4ecm_titelei_.qxd
27.5.2005
8:47 Uhr
Seite 1
M
M
S E M E S
E
S E M E S
M
S
4ecm_titelei_.qxd
27.5.2005
8:47 Uhr
Seite 2
4ecm_titelei_.qxd
27.5.2005
8:47 Uhr
Seite 3
European Congress of Mathematics Stockholm, June 27 – July 2, 2004 Ari Laptev Editor
M
M
S E M E S
S E M E S
European Mathematical Society
4ecm_titelei_.qxd
27.5.2005
8:47 Uhr
Seite 4
Editor: Ari Laptev Department of Mathematics Royal Institute of Technology SE-100 44 Stockholm Sweden
2000 Mathematics Subject Classification 00Bxx
Bibliographic information published by Die Deutsche Bibliothek Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.ddb.de.
ISBN 3-03719-009-4 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data banks. For any kind of use permission of the copyright owner must be obtained.
© 2005 European Mathematical Society Contact address: European Mathematical Society Publishing House Seminar for Applied Mathematics ETH-Zentrum FLI C4 CH-8092 Zürich Switzerland Phone: +41 (0)1 632 34 36 Email:
[email protected] Homepage: www.ems-ph.org Printed in Germany 987654321
4ECM Stockholm 2004 c 2005 European Mathematical Society
Contents Foreword by John Kingman, President of the European Mathematical Society . . . . . . . . . . . . . . . . .
ix
Opening speech of Ari Laptev, President of the 4ECM Organization Committee . . . . . . . . . . . . . . . . . .
x
Scientific Report by Ari Laptev . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xiii
List of Sponsors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xvi
Invited Speakers G. Alberti, M. Cs¨ ornyei and D. Preiss Structure of Null Sets in the Plane and Applications . . . . . . . . . . . . . .
3
D. Auroux Some Open Questions about Symplectic 4-manifolds, Singular Plane Curves and Braid Group Factorizations . . . . . . . . . . .
23
D. Beliaev and S. Smirnov Harmonic Measure on Fractal Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
S. Bianchini Singular Approximations to Hyperbolic Systems of Conservation Laws in one Space Dimension . . . . . . . . . . . . . . . . . . . . . . .
61
A. Borodin and G. Olshanski Representation Theory and Random Point Processes . . . . . . . . . . . . .
73
F. Bouchut Stability of Relaxation Models for Conservation Laws . . . . . . . . . . . . .
95
B.H. Bowditch Hyperbolic 3-manifolds and the Geometry of the Curve Complex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 E. Friedgut Proof of an Intersection Theorem via Fourier Analysis . . . . . . . . . . . . 117 P. G´erard Nonlinear Schr¨ odinger Equations on Compact Manifolds . . . . . . . . . . 121 A. Guionnet A Probabilistic Approach to some Problems in von Neumann Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 S. Helmke and P. Slodowy Singular Elements of Affine Kac–Moody Groups . . . . . . . . . . . . . . . . . . 155
vi
Contents
H. Holden On the Camassa–Holm and Hunter–Saxton equations . . . . . . . . . . . . . R. Klein, E. Mikusky and A. Owinoh Multiple Scales Asymptotics for Atmospheric Flows . . . . . . . . . . . . . . J. Kraj´ıˇcek Proof Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Krammer Horizontal Configurations of Points in Link Complements . . . . . . . . . E. Lindenstrauss Invariant Measures for Multiparameter Diagonalizable Algebraic Actions – A Short Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T. L uczak Phase Transition Phenomena in Random Discrete Structures . . . . . T. Lyons Systems Controlled by Rough Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I. Madsen and M. Weiss The Stable Mapping Class Group and Stable Homotopy Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P. Massart A Non-asymptotic Theory for Model Selection . . . . . . . . . . . . . . . . . . . . P. Mih˘ ailescu Reflection, Bernoulli Numbers and the Proof of Catalan’s Conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M. Mustata, S. Takagi and K. Watanabe F-thresholds and Bernstein-Sato Polynomials . . . . . . . . . . . . . . . . . . . . . K.G. O’Grady Hyperk¨ ahler Manifolds and Algebraic Geometry . . . . . . . . . . . . . . . . . . I.Z. Ruzsa Sumsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y. Shalom Measurable Group Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M. Shcherbina Some Mathematical Problems of Neural Networks Theory . . . . . . . . M. Sodin Zeroes of Gaussian Analytic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . X. Tolsa Painlev´e’s Problem, Analytic Capacity and Curvature of Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.-K. Tornberg Regularization Techniques for Singular Source Terms in Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
173 201 221 233
247 257 269
283 309
325 341 365 381 391 425 445
459
477
Contents
vii
V. Totik Equilibrium Measures and Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . 501 W. Werner SLE, Conformal Restriction, Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515 U. Zannier On the Integral Points on Certain Algebraic Varieties . . . . . . . . . . . . . 529 Network Lectures A. Bonami Some Problems Related with Holomorphic Functions on Tube Domains over Light Cones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549 Y. Brenier Hyperbolic PDEs, Kinetic Formulation and Geometric Measure Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555 F. den Hollander Random Dynamics in Spatially Extended Systems . . . . . . . . . . . . . . . . 561 J. Esterle Analysis and Operators 2000–2004. Four Years of Network Activity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573 B. Helffer Analysis of the Bottom of the Spectrum of Schr¨ odinger Operators with Magnetic Potentials and Applications . . . . . . . . . . . . . 597 J.P. Keating Mathematical Aspects of Quantum Chaos . . . . . . . . . . . . . . . . . . . . . . . .
619
C. Krattenthaler The Research Training Network “Algebraic Combinatorics in Europe” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625 M. Monsurr` o Algebras with Involution and Adjoint Groups . . . . . . . . . . . . . . . . . . . . . 643 M. Reid Constructing Algebraic Varieties via Commutative Algebra . . . . . . . 655 J.P. Solovej Mathematical Problems of Large Quantum Systems . . . . . . . . . . . . . .
669
J. Stix The Grothendieck–Teichm¨ uller Group and Galois Theory of the Rational Numbers – European Network GTEM – . . . . . . . . . . . . . 681
viii
Contents
Plenary Speakers F. Golse Hydrodynamic Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F. Guerra Mathematical Aspects of Mean Field Spin Glass Theory . . . . . . . . . . J. H˚ astad Complexity Theory, Proofs and Approximation . . . . . . . . . . . . . . . . . . . A. Okounkov Random Surfaces Enumerating Algebraic Curves . . . . . . . . . . . . . . . . . P. Ozsv´ ath On Heegaard Diagrams and Holomorphic Disks . . . . . . . . . . . . . . . . . . . O. Schramm Emergence of Symmetry: Conformal Invariance in Scaling Limits of Random Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Voisin Recent Progresses in K¨ahler and Complex Algebraic Geometry . . .
699 719 733 751 769
783 787
Prize Lectures F. Barthe Isoperimetric Inequalities, Probability Measures and Convex Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P. Biran Symplectic Topology and Algebraic Families . . . . . . . . . . . . . . . . . . . . . . S. Serfaty Vortices in the Ginzburg–Landau Model of Superconductivity . . . . W. Tucker Validated Numerics for Pedestrians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . O. Venjakob From Classical to Non-commutative Iwasawa Theory: An Introduction to the GL2 Main Conjecture . . . . . . . . . . . . . . . . . . . . .
861
Index of Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
881
811 827 837 851
4ECM Stockholm 2004 c 2005 European Mathematical Society
Foreword by John Kingman, President of the European Mathematical Society
It was my privilege to welcome participants to the Fourth European Congress of Mathematics, and to thank Ari Laptev and his team for all the hard work in preparation for it. Their efforts were rewarded by the attendance, from across Europe and beyond, and by the successful programme of talks on all aspects of mathematics and its applications. It is clear that European mathematics is moving forward fast, making an impressive contribution to the world scene. This is important, not just because mathematics is worthwhile in itself, but because it underpins all modern science and technology. If we want to exploit the discoveries of science for the benefit of the human race, if we want to make Europe competitive in the global market, we must develop the talents of our young people so that they can use mathematics with confidence and discernment. The challenges of the twenty first century will demand new mathematics and new skills in applying mathematics.
John Kingman and Tuulikki Makelainen looking after the EMS stand
x
Foreword
It was immensely encouraging to hear of the achievements of the young mathematicians awarded EMS Prizes during the Congress. They show the originality and liveliness that augur so well for the future progress of the subject. We congratulate them, and their many colleagues who narrowly missed winning the Prizes, and look forward to their future contributions. Most mathematicians are motivated by the sheer joy of mathematical discovery, whether or not their results find immediate application. We should not apologise for pursuing research that we enjoy, because future use of new mathematics is always unpredictable. There is no sharp dividing line between pure and applied mathematics, much mathematics is ’not yet applied’, and many of the advances announced in Stockholm will surely bear surprising fruit in future years. I therefore commend to those who were not fortunate enough to be in Stockholm, and those who were but welcome a permanent record, this collection of so much that is best in mathematics today. Please enjoy it. Opening speech of Ari Laptev, President of the 4ECM Organization Committee On behalf of the Organizing Committee, I would like to say how happy we are to welcome you here today, in Stockholm, for the 4th European Congress of Mathematics. The European Congresses of Mathematics are a very new tradition compared to the International Congresses of Mathematics, which have existed since 1897. Our congress took place for the first time in Paris in 1992, followed 4 years later by one in Budapest and, most recently, in Barcelona in 2000. However, it has already established itself as a major mathematical event within Europe. This time Stockholm, somewhat inadvertently, became the host city for the 4ECM. It has been arranged by the Royal Institute of Technology in collaboration with Stockholm University. Much preparation was well coordinated with the European Mathematical Society’s Executive Committee and I am also indebted to the members of the 3ECM Organizing Committee for their invaluable advice. This event would not have been possible without the generous financial support from a number of Swedish and International institutions to whom we are extremely grateful and who are listed on the screen. I would like to thank the members of the Scientific Committee chaired by Prof. Lennart Carleson who, together with his Vice President Bj¨ orn Engquist and other members of the Committee, has designed such an excellent programme. We very much appreciate the excellent work of the Prize Committee who accepted the difficult task of choosing 10 talented young mathematicians. The members of the Prize Committee and the prize winners will be announced in the second half of the opening ceremony.
Foreword
Stockholm, view from the City Hall
xi
xii
Foreword
I am also most grateful to my colleagues who shared with me the overwhelming responsibility of organizing this event. We have endeavoured to plan every detail of our programme, and we are delighted that so far our request regarding good weather was granted. Finally, I wish to express my gratitude to all of you who have come here to share and contribute to these 5 days of diverse mathematical lectures. I hope you will have an informative and inspiring visit, during which you will not only experience the beauty of mathematics but also the beauty of Stockholm and its archipelago.
Opening speech of Ari Laptev
Foreword
xiii
Ari Laptev and Nina Uraltseva in front of the lecture hall.
Scientific Report by Ari Laptev, President of the 4ECM Organization Committee Every four years, the European Mathematical Society (EMS) organizes a European Congress of Mathematics. The purpose of this major event of European Mathematics is threefold: to present various new aspects of Pure and Applied Mathematics to a wide audience; to provide a forum for discussion of the relationship between mathematics and society in Europe; to enhance cooperation among mathematicians from all European countries. The Fourth European Congress of Mathematics (4ECM) took place in Stockholm, Sweden, June 27 to July 2, 2004 with 913 participants from 65 countries. 200 grants were awarded to mathematicians from Central and Eastern Countries covering their travelling, lodging and living expenses. It was the major international mathematical event of the year 2004. The theme of the Congress was “Mathematics in Science and Technology”.
xiv
Foreword
There were seven Plenary Lectures, thirty three Invited Lectures, twelve European Network Lectures, six Science Lectures and 322 poster presentations covering all areas of mathematics and many areas of its applications. One of the novelties of the 4ECM were so-called “Science Lectures”, where the most relevant aspects of mathematics in science and technology were discussed. The following speakers gave lectures: Michael Berry (UK), Richard R. Ernst (Switzerland, Nobel Prize in Chemistry 1991), Walter Kohn (USA, Nobel Prize in Chemistry 1998), Martin Nowak (USA), George Oster (USA) and Alexsander Polyakov (USA). Another novelty were presentations of the EU Research Training Networks in Mathematics and Information Sciences and Programmes from European Science Foundation (ESF) in Physical and Engineering Sciences (PESC). Twelve EU Research Training Networks and PESC projects from Brussels and Strasbourg have been chosen by the Scientific Committee.
Prize Ceremony
Foreword
xv
Getting ready for the 5ECM in Amsterdam Prize Winners There were ten EMS Prizes of 5.000 Euro each to young European mathematicians who have made a particular contribution to the progress of mathematics. Prize winners are: Franck Barthe (France), Stefano Bianchini (Italy), Paul Biran (Israel), Elon Lindenstrauss (USA & Israel), Andrei Okounkov (USA & Russia), Sylvia Serfaty (USA & France), Stanislav Smirnov (Switzerland, Sweden & Russia), Xavier Tolsa (Spain), Warwick Tucker (Sweden) and Otmar Venjakob (Germany). Five of the prize winners were independently chosen by the 4ECM Scientific Committee as Plenary or Invited Speakers. Five other prize winners gave their lectures in parallel sessions. At the 4ECM Prize Ceremony the Carl-Erik Fr¨ oberg Prize of 30.000 sek was awarded to Anna-Karin Tornberg for her contribution to solving problems with several phases or discontinuous materials with finite element methods. She was one of the Invited Speakers at the 4ECM. Summary A number of lectures and poster presentations devoted to different applications of modern mathematics allows us to conclude that the Fourth European Congress of Mathematics in Stockholm substantially contributed to developing a close cooperation between pure and applied mathematicians.
xvi
Foreword
List of Sponsors Knut and Alice Wallenberg Foundation funding 1,000,000 sek Swedish Ministry of Higher Education funding 500,000 sek Bank of Sweden Tercentenary Foundation funding 500,000 sek Swedish Foundation for Strategic Research funding 250,000 sek Swedish National Research Council funding 300,000 sek Royal Institute of Technology in Stockholm funding 500,000 sek Stockholm University funding 420,000 sek Nobel Institutes for Physics and Chemistry funding 200,000 sek Unesco (ROSTE) funding 25,000 US$ SAS funding 500 US$ The City of Stockholm: Conference Dinner at City Hall 400,000 sek Kluwer Academic Publishers funding 5,000 euro Springer-Verlag Printing Stockholm Intelligencer
Invited Speakers
4ECM Stockholm 2004 c 2005 European Mathematical Society
Structure of Null Sets in the Plane and Applications Giovanni Alberti, Marianna Cs¨ ornyei and David Preiss Abstract. We describe a decomposition result for Lebesgue negligible sets in the plane, and outline some applications to real analysis and geometric measure theory. These results are contained in [2].
1. Introduction This note is an extended version of a talk that the first author gave at the Fourth European Congress of Mathematics (Stockholm, June 27–July 2, 2004). As the talk, this paper is aimed to non-expert readers, with only a basic knowledge of measure theory and real analysis. Thus many theorems and definitions have been stated in a simplified form, while others of more technical nature have been entirely omitted. Without the burden of generality, certain proofs turned out to be relatively simple, and have therefore been included in a sketchy but hopefully clear form. The interested reader shall find general statements and detailed proofs in a forthcoming paper [2]. The starting point of our research was the observation that in the twodimensional case the solutions of several problems of seemingly different nature can be derived by a simple covering result for null sets in the plane (Theorem 3.1). These problems include the so-called rank-one property of BV functions, the geometric structure of measures supporting normal currents, and the construction of Lipschitz maps with large non-differentiability sets. As shown below, this covering can be proved using a geometric version of a known combinatorial result (Dilworth’s lemma, or Erd˝ os-Szekeres theorem). Unfortunately, no equivalent combinatorial result is available in higher dimension, and it is still an open question whether the desired generalization of Theorem 3.1 holds even in the three-dimensional space (this issue is briefly discussed in Section 8). Despite the fact the paper is mostly focused on the simplest – i.e., twodimensional – situations, the reader should keep in mind that many results Received by the editors January 20, 2005. 2000 Mathematics Subject Classification. 04A20, 06A07, 26A16, 26A27, 26B05, 26B30, 26B35, 49Q15, 52C10. Key words and phrases. decomposition of null sets, functions of bounded variation, Rademacher theorem, differentiability of Lipschitz maps, normal currents, Dilworth’s lemma.
4
G. Alberti, M. Cs¨ ornyei and D. Preiss
extend to higher dimension, too, although in that case they may be not as complete, and many questions are still unanswered. Acknowledgements. This research has been supported, at different moments, by EPSRC (visiting fellowship for G.A.), GNAFA (visiting grant for M.C.), MURST project “Calculus of Variations”, and the Royal Society Wolfson Research Merit Award granted to M.C. Basic notation and terminology. In this the paper, the word “measure” is only used for bounded or locally bounded measures on a Borel σ-algebra, with the only exception of the d-dimensional Hausdorff measure H d , which is not even σ-finite. Recall that if E is a subset of a d-dimensional surface of class C 1 in the Euclidean space, then H d (E) is the usual d-dimensional volume of E. The Lebesgue measure on Rd is denoted by L d . Unless otherwise specified, sets and functions are assumed to be Borel measurable. We will conform as far as possible to the standard notation of measure theory, and just recall here some essential terminology: a set in Rd is null if it is Lebesgue negligible; a measure µ on Rd is singular if it is singular with respect to Lebesgue measure; the (upper/lower) density of a set E ⊂ Rd at a point x ∈ Rd is the (upper/lower) limit as r → 0 of the ratio L d (E ∩ Br (x))/L d (Br (x)), where Br (x) stands for the open ball with center x and radius r; if this limit exists and is equal to 1, then x is a called a density point of E.1 The term “curve” denotes connected 1-dimensional submanifolds of Rd . Given a positive real number L, a map f is called L-Lipschitz if it has Lipschitz constant Lip(f ) ≤ L. 2. A covering result for finite sets in the plane As usual, we denote by x, y the coordinates of a point in the plane. We call x-curve the graph of a 1-Lipschitz function y = y(x) defined for all x in R. Similarly, a y-curve is the graph of a 1-Lipschitz function x = x(y). √ Theorem 2.1. √ A set S of n points in the plane can be covered using at most n x-curves and n y-curves. Remark 2.2. (i) The argument in the proof of Theorem 2.1 can be used, with few modifications, to show that there exists an x- or a y-curve that contains at √ least n points of S. This statement is a particular case of Dilworth’s lemma (see [7]). It also implies, and indeed is equivalent to, the standard formulation of Erd˝ os-Szekeres theorem: every finite sequence (t1 ,√. . . , tn ) of real numbers contains a monotonic subsequence of length at least n.2 For a survey about the many variations of Erd˝ os-Szekeres theorem, see [18]. 1When L n is replaced by a positive measure µ we shall speak of µ-density. 2To prove Erd˝ os-Szekeres theorem, consider the points ph := (h−th , h+th ) with h = 1, . . . , n,
and notice that any subset contained in an x-graph (resp., a y-graph) corresponds to a decreasing (resp., increasing) subsequence (tk ).
Structure of Null Sets
5
(ii) The Lipschitz constant in the definition of x- and y-curves cannot be taken smaller than 1 (consider a set S contained in the line y = x). In general, both x- and y-curves are needed to cover S (consider a set S with n/2 points on the x-axis and n/2 points on the y-axis). (iii) Theorem 2.1 can be stated in a slightly stronger form: given integers h, k such that hk ≥ n, then S can be covered by h x-curves and k y-curves. Proof. We define the following partial order in S: a point p1 = (x1 , y1 ) is below a point p2 = (x2 , y2 ), and we write p1 p2 , if y2 − y1 ≥ |x2 − x1 |, that is, if p2 belongs to the (one sided) cone with vertex p1 and axis parallel to the y-axis shown in Figure 1, left. √ We extract from S a chain (totally ordered subset) √ C1 with n points or more. Then we extract from S \ C1 a chain C2 with n points or more, and we proceed until every chain in S := S \ (C1 ∪ · · · ∪ Ck ) contains √ in this way less than n points3 – see Figure 1, right. Now we extract from S the set M1 of all maximal points, that is, points that are below no other point of S . Then we extract the set M2 of all maximal points of S \ M1 , and we repeat this operation until S \ (M1 ∪ · · · ∪ Mh ) is empty (thus the sets Mj are the strata of S).
y
C2
p2 π/4 p1
M1 M2
C1 M3
x
n=15 points of S strata chains
Figure 1 To conclude, it suffices to observe the following: (i) S is covered by the chains C1 , . . . , Ck and the strata M1 , . . . , Mh ; (ii) each chain is contained in a y-curve and each √ stratum is contained in a x-curve;4 (iii) the number of chains, k, cannot√exceed n because the chains are all disjoint subsets √ of S and contain at least n points. The number h of strata cannot exceed n either, because it agrees with the length of the maximal chain contained in S .5 3If S contains no chain with more than √n points then k = 0 and S = S. 4More precisely, every chain is the graph of a 1-Lipschitz function x = x(y) defined for
finitely many y, and can be extended to all y ∈ R using McShane’s extension lemma. A similar argument applies to the strata. 5Take P ∈ M . Then there exists P ∈ M 1 2 h h−1 such that P1 P2 , otherwise P1 too would belong to Mh−1 . By the same argument we can find P3 ∈ Mh−2 , . . . , Ph ∈ M1 such that Pj Pj+1 for every j, that is, a chain of length h in S .
6
G. Alberti, M. Cs¨ ornyei and D. Preiss
3. A covering result for null sets in the plane We call x-strip of thickness δ a subset T of the plane of the form T = T x (f, δ) := (x, y) : |y − f (x)| ≤ δ/2 where f : R → R is a 1-Lipschitz function. The definition of y-strip T y (f, δ) is the obvious one, one just swaps x and y. Theorem 3.1. Let E be a null set in the plane. Then E can be written as E x ∪E y where E x and E y satisfy the following conditions: (a) for every ε thickness δi (b) for every ε thickness ηj
x can be covered by countably many x-strips Tix of > 0, E so that δi ≤ ε; > 0, E y can be covered by countably many y-strips Tjy of so that ηj ≤ ε.
Remark 3.2. (i) By Fubini’s theorem, every null set E in the plane can be written as the union of two sets E x and E y such that all one-dimensional sections of E x parallel to the y-axis and all sections of E y parallel to the xaxis are null. This means that every such section can be covered by countably many intervals so that the sum of the lengths is smaller than any given ε > 0. Theorem 3.1 makes this statement more precise, by showing that these intervals can be chosen so that they depend in a Lipschitz way on the variable that parametrizes sections. (ii) Conditions (a) and (b) imply the following: H 1 (C ∩ E x ) = 0 for every ycurve C with Lipschitz constant L < 1 and H 1 (C ∩ E y ) = 0 for every x-curve C with Lipschitz constant L < 1. (iii) Adjusting the proof below, one easily deduces the following modification of Theorem 3.1: a set E with positive measure m can be covered by√x-strips δi ≤ 3 m and Tix of thickness δi and y-strips Tjy of thickness ηi so that √ ηj ≤ 3 m.6 Partial proof. We assume for simplicity that E is compact, and only prove that y x for every ε > 0 it can be covered by x-strips Ti and y-strips Tj so that δi ≤ ε and ηj ≤ ε. We fix δ > 0 and define the δ-discretization Eδ of E as the centers of all squares of the form [hδ, (h + 1)δ] × [kδ, (k + 1)δ], with k, h integers, which intersect E (see Figure 2). Since E is compact, it has Lebesgue measure zero if and only if #Eδ = o(1/δ 2 ) . √ By Theorem 2.1, Eδ can √ be covered by #Eδ x-curves – the graphs of some functions fi – and by #Eδ y-curves – the graphs of some functions gj . 6By a different proof we can even obtain δ ≤ a and η ≤ b where a and b are any two i j
positive numbers that satisfy ab > m (see [2]).
Structure of Null Sets
7
δ E Eδ
Figure 2 It is easy to check the x-strips T x (fi , 2δ) and the y-strips T y (gj , 2δ) cover E. Moreover the sum of the thicknesses for both families of strips is #Eδ · (2δ) = o(1/δ 2 ) · (2δ) = o(1), i.e., it tends to 0 as δ → 0. To conclude, we choose δ so that o(1) ≤ ε.
4. Tangent field to a null set in the plane The first application of Theorem 3.1 is about a notion of tangent field for sets in the plane, and has some interesting consequences that will be explained in the next section. Definition 4.1. Let G(2, 1) be the Grassmann manifold of lines in the plane. Given a Borel set E ⊂ R2 , we say that a Borel map τ : E → G(2, 1) is a weak tangent field to E if τS (p) = τ (p) for H 1 -a.e. p ∈ S ∩ E
(4.1)
for every curve S of class C , where τS is the tangent field to S according to the usual definition. 1
Remark 4.2. (i) The notion of weak tangent field is compatible with the usual one: if E is a curve of class C 1 then the tangent field τE is also a weak tangent field, and conversely, every weak tangent field τ agrees with τE up to an H 1 negligible subset.7 (ii) A set E in the plane is rectifiable if it can be covered by countably many curves Si of class C 1 except an H 1 -negligible subset E0 .8 A weak tangent field for such a set is constructed as follows: for p ∈ E \ E0 we set τ (p) := τSi (p) where i is the smallest index such that p belongs to Si , while for p ∈ E0 , τ (p) is taken arbitrarily. (iii) If E is rectifiable, then the weak tangent field is unique up to H 1 -negligible sets, i.e., if τ1 and τ2 satisfy (4.1), then they agree outside a subset of E with 7This is a corollary of the following lemma: given two curves S , S of class C 1 , the corre1 2
sponding tangent fields agree at H 1 -a.e. point of S1 ∩ S2 (in fact, they agree at all points of S1 ∩ S2 except a discrete subset). 8In the terminology of Geometric Measure Theory these sets are called countably (H 1 , 1)rectifiable or simply 1-rectifiable (cf. [9], [13], [17]). The standard definition, albeit equivalent, is different from this one.
8
G. Alberti, M. Cs¨ ornyei and D. Preiss
H 1 measure equal to zero. If in addition E has (locally) finite H 1 measure, then the weak tangent field can be characterized in a pointwise way, and it is known as the approximate tangent field (or bundle) of E. For further details see [9], Section 3.2, or [17], Chapter 3. (iv) A set E in the plane is purely unrectifiable (p.u.) if H 1 (S ∩ E) = 0 for every curve S of class C 1 .9 It is clear that for such sets every τ is tangent because (4.1) is automatically verified. Examples of p.u. sets are the products E = F ×F where F is a null set in R, and more generally all sets E which admit two different projections of measure zero. Essentially all examples of “fractals” in the plane are purely unrectifiable. (v) The weak tangent field (if it exists) is unique up to p.u. sets; in other words, if τ1 and τ2 are tangent to E, then they agree outside a purely unrectifiable subset of E. In the following we shall denote any field in this equivalence class by τE . (vi) A set E with positive Lebesgue measure admits no tangent field. Indeed, using as test curves S in (4.1) all lines parallel to a given line ∈ G(2, 1), we deduce by Fubini’s theorem that a tangent field τ should agree with for L 2 -a.e. point of E; since this should hold for every choice of , we have a contradiction. The following result shows that there is nothing to add to Remark 4.2(vi): Theorem 4.3. Every null set E in the plane admits a weak tangent field. Proof. We need some additional notation: given a unit vector e ∈ R2 and an angle α ∈ [0, π], we denote by C(e, α) the two-sided closed cone of axis e and amplitude α, that is, C(e, α) := v : |v · e| ≥ |v| cos(α/2) . (4.2) A map C from E to the class of all cones is a tangent cone-field to E if it satisfies the obvious analogue of (4.1) for every curve S of class C 1 : τS (p) ⊂ C (p) for H 1 -a.e. p ∈ S ∩ E. Step 1. We first establish the existence of a suitable tangent cone-field. Let e := (1, 0) and e := (0, 1). Writing E as E x ∪ E y as in Theorem 3.1, then the cone-field C(e, α) if p ∈ E x , Cα (p) := C(e , α) if p ∈ E y \ E x , is tangent to E for every α > π/2. This is an easy consequence of the property of E x and E y stated in Remark 3.2(ii). Step 2. If we rotate the axes by an angle θ and perform the construction in Step 1, we obtain a new tangent cone-field Cθ,α which is equal either to 9The standard terminology is (H 1 , 1)-purely unrectifiable, or 1-purely unrectifiable.
Structure of Null Sets
9
C(eθ , α) or to C(eθ , α) at every point of E, where eθ := (cos θ, sin θ) and eθ := (− sin θ, cos θ). Step 3. We observe that every countable intersection of tangent cone-fields is still a tangent cone-field, and then we set C (p) := Cθ,α (p) for every p ∈ E, where the intersection is taken over all α in a given countable dense subset of (π/2, π) and all θ in a given countable dense subset of [0, 2π]. It is not difficult to check that C (p) is either a line or a point for every p ∈ E, and if in the latter occurrence we change it to an (arbitrarily chosen) line, we obtain a weak tangent field to E. 5. The rank-one property of BV functions Given an open set Ω in Rd , the space of functions of bounded variation BV (Ω) consist of all u ∈ L1 (Ω) whose distributional derivative Du is (represented by) a bounded measure on Ω with values in Rd . Let µ be a measure in the plane and E a null set. By Theorem 4.3, E admits a tangent field τE in the sense of Definition 4.1. Then the following holds: Proposition 5.1. For every function u ∈ BV (R2 ), the Radon-Nikodym density of the vector measure Du with respect to µ is a map valued in R2 which satisfies d(Du) (x) ⊥ τE (x) dµ
for µ-a.e. x ∈ E.
(5.1)
Proof. It is not difficult to see that it suffices to prove (5.1) when µ is equal to |Du|, the total variation of the vector measure Du. We need the following results about BV functions: the positive measure |Du| can be disintegrated as |Du| = H 1 St dL 1 (t) , (5.2) R
where each St is a rectifiable set with finite H 1 -measure and H 1 St denotes the restriction of H 1 to the set St .10 Moreover, denoting by τt the approximate tangent field to St (see Remark 4.2(ii) and (iii)), the Radon-Nikodym density of Du with respect to |Du| satisfies d(Du) ⊥ τt (x) for H 1 -a.e. x ∈ St and L 1 -a.e. t ∈ R. d(|Du|)
(5.3)
More precisely, one takes St equal to the reduced boundary of the sublevel {x : u(x) ≥ t}, thus St is rectifiable by De Giorgi’s theorem (see [3], Theorem 3.59), and (5.2) is a reformulation of the coarea formula for BV functions (see [3], 10Identity (5.2) should be read as follows: |Du|(B) is equal to H 1 (S ∩ B) dL 1 (t) for every t
Borel set B ⊂ R2 . Clearly, a certain Borel regularity of the map t → St is assumed.
10
G. Alberti, M. Cs¨ ornyei and D. Preiss
Theorem 3.40). Formula (5.3), like identity (5.2), can also be derived with little extra work from the coarea formula (cf. [1], Theorem 1.12). Let t be fixed. Since St is rectifiable, then it can be covered by countably many curves of class C 1 , and the definition of weak tangent field yields τt (x) = τE (x) for H 1 -a.e. x ∈ St ∩ E.
(5.4)
d(Du) Finally, (5.3) and (5.4) imply d(|Du|) (x) ⊥ τE (x) for H 1 -a.e. x ∈ St ∩ E and 1 L -a.e. t ∈ R. By (5.2), the same is true for |Du|-a.e. x ∈ E, and we have proved (5.1) for µ = |Du|.
Proposition 5.1 implies the so-called rank-one property of BV functions, which was first proved by the first author, in a completely different way, in [1], Corollary 4.6. Recall that given a BV map u : Ω ⊂ Rd → Rm , the derivative Du is a measure valued in m × d matrices, and the Radon-Nikodym density of Du with respect to a positive measure µ is a map valued in m × d matrices. Theorem 5.2. Let u be a map in BV (Ω, Rm ), µ a positive measure on Ω, and E a null set in Ω. Then
d(Du) rank (x) ≤ 1 for µ-a.e. x ∈ E. (5.5) dµ In particular, the density of Du with respect to any singular measure µ is valued in matrices of rank one or zero. Proof. For d = 2, this statement is an immediate consequence of Proposition 5.1: denoting by ui , i = 1, . . . , n, the components of u, the rows of the i) (x) are the vectors d(Du (x); since all these vectors are orthogomatrix d(Du) dµ dµ nal to τE (x), they are co-linear, which means that the matrix has rank one or zero. For general d, the statement can be proved by reduction to the previous case. Indeed, the distributional derivative of u can be reconstructed from the distributional derivatives of its restrictions to the planes parallel to the coordinate planes using a natural “slicing” formula (cf. [1], Proposition 1.10), and since the rank of an m × d matrix is one or zero if (and only if) the same holds for all m × 2 minors, the rank-one property of Du is implied by the rank-one property of its restrictions to planes. 6. Mapping sets of positive measure onto balls Among the problems meant to explore the geometric structure of sets with positive Lebesgue measure, the following one, proposed by M. Laczkovich, is particularly interesting: Question 6.1. Given a compact set K in Rd of positive Lebesgue measure, is there a Lipschitz map Φ : Rd → Rd which takes K onto a closed ball?
Structure of Null Sets
11
It is clearly equivalent to assume that K is Borel, or require that f (K) contains a ball, that is, it has non-empty interior. Looking at a density point of K, it is possible to find a ball B such that the measure of B \K is extremely small compared to that of B. Thus one would expect that a perturbation Φ of the identity can be found, which maps B \ K into a set with empty interior, so that Φ(K) contains Φ(B), and hopefully the latter set has nonempty interior. However, after few attempts one realizes that, in dimension larger than one, making Φ Lipschitz and the interior of Φ(B) non-empty at the same time is quite difficult. Proposition 6.2. The answer to Question 6.1 is positive for d = 1. Proof. Let Φ : R → R be a primitive of the characteristic function 1K , that is Φ(x) := L 1 K ∩ (−∞, x) for every x ∈ R. Then Φ is constant on each connected component of the complement of K, and Φ(K) is equal to Φ(R), which is a non-trivial interval because Φ is not constant. Theorem 6.3. The answer to Question 6.1 is positive for d = 2. This theorem was first proved by the third author (a version of this proof will appear in [2]); a proof based on Erd˝ os-Szekeres theorem was then given by J. Matouˇsek in [11]. Question 6.1 is still open for d ≥ 3. Before giving the proof of this result, we briefly review some na¨ıve solutions, and explain why they do not work. Attempt of solution for d > 1. A way to extend the construction in the proof of Proposition 6.2 is to solve the equation det(∇Φ) = 1K
(6.1)
on some smooth bounded domain Ω of R which contains K, imposing a Dirichlet boundary condition which guarantees that Φ(Ω) contains a ball B. Because of (6.1), Φ(Ω \ K) must be a null set, and therefore has empty interior, which implies that Φ(K) agrees with Φ(Ω), and in particular contains B. The difficulty is that in general the equation det(∇Φ) = g admits no Lipschitz solution even if the datum g is continuous and strictly positive (see [16], [5]), and the situation gets no better when g is discontinuous and takes the value zero. d
An iterative construction for d = 1. The function Φ in the proof of Proposition 6.2 can also be obtained by an iterative construction that might be extended to higher dimension. Given an interval I = (a, b) in R, we denote by ΦI the function if x ≤ a , x ΦI (x) := a if a < x < b , x − (b − a) if b ≤ x .
12
G. Alberti, M. Cs¨ ornyei and D. Preiss
Thus ΦI maps I into a point and is measure-preserving in the complement of I. By composing maps of this type we can “remove” one by one all connected components I in the complement of K. More precisely, we take Φ to be the limit of the functions Φn defined by induction on n as follows: Φ0 (x) := x is the identity map, and Φn (x) := ΦIn (Φn−1 (x)) where In is a bounded connected component of maximal length in the complement of Φn−1 (K). It is easy to check that Φ is 1-Lipschitz, maps the complement of K into a set of measure 0, and is measure preserving on K. In particular Φ(K) agrees with Φ(R) and has the same measure as K, and therefore is an interval of positive length. Second attempt of solution for d > 1. One way of adapting the previous construction to higher dimension is the following: for every open ball B in Rd we construct a Lipschitz map ΦB : Rd → Rd such that ΦB (B) has measure zero, we choose a bounded open set Ω which contains K, and then let Φ be the limit of the maps Φn defined as follows: Φn (x) := ΦBn (Φn−1 (x)) where Bn is, say, the largest open ball contained in Φn−1 (Ω) \ Φn−1 (K). In order to ensure that the limit Φ exists and is Lipschitz, the maps ΦB must be (asymptotically) 1-Lipschitz. Now, the difficulty is that a 1-Lipschitz map which takes a ball into a null set is far from being measure-preserving on the complement. In other words, there is no easy way to prevent the sets Φn (Ω) from collapsing to a set Φ(Ω) of measure zero, and therefore with empty interior. Proof of Theorem 6.3. The iterative construction described in the paragraphs above can be made work in dimension d = 2 by removing suitably chosen strips that cover the complement of K. Given an x-strip T = T x (f, δ), we define ΦT : R2 → R2 by if y ≤ f (x) − δ/2 , (x, y) ΦT (x, y) := (x, f (x) − δ/2) if f (x) − δ/2 < y < f (x) + δ/2 , (x, y − δ) if f (x) + δ/2 ≤ y . Thus ΦT maps T into a null set, is measure preserving in the complement of T , and is 1-Lipschitz provided that R2 is endowed with the ∞ -norm
(x, y) := sup |x|, |y| (6.2) instead of the Euclidean norm. Moreover ΦT maps any x-strip into (but not necessarily onto) another x-strip with same thickness. Now we choose an open square Ω with side-length r and parallel to the coordinate axes so that the set A := Ω\K is small (the precise requirement will be made explicit later). By Remark 3.2(iii), we can cover A using countably many x- or y-strips Tn with thickness δn so that ∞ n=1
δn ≤ 6
L 2 (A) .
Structure of Null Sets
13
We assume for the time being that all Tn are x-strips, and take Φ equal to the limit of the maps Φn defined as follows: Φ0 (x) := x is the identity map, and Φn (x) := ΦTn (Φn−1 (x)) where Tn is a strip of thickness δn which contains Φn−1 (Tn ). Thus Φ is 1-Lipschitz with respect to the norm (6.2) and maps A into a null set, and therefore Φ(K) contains Φ(Ω). Moreover Φ(Ω) contains a rectangle Ω with width r and height r− δn ≥ r − 6 L 2 (A) , (6.3) and has non-empty interior provided that L 2 (A) < r2 /36. Note that this inequality is verified by all squares Ω centered at a density point of K and sufficiently small. This proof works only if the strips Tn are of the same type. In general, using only strips of one type we cannot cover all of A, but we can cover at least half of it, that is, a subset B such that L 2 (B) ≥ L 2 (A)/2. Hence the map Φ given above takes B into a null set, and therefore Φ(K) contains Ω \ A where A := Φ(A \ B) satisfies L 2 (A ) ≤ L 2 (A)/2. This estimate, in combination with (6.3), allows to iterate this construction countably many times, and finally obtain a map Φ such that Φ(K) contains a non-trivial rectangle. Remark 6.4. The proof described above is closer to that in [11]. The proof presented in [2] gives a stronger result: the set Φ(Ω \ K) is one-dimensional and rectifiable, and not just Lebesgue negligible. This proof uses maps Φ : R2 → R2 that remove at once countable unions of x-strips (or y-strips). Although it does not rely directly on the covering result proved in Theorem 3.1, the basic argument is close in spirit to the proof of Theorem 2.1. 7. Differentiability of Lipschitz maps on the plane A large part of [2] is devoted to the structure of differentiability sets of Lipschitz maps. In this section, we address one of the basic questions about differentiability of Lipschitz maps, and state the results which have been obtained in dimension two. Some of the results in higher dimension are briefly mentioned in Subsection 8e. A classical theorem of Rademacher states that a Lipschitz map f : Rd → R is differentiable L d -almost everywhere. Thus the question naturally arises, about what happens if the Lebesgue measure L d is replaced by a positive measure µ. There are obvious examples of singular measures µ for which Rademacher theorem does not hold: for instance, if µ is the restriction of the Hausdorff measure H k to a k-dimensional surface M , then f (x) := dist(x, M ) is differentiable µ-almost nowhere. On the other hand, if µ is absolutely continuous with respect to Lebesgue measure, then every Lipschitz map is differentiable µ-almost everywhere. So the question becomes: are there other measures for which Rademacher theorem holds besides the absolutely continuous ones? m
14
G. Alberti, M. Cs¨ ornyei and D. Preiss
This question can be refined by asking for which sets E in Rd there exists a Lipschitz map which is nowhere differentiable on E. By Rademacher theorem, all these sets must be Lebesgue-negligible, but is this condition also sufficient? These two questions can be restated as follows: Question 7.1. Weak formulation: given a singular measure µ in Rd , is there a Lipschitz map f : Rd → Rm which is differentiable µ-almost nowhere? Strong formulation: given a null set E in Rd , is there a Lipschitz map d f : R → Rm which is differentiable at no point of E? Remark 7.2. (i) In the weak formulation, it does not matter whether f is scalar or vector-valued. The reason is the following lemma: given a positive measure µ on Rd and a sequence of functions fn : Rd → R which are uniformly Lipschitz, −n and uniformly bounded at one point, there exist αn ∈ [0, 2 ] such that the non-differentiability set of f := αn fn agrees with the union of the nondifferentiability sets of fn , up to a µ-negligible subset. In fact, this holds for almost every choice of the coefficients αn . (ii) Whether f is scalar or vector-valued does matter for the strong formulation of Question 7.1. Indeed, the third author showed in [15], Corollary 6.5, that there exist null sets E in the plane such that every scalar Lipschitz function f : R2 → R is differentiable in at least one point of E (but for the same sets there also exist Lipschitz maps f : R2 → R2 which are nowhere differentiable on E). Proposition 7.3. The answer to Question 7.1 in the strong formulation is positive for d = 1. Remark 7.4. (i) Proposition 7.3 is an immediate corollary of the following lemma: given a null set E in R, there exists a set of positive and finite Lebesgue measure F with upper density 1 and lower density 0 at every point of E. Then a primitive of 1F – e.g., f (x) := L 1 (F ∩ (−∞, x)) – is a Lipschitz function that is not differentiable at any point of E. (ii) A more precise statement is proved in [19]: a set E in the line is the non-differentiability set of a Lipschitz function if and only if it is a Gδσ set (a countable union of countable intersections of open sets) and has Lebesgue measure zero. Theorem 7.5 (see [2]). The answer to Question 7.1 in the strong formulation is positive for d = 2. Remark 7.6. Given a null set E, the construction in [2] yields a Lipschitz map f : R2 → R2 which is not differentiable at each x ∈ E in the sense – stronger than the usual one – that the directional derivative De f (x) does not exist for at least one direction e ∈ R2 (depending on x). Question 7.1 is open for d ≥ 3, both in the weak and strong formulation.
Structure of Null Sets
15
In the rest of this section we will recall an important class of Lipschitz functions with “large” non-differentiability sets – the distance functions – then describe a direct construction to prove Proposition 7.3, and briefly discuss its extension to dimension two. Distance functions of porous sets. A typical example of non-smooth Lipschitz function on Rd is the distance function of a closed set E, namely dE (x) := dist(x, E) . It is not difficult to see that dE is not differentiable at x ∈ E if (and only if) there exists a sequence of open balls Brn (xn ) contained in the complement of E, such that xn converge to x and |xn − x| ≤ O(rn ). A set E which satisfies this condition at every point is called porous; in this case the function dE is not differentiable at any point of E. Let µ be a positive measure on Rd , and assume that there are countably many porous sets En which cover µ-almost every point. Then there exists a linear combination of the distance functions dEn which is differentiable µ-almost nowhere (cf. Remark 7.2(i)). Unfortunately, this construction does not settle Question 7.1, because for every d ≥ 1 there are singular measures µ on Rd such that every porous set is µ-negligible.11 A direct construction for d = 1. Let E be a compact null set in R. Then we can find a decreasing sequence of bounded open sets An which contain E and satisfy the following property: L 1 (An ) ≤ 2−n L 1 (I)
(7.1)
for every connected component I of An−1 (since E is compact, we can assume that An−1 has only finitely many connected components). We denote by gn a primitive of the characteristic function of An , and set fn (x) :=
n
(−1)m gm (x) and
m=1
f (x) := lim fn (x) . n→+∞
(7.2)
Using that the sets An are decreasing, it is not difficult to show that each fn is 1-Lipschitz, and so is the limit f (cf. Figure 3 below). We claim that f is not differentiable at any x ∈ E. Fix an odd integer n, and denote by I the closure of the connected component of An which contains 11It is not difficult to prove that given a positive measure µ and a point x ∈ Rd such that
the support of every tangent measure to µ at x is Rd , then x cannot be a µ-density point for any porous set E. On the other hand, there are examples of singular measures µ on Rd whose tangent measures at x are all multiples of the Lebesgue measure for µ-a.e. x (cf. [14], Example 5.9(1)); hence the set of µ-density points of any porous set E is µ-negligible, which implies that E itself is µ-negligible. For further details on the notion of tangent measure see [14], Chapter 2, or [13], Chapter 14.
16
G. Alberti, M. Cs¨ ornyei and D. Preiss
x. Then fn is affine with derivative 1 on I, and therefore for every y ∈ I there holds ∞ f (y) − f (x) fn (y) − fn (x) |gm (y) − gm (x)| ≥ − y−x y−x |y − x| m=n+1 =1−
∞ ∞ L 1 (Am ∩ [x, y]) L 1 (Am ) ≥1− . |y − x| |y − x| m=n+1 m=n+1
Now, (7.1) implies L 1 (Am ) ≤ 2−m L 1 (I) for every m > n, and choosing yn ∈ I such that |yn − x| ≥ L 1 (I)/2 we obtain ∞ 2−n L 1 (I) 2−m L 1 (I) f (yn ) − f (x) ≥ 1− =1− ≥ 1 − 21−n . yn − x |y − x| |y − x| n n m=n+1
Thus the upper derivative of f at x is 1. If n is even, the function fn is affine with derivative 0 on I, and choosing yn as above we obtain a sequence which shows that the lower derivative of f at x is 0. f1 f3 f2 components of
A3 A2 A1
Figure 3 If the set E is not compact, condition (7.1) may not be satisfied by any sequence of open sets An . To make the proof work in this case, one has to replace inequality (7.1) by L 1 (An ∩ I) ≤ 2−n L 1 (I). Extension to dimension d = 2. A na¨ıve way to extend the construction in the previous paragraph to the plane would be the following: given a null set E, we write E as E x ∪ E y as in Theorem 3.1, and construct f for E x and E y separately. To construct a Lipschitz function f which is not differentiable on E x , we take a decreasing sequence of open sets An so that each An is a union of x-strips which cover E x and satisfy a suitable counterpart of (7.1); then we define f as in (7.2), where now gn is a Lipschitz function on the plane whose partial derivative Dy gn is the characteristic function of An . Then f is not differentiable in the y direction at any point of E x . There is, however, a serious problem: the partial derivatives Dx gn are all of order one, but, unlike the partial derivatives Dy gn , do not cancel each
Structure of Null Sets
17
other when summed, that is, the partial derivatives Dx fn may be not uniformly bounded, and f may be not Lipschitz. Since |Dx gn | is bounded by the Lipschitz constant of the x-strips which cover An , this difficulty can be by-passed using strips with smaller and smaller Lipschitz constant; in turn, this requires a suitable refinement of Theorem 3.1, and a careful truncation-and-localization argument. The price to pay is that the resulting function f could still be differentiable at some point of E. However, it is possible to tune the construction parameters so that these “bad” points are µ-negligible with respect to a prescribed singular measure µ, and this suffices to answer the weak formulation of Question 7.1 in the positive. The construction required for the strong formulation is considerably more complicated. 8. Further results and open problems In this section we briefly discuss the extension to higher dimension of Theorem 3.1, and of other results from the previous sections. Since many relevant questions are still unanswered even in dimension three, the following discussion will be sometimes restricted to this case. 8a. Covering of finite sets. As usual, x, y, z denote the coordinates of points in the space. Given L > 0, an x-surface of constant L in the space is the graph of an L-Lipschitz function x = x(y, z) defined for all (y, z) ∈ R2 , while an x-curve of constant L is the graph of an L-Lipschitz map (y = y(x), z = z(x)) defined for all x ∈ R. The definitions of y- and z-surfaces and curves are the obvious ones. Proposition 8.1. Every set S of n points in the space can be covered by n1/3 x-surfaces with constant 1 and n2/3 x-curves with constant 1. The proof of Proposition 8.1 is a straightforward adaptation of that of Theorem 2.1. However, this result has limited applications, and the generalization of Theorem 2.1 with wider impact would be another: Question 8.2 (see [11], [12]). Are there finite positive constants L, M such that any set S of n points in the space can be covered by M n1/3 x-, y-, or z-surfaces with constant L? In [2], we answer this question in the negative. To do this, we first show that a positive answer is equivalent to the following statement: there exists a finite constant L such that, for every set S of n points in the space, (8.1) max σx,L (S), σy,L (S), σz,L (S) ≥ n2/3 , where σx,L (S) is the largest number of points of S covered by a single x-surface of constant L, and so on. Now, equality holds in (8.1) if S is the product of three sets Sx , Sy , Sz in R with same cardinality, and we could show that (8.1) fails for suitable “perturbations” of these product sets. A weaker, but very interesting version of Question 8.2 is still open:
18
G. Alberti, M. Cs¨ ornyei and D. Preiss
Question 8.3. Are there finite positive constants L, M such that the following holds: for every set S of n points in the space it is possible to choose the coordinate axes so that S can be covered by M n1/3 x-, y-, or z-surfaces with constant L? 8b. Covering of nulls sets. Theorem 2.1 implies, via a discretization argument, Theorem 3.1. We can use Proposition 8.1 in the same way and prove the following: x where Proposition 8.4. Every null set E in the space can be written as E x ∪ E x E can be covered by δi -neighborhoods of x-surfaces Si of constant 1 with δi x arbitrarily small,and E can be covered by ηi -neighborhoods of x-curves Ci of constant 1 with ηi2 arbitrarily small. But again, the covering result which would be most useful is another: Question 8.5. Is there a constant L such that every null set E in the space can be covered by δi neighborhoods of x-, or y-, or z-surfaces Si of constant L so that δi is arbitrarily small? A positive answer to this question would imply positive answers to all open questions listed in this paper, with the notable exception of Question 6.1 for d = 3, for which this covering result may not be sufficient. 8c. Tangent field to a null set. We consider here a possible generalization of the notion of weak tangent field to higher dimension. Let E be a set in Rd , τ a map from E into the Grassmann manifold G(d, d − 1) of hyperplanes in Rd , and k an integer between 1 and d − 1. We say that τ is k-weakly tangent to E if for every k-dimensional surface S of class C 1 there holds τS (x) ⊂ τ (x) for H k -a.e. x ∈ S ∩ E.
(8.2)
If τ is k-weakly tangent to E, then it is also h-weakly tangent for h greater than k, but not necessarily for h smaller.12 Using Proposition 8.4, we can show that every null sets in R3 admits a 2-weak tangent field, but we do not know if every null set in R3 admits a 1-weak tangent field. Of course, this would be the case if Question 8.5 were given a positive answer. 8d. Geometric structure of one-dimensional normal currents. A 1-dimensional normal current in Rd is an Rd -valued, bounded measure T on Rd whose distributional divergence is (represented by) a finite measure.13 12Indeed, the notion k-tangent field is stable under arbitrary modifications of τ in a H k -
negligible set, including h-dimensional surfaces of class C 1 , while this clearly not true for the notion h-tangent field. 13The usual definition of k-dimensional normal current looks quite different from this one, but turns out to be equivalent for k = 1 (for more details see [17], Chapter 6, or [9], Section 4.1). We have not included in this paper the results about general normal currents, because they are too technical.
Structure of Null Sets
19
Since T is a bounded measure, it can be written as T = τ · µ where µ is a positive measure and τ is an Rd -valued density. It is proved in [2] that if E is a null set and τE is a 1-weak tangent field to E (see the previous subsection), then τ (x) is belongs to the hyperplane τE (x) at µ-a.e. x ∈ E. An immediate consequence of this observation, and of the fact that every null set E in R2 admits a 1-weak tangent field (Theorem 4.3), is the following: Proposition 8.6. Let T1 = τ1 · µ1 and T2 = τ2 · µ2 be 1-dimensional normal currents on R2 , and let µ be a positive measure absolutely continuous with respect to µ1 and µ2 , such that τ1 (x) and τ2 (x) span R2 for µ-a.e. x. Then µ is absolutely continuous with respect to the Lebesgue measure. Remark 8.7. (i) Since every gradient rotated by π/2 is a divergence-free vectorfield, Proposition 8.6 implies the rank-one property for BV functions on R2 (cf. Theorem 5.2). (ii) The following definition of tangent bundle of a positive measure µ on Rd has been used in the framework of shape optimization problems (see [10], [4], and references therein): given p ∈ [1, +∞], the tangent bundle Tµp (x) is the µ-essential span of all vector-fields v ∈ Lq (µ) such that the (distributional) divergence of v · µ belongs to Lq (µ), where q denotes as usual the conjugate exponent to p. If µ is a singular measure on R2 , then µ is supported on a null set E (i.e., µ(R2 \ E) = 0), and therefore Tµp (x) is contained in the tangent field to E, which exists by Theorem 4.3. In particular Tµp (x) has dimension at most 1 for µ-a.e. x. In the plane, this answers in the positive a questions raised in [10]. It is not known if Proposition 8.6 holds to higher dimension: let Ti = τi ·µi , i = 1, 2, 3, be 1-dimensional normal currents on R3 , and let µ be a positive measure absolutely continuous with respect to all µi , such that the vectors τi (x) span R3 for µ-a.e. x. Is µ absolutely continuous with respect to Lebesgue measure? The answer would be clearly yes if every null set in R3 admitted a 1-weak tangent field. This is probably the weakest of all corollaries that a positive answer to Question 8.5 would yield (thus the most interesting to disprove). 8e. Differentiability of Lipschitz maps in higher dimension. The problem of characterizing those sets E in Rd such that there exists a Lipschitz map on Rd which is nowhere differentiable on E (cf. Section 7) has also been solved in [2] for any dimension d. However, the characterization for d > 2 is not as simple as that in the planar case; whether it can be simplified or not is an open problem. We begin with a definition: Definition 8.8. Given a unit vector e in Rd and an angle α ∈ (0, π), C = C(e, α) denotes the two-sided closed cone with axis e and amplitude α (cf. formula (4.2)). A set E ⊂ Rd is called C-null if for every ε > 0 there exists an open set A such that E ⊂ A and H 1 (A ∩ S) ≤ ε
20
G. Alberti, M. Cs¨ ornyei and D. Preiss
for every curve S of class C 1 which satisfies τS ⊂ C in every point.14 Finally, we denote by N the σ-ideal of all sets E ⊂ Rd which satisfies the following condition: for every α < π, E can be covered by countably many sets Ei so that each Ei is Ci -null for some cone Ci with amplitude α. Theorem 8.9 (see [2]). Given a set E ⊂ Rd , there exists a Lipschitz map f : Rd → Rm , m ≥ d, which is differentiable at no point of E if and only if E ∈ N . Remark 8.10. (i) Theorem 8.9 characterizes the subsets of non-differentiability sets of Lipschitz maps. We do not have a complete characterization of nondifferentiability sets. (ii) The map f constructed in [2] is not differentiable at each x ∈ E in the sense that there exists at least one direction e ∈ Rd of non-differentiability, that is, the directional derivative De f (x) does not exist. (iii) In the construction in [2] we need that m ≥ d. On the other hand, from Remark 3.2(ii) we know that m cannot be 1 and the results of [6] give a strong indication that m must be at least d. (iv) Theorem 3.1 shows that every null set E ⊂ R2 can be written as E1 ∪ E2 so that each Ei is C(ei , α)-null where {e1 , e2 } is any orthonormal base of R2 and α is any angle such that α < π/2 (cf. Remark 3.2(ii)). It can be proved with some additional work that E belongs to N . This remark and Theorem 8.9 imply Theorem 7.5. (v) Theorem 8.9 leaves many questions open. The most important one is: does every null set E in Rd belong to N ? This would be the case if Question 8.5 were given a positive answer. In fact, we do not even know if a set which is C-null for one cone belongs to N . (vi) If the set E is C-null, then H 1 (E ∩ S) = 0 for every curve S such that τS ⊂ C in every point. The converse is true if E is compact, but we do not know if the same holds when E is a Gδ set (countable intersection of open sets); if so, the definition of N would become significantly simpler. The notion of non-differentiability of a map f at a point x ∈ Rd can be strengthened by requiring more than one direction of non-differentiability. For instance, a natural generalization of Question 7.1 is the following: for which sets E ⊂ Rd there exists a Lipschitz map on Rd which is not differentiable in any direction (i.e., there exists no directional derivatives) at every point of E? Rademacher theorem in dimension 1 implies that every Lipschitz map on Rd is differentiable in the direction τS for H 1 -a.e. point of every curve S of class C 1 . Hence such a set E must satisfy H 1 (E ∩ S) = 0 for every curve S, that is, E is purely unrectifiable (cf. Remark 4.2(iv)). With a little more work one can show that E must be C-null with respect to every cone C (cf. Definition 8.8), 14It is essential that S is of class C 1 and connected: were we to consider Lipschitz curves,
the class of admissible S should be defined more carefully.
Structure of Null Sets
21
what we call a uniformly purely unrectifiable (u.p.u.) set. This condition turns out to be also sufficient: Theorem 8.11 (see [2]). Given a set E in Rd , there exists a Lipschitz function f : Rd → R which is not differentiable in any direction at every point of E if (and only if ) E is uniformly purely unrectifiable. Remark 8.12. A u.p.u. set E is also purely unrectifiable, and the converse is true if E is compact. We do not know if the same holds when E is a Borel set, or even a Gδ set (cf. Remark 8.10(vi)). The mysterious vector-field. Let f be a Lipschitz map on R2 . As pointed out before Theorem 8.11, the set of points where f is not differentiable in any direction is u.p.u. Moreover, it can be proved that the set of points where f admits at least two different directions of differentiability but is not differentiable is u.p.u., too. This remark and Theorem 7.5 imply the existence, for every null set E in the plane, of a map τ : E → G(2, 1) with the following property: every Lipschitz map f : R2 → Rm is differentiable in the direction τ at every point of E except a u.p.u. subset. Moreover τ is unique up to a u.p.u. subset of E.15 It is not difficult to show that τ must agree with the weak tangent field to E (see Definition 4.1 and Theorem 4.3) except in a p.u. subset of E.16 As it happens, the definition of τ came before that of weak tangent field, and since we found this object quite puzzling, we referred to it as the “mysterious vector-field”. Were the class of u.p.u. sets strictly contained in that of p.u. sets, the definition of τ would not be equivalent to that of weak tangent field, but – in a still mysterious way – more precise. References [1] G. Alberti: Rank one property for derivatives of functions with bounded variation. Proc. Roy. Soc. Edinburgh Sect. A, 123 (1993), 239–274. [2] G. Alberti, M. Cs¨ ornyei, D. Preiss: paper in preparation. [3] L. Ambrosio, N. Fusco, D. Pallara: Functions of bounded variation and free discontinuity problems. Oxford Mathematical Monographs. Oxford Science Publications, Oxford, 1999. 15To construct τ , we take a map f¯ which is nowhere differentiable on E (Theorem 7.5) and set τ (x) equal to the direction of differentiability of f¯ at x – as remarked above, such a direction exists and is unique for all x ∈ E except a u.p.u. set. Given any other Lipschitz map f , we know that (f, f¯) must be differentiable in at least one direction at every point of E except a u.p.u. set, and since the unique direction of differentiability of f¯ is τ , f must be differentiable in the direction τ for all points of E except a u.p.u. subset. The uniqueness of τ follows by the existence of f¯. 16If not, we could find a curve S of class C 1 and a Lipschitz map f such that f is not differentiable in the direction τS for a subset with positive length of S∩E, and this contradicts Rademacher theorem in dimension 1.
22
G. Alberti, M. Cs¨ ornyei and D. Preiss
[4] G. Bouchitt´e, G. Buttazzo: Characterization of optimal shapes and masses through Monge-Kantorovich equation. J. Eur. Math. Soc. (JEMS), 3 (2001), 139–168. [5] D. Burago, B. Kleiner: Separated nets in Euclidean space and Jacobians of biLipschitz maps. Geom. Funct. Anal., 8 (1998), 273–282. [6] T. de Pauw, P. Huovinen: Points of ε-differentiability of Lipschitz functions from Rn to Rn−1 . Bull. London Math. Soc., 34 (2002), 539–550. [7] R.P. Dilworth: A decomposition theorem for partially ordered sets. Ann. of Math. (2), 51 (1950), 161–166. [8] P. Erd˝ os, G. Szekeres: A combinatorial problem in geometry. Compositio Math., 2 (1935), 463–470. [9] H. Federer: Geometric measure theory. Grundlehren der mathematischen Wissenschaften, 153. Springer, New York, 1969. Reprinted in the series Classics in Mathematics. Springer, Berlin-Heidelberg, 1996. [10] I. Fragal` a, C. Mantegazza: On some notions of tangent space to a measure. Proc. Roy. Soc. Edinburgh Sect. A, 129 (1999), 331–342. [11] J. Matouˇsek: On Lipschitz mappings onto a square. In The mathematics of Paul Erd˝ os, II, 303–309 (Algorithms Combin., 14), Springer, Berlin, 1997. [12] J. Matouˇsek: A lower bound on the size of Lipschitz subsets in dimension 3. Combin. Probab. Comput., 12 (2003), 427–430. [13] P. Mattila: Geometry of sets and measures in Euclidean spaces. Fractals and rectifiability. Cambridge Studies in Advanced Mathematics, 44. Cambridge University Press, Cambridge, 1995. [14] D. Preiss: Geometry of measures in Rn : distribution, rectifiability, and densities. Ann. of Math. (2), 125 (1987), 537–643. [15] D. Preiss: Differentiability of Lipschitz functions on Banach spaces. J. Funct. Anal., 91 (1990), 312–345. [16] D. Preiss: Additional regularity for Lipschitz solutions of PDE. J. Reine Angew. Math., 485 (1997), 197–207. [17] L. Simon: Lectures on geometric measure theory. Proceedings of the Centre for Mathematical Analysis, 3. Australian National University, Centre for Mathematical Analysis, Canberra, 1983. [18] J.M. Steele: Variations on the monotone subsequence theme of Erd˝ os and Szekeres. In Discrete probability and algorithms (Minneapolis, 1993), 111–131, IMA Vol. Math. Appl. 72, Springer, New York, 1995. [19] Z. Zahorski: Sur l’ensemble des points de non-derivabilit´e d’une fonction continue. Bull. Soc. Math. France, 74 (1946), 147–178. Giovanni Alberti Dipartimento di Matematica, Universit` a di Pisa L.go Pontecorvo 5, 56127 Pisa, Italy e-mail:
[email protected] Marianna Cs¨ ornyei and David Preiss Department of Mathematics, University College London Gower Street, London WC1E 6BT, UK e-mail:
[email protected] e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Some Open Questions about Symplectic 4-manifolds, Singular Plane Curves and Braid Group Factorizations Denis Auroux Abstract. The topology of symplectic 4-manifolds is related to that of singular plane curves via the concept of branched covers. Thus, various classification problems concerning symplectic 4-manifolds can be reformulated as questions about singular plane curves. Moreover, using braid monodromy, these can in turn be reformulated in the language of braid group factorizations. While the results mentioned in this paper are not new, we hope that they will stimulate interest in these questions, which remain essentially wide open.
1. Introduction An important problem in 4-manifold topology is to understand which manifolds carry symplectic structures (i.e., closed non-degenerate 2-forms), and to develop invariants that can distinguish symplectic manifolds. Additionally, one would like to understand to what extent the category of symplectic manifolds is richer than that of K¨ ahler (or complex projective) manifolds. For example, one would like to identify a set of surgery operations that can be used to turn an arbitrary symplectic 4-manifold into a K¨ ahler manifold, or two symplectic 4-manifolds with the same classical topological invariants (fundamental group, Chern numbers,. . . ) into each other. Similar questions may be asked about singular curves inside, e.g., the complex projective plane. The two types of questions are related to each other via symplectic branched covers. A branched cover of a symplectic 4-manifold with a (possibly singular) symplectic branch curve carries a natural symplectic structure. Conversely, every compact symplectic 4-manifold is a branched cover of CP2 , with a branch curve presenting nodes (of both orientations) and complex cusps as its only singularities. In the language of branch curves, the failure of most symplectic manifolds to admit integrable complex structures translates into the failure of most symplectic branch curves to be isotopic to complex curves. While the symplectic isotopy problem has a negative answer for plane curves with cusp and node singularities, it is interesting to investigate this failure more precisely. Various partial results have been obtained recently about situations where isotopy Partially supported by NSF grant DMS-0244844.
24
D. Auroux
holds (for smooth curves; for curves of low degree), and about isotopy up to stabilization or regular homotopy. On the other hand, many known examples of non-isotopic curves can be understood in terms of braiding along Lagrangian annuli (or equivalently, Luttinger surgery of the branched covers), leading to some intriguing open questions about the topology of symplectic 4-manifolds versus that of K¨ ahler surfaces. If one prefers to adopt a more group theoretic point of view, it is possible to use braid monodromy techniques to reformulate these questions in terms of words in braid groups. For example, the classification of symplectic 4-manifolds reduces in principle to a (hard) question about factorizations in the braid group, known as the Hurwitz problem. In the following sections, we discuss these various questions and the connections between them, starting from the point of view of symplectic 4-manifolds (in Sect. 2), then translating them in terms of plane branch curves (in Sect. 3) and finally braid group factorizations (in Sect. 4).
2. Topological questions about symplectic 4-manifolds 2.1. Classification of symplectic 4-manifolds. Recall that a symplectic manifold is a smooth manifold equipped with a 2-form ω such that dω = 0 and ω∧· · ·∧ω is a volume form. The first examples of compact symplectic manifolds are compact oriented surfaces (taking ω to be an arbitrary area form), and the complex projective space CPn (equipped with the Fubini–Study K¨ ahler form). More generally, since any submanifold to which ω restricts non-degenerately inherits a symplectic structure, all complex projective manifolds are symplectic. However, the symplectic category is strictly larger than the complex projective category, as first evidenced by Thurston in 1976 [31]. In 1994 Gompf used the symplectic sum construction to prove that any finitely presented group can be realized as the fundamental group of a compact symplectic 4-manifold [15]. An important problem in symplectic topology is to understand the hierarchy formed by the three main classes of compact oriented 4-manifolds: (1) complex projective, (2) symplectic, and (3) smooth. Each class is a proper subset of the next one, and many obstructions and examples are known, but we are still very far from understanding what exactly causes a smooth 4-manifold to admit a symplectic structure, or a symplectic 4-manifold to admit an integrable complex structure. One of the main motivations to study symplectic 4-manifolds is that they retain some (but not all) features of complex projective manifolds: for example the structure of their Seiberg–Witten invariants, which in both cases are nonzero and count embedded (pseudo)holomorphic curves [27, 28]. At the same time, every compact oriented smooth 4-manifold with b+ 2 ≥ 1 admits a “nearsymplectic” structure, i.e., a closed 2-form which vanishes along a union of circles and is symplectic over the complement of its zero set [14, 18]; and it
Symplectic 4-manifolds, Plane Curves and Braid Groups
25
appears that some structural properties of symplectic manifolds carry over to the world of smooth 4-manifolds (see, e.g., [29, 4]). Although the question of determining which smooth 4-manifolds admit symplectic structures and how many is definitely an essential one, it falls outside of the scope of this paper. Rather, our goal will be to obtain information on the richness of the symplectic category, especially when compared to the complex projective category. We will restrict ourselves to the class of integral compact symplectic 4manifolds, i.e., we will assume that the cohomology class [ω] ∈ H 2 (X, R) is the image of an element of H 2 (X, Z). This does not place any additional restrictions on the diffeomorphism type of X, but makes classification a discrete problem (by Moser’s stability theorem, deformations that keep [ω] constant are induced by ambient isotopies). By integrating the Chern classes of the tangent bundle and the symplectic class over the fundamental cycle [X], one obtains various classical topological invariants: the Chern numbers c21 (= 2χ + 3σ) and c2 (= χ), the symplectic volume [ω]2 , and c1 · [ω]. Hence, the first question we will ask is: Question 2.1. Can one classify all integral compact symplectic 4-manifolds with given values of (c21 , c2 , c1 · [ω], [ω]2 ) (and a given fundamental group)? This question contains the geography problem, i.e., the question of determining which Chern numbers can be realized by compact symplectic 4manifolds. In some specific cases, Taubes’ results on Seiberg–Witten invariants seriously constrain the list of possibilities. For example we have the following result [27]: Theorem 2.2 (Taubes). Let (X, ω) be a compact symplectic 4-manifold with b+ 2 ≥ 2. Then c1 · [ω] ≤ 0. Moreover, if X is minimal (i.e., does not contain an embedded symplectic sphere of square −1), then c21 ≥ 0. A lot is also known about the case c21 = 0, from Seiberg–Witten theory and from various surgery constructions. For example, infinite families of simply connected symplectic 4-manifolds homeomorphic but not diffeomorphic to elliptic surfaces have been constructed (see, e.g., [15, 10]). However, when c21 > 0 very little is known, and many important questions remain open. For example it is unknown whether the Bogomolov–Miyaoka–Yau inequality c21 ≤ 3c2 , which constrains the Chern numbers of complex surfaces of general type, holds for symplectic 4-manifolds. 2.2. Lefschetz fibrations and stabilization by symplectic sums. One possible approach to the classification of symplectic 4-manifolds is via symplectic Lefschetz fibrations, as suggested by Donaldson. After blowing up a certain number of points, every compact integral symplectic 4-manifold can be realized as the total space of a fibration over S 2 whose fibers are compact Riemann surfaces, finitely many of which present a nodal singularity [9]. Conversely, the total
26
D. Auroux
space of such a Lefschetz fibration is a symplectic 4-manifold [16]. If one could classify symplectic Lefschetz fibrations, then an answer to Question 2.1 would follow. When the fiber genus is 0 or 1, the classification of Lefschetz fibrations is a classical result; in particular, these fibrations are all holomorphic [22]. For genus 2, Siebert and Tian have proved holomorphicity under assumptions of irreducibility of the singular fibers and transitivity of the monodromy [26], but in general there are non-holomorphic examples [24], and the complete classification is not known. However, the situation simplifies if we “stabilize” by repeatedly performing fiber sums with a specific holomorphic fibration f0 (the fibration obtained by blowing up a pencil of curves of bidegree (2, 3) in CP1 × CP1 ). Then we have the following result [2]: Theorem 2.3. For any genus 2 symplectic Lefschetz fibration f : X → S 2 , there exists an integer n0 such that, for all n ≥ n0 , f #nf0 is isomorphic to a holomorphic fibration. In fact, given two genus 2 symplectic Lefschetz fibrations f, f with the same numbers of singular fibers of each type (irreducible, reducible with genus 1 components, reducible with components of genus 0 and 2), for all large n the fiber sums f #nf0 and f #nf0 are isomorphic [2]. More generally, as a corollary of a recent result of Kharlamov and Kulikov about braid monodromy factorizations [19], a similar result holds for all Lefschetz fibrations with monodromy contained in the hyperelliptic mapping class group. This leads to the following questions relative to the classification of symplectic 4-manifolds up to stabilization by fiber sums: Question 2.4. Does every symplectic Lefschetz fibration become isomorphic to a holomorphic fibration after repeatedly fiber summing with certain standard holomorphic fibrations? Question 2.5. Let X1 , X2 be two integral compact symplectic 4-manifolds with the same (c21 , c2 , c1 · [ω], [ω]2 ). Do X1 and X2 become symplectomorphic after repeatedly performing symplectic sums with the same complex projective surfaces (chosen among a finite collection of model surfaces)? 2.3. Luttinger surgery. Many of the constructions used to obtain interesting examples of non-K¨ahler symplectic 4-manifolds, such as symplectic sum, link surgery, and symplectic rational blowdown, rely on the idea of cutting and pasting elementary building blocks. We focus here on the construction known as Luttinger surgery [21], which has been comparatively less studied but can be used to provide a unified description of numerous examples of exotic symplectic 4-manifolds. Given an embedded Lagrangian torus T in a symplectic 4-manifold (X, ω) and a homotopically non-trivial embedded loop γ ⊂ T , Luttinger surgery is an operation that consists in cutting out from X a tubular neighborhood of T ,
Symplectic 4-manifolds, Plane Curves and Braid Groups
27
foliated by parallel Lagrangian tori, and gluing it back in such a way that the new meridian loop differs from the old one by a twist along the loop γ (while ˜ ω longitudes are not affected), yielding a new symplectic manifold (X, ˜ ). More precisely, identify a neighborhood of T in X with the neighborhood T 2 × D2 (r) of the zero section in (T ∗ T 2 , dp1 ∧ dq1 + dp2 ∧ dq2 ), in such a way that γ is identified with the first factor in T 2 = S 1 × S 1 . Let θ be a smooth circle-valued function on the annulus A = D2 (r) \ D2 ( r2 ) such that ∂θ/∂p2 = 0, and representing the generator of H 1 (A) = Z (i.e., the value of θ increases by 2π as one goes around the origin). The diffeomorphism of T 2 × A defined by φ(q1 , q2 , p1 , p2 ) = (q1 + θ(p1 , p2 ), q2 , p1 , p2 ) preserves the symplectic form, and so the manifold ˜ = (X \ T 2 × D2 ( r )) ∪φ (T 2 × D2 (r)) X 2 inherits a natural symplectic structure. For more details see [21, 3]. By performing Luttinger surgery along suitably chosen Lagrangian tori, one can, e.g., transform a product T 2 ×Σ into any surface bundle over T 2 , or an untwisted fiber sum of Lefschetz fibrations into a twisted fiber sum. Fintushel and Stern’s symplectic examples of knot surgery manifolds can also be obtained from complex surfaces by Luttinger surgery. Although there is no good reason to believe that the answer should be positive, the wide range of examples which reduce to this construction makes it interesting to ask the following question: Question 2.6. Let X1 , X2 be two integral compact symplectic 4-manifolds with the same (c21 , c2 , c1 · [ω], [ω]2 ). Is it always possible to obtain X2 from X1 by a sequence of Luttinger surgeries? In this question, as in Question 2.5 above, we do not require the fundamental groups of X1 and X2 to be isomorphic. This is because Luttinger surgery, like symplectic sum, can drastically modify the fundamental group. Also, let us mention that a positive answer to Question 2.6 essentially implies a positive answer to Question 2.5, as we shall see in Sect. 4. The symplectic sum construction can be used to build minimal simply connected symplectic 4-manifolds with Chern numbers violating the Noether inequality, and hence not diffeomorphic to any complex surface (see, e.g., Theorem 10.2.14 in [16]). Many of these manifolds are homeomorphic to (nonminimal) complex surfaces, but it is not clear at all whether it is possible to obtain them by Luttinger surgeries. Given the very explicit nature of the construction, these could be good test examples for Question 2.6. 3. Isotopy questions about singular plane curves 3.1. Symplectic branched covers. Let X and Y be compact oriented 4-manifolds, and assume that Y carries a symplectic form ωY .
28
D. Auroux
Definition 3.1. A smooth map f : X → Y is a symplectic branched covering if given any point p ∈ X there exist neighborhoods U p, V f (p), and local coordinate charts φ : U → C2 (orientation-preserving) and ψ : V → C2 (adapted to ωY , i.e., such that ωY restricts positively to any complex line in C2 ), in which f is given by one of: (i) (x, y) → (x, y) (local diffeomorphism), (ii) (x, y) → (x2 , y) (simple branching), (iii) (x, y) → (x3 − xy, y) (ordinary cusp). These local models are the same as for the singularities of a generic holomorphic map from C2 to itself, except that the requirements on the local coordinate charts have been substantially weakened. The ramification curve R = {p ∈ X, det(df ) = 0} is a smooth submanifold of X, and its image D = f (R) is the branch curve, described in the local models by the equations z1 = 0 for (x, y) → (x2 , y) and 27z12 = 4z23 for (x, y) → (x3 − xy, y). It follows from the definition that D is a singular symplectic curve in Y . Generically, its only singularities are transverse double points, which may occur with either the complex orientation or the opposite orientation, and complex cusps. We have the following result [1]: Proposition 3.2. Given a symplectic branched covering f : X → Y , the manifold X inherits a natural symplectic structure ωX , canonical up to isotopy, in the cohomology class [ωX ] = f ∗ [ωY ]. The symplectic form ωX is constructed by adding to f ∗ ωY a small multiple of an exact form α with the property that, at every point of R, the restriction of α to ker(df ) is positive. Uniqueness up to isotopy follows from the convexity of the space of such exact 2-forms and Moser’s theorem. Conversely, we can realize every integral compact symplectic 4-manifold as a symplectic branched cover of CP2 [1]: Theorem 3.3. Given an integral compact symplectic 4-manifold (X 4 , ω) and an integer k 0, there exists a symplectic branched covering fk : X → CP2 , canonical up to isotopy if k is sufficiently large. The maps fk are built from suitably chosen triples of sections of L⊗k , where L → X is a complex line bundle such that c1 (L) = [ω]. In the complex case, L is an ample line bundle, and a generic triple of holomorphic sections of L⊗k determines a CP2 -valued map fk : p → [s0 (p) : s1 (p) : s2 (p)]. In the symplectic case the idea is similar, but requires more analysis; the proof relies on asymptotically holomorphic methods [1]. In any case, the natural symplectic structure induced on X by the Fubini– Study K¨ ahler form and fk (as given by Proposition 3.2) agrees with ω up to isotopy and scaling (multiplication by k). Because for large k the maps fk are canonical up to isotopy through symplectic branched covers, the topology of fk and of its branch curve Dk can
Symplectic 4-manifolds, Plane Curves and Braid Groups
29
be used to define invariants of the symplectic manifold (X, ω). Although the only generic singularities of the plane curve Dk are nodes (transverse double points) of either orientation and complex cusps, in a generic one-parameter family of branched covers pairs of nodes with opposite orientations may be cancelled or created. However, recalling that a node of Dk corresponds to the occurrence of two simple branch points in a same fiber of fk , the creation of a pair of nodes can only occur in a manner compatible with the branched covering structure, i.e., involving disjoint sheets of the covering. It is worth mentioning that, to this date, there is no evidence suggesting that negative nodes actually do occur in these high degree branch curves; our inability to rule our their presence might well be a shortcoming of the approximately holomorphic techniques, rather than an intrinsic feature of symplectic 4-manifolds. So we will occasionally consider the more conventional problem of understanding isotopy classes of curves presenting only positive nodes and cusps, although most of the discussion applies equally well to curves with negative nodes. Assuming that the topology of the branch curve is understood, the structure of f is determined by its monodromy morphism θ : π1 (CP2 − D) → SN , where N is the degree of the covering f . Fixing a base point p0 ∈ CP2 − D, the image by θ of a loop γ in the complement of D is the permutation of the fiber f −1 (p0 ) induced by the monodromy of f along γ. (Since viewing this permutation as an element of SN depends on the choice of an identification between f −1 (p0 ) and {1, . . . , N }, the morphism θ is only well-defined up to conjugation by an element of SN .) By Proposition 3.2, the isotopy class of the branch curve D and the monodromy morphism θ determine completely the symplectic 4-manifold (X, ω) up to symplectomorphism. The image by θ of a geometric generator of π1 (CP2 − D), i.e., a loop γ which bounds a small topological disc intersecting D transversely once, is a transposition (because of the local model near a simple branch point). Since the image of θ is generated by transpositions and acts transitively on the fiber (assuming X to be connected), θ is a surjective group homomorphism. Moreover, the smoothness of X above the singular points of D imposes certain compatibility conditions on θ. Therefore, not every singular plane curve can be the branch curve of a smooth covering; in fact, the morphism θ, if it exists, is often unique (up to conjugation in SN ). In the case of algebraic curves, this uniqueness property, which holds except for a finite list of well-known counterexamples, is known as Chisini’s conjecture, and was essentially proved by Kulikov a few years go [20]. The upshot of the above discussion is that, in order to understand symplectic 4-manifolds, it is in principle enough to understand singular plane curves. Moreover, if the branch curve of a symplectic covering f : X → CP2 happens to be a complex curve, then the integrable complex structure of CP2 can be lifted to an integrable complex structure on X, compatible with the
30
D. Auroux
symplectic structure; this implies that X is a complex projective surface. So, considering the branched coverings constructed in Theorem 3.3, we have: Corollary 3.4. For k 0 the branch curve Dk ⊂ CP2 is isotopic to a complex curve (up to node cancellations) if and only if X is a complex projective surface. This motivates the study of the symplectic isotopy problem for singular curves in CP2 (or more generally in other complex surfaces – especially rational ruled surfaces, i.e., CP1 -bundles over CP1 ). 3.2. The symplectic isotopy problem. The symplectic isotopy problem asks under which circumstances (assumptions on degree, singularities, . . . ) it is true that any symplectic curve is isotopic to a complex curve (by isotopy, we mean a continuous one-parameter family of symplectic curves with the same singularities). More generally, the goal is to understand isotopy classes of symplectic curves with given singularities in a given homology class. For example, considering only plane curves with positive nodes and cusps, one may ask the following: Question 3.5. Given integers (d, ν, κ), can one classify all symplectic curves of degree d in CP2 with ν nodes and κ cusps, up to symplectic isotopy? If D is the branch curve of an N -fold symplectic covering, then the Chern classes of the symplectic manifold (X, ω) (with the symplectic structure given by Proposition 3.2) are related to the degree d of D, its genus g = 12 (d − 1)(d − 2) − κ − ν, and its number of cusps via the formulas: [ω]2 = N,
c1 · [ω] = 3N − d,
c21 = g − 1 − 92 d + 9N,
c2 = 2g − 2 + 3N − κ.
In particular, integrality constraints on the Euler–Poincar´e characteristic χ = c2 and signature σ = 13 (c21 − 2c2 ) of X imply that the degree d must be even, and that the number of cusps κ must be a multiple of 3. The geography problem for symplectic 4-manifolds translates into a geography problem for symplectic branch curves: for example, the Bogomolov–Miyaoka–Yau inequality c21 ≤ 3c2 translates into the inequality κ ≤ 53 (g − 1) + 32 d. There are plane curves which violate this inequality, even in the algebraic world: e.g., the branch curves of generic projections of irrational ruled surfaces Σ×CP1 , where Σ is a curve of genus ≥ 2. However, the open question is whether one can find branch curves which violate this inequality and for which the branched covering has c21 ≥ 0. By the above remarks, these cannot be isotopic to any complex curve. The symplectic isotopy problem is understood in various simple situations, where it can be shown that every symplectic curve is isotopic to a complex curve. The first results were obtained by Gromov [17], who used pseudoholomorphic curves to prove that every smooth symplectic curve of degree 1 or 2
Symplectic 4-manifolds, Plane Curves and Braid Groups
31
in CP2 is isotopic to a complex curve. The idea of the argument is to equip CP2 with an almost-complex structure J = J1 such that the given curve C is J-holomorphic, and consider a smooth family of almost-complex structures (Jt )t∈[0,1] interpolating between J and the standard complex structure J0 . By studying the deformation problem for pseudoholomorphic curves, one can prove the existence of a smooth family of Jt -holomorphic curves Ct realizing an isotopy between C = C1 and an honest holomorphic curve C0 . Successive improvements of this result have been obtained by Sikorav (for smooth curves of degree ≤ 3), Shevchishin (degree ≤ 6), and more recently Siebert and Tian [26]: Theorem 3.6 (Siebert–Tian). Every smooth symplectic curve of degree ≤ 17 in CP2 is isotopic to a complex curve. A similar result has also been obtained for smooth curves in CP1 -bundles over CP1 (assuming [C] · [fiber] ≤ 7) [26]. It is expected that the isotopy property remains true for smooth plane curves of arbitrarily large degree; this would provide an answer to Question 3.5 in the case ν = κ = 0 (recall that all smooth complex curves of a given degree are mutually isotopic). The isotopy property is also known to hold in some simple cases for curves with nodes and cusps in CP2 and CP1 -bundles over CP1 , as illustrated by the results obtained by Barraud, Shevchishin, and Francisco. For example, we have the following results [25, 12]: Theorem 3.7 (Shevchishin). Any two irreducible nodal symplectic curves in CP2 of the same degree and the same genus g ≤ 4 are symplectically isotopic. Theorem 3.8 (Francisco). Let C be an irreducible symplectic curve of degree d and genus 0 with κ cusps and ν nodes in CP2 , and assume that κ < d. Then C is isotopic to a complex curve. In general, we cannot expect the classification to be so simple, and there are plenty of examples of symplectic curves which are not isotopic to any complex curve. Perhaps the most widely known such examples are due to Fintushel and Stern [11], who showed that elliptic surfaces contain infinite families of pairwise non-isotopic smooth symplectic curves representing a same homology class. Similar results have also been obtained by Smith, Etg¨ u and Park, and Vidussi. However, if we consider singular curves with cusp singularities, then these non-isotopy phenomena already arise in CP2 . In a non-explicit manner, it is clear that this must be the case, from Corollary 3.4; however to this date the branch curves given by Theorem 3.3 for k 0 have not been computed explicitly for any non-complex examples. More explicitly, the following result is due to Moishezon [23] (see also [3]): Theorem 3.9 (Moishezon). For all p ≥ 2, there exist infinitely many pairwise non-isotopic singular symplectic curves of degree d = 9p(p − 1) in CP2 with
32
D. Auroux
κ = 27(p − 1)(4p − 5) cusps and ν = isotopic to any complex curve.
27 2 2 (p − 1)(p − 2)(3p
+ 3p − 8) nodes, not
Moishezon’s approach is purely algebraic (using braid monodromy factorizations), and yields curves that are distinguished by the fundamental groups of their complements [23]. However a simpler geometric description of his construction can be given in terms of braiding constructions [3]; cf. Sect. 3.4. Questions 2.1 and 3.5 are closely related to each other, via Proposition 3.2 and Theorem 3.3. Let us restrict ourselves to those plane curves which admit a compatible symmetric group valued monodromy morphism, and assume that Chisini’s conjecture about the uniqueness of this morphism (excluding a specific degree 6 curve) extends to the symplectic case. Then integral compact symplectic 4-manifolds (up to scaling of the symplectic form) are in one-to-one correspondence with isotopy classes of singular symplectic plane branch curves up to an equivalence relation which takes into account: (1) the possibility of creating and cancelling pairs of nodes, and (2) the dependence of the branch curve Dk on the parameter k in Theorem 3.3. This latter dependence, while complicated and not quite understood in general, is nonetheless within reach: see [6] for a description of the relation between Dk and D2k . If one allows creations and cancellations of pairs of nodes, then the classification problem becomes different, even considering only curves with positive nodes and cusps. Indeed, it may happen that two non-isotopic curves can be deformed into each other if one is allowed to “push” the curve through itself, creating or cancelling pairs of double points in the process (such a deformation is called a regular homotopy). In fact, in this case the classification becomes excessively simple, as shown by the following result [7]: Theorem 3.10 (Auroux–Kulikov–Shevchishin). Any two irreducible symplectic curves with positive nodes and cusps in CP2 , of the same degree and with the same numbers of nodes and cusps, are regular homotopic to each other. What this means is that, when considering symplectic branch curves given by Theorem 3.3, it is important to restrict oneself to admissible regular homotopies, i.e., regular homotopies which are compatible with the symmetric group valued monodromy morphism θ. When pushing the branch curve D through itself, the two branches that are made to intersect each give rise to a geometric generator of π1 (CP2 − D). The requirement for admissibility of a node creation operation is that the images by θ of these two geometric generators should be transpositions acting on disjoint pairs of elements (i.e., the branching phenomena above the two intersecting branches of D should occur in different sheets of the covering). Thus the version of the isotopy problem which naturally comes out of Theorem 3.3 is the following: Question 3.11. Given integers (d, ν, κ, N ), can one classify all pairs (D, θ) where D is a symplectic curve of degree d in CP2 with ν+ positive nodes, ν− negative
Symplectic 4-manifolds, Plane Curves and Braid Groups
33
nodes and κ complex cusps, ν+ − ν− = ν, and θ : π1 (CP2 − D) → SN is a compatible monodromy morphism, up to admissible regular homotopies? 3.3. Hurwitz curves and stabilization. In order to state the analogue of Question 2.4 for branch curves, we need to introduce a slightly more restrictive category of curves, known at Hurwitz curves. Roughly speaking, a Hurwitz curve in a ruled surface is a curve which behaves like a generic complex curve with respect to the ruling. In the case of CP2 , we consider the projection π : CP2 − {(0 : 0 : 1)} → CP1 given by (x : y : z) → (x : y), and we make the following definition: Definition 3.12. A curve D ⊂ CP2 (not passing through (0 : 0 : 1)) is a Hurwitz curve if D is positively transverse to the fibers of π at all but finitely many points, where D is smooth and non-degenerately tangent to the fibers. Hurwitz curves in CP1 -bundles over CP1 can be defined similarly, considering the projection to CP1 given by the bundle structure. It is easy to see that any Hurwitz curve in CP2 can be made symplectic by an isotopy through Hurwitz curves: namely, the image of any Hurwitz curve by the rescaling map (x : y : z) → (x : y : λz) is a Hurwitz curve, and symplectic for |λ| 1. Moreover, Theorem 3.3 can be improved to ensure that the branch curves Dk ⊂ CP2 are Hurwitz curves [5]. So, the discussion in Sects. 3.1 and 3.2 carries over to the world of Hurwitz curves without modification. After blowing up CP2 at (0 : 0 : 1), we obtain the Hirzebruch surface F1 (recall that Fn = P(OP1 ⊕ OP1 (n))), and any Hurwitz curve in CP2 becomes a Hurwitz curve in F1 , disjoint from the exceptional section. The advantage of considering Hurwitz curves in Hirzebruch surfaces rather than CP2 is that we can now introduce an operation of stabilization by pairwise fiber sum. Namely, consider two Hurwitz curves D1 ⊂ Fn1 , D2 ⊂ Fn2 , of the same degree d relatively to the projection, i.e., such that [D1 ] · [F ] = [D2 ] · [F ] = d, where F is the fiber of the ruling. Then, up to an isotopy among Hurwitz curves, we can assume that the intersections of D1 and D2 with fixed fibers of the rulings coincide, and we can smooth the normal crossing configuration (Fn1 , D1 )∪ fiber=fiber (Fn2 , D2 ) into a pair (Fn , D), where D is a Hurwitz curve in Fn , and n = n1 + n2 . If a Hurwitz curve in Fn is a branch curve, then the ruling on Fn lifts to a symplectic Lefschetz fibration on the branched cover. Assuming that the symmetric group valued morphisms are compatible (i.e., have the same restrictions to given fibers), the fiber sum operation on the branch curves then corresponds to a fiber sum operation on the covers. Hence, the analogue of Questions 2.4 and 2.5 asks whether stabilization by fiber summing can be used to simplify the classification of Hurwitz branch curves: Question 3.13. Let D1 , D2 be two Hurwitz curves in Fn , representing the same homology class and with the same numbers of cusps and nodes. Assume that two compatible monodromy morphisms θi : π1 (Fn − Di ) → SN are given (i ∈ {1, 2}), and that there is a fiber F ⊂ Fn such that F ∩ D1 = F ∩ D2 and
34
D. Auroux
θ1|F −D1 = θ2|F −D2 . Is there a complex curve C ⊂ Fn , compatible with the given monodromy morphisms, such that the fiber sums D1 #C and D2 #C are isotopic to each other as Hurwitz curves? To remain closer to the formulation of the questions in Sect. 2, one can instead require the complex curve to be chosen among a finite list of standard models (depending on the given monodromy morphisms θi ), but allow several successive fiber sum operations. It is also interesting to ask whether the final result of the fiber sum operations can always be assumed to be isotopic to a complex curve. Requiring compatibility with the given monodromy morphisms places restrictions on the choice of the curve C, and makes the question more difficult. Without this constraint the answer is known, and follows directly from a recent result of Kharlamov and Kulikov about braid monodromy factorizations [19]: Theorem 3.14 (Kharlamov–Kulikov). Let D1 , D2 be two Hurwitz curves in Fn , representing the same homology class and with the same numbers of cusps and nodes. Then there exists a smooth complex curve C in F0 = CP1 × CP1 such that the fiber sums D1 #C and D2 #C are isotopic. In this result, C is in fact a smooth curve of bidegree (a, b), where a = [F ] · [Di ], and b 0 is chosen very large. Such a curve may be obtained by smoothing a configuration consisting of a sections CP1 ×{pt} and b fibers {pt}× CP1 . Hence, in this case the fiber sum operation is equivalent to considering the union of Di with b fibers of the ruling, and smoothing the intersections; a more geometric formulation of Theorem 3.14 is therefore: Theorem 3.15. Let D1 , D2 be two Hurwitz curves in Fn , representing the same homology class and with the same numbers of cusps and nodes. Let Di (i ∈ {1, 2}) be the curve obtained by adding to Di a union of b generic fibers of the ruling, intersecting Di transversely at smooth points, and smoothing out all the resulting intersections. Then for all large enough values of b the Hurwitz curves D1 and D2 are isotopic. This construction gives an answer to Question 3.13 in the case of smooth curves (and coverings of degree N = 2); it is unclear whether the argument in [19] can be modified to produce complex curves compatible with branched coverings of degree N ≥ 3. 3.4. Braiding along Lagrangian annuli. Let D be a symplectic curve in a symplectic 4-manifold Y (e.g., Y = CP2 ), possibly with singularities. It is often the case that we can find an embedded Lagrangian annulus A ⊂ Y \ D, with boundary contained in the smooth part of D. (This happens for example when a portion of D consists of two cylinders which run parallel to each other; then we can find a Lagrangian annulus joining them). In this situation, one can twist the curve D along the annulus A, to obtain ˜ which coincides with D away from A [3]. Namely, a new symplectic curve D
Symplectic 4-manifolds, Plane Curves and Braid Groups
35
we can identify a neighborhood of A with the product S 1 × (−1, 1) × D2 , in such a way that A = S 1 × {0} × [− 12 , 12 ] and a neighborhood of ∂A in D is S 1 × (−1, 1) × {± 12 }. (If we deform D suitably, then we may assume that the symplectic structure is the product one, but this is not necessary). ˜ is obtained from D by replacing S 1 × (−1, 1) × {± 1 } by Then the curve D 2 ˜ where Γ ˜ = {(t, ± 1 exp(iπχ(t))), t ∈ (−1, 1)} ⊂ (−1, 1) × D2 and χ is S 1 × Γ, 2 a smooth function which equals 0 near −1 and 1 near 1. This construction is called “braiding” because, forgetting the S 1 factor, it replaces the trivial braid ˜ (−1, 1) × {± 12 } with the half-twist Γ. Assume now that D is the branch curve of an N -fold symplectic covering f : X → Y . Assume moreover that f is ramified in the same manner above the two boundary components of A, i.e., that two of the N lifts of A have boundary contained in the ramification curve R; then these two lifts together form an embedded Lagrangian torus T ⊂ X, and we have the following result [3]: ˜ obProposition 3.16 (A.–Donaldson–Katzarkov). The symplectic 4-manifold X tained from X by Luttinger surgery along the torus T is the total space of a ˜ → Y , whose branch curve D ˜ is the natural symplectic branched covering f˜ : X curve obtained from D by braiding along the annulus A. Hence, the natural analogue of Question 2.6 for singular plane curves is: Question 3.17. Let D1 , D2 be two symplectic curves with positive nodes and cusps in CP2 , of the same degree and with the same numbers of nodes and cusps. Is it always possible to obtain D2 from D1 by a sequence of braiding operations along Lagrangian annuli? As before, there is no good reason to believe that the answer should be positive, except that most known examples of non-isotopic symplectic curves seem to reduce to this construction. This is, e.g., the case for the Fintushel– Stern examples of non-isotopic smooth symplectic curves in elliptic surfaces [11], which are obtained by braiding a disconnected union of elliptic fibers, and for Moishezon’s examples of singular plane curves [23, 3], which are obtained by braiding the branch curve of the projection of a complex surface of general type. 4. Questions about braid monodromy factorizations 4.1. The braid monodromy of a plane curve. One of the main tools to study algebraic plane curves is the notion of braid monodromy, which has been used extensively by Moishezon and Teicher (among others) since the early 1980s in order to study the branch curves of generic projections of complex projective surfaces (see [30] for a detailed overview). Braid monodromy techniques apply equally well to the more general case of Hurwitz curves in CP2 or more generally in rational ruled surfaces (see Definition 3.12). Given a Hurwitz curve D in CP2 , the projection π : CP2 − {(0 : 0 : 1)} → 1 CP makes D a singular branched cover of CP1 , of degree d = deg D. Each fiber
36
D. Auroux
of π is a complex line C ⊂ CP2 , and if does not pass through any of the singular points of D nor any of its vertical tangencies, then ∩ D consists of d distinct points. We can trivialize the fibration π over an affine subset C ⊂ CP1 , and define the braid monodromy morphism ρ : π1 (C − crit(π|D )) → Bd . Here Bd is the Artin braid group on d strings (the fundamental group of the configuration space Xd of d distinct points in C), and for any loop γ the braid ρ(γ) describes the motion of the d points of ∩ D inside the fibers of π as one moves along the loop γ. Equivalently, choosing an ordered system of arcs generating the free group π1 (C−crit(π|D )), one can express the braid monodromy of D by a factorization ∆2 = ρi i 2
of the central element ∆ (representing a full rotation by 2π) in Bd , where each factor ρi is the monodromy around one of the special points (cusps, nodes, tangencies) of D. The monodromy around a tangency point is a half-twist exchanging two strands, i.e., an element conjugated to one of the standard generators of Bd ; the monodromy around a positive (resp. negative) node is the square (resp. the inverse of the square) of a half-twist; and the monodromy around a cusp is the cube of a half-twist. Hence, we are interested in factorizations of ∆2 into products of powers of half-twists. A same Hurwitz curve can be described by different factorizations of ∆2 in Bd : switching to a different ordered system of generators of π1 (C − crit(π|D )) affects the collection of factors ρ1 , . . . , ρr by a sequence of Hurwitz moves, i.e., operations of the form ρ1 , . . . , ρi , ρi+1 , . . . , ρr ←→ ρ1 , . . . , (ρi ρi+1 ρ−1 i ), ρi , . . . , ρr ; and changing the trivialization of the reference fiber (, ∩ D) of π (i.e., its identification with the base point in Xd ) affects braid monodromy by a global conjugation ρ1 , . . . , ρr ←→ b−1 ρ1 b, . . . , b−1 ρr b. For Hurwitz curves whose only singularities are cusps and nodes (of either orientation), the braid monodromy factorization determines the isotopy type completely (see for example [19]). Hence, determining whether two given Hurwitz curves are isotopic is equivalent to determining whether two given factorizations of ∆2 coincide up to Hurwitz moves and global conjugation. In this language the isotopy problem for Hurwitz curves in CP2 becomes: Question 4.1. Given integers (d, ν, κ), can one classify all factorizations of the central element ∆2 of Bd into a product of τ = d(d − 1) − 2ν − 3κ half-twists, ν squares of half-twists, and κ cubes of half-twists, up to Hurwitz moves and global conjugation?
Symplectic 4-manifolds, Plane Curves and Braid Groups
37
If our goal is to consider only branch curves of symplectic coverings (rather than arbitrary plane Hurwitz curves), then we need to look specifically for factorizations in which the factors belong to the liftable braid group, i.e., the subgroup of Bd consisting of all braids compatible with given branched covering data. More precisely, assume that a Hurwitz curve D is the branch curve of a symplectic branched covering f : X → CP2 . The fibers of π form a pencil of lines on CP2 , whose preimages by f equip X with a structure of symplectic Lefschetz pencil. By restricting the monodromy of the covering to a fiber of π, we obtain a symmetric group valued morphism θ : π1 ( − ( ∩ D)) → SN , which describes how to realize a fiber of the Lefschetz pencil as a branched covering of a fiber of π. The braid group acts on π1 ( − ( ∩ D)) by automorphisms; call b∗ the automorphism induced by the braid b. Then the liftable braid group is LBd (θ) = {b ∈ Bd , θ ◦ b∗ = θ}. Equivalently, recall that Bd is the fundamental group of the space Xd of configurations of d distinct points in C, and consider the configuration space X˜d whose elements are pairs (Π, σ), where Π is a set of d distinct points in C, and σ is a surjective group homomorphism from π1 (C − Π) to SN mapping generators to transpositions. The projection (Π, σ) → Π is a finite covering, and taking ˜ ∗ = ( ∩ D, θ) as base point we have LBd (θ) = π1 (X˜d , ˜∗). 4.2. Stabilization and partial conjugation. The main feature which makes braid groups algorithmically manageable is the Garside property. Namely, if we consider the semigroup of positive braids Bd+ , defined by the same generators (σi )1≤i≤d−1 and relations (σi σi+1 σi = σi+1 σi σi+1 ∀i and σi σj = σj σi ∀|i − j| ≥ 2) as Bd but without allowing inverses of the generators, then we have the following property [13]: Theorem 4.2 (Garside). The natural homomorphism i : Bd+ → Bd is an embedding. In other terms, if two positive words in the generators of the braid group represent the same braid, then they can be transformed into each other by repeatedly using the defining relations, without ever introducing the inverses of the generators. Garside’s other fundamental observation is that for any b ∈ Bd there exists an integer k and a positive braid β ∈ Bd+ such that ∆2k b = i(β) [13]. These properties make it possible to obtain solutions to the word and conjugacy problems (see also [8] for a more modern approach); they also yield a stable classification of braid group factorizations [19]. Namely, let F0 be the standard factorization ∆2 = (σ1 · . . . · σd−1 )d in Bd , and say that two factorizations F = (ρ1 · . . . · ρr ), F = (ρ1 · . . . · ρr ) have the same number of factors of each type if they have the same number of factors
38
D. Auroux
r and there is a permutation σ ∈ Sr such that ρi is conjugated to ρσ(i) for all i = 1, . . . , r. Then the following result holds [19]: Theorem 4.3 (Kharlamov–Kulikov). Let F and F be two factorizations of the same element in Bd , with the same numbers of factors of each type. Then there exists an integer n such that the factorizations F · (F0 )n and F · (F0 )n are equivalent under Hurwitz moves. (Here F · (F0 )n is the factorization consisting of the factors of F followed by those of F0 repeated n times.) Theorem 3.14 follows from this result by specifically considering factorizations of ∆2n whose factors are powers of half-twists and observing that F0 is the braid monodromy factorization of a smooth algebraic plane curve. However, considering that the factors in F0 generate the entire braid group, of which LBd (θ) is a proper subgroup as soon as the degree N of the covering is at least 3, one is prompted to ask the following question: Question 4.4. Given a symmetric group valued morphism θ, does a statement similar to Theorem 4.3 hold for factorizations in LBd (θ)? Assuming that the factorization in LBd (θ) playing the role of the standard factorization F0 in this statement can be realized as the braid monodromy of an algebraic curve, a positive answer to this question would imply positive answers to Questions 2.5 and 3.13. Finally, the last question we will consider is that of partial conjugation of braid factorizations. Namely, given a factorizationF with factors ρ1 , . . . , ρr , integers 1 ≤ p < q ≤ r, and a braid b such that p≤i≤q ρi commutes with b, we can form a new factorization F , with factors ρ1 , . . . , ρp−1 , (b−1 ρp b), . . . , (b−1 ρq b), ρq+1 , . . . , ρr : the partial conjugate of F by b. Lemma4.5. If the element b belongs to the subgroup of Bd generated by ρp ,...,ρq , and if p≤i≤q ρi is central in this subgroup, then F is equivalent to F under Hurwitz moves. The proof is easy, and relies on the same trick as in Lemma 6 of [2]. On the other hand, if b does not belong to the subgroup generated by the factors of F , then we can get interesting examples of inequivalent factorizations; this is, e.g., how Moishezon’s examples [23] are constructed. Question 4.6. Are any two factorizations of the same element in Bd (resp. LBd (θ)), with the same numbers of factors of each type, equivalent under Hurwitz moves and partial conjugations by elements of Bd (resp. LBd (θ))? A positive answer to this question (for factorizations in LBd (θ)) would imply that Questions 2.6 and 3.17 also admit positive answers. In fact, if one specifically considers factorizations of ∆2 into a product of powers of half-twists in LBd (θ), then Questions 2.6 and 4.6 are almost equivalent. This is because, given an arbitrary Lagrangian torus T in a symplectic 4-manifold, one can
Symplectic 4-manifolds, Plane Curves and Braid Groups
39
build a symplectic Lefschetz pencil for which T fibers above an embedded loop δ in CP1 and intersects each fiber above δ in a simple closed loop γ. Luttinger surgery along T then amounts to a partial conjugation of the monodromy of the pencil by the Dehn twist about γ, and considering branched coverings of CP2 instead of Lefschetz pencils it should also amount to a partial conjugation of the braid monodromy of the branch curve. Moreover, a positive answer to Question 4.6 also implies a positive answer to Question 4.4 (and hence to Questions 2.5 and 3.13), at least provided that there exists an algebraic plane branch curve whose braid monodromy generates the liftable braid subgroup LBd (θ). The existence of such a factorization F0,θ is rather likely, and examples should be relatively easy to find, although the question has not been studied. Assuming that this is the case, given two factorizations F1 , F2 in LBd (θ) with the same numbers of factors of each type, the factors in F1 · F0,θ and F2 · F0,θ generate LBd (θ), and hence, by Lemma 4.5, any partial conjugation operation performed on F1 · F0,θ is equivalent to a sequence of Hurwitz moves. So, if F1 · F0,θ and F2 · F0,θ are equivalent under Hurwitz moves and partial conjugations then they are equivalent under Hurwitz moves only. Note added in proof. Questions 2.4 and 2.5 have now essentially been solved. The reader is referred to: D. Auroux, A stable classification of Lefschetz fibrations, Geom. Topol. 9 (2005), 203–217.
References [1] D. Auroux, Symplectic 4-manifolds as branched coverings of CP2 , Invent. Math. 139 (2000), 551–602. [2] D. Auroux, Fiber sums of genus 2 Lefschetz fibrations, Turkish J. Math. 27 (2003), 1–10 (math.GT/0204285). [3] D. Auroux, S. K. Donaldson, L. Katzarkov, Luttinger surgery along Lagrangian tori and non-isotopy for singular symplectic plane curves, Math. Ann. 326 (2003), 185–203. [4] D. Auroux, S. K. Donaldson, L. Katzarkov, Singular Lefschetz pencils, preprint. [5] D. Auroux, L. Katzarkov, Branched coverings of CP2 and invariants of symplectic 4-manifolds, Invent. Math. 142 (2000), 631–673. [6] D. Auroux, L. Katzarkov, The degree doubling formula for braid monodromies and Lefschetz pencils, preprint. [7] D. Auroux, V. S. Kulikov, V. V. Shevchishin, Regular homotopy of Hurwitz curves, Izv. Math. 68 (2004), 521–542 (math.SG/0401172). [8] J. Birman, K. H. Ko, S. J. Lee, A new approach to the word and conjugacy problems in the braid groups, Adv. Math. 139 (1998), 322–353. [9] S. K. Donaldson, Lefschetz pencils on symplectic manifolds, J. Differential Geom. 53 (1999), 205–236. [10] R. Fintushel, R. Stern, Knots, links, and 4-manifolds, Invent. Math. 134 (1998), 363–400.
40
D. Auroux
[11] R. Fintushel, R. Stern, Symplectic surfaces in a fixed homology class, J. Differential Geom. 52 (1999), 203–222. [12] S. Francisco, Symplectic isotopy problem for cusp curves, preprint, in prep. [13] F. A. Garside, The braid group and other groups, Quart. J. Math. Oxford 20 (1969), 235–254. [14] D. T. Gay, R. Kirby, Constructing symplectic forms on 4-manifolds which vanish on circles, Geom. Topol. 8 (2004), 743–777. [15] R. E. Gompf, A new construction of symplectic manifolds, Ann. Math. 142 (1995), 527–595. [16] R. E. Gompf, A. I. Stipsicz, 4-manifolds and Kirby calculus, Graduate Studies in Math. 20, Amer. Math. Soc., Providence, 1999. [17] M. Gromov, Pseudo-holomorphic curves in symplectic manifolds, Invent. Math. 82 (1985), 307–347. [18] K. Honda, Transversality theorems for harmonic forms, Rocky Mountain J. Math. 34 (2004), 629–664. [19] V. Kharlamov, V. Kulikov, On braid monodromy factorizations, Izv. Math. 67 (2003), 79–118. [20] V. Kulikov, On a Chisini conjecture, Izv. Math. 63 (1999), 1139–1170. [21] K. M. Luttinger, Lagrangian tori in R4 , J. Differential Geom. 42 (1995) 220–228. [22] B. Moishezon, Complex surfaces and connected sums of complex projective planes, Lecture Notes in Math. 603, Springer, Heidelberg, 1977. [23] B. Moishezon, The arithmetic of braids and a statement of Chisini, Geometric Topology (Haifa, 1992), Contemp. Math. 164, Amer. Math. Soc., Providence, 1994, pp. 151–175. [24] B. Ozbagci, A. Stipsicz, Noncomplex smooth 4-manifolds with genus 2 Lefschetz fibrations, Proc. Amer. Math. Soc. 128 (2000), 3125–3128. [25] V. V. Shevchishin, On the local Severi problem, Int. Math. Res. Not. (2004), 211–237 (math.AG/0207048). [26] B. Siebert, G. Tian, On the holomorphicity of genus two Lefschetz fibrations, preprint, to appear in Ann. Math (math.SG/0305343). [27] C. H. Taubes, The Seiberg–Witten and the Gromov invariants, Math. Res. Lett. 2 (1995), 221–238. [28] C. H. Taubes, The geometry of the Seiberg–Witten invariants, Surveys in Differential Geometry, Vol. III (Cambridge, 1996), Int. Press, Boston, 1998, pp. 299–339. [29] C. H. Taubes, Seiberg–Witten invariants and pseudo-holomorphic subvarieties for self-dual, harmonic 2-forms, Geom. Topol. 3 (1999), 167–210. [30] M. Teicher, Braid groups, algebraic surfaces and fundamental groups of complements of branch curves, Algebraic Geometry (Santa Cruz, 1995), Proc. Sympos. Pure Math., 62 (part 1), Amer. Math. Soc., Providence, 1997, pp. 127–150. [31] W. Thurston, Some simple examples of symplectic manifolds, Proc. Amer. Math. Soc. 55 (1976), 467–468. Denis Auroux Department of Mathematics, M.I.T., Cambridge MA 02139, USA e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Harmonic Measure on Fractal Sets D. Beliaev and S. Smirnov Abstract. Many problems in complex analysis can be reduced to the evaluation of the universal spectrum: the supremum of multifractal spectra of harmonic measures for all planar domains. Its exact value is still unknown, with very few estimates available. We start with a brief survey of related problems and available estimates from above. Then we discuss in more detail estimates from below, describing the search for a fractal domain which attains the maximal possible spectrum.
1. Introduction It became apparent during the last decade that extremal configurations in many important problems in classical complex analysis exhibit complicated fractal structure. This makes such problems more difficult to approach than similar ones where extremal objects are smooth. A striking example is given by coefficient problem for two standard classes of univalent functions S and Σ. 1.1. Coefficient problems for univalent functions. Let D = {z : |z| < 1} be the unit disc and D− = {|z| > 1} be its complement. The classes S and Σ are defined by S = {φ(z) = z + a2 z 2 + a3 z 3 + · · · , φ is univalent on D} , and Σ = {φ(z) = z + b1 z −1 + b2 z −2 + · · · , φ is univalent on D− } . Univalent means analytic and injective, the letters S and Σ stand for German schlicht. Here and below we use an and bn to denote the Taylor coefficients of functions from S (or Sb = S ∩ L∞ ) and Σ correspondingly. A complete description of all possible coefficient sequences (an ) and (bn ) is perhaps beyond reach. So one asks what are the maximal possible values of individual coefficients, especially when n tends to infinity. The long history behind this question goes back to works of Koebe and Bieberbach. ∞ Class S. It is easy to see that the Koebe function k(z) = n=1 nz n is in fact a univalent map from the unit disk to the plane with a half-line (−∞, 1/4] removed. It was conjectured by Bieberbach [8] in 1916 that this function is extremal in the class S, namely that for any function there one has |an | ≤ n.
42
D. Beliaev and S. Smirnov
The Bieberbach conjecture was proved by de Branges [17] in 1985 with the help of Loewner evolution [40, 39] which we discuss below. The asymptotical behavior of max |an | was settled much earlier by Littlewood. In 1925 [35] he showed by an elegant argument that |an | ≤ en for any function φ ∈ S. Class Σ. The corresponding problem for class Σ appears more difficult, with even the question of asymptotic behavior still wide open. √ Bieberbach [7] showed in 1914 using his area theorem that |bn | ≤ 1/ n. While it is easy to produce examples of functions belonging to Σ with |bn | 1/n, Littlewood showed in [34] that those are not extremal. Moreover it is unclear how to construct an extremal function. Not just the problem of finding the sharp upper bound for |bn |, but even determining the correct decay rate is extremely difficult. We define γφ := lim sup n→∞
log bn +1 , log n
i.e., γφ is the smallest number γ such that |bn | nγ−1 . We then define γ = γΣ as the supremum of γφ ’s over all φ ∈ Σ. To find the value of γ one has to solve two problems: prove a sharp estimate from above and construct a function exhibiting the extremal decay rate of coefficients. The origins of the difficulties for the class Σ were explained by Carleson and Jones [13] in 1992. Define another constant βφ to be the growth rate of lengths of Green’s lines Γδ = φ ({z : |z| = 1 + δ}): βφ := lim sup δ→0
log length (Γδ ) , | log δ|
and let β = βΣ be the supremum of βφ ’s over all φ ∈ Σ. Define γb , βb , γs , and βs as the corresponding constants for the classes Sb = S ∩ L∞ and S. Theorem 1.1 (Carleson & Jones, 1992). The following holds: γ = β = γb = βb < γs = βs = 2 . The inequalities γ ≤ β for all the three pairs are due to Littlewood [35], who used them in the proof that |an | ≤ en. The apparent equality was quite unexpected. Indeed, Littlewood’s argument was quite transparent and in kone place used seemingly irreversible inequality. For a function φ(z) = ak z in the class S he wrote −n 1 |φ (z)||dz| e length (Γ1/n ) ≥ 1 − n |z|=1−1/n ∗ 1−n 1−n |z φ (z)|dθ ≥ z φ (z)dθ = |z|=1−1/n = z 1−n kak z k−1 dθ = kak z k−n dθ = 2πn|an | . k
k
Harmonic Measure on Fractal Sets
43
Essentially the same argument is valid for the other two classes, and it follows immediately that γφ ≤ βφ . Note that to have an identity, one must attain an approximate equality in the triangle inequality marked by (*). Thus z 1−n φ (z) should have approximately the same argument around the circle. Carleson and Jones achieved this by a small perturbation of φ, while preserving change in βφ and γφ . The identity γ = β explains the nature of extremal maps φ: those maximize the length of Green’s lines Γδ . For class S the boundary ∂Ω of the image domain Ω = φ(D) may be unbounded, so the Green’s lines can be long because of large diameter. This is exactly what happens for the extremal Koebe function. For classes Σ and Sb the situation is different: ∂Ω is compact. So for the length of Green’s lines to be large, they must “wiggle” a lot, and ∂Ω must be of infinite length (even dimH ∂Ω > 1 for β > 0). This difference explains why the problem for class S is much easier than for classes Σ and Sb . So we know that extremal domains for the latter classes should be fractal (self-similar), but there is no understanding of their origin or structure. 1.2. Multifractal analysis of harmonic measure. In [42] Makarov put this problem in a proper perspective, utilizing the language of multifractal analysis, an intensively developing interdisciplinary subject on the border between mathematics and physics. The concepts were introduced by Mandelbrot in 1971 in [44, 45]. We use the definitions that appeared in 1986 in a seminal physics paper [22] by Halsey, Jensen, Kadanoff, Procaccia, Shraiman who tried to understand and describe scaling laws of physical measures on different fractals of physical nature (strange attractors, stochastic fractals like DLA, etc.). Multifractal analysis studies different multifractal spectra (which quantitatively describe the sets where certain scaling laws apply to the mass concentration), their interrelation, and connections to other properties of the underlying measure. There are various definitions of spectra, in our context constructions similar to the grand ensemble in statistical mechanics lead to the integral means spectrum which for a given function φ ∈ Σ (or the corresponding domain φ(D− )) is defined by 2π log 0 |φ (reiθ )|t dθ βφ (t) := lim sup , t∈R. | log(r − 1)| r→1+ The universal integral means spectrum B(t) is defined as the supremum of βφ (t) for all φ ∈ Σ. Clearly the constant β is equal to B(1). Let ω be the harmonic measure, i.e., the image under the map φ of the normalized length on the unit circle. Another useful function is the dimension spectrum which is defined as the dimension of the set of points, where harmonic measure satisfies a certain power law: 1 f (α) := dim z : ω (B(z, δ)) ≈ δ α , δ → 0 , α ≥ . 2
44
D. Beliaev and S. Smirnov
Here dim stands for the Hausdorff or Minkowski dimension, leading to possibly different spectra. Of course, in the general situation there will be many points, where measure behaves differently at different scales, so one has to add lim sup’s and lim inf’s to the definition above – consult [42] for details. The universal dimension spectrum F (α) is defined as the supremum of f (α)’s over all φ ∈ Σ. Note that by Beurling’s theorem the minimal possible power α for simply connected domains is 1/2, which corresponds to points at the tips of the inward pointing spikes. The basic question about dimensional structure of harmonic measure on planar domains was resolved by Makarov [41] in 1985 when he showed that dimension of harmonic measure (i.e., minimal Hausdorff dimension of the Borel support) on simply-connected domains is always one, and Jones and Wolff [26] proved that for multiply connected domains it is always at most one. In the language of spectra Makarov’s theorem corresponds to the behavior of F (α) near α = 1 and B(t) near t = 0, see discussion in [42]. Makarov [42] developed in 1999 the general multifractal framework for harmonic measure. Among other things he showed that Hausdorff and Minkowski versions of universal spectra coincide (while they might differ for individual maps), and that universal integral means and dimension spectra are connected by a Legendre transform: B(t) − t + 1 = sup (F (α) − t)/α , α>0
F (α) = inf (t + α(B(t) − t + 1)) .
(1.1)
t
The same holds for spectra of individual maps, provided the corresponding domains are “nice” fractals. Makarov extended Carleson-Jones fractal approximation from B(1) to B(t), see below. He gave a complete characterization of all functions which can occur as spectra: those are precisely all positive convex functions which are majorated by the universal spectrum and satisfy β(t) − tβ (t±) ≥ −1. In the same paper Makarov described how the universal spectrum is related to many other problems in the geometric function theory. We will mention several connections later. On the basis of work of Brennan, Carleson, Jones, Makarov and computer experiments Kr¨atzer [30] in 1996 formulated the Universal spectrum conjecture 1. B(t) = t2 /4 for |t| < 2 and B(t) = |t| − 1 for |t| ≥ 2. which by the work of Makarov is equivalent to Universal spectrum conjecture 2. F (α) = 2 − 1/α for α ≥ 1/2. These conjectures are based on several others, discussed below. Unfortunately, besides numerical, there is not much evidence to support them. All known methods to obtain estimates from above seem to be essentially nonsharp. It is unclear at the moment which approach could lead to the sharp
Harmonic Measure on Fractal Sets
45
estimates from above. So it becomes even more important to search for extremal configurations in the hope that they will help to understand underlying structure and produce estimates from above as well. In this note we give an exposition of available methods. 1.3. Survey of related problems. Before discussing the values of the universal spectra we would like to briefly mention some of the problems which can be reduced to its study. For an extensive discussion, see [42]. The Brennan’s conjecture. Brennan [11] conjectured that any conformal map ψ : Ω → D satisfies for all positive |ψ (z)|4− dm(z) < ∞ , Ω
where m is the planar Lebesgue measure. By considering the inverse map, it is easy to see that this conjecture equivalent to B(−2) = 1. See the paper [14] of Carleson and Makarov and the Ph.D. thesis [6] of Bertilsson for reformulations and partial results. For the best known upper bounds for B(−2) see recent papers by Shimorin [53] and Hedenmalm, Shimorin [24]. The H¨ older domains conjecture. Let the map φ be H¨older continuous: φ ∈ S ∩ H¨ ol(η). Jones and Makarov proved (see [25] and [42, Th. 4.3]) that the Hausdorff dimension of the boundary of the image domain Ω = φ(D) satisfies dimH ∂Ω ≤ 2 − C η , for some positive constant C. They conjectured that for small values of η the constant C can be taken arbitrarily close to 1. It turns out that the universal spectrum conjecture suggests an even stronger statement. Indeed, a corollary of Makarov’s theory (see [42, 43] by Makarov and Pommerenke) is that the universal spectrum Bη (t) for the class S ∩ H¨ ol(η) is equal to B(t),
t < tη ,
(1 − η)(t − tη ) + B(tη ),
t ≥ tη ,
where tη is such that the tangent to B(t) at t = tη has a slope 1 − η. On the other hand the maximal possible dimension of ∂Ω is the root of the equation Bη (t) = t − 1 . After combining these statements and plugging in B(t) = t2 /4, an easy calculation then shows that the universal spectrum conjecture for t ∈ [0, 2] is equivalent to the H¨ older domains conjecture, which states that the following estimate holds and is sharp for η-H¨older domains: dimH ∂Ω ≤ 2 − η .
46
D. Beliaev and S. Smirnov
Multiply connected domains. One can define similar spectra for multiply connected domains. Since the class of domains is larger, they are a priori different (e.g., the integral means spectrum cannot be defined or rather is infinite for multiply connected domains when t is negative). However a combination of results of Binder, Makarov, Smirnov [10] and Binder, Jones [9] proves that they coincide whenever both are finite (i.e., B’s for t ≥ 0 and F ’s for α ≥ 1/2). Value distribution of entire functions. There is yet another constant α studied by Littlewood [36], which is the smallest α such that |p | dm ≤ const() nα+ , ∀ > 0 , sup 2 p∈Pn D 1 + |p| where Pn is the collection of all polynomials of degree n. The mentioned results together with Eremenko [20] and Beliaev, Smirnov [4] imply that α = B(1). Since α is more difficult to estimate it greatly improves the previously known estimates 1.11 · 10−5 < α < 1/2 − 2−264 from [1, 33]. The constant α plays role in a seemingly unrelated problem in value distribution of entire functions. Under assumption that α < 1/2 (proved only later by Lewis and Wu [33]) Littlewood proved in [36] a surprising theorem: for any entire function f of finite order most roots of f (z) = w for any w lie in a small set. This can be quantified in several ways, one particular implication is that for any entire function f of finite order ρ > 0 there is a set E such that for any w for sufficiently large R most roots of f (z) = w inside {|z| < R} lie in E while Area(E ∩ {|z| < R}) R2−2ρ(1/2−α) . See [36, 4] for an exact formulation. Universal spectra for other classes of maps. It was shown by Makarov in [42] that universal spectra for many other classes of univalent maps (e.g., H¨ older continuous, with bounds on the dimension of the boundary of the image domain, with k-fold symmetry) can be easily obtained from the universal spectrum B(t) for the class Σ. For example, while the universal spectrum for Sb is the same: Bb (t) = B(t), the universal spectrum Bs (t) for the class S satisfies Bs (t) = max (B(t), 3t − 1) . In particular, one notices immediately that γs = Bs (1) = 2. This ideology can be applied to an old problem about coefficients of m-fold symmetric univalent functions: φ(z) = z + am+1 z m+1 + a2m+1 z 2m+1 + . . . . Szeg¨o conjectured that |an | = O(n−1+2/m ). This conjecture was proved for m = 1 by Littlewood [35, Th. 20], for m = 2 by Littlewood and Paley [37], for m = 3 and (with a logarithmic correction) for m = 4 by Levin [32]. On the other hand, Littlewood [34] proved that the conjecture fails for large m.
Harmonic Measure on Fractal Sets
47
Makarov proved [42] that the universal spectrum B [m] (t) for m-fold symmetric functions satisfies 2 [m] B (t) = max B(t), 1 + t−1 . m Particularly the growth rate of coefficients is given by 2/m − 1,
m ≤ 2/B(1) ,
B(1) − 1,
m ≥ 2/B(1) .
This theorem together with Carleson and Jones conjecture suggests that Szeg¨o conjecture holds for k ≤ 8 and fails for k ≥ 9. The previously known estimates for B(1) show that Szeg¨ o conjecture holds for k = 1, 2, 3, 4, and fails for k ≥ 12. Our improved estimate B(1) > 0.23 (see Theorem 2.4 below) implies that conjecture is indeed wrong for k ≥ 9. 1.4. Estimating universal spectra. The known results about universal spectra use variety of approaches to produce estimates from above and below. At present the estimates from above are rather far from being sharp, and it is unclear which methods can possibly give exact results. In the hope to gain understanding we concentrate in the next sections on estimates from below, that is on constructing (fractal) maps with large spectra. There is also hope that eventually the universal spectrum will be evaluated exactly by showing that it is equal to the spectrum of some particular “fractal” map, for which it can be calculated (cf. discussion of fractal approximation below). Before we pass to fractal examples, we sketch the situation with estimates from above, using B(1) as an example. See also Problems 6.5, 6.7, and 6.8 from the Hayman’s problem list [23] and the survey paper [51] and books [49, 50] by Pommerenke. Conjectural value of γ = γb = B(1) is 1/4, but existing estimates are quite far. The first result in this direction is due to Bieberbach [7] who in 1914 used his area theorem to prove that γ ≤ 1/2. Littlewood, Paley, and Levin proved aforementioned estimates on |an | for k-fold symmetric functions for k = 1, 2, 3, 4. Clunie and Pommerenke in [16] proved that γ ≤ 1/2 − 1/300 and γb ≤ 1/2 − for some > 0. They used a differential inequality on |φ (rξ)|δ for a fixed small δ. Carleson and Jones [13] established that γ = γb and used Marcinkiewicz integrals to prove γ < 0.49755. This estimate was improved by Makarov and Pommerenke [43] to γ < 0.4886 and then by Grinshpan and Pommerenke [21] to γ < 0.4884. The best current estimate is due to Hedenmalm and Shimorin [24] who quite recently proved that B(1) < 0.46. 2. Searching for extremal fractals It is clear that extremal domains should be fractal. There are several standard classes of fractals that one can study. For most of them the fractal approximation holds. This means that the supremum of spectra over this particular
48
D. Beliaev and S. Smirnov
class of fractals is equal to the universal spectrum. These results can help to understand the nature of extremal domains, but it is not clear if one can get any upper bound in this way. Another problem is that it is extremely difficult to work with harmonic measure on fractals because the radial behavior of conformal map depends on arg z in a highly non-regular way. We will argue that solution to this problem might lie in considering random fractals, when averaging over many maps makes behavior of φ statistically the same for all values of arg z. Below we give a short overview of fractals and methods that were used in the search of lower bounds. 2.1. Lacunary series. The first estimate from below is due to Littlewood [34] who disproved for large m the Szeg¨o conjecture about coefficients of m-fold symmetric functions: using lacunary series he constructed an explicit function with |an | > A(m)n−1+a/ log m for infinitely many n, where A is a universal constant. Much later Clunie [15] used the same technique for class Σ and constructed a function with |bn | > n0.002−1 for infinitely many n. Similar technique was used by Pommerenke [47, 48], see the discussion below. The method consisted of writing a specific Taylor series convergent in D and using argument principle to check that the resulting function is a schlicht map. It turns out that such series describe maps to fractal domains. Since it is much easier to construct analytic functions (rather than univalent ones) it is interesting whether more advanced univalence criteria can be used to obtain interesting examples. 2.2. Geometric snowflakes. Canonical geometric construction, called snowflake, was introduced by von Koch [28, 29] as an example of a nowhere differentiable curve. We start with a “building block” – a polygon P = P0 . The construction proceeds in the following fashion: to obtain Pn+1 , a part of each side of Pn is replaced by a scaled copy of P . In the limit a fractal called snowflake is obtained, which we identify with a conformal map of D− to its complement. Carleson and Jones proved that to find the value of β it is enough to study snowflakes. Let Σsnowflake be the class of conformal mappings whose image domain is a snowflake, and set βsnowflake = sup βφ , where the supremum is taken over all snowflakes φ ∈ Σsnowflake . Then Theorem 2.1 (Fractal approximation, Carleson & Jones, 1992). βsnowflake = β . Makarov developed their machinery to extend the result to the multifractal spectra. In [42, Th. 5.1] he gives a complete proof in the multiply connected situation (when one works with Cantor sets rather than von Koch snowflakes), and outlines it in the simply connected case. Again, Fsnowflake (α) and Bsnowflake (t) are defined as suprema of fφ (α) and βφ (t) over φ ∈ Σsnowflake :
Harmonic Measure on Fractal Sets
49
Figure 1. Julia set for z 2 − 0.56 + 0.664i Theorem 2.2 (Fractal approximation, Makarov, 1999). Fsnowflake (α) = F (α) , Bsnowflake (t) =
B(t) .
Fractal approximation tells us that it is enough to study harmonic measure on snowflakes. Construction of the snowflake is geometric, so it is easy to control dimensions, but estimating harmonic measure is much harder. 2.3. Julia sets. Harmonic measure arises in a natural way for Julia sets of polynomials. If p(z) is a polynomial, we denote by F∞ its domain of attraction to infinity, that is the set of z such that iterates p(p(. . . p(z) . . . )) tend to infinity. The Julia set of p is then the boundary of F∞ . It was demonstrated by Brolin [12] that harmonic measure on F∞ is balanced (has constant Jacobian under mapping by p) and by Lyubich [38] that it maximizes entropy. Similarly multifractal spectra have dynamical meaning. For example the integral means spectrum is related to the thermodynamical pressure: β(t) − t + 1 = sup I(µ) − t log p dµ log deg p , where the supremum is taken over all invariant measures µ and I(µ) denotes entropy, see [42] and the references therein. This provides more tools to analyze harmonic measure, for example establishing its dimension in this particular case is easier and has more intuitive reasons, than in general case – compare [46] of Manning to Makarov’s [41] treatment of the general situation. Carleson and Jones [13] studied numerically β for domains of attraction to infinity for quadratic polynomials f (z) = z 2 + c, and obtained non-rigorous estimate β ≈ 0.24 for c = −0.560 + 0.6640i. The Figure 1 shows the corresponding Julia set. Based on this computer experiment and on analogy with conformal field theory they conjectured that B(1) = 1/4.
50
D. Beliaev and S. Smirnov
Recently Binder and Jones [9] proved fractal approximation by Julia sets. Together with theorem by Binder, Makarov, and Smirnov [10] it implies that B(t) = Bmc (t), t ≥ 0, where Bmc is the (a priori larger) universal spectrum for multiply connected domains. It is conjectured by Jones that there is a fractal approximation by quadratic polynomials. If true the universal spectrum will probably be attained by the Mandelbrot set. Despite this progress, it is still unclear whether one can employ Julia sets to estimate the universal spectra – rigorous dimension estimates are very hard in this class of fractals. 2.4. Conformal snowflakes. We would like to introduce a new class of random conformal snowflakes. This class is interesting because fractal approximation holds, while estimates of the spectra reduce to (much simpler) eigenvalue estimates for integral equations. Also it appears that even simple building blocks lead to snowflakes with rather large spectrum. We start with a deterministic construction, which is related to those used by Littlewood and Pommerenke. Denote by Σ the class of univalent maps of D− = {|z| > 1} into itself, preserving infinity. Fix an integer k ≥ 2. We define the Koebe k-root transform of φ ∈ Σ by Kk φ(z) = k φ(z k ) ∈ Σ . The first generation of the snowflake is given by some function Φ0 = φ ∈ Σ . Let Φn (z) = Kkn φ(z). The nth approximation to the snowflake is given by fn = Φ0 (Φ1 (. . . Φn (z) . . . )). We define conformal snowflake as the limit f = lim fn . Let ψ = φ−1 and gn = fn−1 . It is easy to check that fn+1 (z) = φ k fn (z k ) , gn+1 (z) = k gn (ψ(z)k ) . Therefore the limit map g = lim gn satisfies g(z)k = g(ψ(z)k ) . So g semi-conjugates dynamical systems z → z k and z → ψ(z)k on D− , and the resulting snowflake is a Julia set of ψ k acting on D− (i.e., the attractor of inverse iterates). Because construction is based on iterated conformal maps, harmonic measure is easier to handle than in the case of geometric snowflakes, and even polynomial Julia sets. It turns out that there is a fractal approximation for conformal snowflakes: Theorem 2.3 (Fractal approximation). Let Bcsf (t) be the universal integral means spectrum for conformal snowflakes, then Bcsf (t) = B(t) . The proof is quite similar to the proof of fractal approximation for snowflakes due to Carleson and Jones. We sketch the proof for the case t = 1, the complete proof appears in [2]. Let us choose a function φ such that it has a long Green’s line with potential 1/k, namely length (Γ1/k (φ)) ≈ kβ , with β = B(1).
Harmonic Measure on Fractal Sets
51
j Then for Φj = k φ(z kj ) the Green’s line with potential 1/kj has length ≈ kβ . One can argue that the length of Green’s line for fn is the product of the lengths of Green’s lines for Φj ’s, since those oscillate on different scales: length (Γ1/kn (Φ0 ◦ Φ1 ◦ · · · ◦ Φn )) ≈
n
length (Γ1/kj (Φj )) ≈ knβ ,
j=0
and it follows that the specific snowflake we constructed almost attains the universal β. As we noted above Pommerenke used a similar construction in [47, 48] to produce maps with large coefficients. Let 2/mqk 1−λ φk (z) = z , 1 − λz mqk where λ and q are parameters. He studied functions fk defined recursively by fk (z) = fk−1 (φk (z)). Using this construction he first found functions from Sb and Σ with |an |, |bn | > const n0.139−1 , and then improved the estimate to |an |, |bn | > const n0.17−1 . Later Kayumov [27] used this technique to prove that B(t) > t2 /5 for 0 < t < 2/5. 2.5. Random conformal snowflakes. Conformal snowflakes are easier to with than Julia sets or geometric snowflakes. However they share the problem: behavior of f depends on symbolic dynamics of the arg z. To this problem we introduce a random rotation on every step: gn+1 (z) = k gn (ψ(eiθn z)k ) ,
work same solve (2.1)
where θn are independent random variables uniformly distributed in [0, 2π[. Capacity estimates show that there exist a limiting random conformal map g = g∞ , and sending n → ∞ we obtain the stationarity of g under the random transformation (2.1): g(z) = k g(ψ(eiθ z)k ) , (2.2) where θ is uniformly distributed in [0, 2π[, and equality should be understood in the sense of random maps having the same distribution. Using (2.2) one can write a similar equation for the derivative g , and also integral equations (depending on the building block and k) for the expectations like E|g |t . This reduces the determination of the spectrum of a random conformal snowflake to the evaluation of the spectral radius of a particular integral operator (3.3) on the half-line. While its exact value seems beyond reach for the time being, one can obtain decent estimates. As an example, we prove in [2] the following Theorem 2.4. There is a particular snowflake with β(1) > 0.23. This snowflake is generated by a simple slit map. Figures 2 and 3 show its third generation and the blow up of its boundary with three Green’s lines.
52
D. Beliaev and S. Smirnov
Figure 2. Random conformal snowflake from Theorem 2.4
Figure 3. Blow up of the boundary of the random conformal snowflake from Theorem 2.4 with three Green’s lines The general theory of random conformal snowflakes is developed in [2, 3]. In particular the fractal approximation Theorem 2.3 extends to the random conformal snowflakes. Since the building blocks can be taken smooth and relate to the spectra in a simple way, we hope that eventually one might be able to develop some kind of a variational principle, which together with the fractal approximation might yield estimates from above. The random conformal snowflakes can be considered as Julia sets of random sequences of schlicht maps. One can similarly study the spectra for more
Harmonic Measure on Fractal Sets
53
traditional Julia sets of random sequences of polynomials. Unfortunately, after some technical difficulties one arrives at integral equations which are rather hard to work with. 2.6. Schramm-Loewner Evolutions. A very interesting class of random “conformal” fractals was recently introduced by Schramm [52]. The whole plane Schramm-Loewner Evolution with parameter κ ≥ 0 , or SLEκ , is defined as the solution of the Loewner equation (cf. [40, 39]) gτ (z) + ξτ , (2.3) gτ (z) − ξτ √ where the driving force is given by ξτ = exp(i κBτ ) with Bτ being the standard one-dimensional Brownian motion. The initial condition is ∂τ gτ (z) = −gτ (z)
lim eτ gτ (z) = z .
τ →−∞
This equation describes the evolution of random univalent maps gτ from C\Hτ onto D− . One calls SLEκ this family of random maps, as well as the family of random hulls Hτ and inverse maps fτ = gτ−1 . See Lawler’s book [31] for the proof of existence and basic properties. The traces of the Schramm-Loewner evolutions are the only possible conformally invariant scaling limits of cluster perimeters in critical lattice models. As such the values of their spectra were (non-rigorously) predicted by the physicist Duplantier [18, 19] by means of Conformal Field Theory and Quantum Gravity arguments: Theorem 2.5 (CFT prediction, Duplantier, 2000). The f (α) spectrum for the bulk of SLEκ is equal to f (α) = α −
(25 − c)(α − 1)2 , 12(2α − 1)
where c is the central charge which is related to κ by c=
(6 − κ)(6 − 16/κ) . 4
The prediction should be understood as the “mean” or the “almost sure” value of the spectra. Below we sketch a rigorous proof of the Duplantier’s prediction, given by us in [2, 5]. As in the case of conformal snowflakes, stationarity implies that expressions like E|f (z)|t satisfy certain equations. This time the equation turns out to be a heat equation (3.1) with variable coefficients, and asymptotics of solutions can be evaluated exactly. The maximal value of such spectra is attained for κ = 4: f (α) =
3 1 − , 2 4α − 2
κ=4,
54
D. Beliaev and S. Smirnov
√ which gives for example β(1) = 3 − 2 2 ≈ 0.17. So SLE does not have a large spectrum, but at present it is perhaps the only fractal where the spectra can be written exactly. In hope of obtaining large spectrum it is natural to generalize SLE, considering other driving forces. In our derivations the Markov property plays essential role, so the first logical choice would be to consider L´evy processes. One can apply the same technique as in the case of SLE and reduce the problem of finding the spectrum to the analysis of a particular integro-differential equation, but at present we do not have good rigorous estimates of its spectral radius. On the other hand, numerical experiments by us and by Kim and Meyer suggest that Loewner Evolution driven by Cauchy process has a large spectrum. In view of Theorems 2.4 and 2.5 there is certainly no fractal approximation by SLE’s, but one can argue that a fractal approximation principle could hold in the class of “L´evy-Loewner Evolutions.” 3. Estimates of spectra for random fractals For random fractals it is very natural to study the mean spectrum, i.e., behavior of E|f (z)|t instead of |f (z)|t . When available, correlation estimates can be used to show that the mean spectrum is attained by almost every realization of the fractal. Moreover, one can show using Makarov’s fractal approximation theorem that the universal spectrum is greater than the mean spectrum for any class of fractals, so if we are looking for the estimates from below it suffices. Random models that we mentioned above have some kind of stationarity. This means that E|f (z)|t is invariant with respect to some random transformation which implies that it is a solution of a particular equation. Usually it is much easier to analyze the asymptotic behavior of solutions rather than average local behavior of conformal maps. Below we describe how to apply these ideas in the case of SLE and random conformal snowflakes. 3.1. Exact solutions for SLE. Let fτ : D− → Hτ be the whole plane SLEκ . Then e−τ fτ has the same distribution as f0 (see [31] for the proof). One can check that F (z) = E [e−tτ |fτ (z)|t ] is a t-covariant martingale with respect to the filtration generated by the driving force Bs , s < τ . This implies that F (z) = F (r, θ) solves the second-order PDE: 4 r + 4r2 (1 − r cos θ) − 1 − 1 F+ t (r2 − 2r cos θ + 1)2 (3.1) r(r2 − 1) 2r sin θ κ Fr − 2 Fθ + Fθθ = 0 . + 2 r − 2r cos θ + 1 r − 2r cos θ + 1 2 Here the first term is contributed by t-covariance, the second and the third form the derivative in the direction of the Loewner flow (with constant driving force), whereas the forth term is the generator of the driving force – the Brownian motion.
Harmonic Measure on Fractal Sets
55
For such an equation it appears possible to analyze exactly the behavior of solutions as r → 1+. Applying formally Frobenius theory one can obtain the local solution near the singular “growth” point (θ, r) = (0, 1), which, e.g., for t ≤ t∗ = 3(4 + κ)2 /(32κ) has the form (r − 1)−β · ((r − 1)2 + θ2 )γ ,
(3.2)
for
(4 + κ)2 − (4 + κ) (4 + κ)2 − 8κt β = β(t, κ) = −t + , 4κ 4 + κ − (4 + κ)2 − 8κt γ = γ(t, κ) = . 2κ Tweaking the formula (3.2) one constructs global sub- and super-solutions of the PDE (3.1) which behave as (r − 1)−β when r → 1+. So by the maximum principle any solution has such asymptotics. So for t ≤ t∗ the mean spectrum β∗ (t) is equal to β(t). It is easy to see that mean spectrum is a convex function bounded by the universal spectrum. The latter is equal to t − 1 for t ≥ 2 and since β∗ (t∗ −) = 1, one easily infers that β∗ (t) = β(t∗ ) + t − t∗ for t > t∗ . The derived spectrum β∗ (t) is the Legendre transform (1.1) of the Duplantier’s prediction for f (α). Details of the proof appear in [2, 5]. Our reasoning applies to the case of Loewner Evolution driven by a L´evy process with generator A. The function F (z) satisfies the same equation (3.1), with the term κ2 Fθθ substituted by AF . We are not able to perform a rigorous analysis of the resulting equations yet, but this direction of investigations seems rather promising. 3.2. Estimates for snowflakes. Let f be a random conformal snowflake as defined in Section 2.5. Construction of fn is such that it seems impossible to deduce an equation for E|f |t , which seems to be the main obstacle to the exact determination of the corresponding spectra. We work with the inverse function g instead. The spectrum β(t) of the snowflake is roughly speaking the smallest b such that 2π b−1 (r − 1) |f (reiθ )|t dθdr < ∞ . 1
0
In terms of the inverse function g it means that we should study the integrability of |g |2−t (|g| − 1)b−1 near r = 1+. The latter is comparable to |g /g|2−t logb−1 |g|, for whose expectations we can derive an integral equation. Set F (z) = F (|z|) = E |g (z)/g(z)|2−t logb−1 |g(z)| ,
by the presence of rotation in (2.2) the function F depends on |z| only. The mean spectrum of a snowflake is the minimal b such that F is integrable near 1+. Using stationarity of g, namely plugging in instead of g the
56
D. Beliaev and S. Smirnov
right-hand side of (2.2), we write F (r) = E |g (r)/g(r)|2−t logb−1 |g(r)| ! g (ψ(reiθ )k )ψ (reiθ )ψ(reiθ )k−1 2−t log |g(ψ(reiθ )k )| b−1 , =E g(ψ(reiθ )k ) k where θ has a uniform distribution in [0, 2π[. The right-hand side can be rewritten as to separate the expectation with respect to the (independent) distributions of g and θ: ! 2π g (ψ(reiθ )k ) 2−t |ψ (reiθ )ψ(reiθ )k−1 |2−t dθ b−1 iθ k . Eg log |g(ψ(re ) )| g(ψ(reiθ )k ) kb−1 2π 0 By the definition of F the expectation under the integral is equal to F (ψ(reiθ )k ), hence F satisfies the integral equation 2π dθ 1−b , F (ψ(reiθ )k ) · |ψ(reiθ )k−1 ψ (reiθ )|2−t F (r) = k 2π 0 and we are searching for the value of b when it ceases to be integrable near 1+. Thus finding β is reduced to evaluation of the spectral radius in L1 of the integral operator Q: 2π dθ . (3.3) f (|ψ(reiθ )|k ) · |ψ(reiθ )k−1 ψ (reiθ )|2−t (Qf )(r) := 2π 0 It does not seem possible to find the spectral radius exactly in terms of φ and k, but one can write good estimates by majoration or approximation. In this way we prove Theorem 2.4 by showing that β(1) > 0.23 for a snowflake generated by a simple slit map (it maps D− onto D− with a straight slit of length 73) and k = 13, see Figures 2 and 3. References [1] I.N. Baker and G.M. Stallard. Error estimates in a calculation of Ruelle. Complex Variables Theory Appl., 29(2):141–159, 1996. [2] D. Beliaev. Harmonic measure on random fractals. Royal Institute of Technology, Stockholm, 2005. [3] D. Beliaev and S. Smirnov. Conformal snowflakes. In preparation. [4] D. Beliaev and S. Smirnov. On Littlewood’s constants. Bull. London Math. Soc. to appear. [5] D. Beliaev and S. Smirnov. Spectrum of SLE. In preparation. [6] D. Bertilsson. On Brennan’s conjecture in conformal mapping. Royal Institute of Technology, Stockholm, 1999. [7] L. Bieberbach. Zur Theorie und Praxis der konformen Abbildung. Palermo Rend., 38:98–112, 1914.
Harmonic Measure on Fractal Sets
57
¨ [8] L. Bieberbach. Uber die Koeffizienten derjenigen Potenzreihen, welche eine schlichte Abbildung des Einheitskreises vermitteln. Berl. Ber., pages 940–955, 1916. [9] I. Binder and P.W. Jones. In preparation. [10] I. Binder, N. Makarov, and S. Smirnov. Harmonic measure and polynomial Julia sets. Duke Math. J., 117(2):343–365, 2003. [11] J.E. Brennan. The integrability of the derivative in conformal mapping. J. London Math. Soc. (2), 18(2):261–272, 1978. [12] H. Brolin. Invariant sets under iteration of rational functions. Ark. Mat., 6:103– 144 (1965), 1965. [13] L. Carleson and P.W. Jones. On coefficient problems for univalent functions and conformal dimension. Duke Math. J., 66(2):169–206, 1992. [14] L. Carleson and N.G. Makarov. Some results connected with Brennan’s conjecture. Ark. Mat., 32(1):33–62, 1994. [15] J. Clunie. On schlicht functions. Ann. of Math. (2), 69:511–519, 1959. [16] J. Clunie and C. Pommerenke. On the coefficients of univalent functions. Michigan Math. J., 14:71–78, 1967. [17] L. de Branges. A proof of the Bieberbach conjecture. Acta Math., 154(1-2):137– 152, 1985. [18] B. Duplantier. Conformally invariant fractals and potential theory. Phys. Rev. Lett., 84(7):1363–1367, 2000. [19] B. Duplantier. Higher conformal multifractality. J. Statist. Phys., 110(3-6):691– 738, 2003. [20] A.E. Er¨emenko. Lower estimate in Littlewood’s conjecture on the mean spherical derivative of a polynomial and iteration theory. Proc. Amer. Math. Soc., 112(3):713–715, 1991. [21] A.Z. Grinshpan and C. Pommerenke. The Grunsky norm and some coefficient estimates for bounded functions. Bull. London Math. Soc., 29(6):705–712, 1997. [22] T.C. Halsey, M.H. Jensen, L.P. Kadanoff, I. Procaccia, and B.I. Shraiman. Fractal measures and their singularities: the characterization of strange sets. Phys. Rev. A (3), 33(2):1141–1151, 1986. [23] W.K. Hayman. Research problems in function theory. The Athlone Press University of London, London, 1967. [24] H. Hedenmalm and S. Shimorin. Weighted Bergman spaces and the integral means spectrum of conformal mappings. Duke Mathematical Journal. to appear. [25] P.W. Jones and N.G. Makarov. Density properties of harmonic measure. Ann. of Math. (2), 142(3):427–455, 1995. [26] P.W. Jones and T.H. Wolff. Hausdorff dimension of harmonic measures in the plane. Acta Math., 161(1-2):131–144, 1988. [27] I. Kayumov. Lower estimates for the integral means of univalent functions. Arkiv f¨ or Matematik. to appear. [28] H. v. Koch. Sur une courbe continue sans tangente, obtenue par une construction g´eom´etrique ´el´ementaire. Arkiv f. Mat., Astr. och Fys., 1:681–702, 1904. [29] H. v. Koch. Une me´thode g´eom´etrique ´el´ementaire pour l’´etude de certaines questions de la th´eorie des courbes planes. Acta Math., 30:145–174, 1906.
58
D. Beliaev and S. Smirnov
[30] P. Kraetzer. Experimental bounds for the universal integral means spectrum of conformal maps. Complex Variables Theory Appl., 31(4):305–309, 1996. [31] G. Lawler. Conformally Invariant Processes in the Plane, volume 114 of Mathematical Surveys and Monographs. American Mathematical Society, Providence, RI, 2005. ¨ [32] V. Levin. Uber die Koeffizientensummen einiger Klassen von Potenzreihen. Math. Z., 38:565–590, 1934. [33] J.L. Lewis and J.-M. Wu. On conjectures of Arakelyan and Littlewood. J. Analyse Math., 50:259–283, 1988. [34] J. Littlewood. On the coefficients of schlicht functions. Q. J. Math., Oxf. Ser., 9:14–20, 1938. [35] J.E. Littlewood. On inequalities in the theory of functions. Proceedings L. M. S., 23(2):481–519, 1925. [36] J.E. Littlewood. On some conjectural inequalities, with applications to the theory of integral functions. J. London Math. Soc., 27:387–393, 1952. [37] J.E. Littlewood and R.E. A.C. Paley. A proof that an odd schlicht function has bounded coefficients. Journal L. M. S., 7:167–169, 1932. [38] M.J. Ljubich. Entropy properties of rational endomorphisms of the Riemann sphere. Ergodic Theory Dynam. Systems, 3(3):351–385, 1983. [39] C. Loewner. Collected papers. Contemporary Mathematicians. Birkh¨ auser Boston Inc., Boston, MA, 1988. [40] K. L¨ owner. Untersuchungen u ¨ber schlichte konforme Abbildungen des Einheitskreises. I. Math. Ann., 89:103–121, 1923. [41] N.G. Makarov. On the distortion of boundary sets under conformal mappings. Proc. London Math. Soc. (3), 51(2):369–384, 1985. [42] N.G. Makarov. Fine structure of harmonic measure. St. Petersburg Math. J., 10(2):217–268, 1999. [43] N.G. Makarov and C. Pommerenke. On coefficients, boundary size and H¨ older domains. Ann. Acad. Sci. Fenn. Math., 22(2):305–312, 1997. [44] B.B. Mandelbrot. Possible refinement of the lognormal hypothesis concerning the distribution of energy dissipation in intermittent turbulence. In Statistical Models Turbulence, Proc. Sympos. Univ. California, San Diego (La Jolla) 1971, Lecture Notes Phys. 12, 333-351 . 1972. [45] B.B. Mandelbrot. Intermittent turbulence in self-similar cascades: divergence of high moments and dimension of the carrier. J. Fluid Mech., 62:331–358, 1974. [46] A. Manning. The dimension of the maximal measure for a polynomial map. Ann. of Math. (2), 119(2):425–430, 1984. [47] C. Pommerenke. On the coefficients of univalent functions. J. London Math. Soc., 42:471–474, 1967. [48] C. Pommerenke. Relations between the coefficients of a univalent function. Invent. Math., 3:1–15, 1967. [49] C. Pommerenke. Univalent functions. Vandenhoeck & Ruprecht, G¨ ottingen, 1975. [50] C. Pommerenke. Boundary behaviour of conformal maps, volume 299 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin, 1992.
Harmonic Measure on Fractal Sets
59
[51] C. Pommerenke. The integral means spectrum of univalent functions. Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI), 237(Anal. Teor. Chisel i Teor. Funkts. 14):119–128, 229, 1997. [52] O. Schramm. Scaling limits of loop-erased random walks and uniform spanning trees. Israel J. Math., 118:221–288, 2000. [53] S. Shimorin. A multiplier estimate of the Schwarzian derivative of univalent functions. Int. Math. Res. Not., (30):1623–1633, 2003. D. Beliaev KTH S. Smirnov KTH and Geneva University
4ECM Stockholm 2004 c 2005 European Mathematical Society
Singular Approximations to Hyperbolic Systems of Conservation Laws in one Space Dimension Stefano Bianchini Abstract. Consider a n × n hyperbolic system of conservation laws of the form ut + f (u)x = 0,
(t, x) ∈ R+ × R,
u ∈ Rn .
(0.1)
Here u = (u1 , . . . , un ) is the vector of conserved quantities, while the components of f = (f1 , . . . , fn ) are the luxes. The system is said strictly hyperbolic if at each point u the Jacobian matrix Df (u) has n real, distinct eigenvalues λ1 (u) < · · · < λn (u). A fundamental ingredient to prove existence and stability in BV is the introduction of a functional, the Glimm–Liu interaction functional, which controls the interactions among non linear waves. Aim of this note is to present a simple interpretation of (the scalar part of) the interaction functional, and show how it can be extended to the following equation: (1) a parabolic equation of the form ut + f (u)x = uxx ; (2) scalar semidiscrete schemes, for example the upwind scheme ut (t, x) + f (u(t, x)) − f (u(t, x − 1) = 0, or the backward scheme u(t, x) − u(t − 1, x) + f (u(t, x)) = 0; (3) 2 × 2 relaxation approximation, in particular ut + vx = 0 vt + ux = f (u) − v. All these approximations are interesting from the physical and numerical point of view. Finding an interaction is one of the key steps toward the proof of BV bounds.
1. Motion in the direction of curvature Fix two points A, B in the plane R2 and consider the family FAB of all polygonal lines joining A with B. Given γ ∈ FAB , with vertices A = P0 , 2000 Mathematics Subject Classification. 35L65. Key words and phrases. Hyperbolic systems, conservation laws, numerical scheme.
62
S. Bianchini P
l
γ
γ
Pl+1
P l−1
γ’ γ’
A
B
A
B
Figure 1. Area swept by motion in the direction of curvature. . P1 ,. . . ,Pn = B, define vi = Pi − Pi−1 and consider the functional n . 1 |vi ∧ vj |, Q(γ) = 2 i,j=1
(1.1)
i<j
where ∧ stands for the external product in R2 . Let γ be obtained from γ by replacing the two segments P−1 P and P P+1 by one single segment P−1 P+1 , as in Fig. 1. The area of the triangle with vertices P−1 , P , P+1 satisfies Area(P−1 P P+1 ) =
1 |v+1 ∧ v | ≤ Q(γ) − Q(γ ). 2
(1.2)
Indeed, −1 # 1 " Q(γ) − Q(γ ) = |vi ∧ v | + |vi ∧ v+1 | − |vi ∧ (v + v+1 )| 2 i=0
1 + |v ∧ v+1 | 2 n # 1 " |v ∧ vj | + |v+1 ∧ vj | − |(v + v+1 ) ∧ vj | + 2 j=+2
1 ≥ |v ∧ v+1 |. 2 Next, assume that γ is obtained from γ by a finite sequence of consecutive cuts (Fig. 1). In other words, let γ0 , γ1 ,. . . , γn be a sequence of polygonals with γ0 = γ, γn = γ and such that each γi is obtained from γi−1 by replacing two adjacent segments with a single one. By (1.2), an inductive argument yields
Area(γ, γ ) ≤
n i=1
Area(γi , γi−1 ) ≤ Q(γ) − Q(γ ).
(1.3)
Singular Approximations to Hyperbolic Systems
63
γ (s)
γ B
B
γ (t)
γn
γ’ A
A
Figure 2. The ≺ relation. More generally, instead of polygonals, we can define the functional Q on the family of parametric curves of finite length in the plane. Up to a homeomorphism h, we can assume that each γ ∈ F is absolutely continuous and parametrized by x ∈ [0, 1], and we denote with γx the derivative of γ w.r.t. x. We recall that γx ∈ L1 ([0, 1]; R2 ) and 1 γx dx. L(γ) = 0
We can now define a functional Q : F → R by setting n $ . 1 Q(γ) = sup γ(xi ) − γ(xi−1 ) ∧ γ(xj ) − γ(xj−1 ) 2 i,j=1 =
1 2
1 0
j>i
γx (x) ∧ γx (y)dydx ≤ 1 L(γ)2 , 2
1
x
(1.4)
where, as above, the supremum is taken w.r.t. all partitions 0 = x0 < · · · < xn = 1. Observe that the definition (1.4) is the natural extension of (1.1). Given γ ∈ F , by a cut we mean the replacement of the portion of the curve γ(x); x ∈ [x1 , x2 ] ⊆ [0, 1] with the segment connecting γ(x1 ) to γ(x2 ), for some x1 , x2 ∈ [0, 1]. We say that γ follows γ, and write γ ≺ γ , if there exists a sequence of curves γn converging to γ in (F , d) such that each γn is obtained from γ by a finite sequence of consecutive cuts (Fig. 2). Note that, as a consequence of this definition, γ must have the same endpoints of γ. It is easy to see that ≺ defines a partial order relation. Given γ, γ ∈ F with γ ≺ γ , we consider the closed curve γ ∪ γ : [0, 2] → 2 R as # " x ∈ (0, 1] . γ(x) γ ∪ γ (x) = (1.5) γ (2 − x) x ∈ (1, 2] By the area between γ and γ , denoted by Area[γ, γ ], we mean the area of the regions where the winding number of the curve γ ∪ γ is odd.
64
S. Bianchini
Lemma 1.1. If γ ≺ γ , then the area between the two curves satisfies Area[γ, γ ] ≤ Q(γ) − Q(γ ).
(1.6)
Proof. In the case where γ is a polygonal and γ is obtained from γ with a finite sequence of cuts, the result was already proved in (1.3). The general case follows by approximation, using the lower semicontinuity of Q with respect to the distance among curves. Remark 1.2. One can give an equivalent definition of the partial order relation “≺” by setting γ ≺ γ if the following holds: There exists a sequence of parabolic problems on the plane: ν ν (t, x) = 0 ξt + λν (t, x)ξxν − cν (t, x)ξxx ν ξ (0, x) = γ(x) x ∈ [0, 1] (1.7) ν ξ (t, 0) = γ(0) t ∈ [0, 1] ξ ν (t, 1) = γ(1) t ∈ [0, 1] whose solutions at time t = 1 converge to γ , i.e., lim d ξ ν (1, ·), γ = 0. ν→∞
Here λν , cν are smooth functions from [0, 1] × [0, 1] → R, with cν strictly positive. In fact any ”cut” can be obtained by choosing appropriately the coefficients λν , cν in (1.7), and conversely there are simple numerical schemes approximating (1.7) for which the solution moves in the direction of curvature. More generally, consider a path varying with time. This is described by a map γ : [t1 , t2 ] → F . We say that γ moves in the direction of curvature if γ(s) ≺ γ(t) for all s < t, s, t ∈ [t1 , t2 ] (Fig. 2). Observe that, from our definitions, it follows that the endpoints of γ(t) remain constant in time. The area swept by γ(t) during the time interval [t1 , t2 ] is defined as n . % & Area γ; [t1 , t2 ] = sup Area γ(si ), γ(si−1 ) ; (1.8) i=1 t1 = s0 < · · · < sn = t2 , n ≥ 1 . A consequence of the above definitions is Theorem 1.3. Let t → γ(t) ∈ F denote a curve in the plane, moving in the direction of the curvature. Then, for every t1 < t2 one has Area γ; [t1 , t2 ] ≤ Q γ(t1 ) − Q γ(t2 ) . (1.9) Remark 1.4. If γ is differentiable in t, x, then one obtains that the area swept is t2 1 γt (t, x) ∧ γx (t, x)dxdt, (1.10) Area(t1 , t2 ) = t1
0
Singular Approximations to Hyperbolic Systems
so that it follows 1 d 1 1 1 γt (t, x) ∧ γx (t, x)dx. γx (x) ∧ γx (y) dydx ≤ − dt 2 0 x 0
65
(1.11)
Remark 1.5. It is possible to generalize the results for γ : [0, 1] → Rn . For example, the functional Q becomes 1 γx (x1 ) ∧ . . . ∧ γx (xn )dx1 . . . dxn . Q(γ) = n 0≤x1 ≤...≤xn ≤1 We observe that the above formula is also related to the volume of zonoids, i.e., convex bodies which are range of a bounded nonatomic vector measure. 2. Application to scalar conservation laws Consider a scalar conservation law ut + f (u)x = 0,
(2.1)
with f sufficiently smooth. Given an initial data u0 in BV, let u = u(t, x) be the corresponding unique entropic solution. In this section we will show that to u(t, ·) one can associate a parametric curve γ(t) such that γ(t) moves in the direction of curvature, i.e., γ(s) ≺ γ(t) whenever 0 ≤ s < t. Given a map u : R → R with bounded variation, define the function U ∈ BV as x . U (x) = |Du| = Tot. Var. u; (−∞, x] . (2.2) −∞
Here&Du is the measure corresponding to the distributional derivative of u. For % θ ∈ 0, Tot. Var.(u) , we define xθ to be the point x such that U (x−) ≤ θ ≤ U (x+). (2.3) u− > u+& (or Let now two points u− , f (u− ) , u+ , f (u+ ) ∈ R 2 , be given, with % u− < u+ ). We then define the curve R θ; [u− , u+ ] , where θ ∈ 0, |u+ −u− | , as the graph of the convex (concave) envelope of the function f (u) on the interval [u− , u+ ]. To a function u ∈ BV we associate the parametric curve % & γ : 0, T.V.(u) → R2
defined as (Fig. 3) u(−∞), f (u(−∞)) f (u(θ)) u(θ), . γ(u; θ) = R θ − U (xθ −); [u(xθ )− , u(xθ )+ ] u(∞), f (u(∞))
θ=0 u is continuous at xθ u has a jump in xθ , θ ∈ [U (xθ −), U (xθ +)] θ = Tot. Var.(u)
(2.4)
66
S. Bianchini
u
f(u)
3 γ (u) 2
1 4 5 5
4
1
2
3 u
x
Figure 3. The curve γ for the scalar equations. The fact that the curve γ corresponding to Kruzhkov entropy solution moves in the direction of curvature can be proved by considering wavefront tracking approximation. Here we only prove it for a simple case. Example 2.1. Consider the scalar equation 2 u ut + =0 2 x with a monotone decreasing piecewise constant initial data u0 . Let us denote with ui , i = 1, . . . , n the values of the jumps, with ui+1 < ui . The evolution of entropy solution can be constructed as follows: (1) each jump travel with the speed f (ui ) − f (ui−1 ) 1 = ui + ui−1 ; σi = ui − ui−1 2 (2) when two jumps [ui−1 , ui ], [ui , ui+1 ] interact, we continue the solution by considering the larger jump [ui−1 , ui+1 ] with speed (ui+1 + ui−1 )/2. The curve γ associated to this solution is a polygonal with vertices in the points (ui , u2i /2), because u is decreasing and f (u) is convex. It is easy to verify that any interaction corresponds to the replacement of the triangle (ui−1 , u2i−1 /2), (ui , u2i /2), (ui+1 , u2i+1 /2) with the segment (ui−1 , u2i−1 /2), (ui+1 , u2i+1 /2). We note finally in this example (but it holds in general) that one can rewrite Q as 1 |ui+1 − ui ||uj+1 − uj ||σi+1 − σj+1|, (2.5) Q(u) = 2 i<j where σi is the speed of the jump [ui+1 , ui ], |ui+1 − ui | is the strength of the ith wave and |σi+1 − σj+1 | is the difference in speed.
Singular Approximations to Hyperbolic Systems
67
t f(u)=u 2/2 u1 u1 u=0
u2 u2 u3
u3 u
x
Figure 4. Example 2.1 with 3 jumps. 2.1. Viscous approximations. We can construct the interaction functional also for the viscous approximations ut + f (u)x − uxx = 0.
(2.6)
In fact, one verifies easily that if we set λ = λ(u) = f (u), the curve in R u . γ(t, x) = (2.7) f (u) − ux 2
satisfies the parabolic equation γt + λ(t, x)γx − γxx = 0.
(2.8)
Hence by a direct application of Remark 1.2, γ moves in the direction of curvature, hence by Theorem 1.3 the functional 1 γx (x) ∧ γx (y)dydx Q(u) = 2 x
(2.10)
R
2.2. Semidiscrete and discrete schemes. The simplest semidiscrete scheme (stable and diffusive for f > 0) is ut (t, x) + f (u(t, x)) − f (u(t, x − 1)) = 0.
(2.11)
One can rewrite the scheme as f (u(t, x)) − f (u(t, x − 1)) u(t, x) − u(t, x − 1) u(t, x) − u(t, x − 1) = ut (t, x) + λ u(t, x), u(t, x − 1) u(t, x) − u(t, x − 1) = 0, (2.12)
ut (t, x) +
68
S. Bianchini
Figure 5. Motion in the direction of curvature for semidiscrete and discrete schemes. with λ > 0. It is simple to see that the curve γ solving γt (t, x) + λ(t, x) γ(t, x) − γ(t, x − 1) = 0
(2.13)
moves in the direction of curvature for λ > 0, Fig. 5. A similar computation holds for the simple fully discrete scheme (stable and diffusive for 0 < f < 1) u(t + 1, x) − u(t, x) + f (u(t, x)) − f (u(t, x − 1)) = 0,
(2.14)
which can be rewritten as f (u(t, x)) − f (u(t, x − 1)) u(t, x) − u(t, x − 1) u(t + 1, x) = u(t, x) − u(t, x) − u(t, x − 1) = (1 − λ(u(t, x), u(t, x − 1)))u(t, x) + λ(u(t, x), u(t, x − 1))u(t, x − 1). Then the curve γ satisfying γt (t, x) = (1 − λ(t, x))γ(t, x) + λ(t, x)γ(t, x − 1)
(2.15)
(2.16)
moves in the direction of curvature for 0 < λ < 1, Fig. 5. The main problem (which we will not address here) is how to find another variable w, depending on u only, which solves the same equation for u in both schemes. For the discrete scheme we remark that a result in this direction is not known. 3. Another interpretation of the functional 3.1. Parabolic equation. Consider again the parabolic equation ut + f (u)x − uxx = 0, and construct the variable . P (t, x, y) = ut (t, x)ux (t, y) − ut (t, y)ux (t, x). It is easy to verify that P satisfies " # Pt + div f (u(t, x)), f (u(t, y)) P = ∆P
(3.1)
(3.2)
Singular Approximations to Hyperbolic Systems
69
P
y
x
Figure 6. Flow through the boundary. for t ≥ 0, x ≥ y and the Dirichlet boundary condition P (t, x, x) = 0. The interaction functional Q(u) can be now interpreted as the L1 norm of P in {x ≥ y}, |P (t, x, y)|dxdy,
Q(P ) =
(3.3)
x≥y
and its derivative controls the flux of P along the boundary {x = y}, d Q(P ) ≤ − ∇P · (1, −1)dx = −2 utx ux − ut uxx dx. dt x=y R
(3.4)
3.2. An estimate for kinetic models. The estimate of the flow through the boundary can be extended to some kinetic model, the easiest one is 1 1 ft− − fx− = − f − + f + 2 2 (3.5) f+ + f+ = 1f− − 1f+ x t 2 2 If the initial data is positive and with L1 norm equal to 1, this model describes the evolution of the probability densities of one particle which switches between speeds −1, 1 with probability 1/2 in the time unit: f − (t, x) is the probability that in (x + dx, t + dt) the particle has speed −1, while f + is the corresponding probability for speed +1.
70
S. Bianchini
t
x
Figure 7. Two possible particle part. The natural extension of the estimate of the flow through the boundary with Dirichlet boundary conditions is to set f − (t, 0) + f + (t, 0) = 0.
(3.6)
One can explain the above boundary condition by saying the when a particle hits the boundary {x = 0} it change sign. Due to diffusion, it is possible to verify that after some time, in each (t, x) the number of particle which have bounced at x = 0 an even number of times is very close to the number of particles which have bounced an odd number (Fig. 7). A more precise estimate can be obtained in the form +∞ − + |f (t, 0)|dt ≤ 3 (3.7) |f (0, x)| + |f + (0, x)| dx. 0
R
(However we expect that the constant can be improved to 2.) Without entering in the computations, we just say that the interaction functional for the general scheme 1 + λ(u) − 1 − λ(u) + ft− − fx− = − f + f 2 2 (3.8) f + + f + = 1 + λ(u) f − − 1 − λ(u) f + x t 2 2 is obtained by estimating the flux of a corresponding kinetic scheme in R2 trough the boundary {x = y}.
Singular Approximations to Hyperbolic Systems
71
References [1] S. Bianchini and A. Bressan. On a Lyapunov functional relating viscous conservation laws and shortening curves. Nonlinear Analysis TMA, 51(4):649–662, 2002. [2] A. Bressan. Hyperbolic systems of conservation laws. Oxford Univ. Press, 2000. [3] C. Dafermos. Hyperbolic conservation laws in continuous physics. Springer, 2000. [4] J. Glimm. Solutions in the large for nonlinear hyperbolic systems of equations. Comm. Pure Appl. Math., 18:697–715, 1965. [5] S. Kruzhkov. First-order quasilinear equations with several space variables. Mat. Sb., 123:228–255, 1970. English transl. in Math. USSR Sb. 10 (1970), 217–273. [6] T.-P. Liu. Admissible solutions of hyperbolic conservation laws, volume 240. Memoir A.M.S., 1981. Stefano Bianchini Istituto per le Applicazioni del Calcolo “M. Picone” – CNR Viale del Policlinico 137 I-00161 Roma, Italy e-mail:
[email protected] URL: http://www.iac.rm.cnr.it/~bianchin/
4ECM Stockholm 2004 c 2005 European Mathematical Society
Representation Theory and Random Point Processes Alexei Borodin and Grigori Olshanski Abstract. On a particular example we describe how to state and to solve the problem of harmonic analysis for groups with infinite-dimensional dual space. The representation theory for such groups differs in many respects from the conventional theory. We emphasize a remarkable connection with random point processes that arise in random matrix theory. The paper is an extended version of the second author’s talk at the Congress.
Introduction In this paper we would like to discuss a connection between two areas of mathematics which until recently seemed to be rather distant from each other: (1) noncommutative harmonic analysis on groups and (2) some topics in probability theory related to random point processes. In order to make the paper accessible to readers not familiar with either of these areas, we will explain all needed basic concepts. The purpose of harmonic analysis is to decompose natural representations of a given group on irreducible representations. By natural representations we mean those representations that are produced, in a natural way, from the group itself. Examples include the regular representation, which is realized in the L2 space on the group, or a quasiregular representation, which is built from the action of the group on a homogeneous space. In practice, a natural representation often comes together with a distinguished cyclic vector. Then the decomposition into irreducibles is governed by a measure, which may be called the spectral measure. The spectral measure lives on the dual space to the group, the points of the dual being the irreducible unitary representations. There is a useful analogy in analysis: expanding a given function on eigenfunctions of a self-adjoint operator. Here the spectrum of the operator is a counterpart of the dual space. If our distinguished vector lies in the Hilbert space of the representation, then the spectral measure has finite mass and can be normalized to be a probability measure.1 1It may well happen that the distinguished vector belongs to an extension of the Hilbert space (just as in analysis, one may well be interested in expanding a function which is not square integrable). For instance, in the case of the regular representation of a Lie group one
74
A. Borodin and G. Olshanski
Now let us turn to random point processes (or random point fields), which form a special class of stochastic processes. In general, a stochastic process is a family of random variables, while a point process (or random point field) is a random point configuration. By a (nonrandom) point configuration we mean an unordered collection of points in a locally compact space X. This collection may be finite or countably infinite, but it cannot have accumulation points in X. To define a point process on X, we have to specify a probability measure on Conf(X), the set of all point configurations. One classical example is the Poisson process, which is employed in a lot of probabilistic models and constructions. Another important example (or rather a class of examples) comes from random matrix theory. Given a probability measure on a space of N × N matrices, we pass to the matrix eigenvalues and thus obtain a random N -point configuration. In a suitable scaling limit transition (as N → ∞), it turns into a point process living on infinite point configurations. As long as we are dealing with “conventional” groups (finite groups, compact groups, real or p-adic reductive groups, etc.), representation theory seems to have nothing in common with point processes. However, the situation drastically changes when we turn to “big” groups whose irreducible representations depend on infinitely many parameters. Two basic examples are the infinite symmetric group S(∞) and the infinite-dimensional unitary group U (∞), which are defined as unions of ascending chains of finite or compact groups S(1) ⊂ S(2) ⊂ S(3) ⊂ . . . ,
U (1) ⊂ U (2) ⊂ U (3) ⊂ . . . ,
respectively. It turns out that for such groups, the clue to the problem of harmonic analysis can be found in the theory of point processes. The idea is to convert any infinite collection of parameters, which corresponds to an irreducible representation, to a point configuration. Then the spectral measure defines a point process, and one may try to describe this process (hence the initial measure) using appropriate probabilistic tools. This approach was first applied to the group S(∞) (see the surveys Borodin–Olshanski [BO2], Olshanski [Ol6]). In the present paper we discuss the group U (∞), our exposition is mainly based on Olshanski [Ol7] and Borodin– Olshanski [BO6]. Notice that the point processes arising from the spectral measures do not resemble the Poisson process but are close to the processes of random matrix theory. Acknowledgement. This research was partially conducted during the period the first author (A. B.) served as a Clay Mathematics Institute Research Fellow. He was also partially supported by the NSF grant DMS-0402047. The second author (G. O.) was supported by the CRDF grant RM1-2543-MO-03. usually takes the delta function at the unity of the group, which is not an element of L2 . In such a situation the spectral measure is infinite. However, we shall deal with finite spectral measures only.
Representation Theory and Random Point Processes
75
1. Dual space and the problem of harmonic analysis Recall that a unitary representation of a group G in a Hilbert space H is a homomorphism of G into the group of unitary operators in H. For instance, if G is a locally compact topological group then there is a natural representation generated by the (say, right) action of G on itself, called the regular representation. Its space is the L2 space formed with respect to the Haar measure on G, and the operators of the representation are given by (R(g)f )(x) = f (xg),
g ∈ G,
x ∈ G,
f ∈ L2 (G).
(1.1)
A unitary representation is said to be irreducible if it is not a direct sum of other representations. Irreducible representations are elementary objects like simple modules. A general unitary representation T is, in a certain sense, built from irreducible ones: in simplest cases T is decomposed into a direct sum of irreducibles, and in more sophisticated situations, direct sum is replaced by “direct integral”.2 Two fundamental problems of unitary representation theory are: (1) Given a group G, find all its irreducible unitary representations. (2) For most natural representations of G (e.g., the regular representation), describe their decomposition on irreducibles. The set of (equivalence classes of) irreducible unitary representations of Thus, the first problem G is called the dual space to G and is denoted by G. is the description of G. The second problem is called the problem of harmonic analysis. It can be viewed as a noncommutative generalization of the classical Fourier analysis. These two problems were extensively studied for “conventional” groups. The existing literature is immense, and surveying it is beyond the scope of the present paper. What is important for us is that both problems, with appropriate refinement, make sense for certain “nonconventional” groups as well. These are the groups of automorphisms of infinite-dimensional Riemannian symmetric spaces and also certain combinatorial analogs of such groups, which are built with the help of the infinite symmetric group. Results on construction and classification of irreducible representations for the automorphism groups and their combinatorial analogs can be found in Olshanski [Ol1], [Ol5], [Ol2], [Ol4], Pickrell [Pi2], Nessonov [Nes]. The construction of natural reducible representations for these groups and related questions are discussed in Pickrell [Pi1], Kerov–Olshanski–Vershik [KOV1], [KOV2], Olshanski [Ol7]. In the present paper we focus on a single group G, which is U (∞) × U (∞). The reason why we consider not the group U (∞) but the product of its two copies will be explained below. Here we would only like to note 2This claim is true under certain assumptions on the group G or on the representation T ,
but we don’t want to discuss technicalities here. Under additional (but still rather broad assumptions), the decomposition into irreducibles is essentially unique.
76
A. Borodin and G. Olshanski
that U (∞) (or an appropriate completion thereof) can be viewed as an infinitedimensional Riemannian symmetric space, and then U (∞) × U (∞) arises as a group of automorphisms of that space. 2. The dual space U (N ) and spherical representations of U (N ) × U (N ) In this section we briefly describe a few necessary facts about representations of the groups U (N ). The material is classical,3 we present it in a form which will help to understand the subsequent infinite-dimensional generalization. For N = 1, 2, . . . let U (N ) denote the group of unitary matrices of size N × N . This group is compact. Its irreducible representations are parametrized by signatures of length N , that is, N -tuples λ = (λ1 , . . . , λN ) of integers such that λ1 ≥ · · · ≥ λN .4 Thus, the dual space U (N ) can be viewed as a countable N discrete subset of R . Let π λ denote the irreducible representation corresponding to a signature λ∈U (N ), dim π λ denote the dimension of the representation space, and RN be the regular representation of U (N ) in the Hilbert space L2 (U (N )). The decomposition of RN looks as follows ' dim π λ · π λ RN = λ∈U (N )
In other words, each irreducible representation enters the regular representation with multiplicity equal to the dimension of this irreducible representation. This is a special case of a general result valid for any compact group, the Peter–Weyl theorem. We observe now that the group U (N ) acts on itself both on the right and on the left, so that U (N ) becomes a homogeneous space U (N ) × U (N )/ diag(U (N )), where diag(U (N )) stands for the diagonal subgroup in U (N ) × U (N ). This (N of enables us to extend the representation RN to a unitary representation R 2 the group U (N ) × U (N ) acting in the same space L (U (N )), cf. (1.1): (N (g1 , g2 )f )(x) = f (g −1 xg1 ), (g1 , g2 ) ∈ U (N ) × U (N ). (R 2
(N the biregular representation. We call R (N is multiplicity free: In contrast to RN , the decomposition of R ' ∗ (N = (π λ ⊗ π λ ). R
(2.1)
λ∈U (N ) ∗
Here π λ stands for the conjugate representation to π λ ; its signature is λ∗ = (−λN , . . . , −λ1 ). We observe that general irreducible representations of U (N )× 3See, e.g., Weyl [We], Zhelobenko [Zhe], Helgason [He]. 4Another term for collections λ is “dominant highest weights for U(N )”.
Representation Theory and Random Point Processes
77
U (N ) are of the form π λ ⊗ π µ , where λ, µ ∈ U (N ). Representations with µ = λ∗ are characterized as those possessing a spherical vector , that is, a nonzero vector invariant under the subgroup diag(U (N )). Such representations are called spherical. The whole subspace of diag(U (N ))-invariants in π λ ⊗ ∗ π λ has dimension 1, so that the spherical vector is defined uniquely up to a scalar factor. Therefore, the spherical vector is a distinguished vector in the representation space. Note that the homogeneous space U (N ) × U (N )/ diag(U (N )) is an example of a compact symmetric space G/K. For any such space, the associated unitary representation of G in L2 (G/K) is multiplicity free and its decomposition involves exactly the irreducible spherical representations of the pair (G, K), that is, those irreducible representations of G that possess a K-invariant vector. Returning to our special situation we conclude that the dual space U (N ) admits an alternative interpretation as the set of (equivalence classes of) irreducible spherical representations of the pair (G, K) = (U (N ) × U (N ), diag(U (N ))). Now we shall explain how this picture transforms when U (N ) is replaced by U (∞). 3. The dual space U (∞) and spherical representations of U (∞) × U (∞) Consider the tower of groups U (1) ⊂ U (2) ⊂ U (3) ⊂ . . . where, for each N , the group U (N ) is identified with the subgroup in U (N + 1) formed by matrices g = [gij ] such that gi,n+1 = gn+1,i = δi,n+1 . We define U (∞) as the union of all groups U (N ). Equivalently, U (∞) consists of unitary matrices g = [gij ] of infinite size, such that gij = δij for i + j large enough. The conventional definition of a dual space, when applied to the group U (∞), gives a huge pathological space.5 It turns out that the situation drasti cally changes if we mimic the alternative interpretation of U (N ) stated at the end of §2: Definition 3.1. We set U (∞) to be the space of (equivalence classes of) irreducible spherical unitary representations of the pair (G, K), where G = U (∞) × U (∞),
K = diag(U (∞)).
(3.1)
Here “spherical” has the same meaning as above: existence of a nonzero K-invariant vector. Again, such a vector is then unique, within a scalar factor. Below R+ ⊂ R denotes the set of nonnegative real numbers and R∞ + denotes the direct product of countably many copies of R+ . 5This is a general property of the so-called wild groups; U(∞) is one of them.
78
A. Borodin and G. Olshanski
Theorem 3.2. The space U (∞), see Definition 3.1, can be identified with the ∞ ∞ ∞ subset Ω ⊂ R4∞+2 = R × R∞ + + + × R+ × R+ × R+ × R+ formed by 6-tuples + + − − + − ω = (α , β , α , β , δ , δ ) such that ± ∞ α± = (α± 1 ≥ α2 ≥ · · · ≥ 0) ∈ R+ ,
δ ± ∈ R+ ,
β1+ + β1−
β ± = (β1± ≥ β2± ≥ · · · ≥ 0) ∈ R∞ +, ± ± ≤ 1, (α± i + βi ) ≤ δ . i≥1
Thus, for any point ω ∈ Ω there exists an attached irreducible spherical representation of (G, K) which we denote by T ω . Representations T ω enter a larger class of admissible representations which are studied in detail in Olshanski [Ol5], [Ol3]. In particular, we dispose of an explicit description of the representation space of T ω together with the action of G in it. Theorem 3.2 has a long history. First of all, it should be said that the classification of irreducible spherical representations of (G, K) is equivalent to that of finite factor representations of the group U (∞), see Olshanski [Ol1], [Ol5, §24].6 Finite factor representations of U (∞) were first studied by Voiculescu [Vo]. He discovered (among many other things) that these representations are parametrized by the so-called two-sided infinite totally positive sequences of real numbers. But he did not know that such sequences were completely classified much earlier by Edrei [Ed]. This fact was pointed out later by Vershik– Kerov [VK2] and Boyer [Boy]. Thus, Theorem 3.2 is hidden in Edrei’s paper. Note that [Ed] is a pure analytical work, which at first glance has nothing in common with representation theory. Another, very different approach to Theorem 3.2 was suggested in Vershik–Kerov [VK2] and further developed in Okounkov–Olshanski [OkOl]. Let SGN(N ) ⊂ ZN denote the set of signatures of length N , see §2. We shall now define a sequence of embeddings ιN : SGN(N ) → Ω such that as N → ∞, the image ιN (SGN(N )) becomes more and more dense in Ω. This agrees with the intuitive idea that the space U (∞) should be a limit (in an appropriate sense) of the spaces U (N ). First, we need Definition 3.3 (Vershik–Kerov [VK1]). Let µ be a Young diagram, µ denote the transposed diagram, and d(µ) denote the number of diagonal boxes in µ. We also regard µ as a partition µ = (µ1 , µ2 , . . . ), so that µi is the length of the ith row in µ while µi is the length of the ith column. The numbers ai (µ) = µi − i + 12 ,
bi (µ) = µi − i + 12 ,
1 ≤ i ≤ d(µ)
are called the modified Frobenius coordinates of µ. For instance, if µ is the partition (3, 3, 1, 0, 0, . . . ) then d(µ) = 2 and a1 (µ) = 2 12 , a2 (µ) = 1 12 , b1 (µ) = 2 12 , b2 (µ) = 12 . The modified Frobenius 6About factor representations, see, e.g., Naimark [Na, §41.5]. In the present paper we do not use this concept.
Representation Theory and Random Point Processes
79
coordinates are always positive half-integers whose sum equals |µ|, the number of boxes in µ. Definition 3.4 (Embedding ιN : SGN(N ) → Ω). Given a signature λ ∈ SGN(N ), we represent it as a couple (λ+ , λ− ) of Young diagrams corresponding to positive and negative coordinates in λ: + − − λ = (λ+ 1 ≥ · · · ≥ λk > 0, . . . , 0 > −λl ≥ · · · ≥ −λ1 ).
Then we assign to λ a point ω = ιN (λ) ∈ Ω, see Theorem 3.2, as follows ai (λ± ) bi (λ± ) ± |λ± | , i ≤ d(λ± ) ± ± ± N N , i ≤ d(λ ) ; ; β δ αi = . = = i N 0, i > d(λ± ) 0, i > d(λ± ) It is readily verified that ω = (α+ , β + , α− , β − , δ + , δ − ) is indeed a point of Ω. In particular, the inequality β1+ + β1− ≤ 1 follows from the evident fact that k + l ≤ N . We equip Ω with the topology inherited from the ambient product space R4∞+2 . Then any point ω ∈ Ω can be approached by a sequence of the form + ιN (λ(N ) ), where λ(N ) ∈ SGN(N ), N → ∞. Moreover, given a sequence {λ(N ) }, we have " # " (N ) # (N ) ∗ ιN (λ(N ) ) → ω ⇔ πλ ⊗ πλ → Tω , where the last arrow means the convergence of representations of the groups U (N ) × U (N ) to a representation of the group G = U (∞) × U (∞), as defined in Olshanski [Ol5, §22], [Ol2]. 4. The problem of harmonic analysis Let us try to understand now what could be an analog of the decomposition (2.1) for the group G. From §3 we already know the counterparts of the discrete ∗ set U (N ) and of the representations π λ ⊗π λ : these are the infinite-dimensional space Ω and spherical representations T ω . But what is the counterpart of the (N acting in the Hilbert space L2 (U (N ))? biregular representation R The conventional definition is not applicable to the group U (∞): one cannot define the L2 space on this group, because U (∞) is not locally compact and hence does not possess an invariant measure. To surpass this difficulty we embed U (∞) into a larger space U, which can be defined as a projective limit of the spaces U (N ) as N → ∞. The space U is no longer a group but it is still a G-space. That is, the two-sided action of U (∞) on itself can be extended to an action on the space U. In contrast to U (∞), the space U possesses a biinvariant finite measure, which should be viewed as a substitute of the nonexisting Haar measure. Moreover, this biinvariant measure is included into a whole family {µ(s) }s∈C of measures with good transformation properties.7 Using the 7The idea to enlarge an infinite-dimensional space in order to build measures with good transformation properties is well known. This is a standard device in measure theory on linear spaces, but there are not so many works where it is applied to “curved” spaces (see, however,
80
A. Borodin and G. Olshanski
measures µ(s) we explicitly construct a family {Tz,w }z,w∈C of representations, which seem to be a good substitute of the nonexisting biregular representation. In our understanding, the Tz,w ’s are “natural representations”, and we state the problem of harmonic analysis on U (∞) as follows: Problem 4.1. Decompose the representations Tz,w on irreducible representations. We skip a concrete description of the representations Tz,w , which can be found in Olshanski [Ol7], and only list some of their properties that are relevant for our discussion. Henceforth we will assume that (z + w) > −1 and that z and w are not integers. Then, as it follows from the construction, Tz,w comes with a distinguished unit vector ξ, which is K-invariant and cyclic. The latter property means that the linear span of the G-orbit of ξ is dense in H = H(Tz,w ), the Hilbert space of Tz,w . Let HN ⊂ H be the Hilbert subspace spanned by the orbit of ξ under the subgroup U (N ) × U (N ) ⊂ G. Then HN carries a unitary representation of U (N ) × U (N ), which turns out to be equivalent to (N of §2. Since {HN } is an ascending chain of the biregular representation R spaces whose union is dense in H, we see that Tz,w is an inductive limit of (N . At this place the reader might ask about the the biregular representations R meaning of parameters z, w; the answer is that to each value of (z, w) there corresponds a specific tower of embeddings H1 = L2 (U (1)) ⊂ · · · ⊂ HN = L2 (U (N )) ⊂ HN +1 = L2 (U (N + 1)) ⊂ . . . . (4.1) (N as a subrepresentation There are many (even too many) ways to realize R (N +1 , and our construction leads to a distinguished 2-parameter family of of R towers of embeddings. The statement of Problem 4.1 looks rather abstract but we will gradually reduce it to a concrete form. The first step is to apply the following abstract claim. Theorem 4.2. Let T be a unitary representation of G in a Hilbert space H and assume that there exists a K-invariant cyclic vector ξ ∈ H (we will assume
ξ = 1). Then (T, ξ) is completely determined, within a natural equivalence, by a probability measure P on the dual space U (∞) = Ω. The decomposition of T on irreducible representations is given by a multiplicity free direct integral of spherical representations T ω with respect to measure P . We call P the spectral measure of (T, ξ). Note that if ξ is replaced by another vector ξ ∈ H with the same properties then P is replaced by an equivalent measure P . We will not define precisely what is a “direct integral of representations” (see, e.g., Naimark [Na, §41]) but only observe that Theorem 4.2 is strictly similar to a customary fact, the spectral theorem for a pair Pickrell [Pi1], Neretin [Ner]). For the history of the measures µ(s) we refer to Olshanski [Ol7] and Borodin–Olshanski [BO5]. A parallel construction for the symmetric group case is given in Kerov–Olshanski–Vershik [KOV1], [KOV2].
Representation Theory and Random Point Processes
81
(A, ξ) where A stands for a self-adjoint operator in a Hilbert space H and ξ ∈ H is a unit cyclic vector. Taking into account Theorem 4.2 we replace Problem 4.1 by Problem 4.3. Assume that z, w ∈ C \ Z and (z + w) > −1. Let ξ be the distinguished K-invariant cyclic unit vector provided by the construction of Tz,w , and let Pz,w denote the spectral measure of (Tz,w , ξ), which is a probability measure on Ω. Describe Pz,w explicitly. Recall that the Hilbert space H(Tz,w ) is the inductive limit of a chain (4.1) and that the vector ξ belongs to all spaces HN , which carry represen(N . Evidently, for each N , ξ is a diag(U (N ))-invariant cyclic vector tations R (N , ξ) gives rise to a spectral (N . The pair (R in the biregular representation R (N ) (N ) = SGN(N ). Since SGN(N ) is a discrete space, this is measure Pz,w on U a purely atomic probability measure. It has a very simple meaning. According to decomposition (2.1) we obtain an orthogonal decomposition of ξ into a sum of certain vectors ξλ . We have (N )
ξλ 2 and Pz,w (λ) = ξλ 2 for λ ∈ SGN(N ). 1 = ξ 2 = λ∈SGN(N ) (N )
The numbers Pz,w (λ) can be computed, the result is as follows (N ) Pz,w (λ)
= constN ·
N
WN (λi − i) ·
i=1
(λi − λj − i + j)2 ,
(4.2)
1≤i<j≤N −2
WN (l) = |Γ(z − l)Γ(w + N + 1 + l)|
,
l ∈ Z,
(4.3)
where constN is a normalization constant. The assumption that z, w are not (N ) integers just means that Pz,w (λ) does not vanish (which is related to cyclicity of vector ξ). The assumption (z + w) > −1 guarantees that
N
WN (λi − i)
λ∈SGN(N ) i=1
(λi − λj − i + j)2 < ∞
1≤i<j≤N
for all N , so that the normalization is indeed possible. On the other hand, one can prove that " # (N ) lim ιN Pz,w = Pz,w , N →∞
(4.4)
where the embeddings ιN : SGN(N ) → Ω were specified in Definition 3.4. Thus, Problem 4.3 admits a reformulation which already has a very concrete form: Problem 4.4. Compute explicitly the limit probability measure in the righthand side of (4.3), where the probability measures in the left-hand side are given by (4.2) and Definition 3.4. In the remaining part of the paper we explain how this problem is solved. A detailed exposition of the material of this section can be found in Olshanski [Ol7].
82
A. Borodin and G. Olshanski
5. Random point processes The spectral measures Pz,w that we aim to describe live on a “very big” space Ω, which is a domain in an infinite-dimensional product space. There is no hope that Ω possesses a simple reference measure (like Lebesgue measure) such that Pz,w would be determined by a density with respect to that measure. Thus, we have to use another language to describe our measures. It turns out that such a language is provided by the theory of random point processes. In this section we give a few necessary basic definitions concerning random point processes and also provide a few examples which seem to be relevant for the discussion of our main problem. One should not regard our exposition as a survey on point processes. As basic references on this subject the reader can consult Daley and Vere-Jones [DVJ] and Lenard [Len]. Let X be a locally compact space. A point configuration in X is a finite or countable subset without limit points. Let Conf(X) be the set of all point configurations. For any Borel subset A ⊂ X with compact closure, let NA : Conf(X) → Z+ be the function defined by NA (X) = |A ∩ X|, where X ∈ Conf(X). Consider the sigma-algebra of subsets in Conf(X) generated by all functions NA . A probability measure P defined on this sigma-algebra is called a random point process on X. Given P, point configurations X ⊂ X become random objects, and we can speak, for instance, about probabilities of events like this: NA1 (X) = n1 , . . . , NAk (X) = nk . Example 5.1 (Poisson process). The simplest and most known random point process is the Poisson process, which is determined by an arbitrary measure m on X. The Poisson process is characterized by the property that the probability of each event of the form above, where A1 , . . . , Ak do not intersect, equals k
e−m(Ai )
i=1
(m(Ai ))ni . ni !
Given a point process P on X, we can integrate various functions F (X) on Conf(X). An important class of functions F is defined as follows. Let f (x1 , . . . , xn ) be a continuous function on Xn with compact support; we set Ff (X) = f (x1 , . . . , xn ), X ∈ Conf(X), x1 ,...,xn
summed over all n-tuples of pairwise distinct points in X. Note that Ff depends on the symmetric part of f only. Under mild assumptions on P, there exists a unique symmetric measure ρn on Xn such that for any f as above, Ff (X)P(dX) = f (x1 , . . . xn )ρn (dx1 . . . dxn ), Conf(X)
Xn
Representation Theory and Random Point Processes
83
and, moreover, P is uniquely determined by the infinite sequence of measures ρ1 , ρ2 , . . . (see Lenard [Len]). These measures are called the correlation measures of P. They are a convenient tool for identifying and studying a point process. When P is the Poisson process, we simply have ρn = m⊗n . For nonPoisson processes P, the correlation measures can have a more sophisticated structure. In practice one can usually choose a natural reference measure m on X such that ρn has a density with respect to m⊗n for each n. Then this density is called the nth correlation function of P; we will denote it as ρn (x1 , . . . , xn ). If X is a discrete space and m is the counting measure then ρn (x1 , . . . , xn ) is the probability that the random configuration X contains all points x1 , . . . , xn (if these points are not all distinct then ρn (x1 , . . . , xn ) = 0). When X is not discrete, ρn (x1 , . . . , xn ) can be informally defined as follows ρn (x1 , . . . , xn ) =
lim
∆x1 →0,...,∆xn →0
Prob{ random X intersects ∆x1 ,. . . , ∆xn } , m(∆x1 ) . . . m(∆xn )
where ∆x1 , . . . , ∆xn are small neighborhoods of the points x1 , . . . , xn . In words, ρn (x1 , . . . , xn ) is the density of the probability to find a point of the random configuration in each of n infinitesimally small neighborhoods about x1 , . . . , xn . Definition 5.2 (Determinantal processes). Assume that a reference measure as above exists, so that we can deal with the correlation functions. Then P is called a determinantal point process if there exists a function K(x, y) on X × X such that ρn (x1 , . . . , xn ) = det[K(xi , xj )]1≤i,j≤n ,
n = 1, 2, . . . .
We call K the correlation kernel of P. If K is symmetric (K(x, y) = K(y, x)) then the points in the random configuration are negatively correlated : a very close approach of points has a relatively small probability. So, the points look as mutually repelling particles. In a Poisson process, on the contrary, the points are not correlated at all; they look as noninteracting particles. A good survey on determinantal point processes is Soshnikov [So]. All the information about a determinantal process P is hidden in its correlation kernel K(x, y). In this respect, determinantal point processes can be compared to Gaussian measures where all the information is contained in the covariance matrix. Knowing K(x, y) we can, in principle, compute the probabilities of various natural events associated to P. We state the simplest but important example: Proposition 5.3. Let P be a determinantal point process with a correlation kernel K. The probability of having no particles in a region I ⊂ X is equal to the Fredholm determinant det(1 − KI ), where KI is the restriction of K to I × I.
84
A. Borodin and G. Olshanski
It often happens that such gap probability can be expressed through a solution of a (second order nonlinear ordinary differential) Painlev´e equation, see Example 6.2 below. The most known example of a determinantal process is Example 5.4 (Sine process). The sine kernel is given by K(x, y) =
sin(π(x − y)) , π(x − y)
x, y ∈ R
(here the reference measure m is Lebesgue measure). The sine kernel determines a remarkable translation invariant point process on X = R. It is instructive to compare the sine process with the standard Poisson process on R (where m is again Lebesgue measure). Both processes are translation invariant, and for both processes the mean distance between adjacent points equals 1. However, as can be seen from computer simulations, the sample random configurations of the Poisson process are more chaotic. For the Poisson process, the distance between adjacent points is a very simple random variable (it has exponential distribution), while for the sine process the corresponding distribution is expressed through a Painlev´e transcendent.8 For a large number of concrete examples of determinantal processes the space X is a subset of R, C, or Z, and the correlation kernel has the form K(x, y) =
P (x)Q(y) − Q(x)P (y) x−y
(5.1)
or, more generally, k
K(x, y) =
Fi (x)Gi (y)
i=1
,
x−y
where
k
Fi (x)Gi (x) = 0.
(5.2)
i=1
Such kernels are called integrable, see Its–Izergin–Korepin–Slavnov [IIKS], Deift [De], Borodin [B2]. Example 5.5 (Orthogonal polynomial ensembles). Let W (x) be a weight function (defined, say, on a subset X ⊂ R) and let p0 ≡ 1, p1 , p2 , . . . be the associated family of orthogonal polynomials. For an arbitrary N = 1, 2 . . . , consider the orthogonal projection operator in L2 (X, dx)9 onto the N -dimensional subspace 1 W (x, y) stand for spanned by functions pi (x)W 2 (x), 0 ≤ i ≤ N − 1, and let KN the kernel of this operator. This kernel can be written in integrable form (5.1) with 1
P (x) = const pN (x)W 2 (x),
1
Q(x) = const pN −1 (x)W 2 (x).
8This result was originally proved in Jimbo–Miwa– Mˆ ori–Sato[JMMS], and a number of
other proofs and extensions were later given by different authors, see Borodin–Deift [BD] for references. 9If X is a discrete set then Lebesgue measure dx is replaced by the counting measure.
Representation Theory and Random Point Processes
85
W In other words, KN (x, y) is equal to the classical Christoffel–Darboux kernel 1 1 W (x, y) gives rise to random N -point contimes W 2 (x)W 2 (y). The kernel KN figurations in X. Namely, the density of probability10 of a given configuration has the form
P(x1 , . . . , xN ) = const
N i=1
W (xi )
(xi − xj )2 .
(5.3)
1≤i<j≤N
The random point processes of this type are called orthogonal polynomial ensembles. Note that (5.3) can be written in the Gibbsian form which is common in statistical physics: log V −1 (xi ) − 2 log |xi − xj |−1 . P(x1 , . . . , xN ) = const exp − i
i<j
The terms log V −1 (x) and 2 log |xi − xj |−1 are interpreted as the one-particle potential and the pair potential, respectively, and the whole ensemble is interpreted as an N -particle log-gas system (Forrester [Fo]). A variety of random point processes comes from spectra of random matrices. A basic example is the Gaussian Unitary Ensemble (GUE) formed by N × N Hermitian matrices distributed according to a Gaussian measure invariant under conjugation by unitary matrices from U (N ). The spectrum of such a random matrix is a random N -point configuration in X = R arising from the Hermite orthogonal polynomial ensemble (in the notation of Example 5.5, 2 W (x) = e−x , the weight function of the Hermite polynomials). From other ensembles of random matrices one can also obtain the Laguerre and Jacobi orthogonal polynomial ensembles (see, e.g., Forrester [Fo]). One of the fundamental problems in random matrix theory is to study the asymptotic behavior of random matrices as their size goes to infinity. This leads, in particular, to studying the scaling limits of orthogonal polynomial ensembles in various regimes. For instance, if we focus at the N -point Hermite polynomial ensemble with large N in a neighborhood of the origin and scale the space variable x so that the mean distance between adjacent points becomes ap√ proximately 1 (which is achieved by the change of variable x → x = 2N x/π), then we obtain in the limit N → ∞ the sine process. Orthogonal polynomial ensembles with discrete state space X arise in a number of probabilistic models which include random tilings (Johansson [Jo3]) and directed percolation (Johansson [Jo1], [Jo2]). Classical discrete orthogonal polynomials known as Charlier, Krawtchouk, Meixner, and Hahn polynomials arise in this fashion. 10If the space X is discrete then one can simply speak about the probability of (x , . . . , x ). 1 N
86
A. Borodin and G. Olshanski
6. Point processes Pz,w . The main result Now we return to the spectral measures Pz,w . We will explain how to convert them into random point processes Pz,w on the space X = R \ {± 12 } (the real line with two punctures, at 12 and − 12 ). We define a projection Ω → Conf(X) by ω = (α+ , β + , α− , β − , δ + , δ − ) + − − 1 1 1 1 → X = {α+ i + 2 } " { 2 − βi } " {−αj − 2 } " {− 2 + βj }, (6.1) + − − where we omit possible 0’s among α+ i , βi , αi , βi , and also omit possible 1’s + − among βi or βi . Note that X is bounded in R and its points may accumulate only near the punctures 12 and − 12 . By definition, Pz,w is the push-forward of the measure Pz,w under the projection Ω → Conf(X). The projection is not injective, so that we can, in principle, loose a part of information about our measure Pz,w under the passage Pz,w → Pz,w . However, one can present arguments showing that the losses (if any) are negligible, see the end of §9 in Borodin–Olshanski [BO6]. Thus, we can regard Pz,w as a substitute of Pz,w . The next result provides a description of the point process Pz,w and can be viewed as a solution of Problem 4.4.
Theorem 6.1 (Main result). Pz,w is a determinantal point processes. Its correlation kernel can be written in integrable form (5.2) with k = 2, where the functions F1 , F2 , G1 , and G2 can be explicitly expressed through the Gauss hypergeometric function. For instance, if x > 12 and y > 12 then the kernel can be written in form (5.1) with − 12 (z+¯z)−w¯ 1 (w−w) ¯ 1 1 2 P (x) = const x − x+ 2 2 " −1 # ¯ z¯ + w; ¯ z + z¯ + w + w ¯ + 1; 12 − x , × 2 F1 z + w, − 12 (z+¯z)−w−1 1 (w−w) ¯ ¯ 1 1 2 Q(x) = const x − x+ 2 2 " −1 # ¯ + 1, z¯ + w ¯ + 1; z + z¯ + w + w ¯ + 2; 12 − x . × 2 F1 z + w Here 2 F1 (a, b; c; ζ) is the Gauss hypergeometric function with parameters a, b, c and argument ζ. Note that this function is well defined for ζ < 0. We call the kernel of Theorem 6.1 the (continuous) hypergeometric kernel; hypergeom let us denote it by Kz,w (x, y). Precise formulas for the kernel and the proof of the theorem are given in our paper [BO6].
Representation Theory and Random Point Processes
87
hypergeom Note that the kernel Kz,w (x, y) is real valued but not symmetric. It has the following symmetry property instead: hypergeom (y, x) if x, y are both inside Kz,w hypergeom Kz,w (x, y) = (6.2) or outside (− 12 , 12 ); hypergeom −Kz,w (y, x) otherwise. hypergeom (x, y) is symmetric with respect to the indefinite In other words, Kz,w inner product of functions on X given by [f, g] = f (x)g(x)dx − f (x)g(x)dx R\[− 12 , 12 ]
(− 12 , 12 )
An explanation of this fact will be given in Remark 7.2 below. Since all the information about the point process Pz,w is hidden in the hypergeom kernel Kz,w (x, y), a natural question is: What can be extracted from the ± explicit expression for the kernel? For instance, each of parameters α± i , βi can be viewed as a random variable defined on the probability space (Ω, Pz,w ); what can be said about their distribution? Here are two examples. The first example concerns the distribution of α+ 1 . The same result holds − for α1 ; it suffices to interchange z and w. Example 6.2 (Painlev´e VI). By virtue of Proposition 5.3, the probability distribution of α+ 1 is given by Prob{α+ 1 < u} = det(1 − K 12 +u ),
u > 0,
where we abbreviate
Ks = K hypergeom (s,+∞)×(s,+∞) ,
s>
1 2
.
Set z + z¯ + w + w ¯ z − z¯ + w − w ¯ z − z¯ − w + w ¯ , ν3 = , ν4 = , 2 2 2 d ln det(1 − Ks ) ν3 ν4 σ(s) = s2 − 14 − ν12 s + . ds 2 Then σ(s) satisfies the differential equation 2 2 −σ s2 − 14 σ = 2 (sσ − σ) σ − ν12 ν3 ν4 ν1 =
− (σ + ν12 )2 (σ + ν32 )(σ + ν42 ). This differential equation is the so-called σ-form of the Painlev´e VI equation. The proof can be found in Borodin–Deift [BD]. We refer to the introduction of that paper for a brief historical introduction and references on this subject. βi±
Our second example concerns the asymptotic behavior of parameters α± i , as i → ∞.
88
A. Borodin and G. Olshanski
Example 6.3 (Law of large numbers). We conjecture that with probability 1, 1/k lim (α+ = lim (βk+ )1/k = q(z), k)
k→∞
where
k→∞
q(z) = exp −
. −2
|z − n|
1/k lim (α− = lim (βk− )1/k = q(w), k)
k→∞
= exp −
n∈Z
k→∞
π sin(π(z − z¯)) (z − z¯) sin(πz) sin(π¯ z)
This conjecture is based on the results of Borodin–Olshanski [BO1] and [BO7]. The result should be obtained by analogy with Theorem 5.1 of [BO1]. However, we did not verify the details yet. 7. Lattice approximation to process Pz,w Our proof of Theorem 6.1 is based on the limit relation (4.4). In §6, we have interpreted its right-hand side as a point process. Here we explain how to do the same for the left-hand side and thus to translate this relation into the language of random point processes. (N ) Comparing (4.2)–(4.3) with (5.3) we see that the measure Pz,w on SGN(N ) gives rise to a discrete orthogonal polynomial ensemble on Z with weight function (4.3). Here we have used the bijective correspondence between diagrams λ ∈ SGN(N ) and N -point configurations (l1 > · · · > lN ) on Z determined by relation li = λi − i. Since the weight WN (l) from (4.3) has a slow (polynomial) decay at infinity, WN (l) ∼ |l|−2 (z+w)−2N , l → ±∞, it admits only finitely many orthogonal polynomials. However, due to the assumption (z + w) > −1, we have enough polynomials to define the orthogonal polynomial ensemble for any N . We call it the Askey–Lesky ensemble, because the orthogonal polynomials in question were computed in Askey [As] and Lesky [Les1], [Les2]. The Askey–Lesky polynomials are relatives of the classical Hahn polynomials; they are expressed through the value of the hypergeometric series 3 F2 at 1. From the explicit expression of these polynomials we obtain the corAskey-Lesky (x, y). The Askey–Lesky ensemble is responding correlation kernel KN an interesting example of a discrete log-gas system (the particles are confined to a lattice). However, the Askey–Lesky ensemble is only an intermediate object, we need to transform it further in order to visualize the modified Frobenius coordinates of Young diagrams λ± (see Definitions 3.3 and 3.4). The first step is rather simple, we shift the configuration (l1 , . . . , lN ) by N +1 , so that the resulting correspondence between signatures and N -point 2 configurations takes a more symmetric form λ ↔ L = {λ1 +
N −1 2 ,
λ2 +
N −3 2 ,
. . . , λN −1 −
N −3 2 ,
λN −
N −1 2 }.
(7.1)
Representation Theory and Random Point Processes
The configuration L lives on the lattice Z, X(N ) = Z + N 2+1 = Z+
1 2
89
if N is odd; if N is even.
The next step is less obvious. Let us divide the lattice X(N ) into two parts, (N ) (N ) which will be denoted by Xin and Xout : (N ) Xin = − N 2−1 , − N 2−3 , . . . , N 2−3 , N 2−1 , (N ) Xout = . . . , − N 2+3 , − N2+1 ∪ N 2+1 , N 2+3 , . . . . (N )
Here Xin , the “inner” part, consists of N points of the lattice that lie on the (N ) interval (− N2 , N2 ), while Xout , the “outer” part, is its complement in X(N ) , consisting of the points outside this interval. Given an N -point configuration L on X(N ) , which we interpret as a system of particles occupying N positions on the lattice X(N ) , we assign to it (N ) another configuration, X, formed by the particles in Xout and the holes (i.e., (N ) the unoccupied positions) in Xin . Note that X is a finite configuration, too. Since the “interior” part consists of exactly N points, we see that in X, there are equally many particles and holes. However, their number is no longer fixed, (N ) it varies between 0 and 2N , depending on the mutual location of L and Xin . For instance, if these two sets coincide then X is the empty configuration, and if they do not intersect then |X| = 2N . We call the procedure of passage L → X the particles/holes involution. Under this procedure, our initial random N -particle system (coming from the Askey–Lesky ensemble) turns into a random system of particles and holes. Note that the map L → X is reversible, so that both random point processes are (N ) equivalent. Let us denote the second point process by Pz,w . The significance of the procedure described above becomes clear from the following combinatorial fact. Lemma 7.1 ([BO6, §4]). Let λ ∈ SGN(N ) be a signature, L ⊂ X(N ) be the N particle configuration defined by (7.1), and X ⊂ X(N ) be the corresponding finite ± configuration of particles and holes as defined above. Let also a± i and bi be the ± modified Frobenius coordinates of the Young diagrams λ , see Definitions 3.3 and 3.4. Then we have (N )
X ∩ Xout = {a+ i +
N 2}
∪ {−a− i −
N 2 },
− N X ∩ Xin = { N2 − b+ i } ∪ {− 2 + bi }. (N )
(7.2)
Comparing (7.2) with (6.1) suggests that if we shrink our phase space X(N ) by the factor of N (so that the points ± N2 turn into ± 12 ) then our discrete (N ) point process Pz,w should have a well-defined scaling limit. We prove that such
90
A. Borodin and G. Olshanski
a limit does exist and it coincides with the point process Pz,w on X = R \ {± 12 } as defined in §6. (N ) The discrete process Pz,w is determinantal, and its correlation kernel can Askey-Lesky be obtained by a transformation of the kernel KN (x, y); let us denote Askey-Lesky ( (x, y). The correlation kernel K hypergeom (x, y) this new kernel by KN ( Askey-Lesky (x, y). of Theorem 6.1 is obtained as a scaling limit of the kernel K N We just gave a rough sketch of the proof of Theorem 6.1. The detailed proof (see Borodin–Olshanski [BO6]) is rather long and technical. The main technical difficulties arise when we want to get a convenient explicit expression ( Askey-Lesky (x, y) in case when at least one of variables x, y is for the kernel K N in the “interior” part of the lattice.11 Here we apply a discrete version of the formalism of the Riemann–Hilbert problem, see Borodin [B2]. Remark 7.2 (On symmetry (6.2)). Now we are in a position to explain the indefinite-type symmetry (6.2): the same kind of symmetry occurs already in ( Askey-Lesky (x, y). It turns out that the particles/holes involution the kernel K N Askey-Lesky (x, y) into the indefinitejust converts the usual symmetry of kernel KN Askey-Lesky ( (x, y). type symmetry of kernel KN (N )
The point process Pz,w can be viewed as a discrete two-component loggas system consisting of oppositely signed charges. Systems of such a type were earlier investigated in the mathematical physics literature (see, e.g., a number of references listed in section (f) of the introduction to Borodin–Olshanski [BO6]). However, the known concrete models are quite different from our system. Remark 7.3 (Limit density). Given an N -point orthogonal polynomial ensemble, let us attach to a configuration {x1 , . . . , xN } a probability measure, 1 N (δx1
+ · · · + δxN ).
Under an appropriate scaling limit as N → ∞, this random measure can converge to a (nonrandom) probability measure describing the global limit density of particles. For instance, in case of GUE, the limit density is given by the famous Wigner’s semi-circle law, see, e.g., Forrester [Fo, ch. 1]. When we apply this procedure to the Askey–Lesky ensemble (or rather to its shift by N2+1 ) then it can be shown that, as N gets large, almost all N particles occupy positions inside (− N2 , N2 ). (Recall that there are exactly N lattice points in this interval, hence, almost all of them are occupied by particles.) In other words, this means that the density of our discrete log-gas is asymptotically equal to the characteristic function of the N -point set of lattice points inside (− N2 , N2 ), so that in the scaling limit we get the characteristic function of (− 12 , 12 ). 11This part of the kernel describes the correlations of holes with particles and other holes. The Askey-Lesky (x, y) restricted correlations involving particles only are described by the kernel KN to the “exterior” part of the lattice.
Representation Theory and Random Point Processes
91
It can also be shown that after the passage L → X, all but finitely many particles/holes in X concentrate, for large N , near the points ± N2 . This explains why the random system of paricles/holes X converges to a limit point process (as opposed to the Askey–Lesky ensemble). 8. Connection with previous work Let us briefly discuss two similar problems which also lead to spectral measures on infinite-dimensional spaces. The first problem was initially formulated in Kerov–Olshanski–Vershik [KOV1]. It consists in decomposing certain natural (generalized regular) unitary representations Tz of the group S(∞) × S(∞), depending on a complex parameter z. In [KOV1], [KOV2] the problem was solved in the case when the parameter z takes integral values (then the spectral measure has a finitedimensional support). The general case presents more difficulties and we studied it in a cycle of papers (see the surveys Borodin–Olshanski [BO2], Olshanski [Ol6] and references therein). Our main result is that the spectral measure governing the decomposition of Tz can be described in terms of a determinantal point process on the real line with one punctured point. The correlation kernel was explicitly computed, it has integrable form (5.2), where k = 2 and the functions F1 , F2 , G1 , and G2 are expressed through a confluent hypergeometric function (specifically, through the W–Whittaker function), see Borodin [B1], Borodin–Olshanski [BO3]. The second problem deals with decomposition of a family of unitarily invariant probability measures on the space of all infinite Hermitian matrices on ergodic components. The measures depend on one complex parameter; within a transformation of the underlying space, they coincide with the measures µ(s) mentioned in the beginning of §4. The problem of decomposition on ergodic components can be also viewed as a problem of harmonic analysis on an infinitedimensional Cartan motion group. The main result states that the spectral measures in this case can be interpreted as determinantal point processes on the real line with an integrable correlation kernel of type (5.1), where the functions P and Q are expressed through another confluent hypergeometric function, the M-Whittaker function, see Borodin–Olshanski [BO5]. These two problems and the problem that we deal with in this paper have many similarities but the latter problem is, in a certain sense, more general comparing to both problems described above. The Askey–Lesky kernel of §7 can be viewed as the top of a hierarchy of (discrete and continuous) integrable kernels: this looks very much like the hierarchy of the classical special functions. A description of the “S(∞)-part” of the hierarchy can be found in Borodin– Olshanski [BO4].
92
A. Borodin and G. Olshanski
References [As] [B1] [B2] [BD]
[BO1]
[BO2] [BO3]
[BO4]
[BO5]
[BO6]
[BO7] [Boy] [DVJ] [De]
[Ed] [Fo]
[He]
R. Askey, An integral of Ramanujan and orthogonal polynomials, J. Indian Math. Soc. 51 (1987), 27–36. A. Borodin, Harmonic analysis on the infinite symmetric group and the Whittaker kernel, St. Petersburg Math. J. 12 (2001), no. 5, 733–759. A. Borodin, Riemann–Hilbert problem and the discrete Bessel kernel, Intern. Math. Research Notices (2000), no. 9, 467–494; arXiv: math.CO/9912093. A. Borodin and P. Deift, Fredholm determinants, Jimbo–Miwa–Ueno taufunctions, and representation theory, Commun. Pure Appl. Math. 55 (2002), no. 9, 1160–1230; arXiv: math-ph/0111007. A. Borodin and G. Olshanski, Point processes and the infinite symmetric group. Part III: Fermion point processes, Preprint, 1998, arXiv: math.RT/9804088. A. Borodin and G. Olshanski, Point processes and the infinite symmetric group, Math. Research Lett. 5 (1998), 799–816; arXiv: math.RT/9810015. A. Borodin and G. Olshanski, Distributions on partitions, point processes and the hypergeometric kernel, Comm. Math. Phys. 211 (2000), no. 2, 335– 358; arXiv: math.RT/9904010. A. Borodin and G. Olshanski, Z-Measures on partitions, Robinson–Schensted–Knuth correspondence, and β = 2 random matrix ensembles, In: Random matrix models and their applications (P. Bleher and A. Its, eds.). Cambridge University Press. Mathematical Sciences Research Institute Publications 40, 2001, 71–94; arXiv: math.CO/9905189. A. Borodin and G. Olshanski, Infinite random matrices and ergodic measures, Comm. Math. Phys 223 (2001), no. 1, 87–123; arXiv: math-ph/0010015. A. Borodin and G. Olshanski, Harmonic analysis on the infinite-dimensional unitary group and determinantal point processes, Ann. Math. 161 (2005), no. 3, arXiv: math.RT/0109194. A. Borodin and G. Olshanski, Random partitions and the Gamma kernel, Adv. Math. 194 (2005), 141–202; arXiv: math-ph/0305043. R.P. Boyer, Infinite traces of AF-algebras and characters of U (∞), J. Operator Theory 9 (1983), 205–236. D.J. Daley and D. Vere–Jones An introduction to the theory of point processes, Springer series in statistics, Springer, 1988. P. Deift, Integrable operators, In: Differential operators and spectral theory: M.Sh. Birman’s 70th anniversary collection (V. Buslaev, M. Solomyak, D. Yafaev, eds.), American Mathematical Society Translations, ser. 2, v. 189, Providence, R.I.: AMS (1999), 69–84. A. Edrei, On the generating function of a doubly-infinite, totally positive sequence, Trans. Amer. Math. Soc. 74 (1953), no. 3, 367–383. P.J. Forrester Log-gases and random matrices, Book in preparation, see Forrester’s home page at http://www.ms.unimelb.edu.au/∼matpjf/matpjf.html. S. Helgason, Groups and geometric analysis. Integral geometry, invariant differential operators, and spherical functions, Mathematical Surveys and Monographs 83, American Mathematical Society, Providence, R.I., 2000.
Representation Theory and Random Point Processes
93
A.R. Its, A.G. Izergin, V.E. Korepin, N. A. Slavnov, Differential equations for quantum correlation functions, Intern. J. Mod. Phys. B4 (1990), 1003– 1037. [JMMS] M. Jimbo, T. Miwa, Y. Mˆ ori, and M. Sato, Density matrix of an impenetrable Bose gas and the fifth Painlev´ e transcendent, Physica D 1 (1980), 80–158. [Jo1] K. Johansson, Shape fluctuations and random matrices, Commun. Math. Phys. 209 (2000), no. 2, 437–476; arXiv: math.CO/9903134. [Jo2] K. Johansson, Discrete orthogonal polynomial ensembles and the Plancherel measure, Ann. of Math. (2) 153 (2001), no. 1, 259–296; arXiv: math.CO/9906120. [Jo3] K. Johansson, Non-intersecting paths, random tilings and random matrices, Probab. Theory Related Fields 123 (2002), no. 2, 225–280; arXiv: math.PR/0011250. [KOV1] S. Kerov, G. Olshanski, and A. Vershik, Harmonic analysis on the infinite symmetric group. A deformation of the regular representation, Comptes Rend. Acad. Sci. Paris, S´er. I 316 (1993), 773–778. [KOV2] S. Kerov, G. Olshanski, and A. Vershik, Harmonic analysis on the infinite symmetric group, Invent. Math. 158 (2005), 551–642; arXiv: math.RT/0312270. [Len] A. Lenard, Correlation functions and the uniqueness of the state in classical statistical mechanics, Comm. Math. Phys 30 (1973), 35–44. [Les1] P.A. Lesky, Unendliche und endliche Orthogonalsysteme von Continuous Hahnpolynomen, Results in Math. 31 (1997), 127–135. [Les2] P.A. Lesky, Eine Charakterisierung der kontinuierlichen und diskreten klassischen Orthogonalpolynome, Preprint 98–12, Mathematisches Institut A, Universit¨ at Stuttgart (1998). [Na] M.A. Naimark, Normed algebras, Translated from the second Russian edition (Moscow, Nauka, 1968) by Leo F. Boron. Third edition. Wolters–Noordhoff Series of Monographs and Textbooks on Pure and Applied Mathematics. Wolters–Noordhoff Publishing, Groningen, 1972. [Ner] Yu.A. Neretin, Hua type integrals over unitary groups and over projective limits of unitary groups, Duke Math. J. 114, no. 2, 239–266 ; arXiv: math-ph/0010014. [Nes] N.I. Nessonov, A complete classification of the representations of GL(∞) containing the identity representation of the unitary subgroup, Mathematics USSR – Sbornik 58 (1987), 127–147 (translation from Mat. Sb. 130 (1986), No. 2, 131–150). [OkOl] A. Okounkov and G. Olshanski, Asymptotics of Jack polynomials as the number of variables goes to infinity, Intern. Math. Research Notices (1998), no. 13, 641–682. [Ol1] G.I. Ol’shanskii, Unitary representations of infinite-dimensional pairs (G, K) and the formalism of R. Howe, Soviet Math. Doklady 27 (1983), no. 2, 290– 294 (translation from Doklady AN SSSR 269 (1983), 33–36). [Ol2] G.I. Ol’shanskii, Unitary representations of the group SO0 (∞, ∞) as limits of unitary representations of the groups SO0 (n, ∞) as n → ∞, Funct. Anal. Appl. 20 (1986), 292–301. [IIKS]
94 [Ol3]
[Ol4]
[Ol5]
[Ol6]
[Ol7]
[Pi1] [Pi2] [So] [VK1] [VK2] [Vo] [We] [Zhe]
A. Borodin and G. Olshanski G.I. Ol’shanskii, Method of holomorphic extensions in the theory of unitary representations of infinite-dimensional classical groups, Funct. Anal. Appl. 22 (1988), no. 4, 273–285. G.I. Ol’shanskii, Unitary representations of (G, K)-pairs connected with the infinite symmetric group S(∞), Leningrad Math. J. 1 (1990), no. 4, 983–1014 (translation from Algebra i Analiz 1 (1989), No.4, 178–209). G.I. Ol’shanskii, Unitary representations of infinite-dimensional pairs (G, K) and the formalism of R. Howe, In: Representation of Lie Groups and Related Topics (A. Vershik and D. Zhelobenko, eds.), Advanced Studies in Contemporary Math. 7, Gordon and Breach Science Publishers, New York etc., 1990, pp. 269–463. G. Olshanski, An introduction to harmonic analysis on the infinite symmetric group, In: Asymptotic combinatorics with applications to mathematical physics (A.M. Vershik, ed.), A European mathematical summer school held at the Euler Institute, St. Petersburg, Russia, July 9–20, 2001, Springer Lect. Notes Math. 1815, 2003, 127–160; arXiv: math.RT/0311369. G. Olshanski, The problem of harmonic analysis on the infinite-dimensional unitary group, J. Funct. Anal. 205 (2003), no. 2, 464–524; arXiv: math.RT/0109193. D. Pickrell, Measures on infinite dimensional Grassmann manifold, J. Func. Anal. 70 (1987), 323–356. D. Pickrell, Separable representations for automorphism group of infinite symmetric spaces, J. Func. Anal. 90 (1990), 1–26. A. Soshnikov, Determinantal random point fields, Russian Math. Surveys 55 (2000), no. 5, 923–975; arXiv: math.PR/0002099. A.M. Vershik and S.V. Kerov, Asymptotic theory of characters of the symmetric group, Funct. Anal. Appl. 15 (1981), 246–255. A.M. Vershik and S.V. Kerov, Characters and factor representations of the infinite unitary group, Soviet Math. Doklady 26 (1982), 570–574. D. Voiculescu, Repr´ esentations factorielles de type II1 de U (∞), J. Math. Pures et Appl. 55 (1976), 1–20. H. Weyl, The classical groups, their invariants and representations, Princeton University Press, 1946. D.P. Zhelobenko, Compact Lie groups and their representations, Nauka, Moscow, 1970 (Russian); English translation: Transl. Math. Monographs 40, Amer. Math. Soc., Providence, R.I., 1973.
Alexei Borodin Mathematics 253-37, Caltech, Pasadena, CA 91125, USA e-mail:
[email protected] Grigori Olshanski Dobrushin Mathematics Laboratory Institute for Information Transmission Problems Bolshoy Karetny 19, 127994 Moscow GSP-4, Russia e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Stability of Relaxation Models for Conservation Laws Fran¸cois Bouchut Abstract. These notes intend to give an introduction to the recent development of relaxation models, the associated stability conditions, and discrete approximations.
1. Relaxation models A system of conservation laws is a system of partial differential equations of the form t ∈ R, x ∈ R, (1.1) ∂t u + ∂x F (u) = 0, where u(t, x) ∈ Rp , and F (u) ∈ Rp . The classical features for such systems are that • The Cauchy problem has bounded, but discontinuous, solutions u • The nonlinearity F induces nonuniqueness The idea of approximation by relaxation to the system (1.1) is as follows: • Build solutions of (1.1) as limits u = lim u , u (t, x) = Lf (t, x), obtained from solutions f to another (simpler) system of conservation laws • This solution f is forced in the limit, by a relaxation process, to lie in a manifold of equilibrium, f (t, x) ∈ M • This manifold M can be parametrized by u ≡ Lf , i.e., we have f ∈M
⇔
f = M (u) and LM (u) = u.
Example 1.1 (The Jin Xin model). The most simple example is given by [13] ( is dropped): M1 (u) − f1 , M2 (u) − f2 ∂t f2 + c∂x f2 = , where f (t, x) = (f1 (t, x), f2 (t, x)) ∈ Rp × Rp , u(t, x) = Lf (t, x), c > 0, ∂t f1 − c∂x f1 =
M (u) =
Lf = f1 + f2 , u − F (u)/c u + F (u)/c , 2 2
.
(1.2)
96
F. Bouchut
• One has ∂t u + c∂x (f2 − f1 ) = 0 • The right-hand side forces f − M (u) → 0, i.e., f → M, thus c(f2 − f1 ) c(M2 (u) − M1 (u)) = F (u) 2. Hyperbolic relaxation: general framework A general framework is given by [9]: Q(f ) , (2.1) where f (t, x) ∈ Rq , q > p, L : Rq → Rp is linear, and the Maxwellian equilibrium M (u) satisfies consistency relations ∂t f + ∂x A(f ) =
LM (u) = u, LA(M (u)) = F (u). The relaxation term must satisfy LQ(f ) = 0, Q(f ) = 0
⇔
f = M (u) for some u.
Example 2.1. BGK relaxation term Q(f ) = M (Lf ) − f . 3. Kinetic relaxation models The kinetic relaxation models occur as relaxation models when • The space Rq = (Rp )Ξ is a space of functions, f = f (ξ), ξ ∈ Ξ, with Ξ a measure space with positive measure dξ, • The nonlinearity is A(f )(ξ) = a(ξ)f (ξ) for some function a(ξ) ∈ R, • The linear operator is Lf = Ξ f (ξ) dξ, • The Maxwellian becomes M (u) = M (u, ξ), and the consistency relations become moment relations M (u, ξ) dξ = u, a(ξ)M (u, ξ) dξ = F (u). Thus kinetic relaxation models identify with semilinear diagonal relaxation models, possibly in infinite dimension. Such models arise naturally in the kinetic theory of gases, like the Boltzmann equation. Such models are described in [17]. 4. Parabolic relaxation Parabolic relaxation comes from a different scaling in (2.1): 1 Q(f ) ∂t f + ∂x A(f ) = 2 , leading to different features than above:
Stability of Relaxation Models for Conservation Laws
97
• The limit → 0 is a parabolic equation, like in [11], [14], [12], [15], [7] • Incompressible models can be obtained at the limit, see [1], [10], and the talk by F. Golse in this congress
5. Relaxation limit Problem 5.1. How to justify the relaxation limit → 0? Several methods are used in different situations: (1) Whenever the limit u is smooth (in a Sobolev space) we use the relative entropy method to estimate the distance to the limit solution [16]. (2) When the limit equation is mildly nonlinear (like incompressible Navier– Stokes, with viscosity) our method is to control the compactness and the size of the solution [10]. (3) When the limit u is a discontinuous weak solution our method is to obtain L∞ bounds on the solution and get compactness (BV estimates, or compensated compactness). In these notes, I am more interested in the last situation. 6. Stability of the relaxation limit Relaxation models were first discussed in [19]. In particular, a main idea was there to write structural necessary/sufficient conditions for the relaxation limit to hold. Example 6.1. For Jin-Xin’s model, a stability condition is the subcharacteristic condition σ(F (u)) ⊂ [−c, c], where σ denotes the spectrum, and F (u) is the derivative of F (u) with respect to u. Several stability conditions exist. We shall discuss here the following ones: • • • •
The The The The
entropy extension condition (EEC) reduced stability condition (RSC) interlacing subcharacteristic condition (ISC) Chapman-Enskog dissipativity condition (CED)
Except (ISC), they involve entropy inequalities. 6.1. Entropy. The notion of entropy is used in hyperbolic conservation laws for: • selecting admissible solutions (Lax) • getting a priori bounds • proving compactness (DiPerna)
98
F. Bouchut
For the conservation law ∂t u + ∂x F (u) = 0,
(6.1)
an entropy is a scalar function η(u), such that there exist another scalar function G(u), called the entropy flux, satisfying G (u) = η (u)F (u). Interest: smooth solutions to (6.1) satisfy ∂t η(u) + ∂x G(u) = 0. If η is a convex entropy, a weak solution u to (6.1) is said η entropy satisfying if ∂t η(u) + ∂x G(u) ≤ 0. 6.2. Relaxation system: Entropy Extension Condition (EEC). This condition is due to [9]. Consider a conservation law ∂t u + ∂x F (u) = 0, and an associated relaxation system ∂t f + ∂x A(f ) = Q(f )/, satisfying the previously stated conditions. Definition 6.2. Given a convex entropy η, we say that (EEC) holds if there exist a convex entropy H(f ) with entropy flux G(f ), such that H(M (u)) = η(u) + cst, G(M (u)) = G(u) + cst, and that the minimization principle holds, H(M (u)) ≤ H(f ) whenever u = Lf. The relaxation term must satisfy also H (f )Q(f ) ≤ 0. 6.3. Interest of the Entropy Extension Condition (EEC). Starting from an entropy solution f of the relaxation system, one automatically gets an entropy solution u (if f converges weakly, and u converges strongly). Thus it enables in the good cases to • select admissible solutions (Lax) • get a priori bounds • prove compactness (DiPerna) For the last point, compensated compactness works when there is a whole family of entropies with entropy extensions, i.e., in the scalar case p = 1, and for some good models [2]. In the case of only a single entropy extension, it can work also for special structures [18], and in the kinetic case with continuous variable ξ [3].
Stability of Relaxation Models for Conservation Laws
99
6.4. Relaxation system: Reduced Stability Condition (RSC). This condition is introduced in [6]. Consider a conservation law ∂t u + ∂x F (u) = 0, and an associated relaxation system ∂t f + ∂x A(f ) = Q(f )/, satisfying again the minimal consistency conditions. We assume hyperbolicity of both systems, i.e., that F (u) and A (f ) are diagonalizable, and we denote by Pλ [. . . ] the projector onto the eigenspace, for any eigenvalue λ. Definition 6.3. Given a convex entropy η, we say that (RSC) holds if for any u and any λ L Pλ [A (M (u))]M (u) is symmetric nonnegative for η (u). • It implies that this operator must be diagonalizable with nonnegative eigenvalues • It involves only Maxwellian states 6.5. Relaxation system: Interlacing Subcharacteristic Condition (ISC). Consider a conservation law ∂t u+∂x F (u) = 0, and an associated relaxation system ∂t f + ∂x A(f ) = Q(f )/, satisfying again the minimal consistency conditions. We assume hyperbolicity of both systems, i.e., that F (u) and A (f ) are diagonalizable. We denote by λ1 [F (u)] ≤ · · · ≤ λp [F (u)], λ1 [A (M (u))] ≤ · · · ≤ λq [A (M (u))], the eigenvalues repeated with multiplicities. Definition 6.4. We say that (ISC) holds if for any u λk [A (M (u))] ≤ λk [F (u)] ≤ λq−p+k [A (M (u))], for any 1 ≤ k ≤ p. 6.6. Relaxation system: Chapman–Enskog Dissipatity (CED). Consider a conservation law ∂t u + ∂x F (u) = 0, and an associated relaxation system ∂t f + ∂x A(f ) = Q(f )/ with BGK relaxation Q(f ) = M (Lf )−f , satisfying again the minimal consistency conditions. Then formally, with u = Lf , ∂t u + ∂x F (u ) = ∂x (D(u )∂x u ) ,
(6.2)
up to terms in 2 , with D(u) = L A (M (u))2 M (u) − F (u)2 .
(6.3)
Definition 6.5. Let η be a convex entropy. We say that (CED) holds if (6.2) is η-symmetrically entropy dissipative, i.e., if D(u) is symmetric nonnegative for η (u). It implies that ∂t η(u) + ∂x G(u) − ∂x (η (u)D(u)∂x u) = −D(u)t η (u) · ∂x u · ∂x u ≤ 0.
100
F. Bouchut
6.7. Comparison of the stability conditions. Theorem 6.6 ([9]). (EEC) =⇒ (ISC), (EEC) =⇒ (CED). None of (ISC) or (CED) imply the other. Theorem 6.7 ([6]). (EEC) =⇒ (RSC) =⇒ (ISC), (EEC) =⇒ (RSC) =⇒ (CED). Theorem 6.8 ([4]). In the kinetic case, (EEC) ⇐⇒ (RSC). 7. Discrete approximations Relaxation approximations enable to build numerical schemes for conservation laws ∂t u + ∂x F (u) = 0, by the transport-projection approach [8]. It can be summarized as follows. Start with un (x) piecewise constant Define f n (x) = M (un (x)), which is piecewise constant Solve the relaxation problem ∂t f + ∂x A(f ) = 0 for tn < t < tn+1 Define un+1 (x) by piecewise constant projection of L f (tn+1 , x) Build an Approximate Riemann Solver that generates a conservative finite volume scheme • In particular, kinetic relaxation models lead to Kinetic schemes • Condition (EEC) automatically gives entropy consistency • Condition (RSC) gives entropy consistency for data of small variation
(1) (2) (3) (4) •
For these numerical features, consult [5]. 8. Concluding comments on relaxation models • Relaxation approximations yield structurally well-behaved approximations: – Entropy conditions can be analyzed – Stability can be analyzed – Have the same hyperbolic structure as the limit (finite speed of propagation. . . ), which is better than viscosity approximation • Relaxation approximations can be used: – To prove existence to the Cauchy problem of the limit, even if until now they have not allowed to prove really new results – To build stable numerical methods
Stability of Relaxation Models for Conservation Laws
101
References [1] C. Bardos, F. Golse, C.D Levermore, The acoustic limit for the Boltzmann equation, Arch. Ration. Mech. Anal. 153 (2000), 177–204. [2] F. Berthelin, F. Bouchut, Kinetic invariant domains and relaxation limit from a BGK model to isentropic gas dynamics, Asymptot. Anal. 31 (2002), 153–176. [3] F. Berthelin, F. Bouchut, Relaxation to isentropic gas dynamics for a BGK system with single kinetic entropy, Methods Appl. Anal. 9 (2002), 313–327. [4] F. Bouchut, Construction of BGK models with a family of kinetic entropies for a given system of conservation laws, J. Stat. Phys. 95 (1999), 113–170. [5] F. Bouchut, Nonlinear stability of finite volume methods for hyperbolic conservation laws, and well-balanced schemes for sources, Frontiers in Mathematics series, Birkh¨ auser, 2004. [6] F. Bouchut, A reduced stability condition for nonlinear relaxation to conservation laws, J. Hyp. Diff. Eq. 1 (2004), 149–170. [7] F. Bouchut, F. Guarguaglini, R. Natalini, Diffusive BGK approximations for nonlinear multidimensional parabolic equations, Indiana Univ. Math. J. 49 (2000), 723–749. [8] Y. Brenier, Averaged multivalued solutions for scalar conservation laws, SIAM J. Numer. Anal. 21 (1984), 1013–1037. [9] G.Q. Chen, C.D. Levermore, T.-P. Liu, Hyperbolic conservation laws with stiff relaxation terms and entropy, Comm. Pure Appl. Math. 47 (1994), 787–830. [10] F. Golse, L. Saint-Raymond, The Navier-Stokes limit of the Boltzmann equation for bounded collision kernels, Invent. Math. 155 (2004), 81–161. [11] L. Hsiao, T.P. Liu, Convergence to nonlinear diffusion waves for solutions of a system of hyperbolic conservation laws with damping, Comm. Math. Phys. 143 (1992), 599–605. [12] S. Jin, L. Pareschi, G. Toscani, Diffusive relaxation schemes for multiscale discrete velocity kinetic equations, SIAM J. Numer. Anal. 35 (1998), 2405–2439. [13] S. Jin, Z.-P. Xin, The relaxation schemes for systems of conservation laws in arbitrary space dimensions, Comm. Pure Appl. Math. 48 (1995), 235–276. [14] P.-L. Lions, G. Toscani, Diffusive limit for finite velocity Boltzmann kinetic models0, Rev. Mat. Iberoamericana 13 (1997), 473–513. [15] P. Marcati, B. Rubino, Hyperbolic to parabolic relaxation theory for quasilinear first order systems, J. Differential Equations 162 (2000), 359–399. [16] N. Masmoudi, Some recent developments on the hydrodynamic limit of the Boltzmann equation, Mathematics & mathematics education (Bethlehem, 2000), 167– 185, World Sci. Publishing, River Edge, NJ, 2002. [17] B. Perthame, Kinetic formulations of conservation laws, Oxford University Press, 2002. [18] A. Tzavaras, Materials with internal variables and relaxation to conservation laws, Arch. Ration. Mech. Anal. 146 (1999), 129–155. [19] G.B. Whitham, Linear and nonlinear waves, reprint of the 1974 original, Pure and Applied Mathematics (New York). Wiley-Interscience publication, John Wiley & Sons, New York, 1999. Fran¸cois Bouchut CNRS & DMA, Ecole Normale Sup´ erieure, 45 rue d’Ulm F-75230 Paris cedex 05, France
4ECM Stockholm 2004 c 2005 European Mathematical Society
Hyperbolic 3-manifolds and the Geometry of the Curve Complex Brian H. Bowditch Abstract. We give a brief survey of some recent work on 3-manifolds, notably towards proving Thurston’s ending lamination conjecture. We describe some applications to the theory of surfaces and mapping class groups.
1. Introduction There has recently been a great deal of activity in 3-manifold theory, with announcements of proofs of three major conjectures. In this paper, we will focus on some of the work surrounding one of these, namely the ending lamination conjecture, a proof of which was announced by Minsky, Brock and Canary in 2002. This, and related work has unearthed an array of fascinating interconnections between the mapping class groups, Teichm¨ uller theory and the geometry of 3-manifolds. Much of this can be viewed in the context of geometric group theory. This subject has seen very rapid growth over the last twenty years or so, though of course, its antecedents can be traced back much earlier. Two major sources of inspiration have been 3-manifold theory and hyperbolic geometry. The work of Thurston in the late 1970s [Th1, Th2] brought these subjects much closer together, and the resulting activity was one of the factors in launching geometric group theory as a subject in its own right. The work of Gromov has been a major driving force in this. Particularly relevant here is his seminal paper on hyperbolic groups [Gr]. In this paper, we give a brief overview of some of this recent work. As an illustration, we shall offer an example of how hyperbolic 3-manifolds can be used to study an essentially combinatorial problem concerning the curve complex associated to a compact surface. This complex, introduced by Harvey around 1980, has many nice topological and geometric properties. I am grateful to the ECM organisers for offering me the opportunity to present this work. I also thank the Max-Planck-Institut f¨ ur Mathematik in Bonn, where much of this paper was written, for its support and hospitality. Received by the editors September 2004.
104
B.H. Bowditch
2. Coarse geometry In this section, we briefly recall some of the fundamental notions of geometric group theory. The general idea is to understand the “large scale” geometry of a metric space. This is sometimes termed “coarse” geometry since the invariants will not in general respect small scale geometry or topology. A fairly general reference is [BriH]. (We remark that a related but somewhat different viewpoint on coarse geometry is bound up with the Novikov and Baum-Connes conjectures, see for example [Ro], though we shall not discuss these matters here.) Let (X, d) be a metric space. A (global) geodesic in X is a path π : I −→ X such that d(π(t), π(u)) = |t−u| for all t, u ∈ I, where I is a real interval. Usually we will not worry about parametrisations and identify π with its image in X. We say X is a geodesic space if every pair of points are connected by a geodesic. Examples are complete Riemannian manifolds, or graphs where each edge is deemed to have unit length. The following is a fundamental notion: Definition 2.1. A function f : (X, d) −→ (X , d ) (not necessarily continuous) between geodesic spaces is a quasi-isometry if there are constants, c1 > 0, c2 , c3 , c4 , c5 ≥ 0, such that for all x, y ∈ X, c1 d(x, y) − c2 ≤ d (f (x), f (y)) ≤ c3 d(x, y) + c4 and for all y ∈ X , there exists x ∈ X, such that d (y, f (x)) ≤ c5 . We say that X, X are quasi-isometric and write X ∼ X , if there is some quasi-isometry between them. One verifies that this defines an equivalence on geodesic spaces. If a group, Γ, acts properly discontinuously of a proper (i.e. complete and locally compact) geodesic space X, then Γ is finitely generated. A key observation is that if the same group also acts properly discontinuously cocompactly on another such X , then X and X are (equivariantly) quasi-isometric. If Γ is any finitely generated group, then any Cayley graph of Γ with respect to a finite generating set is an example of such a space, and is therefore well defined up to quasi-isometry. As examples, we see that (the Cayley graph of) the group of integers Z is quasi-isometric to the real line; Z ⊕ Z to the Euclidean plane; and any free group to a tree. The fundamental group, π(Σg ), of the closed orientable surface, Σg , of genus g ≥ 2 is quasi-isometric to the hyperbolic plane. The last example follows from the fact that Σg admits a hyperbolic structure, and so π(Σg ) acts properly discontinuously cocompactly on its universal cover, the hyperbolic plane, H2 . The following notion was introduced by Gromov [Gr]: Definition 2.2. A geodesic space, X, is k-hyperbolic if for any triangle consisting of three geodesics, σ1 , σ2 , σ3 , in X, cyclically connecting three points, then σ3 lies in a k-neighborhood of σ1 ∪ σ2 . We say that X is (Gromov) hyperbolic if it is k-hyperbolic for some k ≥ 0.
Hyperbolic 3-manifolds
105
Note, in particular, that any two geodesics with the same endpoints remain bounded distance apart. Expositions of this notion of hyperbolicity can be found in [GhH], [CoDP], [Sho] and [Bow1]. It turns out that hyperbolicity is quasi-isometry invariant. It thus makes sense to talk about a “hyperbolic group”. Note that H2 (and indeed, Hn for any dimension, n) is hyperbolic and so π1 (Σg ) is a hyperbolic group. Any tree is 0-hyperbolic and so any finitely generated free group is hyperbolic. However, the Euclidean plane and hence Z ⊕ Z is not. Indeed one can show that no hyperbolic group can contain Z ⊕ Z as a subgroup. We remark that there are related notions of CAT(0) and CAT(−1) spaces, where geodesic triangles are assumed to be at least as “thin”, in the appropriate metric sense, as the corresponding “comparison triangles” in the Euclidean and hyperbolic planes respectively. These are not, however, quasi-isometry invariant. CAT(−1) implies both CAT(0) and hyperbolic. 3. Mapping class groups Let Σ be a compact orientable surface of genus g with p boundary components, and let κ(Σ) = 3g + p − 4. We shall assume that κ(Σ) > 0. In other words, we are ruling out a small number of “exceptional” surfaces that can be independently understood. The mapping class group, Map = Map(Σ), is the group of orientation preserving self-homeomorphisms of Σ defined up to homotopy. This group is finitely generated, but not hyperbolic: it has lots of Z ⊕ Z subgroups generated by pairs of disjoint Dehn twists (i.e. a pair of non-trivial mapping classes supported on disjoint annuli). The large scale geometry of (any Cayley graph of) Map has been studied by a number of authors, see for example [Ham]. In [Harv], Harvey associated a simplicial complex, C = C(Σ) to Σ. Its vertex set, V (C), is the set of homotopy classes of simple closed curves in C that cannot be homotoped to a point or to a boundary component of Σ. A subset, A ⊆ V (C) is deemed to be a simplex if its elements can be realised disjointly in Σ. This complex is connected and has dimension κ(Σ). We see that Map acts simplicially on C(Σ), pulling back curves under the homeomorphism, and that the quotient space is compact. The space C(Σ) is commonly referred to as the curve complex (or Harvey complex ). We shall refer to its 1-skeleton, G(Σ), as the curve graph. The curve complex has nice topological and combinatorial properties that can be used to study Map(Σ). For example, in [Hare], Harer investigates the cohomology of Map and in [Iv], Ivanov studies its automorphisms. The Teichm¨ uller space, T = T (Σ), of Σ is the space of marked hyperbolic structures on the interior, int(Σ), of Σ. More precisely, an element of T consists of a complete finite-area hyperbolic surface, S, which is “marked” by a homotopy class of homeomorphisms, int(Σ) −→ S. We see that Map acts on T by changing the marking. The quotient, T /Map is the “moduli space” of
106
B.H. Bowditch
unmarked hyperbolic structures. By uniformisation, studying hyperbolic structures is equivalent to studying conformal structures, that is, (punctured) Riemann surfaces. The Teichm¨ uller space has a very rich structure (see [ImT]). For example it is a complex manifold, and carries two, rather different, natural metrics, namely the Teichm¨ uller metric and the Weil-Petersson metric. It is worth noting however, that: Proposition 3.1. If d is any complete Gromov hyperbolic Map-invariant metric, then the action of Map on T must be parabolic (i.e., it fixes a unique point in the ideal boundary). This follows from an argument that is most easily expressible in terms of “convergence groups”, as introduced by Gehring and Martin. In the above situation, Map would act as a convergence group on the ideal boundary. We have observed that any pair of disjoint Dehn twists generate a Z ⊕ Z subgroup of Map, which must be parabolic (see for example [Tu]). It follows that any Dehn twist fixes a unique ideal point, and since the curve graph is connected, these fixed points are all equal. The result now follows from the fact that Map can be generated by a set of Dehn twists. (Indeed any convergence group action of Map must fix a unique point.) This effectively says that T admits no interesting invariant complete Gromov hyperbolic metric. Topologically, T , is an open (6g − 6 + 2p)-dimensional ball that can be naturally compactified to a closed ball by adjoining the space, ∂T , of “projective laminations”. This is the “Thurston compactification” [Bon]. Given α ∈ V (C) and > 0, we write T (α) ⊆ T for the set of surfaces in which α can be realised as a curve of length less than . If/ = (Σ) > 0 is sufficiently small, then A ⊆ V (C) is a simplex if and only if α∈A T (α) '= ∅. In other words, we can think of a C as a nerve to the family (T (α))α∈V (C) . Up to quasi-isometry, we can equivalently think of C as arising by “shrinking (starting, for example, with down” each T (α) to a set of bounded diameter 0 the Teichm¨ uller metric on T ). We refer to α∈V (C) T (α) as the thin part of T , and to its complement as the thick part. It is well known, following work of Mumford, that thick(T )/Map is compact (see for example [Ab]). Moreover, thick(T ) is connected, and we see that Map is equivariantly quasi-isometric to any invariant geodesic metric on thick(T ). In this way, we can also view C up to quasi-isometry as arising by shrinking down each of a family of subgroups of Map, namely the stablisers of simple closed curves. In view of the fact that neither T nor thick(T ) ∼ Map admits a (sensible) proper invariant hyperbolic metric, the following result is striking: Theorem 3.2 ([MasM1]). The curve complex, C, is Gromov hyperbolic. Note that it is enough here to consider the curve graph, G(Σ), since its inclusion into C is a quasi-isometry.
Hyperbolic 3-manifolds
107
A somewhat shorter proof can be found in [Bow3], which shows, in fact that the hyperbolicity constant is O(log κ(Σ)). A major complication in applying the usual machinery of hyperbolic groups to the curve graph arises from the fact that G is far from being locally finite. One way of dealing with this is suggested by Bestvina and Fujiwara [BeF], where they show that the action of Map on G is what they call “weakly properly discontinuous”. As a result, they deduce: Theorem 3.3 ([BeF]). The second bounded cohomology of Map is infinitely generated. Indeed, they deduce that the same holds for “most” subgroups of Map. Here is another result concerning the action of Map on G. Theorem 3.4 ([Bow6]). (1) The action of Map on G is acylindrical. (2) There is some N = N (Σ) ∈ N such that for all g ∈ Map, N ||g|| ∈ N. “Acylindricity” says essentially that there is a bound on the number of elements that can displace a long geodesics a short distance. (To be precise, for all r ≥ 0, there exist R, K ≥ 0 such that if x, y ∈ V (G) with d(x, y) ≥ R, then |{g ∈ Map | d(x, gx) ≤ r, d(y, gy) ≤ r}| ≤ K.) It is a natural property of an action on a hyperbolic space. In particular, it implies weak proper discontinuity in the sense of [BeF]. The stable length, ||g||, of g ∈ Map is defined as limn→∞ n1 d(x, g n x) for any x ∈ G. We are thus claiming that this is uniformly rational. The analogues of (1) and (2) above are known for hyperbolic groups. The proof of Theorem 5.4 will use hyperbolic 3-manifolds, and we say more about it in Section 5. We conclude this section with some remarks about the Teichm¨ uller and Weil-Petersson metrics. The Teichm¨ uller metric, dT , is a complete geodesic Finsler metric. As we have observed, it cannot be hyperbolic, nor is it CAT(0) [Mas]. However, Teichm¨ uller geodesics have a nice geometric description. For simplicity consider the case where Σ closed. A geodesic path π : I −→ T gives rise to a particular kind of singular Riemannian metric, namely a “singular sol” geometry on Σ×I, which we denote by Pπ . If π(I) ⊆ thick(T ), then the universal cover P˜π is Gromov hyperbolic. More generally, if σ : I −→ thick(T ) is any path, we can construct a space Pσ ∼ = Σ × I, essentially by assembling the hyperbolic surfaces σ(t) for t ∈ I. Provided this is done in a reasonably sensible manner, the universal cover, P˜π , is well defined up to π1 (Σ)-equivariant quasi-isometry. It follows from independent work in [Mo] and [Bow2] that: Theorem 3.5. A path σ : I −→ thick(T ) remains a bounded distance from a Teichm¨ uller geodesic if and only if P˜σ is Gromov hyperbolic. (Of course one needs to interpret this in term of the uniformity of the various constants involved.)
108
B.H. Bowditch
The Weil-Petersson metric is rather different. It is a negatively curved Riemannian K¨ ahler metric. It is not complete, but nevertheless geodesic and globally CAT(0), see [W1, W2]. It is shown in [Bro] that (T , dW ) is quasiisometric to the “pants complex”, P = P(Σ) of Σ. This is a 2-dimensional cell complex related to the curve complex. Like the curve complex, up to quasiisometry it can be thought of as obtained by shrinking down some (but this time not all) of the thin part of Teichm¨ uller space, or as shrinking down certain subgroups of Map. In this way, its coarse geometry can be viewed as intermediate between those of Map and C. It turns out that P is not hyperbolic except when Σ is a five-holed sphere or two-holed torus [BroF], and so the same follows for (T , dW ). See also [Ar] for a discussion of the exceptional cases. Some connections between the Weil-Petersson metric and hyperbolic 3-manifolds are discussed in [Bro]. In summary, we have seen four very natural quasi-isometry classes of metrics on which Map acts, namely Map ∼ thick(T ), C(Σ), (T , dT ) and (T , dW ) ∼ P(Σ). Each has some nice property not shared by any of the others, and understanding their interconnections is an intriguing problem. 4. 3-manifolds Two aspects of 3-dimensional space provide us with powerful tools in this dimension. The first arises from the fact that hyperbolic 3-space, H3 is naturally compactified to a ball by adjoining the Riemann sphere, C ∪ {∞}, so that hyperbolic isometries correspond to conformal automorphisms. This gives rise to a rich analytic theory. The second stems from the topological theory of 3-manifolds developed over the last century. Such connections began to be exploited in the 1960s and 1970s, see for example [Mar], and the subject saw a revolution in the late 1970s arising out of the work of Thurston [Th1, Th2]. He proposed a number of conjectures. Among the most significant are: (1) Geometrisation. This says that any compact 3-manifold can be canonically cut into pieces each admitting a geometric structure – the main issues arising out of spherical and hyperbolic geometry. The topological decomposition alluded to had already been described in earlier work of Kneser and Milnor, and Waldhausen, Johanson, Jaco and Shalen. It should be noted that this work has served as a major source of inspiration in geometric group theory. We note, in particular, the splitting theory developed by Stallings, Dunwoody, Rips and many others as well as the more recent JSJ decomposition of Sela [Se] which is central to his work on the Tarski problem, and in which the mapping class groups of surfaces feature prominently. Thurston proved many special cases of the geometrisation conjecture [O1, K]. Recently Perelman announced a proof in general [P1, P2]. This, of course, implies the famous Poincar´e conjecture.
Hyperbolic 3-manifolds
109
(2) Tameness. This can be conveniently phrased as follows. If M is a complete hyperbolic 3-manifold with π1 (M ) finitely generated, then M is tame (or topologically finite), i.e., homeomorphic to the interior of a compact manifold. In fact, in this form, the conjecture is due to Marden [Mar]. Thurston gave a geometric reinterpretation which was later shown to be equivalent by Canary [Can]. Significant advance was made by Bonahon [Bon], and the general case was recently announced independently by Agol [Ag] and Calegari and Gabai [CalG]. (3) The ending lamination conjecture. Suppose that M is a tame hyperbolic 3-manifold. The ending lamination conjecture (ELC) asserts that M is determined up to isometry by its topology together with a finite set of “end invariants”. Work towards this conjecture has formed a major project of Minsky, along with coworkers, notably Masur. A general proof has now been announced in joint work with Brock and Canary [Mi4, BroCM]. See [Mi3] for a general survey. For simplicity of exposition, consider the case where M has no cusps. Each end of M is of one of two types. It may be “geometrically finite”, in which case it opens out exponentially fast, and can be naturally compactified by adjoining a Riemann surface (arising out of the identification of the boundary of H3 with C ∪ {∞}). In the other “simply degenerate” case, the geometry is quite different. For example in the “bounded geometry” situation (see Section 5) the end is quasi-isometric to a ray [0, ∞). The end invariant of a geometrically finite end is a point of T , namely the compactifying Riemann surface. That of a simply degenerate end is a lamination, which (modulo forgetting about transverse measures) might be thought of as a point in ∂T . Suppose M1 and M2 are tame hyperbolic 3-manifolds, with the same topology and the same end invariants. Let Γ = π1 (M1 ) = π1 (M2 ). We get ˜ 2 , which are each isometric to ˜ 1 and M actions of Γ on the universal covers M 3 H . To prove the ELC, it turns out to be sufficient to find an equivariant quasi-isometry between their covers. This follows from the deformation theory of Kleinian groups developed by Ahlfors, Bers, Marden, Maskit and Sullivan, see for example, [K]. The geometrically finite case is already encompassed by this earlier work. Since this all boils down to understanding the geometry of a (simply degenerate) end which we know to be homeomorphic to a surface times [0, ∞), we can see most of the essential ideas just by considering surface groups. 5. Surface groups For simplicity, we consider only the closed surface case. Let Σ = Σg be the closed orientable surface of genus g ≥ 2, and let Γ = π1 (Σ). Suppose that Γ acts properly discontinuously on H3 preserving orientation and without parabolics. Thus, M = H3 /Γ is a 3-manifold without cusps. In this case, tameness follows
110
B.H. Bowditch
from [Bon], and so M is homeomorphic to Σ × R. Simply hyperbolic geometry tells us that any curve α ∈ V (G) can be uniquely realised as a closed geodesic α ¯ in M . (Here we mean in the usual Riemannian sense – it is only locally geodesic in the metric space sense defined earlier.) We begin by recalling some of the standard Thurston machinery (see [CanEG]). By a pleated surface we shall mean a map φ : (Σ, ρ) −→ M which is homotopic to the inclusion of Σ in M ∼ = Σ × R, and which is 1-lipschitz with respect to some hyperbolic metric, ρ, on Σ. (Normally, pleated surfaces are assumed to be folded in a particular way, but all we require here is the Lipschitz property. Indeed it would be enough for them to be uniformly Lipschitz.) The hyperbolic structure, ρ, is viewed as part of the data of the pleated surface. In general, pleated surfaces are not embedded. We say that φ realises a curve α ∈ V (G) if φ|α ˆ is a locally isometric map to α, ¯ where α ˆ is the unique closed geodesic in (Σ, ρ) in the class of α. A relatively simple construction of [Th1] or of [Bon] shows: Lemma 5.1. Any α ∈ V (G) can be realised by a pleated surface. Indeed, if α, β ∈ V (G) are adjacent then they can be realised by the same pleated surface. We see a connection with the curve graph emerging, since if γ0 , . . . , γn is any path in G, we get a sequence of interlocking pleated surfaces, φi : (Σ, ρi ) −→ M , for i = 1, . . . , n, where φi realises both γi and γi−1 . Now any sequence of curves (γi )ni=0 in V (G) contains a subsequence converging on a lamination λ. This means that they can be realised in Σ so that they converge in the Hausdorff sense. Generically, a lamination is locally homeomorphic to a cantor set times an interval, though in general a transversal may also contain (or indeed consist entirely of) isolated points. A lamination thus consists of a set of 1-dimensional leaves foliating a closed subset of Σ. (If we were to fix a hyperbolic structure on Σ, we could realise this so that all leaves are Riemannian geodesics.) Suppose the end e ≡ Σ × [0, ∞) of M is simply degenerate. By [Bon], we get a sequence, (γi )∞ ¯i , go out the end e. i=1 in V (G) so that the realisations, γ Moreover, γi converges on a well-defined lamination – the ending lamination of e (at least modulo removing isolated leaves from the limit). We can also think of this in terms of Teichm¨ uller space. We get sequence of pleated surfaces, φi : (Σ, ρi ) −→ M realising γi . The images φi (Σ) also go out e. In the Thurston compactification, T ∪ ∂T , of Teichm¨ uller space, (Σ, ρi ) converges on λ (at least after we have identified all projective laminations with support λ.) In fact one can interpolate so that the γi are the vertices of an infinite ray in G(Σ), and this way get a sequence of interlocking pleated surfaces. (Indeed it follows from work of Minsky that one can take this ray to be geodesic in G.) The general strategy for proving the ELC is to construct a “model” metric on Σ × [0, ∞), depending only on the ending lamination λ, and then show that
Hyperbolic 3-manifolds
111
the universal covers of e and of the model space are Γ-equivariantly quasiisometric. A special case of the ELC is that of bounded geometry, i.e., where e has positive injectivity radius. It then follows that the images of all pleated surfaces in e have bounded diameter. This case is treated in [Mi1, Mi2], and one can take the model space to be the singular sol manifold Pπ , where π is a geodesic ray in T tending to λ. In fact, by interpolating between the pleated surfaces in M , we get a path σ : I −→ thick(T ) such that P˜σ is equivariantly quasi-isometric to the universal cover, e˜. One can deduce that Pσ is Gromov hyperbolic, and using Theorem 3.5, one sees that σ remains close to π, from which one deduces, in turn, that P˜σ is equivariantly quasi-isometric to P˜π . In other words, one recovers the following result of Minsky: Theorem 5.2. If the end e has bounded geometry, then e˜ is equivariantly quasiisometric to the singular sol model space, P˜π . We deduce the ELC in the bounded geometry case. Unfortunately, Theorem 5.2 will certainly fail when we move away from bounded geometry (though a possible variant of this construction is proposed in [Re]). In the general (indeed generic) case, e will contain arbitrarily short closed geodesics, which are necessarily simple [O2], and hence have the form γ¯ where γ ∈ V (G). Any path of pleated surfaces going out the end will inevitably have to pass through the corresponding thin parts, T (γ), of Teichm¨ uller space. The picture can get very complicated, but the curve graph, G(Σ), offers a means of coming to terms with the situation. This was one of the motivations behind the study of [MasM1, MasM2]. The idea in [Mi4] is to construct a model space out of combinatorial data of the curve graph. The details are quite involved, but a key idea is that of a “tight” geodesic. (To interpret the following discussion correctly one should substitute “multicurve” for “curve”, allowing a curve to have more than one component. However, we can safely ignore this somewhat tedious complication here.) Let (γi )ni=0 be a geodesic in G. We say that (γi )i is tight at γi if each curve that crosses γi also crosses either γi−1 or γi+1 . We say (γi )i is tight if is tight at γi for all i = 1, . . . , n − 1. Note that γi must be disjoint from the connected set γi−1 ∪ γi+1 ⊆ Σ. In general, there may be infinitely many ways of choosing γi . Tightness obliges us to take one of the curves bounding the subsurface of Σ filled by γi−1 ∪ γi+1 . Let T (α, β) be the set of all tight geodesics from α to β in G. Theorem 5.3 ([MasM2]). (1) T (α, β) is nonempty. (2) T (α, β) is finite. (It is part (1) which seems to require us to reinterpret tightness in terms of multicurves.)
112
B.H. Bowditch
0 Given r ∈ N, let Sr (α, β) = {γ ∈ T (α, β) | d(α, γ) = r}. In other words it is a “slice” through the union of all tight geodesics a given distance from one endpoint. We can refine Theorem 5.3(2) as: Theorem 5.4 ([Bow5]). There is some K = K(genus(Σ)) ∈ N such that given any α, β ∈ V (G) and r ∈ N, |Sr (α, β)| ≤ K. Note that the hyperbolicity of G tells us immediately that slices have bounded diameter. Theorem 5.4 states that they have bounded cardinality. In fact, there are refinements of this result that allow us to vary α and β, each within a set of bounded diameter, while retaining a cardinality bound on slices that remain far enough away from the endpoints. One consequence of Theorem 5.4 (and its refinements) is that, for certain purposes, it effectively reduces us to considering locally finite graphs. In this way, a diagonal sequence argument, together with an argument of Delzant [D] in the context of hyperbolic groups, gives us: Proposition 5.5. If g ∈ Map and ||g|| > 0, then there is a bi-infinite geodesic, π ⊆ G, such that g N π = π, where N = N (Σ) depends only on the topological type of Σ. Thus, g N translates π some distance p ∈ N, and so N ||g|| = ||g N || = p ∈ N, proving Theorem 3.4(2). We remark that ||g|| > 0 if and only if g is a pseudoanosov mapping class in the Nielsen-Thurston classification. One can similarly use Theorem 5.4 to prove Theorem 3.4(1). The proof of Theorem 5.4 uses the following relatively classical fact about hyperbolic 3-manifolds: Lemma 5.6. Given any α, β ∈ V (G), we can find a complete hyperbolic 3manifold, M ∼ ¯ and β¯ both have uniformly bounded length = Σ × R, in which α (indeed can be chosen arbitrarily short). Here we see the necessity of passing to 3 dimensions – there is no hope of achieving such a result for hyperbolic surfaces. We need, in addition, the following: Theorem 5.7. If α = γ0 , . . . , γn = β is a tight geodesic with the lengths of α ¯ and β¯ uniformly bounded, then the lengths of the γ¯i are all bounded by another constant depending only on Σ. This “a priori bound” is proven in [Mi4], and one can see its relevance to the ELC given that tight geodesics are used to construct the model space. Minsky’s argument is part of a larger project, and uses much sophisticated machinery. A more direct proof of this statement is given in [Bow5]. The vague idea is that, if the result should fail, we can find such a set-up in a 3-manifold in which at least some of the γ¯i are very long. We can connect them by interlocking pleated surfaces, φi : (Σ, ρi ) −→ M . In these pleated surfaces, the very long γi will tend to “fill out” certain subsurfaces, Fi ⊆ Σ.
Hyperbolic 3-manifolds
113
Tightness means that γi must drag around with it either γi−1 or γi+1 (or both), so that Fi will have a homotopically non-trivial intersection with either Fi−1 or Fi+1 . We can then use this sequence of subsurfaces to shortcut the path (γi )i , contradicting the assumption that it is geodesic in G(Σ). To make proper sense out of this argument, we need at some point to use some kind of limiting procedure to derive a contradiction. As a result, we get some non-constructive input into the proceedings, and it is unclear whether the constant K featuring in Theorem 5.4 is a computable function of g = genus(Σ). This therefore also applies to the constants in Theorem 3.4. Some algorithmic bounds associated tight geodesics are described in [Sha], showing for example that distances in G(Σ) are computable. However it seems more difficult to simultaneously achieve uniformity and computability of the various constants referred to earlier. To conclude the proof of Theorem 5.4, one needs to delve further into the geometry of M . For this we use the band systems constructed in [Bow4]. A “band system” gives some kind of topological account of the failure of bounded geometry in M . One needs to argue that realisations of curves featuring in tight geodesics cannot enter any such band. The bounded geometry of M outside the bands then gives rise to combinatorial restrictions on the possibilities for such curves. References [Ab] [Ag] [Ar] [BeF] [Bon] [Bow1]
[Bow2] [Bow3] [Bow4] [Bow5] [Bow6]
W. Abikoff, The real analytic theory of Teichm¨ uller space. Springer Lecture Notes in Mathematics No. 820 (1980), Springer Verlag. I. Agol, Tameness and hyperbolic 3-manifolds. preprint, Chicago (2004). J. Aramayona, The Weil-Petersson geometry of the five-times punctured sphere. preprint, Southampton (2004). M. Bestvina, K. Fujiwara, Bounded cohomology of subgroups of the mapping class groups. Geom. Topol. 6 (2002) 69–89. F. Bonahon, Bouts des vari´ et´es hyperboliques de dimension 3. Ann. of Math. 124 (1986) 71–158. B.H. Bowditch, Notes on Gromov’s hyperbolicity criterion for path-metric spaces. in “Group theory from a geometrical viewpoint” (ed. E. Ghys, A. Haefliger, A. Verjovsky), World Scientific (1991) 64–167. B.H. Bowditch, Stacks of hyperbolic spaces and ends of 3-manifolds. preprint, Southampton (2002). B.H. Bowditch, Intersection numbers and the hyperbolicity of the curve complex. preprint, Southampton (2002). B.H. Bowditch, Systems of bands in hyperbolic 3-manifolds. preprint, Southampton (2003). B.H. Bowditch, Length bounds on curves arising from tight geodesics. preprint, Southampton (2003). B.H. Bowditch, Tight geodesics in the curve complex. preprint, Southampton (2003).
114 [BriH] [Bro] [BroCM] [BroF] [CalG] [Can] [CanEG]
[CoDP] [D] [GhH] [Gr] [Ham] [Hare] [Harv]
[ImT] [Iv] [K] [Mar] [Mas]
[MasM1] [MasM2]
B.H. Bowditch M.R. Bridson, A. Haefliger, Metric spaces of non-positive curvature. Grundlehren der Mathematischen Wissenschaften No. 319, Springer-Verlag (1999). J.F. Brock, The Weil-Petersson metric and volumes of 3-dimensional hyperbolic convex cores. J. Amer. Math. Soc. 16 (2003) 495–535. J.F. Brock, R.D. Canary, Y.N. Minsky, Classification of Kleinian surface groups II: The ending lamination conjecture. in preparation. J.F. Brock, B. Farb, The curvature and rank of Teichm¨ uller space. preprint, Chicago (2001). D. Calegari, D. Gabai, Shrinkwrapping and the taming of hyperbolic 3manifolds. preprint, Pasadena (2004). R.D. Canary, Ends of hyperbolic 3-manifolds. J. Amer. Math. Soc. 6 (1993) 1–35. R.D. Canary, D.B.A. Epstein, P. Green, Notes on notes of Thurston. in “Analytic and geometric aspects of hyperbolic space”, London Math. Soc. Lecture Notes Series No. 111, (ed. D.B.A. Epstein) Cambridge University Press (1987) 3–92. M. Coornaert, T. Delzant, A. Papadopoulos, Les groupes hyperboliques de Gromov. Lecture Notes in Mathematics No. 1441, Springer Verlag (1990). T. Delzant, Sous-groupes distingu´ es et quotients des groupes hyperboliques. Duke Math. J. 83 (1996) 661–682. E. Ghys, P.de la Harpe (eds.), Sur les groupes hyperboliques d’apr` es Mikhael Gromov. Progress in Mathematics No. 83, Birkh¨ auser (1990). M. Gromov, Hyperbolic groups. in “Essays in Group Theory” (ed. S.M. Gersten) M.S.R.I. Publications No. 8, Springer-Verlag (1987) 75–263. U. Hamenst¨ adt, Train tracks and mapping class groups I. preprint, Bonn (2004). J.L. Harer, The virtual cohomological dimension of the mapping class groups of orientable surfaces Invent. Math. 84 (1986) 157–176. W.J. Harvey, Boundary structure of the modular group. in “Riemann surfaces and related topics: Proceedings of the 1978 Stony Brook Conference” (ed. I. Kra, B. Maskit), Ann. of Math. Stud. No. 97, Princeton University Press (1981) 245–251. Y. Imayoshi, M. Taniguchi, An introduction to Teichm¨ uller spaces. Springer-Verlag (1992). N.V. Ivanov, Automorphism of complexes of curves and of Teichm¨ uller spaces. Internat. Math. Res. Notices (1997) 651–666. M. Kapovich, Hyperbolic manifolds and discrete groups. Progress in Mathematics No. 183, Birkh¨ auser (2001). A. Marden, The geometry of finitely generated Kleinian groups. Ann. of Math. 99 (1974) 383–462. H. Masur, The curvature of Teichm¨ uller space. in “A crash course on Kleinian groups”, Lecture Notes in Mathematics No. 400, Springer (1974) 122–123. H.A. Masur, Y.N. Minsky, Geometry of the complex of curves I: hyperbolicity. Invent. Math. 138 (1999) 103-149. H.A. Masur, Y.N. Minsky, Geometry of the complex of curves II: hierarchical structure. Geom. Funct. Anal. 10 (2000) 902–974.
Hyperbolic 3-manifolds [Mi1] [Mi2] [Mi3]
[Mi4] [Mo] [O1] [O2]
[P1] [P2] [Re] [Ro] [Se] [Sha] [Sho]
[Th1] [Th2] [Tu] [W1] [W2]
115
Y.N. Minsky, Teichm¨ uller geodesics and ends of hyperbolic 3-manifolds. Topology 32 (1993) 625–647. Y.N. Minsky, On rigidity, limit sets, and ends of hyperbolic 3-manifolds. J. Amer. Math. Soc. 7 (1994) 539–588. Y.N. Minsky, Short geodesics and end invariants. in “Comprehensive Research in Complex Dynamics and Related Fields”, (eds. M. Kisaka, S. Morosawa) RIMS Kokyuroka No. 1153 (2000) 1–20. Y.N. Minsky, The classification of Kleinian surface groups I: Models and bounds. preprint, Stony Brook (2002). L. Mosher, Stable Teichm¨ uller quasigeodesics and ending laminations. Geom. Topol. 7 (2003) 33–90. J.-P. Otal, Thurston’s hyperbolization of Haken manifolds. Surveys in differential geometry, Vol. III, 77–194, International Press, 1998. J.-P. Otal, Les g´eod´esiques ferm´ ees d’une vari´et´e hyperbolique en tant que noeuds. in “Kleinian groups and hyperbolic 3-manifods” (ed. Y. Komori, V. Markovic, C. Series), London Math. Soc. Lecture Notes Series No. 299 (2003), Cambridge University Press, 95–104. G. Perelman, The entropy formula for Ricci flow and its geometric applications. preprint, Saint Petersburg (2003). G. Perelman, Ricci flow with surgery on 3-manifolds. preprint, Saint Petersburg (2003). M. Rees, The geometric model and large Lipschitz equivalence direct from Teichm¨ uller geodesics. preprint, Liverpool (2004). J. Roe, Lectures on coarse geometry. University Lecture Series No. 31, American Mathematical Society (2003). Z. Sela, Structure and rigidity in (Gromov) hyperbolic groups and discrete groups in rank 1 Lie groups II. Geom. Funct. Anal. 3 (1997) 561–593. K.J. Shackleton, Tightness and computing distances in the curve complex. preprint, Southampton (2004). H. Short et al. Notes on word hyperbolic groups. in “Group theory from a geometrical viewpoint” (ed. E. Ghys, A. Haefliger, A. Verjovsky), World Scientific (1991) 3–63. W.P. Thurston, The geometry and topology of 3-manifolds. notes, Princeton (1979). W.P. Thurston, Three-dimensional manifolds, Kleinian groups and hyperbolic geometry. Bull. Amer. Math. Soc. 9 (1982) 357–381. P. Tukia, Convergence groups and Gromov’s metric hyperbolic spaces. New Zealand J. Math. 23 (1994) 157–187. S.A. Wolpert, Geodesic length functions and the Nielsen problem. J. Differential Geom. 25 (1987) 275–296. S.A. Wolpert, Geometry of the Weil-Petersson completion of Teichm¨ uller space. in “Surveys in Differential Geometry, Vol. VIII”, Boston (2003) 357– 393.
Brian H. Bowditch School of Mathematics, University of Southampton Highfield, Southampton SO17 1BJ, Great Britain URL: http://www.maths.soton.ac.uk/staff/Bowditch
4ECM Stockholm 2004 c 2005 European Mathematical Society
Proof of an Intersection Theorem via Fourier Analysis Ehud Friedgut Abstract. Let p ≤ 1/2 and let µp be the product measure on {0, 1}n , where n− xi . Let A ⊂ {0, 1}n be an intersecting family, i.e., for x i (1 − p) µp (x) = p every x, y ∈ A there exists 1 ≤ i ≤ n such that xi = yi = 1. Then µp (A) ≤ p. The proof uses discrete harmonic analysis.
1. Introduction This note, which is to appear in the proceedings of the Fourth European Congress of Mathematics, is related to my talk there, but not quite reflective of it’s precise contents. The talk I gave was based mainly on the paper [1] and on an upcoming paper [4]. Rather than repeat the contents of those papers I prefer to present a related result – it is extremely simple, but reflects the same theme for which the above papers were chosen – yet another illustration of the power of Fourier analysis in discrete settings. I hope to present an expanded demonstration of the potential of this method for treating intersection problems of this type in an upcoming paper [6]. I should note that the proof I present here is closely related to the proofs of Hoffman [8] and Wilson [9]. The Erd˝ os–Ko–Rado theorem, [5], henceforth EKR, is perhaps the most fundamental theorem in extremal set theory. In this note we prove an analogue of this theorem in a slightly different setting, using Fourier analysis. The main theorem we prove here is implicit in a paper by Dinur and Safra, [3], where they introduce an asymptotic approach that yields a way to deduce it from the EKR theorem itself. (As a matter of fact they prove the analogue of a far reaching generalization of the EKR theorem, the Ahlswede–Khachatrian theorem, [2].) Let us begin with some simple definitions. Let p ∈ [0, 1], and let q = 1 − p. Let n be a positive integer fixed throughout this paper. We consider the space V = Vn = {0, 1}n as a probability space, endowed with the product measure µ = µp . For any x ∈ V, the measure of x is µ(x)= p|x| q n−|x| , where |x| = n i=1 xi , and for any A ⊆ V we define µ(A) = x∈A µ(x). Let [n] denote to the set {0, 1, . . . , n}. A k-set is a subset of [n] of size k, and we use [n] k denote the set of all k-sets contained in [n]. As usual we identify subsets of [n] with their characteristic vectors and vice versa, we identify x ∈ {0, 1}n with A = {i : xi = 1}. We will say A ⊆ V is an intersecting family if every two sets
118
E. Friedgut
that belong to A have non-empty intersection. We will call a family A ⊂ [n] k : i ∈ A}. a principle family if there exists 1 ≤ i ≤ n such that A = {A ∈ [n] k We will call a family A ⊆ V a dictatorship if there exists 1 ≤ i ≤ n such that A = {A ⊆ [n] : i ∈ A}. Theorem 1.1 (Erd˝ os–Ko–Rado). Let k ≤ n/2 and let A ⊆ [n] k be an intersecting family. Then . (1) |A| ≤ nk nk = n−1 n−1 k−1 (2) If |A| = k−1 then A is a principle family. Clearly the condition k ≤ n/2 is necessary, else every two k-sets contained in [n] intersect. The following theorem is a natural analogue of the EKR theorem, and arose in [3] where a generalization of it was first stated and proved. Theorem 1.2. Let p < 1/2 and let A ⊂ {0, 1}n be an intersecting family. Then (1) µp (A) ≤ p. (2) If µp (A) = p then A is a dictatorship. (3) If µp (A) = p − ε then A is O(ε)-close to some dictatorship. (The square of the L2 distance to some dictatorship is of order ε.) If p = 1/2 then µ is the uniform measure, and the first part of the theorem is true but trivial since from every pair of complementary sets at most one can belong to A. As before, (in the case k > n/2), the theorem does not hold in general for p > 1/2, as one may take A, for example, to be the family of all subsets of [n] of size greater than n/2. 2. Proof of Theorem 1.2 For i = 1, . . . , n let χi : {0, 1}n → R be defined by − q/p if xi = 1 χi (x) = p/q if xi = 0 and for any S ⊆ [n] let χS = i∈S χi . The functions {χS } form an orthonormal basis for the functions on {0, 1}n . Given any function f : {0, 1}n → R we can expand it in terms of this basis as f = fˆ(S)χS . We refer to fˆ as the Fourier transform of f . Let A ⊆ {0, 1}n be an intersecting family and let f be its characteristic function. Let P r(f = 1) = α. We wish to prove that α ≤ p. The main observation that we will need is the following lemma about the Fourier expansion of f . Lemma 2.1.
fˆ2 (S)(−p/q)|S| = 0.
Proof of an Intersection Theorem
119
We will prove this lemma shortly, but first let us see how it implies the theorem. Since f takes on only the values 0 and 1 fˆ2 (S) = f 22 = f 1 = fˆ(∅) = α. Also recall that p < 1/2 so p/q < 1. Hence fˆ2 (S)(−p/q)|S| ≥ α2 + (α − α2 )(−p/q). 0= fˆ2 (S)(−p/q)|S| = α2 + S =∅
This immediately yields α ≤ p as required. Also the above implies that one only has equality if for all |S| > 1 fˆ(S) = 0. This means that f is a Boolean function that is linear. It is quite simple to deduce from this that in such a case f depends on a single coordinate. Furthermore this type of observation is the key to proving part (3) of the theorem. One can show that if all but a small portion of the L2 weight of fˆ sits on sets of size at most 1 then f is close to a dictatorship. This uses a theorem to this effect from [7]. We omit the details. (For a proof of this nature see [1].) Proof of Lemma. First define the following matrix: q − p p/q A1 = 1 0 Note that the eigenvectors for A1 are (1, 1) and ( p/q, − q/p), which if viewed as functions on {0, 1} are precisely χ∅ and χ1 . The corresponding eigenvalues are 1 and −p/q respectively. Now define An as the n-fold tensor product of A1 with itself. The eigenvectors of An are precisely the functions χS for S ⊆ [n] and the corresponding eigenvalues are (−p/q)|S| . The main reason for our interest in An is the following fact, that is not hard to prove by induction. If we label the columns and rows of An in the natural way by the elements of {0, 1}n then it is a non-intersection matrix of the subsets of n: if x, y ∈ {0, 1}n are the characteristic vectors of intersecting sets then Ax,y = 0. The conclusion of this n simple fact is that if f ∈ {0, 1}2 is the characteristic vector of an intersecting ˆ family then f Af tr = 0. Expanding f as f (S)χS immediately yields the lemma. References [1] N. Alon, I. Dinur, e. Friedgut, B. Sudakov, Graph Products, Fourier Analysis and Spectral Techniques To appear in G.A.F.A. [2] R. Ahlswede, L. Khachatrian, The complete intersection theorem for systems of finite sets. European J. Combin. 18 (1997), no. 2, 125–136. [3] I. Dinur, S. Safra, On the importance of being biased (1.36 hardness of approximating Vertex-Cover). Annals of Mathematics, to appear. Proc. of 34th STOC, 2002. [4] I. Dinur, E. Friedgut, Large monotone intersecting families are contained in a junta, in preparation.
120
E. Friedgut
[5] P. Erd˝ os, C. Ko, R. Rado, Intersection theorems for systems of finite sets, Quart. J. Math. Oxford, ser. 2 12 (1961), 313–318. [6] E. Friedgut, On the measure of intersecting families, in preparation. [7] E. Friedgut, G. Kalai, A. Naor, Boolean Functions whose Fourier Transform is Concentrated on the First Two Levels, Adv. in Appl. Math., 29(2002), 427–437 [8] A.J. Hoffman Eigenvalues of graphs. Studies in graph theory, Part II, pp. 225– 245. Studies in Math., Vol. 12, Math. Assoc. Amer., Washington, D. C., 1975 [9] R.M. Wilson, The exact bound in the Erd˝ os–Ko–Rado theorem. Combinatorica 4 (1984), no. 2–3, 247–257. Ehud Friedgut Institute of Mathematics Hebrew University Jerusalem, Israel e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Nonlinear Schr¨ odinger Equations on Compact Manifolds P. G´erard Abstract. Nonlinear Schr¨ odinger equations have been studied by mathematicians for about thirty years. However, until recently, most of the contributions concerned the equation on the whole Euclidean space, with the notable exception of J. Bourgain’s contributions on tori. In the case of general Riemannian manifolds, the interaction of geometry with nonlinear operations leads to new phenomena, particularly if the manifold is compact. Here we review the state of the art concerning the Cauchy problem on such manifolds, and we describe optimal results on spheres, where new estimates on spherical harmonics play a crucial role. The matter of this paper is based on a series of results in collaboration with N. Burq and N. Tzvetkov.
1. Introduction Let (M, g) be a complete Riemannian manifold of dimension d. It is well known (see Gaffney [21] ) that the Laplace operator ∆g = div∇ on functions is essentially selfadjoint, and therefore generates a unitary one parameter group on L2 (M, dx), S(t) = eit∆g ,
(1.1)
called the Schr¨odinger group. Let us consider the nonlinear evolution equation iut + ∆u = ε|u|2 u,
u|t=0 = u0
(1.2)
where ε ∈ {1, −1}, u0 : M → C is a given Cauchy data, and u : R × M → C is the unknown. Other nonlinearities in the right-hand side of (1.2) can appear; here, for the sake of simplicity, we have chosen to reduce our discussion to the cubic one. The nonlinear Schr¨ odinger equation (1.2) can be seen as an infinite dimensional Hamiltonian system on L2 (M ) endowed with the symplectic form f1 f 2 dx , σ(f1 , f2 ) = 2Im M
2000 Mathematics Subject Classification. 35Q55, 35BXX, 37K05, 37L50, 81Q20. Key words and phrases. nonlinear Schr¨ odinger, eigenfunction estimates, dispersive equations.
122
P. G´erard
and associated to the (unbounded) energy function ε H(f ) = |∇f |2 + |f |4 dx . 2 M From this structure it inherits the following formal conservation laws,
u(t) L2 = u(0) L2 ; H(u(t)) = H(u(0)) .
(1.3)
In the Euclidean case, (1.2) arises naturally in a number of physical contexts (see, e.g., the recent book of Sulem-Sulem [35] and the references therein). In nonlinear optics, it has important applications to the modelisation of laser beams. In this context, a non-Euclidean metric would correspond essentially to a medium with variable optical index. In Quantum Mechanics, the study of Bose-Einstein condensates led to a similar equation to (1.2), where a confining potential is added to the Laplace operator; at least in some regimes, the expected effect of this potential can be very close to a localization on a compact manifold. Our purpose in this paper is to investigate the influence of the geometry of the manifold M on the dynamics of (1.2). This problem is particularly relevant in view of the infinite propagation speed displayed by the Schr¨ odinger group (1.1), which suggests to expect global geometric effects even at small times, particularly if M is compact. More specifically, natural issues about the Cauchy problem (1.2) are • Definition of dynamics: choice of the phase space, local in time or global in time existence, uniqueness, regularity. . . • Qualitative properties of the flow map: in particular, the stability – even for small times – of the evolution with data displaying singular behaviors, such as oscillations, concentration effects, etc. In order to address such questions, we now introduce the basic notion of uniform wellposedness. 1.1. Uniform wellposedness on Sobolev spaces. For simplicity, we shall assume from now on that (M, g) belongs to one of the following two classes of complete Riemannian manifolds: a) Compact manifolds. b) M = Rd and g satisfies the following global estimates, ∀x ∈ Rd , ∀α ∈ Nd ,
cI ≤ g(x) ≤ CI
,
|∂ α g(x)| ≤ Cα .
Given s ∈ R, we denote by H s (M ) the space (1 − ∆)−s/2 (L2 (M )), which is natural in this context since the linear Schr¨ odinger group (1.1) acts unitarily on it. Definition 1.1. Let s ∈ R. We shall say that the nonlinear Schr¨ odinger equation (1.2) is (locally in time) uniformly well posed on H s (M ) if, for any bounded subset B of H s (M ), there exist T > 0 and a Banach space XT continuously contained into C([−T, T ], H s (M )), such that
NLS on Compact Manifolds
123
i) For every Cauchy data u0 ∈ B, (1.2) has a unique solution u ∈ XT . ii) If u0 ∈ H σ (M ) for σ > s, then u ∈ C([−T, T ], H σ (M )). iii) The map u0 ∈ B → u ∈ XT is uniformly continuous. This definition calls for several remarks. Firstly, using the Sobolev inequality H s (M ) ⊂ L∞ (M ) for s > d/2, it is easy to prove that (1.2) is uniformly well posed on H s (M ) for such s – where XT = C([−T, T ], H s (M )) and the uniform continuity in iii) is in fact Lipschitz continuity. Secondly, though the above definition claims local in time properties, in some cases it can be combined with the conservation laws (1.3) to provide global in time results. Specifically, assume for instance that (1.2) is uniformly well posed on L2 (M ). Since the L2 conservation law holds for every solution in C([−T, T ], H s ) with s large enough, it results from requirement ii) and from the continuity part in requirement iii) that this conservation law holds on [−T, T ] as soon as u0 ∈ L2 . Combining this observation with requirement i), we conclude that uniform wellposedness on L2 implies global in time wellposedness on L2 , including propagation of regularity. We shall see that this is always the case if d = 1. Similarly, one can show that uniform wellposedness on H 1 implies global in time wellposedness on H 1 , including propagation of regularity, under the assumption that a bound on f L2 and on H(f ) is equivalent to a bound on f H 1 , which holds if ε = 1 and d ≤ 4. Finally, let us discuss a little the meaning of requirement iii) by focusing onto the particular case of a sequence (uN 0 ) of Cauchy data which are spectrally localized at a high frequency N , namely, N 1[N,2N ] (uN 0 ) = u0
,
N →∞.
(1.4)
s In this context, the boundedness of uN 0 in H is clearly equivalent to the fol2 lowing information on the size of the L norm, −s
uN 0 L2 ≤ C N
.
(1.5)
Requirement iii) obviously means that for such a sequence, a small perturbation of the data in H s results in a small perturbation of the corresponding solution in the same space during a fixed time interval; in other words, iii) can be interpreted as a high frequency stability requirement. Moreover, in view of (1.5), the smaller is s, the larger is the size of data satisfying this stability property. Now we can state more precisely the problem we are going to study. Problem. Find the uniform wellposedness threshold sc (M, g, ε) = inf{s ∈ R : (1.2) is uniformly well posed on H s (M ) } .
124
P. G´erard
1.2. Contents of the paper. In the next section below, we shall briefly survey the main known results about uniform wellposedness for (1.2) on the Euclidean space: in this case, scaling considerations suggest the value of the critical threshold sc . The key tool for confirming this value is the so-called dispersion property for the Schr¨ odinger group, and a set of space-time a priori inequalities, known as Strichartz inequalities in the literature. In Section 3, we shall see that the dispersion property fails on any compact manifold; as a consequence, we shall only be able to obtain weak generalizations of Strichartz inequalities on general compact manifolds – involving a fractionary loss of derivative – which will imply a rough upper bound for sc . Then we shall compare these general results with the earlier pioneering work of J. Bourgain [2],[3], [6], which concerns the case M = Td , and where the value of sc is the Euclidean one. We shall also compare our weak Strichartz inequalities with Lp estimates on eigenfunctions of the Laplacian due to C. Sogge [31], [32], [33], and deduce from the particular case of the sphere a lower bound for the Strichartz loss of derivative. In Section 4, we focus on the case of the two-dimensional sphere, for which we compute the value of the critical threshold sc , which is larger than the corresponding one on the Euclidean plane or on the torus. The main steps of the proof are • the construction and the study of high frequency stationary solutions of (1.2) on the sphere concentrating on equators, in the continuation of a paper by A. Weinstein [37]. • a sharp bilinear Strichartz inequality on the sphere, based on a bilinear version of Sogge’s inequalities and on the clustering structure of the spectrum of the Laplacian. Finally, in the last section, we discuss a few generalizations of the results of Section 4 and open problems. We close this introduction by mentioning that the results described in this paper were obtained in collaboration with N. Burq and N. Tzvetkov (see [8], [9], [10], [14], [16]). We shall not discuss here the case of manifolds with boundary, which is more intricate; however some results can be found in [11], [12], [13].
2. Dispersion and the classical results in the Euclidean case In this section M = Rd and g = ge is the Euclidean metric. Then it is possible to take advantage of the explicit representation of the Schr¨ odinger group, 2 1 ei|x−y| /4t f (y) dy . (2.1) S(t)f (x) = (4iπt)d/2 Rd
NLS on Compact Manifolds
125
From (2.1), one infers the following dispersion estimate, C
S(t) L1 (Rd )→L∞ (Rd ) ≤ d/2 . (2.2) |t| This estimate has important consequences for the nonlinear Schr¨ odinger equation. In order to state these consequences, we introduce a standard definition. Definition 2.1. We shall say that a pair (p, q) ∈ [1, ∞] × [1, ∞] is d-admissible if 2 d d + = , p ≥ 2 , (p, q) '= (2, ∞). p q 2 By means of a functional-analytic argument, the dispersion property (2.2) leads to the following “Strichartz estimates” for the Schr¨odinger group. Theorem 2.2 (Ginibre-Velo[23], Keel-Tao[28]). If (p, q) is a d-admissible pair, there exists a constant C such that - p/q .1/p |S(t)f (x)|q dx dt ≤ C f L2 (Rd ) . R
Rd
For instance, if d = 2, the pair (4, 4) is admissible, thus the function given by (2.1) is L4 (R × R2 ) as soon as f ∈ L2 (R2 )). With Strichartz estimates in hand, it is now possible to compute the threshold of uniform wellposedness for (1.2). Theorem 2.3. For every d ≥ 1, we have
d sc (R , ge , ε) = max 0, − 1 . 2 d
Moreover, if d = 1, (1.2) is uniformly well posed on L2 (R). Notice that the value se = d/2 − 1 can be guessed by the following scaling considerations: for every λ > 0, equation (1.2) is invariant by the transformation u → uλ with uλ (t, x) = λu(λ2 t, λx), and se is the real number s such that the homogeneous H˙ s Sobolev norm is invariant by this transformation. Theorem 2.3 is the achievement of a long series of contributions. The uniform wellposedness for s > se can be seen as a particular case of results by Cazenave-Weissler [18] with a number of earlier contributions including Ginibre-Velo [22], [23] and Kato [27]. The case s = 0, d = 1 is due to Tsutsumi [36]. The basic approach is to solve (1.2) as the integral equation t u(t) = S(t)u0 − iε S(t − τ )(|u(τ )|2 u(τ )) dτ (2.3) 0
and to apply a fixed point theorem in, say, XT = C([−T, T ], H s (Rd )) ∩ Lp ([−T, T ], (1 − ∆)−s/2 (Lq (Rd ))) where (p, q) is a suitable d-admissible pair.
126
P. G´erard
The lack of uniform wellposedness was first proved in the particular case d = 1 , s < 0 , ε = −1 by Kenig-Ponce-Vega [29]; the general case was solved recently by Christ-Colliander-Tao [19], [20]. Finally, let us close this brief survey by mentioning two critical problems: • If d = 4, then sc = 1; in the focusing case ε = −1, existence of solutions blowing up in finite time (see, e.g., Zakharov [38], Glassey [25]) implies that (1.2) is not uniformly well posed on H 1 (R4 ). In the defocusing case ε = 1, the problem is still open, except in the radial case, solved by Bourgain [7]. • If d = 2, then sc = 0 and we have a similar situation: if ε = −1, (1.2) is not uniformly well posed on L2 , due to solutions blowing up in finite time. If ε = 1, the problem is widely open, despite many attempts to solve it. We refer to Cazenave [17] for a more complete survey of the Euclidean case. 3. The failure of dispersion on compact manifolds We now assume that M is a compact manifold of dimension d ≥ 1. Then the dispersion inequality (2.2) is strongly violated for any time t '= 0, since an operator which maps L1 (M ) into L∞ (M ) has a kernel in L∞ (M ×M ), therefore in L2 (M × M ) since M is compact. But such an operator is Hilbert-Schmidt on L2 (M ), hence is a compact operator, which cannot be the case of the unitary operator S(t). This obstruction can be made more quantitative as follows: given a function χ ∈ C0∞ (R) with, say, χ(0) = 1, and a small parameter h > 0, let us estimate the norm of the smoothing operator χ(h2 ∆g ) S(t) as a map from L1 (M ) to L∞ (M ) as h goes to 0. By the above consideration on the kernel of this operator, one obtains the following lower bound, .1/2 1 |χ(h2 λ)|2 ,
χ(h2 ∆g ) S(t) L1 →L∞ ≥ vol(M ) λ
where the sum in the right-hand side bears on all λ in the spectrum of the Laplacian, repeated according to their multiplicity. Applying Weyl’s asymptotics, we infer C
χ(h2 ∆g ) S(t) L1 →L∞ ≥ d/2 . h As a consequence, the norm of χ(h2 ∆g ) S(t) as a map from L1 (M ) to L∞ (M ) cannot be bounded by C|t|−d/2 if |t| >> h. It turns out that this obstruction is optimal. Theorem 3.1 ([8]). Given χ ∈ C0∞ (R), there exist θ > 0, C > 0 such that, ∀h ∈ (0, 1],
χ(h2 ∆) S(t) L1 (M )→L∞ (M ) ≤
C , |t| ≤ θh. |t|d/2
NLS on Compact Manifolds
127
The main argument in the proof of Theorem 3.1 is that, for Cauchy data localized at a frequency h−1 and |t| h, the linear Schr¨ odinger equation is a semiclassical equation, therefore its solution can be represented by means of a Fourier integral operator. Applying the stationary phase formula to the kernel of this operator yields the theorem. As a consequence of Theorem 3.1, we obtain the following weak generalizations of Strichartz estimates. Theorem 3.2 ([34],[8]). Given a d-admissible pair (p, q) and I ⊂⊂ R, - .1/p p/q
|S(t)f (x)|q dx I
dt
≤ C(I) f H 1/p (M ) .
M
In comparison with Theorem 2.2, notice the loss of 1/p derivative in the right-hand side of the above estimate, which is a consequence of the bad dispersive properties of the Schr¨ odinger group. If p = 2 and d ≥ 3, this loss can be shown to be optimal if M is the sphere. Arguing as in the Euclidean case, these estimates have consequences on the wellposedness theory for (1.2). Corollary 3.3 ([8]). For every Riemannian compact manifold (M, g) of dimension d, we have d−1 sc (M, g, ε) ≤ . 2 Compared with Theorem 2.3, this result seems to be a rough bound. In particular, in the case of dimension 3, it only asserts the uniform wellposedness in H s for s > 1, which barely misses the energy threshold. In fact, the situation is a little better: by using Theorem 3.2 in a finer way, logarithmic estimates are derived in [8] so that, in the case ε = 1, the Cauchy problem (1.2) has unique global solutions in C(R, H 1 (M )) for every data u0 ∈ H 1 (M ), with propagation of the regularity and continuity of the flow map u0 ∈ H 1 → u ∈ C([−T, T ], H 1 ) for all T . However, we do not know if this map is uniformly continuous on bounded subsets of H 1 . It is therefore tempting to investigate the optimality of Corollary 3.3 in particular cases of compact manifolds. The first natural example is of course the torus Td = Rd /Zd endowed with the standard metric. 3.1. The case of tori. In the series of papers [2], [3], Bourgain investigated the wellposedness of the nonlinear Schr¨ odinger equation on Td . His approach to (1.2) is different from the one described above. On the one hand, he does not try to improve general Strichartz estimates of Theorem 3.2, but only uses the space time norm L4 (I × Td ) of S(t)f , for which he gets, by different methods, essentially the same estimates as in the Euclidean space. For instance, if d = 2, he proves 1/4 4 |S(t)f (x)| dx dt ≤ Cδ f H δ (T2 ) , T
T2
128
P. G´erard
for every δ > 0. On the other hand, these new estimates are precised by involving a new set of Banach spaces XT = XTs,b which are adapted to the symbol of the linear Schr¨ odinger equation. We refer to subsection 4.3 below for more details about these spaces. Of course the arguments make a strong use of the algebraic properties of the Fourier decomposition. Bourgain’s results in the particular case of (1.2) can be rephrased as follows. Theorem 3.4 (Bourgain [2], [6]). For every d ≥ 1, the cubic nonlinear Schr¨ odinger equation (1.2) is uniformly well posed on H s (Td ), for every d s > max 0, − 1 . 2 Moreover, if d = 1, it is uniformly well posed on L2 (T). Combining these results with recent illposedness results on the torus (see [9], [20]), we conclude that Theorem 2.3 is still true if one replaces the Euclidean space Rd by the torus Td . In other words, no geometric effect is detected by comparing the value of sc on the Euclidean space and on the torus. It should be stressed that this result is by no means an adaptation of the methods in the Euclidean space. In fact, if instead one deals with a multidimensional torus with irrational sides, the situation is far from being understood. A very recent work by Bourgain [5] shows for instance that, if M = (R/θ1 Z) × (R/θ2 Z) × (R/θ3 Z) , where θ1 , θ2 , θ3 are arbitrary positive real numbers, then 2 sc (M, ε) ≤ . 3 3.2. Eigenfunction estimates. Another way of testing the optimality of our general results on compact manifolds is to compare Strichartz estimates of Theorem 3.2 with Lq estimates for eigenfunctions of the Laplacian. Indeed, if φ satisfies ∆g φ + λφ = 0 (3.1) with, say, λ ≥ 1, then estimates of Theorem 3.2 applied to f = φ imply √ d− d 2d
φ Lq (M ) ≤ C λ 4 2q φ L2 (M ) , 2 ≤ q ≤ , q < ∞. d−2 These inequalities are to be compared with the following result. Theorem 3.5 (Sogge [31], [32], [33]). If φ satisfies (3.1) with λ ≥ 1, then √ s(q)
φ L2 (M ) , 2 ≤ q ≤ ∞,
φ Lq (M ) ≤ C λ with
s(q) =
d−1 2 d−1 2
"
1 2
−
− d q
1 q
#
if
if
2≤q≤
2(d+1) d−1
2(d+1) d−1
,
≤q≤∞ .
Moreover, the exponent s(q) is optimal for every q if M is a sphere.
NLS on Compact Manifolds
129
s(q) 1 2
1 4 1 6 1 8
1 6
1 4
1 2
1 q
Figure 1. The Sogge diagram in dimension 2 Notice that s(q) > 0 for every q > 2, hence the optimality in Theorem 3.5 implies that in general one cannot avoid a positive loss of derivative in Strichartz estimates. However it is not clear whether the loss in Theorem 3.2 is optimal, except in the particular case q = 2d/(d − 2), d ≥ 3, for which s(q) = (d − 1)/2 − d/q = 1/2 . In figure 1 below is plotted the function 1/q → s(q) when d = 2. 4. An optimal result on the two-dimensional sphere This section is devoted to the computation of sc (M, g, ε) if (M, g) is the standard two-dimensional sphere S 2 . Theorem 4.1 ([9],[14]). sc (S 2 , ε) =
1 . 4
Notice the contrast with the above results on R2 and T2 , which assert sc = 0, and with Corollary 3.3, which only gives sc ≤ 1/2. In the rest of this section we describe the main steps of the proof of Theorem 4.1. 4.1. High frequency instability for 0 ≤ s < 14 . In this subsection we prove that uniform continuity in requirement iii) of Definition 1.1 cannot hold. The idea is to study the solution of (1.2) for a family of high frequency Cauchy data displaying a strong concentration in L4 norm. Referring to Sogge [31] it is known that the strongest concentration in L4 norm of spherical harmonics on S 2 is displayed by the following ψn (x) = (x1 + ix2 )n
(4.1)
130
P. G´erard
where (x1 , x2 , x3 ) are cartesian coordinates on R3 and S 2 = {(x1 , x2 , x3 ) ∈ R3 : x21 + x22 + x23 = 1} . Notice that ψn is concentrated on the equator {x3 = 0}. Moreover, it is easy to check that
ψn L4 ∼ n1/8
ψn L2 as n → ∞, which is coherent with s(4) = 1/8 in Theorem 3.5. In [9] (see also Banica [1] for more precise results) we proved the instability by finding an ansatz for the solution of (1.2) with u(0) = cn ψn where cn is a normalisation factor so that u(0) be or order 1 in H s . Here we propose a new proof of the instability based on the construction of stationary solutions to (1.2). For simplicity, we deal with the defocusing case ε = 1. The starting point is to observe that ψn is the ground state of −∆ restricted to the space L2n (S 2 ) = {f ∈ L2 (S 2 ) : ∀α ∈ R, f ◦ Rα = einα f } , where Rα denotes the rotation of angle α around the x3 axis. For every δ > 0, let us minimize the energy 1 4 2 H(f ) = |∇f | + |f | 2 S2 on the sphere of radius δ in L2n (S 2 ). By using the Rellich theorem, it is easy to prove the existence of a minimizer fn . The Euler equation reads −∆fn + |fn |2 fn = ωn fn for some number ωn , so that un (t, x) = e−itωn fn (x) is a solution of (1.2). Notice that 1 ωn = 2 (|∇fn |2 + |fn |4 ) > 0 . δ S2
(4.2)
Let φn = cn ψn with cn > 0 such that φn L2 = δ. As n goes to ∞, it turns out that fn is well approximated by φn . Lemma 4.2. There exists C > 0 and n0 such that, for every n ≥ n0 , for every δ ∈]0, 1], 1
fn − eiαn φn H s ≤ C ns− 4 δ 2 for some αn ∈ R and for every s ∈ [0, 1], and 2 4 4 ωn − 1 (|∇φ | + |φ | ) n n ≤ Cδ . δ2 2 S
(4.3)
(4.4)
NLS on Compact Manifolds
Notice that |∇φn |2 = n(n + 1)δ 2 ; S2
S2
√ |φn |4 = c0 δ 4 n + O
131
δ4 √ n
,
so that estimate (4.4) of Lemma 4.2 above means that √ δ2 2 4 √ . ωn = n(n + 1) + c0 δ n + O δ + n
(4.5)
Proof of Lemma 4.2. From the definition of fn and φn we have 2 |∇fn | ≥ |∇φn |2 H(fn ) ≤ H(φn ) , S2
S2
therefore θn :=
S2
(|∇φn |2 + |φn |4 ) − δ 2 ωn ≥ 0 .
Let us decompose f n = z n φn + q n with qn ⊥ φn . Then, computing fn 2L2 and ∇fn 2L2 , we get δ = |zn | δ + 2
2 2
qn 2L2
; δ ωn = |zn | n(n + 1)δ + 2
2
2
∇qn 2L2
(4.6)
+ S2
|fn |4 . (4.7)
Combining these two identities yields |φn |4 − |fn |4 − ( ∇qn 2L2 − n(n + 1) qn 2L2 ) . θn = S2
(4.8)
S2
Set Hn (qn ) = ∇qn 2L2 − n(n + 1) qn 2L2 . Using θn ≥ 0, we infer from (4.8) the a priori bound √ Hn (qn ) ≤ |φn |4 ≤ B δ 4 n . (4.9) S2
Further, let us transform the expression (4.8) of θn as θn ≤ (1 − |zn |4 ) |φn |4 + C1 (|φn |3 |qn | + |φn | |qn |3 ) − Hn (qn ) 2 2 S S √ ≤ C2 (δ 3 qn L2 n + δ qn 3L6 ) − Hn (qn ) ,
(4.10)
where, in the second inequality, we used the a priori bounds on φn and the first identity in (4.7). Now we estimate qn by taking advantage of the fact that φn is the unique ground state of −∆ on L2n . Let us decompose qn =
∞
hn,k
k=1
where hn,k is a spherical harmonic of degree n + k, so that Hn (qn ) =
∞ k=1
((n + k)(n + k + 1) − n(n + 1)) hn,k 2L2 .
132
P. G´erard
Then
qn L2 =
-∞
.1/2
hn,k 2L2
k=1
so that
1 ≤ √ (Hn (qn ))1/2 n
1 6 Hn (qn ) , (4.11) δ qn L2 n ≤ C2 δ + 4C2 and, using again the first identity in (4.7) and the a priori bound (4.9) on Hn (qn ), 0 ≤ 1 − |zn | ≤ B n−1/2 δ 2 . (4.12) √
3
Similarly, ∀s ∈ [0, 1] ,
qn H s ≤ ns−1/2 (Hn (qn ))1/2 ≤ B ns−1/4 δ 2 . ,
(4.13)
Plugging (4.12) and (4.13) into (4.6), we obtain (4.3), where αn denotes the argument of zn . Furthermore, by using Sogge’s estimates of Theorem 3.5,
qn L6 ≤
∞
hn,k L6
k=1
≤ C3
∞
(n + k)1/6 hn,k L2
k=1
≤ C3
-∞ k=1 −1/3
≤ C4 n
(n + k)1/3 (n + k)(n + k + 1) − n(n + 1)
(4.14)
.1/2 (Hn (qn ))1/2
(log(n))1/2 (Hn (qn ))1/2 .
Using again the a priori bound (4.9) we infer δ qn 3L6 ≤ C5 δ 3 n−3/4 (log(n))3/2 Hn (qn ) .
(4.15)
Plugging (4.11) and (4.15) into (4.10), we deduce that, for n ≥ n0 and δ ≤ 1, θn ≤ C δ 6 , which completes the proof of (4.4). Let us complete the proof of the high frequency instability. First we observe that, multiplying fn by a phase factor, we may assume αn = 0 in (4.3). We define fn and fn corresponding respectively to the following values of δ, δ = n−s
,
δ = κn n−s
where κn goes to 1 as n → ∞ in a way to be defined below. Then the a priori bounds on fn and fn imply
fn L2 + fn L2 ≤ Cn−s
;
∇fn L2 + ∇fn L2 ≤ Cn1−s ,
NLS on Compact Manifolds
133
so that fn and fn are bounded in H s , and, by (4.3), 1
fn − fn H s = O(|κn − 1| + n−s− 4 ) . The corresponding solutions of (1.2) are un (t) = e−itωn fn
;
un (t) = e−itωn fn
and therefore
1
un (t) − un (t) H s ≥ e−itωn − e−itωn fn H s − O(|κn − 1| + n−s− 4 ) .
Since fn ∈ L2n , we have
fn H s ≥ ns fn L2 = 1 and, taking advantage of Lemma 4.2 through (4.5), ωn − ωn = c0 (1 − κ2n ) n1/2−2s + O(n−4s ) . Since s < 1/4, one can choose κn such that (1 − κ2n ) n1/2−2s → ∞ which implies that, for every T > 0, lim inf sup un (t) − un (t) H s > 0 , n→∞ |t|≤T
contradicting requirement iii) of Definition 1.1. Remark 4.3. a) In fact, the above proof shows that, for every t '= 0, the flow map u0 ∈ H s → u(t) ∈ H s is not uniformly continuous on bounded subsets of H s . b) By using the Agmon inequalities (see, e.g., Helffer [26]), one can prove that, like φn , fn enjoys an exponential localization near the equator {x3 = 0}, namely 2
|fn (x)| ≤ Cδ nβ e−α n x3
(4.16)
for some C, α, β > 0. c) If s < 0, an adaptation of the arguments of Christ-Colliander-Tao [20] (see also the appendix of [16]) allows to show that, for any T > 0, the map u0 ∈ H s → u ∈ C[−T, T ], H s ) cannot be continuous at u0 = 0.
134
P. G´erard
4.2. Uniform wellposedness for s > 14 . In this subsection, we prove the second part of Theorem 4.1. We first introduce the useful notion of bilinear Strichartz estimate on a compact Riemannian manifold (M, g). Given such a manifold, for every dyadic integer N , we introduce the spectral dyadic projector √ PN = 1[N,2N [ ( −∆) . Definition 4.4. We shall say that the Schr¨ odinger group satisfies a bilinear Strichartz estimate of loss σ0 ≥ 0 on M if there exits C > 0 such that, for every L2 functions f, f˜ on M , for every dyadic integers N, L, - .1/2 2 |S(t)(PN f )(x) S(t)(PL f˜)(x)| dx dt [0,1]×M
˜ L2 . ≤ C (min(N, L))2σ0 f L2 f Notice that by setting f˜ = f, L = N and by using the Littlewood-Paley inequality, one shows easily that a bilinear Strichartz estimate of loss σ0 implies a Strichartz-type estimate of the space-time L4 norm of a solution to the linear Schr¨ odinger equation in terms of the H σ0 norm of the Cauchy data. However, if σ0 > 0, a bilinear Strichartz estimate says more, since the price to pay for estimating the L2 norm of a product of such solutions only involves the lowest frequency of these solutions. This fact is crucial in the wellposedness theory of the cubic nonlinear Schr¨ odinger equation. Proposition 4.5 ([14]). If the Schr¨ odinger group satisfies a bilinear Strichartz estimate of loss σ0 on a manifold M , then (1.2) is uniformly well posed on H s (M ) for every s > 2σ0 . The proof of this proposition is a generalization to every manifold of Bourgain’s approach on the torus (see also Klainerman-Machedon [30] in the context of the wave equation and null quadratic forms). The main idea is to introduce the scale of Hilbert spaces X s,b (R × M ) = {v ∈ S (R × M ) : (1 + |i∂t + ∆|2 )b/2 (1 − ∆)s/2 v ∈ L2 (R × M )} for s, b ∈ R. Denoting by XTs,b the space of restrictions of elements of X s,b (R × M ) to ] − T, T [×M , it is easy to observe that ∀b >
1 , XTs,b ⊂ C([−T, T ], H s (M )) 2
and that ∀f ∈ H s (M ) , ∀b > 0 , (t, x) → S(t)f (x) ∈ XTs,b . Moreover, the Duhamel term in the integral equation (2.3) can be handled by means of these spaces as 1 1 t 1 1 1−b−b 1 S(t − τ )f (τ ) dτ 1
f X s,−b 1 s,b ≤ C T 1 T 0
XT
NLS on Compact Manifolds
135
if 0 < T ≤ 1 , 0 < b < 12 < b , b + b < 1. We refer to [24] for a pedagogical introduction to this strategy. The crux of the proof of Proposition 4.5 is then to observe that a bilinear Strichartz estimate of loss σ0 imply, for σ ≥ s > 2σ0 and suitable b, b as above,
v1 v2 v3 X s,−b ≤ C v1 X s,b v2 X s,b v3 X s,b ,
|v|2 v X σ,−b ≤ C v 2X s,b v X σ,b , which allows the use a fixed point argument in XTs,b in the resolution of the integral equation (2.3). Using Proposition 4.5, one can recover the information on sc already obtained in the previous sections. Indeed, some of the Fourier series estimates of Bourgain [2],[6] can be rephrased as bilinear Strichartz estimates of loss σ0 > (d − 2)/4 on the torus Td (see Theorem 3.4). On an arbitrary compact manifold M , combining the Strichartz inequalities of Theorem 3.2 with the Sobolev inequalities, one shows easily bilinear Strichartz inequalities of loss σ > (d − 1)/4, which yields Corollary 3.3. The rest of the proof of Theorem 4.1 lies in an improvement of these bilinear estimates in the case of S 2 . Proposition 4.6 ([14]). On S 2 , the Schr¨ odinger group satisfies a bilinear Strichartz estimate of any loss σ0 > 1/8. The proof of Proposition 4.6 is based on two ingredients. A first step consists in proving the following bilinear version of Sogge’s L4 inequality: if Hn , H are spherical harmonics of degree n, ≥ 1, 1
Hn H L2 (S 2 ) ≤ C (min(n, )) 4 Hn L2 (S 2 ) H L2 (S 2 ) .
(4.17)
This inequality is in fact true on any compact surface, √ and follows from similar properties for the approximate spectral projectors χ( −∆ − N ), where χ ∈ S(R). We refer to [14], [15] and [16] for different proofs. The second step takes advantage of the clustering property of the spectrum on the sphere. Indeed, S(t)(PN f ) =
N/2≤n≤2N
e−itn(n+1) Hn , S(t)(PL f˜) =
˜ e−it(+1) H
L/2≤≤2L
Using the Parseval formula in the time variable and introducing, for each integer τ , the set ΛN L (τ ) = {(n, ) :
L N ≤ n ≤ 2N , ≤ ≤ 2L , n(n + 1) + ( + 1) = τ }, 2 2
136
P. G´erard
we obtain
S(t)(PN f ) S(t)(PL f˜) L2 ((0,2π)t ×S 2 ) =
τ
1/2 ˜ 2 2 2 Hn H L (S )
(n,)∈ΛN L (τ )
≤ (sup #ΛN L (τ ))1/2 τ
1/2 ˜ 2 2 2
Hn H L (S )
n,
≤ C (sup #ΛN L (τ )) τ
1/2
˜ L2 , (min(N, L))1/4 f L2 f
where we used (4.17) in the last inequality. Proposition 4.6 is then a consequence of the elementary number-theoretic estimate ∀δ > 0 , #ΛN L (τ ) ≤ Cδ (min(N, L)) δ . 5. Open problems and generalizations 5.1. The case of surfaces. In dimension 2, the instability result of subsection 4.1 can be generalized to a large class of manifolds, including revolution surfaces with a nondegenerate equator. Moreover, the exponential localization (4.16) of the stationary solution allows to extend this instability to non-compact surfaces admitting a subdomain which is isometric to a neighborhood of the equator in such a revolution surface. On the other hand, the proof of uniform wellposedness in subsection 4.2 extends easily to any Zoll surface, since it enjoys the same spectral clustering properties as the sphere. Of course many open questions still subsist. For instance, notice that the rough bound of Corollary 3.3 gives sc ≤ 1/2, while in all the examples of the surfaces we studied, we were able to prove that sc ≤ 1/4. A natural open question is thus: does there exist a surface M such that sc (M ) > 1/4 ? Another widely open question is the evaluation of sc for negatively curved compact surfaces. 5.2. Higher dimensions. In dimension 3, following the same ideas as in section 4, one can prove that sc (S 3 ) = 1/2 (see [16]). Notice that this is the same value as on T3 and R3 . However differences occur if other types of nonlinearities in the right-hand side of (1.2) are considered. It is also possible to prove that sc (S 2 × S 1 ) ≤ 3/4 (see [16]) but the exact value is not known. Apart from these examples and the ones studied by Bourgain (see section 3), we ignore if the uniform wellposedness in the energy space H 1 holds on a three-dimensional manifold. In dimension d ≥ 4, again it is possible to prove that d sc (S d ) = − 1 = sc (Td ) = sc (Rd ). 2
NLS on Compact Manifolds
137
As in dimension 3, the geometric effects of the sphere can be seen with subcubic nonlinearities. To our knowledge the critical threshold was not computed on any other high dimensional manifold. 5.3. Critical problems. As on Euclidean spaces and on tori, critical problems on spheres are widely open: we do not know if (1.2) is uniformly well posed on H 1/4 (S 2 ), H 1/2 (S 3 ), H 1 (S 4 ). However, in the two latter cases, it is possible to prove that bilinear Strichartz inequalities with the critical loss are wrong (see Theorem 4 in [8]). Using Remark 2.12 in [14], we conclude that the flow map u0 → u cannot be C 3 at 0, which is in strong contrast with the Euclidean case. Hence the study of the critical Cauchy problem for small data is certainly a challenging issue. References [1] V. Banica. On the nonlinear Schr¨ odinger dynamics on S 2 . J. Math. Pures Appl. , 83: 77–98, 2004. [2] J. Bourgain. Fourier transform restriction phenomena for certain lattice subsets and application to nonlinear evolution equations I. Schr¨ odinger equations. Geom. and Funct. Anal., 3:107–156, 1993. [3] J. Bourgain. Exponential sums and nonlinear Schr¨ odinger equations. Geom. and Funct. Anal., 3: 157–178, 1993. [4] J. Bourgain. Eigenfunction bounds for the Laplacian on the n-torus. Internat. Math. Res. Notices, 3: 61–66, 1993. [5] J. Bourgain. Remarks on Strichartz’ inequalities on irrational tori. Personal communication, 2004. [6] J. Bourgain.Global Solutions of Nonlinear Schr¨ odinger equations. Colloq. Publications, American Math. Soc., 1999. [7] J. Bourgain.Global wellposedness of defocusing critical nonlinear Schr¨ odinger equation in the radial case. J. Amer. math. Soc. 12: 145–171, 1999. [8] N. Burq, P. G´erard and N. Tzvetkov. Strichartz inequalities and the nonlinear Schr¨ odinger equation on compact manifolds. Amer. J. Math., 126-3: 569–605, 2004. [9] N. Burq, P. G´erard, and N. Tzvetkov. An instability property of the nonlinear Schr¨ odinger equation on S d . Math. Res. Lett., 9(2-3): 323–335, 2002. [10] N. Burq, P. G´erard, and N. Tzvetkov. The Cauchy problem for the nonlinear Schr¨ odinger equation on compact manifolds. J. Nonlinear Math. Physics, 10: 12–27, 2003. [11] N. Burq, P. G´erard, and N. Tzvetkov. Two singular dynamics of the nonlinear Schr¨ odinger equation on a plane domain. Geom. funct. anal., 13: 1–19, 2003. [12] N. Burq, P. G´erard, and N. Tzvetkov. An example of singular dynamics for the nonlinear Schr¨ odinger equation on bounded domains. Hyperbolic Problems and Related Topics, F. Colombini and T. Nishitani editors, Graduate series in Analysis, International Press, 2003. [13] N. Burq, P. G´erard, and N. Tzvetkov. On nonlinear Schr¨ odinger equations in exterior domains. Ann. I. H. Poincar´ e-AN, 21: 295–318, 2004.
138
P. G´erard
[14] N. Burq, P. G´erard and N. Tzvetkov. Bilinear eigenfunction estimates and the nonlinear Schr¨ odinger equation on surfaces, Inventiones Mathematicae 159: 187– 223, 2005. [15] N. Burq, P. G´erard and N. Tzvetkov. Multilinear estimates for Laplace spectral projectors on compact manifolds. C. R. Acad. Sci. Paris, Ser. I 338: 359–364, 2004. [16] N. Burq, P. G´erard and N. Tzvetkov. Multinear eigenfunction estimates and global existence for the three-dimensional nonlinear Schr¨ odinger equations, to appear in Ann. Scient. Ec. Norm. Sup., arXiv math.AP/0409015. [17] T. Cazenave. Semilinear Schr¨ odinger equations. Courant Lecture Notes in Mathematics, 10. New York University. American Mathematical Society, Providence, RI, 2003. [18] T. Cazenave and F. Weissler. The Cauchy problem for the critical nonlinear Schr¨ odinger equation in H s . Nonlinear Analysis, Theory, Methods and Applications, pages 807–836, 1990. [19] M. Christ, J. Colliander and T. Tao. Asymptotics, modulation and low regularity ill-posedness for canonical defocusing equations. Amer. J. Math. 125: 1225–1293, 2003. [20] M. Christ, J. Colliander and T. Tao. Ill-posedness for nonlinear Schr¨ odinger and wave equations. Preprint, 2003. [21] M. Gaffney. A special Stokes theorem for complete Riemannian manifolds. Ann. of Math., 60: 140–145, 1954. [22] J. Ginibre and G. Velo. On a class of nonlinear Schr¨ odinger equations. J. Funct. Anal., 32: 1–71, 1979. [23] J. Ginibre and G. Velo. The global Cauchy problem for the nonlinear Schr¨ odinger equation. Ann. I. H. Poincar´ e-AN, 2: 309–327, 1985. [24] J. Ginibre. Le probl`eme de Cauchy pour des EDP semi-lin´eaires p´eriodiques en variables d’espace (d’apr`es Bourgain). S´eminaire Bourbaki, Exp. 796, Ast´erisque 237: 163–187, 1996. [25] R. T. Glassey. On the blowing up of solutions to the Cauchy problem for nonlinear Schr¨ odinger equations. J. Math. Phys. 18: 1794–1797, 1977. [26] B. Helffer. Semi-classical analysis for the Schr¨ odinger operator and applications. Lecture Notes in Mathematics 1336, Springer-Verlag, 1988. [27] T. Kato. On nonlinear Schr¨ odinger equations. Ann. Inst. Henri Poincar´e, Physique th´eorique, 46: 113–129, 1987. [28] M. Keel and T. Tao. Endpoint Strichartz estimates. Amer. J. Math., 120: 955– 980, 1998. [29] C. Kenig, G. Ponce and L. Vega. On the ill-posedness of some canonical dispersive equations. Duke Math. J. 106: 617–633. [30] S. Klainerman and M. Machedon. Finite energy solutions of the Yang-Mills equations in R3+1 Ann. of Math. (2), 142 (1): 39–119, 1995. [31] C. Sogge. Oscillatory integrals and spherical harmonics. Duke Math. Jour., 53: 43–65, 1986. [32] C. Sogge. Concerning the Lp norm of spectral clusters for second order elliptic operators on compact manifolds. J. Funct. Anal., 77: 123–138, 1988.
NLS on Compact Manifolds
139
[33] C. Sogge. Fourier integrals in classical analysis. Cambridge tracts in Mathematics, 1993. [34] G. Staffilani and D. Tataru. Strichartz estimates for a Schr¨ odinger operator with nonsmooth coefficients. Comm. Partial Differential Equations, 27(7-8): 1337– 1372, 2002. [35] C. Sulem and P.L. Sulem.The Nonlinear Schr¨ odinger Equation. Self-Focusing and Wave Collapse. Applied Mathematical Sciences, 139, Springer-Verlag, New York, 1999. [36] Y. Tsutsumi. L2 -solutions for nonlinear Schr¨ odinger equations and nonlinear groups. Funkcial. Ekvac. 30: 115–125, 1987. [37] A. Weinstein. Nonlinear stabilization of quasimodes. Proc. A.M.S. Symp. on Geometry of the Laplacian, Hawa¨ı, 1979, A.M.S. Colloq. Publ. 36: 301–318, 1980. [38] V.E. Zakharov. Collapse of Langmuir waves. Sov. Phys. JETP 35: 908–914, 1972. P. G´ erard Universit´ e Paris-Sud Math´ ematiques Bˆ at. 425 F-91405 Orsay Cedex, France e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
A Probabilistic Approach to some Problems in von Neumann Algebras A. Guionnet
One of the most famous open questions concerning von Neumann algebras is to know whether free group factors with different numbers of generators are isomorphic or not L(F m ) L(F n )
if
n '= m ???
To try to attack such questions, Voiculescu introduced about twenty years ago free probability theory. Free probability theory is a probability theory for noncommutative variables equipped with a notion of freeness analogous to the classical notion of independence. This similarity permits to generalize many concepts from classical probability such as central limit theorems, Brownian motions etc and provides intuition to the domain. On the other hand, freeness is related to the usual notion of freeness on groups and is therefore meaningfull in standard operator algebras theory. Last but not least, independent large Gaussian random matrices were shown to be asymptotically free by D. Voiculescu [18]. Since then, large random matrices became a source of examples of interesting laws of non-commutative variables. In these proceedings, we shall describe how such a philosophy has been developed to try to answer the isomorphism problem and related issues. Even though this problem has not yet been settled we want to emphasize that such a strategy has already been fruitful (cf. [21], [12], [13]). We hope to convince analysts and probabilists that these issues are very closely related with standard problems in analysis and probability. We shall follow the following plan (1) Description of free probability framework. Relation with large random matrices. (2) The isomorphism problem in free probability terms. (3) Trying to disprove it by an entropy approach. Entropy theory, large deviations techniques. Recent developments and discussion. (4) Conclusion. For completeness, we provide in the appendix the proof of Gelfand-Naimark-Segal construction and show how non-commutative laws prescribe von Neumann algebras up to isomorphisms. I wish to thank N. Brown and D. Shlyakhtenko for many useful conversations on the topic I discussed in these notes.
142
A. Guionnet
1. Free probability versus classical probability In this section, we provide a short introduction to free probability, comparing it with standard probability. 1.1. The setting. A non-commutative (or W ∗ )-probability space is a couple (A, τ ) such that • A is a von Neumann algebra, i.e., a weakly closed sub-C ∗ -algebra of the space B(H) of bounded linear operators on some Hilbert space H. • τ is a state on A, that is a complex-valued linear form on A such that τ (A) = τ (A∗ ),
τ (AA∗ ) ≥ 0,
τ (I) = 1,
∀A ∈ A.
We shall consider in the following tracial states, which are states satisfying the additional hypothesis that τ (AB) = τ (BA),
∀A, B ∈ A.
Example 1.1. Let n ∈ N, A = Mn (C) = B(Cn ). For any v ∈ Cn such that
v Cn = 1, we set τv (M ) = v, M vCn ,
∀M ∈ Mn (C).
Then, τv is a state. There is a unique tracial state on Mn (C), which is the normalized trace n 1 tr(M ) = Mii . n i=1 Example 1.2 (Standard Probability). Let (X, Σ, dµ) be a classical probability space. Then A = L∞ (X, Σ, dµ) equipped with τ : f → f dµ is a (non-)commutative probability space. Here, L∞ (X, Σ, dµ) is seen as the space of bounded linear operator on the Hilbert space H = L2 (X, Σ, dµ)/ ≡ equipped with the scalar product f, gµ = f (x)g(x)dµ(x) by the embedding given by the multiplication operator M (f )g = f g. Here, H is obtained by separating L2 (X, Σ, dµ) by the equivalence relation f ≡ h ⇔ µ(|f − g|2 ) = 0 so that ·, ·µ furnishes it with a Hilbert structure. Example 1.3. Let G be a discrete group, and (eh )h∈G be a basis of 2 (G). Let λ(h)eg = ehg . Then, we take A to be the von Neumann algebra generated by the linear span of λ(G). The (tracial) state is the linear form given, once restricted to λ(G), by τ (λ(g)) = 1g=e Here, e denotes the neutral element. We refer to [25] for further examples and details.
Some Problems in von Neumann Algebras
143
1.2. The law of m self-adjoint non-commutative variables. Let (A, τ ) be a noncommutative probability space. If (X1 , . . . , Xm ) ∈ A, Xi = Xi∗ , their joint law is given by the restriction of τ to the algebra generated by (X1 , . . . , Xm ): τX1 ,...,Xm (P ) = τ (P (X1 , . . . , Xm ))
∀P ∈ CX1 , . . . , Xm
where CX1 , . . . , Xm denotes the set of polynomial functions in m non-commutative variables. Such a definition can of course be extended to the case of non self-adjoint variables by taking polynomial functions of (Xi , Xi∗ )1≤i≤m but we shall not consider this generalization Classical setting. This definition is a generalization of the observation that, in the commutative setting, the law µf1 ,...,fm of m bounded real-valued random variables (fi )1≤i≤m ∈ L∞ (X, Σ, µ) is determined by their joint moments, i.e., if C[X1 , . . . , Xm ] denotes the set of polynomial functions in m commutative variables, the law µf1 ,...,fm is determined by µf1 ,...,fm (P ) = P (f1 (ω), . . . , fm (ω))dµ(ω), ∀P ∈ C[X1 , . . . , Xm ]. As a consequence, the space Mm of laws of m self-adjoint non-commutative variables can be seen as the set of linear forms on CX1 , . . . , Xm which are a) non-negative: τ (P P ∗ ) ≥ 0 for all P ∈ CX1 , . . . , Xm , b) with mass one: τ (I) = 1. We shall assume that they are tracial ; τ (P Q) = τ (QP ) ∀P ∈ CX1 , . . . , Xm . This abstract point of view is actually equivalent to the previous one in the sense that by the Gelfand-Neumark-Segal (GNS) construction, being given µ ∈ Mm , we can construct a W ∗ -probability space (A, τ ) and operators (X1 , . . . , Xm ) such that (1.1) µ = τX1 ,...,Xm . We recall this construction in Appendix 5.1; roughly speaking it shows that, as in the commutative setting, A can be thought as L∞ (µ) in the sense that it is embedded into the space B(L2 (µ)) of bounded linear operators on the space of functions with finite second moment. We shall also denote W ∗ (X1 , . . . , Xn ) the von Neumann algebra A. m If R ∈ R, the subset Mm of variables uniformly bounded by R, R of M m 2n 2n Mm R = {τ ∈ M ; τ (Xi ) ≤ R
∀n ∈ N}
is Polish when equipped with its weak-∗ topology lim τn = τ ⇔ lim τn (P ) = τ (P )
n→∞
n→∞
∀P ∈ CX1 , . . . Xm .
144
A. Guionnet
Classical setting. Note that M1R is exactly the space P([−R, R]) of probability measures on [−R, R]. The set P([−R, R]m ) of probability measures on [−R, R]m is more generally described as the set C[X1 , . . . , Xm ]∗ of linear forms on C[X1 , . . . , Xm ] which are positive and with mass one, and it is a Polish space when equipped with its weak-∗ topology. The assumption that the variables are bounded (i.e., R < ∞) can be relaxed in the classical setting by considering bounded continuous test functions, P(Rm ) ⊂ Cb (Rm )∗ . This approach can be generalized to Mm by considering bounded non-commutative test functions (see [6]). N m N Example 1.4. Let AN 1 , . . . , Am ∈ HN , with spectral radius Ai ∞ bounded by R for 1 ≤ i ≤ m, and consider N N ∀P ∈ CX1 , . . . Xm . µ ˆN AN ,...,AN (P ) = tr P (A1 , . . . , Am ) , 1
m
N N ∈ Mm Then, µ ˆN R . If (A1 , . . . , Am )N ∈N is a sequence such that AN ,...,AN 1
m
ˆN lim µ AN ,...,AN (P ) = τ (P ),
then τ ∈
N →∞ m MR since
1
m
∀P ∈ CX1 , . . . Xm ,
Mm R is Polish.
There is a well-known question of A. Connes related with the last example Question 1.5. Can all τ ∈ Mm be constructed as a limit of µ ˆN for a AN ,...,AN 1
N m sequence AN 1 , . . . , Am ∈ HN , N ∈ N?
m
Classical setting. In the case m = 1, M1R = P([−R, R]) and the question amounts to ask whether for all µ ∈ P([−R, R]), there exists a sequence N (λN 1 , . . . , λN )N ∈N such that N 1 δλN = µ. i N →∞ N
lim
i=1
This is well known to be true according to Birkhoff’s theorem, but is still an open question for m ≥ 2 in the non-commutative setting. 1.3. Notion of freeness. X = (X1 , . . . , Xm ) are said to be free with Y = (Y1 , . . . , Yn ) iff for any P1 , . . . , Pq ∈ CX1 , . . . , Xm and Q1 , . . . , Qq ∈ CX1 , . . . , Xn such that µX (Pi ) = 0 and µY (Qi ) = 0 ∀ 1 ≤ i ≤ q, µX,Y (P1 (X)Q1 (Y )P2 (X)Q2 (Y ) · · · Pq (X)Qq (Y )) = 0.
(1.2)
Freeness, as independence, uniquely defines the joint law from the marginals µX and µY since one easily checks that µX,Y (P ) is uniquely determined for any P ∈ CX1 , . . . , Xm , Y1 , . . . , Ym by induction over the degree of P . Classical setting. In comparison, if X, Y are two bounded random variables with law τ , X is independent of Y under τ iff for all P, Q ∈ C[X1 , . . . , Xm ] × C[Y1 , . . . , Ym ] µX (P ) = 0, µY (Q) = 0 ⇒ τ (P (X)Q(Y )) = 0.
Some Problems in von Neumann Algebras
145
Note here that if X, Y are centered random variables which are commutative and independent under τ , τ (XY XY ) = τ (X 2 )τ (Y 2 ) > 0 whereas if they are free τ (XY XY ) = 0. Example 1.6. In the case of a discrete group considered in Example 1.3 with 2-free generators g1 , g2 (in the usual sense that for any polynomials such that Pij (gj ) '= e, P11 (g1 )P12 (g2 )P21 (g1 ) · · · '= e), (λ(g1 ), λ(g2 )) are also free in the sense that the law prescribed by τg1 ,g2 (λ(g)) = 1g=e for any element g of the group generated by g1 and g2 satisfies (1.2). Example 1.7 (Voiculescu [18]). Take X1N , X2N ∈ HN to be a sequence of uniformly bounded matrices with spectral distribution converging as N go to infinity toward µ1 and µ2 respectively. Then, if U follows Haar measure on U (N ), ˆN lim tr(P (X1N , U X2N U ∗ )) = lim µ X N ,U X N U ∗ (P ) = τµ1 ,µ2 (P ) ∀P.
N →∞
N →∞
1
2
τµ1 ,µ2 ∈ M is the distribution of two free variables with marginal distribution given by µ1 and µ2 . 2
If X2N is distributed according to the Gaussian law (GUE) (that is a Gaussian Wigner matrix) " N2 # 1 µN (dX) = tr(X 2 ) dX, 1X∈HN exp − ZN 2 then for any unitary matrix U , µN (dX) = µN (U dXU ∗). Hence, since by Wigner [27], µ ˆN converges towards the semi-circular law X2N σ(dx) = (2π)−1 4 − x2 dx, µ ˆN ⇒ τµ1 ,σ . X N ,X N 1
2
1.4. Some notions borrowed from classical probability. The role played by Gaussian laws with respect to independence is played by semi-circular laws when freeness is considered. Indeed, if (X1 , . . . , Xn , . . . ) are free centered varin ables (τ (Xi ) = 0) with covariance one (τ (Xi2 ) = 1), n−1/2 i=1 Xi converges in distribution to a semi-circular distribution (cf. [18]). Classical setting. When the (X1 , . . . , Xn , . . . ) are independent centered variables with covariance one, the well-known central limit theorem asserts that n−1/2 ni=1 Xi converges in distribution to a standard Gaussian variable. One can define a free Brownian motion (St , t ≥ 0) as a process starting from the origin and such that for all t ≥ s, (t − s)−1/2 (St − Ss ) is free from σ(Su , u ≤ s) and with semi-circular distribution. Free stochastic differential (Itˆ o’s) calculus can be constructed (cf. [2]). Namely, if K. is a function of noncommutative variables such that for t ∈ R, Kt depends only on the algebra σ(Xu , u ≤ t) generated by (Xu , u ≤ t) and is uniformly Lipschitz with respect
146
A. Guionnet
to the operator norm, then there exists a unique solution to the differential operator-valued equations given by dXt = dSt + Kt (X)dt,
(1.3)
as can be seen by using a standard Picard iteration argument. 2. The isomorphism problem The fundamental observation (which belongs to free probability folklore) is that the law of the variables X1 , . . . , Xm determines the von Neumann algebra they generate. More precisely, Lemma 2.1. If X1 , . . . , Xm (resp. Y1 , . . . , Ym ) are non-commutative variables with law τX and τY , τX = τY ⇒ W ∗ (X1 , . . . , Xm ) W ∗ (Y1 , . . . , Ym ) where A B means that the two algebras are isomorphic.
Proof. The proof of this lemma is recalled in Appendix 5.2. Now,
W ∗ (X1 , . . . , Xm ) W ∗ (Y1 , . . . , Yn )
iff there exists
F1 (X), . . . , Fn (X) ∈ W ∗ (X1 , . . . , Xm )n resp. G1 (Y ), . . . , Gm (Y ) ∈ W ∗ (Y1 , . . . , Yn )m
and unitary operators
U : L2 (W ∗ (X1 , . . . , Xm )) → L2 (W ∗ (Y1 , . . . , Yn )) resp. V : L2 (W ∗ (Y1 , . . . , Yn )) → L2 (W ∗ (X1 , . . . , Xm ))
so that Yi = U Fi (X)U ∗ for 1 ≤ i ≤ n resp. Xi = V Gi (Y )V ∗ for 1 ≤ i ≤ m . Hence, let us say that τX is equivalent to τY , which we denote by τX ≡ τY , iff τX and τY are the pushforward of each other, that is that there exists F ∈ W ∗ (X1 , . . . , Xm )n , G ∈ W ∗ (Y1 , . . . , Yn )m such that τY (P ) = F# τX (P ) = τX (P ◦ F )
τX (P ) = G# τY (P ) = τY (P ◦ G) ∀P .
Then, Lemma 2.1 shows that W ∗ (X1 , . . . , Xm ) W ∗ (Y1 , . . . , Ym ) ⇔ τX ≡ τY .
(2.1)
Problem 2.2 (The isomorphism problem). Let σm be the law of m free semicircular variables S1 , . . . , Sm . By (2.1), L(F m ) W ∗ (S1 , . . . , Sm ). The isomorphism problem can therefore be recast into W ∗ (S1 , ., Sm ) W ∗ (S1 , ., Sn ) ⇔ σm ≡ σn ⇒ m = n?
Some Problems in von Neumann Algebras
147
Classical setting. It is well known that a probability measure on Rm is equivalent to a probability measure on Rn provided they have no atoms. 3. Entropy approach Voiculescu [20] introduced a quantity δ : Mm → [0, m], analogue to Minkowski dimension, such that for all m ∈ N δ(σm ) = m. It is currently warmly discussed whether δ is an invariant of the von Neumann algebra, that is whether for all µ ∈ Mm , µ ≡ σm implies δ(σm ) = δ(µ). If this is the case, then one has proved that L(F m ) ' L(F n ) if m '= n. To define δ, Voiculescu [20] built an Entropy theory based on microstates free entropy χ which we now define. Let τ ∈ Mm and define a micro-state ΓR (τ, , k) by ΓR (τ, , k) = {A1 , . . . , Am ∈ HN : | tr(Ai1 · · · Aip ) − τ (Xi1 · · · Xip )| < ∀ p ≤ k, ∀ 1 ≤ ij ≤ m, Aj ∞ ≤ R
∀ 1 ≤ j ≤ m}.
Then we set χ(τ ) :=
lim
lim sup
↓0 N →∞ k↑∞, R↑∞
12 log µ⊗m N (ΓR (τ, , k)). N
The original definition of Voiculescu uses the Lebesgue measure instead of the Gaussian measure but it is not hard to see (cf. [7]) that these two definitions are equivalent up to a Gaussian term 2−1 µ(Xi2 ). Classical setting. The classical analogue to χ is the Boltzmann-Shannon entropy: 1 S(µ) = lim lim sup log µ ˜⊗m N (ΓR (τ, , k)) ↓0 N →∞ N k↑∞, R↑∞
where µ ˜N is the law of diagonal matrices with i.i.d standard Gaussian entries. In fact, for diagonal matrices N 1 m , xi · · · xi τ (Xi1 · · · Xip ) = δ 1 1 p N i=1 Xii ,...,Xii
so that ΓR (τ, , k) is a small neighborhood of the empirical measure of the entries. Moreover, when the random variables are bounded, it is well known that the weak-* topology generated by polynomial functions is equivalent to the topology generated by bounded continuous functions and hence we arrive to the more common definition of Boltzmann-Shannon entropy . N "1 # 1 m, µ d S(µ) = lim lim sup log µ ˜⊗m δ 1 < N ↓0 N →∞ N N i=1 Xii ,...,Xii
148
A. Guionnet
where d is a distance compatible with respect to the weak-topology such as Dudley’s distance f (x) − f (y) d(µ, ν) = sup f dµ − f dν ; |f (x)| and ≤ 1, ∀x '= y . x−y By Sanov’s theorem (cf. [9], Theorem 6.2.10), if γ is the standard Gaussian law 1 2 γ(dx) = (2π)−1 e− 2 x dx, . N "1 # 1 ⊗m m, µ d S(µ) := lim lim sup log µ ˜N δ 1 < ↓0 N →∞ N N i=1 Xii ,...,Xii . N "1 # 1 ⊗m m, µ d log µ ˜N δ 1 < = lim lim inf ↓0 N →∞ N N i=1 Xii ,...,Xii = S ∗ (µ) where
−∞ ∗ S (µ) := dµ − log ⊗m dµ dγ
if µ ' γ ⊗m , otherwise.
The natural question is to seek for a generalization of Sanov’s theorem in the non-commutative setting, that is to show that in the definition of χ one can replace the lim sup by a lim inf and then to find a formula for this limit which does not depend on the description via micro-states. When m = 1, Voiculescu showed that indeed one can replace in the definition of χ the lim sup by a lim inf and also that for all µ ∈ P(R) 1 3 χ(µ) = χ∗ (µ) := log |x − y|dµ(x)dµ(y) − x2 dµ(x) + . 2 4 When m ≥ 2, Biane, Capitaine and myself [3] proved by using large deviations techniques that, for all τ ∈ Mm , χ∗∗ (τ ) ≤ χ(τ ) ≤ χ∗ (τ ). χ∗ had been previously defined by Voiculescu [22] by means of Free Fisher Information and called non-microstates free entropy since it does not depend on the micro-state and random matrices definition. It is the analogue of the relative entropy S ∗ . The question whether one can replace the lim sup by a lim inf is still wide open since as we shall see the equality between χ∗ and χ∗∗ is still unclear and actually related to very deep questions such as Connes’s. χ∗ (τ ) and χ∗∗ (τ ) can be seen as the cost to construct τ from free semicircular increments: 1 1 ∗ 2 χ (τ ) := − inf φ(|Kt (X)| )dt (3.1) 2 0
Some Problems in von Neumann Algebras
149
where the infimum is taken over all φ which are laws of continuous noncommutative processes (Xt1 , . . . , Xtm )t∈[0,1] which start at null operators (X01 , . . . , X0m ) = (0, . . . , 0) and end at time one at operators (X11 , . . . , X1m ) with law τ and which satisfies in a weak sense dXti = dSti + Kti (X)dt, 1 ≤ i ≤ m (3.2) where Kt belongs to the von Neumann algebra generated by (Xu , u ≤ t) and S is a m-dimensional free Brownian motion (i.e., (S 1 , . . . , S m ) are m free Brownian motions). More precisely, assume to simplify that Kt is uniformly bounded so that the solution to (3.2) is uniformly bounded (note here that St is uniformly bounded since the semi-circular law is compactly supported). Let for s ∈ [0, 1], φ˜s be the law of (Xu∧s + Su−s∨0 , u ∈ [0, 1]). Then (3.2) is satisfied in a weak sense iff for all t ∈ [0, 1], all polynomial functions P = Q(Xt1 , . . . , Xtn ) on cylinders t " # φ˜t (P ) − σ(P ) = φ φ˜s (∇s P |Bs )Ks ds (3.3) 0
where σ is the law of a free Brownian motion and φ˜s (.|Bs ) denotes the orthogonal projection in L2 (φ) on the σ-algebra Bs = σ(Xui , 1 ≤ i ≤ m, u ≤ s). ∇s is the Malliavin operator ∇ls (xit11 . . . xitnn ) =
n
i
i
1ip =l xtp+1 . . . xitnn xit11 . . . xtp−1 1 (s) p+1 p−1 [0,tp ]
p=1
It was shown in [3] that for nice K. , (3.3) is a strong solution of (3.2). χ∗∗ is defined similarly but the infimum is restricted to processes for which the drift K is sufficiently smooth. It was shown in [3] that the infimum in the definition (3.1) of χ∗ is taken at the distribution of a free Brownian bridge t (tX + (1 − t) 0 (1 − u)−1 dSu , t ∈ [0, 1]), where X = (X1 , . . . , Xm ) has law τ and S = (S1 , . . . , Sm ) is a m-dimensional free Brownian motion, free with X. Plugging this fact into the definition (3.1) of χ∗ yields the initial definition of Voiculescu. Classical setting. It can be seen that S ∗ can be defined similarly by replacing (St , t ≥ 0) by a standard Brownian motion. The so-called unification problem (cf. Voiculescu [26]) is to prove that the lim sup can be replaced by a lim inf in the definition of χ and χ(τ ) = χ∗ (τ ) at least for τ such that χ(τ ) > −∞. It seems that this problem is related with a better understanding of analysis of non-commutative functions and related to Connes’s question. In fact, to show that χ∗∗ (τ ) = χ∗ (τ ) = χ(τ ), we would like to show that processes with smooth fields are dense, i.e., that for any law of non-commutative processes φ,
150
A. Guionnet
there exists a sequence (φ ) associated with a smooth field K by (3.2) such that lim φ = φ →0
and
→0
1
φ (|Kt (X)|2 )dt =
lim
0
1
φ(|Kt (X)|2 )dt. 0
But because K is smooth, dXt = dSt + Kt (X )dt as a unique solution Xt = Ft (Ss , s ≤ t) with a smooth function F as can be checked again by Picard argument. Moreover, if H N is an m-N -dimensional Hermitian Brownian (that is a N × N Hermitian matrix with Brownian motion entries) the asymptotic freeness of independent Wigner’s matrices together with Wigner’s convergence [27] imply that µ ˆN H N ,t≥0 ⇒ τSt ,t≥0 . t
Consequently, since F are smooth, ˆN φ = lim µ AN ,...,AN N →∞
1
m
N m with AN t = Ft (Hs , s ≤ t) ∈ HN , t ∈ [0, 1].
Thus, φ can be approximated by non-commutative distribution of finite matrices and hence φ. When m = 1, such a program was realized by taking for φ the law obtained by convoluting φ by small Cauchy laws (cf. Zeitouni and myself [14]) but the generalization of this strategy to m ≥ 2 fails on crucial analytic questions which are not yet understood in the non-commutative context. The entropy dimension was defined by Voiculescu [20] for τ = τX1 ,...,Xm ∈ Mm , if S1 , . . . , Sm are free semicircular variables, free with X, by δ(τ ) = m + lim sup ↓0
χ(τX1 +S1 ,...,Xm +Sm ) . | log |
It satisfies the following property Proposition 3.1. (a) δ(τX1 ,...,Xm ) = m i=1 δ(τXi ) if X1 , . . . Xm are free (cf. [20]). (b) χ(τ ) > −∞ implies δ(τ ) = m (cf. [20]). (c) If m = 1, µ ∈ P(R), Voiculescu [20] proved that µ({t})2 . δ(µ) = 1 − t∈R
(d) By [3], where δ ∗ and δ ∗∗
δ ∗∗ (τ ) ≤ δ(τ ) ≤ δ ∗ (τ ) are defined as δ but with χ∗ (resp. χ∗∗ ) instead of χ.
Note that by 3.1 (c), we see that when m = 1, δ counts the number of atoms which existence is crucial in isomorphism questions in the commutative setting.
Some Problems in von Neumann Algebras
151
Recent work of Connes, Shlyakhtenko [8] tried to define another invariant for von Neumann algebras. They generalized the notion of L2 -homology and L2 -Betti numbers for a tracial von Neumann algebra, motivated by the measure-equivalence invariance of the group theoretical L2 -Betti numbers proved by Gaboriau [11]. They define an L2 -Betti number ∆. They can show it is related to δ ∗ , and thus to δ by [3], by δ(τ ) ≤ δ ∗ (τ ) ≤ ∆(τ ). Mineyev, Shlyakhtenko [15] proved that in the case of a finitely generated group δ ∗ (τ ) = β1 (G) − β0 (G) + 1 with the group L2 Betti-numbers β. Yet the question of the invariance of δ, δ ∗ , ∆ is still open. Another attempt was done by Haagerup et al. to try to prove that δ is NOT an invariant. A good candidate for a counterexample was a priori the so-called DT operator T which is obtained as the limit of upper triangular matrices with i.i.d Gaussian variables above the diagonal. The idea is that the circular operator, limit of square matrices with i.i.d Gaussian entries is such that C = T + T˜∗ where T, T˜ are free. It is known to generate a two-dimensional free group factor and δ(C) = 2. Thus, since T is generated with half as much random variables, it could be hoped that δ(T ) < 2. On the other hand, it was shown by Dykema and Haagerup [10] that T is isomorphic to L(F 2 ) so that invariance of δ would be disproved if δ(T ) was strictly smaller than 2. However, it was recently shown by Aagaard [1] that δ ∗ (T ) = 2 so that DT operators do not provide a counterexample for δ ∗ , nor for δ if one believes the unification problem to hold true. 4. Conclusion Free probability allows to express in probability terms many problems from non-commutative algebras, and hence gives to probabilists a chance to use their skill in this topic. However, open questions are often in the end analytic questions: Connes question and χ = χ∗ problem could be settled if we would understand better the regularizing properties of free convolution. The important use of Gaussian random matrices in this domain also connects it with combinatorics and physics, since tracial states which satisfy Connes approximating property can be seen as limit of matrix models, which have been used in these last domains to enumerate maps (see the review [28]).
152
A. Guionnet
5. Appendix 5.1. About the GNS construction. This construction can be summarized as follows (cf. [16],[17]). Consider the bilinear form on CX1 , . . . Xm 2 given by P, Qµ = µ(P Q∗ ). We then construct a Hilbert space (Hµ , ·, ·µ ) as follows. We consider the left ideal Lµ = {F ∈ L2 (µ) : F µ = 0} and the quotient space hµ := CX1 , . . . Xm /Lµ . We let ηµ be the inclusion map from CX1 , . . . Xm into hµ . ·, ·µ determines a pre-Hilbert structure 1/2 on hµ and therefore the completion Hµ of hµ by the norm · µ = ·, ·µ is a Hilbert space. The non-commutative polynomials CX1 , . . . Xm act by left multiplication on Hµ . In fact, if we denote for P, Q ∈ CX1 , . . . Xm πµ (P )ηµ (Q) = ηµ (P Q), then πµ (P ) extends uniquely as a bounded linear operator on Hµ since
πµ (P )(ηµ (Q) 2µ = µ(P QQ∗ P ) ≤ QQ∗ ∞ µ(P P ∗ ) = QQ∗ ∞ πµ (P ) 2µ . Moreover, one checks that πµ (CX1 , . . . Xm ) is an involutive algebra equipped with the operator norm |||πµ (P )||| = sup ηµ (Q) −1 µ ηµ (P Q) µ . Q∈Hµ
The involution is simply given by (Xi1 · · · Xin )∗ = Xin · · · Xi1 . We denote Aµ the von Neumann obtained by completing πµ (CX1 , . . . Xm ) by the weak topology on Hµ . Aµ is equipped with the tracial state τµ (πµ (P )) = πµ (P )ηµ (I), ηµ (I)µ = µ(P ). We then easily check that (Aµ , τµ ) verify (1.1). In the sense that Aµ ⊂ B(Hµ ) where Hµ is roughly speaking the space of square integrable functions, we can think of Aµ as the set of bounded measurable functions L∞ (µ). 5.2. Proof of Lemma 2.1. This fact can be deduced from proposition 3.3.7 of [16] and the previous proof of GNS construction when one notices that (πµ , Hµ ) can be seen to be a cyclic representation of CX1 , . . . Xm (cf. [16], section 3.3). Let us however summarize it. The proof uses uniqueness of GNS representations. It can be recast in the general framework of two non-commutative probability space (M, τM ) and (N, τN ); if τM = τN then we want to show that N M . Indeed, if we regard M ⊂ B(HτM ) and N ⊂ B(HτM ) as acting via the GNS representation then one defines a unitary operator U : HτM → HτN , U (ητM (P (X1 , . . . , Xn ))) = ητN (P (Y1 , . . . , Yn )), for every polynomial P . The fact that τM (P (X1 , . . . , Xn )) = τN (P (Y1 , . . . , Yn )) ensures that U is well defined and isometric on a dense subspace of HτM and
Some Problems in von Neumann Algebras
153
maps this dense subspace onto a dense subspace of HτN . Hence we may extend it (uniquely) to a unitary U : HτM → HτN . Finally, one checks that U Xi U ∗ = Yi and therefore U M U ∗ = N . References [1] L. Aagard; Thenon-microstates free entropy dimension of DT-operators. Preprint Syddansk Universitet (2003) [2] P. Biane, R. Speicher; Stochastic calculus with respect to free brownian motion and analysis on Wigner space, Prob. Th. Rel. Fields, 112: 373–409 (1998) [3] P. Biane, M. Capitaine, A. Guionnet; Large deviation bounds for the law of the trajectories of the Hermitian Brownian motion. Invent. Math.152: 433–459 (2003) [4] P. Biane, D. Voiculescu; A free probability analogue of the Wasserstein metric on the trace-state space Geom. Funct. Anal., 11,1125–1138 (2001) [5] N. Brown; Finite free entropy and free group factors http://front.math.ucdavis.edu/math.OA/0403294 [6] T. Cabanal-Duvillard, A. Guionnet; Large deviations upper bounds and noncommutative entropies for some matrices ensembles, Annals Probab. 29 : 1205– 1261 (2001) [7] T. Cabanal-Duvillard, A. Guionnet; Discussions around non-commutative entropies, Adv. Math. 174: 167–226 (2003) [8] A. Connes, D. Shlyakhtenko; L2 -Homology for von Neumann Algebras http://front.math.ucdavis.edu/math.OA/0309343 [9] A. Dembo, O. Zeitouni; Large deviations techniques and applications, second edition, Springer (1998). [10] K. Dykema, U. Haagerup; Invariant subspaces of the quasinilpotent DT-operator J. Funct. Anal. 209: 332–366 (2004) [11] D. Gaboriau; Invariants 2 de relations d’´equivalences et de grroupes Publ. Math. ´ Inst. Hautes. Etudes Sci. 95: 93–150(2002) [12] L. Ge; Applications of free entropy to finite von Neumann algebras, Amer. J. Math. 119: 467–485(1997) [13] L. Ge; Applications of free entropy to finite von Neumann algebras II,Annals of Math. 147: 143–157(1998) [14] A. Guionnet, O. Zeitouni; Large deviations asymptotics for spherical integrals, Jour. Funct. Anal. 188: 461–515 (2001) [15] I. Mineyev, D. Shlyakhtenko; Non-microstates free entropy dimension for groups http://front.math.ucdavis.edu/math.OA/0312242 [16] G.K. Pedersen; C ∗ -algebras and their automorphism groups, London mathematical society monograph, 14 (1989) [17] V.S. Sunder; An invitation to von Neumann algebras, Universitext, Springer(1987) [18] D. Voiculescu; Limit laws for random matrices and free products Invent. math. 104: 201–220 (1991) [19] D. Voiculescu; The analogues of Entropy and Fisher’s Information Measure in Free Probability Theory jour Commun. Math. Phys. 155: 71–92 (1993)
154
A. Guionnet
[20] D. Voiculescu; The analogues of Entropy and Fisher’s Information Measure in Free Probability Theory, IIInvent. Math. 118: 411–440 (1994) [21] D.V. Voiculescu; The analogues of Entropy and Fisher’s Information Measure in Free Probability Theory, III. The absence of Cartan subalgebras Geom. Funct. Anal. 6: 172–199 (1996) [22] D. Voiculescu; The analogues of Entropy and Fisher’s Information Measure in Free Probability Theory, V: Noncommutative Hilbert Transforms Invent. Math. 132: 189–227 (1998) [23] D. Voiculescu; A Note on Cyclic Gradients Indiana Univ. Math. I 49: 837–841 (2000) [24] D.V. Voiculescu; A strengthened asymptotic freeness result for random matrices with applications to free entropy. Internat. Math. Res. Notices 1: 41–63 (1998) [25] D. Voiculescu; Lectures on free probability theory, Lecture Notes in Mathematics 1738: 283–349 (2000). [26] D. Voiculescu; Free entropy Bull. London Math. Soc. 34: 257–278 (2002) [27] E. Wigner; On the distribution of the roots of certain symmetric matrices, Ann. Math. 67: 325–327 (1958). [28] A. Zvonkin; Matrix integrals and Map enumeration; an accessible introductionMath. Comput. Modelling 26 281–304 (1997) A. Guionnet UMPA Ecole Normale Sup´erieure de Lyon 46, all´ ee d’Italie F-69364 Lyon Cedex 07, France e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Singular Elements of Affine Kac–Moody Groups Stefan Helmke and Peter Slodowy
1. Introduction In 1970, E. Brieskorn [5] explained a remarkable connection between simple singularities and simple algebraic groups. It was since then an open question, how this connection would extend to a larger class of singularities. The first results in this direction were obtained about 10 years later in [22]. Here, the connection was between simple elliptic singularities and affine Kac–Moody groups. But in contrast to the simple algebraic groups, the singularities could not be directly observed in the affine Kac–Moody groups, essentially because a classification of conjugacy classes in those groups was not yet established, but also because one actually needs a completion of this group and it was not clear which completion would be the right one. The situation changed dramatically another almost 20 years later. At that time, G. Br¨ uchert [6] constructed a section of the adjoint quotient map for an affine Kac–Moody group in his thesis under the advise of the second author. At the same time, R. Friedman, J. Morgan and E. Witten [8] constructed the moduli space of semi-stable principal bundles over an elliptic curve, by using the deformation of a certain unstable bundle. It turned out, that those two constructions where equivalent and that the use of principal bundles was a powerful tool to classify conjugacy classes. It also strongly suggested which completion of the affine Kac–Moody group one should use. We then succeeded quickly, to classify all the necessary conjugacy classes and identified the corresponding singularities [10], [11] and [12]. We found that the simple elliptic singularities and their deformations are indeed realized in this way. After a longer introduction to the theory of singularities and groups, in which we will explain Brieskorn’s Theorem and some of its generalizations, we will concentrate on the study of those isolated singularities, which appear in the classical affine Kac–Moody groups. In most classical cases, the singularities are (1) (1) non-isolated. Only for groups of type A1 and D5 also isolated singularities appear. Those can be identified much more directly than the singularities in the exceptional groups. 1.1. Simple singularities. The simple singularities can be constructed as follows. Let Γ be a finite subgroup of SL2 (C). Then, Γ is one of the following: a cyclic group of order ≥ 1 (A ), a binary dihedral group of order 4(− 2), ≥ 4 (D ), the binary tetrahedral group (E6 ), the binary octahedral group (E7 ), or
156
S. Helmke and P. Slodowy
the binary icosahedral group (E8 ). Here, the word binary refers to the fact that the group is the preimage of the corresponding subgroup of the rotation group SO3 (R) to its universal cover. In 1874, F. Klein showed that the ring of polynomials in two variables which are invariant under Γ is generated by three elements x, y and z, which satisfy the following relation A≥1
x+1 + y 2 +z 2 = 0
D≥4
x−1 + xy 2 +z 2 = 0
E6
x4
E7
x3 y + y 3 +z 2 = 0
E8
x5
+ y 3 +z 2 = 0 + y 3 +z 2 = 0.
In other words, the quotient C2 /Γ is a hypersurface singularity. Those singularities have various characteristic properties. For example, V.I. Arnold [1] proved that they are the only hypersurface singularities which can only deform in a finite number of isomorphism classes of other singularities and they were therefore called simple singularities. 1.2. Conjugacy classes in simple algebraic groups. Let G be a simple and simply connected algebraic group over the complex numbers. There are four series of such groups, the classical groups A = SL+1 (C), B = Spin2+1 (C), C = Sp2 (C) and D = Spin2 (C) and the five exceptional groups E6 , E7 , E8 , F4 and G2 . In G, one has a Jordan decomposition, i.e., each element can be written as a product of a unipotent element and a semi-simple element commuting with each other, and this decomposition is unique. The classification of semi-simple conjugacy classes is easy: Let T ⊂ G be a maximal torus in G and W = NG (T )/T its Weyl group. Then, the intersection of a semi-simple conjugacy class with T is exactly one W -orbit. There is even a morphism χ : G −→ T /W C which maps each element of G to the W -orbit of its semi-simple part. This map can be also given by the characters of the fundamental representations of G, where = rank G. On the other hand, the unipotent part of an element of G is just a unipotent element of the centralizer of its semi-simple part, which is itself a reductive group. Hence, the classification of conjugacy classes in simple algebraic groups reduces to the classification of unipotent conjugacy classes. Furthermore, with the help of Jacobson–Morozov triplets, this can be reduced to the classification of conjugacy classes of subgroups of G isomorphic to SL2 (C) and this was done, even much more general, by Dynkin [7]. 1.3. Singularities of the unipotent variety. The unipotent variety of G is the set of all of its unipotent elements. It is exactly the fiber χ−1 χ(1) and therefore, it is a subvariety of G. Other fibers are isomorphic to fiber bundles over the unipotent variety of the centralizer of the corresponding element in T . Even
Singular Elements of Affine Kac–Moody Groups
157
the detailed classification of unipotent conjugacy classes depends strongly on G, there are a few common properties. First of all, the unipotent variety is irreducible and it contains a unique dense G-orbit. The elements of this orbit are called regular unipotent. A unipotent element is regular, if and only if the Jacobian of the map χ has maximal rank at this element. The complement of the regular unipotent orbit has codimension two in the unipotent variety and again, this complement is irreducible and contains a dense orbit. The elements of this orbit are called subregular unipotent. The complement of those two orbits may not be irreducible anymore, in general. However, there are only finitely many unipotent G-orbits and all of them contain the so-called minimal unipotent orbit in its closure, which by itself contains only the 0-dimensional 1-orbit in its closure. So one can draw a symbolic picture of the unipotent variety as follows ◦ regular orbit ◦. subregular orbit .. ◦ minimal orbit ◦ 1-orbit. Each circle ◦ in this picture represents a unipotent conjugacy class and two circles are connected, if and only if the orbit represented by the lower of those two circles is contained in the closure of the orbit represented by the upper circle. The vertical dots are subject to the more detailed classification which is G-dependent. Example 1.1. The simples example is G = SL2 (C). In this case, there are only two unipotent conjugacy classes: The 1-orbit and the orbit through any other unipotent element. The first of those two orbits is the subregular orbit and other one is the regular orbit. The map χ is given by the character of the fundamental 2-dimensional representation of G, i.e., the trace "x y# χ : SL2 (C) −→ C, −→ x + w. zw Hence, the defining equation for the unipotent variety is w = 2−x. Substituting this into the defining equation xw−zy = 1 for SL2 (C) leads to (x−1)2 +zy = 0. This is just the equation for a simple singularity of type A1 . So the unipotent variety of SL2 (C) has a singularity of this type at the unit element 1 (cf. Fig. 1). In general, the unipotent variety is singular along the closure of the subregular unipotent orbit. Hence, in all cases except A1 , the unipotent variety has non-isolated singularities. However, at the generic point of the singular locus, this singularity is a simple hypersurface singularity of the corresponding type. Theorem 1.2 (Brieskorn’s Theorem cf. [5]). Assume that G is of type A , D or E . The intersection of a transversal slice to the subregular unipotent orbit with the unipotent variety has a simple surface singularity of the same type as
158
S. Helmke and P. Slodowy
1 1 regular orbit through 0 1
subregular orbit
1 0
0 1
Figure 1. The unipotent variety of SL2 (C)
G. Moreover, the restriction of the map χ to this slice is the semi-universal deformation of the simple singularity. Actually, the second statement of this theorem was one of its motivations. Since it was known already that the discriminant of the map χ and the discriminant of a simple singularity of the corresponding type where isomorphic, it was likely that there is a closer connection between simple algebraic groups and simple singularities. For groups of type B , C , F4 or G2 , the situation was temporary less clear, since there was no associated singularity. However, V.I. Arnold [2] introduced the notion of boundary singularities and showed that the simple objects in this category correspond to the simple algebraic groups of type B , C and F4 . A boundary singularity is equivalent to an ordinary singularity together with an action of a cyclic group Z2 of order 2 and therefore, the notion of boundary singularities naturally generalizes to the notion of singularities with a finite symmetry group. The remaining case G2 is now associated to a simple singularity of type D4 with the action of the symmetric group S3 of order 6 and Brieskorn’s Theorem extends as follows. Theorem 1.3 (cf. P. Slodowy [21]). Assume that G is of type B , C , F4 or G2 . The intersection of a transversal slice to the subregular unipotent orbit with the unipotent variety has a simple singularity of the following type G
B
C
F4
G2
Type
A2−1
D+1
E6
D4 .
The group of connected components of the centralizer of a subregular element (isomorphic to Z2 for B , C , F4 and S3 for G2 ) acts on the transversal slice and the restriction of χ to this slice is the invariant part of the semi-universal deformation of the simple singularity. In general, if a finite group acts on a hypersurface singularity, then it automatically acts also on the base space of its semi-universal deformation. Since the symmetry group in the previous statement comes from the adjoint action of G on itself and χ is invariant under this action, the deformation
Singular Elements of Affine Kac–Moody Groups
elliptic curve
159
elliptic singularity
Figure 2. Blow-down of the zero-section of a line bundle over a curve induced by χ must come from a morphism from the base space T /W to the invariant part of the semi-universal deformation. What is a bit surprising is that this morphism is an isomorphism in all cases. E. Brieskorn [5] sketched two proofs for his theorem. Both are explained in full detail in [21]. The first argument is very geometric. It uses a group theoretic construction of a resolution of the singularities of the unipotent variety due to T.A. Springer (and A. Grothendieck for the whole family χ). R. Steinberg and J. Tits proved that the exceptional locus over a subregular element is a union of smooth rational curves whose dual graph is the Dynkin diagram of G. In order to conclude that the singularity is indeed of the expected type, one needs to know that the self-intersection numbers of all those rational curves in a transversal slice are −2, which was later proved by H. Esnault. The second argument is technically easier and based on the quasi-homogeneity of simple singularities. First note that the exponential map is a morphism from the Lie algebra g of G onto G which maps the nilpotent variety isomorphically onto the unipotent variety. Therefore, it is equivalent to prove the corresponding statements for the Lie algebra g instead of G. The advantage is that on g one has a natural C∗ -action which makes the deformation equivariant. It is basically just given by scalar multiplication but it has to be modified with a 1-parameter subgroup of G which comes from a Jacobson–Morozov triplet for the subregular nilpotent element. The weights of this action on the transversal slice and the base space can be expressed in terms of some well-known invariants of the corresponding root system. For a general quasi-homogeneous singularity, this information is far from being enough to identify it. However, it is a very special property for simple singularities that they and their semi-universal deformation are indeed uniquely determined by those weights. 1.4. Simple elliptic singularities. After the connection between simple algebraic groups and simple singularities as explained in the previous section was established, there was the hope that one can extend this kind of connection to a larger class of singularities which would help to understand their deformation theory (cf. [22] and in much more detail [23], [24], [25], [26]). From the view
160
S. Helmke and P. Slodowy
point of V. I. Arnold’s classification, it is natural to look at uni-modular singularities next, i.e., those which deform in a (at most) one-dimensional family of pairwise non-isomorphic singularities. Some of the simplest uni-modular singularities can be constructed as follows: Let L be the total space of a line bundle over an elliptic curve E such that k := − deg L > 0. Then, one can contract the zero-section of L to a point and the result is called a simple elliptic singularities of degree k (cf. Fig. 2). They have been widely studied by K. Saito [20], E.J.N. Looijenga [14], [15], [16], H.C. Pinkham [18] and J.Y. M´erindol [17]. The embedding dimension of a simple elliptic singularity of degree k ≤ 3 is equal to 3, i.e., it is a hypersurface singularity. Its equation can be written as (1)
x6 + y 3 + z 2 + λ xyz = 0
(1)
x4 + y 4 + z 2 + λ xyz = 0
(1)
lx3 + y 3 + z 3 + λ xyz = 0
E8 E7 E6
for k = 1, 2 and 3 respectively. The parameter λ is a constant which is related to the j-invariant of the elliptic curve E (cf. [20]). For k ≥ 4 the embedding (1) dimension is k. A simple elliptic singularity of degree 4 is called D5 . It is the complete intersection of two quadrics in four variables. For k > 4, a simple elliptic singularity of degree k is not a complete intersection anymore. At least some of those are also related to affine root systems. For example a simple (1) elliptic singularity of degree 5 is related to A4 . But since they are not complete intersections, they can unfortunately not appear in the following construction. There might be some other construction, but this is still unknown. 1.5. Conjugacy classes in affine Kac–Moody groups. The deformation theory of simple elliptic singularities shows that they are related to affine root systems. In fact, the discriminant of the semi-universal deformation of a simple elliptic singularity is isomorphic to the discriminant of an affine Weyl group (cf. [15]). The groups corresponding to affine root systems are the so-called affine Kac– Moody groups (cf. [13]). We will need a certain completion of those groups, which can be constructed as follows. Let G be a simple and simply connected algebraic group over C as before. The set of holomorphic loops LG := ϕ : C∗ −→ G ϕ is holomorphic has a natural group structure (pointwise multiplication) and is called a holo( with the center morphic loop group. It has a universal central extension LG ∗ isomorphic to the multiplicative group C , i.e., there is an exact sequence ( −→ LG −→ 1 1 −→ C∗ −→ LG ( and every other central extension of LG with C∗ contained in the center of LG is induced from this exact sequence (cf. [19]). The action of C∗ on LG given
Singular Elements of Affine Kac–Moody Groups
161
( and hence, by (q · φ)(z) = φ(qz) extends to the universal central extension LG one can construct the semi-direct product := LG ( C∗ , LG which is the desired completion of an affine Kac–Moody group. If the group G is said to be is of type X for some X ∈ {A, B, . . . , F }, then the group LG (1) of type X . In contrast to the finite-dimensional group G, some elements do not have a Jordan decomposition. Moreover, the exponential map for in LG is not surjective. Therefore, we cannot reduce the study of the Lie group LG to those in its Lie algebra as before in the closures of conjugacy classes in LG finite-dimensional case. But at least an analog of the map χ can be constructed has + 1 fundamental highest weight in the following way. The group LG representations (cf. [19]), where = rank G. The formal characters χ0 , . . . , χ of those highest weight representations are convergent on the open set ( × D∗ ⊂ LG, where D∗ := q ∈ C∗ |q| < 1 U G := LG This and they are invariant under conjugation with arbitrary elements of LG. follows from the much more general work by R. Goodman and N. Wallach [9]. Those fundamental characters are the first + 1 components of our map χ : U G −→ C+1 × D∗ . The final component is just the canonical projection onto the pointed disc D∗ , which is also invariant under conjugation. Hence, every conjugacy class in U G is contained in a fiber of χ . The codimension of every conjugacy class in U G is finite and every fiber of χ , for which at least one of the fundamental characters is non-zero, contains only finitely many conjugacy classes. The elements in those classes also have a unique Jordan decomposition and the centralizer of its semi-simple part is a finite-dimensional reductive group. However, the fibers over elements (0, q) with q ∈ D∗ contain infinitely many conjugacy classes and those do not admit a Jordan decomposition. In order to prove those first results on conjugacy classes in U G and eventually to get a more complete classification the following observation due to E.J.N. Looijenga is very helpful. The quotient of the multiplicative group C∗ by the cyclic subgroup generated by and element q ∈ D∗ is an elliptic curve E. Moreover, for any element ϕ ∈ LG one can define an action of the free cyclic group Z on the product C∗ × G such that the generator of Z acts by the automorphism C∗ × G −→ C∗ × G, (z, g) −→ qz, ϕ(z) · g and the quotient of C∗ × G by this Z-action is a principal G-bundle over E. It turns out that the isomorphism class of this principal G-bundle depends only of the LG-conjugacy class of the element (ϕ, q) ∈ LG C∗ . This natural construction even induces a bijection between LG-conjugacy class in LG × {q} and isomorphism classes of principal G-bundles over E. Also note that the
162
S. Helmke and P. Slodowy
multiplicative subgroup C∗ ⊂ LG C∗ acts as a translation group on the elliptic curve E. Now, consider a LG-orbit O ⊂ U G and assume first that at least one of the fundamental characters is non-zero on O. Then, one can show that O is invariant under the translation group and the projection of O to LG C∗ has finite fibers. The principal G-bundle associated to its image is semi-stable (cf. [4]). On the other hand, if all the fundamental characters vanish on O, then the orbit is not invariant under the translation group and the fibers of the projection to LG C∗ are isomorphic to the center C∗ . In this case, the principal G-bundle associated to its image is unstable. In summary one can therefore say, that the classification of conjugacy classes of U G is equivalent to the classification of principal bundles over an elliptic curve. 1.6. Singularities of the unstable variety. As in the finite-dimensional case, all the fibers of χ through elements which admit a Jordan decomposition, are isomorphic to fiber bundles over the unipotent variety of the centralizer of any semi-simple element in that fiber. Since these centralizers are finite-dimensional reductive groups, those fibers have at most simple singularities in codimension 2, by Brieskorn’s Theorem! But the fibers over 0 × D∗ look different. For −1 (0, q) is simplicity, in the following we will fix a number q ∈ D∗ . The fiber χ called the unstable variety. If G is of type A , then this variety has irreducible components. In all other cases the unstable variety is irreducible. Each irreducible component contains a dense (regular) orbit. This was independently found by G. Br¨ uchert [6] and by R. Friedman, J. Morgan and E. Witten [8]. There is no orbit which has codimension 1 in the unstable variety and there are always (subregular) orbits, which have codimension 2. However, in most cases we find even 1-parameter families of subregular orbits. Hence, the codimension of the complement of the regular orbit(s) may be 1, in sharp contrast to the finite-dimensional case. In [11] we classified all those subregular (1) (1) orbits (even for non-simply connected groups G). For groups of type A , D (1) and E the result can be summarized in the following table. ◦ ◦2323 ◦ ◦ · · · ◦2323 ◦ ◦ (1) (1) A1 A>1 ◦◦. ◦◦. · · · · · ◦◦. ◦◦. ◦. .. .. .. .. .. (1)
D4
3◦2 ◦◦. ◦◦. ◦◦. .. .. ..
(1)
D5
3◦2 ◦. ◦◦. ◦. .. .. ..
(1)
D>5
◦ ◦◦. ..
◦ ◦ (1) E8 ◦. ◦. ◦. ◦. .. .. .. .. Here, a single circle ◦ represents one unstable orbit and a double circle ◦◦ represents a 1-parameter family of orbits. The circles at the top represent the regular (1) E6
32 ◦
(1) E7
Singular Elements of Affine Kac–Moody Groups
163
orbits and the circles directly below represent the subregular orbits. A subregular orbit or family is connected by a line to a regular one, if it is contained in its closure. Subregular orbits which appear in a 1-parameter family are called non-isolated subregular orbits, the others are called isolated subregular orbits. As we see, the exceptional groups contain only isolated subregular orbits, but (1) the classical groups contain mostly non-isolated subregular orbits, except A1 (1) and D5 . The analogy of Brienskorn’s Theorem can now be stated as follows. be a group of type D , E , E Theorem 1.4 (cf. [10] and [12]). Let LG 5 6 7 (1) or E8 . The intersection of a transversal slice to an isolated subregular unstable orbit with the unstable variety has a simple elliptic singularity of the same type Moreover, the restriction of the map χ as LG. to this slice is the semi-universal deformation of the simple elliptic singularity. (1)
(1)
(1)
Again, there are essentially two different ways to prove this result. The first one uses the deformation theory of principal G-bundles over an elliptic curve. The transversal slice can be identified with the base space of the semiuniversal deformation of the subregular unstable principal G-bundle. Under this identification, the intersection of the transversal slice with the unstable variety is just the locus in the base space corresponding to unstable bundles. In general, the semi-universal deformation of a subregular unstable principal (1) G-bundle is quite complicated, but at least for D5 we were able to determine exactly its unstable locus. The other argument uses quasi-homogeneity of simple elliptic singularities. Here, a C∗ -action can be directly realized in the group. Essentially it is given by but, as in the finite-dimensional case, multiplication with the center C∗ ⊂ LG, it has to be modified with a 1-parameter subgroup in order to fix a subregular unstable element. The calculation of the weights of this action on the transversal slice and the base space is very similar to the corresponding calculation in the finite-dimensional case. It turns out that those weights coincide with the weights for the corresponding simple elliptic singularity. Finally, the simple elliptic singularities also have the property, that they are uniquely determined by those weight, except that the j-invariant of the elliptic curve cannot be obtained this way. But even the j-invariant of the elliptic curve is then determined by the discriminant of the semi-universal deformation of the corresponding simple elliptic singularity. We used this method for the exceptional groups, since the first method was too complicated in those cases. Before the existence of subregular unstable orbits was known, it was a somewhat mysterious question, what would happen for the majority of affine Kac–Moody groups, for which no simple elliptic singularity exists. Now we are able to answer this question. First of all, the intersection of a transversal slice to a non-isolated subregular unstable orbit with the unstable variety has is of type A(1) , then those singularities are of non-isolated singularities. If LG >1
164
S. Helmke and P. Slodowy
y
x z Figure 3. The Whitney Umbrella x2 y = z 2 (D∞ )
type A∞ , i.e., they are isomorphic to two smooth planes intersecting transver is of type D(1) , then beside singularities of type A∞ , sally along a line. If LG at four non-isolated subregular unstable orbits the singularity is of type D∞ , i.e., a so-called Whitney Umbrella (cf. Fig. 3). The semi-universal deformation of such a non-isolated hypersurface singularity has an infinite-dimensional base space. But some careful analysis shows that the singularity has an additional symmetry group which is now infinite, and the restriction of the map χ to the transversal slice is essentially the invariant part of the semi-universal deformation. The situation is therefore very similar to the finite-dimensional case, where also no simple singularity corresponds to a group of type B , C , F4 and G2 , but instead they correspond to a simple singularity with symmetry.
(1)
2. Singularities of orbit closures in A1
In this section, we will study the unstable variety of an affine Kac–Moody (1) group of type A1 in terms of rank 2 vector bundles over an elliptic curve E. Recall that the orbits in the unstable variety are in one-to-one correspondence with the isomorphism classes of unstable principal SL2 (C)-bundles modulo translations on the elliptic curve, due to Looijenga’s construction. Given a principal SL2 (C)-bundle, one can associate to it a rank 2 vector bundle E with trivial determinant by using the fundamental 2-dimensional representation of SL2 (C). On the other hand, if E is a vector bundle of rank 2, its frame bundle is a principal GL2 (C)-bundle and if the determinant of E is trivial, the structure group of this principal bundle reduces to SL2 (C). It is therefore equivalent to study the deformations of rank 2 vector bundle with trivial determinant instead of principal SL2 (C)-bundles, which we will do in the following.
Singular Elements of Affine Kac–Moody Groups
165
Definition 2.1. Let E be rank 2 vector bundle with trivial determinant over an elliptic curve E. The instability index of E is the integer i(E) := max deg L L a line bundle with Hom(L, E) '= 0 . Note that i(E) ≥ 0 for every rank 2 vector bundle E with trivial determinant over E and that i(E) > 0, if and only if E is unstable. Lemma 2.2. Let E be an unstable rank 2 vector bundle with trivial determinant over an elliptic curve E. Then there is a unique line subbundle L of E with deg L = i(E). Moreover, (2.1) E L ⊕ L∗ . Proof. In fact, just by the definition we can find a line bundle L over E whose degree is equal to the instability index of E together with a non-trivial homomorphism L → E. Because of the maximality of the degree of L and the fact that the local rings of E are principal ideal domains, L must be a subbundle of E. Since E has trivial determinant it is therefore an extension of L∗ by L. But on an elliptic curve Ext1 (L∗ , L) = 0 for any line bundle L with positive degree. Therefore, the extension splits as a direct sum (2.1) and this also implies the uniqueness of L. The isomorphism classes of line bundles over E of a fixed degree i are in one-to-one correspondence to points on E. But if i > 0, then all those line bundles are equivalent up to translations on E. Therefore, for each i = 1, 2, . . . there is exactly one isomorphism class of rank 2 vector bundles with trivial determinant and instability index i modulo translations. We will now describe the deformations of an unstable rank 2 vector bundle E with trivial determinant. The tangent space of the semi-universal deformation of E is isomorphic to Ext1 (E, E) H 1 (End E). But this includes deformations which change the determinant. In order to keep the determinant fixed we have to consider only the subspace H 1 (End0 E), where End0 E denotes the traceless endomorphisms of E. If L is a line subbundle of E with maximal degree i = i(E), then, using (2.1) we find End0 E L−2 ⊕ O ⊕ L2
and
H 1 (End0 E) H 1 (L−2 ) ⊕ C.
The second term in H 1 (End0 E) comes from H 1 (O). Note that the infinitesimal translation group of E acts transitively on this term. Since we consider isomorphism classes of bundles modulo translations on E, we may ignore this term. Hence, the tangent space of the semi-universal deformation of E with fixed determinant and modulo translations reduces to the 2i-dimensional space Λ := H 1 (L−2 ) Ext1 (L, L∗ ). The deformations are all unobstructed and we can even identify Λ with the global base space of the semi-universal deformation, not only the infinitesimal one: An element λ ∈ Λ corresponds to an extension 0 −→ L∗ −→ Eλ −→ L −→ 0 (2.2)
166
S. Helmke and P. Slodowy
such that, after applying the functor Hom(L, –), the connection homomorphism 1 ∈ Hom(L, L) −→ Ext1 (L, L∗ ) λ δ
(2.3)
maps the identity to the extension class λ. Our main interest is the unstable locus in the deformation space of the subregular bundle, which corresponds to the case i = 2. We will see that this locus is the cone over an elliptic curve of (1) degree 4 and therefore a simple elliptic singularity of type D5 . More general, we can prove the following. Theorem 2.3. Let E be an unstable rank 2 vector bundle with trivial determinant over an elliptic curve E and let Λ be the base space of its semi-universal deformation. Assume that the instability index i of E is at least 2 and let Λj := λ ∈ Λ i(Eλ ) ≥ j for j = 0, . . . , i. Then, the stratum Λi−1 is the cone over a regular embedding of E into P(Λ), i.e., Λi−1 has a simple elliptic singularity of degree 2i in 0. Proof. Suppose that λ ∈ Λi−1 . Then, we can find a line bundle L of degree i − 1 and a non-trivial homomorphism L → Eλ . Note that Hom(L , L) is 1-dimensional. In other words, there is a non-trivial homomorphism h : L → L, which is unique up to a scalar. This homomorphism has exactly one zero at some closed point P ∈ E, i.e., there is an exact sequence 0 −→ L −→ L −→ L ⊗ OP −→ 0. h
(2.4)
We now apply the two functors Hom(L, –) and Hom(L , –) to the exact sequence (2.2) to get the following commutative diagram δ
0 −−−−→ Hom(L, Eλ ) −−−−→ Hom(L, L) −−−−→ Ext1 (L, L∗ ) λ ∗ ∗ ∗ h 5 h 5 h 5 0 −−−−→ Hom(L , Eλ ) −−−−→ Hom(L , L) −−−−→ Ext1 (L , L∗ ). The map δ is the connection homomorphism from (2.3) and therefore, it maps the identity to λ. On the other hand, the second vertical map h∗ is an isomorphism, since it maps the identity to h. So, we may conclude that Hom(L , Eλ ) '= 0
if and only if λ ∈ Ker h∗ .
(2.5)
On the other hand, if we apply the functor H 0 (L ⊗ –)∗ to (2.4), we get h∗
0 −→ H 0 (L2 ⊗ OP )∗ −→ H 0 (L2 )∗ −→ H 0 (L ⊗ L )∗ −→ 0. Using Serre duality, we can identify the homomorphism labeled h∗ in this exact sequence with the homomorphism h∗ in (2.5). In other words, the kernel of h∗ is nothing else than the image of the dual of the evaluation map of global section of L2 at P . This is a 1-dimensional subspace of Λ. When L runs through all line bundles of degree i − 1, then P runs through all closed points of E and the union of the corresponding 1-dimensional subspaces of Λ is the cone over
Singular Elements of Affine Kac–Moody Groups
167
the embedding E → P(Λ) given by the linear system of global sections of L2 . Finally, it follows from (2.5) that the stratum Λi−1 is exactly this cone. Conclusion. Fix a number q ∈ D∗ , such that C∗ /{ q n | n ∈ Z } E. Denote (1) by Oi the orbit in the unstable variety χ −1 (0, q) of A1 , which consists of those elements, whose corresponding vector bundle has instability index i. Recall that in Lemma 2.2, we have shown that Oi is indeed a single orbit and that the unstable variety is the disjoint union of O1 , O2 , O3 , . . . . A transversal slice at an element of Oi induces a deformation of the corresponding rank 2 vector bundle E. This deformation also deforms the elliptic curve E, but if we fix q in the transversal slice, the deformation is exactly the semi-universal deformation of E, which we studied in Theorem 2.3. Therefore, the codimension of Oi in U G is 2i + 1 and Oi is contained in the closure of Oj if and only if j ≤ i. Moreover, the closure of Oi−1 has a simple elliptic singularity of degree 2i at the generic point of Oi . In summary, we may now draw a symbolic picture of the unstable (1) variety of A1 as follows. O1 ◦ cone over E → P3 O2 ◦ cone over E → P5 O3 ◦ cone over E → P7 O4 ◦. .. As usual, each orbit is represented by a circle and two circles are connected by a vertical line, if the upper orbit contains the lower orbit in its closure. In addition, the circles are labeled by their instability index and the lines are labeled by the generic type of the singularity of the corresponding orbit closure. (1)
3. Isolated subregular singularities in D5
In the previous section, we gave a complete description of the unstable variety (1) of A1 by using the deformation theory of rank 2 vector bundles with trivial determinant over an elliptic curve. This was possible, essentially because the deformations of every unstable bundle of rank 2 with trivial determinant is realized as simple extension (2.2). Unfortunately, this fails for most bundles of higher rank. However, there are other examples of bundles with this property (1) and luckily, the two isolated subregular unstable orbits in D5 corresponds to such examples. We will now describe these particular principal bundles in terms of their associated vector bundles (cf. [11]). Let E be an indecomposable vector bundle of rank 5 and degree 2 over an elliptic curve E. By [3] such a bundle exists and is unique up to translations on the elliptic curve. The direct sum with its dual, F := E ⊕ E ∗
(2.1 )
168
S. Helmke and P. Slodowy
has a natural non-degenerate bilinear form. Therefore, the structure group of the frame bundle of F reduces to SO10 (C). Since the degree of E is even, it has four spin structures and among the corresponding four principal Spin10 (C)bundles, two are non-isomorphic. Those are exactly the principal bundles cor(1) responding to the two isolated subregular unstable orbits in D5 . In order to describe the deformations of F as a principal Spin10 (C)bundle, we may fix a spin-structure of F and then consider only the deformations of F as a principal SO10 (C)-bundle, since the spin-structure will automatically extend to any of its deformations. The tangent space of the semiuniversal deformation of F is isomorphic to the first cohomology group of the sheaf End F End E ⊕ End E ∗ ⊕ Hom(E, E ∗ ) ⊕ Hom(E ∗ , E). The first two terms have 1-dimensional cohomology and the last one has trivial cohomology (cf. [3]). As in the previous section, the first two terms disappear, if we consider only deformations with fixed determinant and if we identify bundles which differ only by a translation of the elliptic curve. The remaining third term corresponds to extensions 0 −→ E ∗ −→ Fλ −→ E −→ 0.
(2.2 )
Those extensions parameterize exactly the deformations of F as a SL10 (C)bundle modulo translations. The non-degenerate bilinear form of F extends to a deformation Fλ if and only if the extension class λ is skew-symmetric, i.e., if the connection homomorphism 62 δ 1 ∈ Hom(E, E) −→ Ext1 (E, E ∗ ) ⊃ H 1 ( E ∗ ) =: Λ λ (2.3 ) maps the identity to a skew-symmetric element of Ext1 (E, E ∗ ) H 1 (E ∗ ⊗ E ∗ ). To continue our discussion, we will need the following. Lemma 3.1 (cf. M.F. Atiyah [3]). Let E be an indecomposable vector bundle of rank 5 and degree 2 over an elliptic curve E. Then there is an indecomposable vector bundle G of rank 5 and degree 4 over E, such that 62 E G ⊕ G. Proof. Theorem 14 of [3] shows that there is an indecomposable vector bundle G of rank 5 and degree 4 over E, such that E ⊗ E G ⊕ G ⊕ G ⊕ G ⊕ G. On the other hand, the tensor product E ⊗ E is isomorphic to the direct sum of the second symmetric and the second exterior power of E. The symmetric power has rank 15 and the exterior power has rank 10. This already proves the assertion of the lemma. Remark 3.2. Theorem 14 of [3] does not in general determine exactly the isomorphism class of G. However, we have seen that the second symmetric power of E is isomorphic to the direct sum of three G and the second exterior power
Singular Elements of Affine Kac–Moody Groups
169
is isomorphic to the direct sum of two G. On the other hand, the determinant of the symmetric power is isomorphic to (det E)6 and the determinant of the exterior power is isomorphic to (det E)4 . Combining those two calculations, we find det G (det E)2 which actually determines the isomorphism class of G uniquely. We can now prove the main result of the current section. Theorem 3.3. Let E be a vector bundle of rank 5 and degree 2 over an elliptic curve E and denote by Λ the 8-dimensional base space of the semi-universal deformation of F = E ⊕ E ∗ as a Spin10 (C)-bundle. Then, the unstable locus 6 λ ∈ Λ = H 1 ( 2 E ∗ ) Fλ is unstable is contained in a 4-dimensional subspace Λ+ ⊂ Λ and it is equal to the cone over a regular embedding of E into P(Λ+ ). Proof. Suppose that λ ∈ Λ, such that the corresponding vector bundle Fλ is unstable. Then we can find an indecomposable vector bundle E of positive degree and an injective homomorphism E → Fλ . Since this homomorphism degenerates to an injective homomorphism E → F , we see that the vector bundle E can only be of degree 1 and rank 3, 4 or 5, or of degree 2 and rank 5. If it is of degree 2 and rank 5, then λ must be 0 which we may exclude in the following. If it is of degree 1 and rank 5, then Fλ must be isomorphic to the direct sum of E with its dual, which has no spin structure and can therefore not appear. If E is of degree 1 and rank 4, then Fλ is isomorphic to the direct sum of E , its dual and a rank 2 bundle. Again, since Fλ must have a spin structure, this rank 2 bundle must be the direct sum of a line bundle with its dual and the line bundle must have odd degree. Those spin bundles have a too large deformation space and cannot appear either. So E has rank 3 and degree 1 and there is a non-trivial homomorphism h : E → E which is unique up to a scalar. Actually, E is a subbundle of E and we have an exact sequence 0 −→ E −→ E −→ E −→ 0, h
(2.4 )
where E is a vector bundle of rank 2 and degree 1. As in the proof of Theorem 2.3, we apply the two functors Hom(E, –) and Hom(E , –) to the exact sequence (2.2 ) to get the following commutative diagram δ
0 −−−−→ Hom(E, Fλ ) −−−−→ Hom(E, E) −−−−→ Ext1 (E, E ∗ ) ⊃ Λ λ h∗ 5 h∗ 5 h∗ 5 0 −−−−→ Hom(E , Fλ ) −−−−→ Hom(E , E) −−−−→ Ext1 (E , E ∗ ) and as before, we see that there is a non-trivial homomorphism E → Fλ , if and only if λ lies in the kernel of the most right vertical homomorphism h∗ restricted 6 to Λ. Obviously, this kernel contains the 1-dimensional subspace H 1 ( 2 E ∗ ) and we claim that it is actually equal to this 1-dimensional subspace. To see
170
S. Helmke and P. Slodowy
this, we may degenerate the exact sequence (2.4 ) to a splitting sequence, i.e., we may replace E by the direct sum E ⊕ E . Then, the first two terms of 62 ∗ 62 ∗ 62 ∗ (E ⊕ E ∗ ) H 1 E ⊕ H 1 E ∗ ⊗ E ∗ ⊕ H 1 E H1 map under the degeneration of h∗ injectively into the corresponding terms in Ext1 E , E ∗ ⊕ E ∗ H 1 E ∗ ⊗ E ∗ ⊕ H 1 E ∗ ⊗ E ∗ and hence, as claimed, only the third term survives. In combination with our previous observation, this shows that for every indecomposable rank 3 and degree 1 bundle E there is a line in Λ, such that all Fλ with λ contained in this line admit a non-trivial homomorphism E → Fλ and the unstable locus is the union of those lines. More precisely, the line in Λ corresponding to E is the image of the homomorphism 62 ∗ 62 ∗ H1 E E −→ H 1 (3.1) induced from the exact sequence (2.4 ). Now, recall that the second exterior power of E is a direct sum of a rank 5 and degree 4 vector bundle G with itself, by Lemma 3.1. The bundle G can be realized as an extension of its determinant L, which is a line bundle of degree 4, by a trivial bundle of rank 4 0 −→ H 0 (L) ⊗ O −→ G −→ L −→ 0.
(3.2)
Denote by L the determinant of E , which is a line bundle of degree 1. The functor Hom(–, L ) applied to the previous exact sequence (3.2) leads us to 0 −→ Hom(G, L ) −→ H 0 (L)∗ ⊗ H 0 (L ) −→ Ext1 (L, L ) −→ 0.
(3.3)
In order to determine the image of the homomorphism (3.1), we will consider a generic projection onto the vector bundle G ∗ . Note that there is a unique line subbundle L∗ → G ∗ . Using Serre duality and (3.2) we see that on the one hand, H 0 (L)∗ is isomorphic to H 1 (G ∗ ) and the image of the left homomorphism in (3.3) coincides with the image of the induced map H 1 (L∗ ) → H 1 (G ∗ ) and on the other hand, the right homomorphism in (3.3) is the dual of the cup product H 0 (L ⊗ L∗ ) ⊗ H 0 (L ) −→ H 0 (L). When E runs through all indecomposable rank 3 vector bundles of degree 1, then L runs through all line bundles of degree 1 and therefore, the union of all the corresponding lines in H 1 (G ∗ ) is the same as the cone over the embedding of E given by the linear system of global sections of L. Finally, since the map from this cone into H 1 (G ∗ ) is unique up to a scalar, there is a unique subspace Λ+ H 1 (G ∗ ) of the deformation space Λ, such that the image of (3.1) is contained in Λ+ for all E . Conclusion. As a consequence of Theorem 3.3 we have proved that the intersection of a transversal slice to an isolated subregular unstable orbit with the (1) unstable variety of D5 has a simple elliptic singularity of the same type. From this, it is easy to conclude that the restriction of the map χ to the transversal
Singular Elements of Affine Kac–Moody Groups
171
slice is the semi-universal deformation of the simple elliptic singularity. But unfortunately, it is quite difficult to use this technique for principal bundles of type E . Here we rely entirely on the weights of the C∗ -action on the transversal slice, as explained in the introduction. References [1] V.I. Arnol‘d, Normal forms for functions near degenerate critical points, the Weyl groups Ak , Dk , Ek , and Lagrangian singularities, Funct. Anal. Appl. 27 (1972), 254–272. [2] V.I. Arnol‘d, Critical Points of Functions on a Manifold with Boundary, the simple Groups Bk , Ck and F4 and Singularities of Evolutes, Russian Math. Surveys 33 (1978), 99–116. [3] M.F. Atiyah, Vector bundles over an elliptic curve, Proc. London Math. Soc. 7 (1957), 414–452. [4] V. Baranovsky and V. Ginzburg, Conjugacy classes in loop groups and G-bundles on elliptic curves, Internat. Math. Res. Notices 15 (1996), 733–751. [5] E. Brieskorn, Singular elements of semisimple algebraic groups, Actes Congr. Int. Math., Nice, tome 2 (1970), 279–284. [6] G. Br¨ uchert, Trace class elements and cross sections in Kac–Moody groups, Can. J. Math. 50 (1998), 972–1006. [7] E.B. Dynkin, Semisimple subalgebras of semisimple Lie-algebras, Amer. Math. Soc. Trans. 2 (1957), 111–156. [8] R. Friedman, J. Morgan and E. Witten, Vector bundles and F-theory, Commun. Math. Phys. 187 (1997), 679–743. [9] R. Goodman and N. Wallach, Structure of unitary cocycle representations of loop groups and the group of diffeomorphisms of the circle, J. Reine Angew. Math. 347 (1984), 69–133. [10] S. Helmke and P. Slodowy, Loop groups, principal bundles over elliptic curves and elliptic singularities, Annual Meeting of the Math. Soc. of Japan, Hiroshima, Sept. 1999, Abstracts, Section Infinite-dimensional Analysis, 67–77. [11] S. Helmke and P. Slodowy, On unstable principal bundles over elliptic curves, Publ. RIMS, 37 (2001), 349–395. [12] S. Helmke and P. Slodowy, Loop groups, elliptic singularities and principal bundles over elliptic curves, Geometry and Topology of Caustics – Caustics ’02, Banach Center Publ. 62, Warszawa, (2004), 87–99. [13] V.G. Kac, Constructing groups associated to infinite-dimensional Lie algebras, Infinite-dimensional groups with applications, ed V. G. Kac, MSRI Publications, Vol. 4, 167–216, Springer Verlag, (1985). [14] E.J.N. Looijenga, Root systems and elliptic curves, Invent. Math. 38 (1976), 17– 32. [15] E.J.N. Looijenga, On the semi-universal deformation of a simple elliptic singularity II, Topology 17 (1978), 23–40. [16] E.J.N. Looijenga, Invariant theory for generalized root systems, Invent. Math. 61 (1980), 1–32.
172
S. Helmke and P. Slodowy
[17] J.Y. M´erindol, Les singularit´ es simples elliptiques, leurs d´eformations, les surfaces de Del Pezzo et les transformations quadratiques, Ann. Scient. Ec. Norm. Sup. 15 (1982), 17–44. [18] H.C. Pinkham, Simple elliptic singularities, Del Pezzo surfaces and Cremona transformations, Proc. Symp. Pure Math. 30 (1977), 69–71. [19] A. Pressley and G. Segal, Loop Groups, Oxford University Press, (1986). [20] K. Saito, Einfache elliptische Singularit¨ aten, Invent. Math. 23 (1974), 289–325. [21] P. Slodowy, Simple Singularities and Simple Algebraic Groups, Springer Lecture Notes in Math. 815, Springer, (1980). [22] P. Slodowy, Chevalley groups over C((t)) and deformations of simply elliptic singularities, RIMS Kokyuroku 415 (1981), 19–38, Kyoto University, and Proceedings of the International Conference on Algebraic Geometry, La Rabida 1981, Springer Lecture Notes in Math. 961, 285–301, Springer, (1982). [23] P. Slodowy, Singularit¨ aten, Kac–Moody Liealgebren, assoziierte Gruppen und Verallgemeinerungen, Habilitationsschrift, Universit¨ at Bonn, (1984). [24] P. Slodowy, A character approach to Looijenga’s invariant theory for generalized root systems, Compositio Mathematica 55 (1985), 3–32. [25] P. Slodowy, An adjoint quotient for certain groups attached to Kac–Moody algebras, Infinite-dimensional groups with applications, ed V.G. Kac, MSRI Publications Vol. 4, 307–333, Springer Verlag, (1985). [26] P. Slodowy, On the Algebraic Geometry of Kac–Moody groups, RIMS Kokyuroku 1086 (1999), 71–87, Kyoto University. Stefan Helmke Research Institute for Mathematical Sciences Kyoto University Kyoto 606-8502, Japan e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
On the Camassa–Holm and Hunter–Saxton equations Helge Holden Abstract. We survey recent results for the Camassa–Holm equation ut − uxxt + 2κux + 3uux − 2ux uxx − uuxxx = 0, in particular convergence of a carefully selected finite difference scheme in the case of periodic initial data, and a detailed description of algebro-geometric solutions of the Camassa–Holm hierarchy. Furthermore, we present results for the generalized hyperelastic-rod wave equation ut −uxxt + 12 g(u)x = γ(2ux uxx +uuxxx ). Finally, we discuss convergence of finite difference schemes for the Hunter–Saxton equation (ut + uux )x = 12 (ux )2 and describe semi-discrete, implicit as well as explicit upwind schemes that converge to diffusive solutions of the Hunter–Saxton equation.
1. Introduction We aim at giving a brief survey of some recent result for two families of nonlinear partial differential equations. The presentation will by necessity be brief, and details are to be found in the references. The first comprehensive study of the Camassa–Holm equation ut − uxxt + 2κux + 3uux − 2ux uxx − uuxxx = 0
(1.1)
appeared in [9, 10]. With κ positive it models, see [34], propagation of unidirectional gravitational waves in a shallow water approximation, with u representing the fluid velocity. However, the equation possesses many intriguing properties, making it a popular equation to study. It is bi-Hamiltonian and completely integrable. In the case when κ vanishes, the Camassa–Holm equation has special solutions denoted peakons that interact like solitary waves. The one peakon reads u(x, t) = ce−|x−ct| for real constants c. The equation has been studied extensively as an initial value problem, both on the real line and in the periodic case. In this paper we will study two aspects of this equation, namely convergence of a particular difference scheme in the periodic case. Secondly, we will study properties of algebro-geometric solutions of the Camassa– Holm hierarchy, i.e., special solutions of a highly selected infinite sequence of 2000 Mathematics Subject Classification. Primary: 35A05; Secondary: 35B30. Key words and phrases. Camassa–Holm equation, Hunter–Saxton equation. Partially supported by the BeMatA program of the Research Council of Norway and the European network HYKE, contract HPRN-CT-2002-00282.
174
H. Holden
nonlinear partial differential equations of which the first one is the Camassa– Holm equation. In both cases it suffices to consider κ = 0, since solutions with nonzero κ are obtained from solutions with zero κ by the transformation v(x, t) = u(x + κt, t) − κ. When we study finite difference approximations, it turns out to be convenient to rewrite the equation as a system mt = −(mu)x − mux ,
m = u − uxx .
(1.2)
The behavior of the solutions depends strongly on whether m has a definite sign or not. More precisely, the fundamental existence theorem, due to Constantin and Escher [16], reads as follows: If u0 ∈ H 3 ([0, 1]) and m0 := u0 − u0 ∈ H 1 ([0, 1]) is non-negative, then equation ut − uxxt + 3uux − 2ux uxx − uuxxx = 0,
(1.3)
has a unique global periodic solution u ∈ C([0, T ), H 3 ([0, 1])) ∩ C 1 ([0, T ), H 2 ([0, 1])) for any T positive. However, if m0 ∈ H 1 ([0, 1]) with m0 dx = 0 (but u0 not identically zero), then the maximal time interval of existence is finite. Furthermore, if u0 ∈ H 1 ([0, 1]) and m0 = u0 − u0 is a positive Radon measure on [0, 1], then (1.3) has a unique global weak periodic solution. Additional results in the periodic case can be found in [13, 16, 14, 19, 38]. In [29] we prove convergence of a particular finite difference scheme, thereby giving the first constructive approach to the actual determination of the solution. We work in the case where one has global solutions, that is, when m0 ≥ 0. The scheme is semi-discrete: Time is not discretized, and we have to solve a system of ordinary differential equations. We reformulate (1.1) to give meaning in C([0, T ]; H 1 ([0, 1])) to solutions such as peakons, and we prove that our scheme converges in C([0, T ]; H 1 ([0, 1])). As for algebro-geometric solutions of the Camassa–Holm hierarchy the technical machinery and the notation needed are rather extensive, and we refer to [26] for a complete treatment. In the context of completely integrable systems, the Camassa–Holm equation appears as a compatibility requirement of a postulated zero-curvature equation. More precisely, define 2 × 2 matrices −1 1 U (z, x, t) = , (1.4) z −1 (4u − uxx ) 1 −z + 2u + ux z − 2u . (1.5) V1 (z, x, t) = z −1 ((4u + 2ux )z − 2u2x − 4uux − 8u2 ) z − 2u − ux Postulating the zero-curvature relation Ut − V1,x + [U, V1 ] = 0,
(1.6)
(where [ · , · ] denotes the commutator), we find that (1.6) is equivalent to the Camassa–Holm equation (with a slightly different normalization compared to (1.1)). By replacing the elements of V1 by carefully constructed polynomials in
Camassa–Holm and Hunter–Saxton
175
z, one can construct higher-order integro-differential equations that constitute the Camassa–Holm hierarchy. Stationary solutions of (1.6) are called algebrogeometric solutions. The aim is to characterize these solutions in terms of properties of an underlying hyperelliptic curve Kn . A central object in the analysis is the solution of the Dubrovin equation, a system of ordinary differential equations (in the stationary case) on Kn . The solution u can be expressed in terms of the solution of the Dubrovin equation by a trace formula. However, it turns out that the Abel map does not linearize solutions of the Dubrovin equations, and this property is distinct from the very characteristic linear behavior that one encounters for other completely integrable systems like the Korteweg–de Vries (KdV) equation, sine-Gordon equation, Thirring equation, AKNS system etc, see [26]. We provide a complete description of algebro-geometric solutions of any equation in the Camassa–Holm hierarchy in terms of Riemann theta functions and constants derived from Kn . We also discuss the algebro-geometric initial value problem, by which we mean the following: Given a stationary solution u0 of the nth stationary Camassa–Holm equation and the compact hyperelliptic curve Kn , we seek a solution of the rth time-dependent Camassa–Holm equation that coincides with u0 initially. It turns out that the solution will satisfy the nth stationary Camassa–Holm equation for all times. Finally, we describe the solution of this initial value problem in terms of Riemann theta functions. Discussion of various aspects of algebro-geometric solutions of the Camassa– Holm hierarchy can be found in [1], [2], [3], [4], [5], [6], [7], [8], as well as [26], [24], and [25]. Furthermore, we are interested in the Cauchy problem for the nonlinear equation ∂3u ∂u ∂ 2 u ∂u ∂ g(u) ∂3u − =γ 2 + + u 3 , t > 0, x ∈ R, (1.7) ∂t ∂t∂x2 ∂x 2 ∂x ∂x2 ∂x where the function g : R → R and the constant γ ∈ R are given. Observe that if g(u) = 2κu+3u2 and γ = 1, then (1.7) is the classical Camassa–Holm equation. With g(u) = 3u2 , Dai [21, 20, 22] derived (1.7) as an equation describing finite length, small amplitude radial deformation waves in cylindrical compressible hyperelastic rods, and the equation is often referred to as the hyperelastic-rod wave equation. We coin (1.7) the generalized hyperelastic-rod wave equation. This equation is considerably less studied than the Camassa–Holm equation. Recently, Yin [44, 45, 46] (see also Constantin and Escher [17]) proved local well-posedness, global well-posedness for a particular class of initial data, and in particular that smooth solutions blow up in finite time (with a precise estimate of the blow-up time) for a large class of initial data. Lopes [37] proved stability of solitary waves for (1.7) with γ = 1, while Kalisch [35] studied the stability when g(u) = 2κu + 3u3 and γ ∈ R. Our approach is heavily influenced by that of Xin and Zhang, see [42, 43]. We prove in [11, 12] that (1.7) possesses a global weak solution that is stable with respect to perturbations in the initial data u0 as well as variation in the function g and the parameter γ, a result that is
176
H. Holden
new even for the Camassa–Holm equation. The starting step in the proof is to show existence of solutions of a viscous regularization of (1.7) with sufficient stability when the regularization is turned off, see [11]. The Hunter–Saxton equation, 1 (ut + uux )x = (ux )2 , (1.8) 2 was first derived as a model for nematic liquid crystals [30]. Liquid crystals are mesophases, i.e., intermediate states between the liquid and the crystal phase. Nematic liquid crystals can be described by two linearly independent vector fields; one describing the fluid flow and one describing the orientation of the director field that gives the orientation in of the rod-like molecule. In the situation where one studies stationary flow governed by the Oseen–Franck expression for internal energy, one can derive an equation for the director field only. Minimizing an action principle for a planar director field perturbed around a constant state, one ends up with the Hunter–Saxton equation (1.8) in the unknown u that describes the angle of the director field. Soon after its derivation, it was discovered that the equation possessed many unexpected and interesting properties. As the Camassa–Holm equation, the Hunter–Saxton equation is completely integrable, bivariational and bi-Hamiltonian [31]. We study the problem on a half-line. Introducing v = ux , we can write (1.8) as 1 1 (1.9) vt + (uv)x = v 2 , or vt + uvx = − v 2 . 2 2 We impose initial-boundary conditions as u(0, t) = 0,
v(x, 0) = v0 (x).
(1.10)
Central in the study of the Hunter–Saxton equation is the characteristics given by d Φ(x, t) = u(Φ(x, t), t), Φ(x, 0) = x. dt If v0 ≥ 0, then x (1 + 12 v0 (y)t)2 dy, Φ(x, t) = 0 x (1 + 12 v0 (y)t)v0 (y) dy, u(Φ(x, t), t) = 0
2v0 (x) . 2 + v0 (x)t In contrast to the case of hyperbolic conservation laws where characteristics in general will collide, the characteristics for the Hunter–Saxton equation will only focus. Smooth solutions can be expressed as the solution of a system (see [32]) v(Φ(x, t), t) =
u = u0 (ξ) + tg(ξ) + h (ξ), 1 x = ξ + tu0 (ξ) + t2 g(ξ) + h(ξ), 2
(1.11)
Camassa–Holm and Hunter–Saxton
177
where h is any function with h(0) = h (0) = 0, and g (ξ) = 12 u0 (ξ)2 . However, the Hunter–Saxton equation will not in general enjoy classical solutions. More precisely, if u0 is not monotone increasing, then inf(ux ) → −∞ as t ↑ t∗ = 2/ sup(−u0 ).
(1.12)
The solution concept for weak solutions is more complicated. Two different solution concepts are discussed in the literature, namely that of conservative and dissipative solutions, see [48, 49]. Dissipative solutions are characterized by having a non-increasing energy as well as satisfying a one-sided Oleinik entropy condition. Conservative solutions on the other hand preserve energy even locally. We here study a family of upwind schemes, see [28]. More precisely, we analyze (i) a semi-discrete scheme where the time is kept continuous while the spatial variable is discretized; (ii) an implicit (fully) discrete system, and (iii) an explicit (fully) discrete system. We show that all schemes converge to the dissipative solution in the case when v0 is non-negative. Finally, we extend the semi-discrete scheme to the considerably more difficult case of varying sign of v0 , and show that even in this case the scheme converges to the dissipative solution. We close this introduction by noting the following similarity between the Camassa–Holm and Hunter–Saxton equations. Recall from (1.2) that one may recast the Camassa–Holm equation as follows: mt = −(mu)x − mux ,
m = u − uxx .
Similarly, one may write the Hunter–Saxton equation as mt = −(mu)x − mux ,
m = uxx .
2. The Camassa–Holm equation The Camassa–Holm equation can be studied from many different points of view. Here we focus on two, that have rather different aims. The finite difference approximation offers a constructive approach to compute the solution of the Camassa–Holm equation for a wide class of periodic initial data. One the other hand, the algebro-geometric approach aims at characterizing explicitly a certain class of solutions. 2.1. Convergence of a finite difference scheme for the Camassa–Holm equation. This section is based on joint work with X. Raynaud, see [29]. We may rewrite the Camassa–Holm equation as follows 3 1 1 ut − uxxt = − (u2 )x − (u2x )x + (u2 )xxx . (2.1) 2 2 2 A function u in L∞ ([0, T ]; H 1 ) is said to be solution of the periodic Camassa– Holm equation if it is periodic and satisfies (2.1) in the sense of distributions. First we present the necessary notation for the finite difference scheme we apply. Introduce the partition of the unit interval [0, 1] in n points xi = hi, i = 0, . . . , n−1 with spacing h = 1/n. Given any vector u = (u0 , . . . , un−1 ) ∈ Rn we
178
H. Holden
can define a unique continuous, piecewise linear and periodic (i.e., u(0) = u(1)) function u : [0, 1] → R such that u(xi ) = ui . The left and right derivatives of u coincide with the usual finite difference quantities ±1 (ui±1 − ui ). (D± u)i = h In addition we need the symmetric difference (Du)i =
1 1 (D+ u)i + (D− u)i = (ui+1 − ui−1 ). 2 2h
Recall that the Camassa–Holm equation may be written as mt = −(mu)x − mux ,
m = u − uxx .
(2.2)
We consider the following semi-discrete approximation mnt = −D− (mn un ) − mn Dun , mn = un − D− D+ un ,
(2.3)
with initial condition un |t=0 = v n . Here mn = mni (t) ≈ m(i/n, t) and un = uni (t) ≈ u(i/n, t) where u = u(x, t) and m = m(x, t) denotes the exact solution of (2.2). The second equation allows for an inversion in the sense that (see [29, eqn. (2.8)]) uni
n−1 c = (e−κ(i−j) + eκ(i−j−n) )mnj , 1 − e−κn j=0
i = 0, . . . , n − 1,
(2.4)
√ with κ = ln((1 + 2n2 + 1 + 4n2 )/(2n2 )), which in effect reduces the first equation to a system of ordinary differential equations in mn . Using the inversion formula once more we obtain the approximate solution un . We extend this spatially discrete function to a continuous, piecewise linear and periodic function. A key property of the solution of the Camassa–Holm equation is the dependence for the solution on the sign of m initially. Indeed, we recall [13, Theorem 4] the following result. Theorem 2.1. Assume that m0 ∈ H 2 ([0, 1]) is non-negative and periodic. Then the system (2.2) with initial data m|t=0 = m0 is globally well posed in H 2 ([0, 1]). However, once one permits a change in the sign of m0 , the qualitative behavior of the solution changes considerably, as the next result shows (see [16]).
Camassa–Holm and Hunter–Saxton
179
Theorem 2.2. Assume that m0 ∈ H 1 ([0, 1]) with m0 dx = 0 and the associated u0 '≡ 0. Then there exists a T positive such that the system (2.2) with initial data m|t=0 = m0 has a unique periodic solution u such that u ∈ C([0, T ), H 3 ([0, 1])) ∩ C 1 ([0, T ), H 2 ([0, 1])).
(2.5)
The maximal choice of T is finite. With the discrete approximation we work in the context of globally well posed problems. An important element in the proof of Theorem 2.1 is the fact that the sign of m is preserved as a function of time. The approximation given by (2.3) shares that property as the next lemma shows (see [29, Lemma 2.2]). Lemma 2.3. Assume that m0 ≥ 0. For any solution u(t) of the system (2.3), we have that m(t) ≥ 0 for all t ≥ 0. let
Next we want estimate the H 1 norm of the approximation. To that end . -n−1 n−1 1 n 2 2 n 2 (2.6) En (t) = (u ) + (D+ ui ) . n i=0 i i=0
This implies n−1 & 2 % n n dEn (t)2 = ui ui,t + D+ uni D+ uni,t dt n i=0
=−
n−1 2 [D− (mn un )i uni + mni Duni uni ] n i=0
n−1 2 n n = [m u (D+ uni − Duni )] n i=0 i i n−1 1 n n = 2 [m u (−mni + uni )] . n i=0 i i
Using that uni is positive (see (2.4) and Lemma 2.3), we find n−1 dEn2 (t) 1 n n 2 m (u ) . ≤ 2 dt n i=0 i i
Since
un ∞ ≤ O(1)En (t) and
we find that
n−1 1 n n m u = En (t)2 , n i=0 i i
1 1 O(1) ≥ − t. En (t) En (0) n
(2.7)
180
H. Holden
If we assume that un (0) tends to a nonzero v in H 1 , un (0) H 1 and therefore En (0) are bounded. It implies that En (0)−1 is bounded from below by a strictly positive constant and, for any given T > 0, there exists N ≥ 0 and constant C > 0 such that for all n ≥ N and all t ∈ [0, T ], we have En (0)−1 − O(1)t/n ≥ 1/C . Hence, En (t) ≤ C and thus the H 1 -norm of un (t) is uniformly bounded in [0, T ]. Further estimates are needed if we want to conclude that un converges to a solution. We state, but do not prove the following result here. Lemma 2.4. We have the following properties: (i) unx is uniformly bounded in L∞ ([0, 1]). (ii) unx has a uniformly bounded total variation. (iii) unt is uniformly bounded in L2 ([0, 1]). We stress that in the proof of this lemma, the positivity of m enters in a crucial way. We can now state the main theorem. Theorem 2.5. Let v n be a sequence of continuous, periodic and piecewise linear functions on [0, 1] that converges to v in H 1 ([0, 1]) as n → ∞ and such that v n − D− D+ v n ≥ 0. Then, for any given T > 0, the sequence un = un (x, t) of continuous, periodic and piecewise linear functions determined by the system of ordinary differential equations mnt = −D− (mn un ) − mn Dun mn = un − D− D+ un
(2.8)
with initial condition un |t=0 = v n , converges in C([0, T ]; H 1 ([0, 1])) as n → ∞ to the solution u of the Camassa–Holm equation (1.3) with initial condition u|t=0 = v. Sketch of proof. Applying the theorem of Simon [41, Corollary 4] we consider the Banach space X = v ∈ H 1 | vx ∈ BV with norm
v X = v H 1 + vx BV = v H 1 + vx L∞ + TV(vx ) which injects compactly in H 1 . As spaces B and Y in Simon’s theorem we use H 1 and L2 , respectively. Simon’s theorem implies the existence of a subsequence of un that converges in C([0, T ], H 1 ) to some u ∈ H 1 . Next, we show that the limit is indeed a solution of the Camassa–Holm equation. Take ϕ in C ∞ ([0, 1]×[0, T ]) and multiply, for each i, the first equation in (2.3) by hϕ(xi , t). We denote ϕn the continuous piecewise linear function
Camassa–Holm and Hunter–Saxton
181
given by ϕn (xi , t) = ϕ(xi , t). Then n−1
n−1 n n n−1 n n 2 h ui,t − (D− D+ ui )t ϕi = h(ui ) D+ ϕi − huni D− D+ uni D+ ϕni
i=0
i=0
i=0
n−1
−
huni Duni ϕni +
n−1
i=0
hD− D+ uni Duni ϕni .
i=0
(2.9) A detailed analysis of each term shows that the right-hand side converges, as n → ∞, to 1 1 1 1 1 2 1 1 2 2 2 u ϕx dx+ u ϕxxx dx− ux ϕx + uux ϕ dx− u ϕx dx. (2.10) 2 0 2 0 x 0 0 0 Integrating with respect to time over the interval [0, T ] we find that the lefthand side converges to T n−1 h uni,t − D− D+ uni,t ϕ(xi , t)dt 0
i=0
T
→−
1
0
0
0
"
u(ϕ − ϕxx ) dx 0
Thus we conclude that T 1 − u(ϕt − ϕtxx ) dxdt + 0
t=T
1
u(ϕt − ϕtxx ) dxdt +
t=T
1
u(ϕ − ϕxx ) dx 0
t=0
1 1 1 2 2 = u ϕx dx + u ϕxxx dx − u2x ϕx 2 0 0 0 0 1 # 1 1 2 + uux ϕ dx − ux ϕx dx dt, 2 0 0 which shows that u is a solution of the Camassa–Holm equation. T
. t=0
1
(2.11)
2.2. Algebro-geometric solutions for the Camassa–Holm hierarchy. This section describes joint work with F. Gesztesy, see [24, 25, 26]. We will not be specific about smoothness assumptions on solutions u and simply assume that solutions are infinitely differentiable with bounded derivatives. We start by constructing the Camassa–Holm hierarchy. To that end define {f }∈N0 recursively by f0 = 1, f,x = −2G 2(4u − uxx )f−1,x + (4ux − uxxx)f−1 , ∈ N, (2.12) where G is given by G : L∞ (R) → L∞ (R),
(Gv)(x) =
1 4
R
dy e−2|x−y| v(y),
x ∈ R, v ∈ L∞ (R). (2.13)
182
H. Holden
1.5
1
0.5
0
0
5
10
15
20
25
30
35
40
Figure 1. To the left is the initial data is e−|x−2| + 3e−|x−5| + 2e−|x−8| (periodized). The period is 40 and n = 214 . The computed and the exact solution (dotted) at time t = 6 are to the right. At each level a new integration constant, denoted by c , is introduced. Moreover, we introduce coefficients {g }∈N0 and {h }∈N0 by 1 g = f + f,x , h = (4u − uxx )f − g+1,x , ∈ N0 . 2 Define the following 2 × 2 matrix Vn by −Gn Fn , n ∈ N0 , z ∈ C \ {0}, x, tn ∈ R, Vn (z, x, tn ) = z −1 Hn Gn
(2.14)
(2.15)
assuming Fn , Gn , and Hn are polynomials Fn =
n
fn− z ,
Gn =
=0
n
gn− z ,
Hn =
=0
n
hn− z
(2.16)
=0
of degree n with respect to z. Postulating zero-curvature relation (recall the definition (1.4) of U ) Utn − Vn,x + [U, Vn ] = 0,
n ∈ N0 ,
(2.17)
yields the following set of time-dependent equations 4utn − uxxtn − Hn,x + 2Hn − 2(4u − uxx )Gn = 0,
(2.18)
Fn,x = 2Gn − 2Fn ,
(2.19)
zGn,x = (4u − uxx )Fn − Hn .
(2.20)
Inserting the polynomial expressions for Fn , Hn , and Gn into (2.19) and (2.20), respectively, first yields recursion relations (2.12) and (2.14) for f and g for = 0, . . . , n. For fixed n ∈ N we obtain from (2.18) the recursion for h for = 0, . . . , n − 1 and (2.21) hn = (4u − uxx )fn .
Camassa–Holm and Hunter–Saxton
183
In addition, one finds 4utn (x, tn ) − uxxtn (x, tn ) − hn,x (x, tn ) + 2hn (x, tn ) − 2(4u(x, tn ) − uxx (x, tn ))gn (x, tn ) = 0. (2.22) Using relations (2.14) and (2.21) permits one to write (2.22) as CHn (u) = 4utn − uxxtn + (uxxx − 4ux )fn − 2(4u − uxx )fn,x = 0.
(2.23)
Varying n ∈ N0 in (2.23) then defines the time-dependent Camassa–Holm hierarchy. We obtain the Camassa–Holm equation for n = 1, which with the current choice of numerical factors reads 4ut1 − uxxt1 − 2uuxxx − 4ux uxx + 24uux + c1 (uxxx − 4ux ) = 0. The algebro-geometric framework is best described in the stationary case, where −Vn,x (z, x) + [U (z, x), Vn(z, x)] = 0,
(2.24)
which yields the stationary Camassa–Holm equation s-CHn (u) = (uxxx − 4ux )fn − 2(4u − uxx )fn,x = 0.
(2.25)
Furthermore, in the stationary case we find that z 2 Gn (z, x)2 +zFn (z, x)Hn (z, x) is independent of x, and thus z 2 Gn (z, x)2 + zFn (z, x)Hn (z, x) = R2n+2 (z),
(2.26)
for some polynomial R2n+2 of degree 2n + 2, or R2n+2 (z) =
2n+1
(z − Em ),
E0 , E1 , . . . , E2n ∈ C, E2n+1 = 0.
(2.27)
m=0
In the following we assume that the Ej are all distinct. We introduce the hyperelliptic curve Kn of arithmetic genus n defined by Kn : Fn (z, y) = y 2 − R2n+2 (z) = 0.
(2.28)
Compactify Kn by adding two distinct points at infinity, P∞+ , P∞− , still denoting its projective closure by Kn . Hence Kn becomes a two-sheeted Riemann surface of arithmetic genus n. Points P on Kn \ {P∞± } are denoted by P = (z, y), where y( · ) denotes the meromorphic function on Kn satisfying Fn (z, y) = 0. In the following the roots of the polynomial Fn will play a special role and hence we introduce on C × R n Fn (z, x) = (z − µj (x)). (2.29) j=1
Indeed we have the fundamental trace relation u(x) =
n 2n+1 1 1 µj (x) − Em , 2 j=1 4 m=0
(2.30)
184
H. Holden
when u satisfies the nth stationary Camassa–Holm equation, s-CHn (u) = 0. Moreover, we introduce µ ˆj (x) = (µj (x), −µj (x)Gn (µj (x), x)) ∈ Kn ,
j = 1, . . . , n, x ∈ R,
(2.31)
and P0 = (0, 0).
(2.32)
The branch of y( · ) near P∞± is fixed according to y(P ) = ∓1. |z(P )|→∞ z(P )Gn (z(P ), x) lim
(2.33)
P →P∞±
Next, we introduce the fundamental meromorphic function φ( · , x) on Kn by φ(P, x) =
y − zGn (z, x) zHn (z, x) = , Fn (z, x) y + zGn (z, x)
P = (z, y) ∈ Kn , x ∈ R. (2.34)
Given φ( · , x), one defines the associated Baker–Akhiezer vector Ψ( · , x, x0) on Kn \ {P∞+ , P∞− , P0 } by ψ1 (P, x, x0 ) Ψ(P, x, x0 ) = , P ∈ Kn \ {P∞+ , P∞− , P0 }, (x, x0 ) ∈ R2 , ψ2 (P, x, x0 ) (2.35) where x ψ1 (P, x, x0 ) = exp −(1/z) dx φ(P, x ) − (x − x0 ) , (2.36) x0
ψ2 (P, x, x0 ) = −ψ1 (P, x, x0 )φ(P, x)/z.
(2.37)
The basic properties of φ and Ψ then read as follows. Lemma 2.6. Assume the nth stationary Camassa–Holm equation (2.25) holds, and let P = (z, y) ∈ Kn \ {P∞+ , P∞− , P0 }, (x, x0 ) ∈ R2 . Then φ satisfies the Riccati-type equation φx (P, x) − z −1 φ(P, x)2 − 2φ(P, x) + 4u(x) − uxx (x) = 0,
(2.38)
while Ψ fulfills Ψx (P, x, x0 ) = U (z, x)Ψ(P, x, x0),
(2.39)
− yΨ(P, x, x0 ) = zVn (z, x)Ψ(P, x, x0).
(2.40)
µ ˆ = {ˆ µ1 , . . . , µ ˆn } ∈ σ n Kn ,
(2.41)
Abbreviate where σ m Kn , m ∈ N, denotes the mth symmetric product of Kn . Then it turns out that µ ˆ satisfies a first-order system of ordinary differential equations, denoted the Dubrovin equations that is described by the next lemma.
Camassa–Holm and Hunter–Saxton
185
Lemma 2.7. Assume that Kn is nonsingular. Suppose that the nth stationary ( µ ⊆ R. Moreover, Camassa–Holm equation (2.25) holds on an open interval Ω suppose that the zeros µj , j = 1, . . . , n, of Fn ( · ) remain distinct and nonzero ( µ . Then {ˆ µj }j=1,...,n , defined by (2.31), satisfies the following first-order on Ω system of differential equations µj,x (x) = 2
n y(ˆ µj (x)) (µj (x) − µ (x))−1 , µj (x)
( µ. j = 1, . . . , n, x ∈ Ω
(2.42)
=1 =j
Next, assume Kn to be nonsingular and introduce the initial condition {ˆ µj (x0 )}j=1,...,n ⊂ Kn
(2.43)
for some x0 ∈ R, where µj (x0 ) '= 0, j = 1, . . . , n, are assumed to be distinct. Then there exists an open interval Ωµ ⊆ R, with x0 ∈ Ωµ , such that the initial value problem (2.42), (2.43) has a unique solution {ˆ µj }j=1,...,n ⊂ Kn satisfying µ ˆj ∈ C ∞ (Ωµ , Kn ),
j = 1, . . . , n,
(2.44)
and µj , j = 1, . . . , n, remain distinct and nonzero on Ωµ . A detailed analysis of the solution u of the nth stationary Camassa–Holm equation reveals that it can be written explicitly in terms of the Riemann theta function associated with Kn . To that end recall that exp 2πi(n, z) + πi(n, τ n) , z ∈ Cn , (2.45) θ(z) = n∈Zn
n where (u, v) = j=1 uj vj denotes the scalar product in Cn . Here the matrix τ = (τj, )j,=1,...,n is defined by τj, = ω , j, = 1, . . . , n, (2.46) bj
for a given homology basis {aj , bj }j=1,...,n and where ω1 , . . . , ωn are the holomorphic differentials on Kn . Next, fix a base point Q0 ∈ Kn \ {P0 , P∞± } and denote by Ln = {z ∈ Cn | z = m + τ n, m, n ∈ Zn } the period lattice. Then J(Kn ) = Cn /Ln is the Jacobi variety of Kn . Define the Abel map AQ0 by AQ0 (P ) =
P
AQ0 : Kn → J(Kn ), P ω1 , . . . , ωn (mod Ln ),
Q0
P ∈ Kn .
(2.47)
Q0
Similarly, we introduce αQ0 : Div(Kn ) → J(Kn ),
D → αQ0 (D) =
P ∈Kn
D(P )AQ0 (P ),
(2.48)
186
H. Holden
where Div(Kn ) denotes the set of divisors on Kn . Furthermore, n → Cn , :K A Q0
Q ,1 (P ), . . . , A Q ,n (P ) = (P ) = A P → A Q0 0 0
P
P
ω1 , . . . , Q0
(2.49) ωn
Q0
and n ) → Cn , α Q0 : Div(K
D → α Q0 (D) =
Q (P ). D(P )A 0
(2.50)
n P ∈K
n denotes the simply connected interior of the fundamental polygon Here K ∂ Kn . As noted in the introduction, it turns out that the Abel map does not linearize solutions of the Dubrovin equations. More precisely, one finds that d 1 αQ0 (Dµˆ (x) ) = n c, dx j=1 µj (x)
x ∈ Ωµ ,
(2.51)
for some constant c. This also affects the formula for the the solution u of the nth stationary Camassa–Holm equation. Indeed, we find that n ˆ (x)) + w θ z(P∞+ , µ ∂ 1 Uj ln , (2.52) u(x) = Yn + 2 j=1 ∂wj θ z(P∞− , µ ˆ (x)) + w w=0 where zˆ(P∞± , µ ˆ (x)) = ZQ0 (P∞± ) + α Q0 (Dµˆ (x) ) = ZQ0 (P∞± ) + α Q0 (Dµˆ (x0 ) ) +
x x0
dx c. j=1 µj (x )
n
(2.53)
Here the constants Uj , Yn , ZQ0 (P∞± ) are determined by the hyperelliptic curve and can be determined explicitly, see, e.g., [26, Theorem 5.5.8]. By the algebro-geometric initial value problem one means that given a solution u0 of the nth stationary Camassa–Holm equation together with the given hyperelliptic curve Kn as described above, one seeks a solution u of the rth time-dependent Camassa–Holm equation that coincides with u0 at t = 0, i.e., u|t=0 = u0 . There is no relationship between r and n, and the constants c used in the construction are independent for the stationary and time-dependent equations. More precisely, one defines Utr (z, x, tr ) − V(r,x (z, x, tr ) + [U (z, x, tr ), V(r (z, x, tr )] = 0,
(2.54)
−Vn,x (z, x, tr ) + [U (z, x, tr ), Vn (z, x, tr )] = 0,
(2.55)
Camassa–Holm and Hunter–Saxton
187
for (z, x, tr ) ∈ C × R2 where
−1 1 U (z, x, tr ) = , z −1 (4u(x, tr ) − uxx (x, tr )) 1 . ( r (z, x, tr ) F(r (z, x, tr ) −G V(r (z, x, tr ) = (r (z, x, tr ) , (r (z, x, tr ) G z −1 H −Gn (z, x, tr ) Fn (z, x, tr ) . Vn (z, x, tr ) = z −1 Hn (z, x, tr ) Gn (z, x, tr )
(2.56)
We use tilde to emphasize that the set of integration constants c are different in the two cases. Observe that even if we only know that (2.55) holds initially, it turns out that for the solutions constructed, it will hold for all times. Careful analysis reveals that −Vn,tr (z, x, tr ) + [V(r (z, x, tr ), Vn (z, x, tr )] = 0
(2.57)
holds as well. Similarly to the stationary case, the analysis is based on the meromorphic function y − zGn (z, x, tr ) Fn (z, x, tr ) zHn (z, x, tr ) = , y + zGn (z, x, tr )
(2.58)
φ(P, x, tr ) =
P = (z, y) ∈ Kn \ {P∞± }, (x, tr ) ∈ R2 . (2.59)
The corresponding time-dependent vector Ψ reads ψ1 (P, x, x0 , tr , t0,r ) Ψ(P, x, x0 , tr , t0,r ) = , ψ2 (P, x, x0 , tr , t0,r )
(2.60)
P ∈ Kn \ {P∞± }, (x, x0 , tr , t0,r ) ∈ R4 where ψ1 (P, x, x0 , tr , t0,r ) tr (r (z, x0, s) = exp − ds (1/z)F(r (z, x0 , s)φ(P, x0 , s) + G t0,r
x
− (1/z)
dx φ(P, x , tr ) − (x − x0 ) ,
(2.61)
x0
ψ2 (P, x, x0 , tr , t0,r ) = −ψ1 (P, x, x0 , tr , t0,r )φ(P, x, tr )/z. Key properties are contained in the following lemma.
(2.62)
188
H. Holden
Lemma 2.8. Assume (2.54) and (2.55). Let P = (z, y) ∈ Kn \ {P∞+ , P∞− , P0 } and (x, tr ) ∈ R2 . Then φ and Ψ satisfy φx − z −1 φ2 − 2φ + 4u − uxx = 0,
(2.63)
( r + 2(F(r φ)x φtr = (4u − uxx )F(r − H
(2.64)
(r φ − H (r , = (1/z)F(r φ2 + 2G
(2.65)
Ψx = U Ψ,
(2.66)
− yΨ = zVn Ψ,
(2.67)
Ψtr = V(r Ψ.
(2.68)
The Dubrovin equations extend to the time-dependent case as the next lemma shows. ( µ ⊆ R2 . Lemma 2.9. Assume (2.54), (2.55) on an open and connected set Ω Moreover, suppose that the zeros µj , j = 1, . . . , n, of Fn ( · ) remain distinct ( µ . Then {ˆ µj }j=1,...,n , defined by (2.31), satisfies the following and nonzero on Ω first-order system of differential equations µj,x(x, tr ) = 2µj (x, tr )−1 y(ˆ µj (x, tr ))
n
(µj (x, tr ) − µ (x, tr ))−1 ,
(2.69)
=1 =j
µj,tr (x, tr ) = 2F(r (µj (x, tr ), x, tr ) µj (x, tr )) × µj (x, tr )−1 y(ˆ
n
(µj (x, tr ) − µ (x, tr ))−1 ,
(2.70)
=1 =j
( µ. j = 1, . . . , n, (x, tr ) ∈ Ω Next, assume Kn to be nonsingular and introduce the initial condition {ˆ µj (x0 , t0,r )}j=1,...,n ⊂ Kn
(2.71)
for some (x0 , t0,r ) ∈ R2 , where µj (x0 , t0,r ) '= 0, j = 1, . . . , n, are assumed to be distinct. Then there exists an open and connected set Ωµ ⊆ R2 , with (x0 , t0,r ) ∈ Ωµ , such that the initial value problem (2.69)–(2.71) has a unique solution {ˆ µj }j=1,...,n ⊂ Kn satisfying µ ˆj ∈ C ∞ (Ωµ , Kn ),
j = 1, . . . , n,
(2.72)
and µj , j = 1, . . . , n, remain distinct and nonzero on Ωµ . Again the Dubrovin equations are not linearized by the Abel map, and the formula (2.52) as well as the trace formula (2.30) extend to the time-dependent case.
Camassa–Holm and Hunter–Saxton
189
3. The generalized Camassa–Holm equation This section describes joint work with G.M. Coclite and K.H. Karlsen, see [11, 12]. Here we are interested in the Cauchy problem for the nonlinear differential equation (1.7). We shall use the following definition of weak solution. Definition 3.1. We call u : [0, ∞) × R → R a weak solution of the Cauchy problem for (1.7) if (i) u ∈ C([0, ∞) × R) ∩ L∞ (0, ∞); H 1 (R) ; (ii) u satisfies (3.4) in the sense of distributions; (iii) u(0, x) = u0 (x), for every x ∈ R; (iv) u(t, · ) H 1 (R) ≤ u0 H 1 (R) , for each t > 0. If, in addition, there exists a positive constant K1 depending only on u0 H 1 (R) such that ∂u 2 (t, x) ≤ + K1 , (t, x) ∈ (0, ∞) × R, (3.1) ∂x γt then we call u an admissible weak solution of the Cauchy problem for (1.7). We shall assume
u|t=0 = u0 ∈ H 1 (R ,
(3.2)
and
g(0) = 0, γ > 0. g ∈ C ∞ (R), Formally, equation (1.7) is equivalent to the elliptic-hyperbolic system 2 ∂u ∂P ∂2P γ ∂u ∂u + γu + = 0, − 2 + P = h(u) + ∂t ∂x ∂x ∂x 2 ∂x
(3.3)
(3.4)
when h(ξ) = 12 (g(ξ) − γξ 2 ). The solution of (3.4) is obtained from the regularized system 2 ∂uε + γuε ∂uε + ∂Pε = ε ∂ uε , t > 0, x ∈ R, 2 ∂x ∂x ∂t ∂x 2 2 γ ∂uε ∂ Pε (3.5) + Pε = h(uε ) + , t > 0, x ∈ R, − 2 ∂x 2 ∂x uε (0, x) = uε,0 (x), x ∈ R, assuming that
uε,0 H 1 (R) ≤ u0 H 1 (R) , ε > 0,
and
uε,0 → u0 in H 1 (R).
(3.6)
One can prove, see [11], that if uε,0 ∈ H (R) with ≥ 2 and (3.6) holds, then there exists a unique solution uε ∈ C R; H (R) to the Cauchy problem (3.5). Moreover, for each t ≥ 0, t 2 2
uε (t, · ) H 1 (R) + 2ε
qε (s, · ) H 1 (R) ds = uε,0 2H 1 (R) , 0
where qε = uε,x . Furthermore, it is proved that the viscous approximation satisfies the following key estimates.
190
H. Holden
Lemma 3.2. (A) The unique solution of (3.5) satisfies
for
∂uε 2 (t, x) ≤ + C2 , ∂x γt
(3.7)
.1/2 7 γ 2 h(ξ) + u0 2 1 2 C2 = . √max H (R) γ 2 |ξ|≤ 2u0 H 1 (R)
(3.8)
(B) Let 0 < α < 1, T > 0, and a, b ∈ R, a < b. Then there exists a positive constant C3 depending only on u0 H 1 (R) , α, T > 0, a and b, but independent of ε, such that 2+α T b ∂uε (t, x) dtdx ≤ C3 . (3.9) ∂x 0 a (C) There exists a positive constant C4 depending only on u0 H 1 (R) such that 1 1 1 1 1 1 1 ∂Pε 1 ∂Pε 1 1 1 (t, · )1 (t, · )1 , 1 ≤ C4 .
Pε (t, · ) L∞ (R) , Pε (t, · ) L2 (R) , 1 1 2 ∂x ∂x L∞ (R) L (R) (3.10) In particular, the family {Pε }ε is uniformly bounded in L∞ ([0, ∞); W 1,∞ (R)) and L∞ ([0, ∞); H 1 (R)). From this result it follows that there exists a sequence {εj }j∈N tending to zero such that q εj q qε2j
q2
in Lploc ([0, ∞) × R), in
Lrloc ([0, ∞)
q εj q
2 in L∞ loc ([0, ∞); L (R)),
× R),
for each 1 < p < 3 and 1 < r < Lrloc ([0, ∞) × R). Moreover,
(3.11) (3.12)
3 2
for functions q ∈ Lploc ([0, ∞) × R), q 2 ∈
q 2 (t, x) ≤ q 2 (t, x) for almost every (t, x) ∈ [0, ∞) × R
(3.13)
and ∂u = q in the sense of distributions on [0, ∞) × R. (3.14) ∂x Following [42], we wish to improve the weak convergence of qε in (3.11) to strong convergence (and then we have an existence result for (1.7)). Roughly speaking, the idea is to derive a “transport equation” for the evolution of the defect measure (q 2 − q 2 )(t, · ) ≥ 0, so that if it is zero initially, then it will continue to be zero at all later times t > 0. The proof is complicated by the fact that we do not have a uniform bound on qε from below but merely qε (t, x), q(t, x) ≤
2 + C2 , γt
and that in Lemma 3.2 we have only α < 1.
t ≥ 0, x ∈ R,
(3.15)
Camassa–Holm and Hunter–Saxton
191
Our existence results are collected in the following theorem: Theorem 3.3. There exists a strongly continuous semigroup of solutions associated to the Cauchy problem (1.7). More precisely, let S : [0, ∞) × (0, ∞) × E × H 1 (R) −→ C([0, ∞) × R) ∩ L∞ [0, ∞); H 1 (R) , where
E = g ∈ Liploc (R) | g(0) = 0
be such that (j) for each u0 ∈ H 1 (R), γ > 0, g ∈ E the map u(t, x) = St (γ, g, u0 )(x) is an admissible weak solution of (1.7); (jj) it is stable with respect to the initial condition in the following sense, if u0,n −→ u0 in H 1 (R), γn −→ γ, gn −→ g in L∞ (I),
(3.16)
then St (γn , gn , u0,n ) −→ St (γ, g, u0 ) in L∞ ([0, T ]; H 1 (R)),
(3.17)
for every {u0,n }n∈N ⊂ H 1 (R), {γn }n∈N ⊂ (0, ∞), {gn }n∈N ⊂ E, u0 ∈ H 1 (R), γ > 0, g ∈ E, T > 0, where
1 I = √ − sup u0,n H 1 (R) , sup u0,n H 1 (R) . n n 2 Moreover, the following statements hold: (k) Estimate (3.1) is valid with K1 = C2 given by (3.8). (kk) There results ∂ S(γ, g, u0 ) ∈ Lploc ([0, ∞) × R), ∂x
(3.18)
for each 1 ≤ p < 3. (kkk) The following identity holds in the sense of distributions on [0, ∞) × R γ # & ∂ 1% 2 ∂ " γ 2 2 u +q u q + P + u3 − H(u) = −µ, (3.19) + ∂t 2 ∂x 2 3 ∂ where u = St (γ, g, u0 ), q = ∂x St (γ, g, u0 ), and H = h. The defect measure µ is a nonnegative Radon measure such that
Rq (q + R) χ(−∞,−R) (q) µ as R → ∞ in the sense of measures and µ([0, ∞) × R) ≤ 12 u0 H 1 (R) .
192
H. Holden
4. The Hunter–Saxton equation This section is based on joint work with N.H. Risebro and K.H. Karlsen, see [28]. We here consider the initial-boundary value problem 1 vt + uvx = − v 2 , ux = v, 2 v(x, 0) = v0 (x), u(0, t) = 0,
(4.1)
in the half strip (x, t) ∈ QT = [0, ∞) × [0, T ]. First, we assume that v0 ≥ 0. We study several schemes, i.e., semi-discrete, implicit and explicit finite difference schemes. To keep the presentation short, we only describe the semi-discrete scheme here. We discretize the half-line with a spacing ∆x, thus 1 xj = j∆x, fj = f (xj ), D± fj = ± fj±1 − fj , j ∈ N0 . ∆x The semi-discrete scheme is given by 1 D+ u = v, v|t=0 = v(0), u0 (t) = 0 (4.2) vt + uD− v = − v 2 , 2 where v(0) is a discretization of v0 that converges in L2 ([0, ∞)). Then we define x vj (t)χIj (x), and u∆x (x, t) = v∆x (y, t) dy (4.3) v∆x (x, t) = 0
j≥0
where Ij = [xj−1/2 , xj+1/2) and xj±1/2 = xj ± ∆x/2. Lemma 4.1. Set v¯∆ (t) = maxj≥0 vj (t). Then for t > 0 we have 0 ≤ vj (t) ≤ v¯∆ (t) ≤
2¯ v∆ (0) . t¯ v∆ (0) + 2
(4.4)
Observe that 2 , D− (uj vj ) = uj D− vj + vj−1 D− uj = uj D− vj + vj−1
which implies that a conservative formulation of the scheme is 1 vj,t + D− (uj vj ) = vj2 − (vj−1 + vj ) D− vj ∆x. 2 Multiplying equation (4.5) with f (vj ) we find ∆x 1 d f (vj ) + uj D− f (vj ) + uj f (ξj )(D− vj )2 = − f (vj )vj2 dt 2 2 for some ξj between vj−1 and vj . Using that
(4.5)
(4.6)
D− (uj f (vj )) = uj D− f (vj ) + f (vj−1 )D− uj = uj D− f (vj ) + f (vj−1 )vj−1 we obtain d ∆x 1 f (vj ) + D− (uj f (vj )) + uj f (ξj )(D− vj )2 = − f (vj )vj2 + vj−1 f (vj−1 ). dt 2 2 (4.7)
Camassa–Holm and Hunter–Saxton
193
In particular, choosing f (v) = v p in (4.7), we find the key relation d p p(p − 1) p 2 vj + uj D− vjp + uj ξjp−2 (D− vj ) ∆x = − vjp+1 . (4.8) dt 2 2 If v0 is non-negative and in Lq ([0, ∞)) for q > 2, one can show that u∆ → u, and that v v¯, v 2 w. ¯ Furthermore, the following result holds. Lemma 4.2. We have that v¯t + (u¯ v )x =
1 w ¯ 2
(4.9)
and w ¯t + (uw) ¯ x≤0
(4.10)
weakly in QT . From general properties, we know that w ¯ ≥ v¯2 . The next lemma shows 2 that indeed w ¯ = v¯ . Lemma 4.3. Assume that u : QT → [0, ∞) is bounded and continuous, v ∈ L2 (QT ) and w ∈ L1 (QT ), such that 1 (4.11) vt + (uv)x = w, 2 wt + (uw)x ≤ 0, (4.12) ux = v,
(4.13)
weakly in QT . If we have w(x, t) ≥ v 2 (x, t) almost everywhere in QT and w(x, 0) = v 2 (x, 0), then w = v 2 . Finally, we state our main result. Theorem 4.4. Let v0 be non-negative function in L2 ([0, ∞)) ∩ Lq ([0, ∞)) for q > 2. Define the semi-discrete approximation (v∆x , u∆x ) for ∆x positive using (4.3) and (4.2). Then (v∆x , u∆x ) converges to (v, u), i.e.,
u∆x − u L∞ (QT ) → 0,
v∆x (t) − v(t) L2 ([0,∞)) → 0
as ∆x → 0. The limit satisfies (v, u) ∈ L∞ ([0, T ], L2 ([0, ∞))) ⊗ C(QT ), 1 vt + (uv)x = v 2 and ux = v weakly in QT , 2
v(t) L2 ([0,∞)) ≤ v0 L2 ([0,∞)) , u(x, t) → 0
as x → 0,
v(t) − v0 L1 ([0,∞)) → 0
v(x, t) ≤ 2/t a.e. on QT .
as x → 0,
194
H. Holden
Figure 2. The function u. See Figure 3 for the corresponding v. We mention here in passing that similar results can be proved for the implicit scheme 2 1 t n vj + un+1 D− vjn+1 = − vjn+1 , n, j ∈ N0 D+ j 2 (4.14) n+1 n D+ un+1 = v , u = 0, 0 j j and the explicit scheme 1 n 2 v 2 j un0 = 0
t n vj + unj D− vjn = − D+
D+ unj = vjn ,
n ≥ 0,
0 ≤ j ≤ J.
(4.15)
t Here D+ g(x, t) = (g(x, t + ∆t) − g(x, t))/∆t. To cover the case with varying sign of the initial data v0 we modify the semi-discrete scheme as follows. Consider 1 v˙ j + (uj ∨ 0) D− vj + (uj+1 ∧ 0) D+ vj = − vj2 , 2 (4.16) vj = D+ uj , u0 (t) = 0
where (a ∧ b) = min {a, b} and (a ∨ b) = max {a, b}. With this scheme we can show that Theorem 4.4 still holds with (4.2) replaced by (4.16).
Camassa–Holm and Hunter–Saxton
195
Figure 3. The solution v with initial data v0 = cos(πx)/ .1 + x2 with ∆x = 10−8 .
Acknowledgements. I acknowledge with pleasure and gratitude the joyful and stimulating collaboration with G.M. Coclite, F. Gesztesy, K.H. Karlsen, X. Raynaud, and N.H. Risebro on which this paper is based. Figure 1 is due to X. Raynaud, while Figures 2–7 are due to N.H. Risebro. References [1] M.S. Alber. N-component integrable systems and geometric asymptotics. In: Integrability: The Seiberg-Witten and Whitham equations (H.W. Braden and I.M. Krichever, editors). Gordon and Breach Science Publishers, Singapore, 2000, pp. 213–228. [2] M.S. Alber, R. Camassa, Yu.N. Fedorov, D.D. Holm, and J.E. Marsden. On billiard solutions of nonlinear PDEs. Phys. Lett. A 264 (1999) 171–178. [3] M.S. Alber, R. Camassa, Yu.N. Fedorov, D.D. Holm, and J.E. Marsden. The complex geometry of weak piecewise smooth solutions of integrable nonlinear PDE’s of shallow water and Dym type. Comm. Math. Phys. 221 (2001) 197–227. [4] M.S. Alber, R. Camassa, and M. Gekhtman. Billiard weak solutions of nonlinear PDE’s and Toda flows. In: SIDE III–Symmetries and Integrability of Difference
196
H. Holden
Figure 4. The function u. See Figure 5 for the corresponding v.
[5]
[6]
[7]
[8]
[9] [10] [11]
Equations (D. Levi and O. Ragnisco, editors). CRM Proceedings and Lecture Notes, Amer. Math. Soc., Providence, RI, 2000, volume 25, pp. 1–11. M.S. Alber, R. Camassa, D.D. Holm, and J.E. Marsden. The geometry of peaked solitons and billiard solutions of a class of integrable PDE’s. Lett. Math. Phys. 32 (1994) 137–151. M.S. Alber, R. Camassa, D.D. Holm, and J.E. Marsden. On the link between umbilic geodesics and soliton solutions of nonlinear PDE’s. Proc. Roy. Soc. London Ser. A 450 (1995) 677–692. M.S. Alber and Yu.N. Fedorov. Wave solutions of evolution equations and Hamiltonian flows on nonlinear subvarieties of generalized Jacobians. J. Phys. A 33 (2000) 8409–8425. M.S. Alber and Yu.N. Fedorov, Algebraic geometrical solutions for certain evolution equations and Hamiltonian flows on nonlinear subvarieties of generalized Jacobians. Inverse Problems 17 (2001) 1017–1042. R. Camassa and D.D. Holm. An integrable shallow water equation with peaked solitons. Phys. Rev. Lett. 71 (1993) 1661–1664. R. Camassa, D.D. Holm, and J.M. Hyman. A new integrable shallow water equation. Adv. Appl. Mech. 31 (1994) 1–33. G.M. Coclite, H. Holden, and K.H. Karlsen. Wellposedness of solutions of a parabolic-elliptic system. Discrete Contin. Dyn. Syst. Ser. A, to appear.
Camassa–Holm and Hunter–Saxton
Figure 5. The solution v with initial data v0 = − sin(πx)/ .1 + x2 with
197
∆x = 10−8 .
[12] G.M. Coclite, H. Holden, and K.H. Karlsen. Global weak solutions to a generalized hyperelastic-rod wave equation. Preprint, 2004. Submitted. [13] A. Constantin. On the Cauchy problem for the periodic Camassa–Holm equation. J. Differential Equations 141 (1997) 218–235. [14] A. Constantin. On the blow-up of solutions of a periodic shallow water equation. J. Nonlinear Sci. 10 (2000) 391–399. [15] A. Constantin and J. Escher. Global existence and blow-up for a shallow water equation. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 26 (1998) 303–328. [16] A. Constantin and J. Escher. Well-posedness, global existence, and blowup phenomena for a periodic quasi-linear hyperbolic equation. Comm. Pure Appl. Math. 51 (1998) 475–504. [17] A. Constantin and J. Escher. On the Cauchy problem for a family of quasilinear hyperbolic equations. Comm. Partial Differential Equations 23(1998) 1449–1458. [18] A. Constantin and H.P. McKean. A shallow water equation on the circle. Comm. Pure Appl. Math. 52 (1999) 949–982. [19] A. Constantin and L. Molinet. Global weak solutions for a shallow water equation. Comm. Math. Phys. 211 (2000) 45–61.
198
H. Holden
Figure 6. The function u. See Figure 7 for the corresponding v. [20] H.-H. Dai. Exact travelling-wave solutions of an integrable equation arising in hyperelastic rods. Wave Motion 28 (1998) 367–381. [21] H.-H. Dai. Model equations for nonlinear dispersive waves in a compressible Mooney–Rivlin rod. Acta Mech. 127 (1998) 193–207. [22] H.-H. Dai and Y. Huo. Solitary shock waves and other travelling waves in a general compressible hyperelastic rod. R. Soc. Lond. Proc. Ser. A 456 (2000) 331–363. [23] R. Danchin. A note on well-posedness for Camassa–Holm equation. J. Differential Equations 192 (2003) 429–444. [24] F. Gesztesy and H. Holden. Algebro-geometric solutions of the Camassa–Holm hierarchy. Rev. Mat. Iberoamericana 19 (2003) 73–142. [25] F. Gesztesy and H. Holden. Real-valued algebro-geometric solutions of the Camassa–Holm hierarchy. Philos. Trans. Roy. Soc. London Ser. A, to appear. [26] F. Gesztesy and H. Holden. Algebro-Geometric Solutions of Soliton Equations. Vol. I. (1 + 1)-Dimensional Continuous Models. Cambridge Univ. Press, Cambridge, 2003. [27] A.A. Himonas and G. Misiolek. The Cauchy problem for an integrable shallowwater equation. Differential Integral Equations 14 (2001) 821–831. [28] H. Holden, K.H. Karlsen, and N.H. Risebro. Convergence of upwind methods for the Hunter–Saxton equation. Preprint, CMA, UiO, 2004. In preparation.
Camassa–Holm and Hunter–Saxton
199
Figure 7. The solution v with initial data v0 = −2χ{x<(1−t)2 } /(2 − t) with
∆x = 10−8 .
[29] H. Holden and X. Raynaud. Convergence of a finite difference scheme for the Camassa–Holm equation. Preprint, NTNU, 2004. Submitted. [30] J.K. Hunter and R.A. Saxton. Dynamics of director fields. SIAM J. Appl. Math. 51 (1991) 1498–1521. [31] J.K. Hunter and Y. Zheng. On a completely integrable nonlinear hyperbolic variational equation. Physica D 79 (1994) 361–386. [32] J.K. Hunter and Y. Zheng. On a nonlinear hyperbolic variational equation: I. Global existence of weak solutions. Arch. Rat. Mech. Anal. 129 (1995) 305–353. [33] J.K. Hunter and Y. Zheng. On a nonlinear hyperbolic variational equation: II. The zero-viscosity and dispersion limits. Arch. Rat. Mech. Anal. 129 (1995) 355– 383. [34] R.S. Johnson. Camassa–Holm, Korteweg–de Vries and related models for water waves. J. Fluid Mech. 455 (2002) 63–82. [35] H. Kalisch. Stability of solitary waves for a nonlinearly dispersive equation. Discrete Contin. Dyn. Syst. 10 (2004) 709–717. [36] Y.A. Li and P.J. Olver. Well-posedness and blow-up solutions for an integrable nonlinearly dispersive model wave equation. J. Differential Equations 162 (2000) 27–63.
200
H. Holden
[37] O. Lopes. Stability of peakons for the generalized Camassa–Holm equation. Electron. J. Differential Equations, No. 5, 12 pp. (electronic), 2003. [38] G. Misiolek. Classical solutions of the periodic Camassa–Holm equation. Geom. Funct. Anal. 12 (2002) 1080–1104. [39] T. Qian and M. Tang. Peakons and periodic cusp waves in a generalized Camassa–Holm equation. Chaos Solitons Fractals 12 (2001) 1347–1360. [40] G. Rodr´ıguez-Blanco. On the Cauchy problem for the Camassa–Holm equation. Nonlinear Anal. 46 (2001) 309–327. [41] J. Simon. Compact sets in the space Lp (0, T ; B). Ann. Mat. Pura Appl. 146 (1987) 65–96. [42] Z. Xin and P. Zhang. On the weak solutions to a shallow water equation. Comm. Pure Appl. Math. 53 (2000) 1411–1433. [43] Z. Xin and P. Zhang. On the uniqueness and large time behavior of the weak solutions to a shallow water equation. Comm. Partial Differential Equations 27 (2002) 1815–1844. [44] Z. Yin. On the blow-up scenario for the generalized Camassa–Holm equation. Comm. Partial Differential Equations 29 (2004) 867–877. [45] Z. Yin. On the Cauchy problem for the generalized Camassa–Holm equation. Preprint. [46] Z. Yin. On the Cauchy problem for a nonlinearly dispersive wave equation. J. Nonlinear Mathematical Physics 10 (2003) 10–15. [47] P. Zhang and Y. Zheng. On oscillations of an asymptotic equation of a nonlinear variational wave equation. Asymptot. Anal. 18 (1998) 307–327. [48] P. Zhang and Y. Zheng. On the existence and uniqueness of solutions to an asymptotic equation of a variational wave equation. Acta Math. Sinica (Engl. Ed.) 15 (1999) 115–130. [49] P. Zhang and Y. Zheng. Existence and uniqueness of solutions of an asymptotic equation arising from a variational wave equation with general data. Arch. Rat. Mech. Anal. 155 (2000) 49–83. Helge Holden Department of Mathematical Sciences Norwegian University of Science and Technology NO–7491 Trondheim, Norway and Centre of Mathematics for Applications University of Oslo P.O. Box 1053, Blindern NO–0316 Oslo, Norway e-mail:
[email protected] URL: www.math.ntnu.no/~holden/
4ECM Stockholm 2004 c 2005 European Mathematical Society
Multiple Scales Asymptotics for Atmospheric Flows Rupert Klein, Eileen Mikusky and Antony Owinoh Abstract. One important activity of theoretical meteorology involves the development of simplified model equations that describe selected scale-dependent phenomena observed in atmospheric flows. This paper summarizes a unified mathematical approach to the derivation of such models, based on multiple scales asymptotic techniques. First we motivate the approach by an example from fluid mechanics, the interaction of small-scale quasi-incompressible flow with long-wave acoustics. In this case, the analysis proceeds via multiple scales asymptotics in terms of the Mach number, M, as the small expansion parameter. Then we discuss the particular setting of meteorology, where there is a host of singular small parameters to be taken into account. Examples are the Rossby, Froude, Mach, and Strouhal numbers. A particular distinguished limit among these parameters is introduced in combination with systematic multiple scales asymptotics. A wide range of simplified meteorological models can then be recovered by specializing this general ansatz to a single horizontal, a single vertical, and a single time coordinate. As a concrete example we report on the multiple scales derivation of boundary layer theories. In particular, we recover the classical Ekman boundary layer equations for flows on synoptic scales (∼ 500 km, 12 h), and find an extension of the nonlinear Prandtl boundary layer equations to atmospheric mesoscales (∼ 70 km, 2 h).
1. Introduction 1.1. Single time, multiple space scales for low Mach number flows. To motivate the approach suggested here for meteorological modelling, we review the multiple scales asymptotics for low Mach number flows in [12]. Thus we discuss compressible flow with characteristic flow velocities uref that are small compared with a typical sound speed cref of the fluid, so that M=
uref 1. cref
(1.1)
The governing Euler compressible flow equations read ρt
+
∇ · (ρv)
(ρv)t
+
∇ · (ρv ◦ v) +
(ρe)t
+
∇ · ([ρe + p] v)
=
0
1 ∇p = M2
0
=
0.
(1.2)
202
R. Klein, E. Mikusky and A. Owinoh
Here (ρ, v, p) are the density, velocity, and pressure, respectively. The total energy density, ρe, obeys the equation of state, ρe =
p ρv 2 + M2 , γ−1 2
(1.3)
with γ the isentropic exponent, which is assumed constant here. Consider a one-dimensional setting as sketched in Fig. 1. The graph on the L t
t
T L/M MT T x
L
x
b)
a)
Figure 1. Single time, multiple space scales , a), vs. multiple time, single space scales, b), for low Mach number flows. Thick wiggly lines indicate perturbations, imposed locally in an oscillatory fashion, a), or with given spacial variation at some point in time, b). left sketches a situation in which, e.g., an unstable flame drives the surrounding flow field, thereby simultaneously inducing advectively transported fluctuations of entropy, vorticity, and chemical species, and acoustic pressure perturbations. As the time scale, T , of the oscillations is imposed by the fluctuating flame, the difference in propagation speeds, uref cref , results in differing characteristic lengths, L and L/M, of the associated advective and acoustic phenomena, respectively. This regime may be captured by a multiple space scale asymptotic expansion of the form, [12], U(x, t; M) = Mi U(i) (x, ξ, t) , where ξ = Mx . (1.4) i
The opposite regime, sketched in Fig. 1b), arises, e.g., in the solution of a Cauchy initial value Problem when the initial data have a certain characteristic length, L, and include fluctuations that again excite advective as well as acoustic modes. The appropriate asymptotic representation for M 1 reads 1 (1.5) Mi U(i) (x, t, τ ) , where τ = t . U(x, t; M) = M i In (1.4), (1.5), U is a placeholder for any of the dependent variables.
Multiple Scales Asymptotics for Atmospheric Flows
203
The second of the two regimes has been analysed in quite some detail. See, e.g., [9, 18, 5, 11, 22, 19, 10, 24, 25, 3] and the references therein. The first is less prominent, but equally important. We use it here to demonstrate how the multiple scales ansatz in (1.4) allows us to (1) provide a unified derivation of two well-known simplified single-scale models of theoretical fluid mechanics, and (2) derive a new extended model that explicitly describes the scale interactions in this regime. In [12] one of the present authors derives the leading order closed set of equations that results from the expansion scheme in (1.4): Pressure decomposition p(0) ≡ P0 (t) ,
p(1) = P (1) (ξ, t) ,
p(2) = p(2) (x, ξ, t) .
(1.6)
Small scale quasi-incompressible flow ρt
+
∇x · (ρv)
=
0,
(ρv)t
+
∇x · (ρv ◦ v) + ∇x p(2)
=
−∇ξ P (1) ,
=
dP0 . − dt
γP0 ∇x · v
(1.7)
Long wave acoustics (ρv)t
+
∇ξ P (1)
=
0,
(1)
+
γP0 ∇ξ · v
=
0.
Pt
(1.8)
Here the overbar denotes averages of the perturbation functions in the small scale variable x. Consider now the single-scale specialization of (1.6), (1.7) and (1.8) for which ∇ξ U(i) ≡ 0. In this case, P (1) ≡ P1 (t), and it disappears from the small scale flow equations in (1.7). These reduce to the well-known equations for variable density, zero Mach number flow with background compression, i.e., ρt
+
∇x · (ρv)
=
0,
(ρv)t
+
∇x · (ρv ◦ v) + ∇x p(2)
=
0,
(1.9) 1 dP0 ∇x · v . = − γP0 dt Consider, on the other hand, the specialization of (1.4) to allow only for long-wave dependencies on ξ, requiring ∇x U(i) ≡ 0. Then dP0 /dt ≡ 0, i.e., P0 ≡ P∞ = const., and ρ(0) = ρ0 (ξ) from (1.7)3 and (1.7)1 , respectively. The long wave equations from (1.8) reduce to the equations of linear acoustics with space-dependent speed of sound, ρ0 v t
+
∇ξ P (1)
=
0,
(1)
+
γP∞ ∇ξ · v
=
0.
Pt
(1.10)
204
R. Klein, E. Mikusky and A. Owinoh
In these equations, P∞ = const. ,
ρ0 = ρ0 (ξ) , (1.11) and with them the sound speed c(ξ) = γP∞ /ρ0 (ξ), are to be specified together with the initial data for (P (1) , v) to define a closed problem. Thus, by specialization of the general multiple scales ansatz in (1.4) to two different single-scale versions we have recovered from (1.6)–(1.8) two wellknown sets of equations from theoretical fluid mechanics. When dependencies on both x and ξ are relevant, interactions between the small scale and long wave components of the flow will occur. One way to reveal these explicitly proceeds by rewriting (1.6)–(1.8) as a system for the long wave components ρ, m, P (1) , P0 of the flow variables and their small-scale (2) , where 8, p9 fluctuations, ρ(, m m = ρv . (1.12) First, the sublinear growth conditions on m and v for |x| → ∞ applied to (1.7)1,3 allow us to conclude again that P0 ≡ P∞ = const. ,
and
ρ(0) = ρ0 (ξ) .
(1.13)
These results thus do not depend on the previous assumption of a single scale dependence on ξ only, which was assumed to obtain (1.11). P∞ , ρ0 are thus time independent (on the time scales considered) and are to be extracted from (2) and m, P (1) obey the 8, p9 the initial data. The remaining unknowns ρ(, m Low Mach number multiscale model: Smallscale quasi-incompressible flow ρ(t
+
∇x · (8 m)
=
0,
8t m
+
8) + ∇x p((2) ∇x · (v ◦ m
=
0,
∇x · v
=
0.
(1.14)
Long-wave acoustics mt (1)
Pt
+ +
∇ξ P (1) = 2 ∇ξ · c m =
0,
" # 8 , −∇ξ · γP∞ α (m
(1.15)
where α
=
1/ (ρ + ρ() ,
(1.16)
v
=
8) , α (m + m
(1.17)
c
=
γP∞ α .
(1.18)
Multiple Scales Asymptotics for Atmospheric Flows
205
The two systems for the small scale flow and the long-wave acoustic component couple in a subtle way. The small scale flow is influenced by the long wave components (m, ρ) through (1.16), (1.17). The two-fold influence of the small scales on the long wave modes is obvious from (1.15). On the left we have the operator of linear acoustics with a space-time-dependent sound speed. The latter depends on the average of the inverse density, α = 1/(ρ + ρ(), and thus depends non-trivially on the density fluctuation ρ(. In addition, on the right of the long wave momentum equation (1.15)2 we find a modification of the effective 8. energy fluxes proportional to correlations of the small scale fluctuations α (, m 1.2. Organization of the paper. In analogy with the derivations of the present section we will now move to atmospheric applications. After some preliminary discussions related to the appearance of multiple small parameters in the dimensionless flow equations in Section 2, we introduce a general multiple time / multiple space scale ansatz in Section 3. Various specializations of this ansatz will be described which have allowed us to reproduce a host of well-known simplified model equations of theoretical meteorology. These results will be reviewed briefly in Section 4. The particular example of atmospheric boundary layer flows will be discussed in more detail in Section 5. Section 6 summarizes our main points and provides on outlook on ongoing and future work. 2. Dimensionless parameters and limit regimes Atmospheric flows generally feature a very wide range of length and time scales. Simultaneously one finds turbulent fluctuations on the smallest scales, characterized by a continuous spatio-temporal spectrum, and larger scale organized flow features, characterized by a relatively clean scale separation. These scale separations justify the quest for simplified model equations based on what is called scaling analysis in theoretical meteorology and is labelled multiple scales asymptotics in theoretical fluid mechanics and applied mathematics. As we will see below, such scale separations appear naturally on length and time scales comparable to or larger than about 10 km/20 min (the pressure scale height and the associated characteristic advection time scale). In the present paper we focus on simplified descriptions of scale separated phenomena, importing turbulence closures for smaller scales as “black boxes” where needed. 2.1. Dimensionless parameters for large scale atmospheric flows. There are a few physical parameters, shown in Tables 1,2, that appear to be universal to flows of the atmosphere. While the “properties of the rotating earth” deserve no further explanation, the aerothermodynamic reference values given in Table 2 should be explained. Amongst all forces acting on the atmosphere, gravity is the strongest. As a consequence, the thermodynamic pressure is almost everywhere determined by hydrostatic balance. It follows that the pressure at some reference height, such as the mean sea level, will balance the weight of the column of air above,
206
R. Klein, E. Mikusky and A. Owinoh
i.e., the sea level pressure is – to leading order – determined by the total mass of air in the atmosphere. Since gravity essentially inhibits the mass of air from leaving the planet, this reference pressure of about 1 bar may be considered a generally given constant. The order of magnitude of the mean temperature of the atmosphere at sea level is set by the global radiative equilibrium. Even if we neglect all “green house effects” due to water vapor, CO2 , and other radiatively active species, this balance would provide for a mean temperature of about T ≈ 250 K. All the greenhouse effects together do substantially raise the mean temperature above this level, but not so by an order of magnitude. Through the equation of state of an ideal gas, ρ = p/RT , this estimate justifies the reference density given in the table. Table 1. Properties of the rotating earth Earth’s radius
a ∼
6 · 106
rotation frequency Ω ∼
10
∼
10
acceleration of gravity
g
−4
m s−1 m/s2
Table 2. Aerothermodynamic conditions ∼ 105
kg/ms2
air flow velocity uref
∼ 10
m/s
air density
∼ 1
kg/m3
thermodynamic pressure
pref ρref
The choice of the reference flow velocity of about 10 m/s is somewhat more subtle. On the one hand, this is a characteristic value for meteorologically relevant flow velocities found almost universally in textbooks, and this may suffice to justify the present choice. There are theoretical arguments which relate this value to the mean vertical shear wind across the troposphere induced by the “thermal wind” – which is a result of the dominant momentum balances on the large scales: hydrostatics in the vertical direction, and geostrophic balance (pressure gradients balance the Coriolis force) in the horizontal direction. Figure 2 shows the long-term distribution of the zonally averaged zonal wind speed, and it corroborates the scaling in the table for most of the atmosphere. The reference values in Tables 1, 2 can be combined into three independent non-dimensional characteristic numbers (six quantities involving three fundamental physical dimensions). The following combinations have proven to be useful: aΩ uref a Ω2 π1 = ≈ 0.006 . (2.1) ≈ 2.0 , π2 = ≈ 0.03 , π3 = cref cref g
Multiple Scales Asymptotics for Atmospheric Flows
207
Figure 2. Magnitude of the zonal wind, in m/s, averaged zonally and over the years of 1968 to 1996. The vertical axis indicates height parameterized in terms of pressure levels and measured in mbar, the horizontal axis is the latitude in degrees. (Image provided by the NOAA-CIRES Climate Diagnostics Center, Boulder, Colorado, from their Web site at http://www.cdc.noaa.gov/.) Here cref = pref /ρref characterizes the speed of sound as well as the speed of long wavelength (barotropic) gravity waves in the atmosphere. In the attempt to construct a systematic unified approach to theoretical meteorology we seek solutions to the compressible flow equations in a rotating system that are characterized by π1 = O(1), whereas π2 , π3 1. Before we proceed, a note on distinguished limits is in order. 2.2. Distinguished limits. Suppose we consider a differential equation in some independent variable x involving two singular small parameters, , δ, and we are interested in describing the asymptotic behavior of solutions as , δ → 0. It is often assumed that the most general approach to this problem, at least in the single-scale setting, would be a two-parameter expansion of the solution U (x; , δ) according to U (x; , δ) = U (0) (x) + U (1,0) (x) + δU (0,1) (x) + o(, δ) .
(2.2)
Such an expansion assumes the existence of the gradient (in the sense of Fr´ech´et) of U with respect to , δ at = δ = 0, and of higher derivatives if higher-order terms are included. Unfortunately, even for the simple case of the linear oscillator with small mass (say, ) and small damping (say, δ), such a Fr´ech´et derivative does not
208
R. Klein, E. Mikusky and A. Owinoh
exist, because the sequential limits do not commute: If the mass vanishes first, the oscillator enters the strongly damped regime and its motion is monotonous, being governed by a balance of the spring and damping forces. If the damping vanishes first, it enters the oscillatory regime and undergoes high-frequency oscillations. Any asymptotic limit equation for the oscillator thus will depend on the path to the origin in the -δ-plane, i.e., on the particular distinguished limit chosen. We conclude that distinguished limits, being a generalization of the directional or Gateau-derivative, will exist under less stringent conditions than independent two-parameter expansions. The way to proceed in the presence of the two small parameters π2 , π3 from (2.1) is thus to pick some judiciously chosen distinguished limits and analyse the asymptotics of the compressible flow equations in the resulting regimes. One such limit, which is compatible with the above estimates for π1 . . . π3 , has proven to be particularly useful, π1 ∼ ε0 = 1,
π2 ∼ ε2 ,
π3 ∼ ε3 ,
ε → 0,
as
(2.3)
and this will be assumed throughout the rest of this paper. 3. Non-dimensional governing equations and general multiple scales ansatz In nondimensionalizing the compressible flow equations with gravity and rotation below we use the reference values (ρref , uref , pref ) from Table 2 to measure the dependent variables, and ref = hsc =
pref g ρref
and
tref =
ref uref
(3.1)
to measure the space and time coordinates, respectively. Here hsc ∼ 10 km, called the pressure scale height, is the characteristic vertical distance over which the thermodynamic pressure drops appreciably in a nearly hydrostatic atmosphere. With this understanding we consider here the governing equations ρt
+
∇ · (ρv)
(ρv)t
+
∇ · (ρv ◦ v) + ε Ω × ρv +
(ρe)t
+
∇ · (v [ρe + p])
=
0,
1 ∇p = ε4
−
1 ρg k , ε4
(3.2)
= Sρe ,
with the equation of state (ρe) =
p ρv 2 + ε4 , γ−1 2
(3.3)
and with Sρe an effective energy source term that will not be specified further here. In practice it will include source terms from the radiative balance and from latent heat exchange due to condensation and evaporation. In (3.2) we have left out terms describing molecular transport or closures for turbulent
Multiple Scales Asymptotics for Atmospheric Flows
209
transport for brevity. We will get to how the need for such closures arises naturally in a multiple scales setting in Section 5. For the reader familiar with meteorological theory we notice that the various appearances of the small parameter ε in (3.2) imply certain distinguished limits for the Rossby, Mach, and barotropic Froude numbers, Ro =
Ωhsc 1 ∼ , uref ε
M=
uref ∼ ε2 , cref
uref Frbaro = √ ∼ ε2 . ghsc
(3.4)
The Rossby number being large may appear odd at a first glance. Yet it should be noticed that Ro as defined above is the Rossby number with respect to the pressure scale height, hsc ∼ 10 km, not as usual with respect to the horizontal synoptic scales, Lsyn ∼ 1000 km. Here we will capture flows on such large scales not by an a priori scaling of the governing equations, but by systematic use of multiple scales techniques. This will be demonstrated explicitly in the context of Ekman boundary layer theory in Section 5. 4. General multiscale expansion and classical results 4.1. Multiscale expansion scheme. The equations in (3.2), (3.3), include a single small parameter ε which motivates the following asymptotic expansions. The goal is to capture the wide variety of length and time scales and associated simplified model equations of theoretical meteorology through a general multiple scales ansatz, " # (i) U(x, t; ε) = . . . ψε(−1) x, x, ψε(1) x, . . . , ψε(−1) t, t, ψε(1) t, . . . . φ(i) ε U i
(4.1) Here the scaling coefficients satisfy (0) φ(0) ε = ψε ≡ 1 ,
and
" # φ(j+1) = o φ(j) , ε ε
" # ψε(j+1) = o ψε(j) ,
(4.2)
as
ε → 0.
(4.3)
The general asymptotic solution ansatz in (4.1) allows us to explicitly describe phenomena taking place on asymptotically separated scales, with scale ratios (j+1) (j) (j+1) (j) /φε for amplitudes and ψε /ψε for space and time. φε The governing equations (3.2) involve only integer powers of ε so that (j) (j) an analogous choice for the scaling coefficients, i.e., φε = ψε = εj is selfsuggesting. This choice allows one to derive a wide range of well-known simplified model equations of theoretical meteorology using standard asymptotics procedures. A counter-example arises in tropical meteorology, however, where non-integer powers of ε turn out to be the relevant scaling factors, see [16], and the next section.
210
R. Klein, E. Mikusky and A. Owinoh
Table 3. Simplified models of theoretical meteorology Coordinate scalings
Simplified model obtained
U(i) = U(i) (t, x, z) U(i) U(i) U(i) U(i) U(i)
Anelastic & pseudo-incompressible models, [15, 6, 1] = U(i) (ε2 t, ε2 x, z) Mid-latitude Quasi-Geostrophic model, [21, 8] = U(i) (ε2 t, ε2 x, z) Equatorial Weak Temperature Gradient models, [26] = U(i) (ε2 t, ε−1 ξ(ε2 x), z) Semi-geostrophic model, [21, 8, 4] 5 5 = U(i) (ε 2 t, ε 2 x, z) Gill’s model for balanced equatorial flows, [20, 27, 7] 5 7 5 = U(i) (ε 2 t, ε 2 x, ε 2 y, z) Quasi-linear equatorial long-wave equations, [17]
4.2. Classical results through specializations of the general scheme. Specializing the general multiple scales ansatz from (4.1) by neglecting all but one time, one horizontal, and one vertical space coordinate, it could be demonstrated in [16, 13, 14] that the following simplified models of theoretical meteorology could be derived directly from the three-dimensional compressible flow equations using standard techniques of asymptotic analysis (see Table 3). Notice that all of these results have been obtained using the same distinguished limit from (2.3). This indicates an aspect of mutual consistency of all these models that – at least to the authors – had not been obvious to begin with. We should mention, however, that there is one system of simplified equations, the hydrostatic primitive equations (HPEs), which is the basis of the majority of the computational weather prediction codes and climate models, and which cannot be obtained through the same distinguished limit. The only simplifying assumptions made in deriving the HPEs are that there is a dominant balance between pressure gradient and gravity in the vertical momentum equation, and that there is an (asymptotic) scale separation between the characteristic vertical and the characteristic horizontal scales. In particular, the Mach number and the barotropic Froude number based on the horizontal flow velocities are assumed to be of order O(1) in this theory. In contrast, here they are small of order O(ε2 ) as ε → 0, see (3.4). The asymptotic expansion schemes shown above in the table all lead to “single scale models”, because we allow for only one characteristic scale in each of the time and space directions. (The ansatz reproducing the semigeostrophic theory just allows for anisotropic horizontal scales. It is not an honest-to-goodness multiple scales expansion.)
Multiple Scales Asymptotics for Atmospheric Flows
211
Of course, the true potential of the general scheme in (4.1) may be exploited only when more than one coordinate is retained for at least one of the space-time directions. The first multiple scales application of the present approach in this sense has been presented in [16] where it was demonstrated how the approach allows one to construct systematic multiple scales models for a host of near-equatorial flow phenomena directly from the three-dimensional compressible flow equations. In the next section we give an example of a multiple scales theory by sketching the derivations of two models describing boundary layer flows. We will also demonstrate how the necessity of “closure” or “parameterization” comes up naturally when the small-scale part of a multi-scale model enters the large scale dynamics through some nonlinear averages, yet does not allow for explicit, analytical solutions. 5. Boundary layer flows As concrete examples of model derivations via the systematic multiple scales approach presented above we consider atmospheric boundary layer flows. In particular, we summarize the derivations for two different flow regimes which result in very different effective models. The first ansatz reproduces the classical Ekman boundary layer theory (see, e.g., [21]). Our derivation addresses the interaction of three-dimensional flow near the ground, at characteristic spatio-temporal scales of the order of 200 m, 20 s, with middle latitude synoptic scale motions on scales of 500 km, 12 h and above. Considering that our units of measure for the space-time coordinates are ref = hsc ∼ 10 km and uref ∼ 10 m/s, and that ε ∼ 1/7, the appropriate multiple scales ansatz reads Ekman Layer Scalings: U(x, t; ε) =
εi U(i) (X, Z, T, ξs , τs ) .
(5.1)
i
where xh z t , Z= 2, T = 2 (5.2) 2 ε ε ε denote the “microscale coordinates” resolving the 200 m, 20 s scales near the ground, and ξ s = ε2 xh , τs = ε2 t (5.3) X=
are the “synoptic scale coordinates” resolving the 500 km, 12 h scales. Notice that the latter, according to the last section, are the proper horizontal and time coordinates for the derivation of quasi-geostrophic theory. In both (5.2) and (5.3) we have anticipated that, for the scales of interest here, an approximate description of the flow in a local tangent plane to the
212
R. Klein, E. Mikusky and A. Owinoh
earth surface is sufficient for our purposes. Denoting the vertical unit vector by k we have used the abbreviations xh = (1 − k ◦ k) x ,
z =k·x
(5.4)
for the local horizontal and vertical components of the position coordinate x. The second regime is relevant for the same small scales, but for 70 km, 2 h as the larger scales involved. This regime leads to a nonlinear model very similar to the classical boundary layer equations in theoretical (non-geophysical) fluid mechanics, [23], yet, it includes Coriolis effects on the larger scale. The appropriate asymptotic expansion scheme reads Nonlinear Boundary Layer Scalings: εi U(i) (X, Z, T, ξ, τ ) . U(x, t; ε) =
(5.5)
i
where ξ = ε xh ,
τ = εt.
(5.6)
5.1. Multiscale derivation of Ekman boundary layer theory. 5.1.1. Leading order balances and the Boussinesq approximation. Consider the multiple scales ansatz in (5.1). Inserting into the governing equations (3.2), (3.3) and collecting like powers of ε we obtain a hierarchy of perturbation equations as usual. The few leading orders are dominated by the derivatives with respect to the fast variables, X, Z, T . As a result we find ∂p (i) ≡ 0 for (i ∈ {0, 1, 2, 3}) . (5.7) ∂Z We are considering here a thin boundary layer of characteristic height O(ε2 hsc ). The solution in this layer must match with the bulk flow in the atmosphere above. Thus, as usual in matched asymptotics for boundary layer flows, we find that the pressures p(0) . . . p(3) are all imposed from outside. The natural scalings for the outer flow, compatible with the presentansatz for the boundary layer, would involve an asymptotic expansion U = εi U(i) (ξ s , z, τs ). This ∇X p(i) ≡ 0 ,
i
reproduces the well-known quasi-geostrophic theory (QG) as mentioned in the last section. In derivations not shown here, [14], we find that horizontal variations in the slow scale variables ξ s , τs occur first in p(3) . As a consequence we may adopt p(0) ≡ P0 = 1 = const. ,
p(i) ≡ 0
(i ∈ {1, 2})
(5.8)
and an externally given (3)
p(3) ≡ PQG (ξ s , τs ) from here on. It is assumed that constants for p in P0 .
(5.9) (1)
,p
(2)
have been absorbed
Multiple Scales Asymptotics for Atmospheric Flows
213
Order of magnitude estimates in [13, 16] for entropy fluctuations in the troposphere show that these are of order O(ε2 ) under typical conditions. Assuming this scaling here as well, we conclude that the “potential temperature”, θ, obeys 1 pγ θ= = 1 + ε2 θ(2) + . . . , (5.10) ρ with θ(2) (X, Z, T, ξs , τs ) depending on the small as well as the large scale variables. For the density perturbations we conclude from (5.8)–(5.10) that ρ(0) ≡ ρ0 = 1 = const. ,
ρ(1) ≡ 0 ,
ρ(2) = −θ(2) .
(5.11)
The resulting leading order small scale equations are (dropping the order indicator on u(0) , w(0) for convenience) (ρ0 u)T
+
∇X · (ρ0 u ◦ u) + (ρ0 wu)Z + ∇X p(4)
(ρ0 w)T
+
(ρ0 θ(2) )T
+
=
0,
∇X · (ρ0 u ◦ w) + (ρ0 w2 )Z + pZ
=
ρ0 θ(2) ,
∇X · (ρ0 u θ(2) ) + (ρ0 w θ(2) )Z
=
(0) (γ − 1) Sρe ,
∇X · v + wZ
=
0.
(4)
(5.12) These are the Boussinesq equations for small scale incompressible flow. Importantly, they include thermal effects through the buoyancy term ρ0 θ(2) in the vertical momentum balance, and through the transport equation for θ(2) in the next line. The main goal in this section is the derivation of the effective large scale boundary layer equations. The appropriate tool are sublinear growth conditions. The equations in (5.12) are in divergence form, so that averaging in X, T is straight forward. The resulting spatio-temporal sublinear growth conditions read " # ρ0 w(0) u(0) = 0 Z " # 2 ρ0 w(0) + p(4) = ρ0 θ(2) Z (5.13) " # (0) ρ0 w θ(2) = (γ − 1) Sρe , Z
w(0) Z = 0 . Here the overbar denotes the double average in the X, T coordinates. Taking into account the bottom boundary condition which states that the large scale average vertical velocity at Z = 0 vanishes (there is no vertical velocity pattern with velocity magnitude of order O(10 m/s) that would be coherent over distances of O(500 km)), (5.13)4 yields w(0) ≡ 0 ,
(5.14)
214
R. Klein, E. Mikusky and A. Owinoh
and in the sequel we define w ≡ w(0) − w(0) = w(0) .
(5.15)
The averaging procedure · satisfies the Reynolds’ averaging conditions, i.e., for some quantities a = a + a and b = b + b with a = b = 0 we have a b = 0 ,
ab = ab,
a b = a b + a b
(5.16)
Using (5.14) we conclude that w(0) u(0) = w u ,
w(0) θ(2) = w θ
w(0) w(0) = w 2 .
(5.17)
The same argument which we used to derive (5.14) may now be applied to (5.13)1 to obtain w u ≡ 0 . (5.18) While, therefore, there is no large scale mean vertical transport of horizontal momentum at leading order, this is not true for the transport of vertical momentum and of heat (represented by the perturbation potential temperature, θ(2) ). In fact, (5.13)2 describes a modification of hydrostatics by the effective vertical momentum transport ρ0 w 2 , and (5.13)3 shows how fluctuations of vertical velocity and potential temperature must correlate to produce an effective vertical heat flux that matches the vertically integrated heat source, Z ρ0
w θ
(0)
(γ − 1)Sρe dZ .
=
(5.19)
0
For later reference we notice that, in analogy with (5.13)3 , the first- and secondorder potential temperature equations also represent a balance of fluxes and source terms, so that # " (i) (i−2) ρwθ = (γ − 1) Sρe for i ∈ {1, 2} . (5.20) Z
5.1.2. Three-term balance in the Ekman layer. We are interested here in exhibiting the relation of the present multiple scales derivations with Ekman boundary layer theory. This theory addresses the balance of horizontal momentum on large space and time scales. In the present setting we consider, in analogy with (5.13)1 , the sublinear growth conditions from the horizontal momentum equations at first, second, and third order. These are (we add the earlier result for completeness) ∂Z (ρwu)
(0)
=
0,
∂Z (ρwu)
(1)
=
0,
∂Z (ρwu)
(2)
=
0,
f k × ρ0 u(0) + ∇ξ PQG + ∂Z (ρwu)
(3)
=
0.
(5.21) (3)
Multiple Scales Asymptotics for Atmospheric Flows
215
Here f = k · Ω is the vertical component of the earth rotation vector in the considered tangent plane. Thus we recover the Ekman boundary layer theory’s three-term balance equation for horizontal momentum in (5.21)4 . A few remarks are in order here: (i) Flow evolution on the length and time scales assumed in Ekman theory requires very weak correlations of the fluctuations of density and the vertical and horizontal velocities. Because of the no-slip bottom boundary condition, equations (5.21)1–(5.21)3 imply (ρwu)
(i)
≡0
for
i ∈ {0, 1, 2} .
(ii) The three-term balance in (5.21)4 is obtained here allowing general entropy (potential temperature) perturbations of order O(ε2 ). As seen in the Boussinesq equations governing the small scale, (5.12), as well as in the equation describing the large scale vertical thermal flux, (5.19), such perturbations are sufficient to induce leading order small scale vertical motions. Thus the present derivations are valid for thermally driven as well as for thermally neutral flows. (iii) Depending on whether such thermal fluctuations are present or not, the structure of the small scale flow will differ considerably, and so will the relation between the effective Ekman layer momentum flux ρwu(3) and the large scale mean flow. (iv) Classical Ekman theory, [21], assumes thermally neutral flow and uses a simple gradient flux approximation to represent the vertical momentum flux. That is, in (5.21)4 we would have (ρwu)
(3)
= −ρ0 Km
∂u(0) ∂Z
and the Ekman layer equation becomes (dropping
(0)
for the moment)
f ∂2 (u − uQG ) + k × (u − uQG ) = 0 . 2 ∂Z Km This is the classical Ekman layer equation for the deviation (u − uQG ) of the horizontal mean flow velocity from the quasi-geostrophic large scale flow that prevails above the boundary layer. Solutions must satisfy the boundary and asymptotic matching conditions (u − uQG ) = −uQG
at (Z = 0) ,
and (u − uQG ) → 0 as (Z → ∞) . There are exact solutions to this problem which exhibit a spiral vertical distribution of the horizontal large scale flow velocity throughout the layer: As we leave the layer, vertical derivatives vanish asymptotically and we approach geostrophic balance of Coriolis force and pressure gradient. As a consequence,
216
R. Klein, E. Mikusky and A. Owinoh
flow velocity and pressure gradient are orthogonal to each other – the flow velocity is tangent to level set of the pressure field. In contrast, near the ground the flow velocity and the Coriolis force vanish, and we have a balance of pressure gradient and the effective vertical momentum transport term. The vertical shear, and with it the velocity close to the ground, thus have to be aligned with the pressure gradient. (v) The present derivations are somewhat incomplete, at least for the case of thermally driven layers. In the presence of second-order density perturbations another somewhat thicker boundary layer will establish within which the mean vertical fluxes are dominated entirely by buoyant updrafts. A detailed analysis of this regime is beyond the scope of this paper. 5.2. Boundary layer with advective nonlinearity. Here we consider the modified boundary layer scalings from (5.5). All the leading perturbation equations up to the first appearance of derivatives with respect to the large scale variables ξ, τ are unchanged in this regime, and so are the associated sublinear growth conditions. Thus, all the results from (5.7) to (5.21)3 remain valid. We are interested here in the modification of the large scale evolution equations in the presence of mesoscale variations. The sublinear growth conditions from the third-order horizontal momentum and potential temperature equations read # " (5) 1 (3) θ(2) τ + ∇ξ · u(0) θ(2) + ρwθ Z = (γ − 1)Sρe , (5.22) ρ0 # " # 1 " # " (3) ∇ξ p(4) + (ρwu)Z = 0 , u(0) τ + ∇ξ · u(0) ◦ u(0) + f k × u(0) − uQG + ρ0 (5.23) In addition to their counterparts in the Ekman-regime, (5.21)4 and (5.13)3 , these equations involve the local time derivative, and horizontal advection in the form of a nonlinearly averaged effective transport term. The latter cannot be expressed in terms of the large scale variables only without further approximation. However, under the reasonable assumption that the leading, first-, and second-order correlations involving the horizontal velocity components vanish in analogy with the decorrelation observed for the horizontal and vertical velocities in (5.21)1 –(5.21)3 , we may simplify these terms. In this case, e.g., u(0) ◦ u(0) = u(0) ◦ u(0) ,
u(0) θ(2) = u(0) θ(2) .
(5.24)
There is one more difference between the present mesoscale regime and the Ekman regime considered in the last subsection. As the horizontal scale considered here is by one order of magnitude smaller than in the former regime, the horizontal large scale flow divergence is by one order of magnitude larger here. In fact, a detailed analysis of the first-order mass continuity equations shows that ∇ξ · u(0) + ∂Z w(3) = 0 , (5.25)
Multiple Scales Asymptotics for Atmospheric Flows
217
whereas the analogue of this equation in the Ekman regime would have involved w(4) . Therefore, there will generally be a coherent large scale vertical velocity at third order, i.e., w(3) '= 0. Taking into account that the vertical flux (3) (3) terms, (ρwu) , ρwθ will thus have a coherent contribution from large scale vertical advection by w(3) we may rewrite as (3) (3) (ρwu) = w(3) u(0) + ρw u (5.26) (5) (5) (3) (2) ρwθ = w θ + ρw θ . This decomposes the flux terms into large scale coherent advection and net fluxes resulting from nonlinear averages over small scale fluctuations. We summarize the boundary layer equations for this regime in a streamlined notation using the following replacements, u(0) → u ,
w(3) → w ,
p(4) → p ,
θ(2) → θ .
(5.27)
Nonlinear Boundary Layer Equations: 1 Du + f k × (u − uQG ) + ∇ξ p = Dτ ρ0 Dθ Dτ
=
1 pZ − θ ρ0
=
∇ξ · u + wZ
=
−
1 (3) ρw u Z , ρ0
1 (5) (3) ρw θ Z + (γ − 1)Sρe , ρ0 " # 2 − w(0) ,
−
Z
0. (5.28)
Here uQG is the flow velocity outside the boundary layer, and D (5.29) = ∂τ + u · ∇ξ + w ∂Z . Dτ These equations resemble the classical Prandtl boundary layer equations in that, in addition to the vertical fluxes from nonlinear small-scale averages, they explicitly include the unsteady term and advection both in the horizontal and vertical directions. In addition, however, they include Coriolis effects and internal gravity waves, the latter arising due to the interaction of potential temperature transport in (5.28)2 with the horizontal momentum balance in (5.28)1 via the hydrostatic balance in (5.28)3. 6. Closing Remarks This paper has summarized a unified approach to meteorological modelling. It uses judiciously chosen distinguished limits among the multiple singular parameters of the system, and systematic multiple scales asymptotics based on
218
R. Klein, E. Mikusky and A. Owinoh
the remaining ε 1. We have outlined how a wide range of well-established simplified “single scale” models of theoretical meteorology can be recovered naturally through this approach. Here “single scale” means that only a single characteristic scale is assumed for any of the coordinate directions and for time, respectively. Two aspects of these re-derivations of established theories seem worth noting: • The small expansion parameter, ε, is a representative of a particular distinguished limit amongst the various singular small parameters of the system, i.e., of the Rossby, Froude, Mach, and other characteristic numbers. One and the same distinguished limit turns out to be adequate for the derivation of most of the simplified models considered. • All derivations use the full three-dimensional compressible flow equations as the starting point. The approach naturally lends itself as a tool to study multiple scales interactions. A first successful application was the derivation of “Systematic multiscale models for the tropics” in [16], which led to very promising further developments, [2]. Here we have discussed a recent analysis of atmospheric boundary layer flows as an example of a typical multi-scale application. First we have revisited the flow regime of the classical Ekman boundary layer theory involving ∼ 500 km, 12 h length and time scales, respectively. We were able to reproduce Ekman’s theory which describes the quasi-stationary balance of the Coriolis, pressure gradient and (turbulent) friction terms. We obtained these classical results by replacing the nonlinear averages of vertical advective fluxes, which naturally arise in sublinear growth conditions from the multiple scales technique, with certain simplified closures normally imported from turbulence theory. However, our derivations also demonstrate how such closures may be improved once additional information on the small scale flow becomes available. We have then considered flows on the “meso scales” covering ∼ 70 km, 2 h. Here we found a very different set of boundary layer equations describing inherently unsteady effects, advection both in the horizontal and in the vertical direction, and nonlinear inertio-gravity waves. The latter are wave motions being driven simultaneously by Coriolis effects and by the mechanisms for internal gravity waves. Acknowledgments. The authors thank Ann S. Almgren, Julian C.R. Hunt, Andrew J. Majda, and V. Petukhov for fruitful discussions and encouragement, and the Deutsche Forschungsgemeinschaft for their continuing support under grants KL 611/6, KL 611/14.
Multiple Scales Asymptotics for Atmospheric Flows
219
References [1] Bannon P.R., 1995: Potential vorticity conservation, hydrostatic adjustment, and the anelastic approximation, J. Atmos. Sci., 52, 2302–2312 [2] Biello J.A., Majda A.J., 2004: A new multi-scale model for the Madden-Julian oscillation, submitted to J. Atmos. Sci. [3] Cheverry C., 1996: The modulation equations of nonlinear geometric optics, Comm. Part. Diff. Eqs., 21, 1119–1140 [4] Cullen M.J.P., 2000: On the accuracy of the semi-geostrophic equations, J. Roy. Met. Soc., 126, 1099–1116 [5] DiPerna R., Majda A.J., 1985: The validity of nonlinear geometric optics for weak solutions of conservation laws, Comm. Math. Phys., 98, 313–347 [6] Durran D.R., 1989: Improving the anelastic approximation, J. Atmos. Sci., 46, 1453–1461 [7] Gill A.E., 1980: Some simple solutions for heat-induced tropical circulation, Quart. J. Roy. Meteor. Soc., 106, 447–462 [8] Gill A.E., 1982: Atmosphere-Ocean Dynamics, Academic Press [9] Hunter J., Keller J., 1983: Weakly nonlinear high frequency waves, Comm. Pure & Appl. Math., 36, 547–569 [10] Joly J.L., M´etivier G., Rauch J., 1993: Resonant one-dimensional nonlinear geometric optics, J. Funct. Analysis, 114, 106–231 [11] Klein R., Peters N., 1988: Cumulative effects of weak pressure waves during the induction period of a thermal explosion in a closed cylinder, J. Fluid Mech., 187, 197–230 [12] Klein R., 1995: Semi-implicit extension of a Godunov-type scheme based on low Mach number asymptotics I: One-dimensional flow, J. Comp. Phys., 121, 213–237 [13] Klein R., 2003: An applied mathematical view of theoretical meteorology, invited presentation, in: Proc. ICIAM 2003, Sydney, Australia [14] Klein R., Mikusky E., Vater S., 2004: Mathematische Modellierung in der Klimaforschung (in german), Lecture Notes, FB Mathematik & Informatik, Freie Universit¨ at Berlin [15] Lipps F., Hemler R.S., 1982: A scale analysis of deep moist convection and some related numerical calculations, J. Atmos. Sci., 39, 2192–2210 [16] Majda A.J., Klein R., 2003: Systematic multi-scale models for the tropics, J. Atmos. Sci., 60, 393–408 [17] Majda, A., 2003: Introduction to PDE’s and waves for the atmosphere and ocean, American Mathematical Society [18] Majda A.J., Rosales R.R., 1984: Resonantly interacting weakly nonlinear hyperbolic waves: I. A single space variable, Stud. Appl. Math., 71, 149–179 [19] Majda A.J., Rosales R.R., Sch¨ onbek M., 1988: A canonical system of integrodifferential equations arising in resonant nonlinear acoustics, Stud. Appl. Math., 79, 205–262 [20] Matsuno, T., 1966: Quasi-geostrophic motions in the equatorial area, J. Meteor. Soc. Japan, Ser II, 44, 25–43 [21] Pedlosky J., 1987: Geophysical Fluid Dynamics, Springer Verlag
220
R. Klein, E. Mikusky and A. Owinoh
[22] Pego R., 1988: Some explicit resonating waves in weakly nonlinear gas dynamics, Stud. Appl. Math., 79, 263–270 [23] Schlichting H., Gersten K., 2003: Boundary-Layer Theory, 8th edition, Springer Verlag [24] Schochet S., 1994: Resonant nonlinear geometric optics for weak solutions of conservation laws, J. Diff. Eqs., 113, 473–504 [25] Schochet S., 1994: Fast singular limits of hyperbolic PDEs, J. Diff. Eqs., 114, 476–512 [26] Sobel A., Nilsson J., Polvani L., 2001: The weak temperature gradient approximation and balanced tropical moisture waves. J. Atmos. Sci., 58, 3650–3665 [27] Webster P.J., 1972: Response of the tropical atmosphere to local steady forcing, Mon. Wea. Rev., 100, 518–541 Rupert Klein Mathematik & Informatik, Freie Universit¨ at Berlin Eileen Mikusky and Antony Owinoh Potsdam Institut f¨ ur Klimafolgenforschung
4ECM Stockholm 2004 c 2005 European Mathematical Society
Proof Complexity Jan Kraj´ıˇcek Abstract. This note, based on my 4ECM lecture, exposes few basic points of proof complexity in a way accessible to any mathematician.
In many parts of mathematics one finds statements asserting that a finite object with a particular, feasibly verifiable, property does not exist. Such statements include, for example, the unsolvability of an equation, the non-existence of a combinatorial pattern or of an algebraic object, or the non-existence of a computation solving a problem. The qualification “feasibly” will mean throughout the paper “in polynomial time” (shortly, p-time). A universal statement of this form is a statement that a propositional formula cannot be satisfied by any truth assignment or, equivalently, that the negation of the formula is a tautology. The qualification “universal” means, in particular, that proving statements about the non-existence of particular finite objects can be reduced to proving that particular propositional formulas are tautologies. The ultimate goal of proof complexity is to show that there is no universal propositional proof system allowing for efficient proofs of all tautologies. This is equivalent to showing that the computational complexity class N P is not closed under the complementation (and it implies the famous conjecture that P differs from N P). By the universality propositional proof systems subsume methods from other parts of mathematics used for proving the non-existence statements. Because of this, even the partial results known at present (lower bounds for some specific proof systems) revealed interesting links of proof complexity to logic, algebra, combinatorics, computational complexity,. . . . I shall attempt to explain some basic points of proof complexity in a coherent manner accessible to any mathematician. I will first give few informal examples (Section 1) in order to motivate the main concepts and problems of proof complexity (Section 2). After that I will discuss two particular topics in proof complexity, one classic (Section 3) and one quite recent (Section 4), in some detail. The author is also member of the Institute for Theoretical Computer Science of the Charles University. Partially supported by grant # A 101 94 01 of the Academy of Sciences and by project LN00A056 of The Ministry of Education of the Czech Republic.
222
J. Kraj´ıˇcek
The informed reader will notice that I do not discuss here at all any lower bounds for particular proof systems although these are generally the most difficult and appreciated results in proof complexity. This is chiefly because of a lack of space (and time in the 4ECM lecture) as one needs a substantial background to appreciate that even results about particular proof systems say something about the original fundamental problems. 1. Examples of proof systems Let pi (x1 , . . . , xn ) = 0, i = 1, . . . , k, be a system of polynomial equations over Fq . We want to know if the system is solvable in the field. We do not necessarily want to decide this by some algorithm but rather we want to “prove” that the system is either solvable, or to prove that it is unsolvable. If the system is solvable then as a “proof” of the solvability will serve any solution a = (a1 , . . . , an ) ∈ (Fq )n . The soundness of any such proof is verified easily by checking that indeed all equations pi (a) = 0 are satisfied. This can be done in p-time (as long as the polynomials are represented well, e.g., in the so-called dense notation as an explicit sum of monomials). If the system is unsolvable we can take as a “proof” polynomials qi ∈ Fq [x1 , . . . , xn ], i = 1, . . . , k + n, such that in Fq [x1 , . . . , xn ] it holds: qi · pi = 1 i≤k+n
xqj
− xj for j = 1, . . . , n. This equality can be again checked in where pk+j := p-time. For any a ∈ (Fq )n that would be a solution of the system it would hold that pi (a) = 0 for all i ≤ k + n. But that is impossible by the equality. Hence our “proofs” are sound: Whenever suitable qi ’s exist, the system is unsolvable. In fact, by Hilbert’s Nullstellensatz such qi ’s always exists as long as the system is unsolvable in the field; hence this “proof system” is also complete. Note that while we can bound the size of the proofs in the first case by the size of the polynomial system itself (i.e., of the problem instance), no similar bound is a priori possible (in general) for the proofs in the second case. The only bound on the size of some “proofs” is about q n (this can be exponential in the size of the problem instance). This trivial bound is achieved by an exhaustive search of all potential solutions. When we later use the phrase that there is no a priori bound we mean a bound better than the trivial exponential one. Our second example is from graph theory. Let G be a graph (finite, unordered). The issue is whether G is 3-colorable or not. Again we are not interested in an algorithm deciding this but in “proof systems” that would allow us to prove that it is or it is not, respectively. A proof of the affirmative case is simple: Any 3-coloring will do, as its correctness can be checked in p-time. To prove the non-colorability we may use a method devised by Haj´os [7]. He showed that there is a finite number of initial graphs, all non-3-colorable, and a couple of elementary rules how to
Proof Complexity
223
obtain a new graph from two graphs already constructed, such that the class of graphs constructible in this way coincides with the class of non-3-colorable graphs. In particular, a sequence of graphs H1 , . . . , Hk which are constructed from the initial graphs by Haj´ os’s rules and such that Hk = G is a “proof” of non-3-colorability of G. The soundness of this proof system is obvious from the definition of the rules, the completeness is provided by Haj´ os’s theorem [7]. Similarly as before, while we have a trivial upper bound on the size of proofs in the affirmative case we have no a priori subexponential upper bound for the sequences H1 , . . . , Hk . The third example I shall discuss is the most important one. Let ϕ be a propositional formula in some fixed complete language for propositional logic, e.g., the DeMorgan language 0, 1, ¬, ∨, ∧. We want to prove that ϕ is, resp. is not, satisfiable. The satisfiability can be proved easily: Any satisfying assignment will do as a “proof”. For the unsatisfiability we also have an elegant way of proving it: Note that ϕ is unsatisfiable iff ¬ϕ is a tautology, and hence any proof of ¬ϕ in some propositional calculus will serve well. Again we have no a priori upper bounds on such proofs that would be better than exponential (in the size of ϕ). I shall conclude this set of examples by an example of a different nature. Let f (x, y) = 0 be a Diophantine equation in Z, and let B ∈ N be a parameter. We want to prove, resp. to disprove, that the equation has an integer solution bounded in absolute value by B. The affirmative case is proved by simply producing such a solution. But now - and this is the point of this example - we do not seem to have any elegant way of proving unsolvability other than simply proving it in Mathematics, and then take as a proof its formalization in some theory used for formalizing Mathematics (e.g., in set theory). The soundness of such a proof comes from the soundness of the theory, and of course we have no a priori subexponential upper bound on its size. Note that the key property that it can be verified in p-time whether something is a proof or not is maintained (this is due to the fact that axioms of set theory are given by a finite number of schemes and so it is p-verifiable if a string is an axiom or not). 2. Basic concepts and problems We now leave behind classes of particular finite objects (polynomials, graphs, formulas, etc.) and enter the abstract framework usual in theoretical computer science. Finite objects are encoded by binary words, i.e., elements of {0, 1}∗. Decisions problems are identified with the set of those problem instances for which the problem has the affirmative answer. That is, decision problems are languages L ⊆ {0, 1}∗ and the original query is replaced by a query of the form x ∈? L. The size of a problem instance x becomes the length |x| of the word x.
224
J. Kraj´ıˇcek
Definition 2.1 (Cook-Reckhow[6]). A proof system for L ⊆ {0, 1}∗ is a binary relation P (x, y) satisfying the following three conditions: (1) Completeness: x ∈ L → ∃y; P (x, y). (2) Soundness: ∃y; P (x, y) → x ∈ L. (3) p-verifiability: P (x, y) is a p-time decidable relation. The important distinction between the affirmative cases and the negative cases, the existence of a priori short proofs, is formalized by the following definition. Definition 2.2 (Cook-Reckhow[6]). Proof system P (x, y) is p-bounded iff there exists k ≥ 1 such that for all x, y ∈ {0, 1}∗: P (x, y) → ∃z(|z| ≤ (|x| + 2)k ); P (x, z) . That is, the size of proofs in a p-bounded P can be a priori bounded by a polynomial in the size of the problem instance. Definition 2.3. N P is the class of languages L admitting a p-bounded proof system. coN Pis the class of complements of N P-languages. In our examples, the sets of instances for which the problems has affirmative answer form N P-sets (solvable polynomial systems, 3-colorable graphs, satisfiable formulas, etc.), while the sets of instances with the negative answer form coN P-sets (unsolvable polynomial systems, non-3-colorable graphs, unsatisfiable formulas or tautologies, etc.). The following problem, the so-called N P versus coN P problem, is the central problem in proof complexity. It formalizes the question whether or not there are efficient ways to prove the negative cases in our, and in many other similar, examples. Problem 2.4. N P =? coN P That is, does the implication L ∈ N P → {0, 1}∗ \ L ∈ N P hold for any L? It is a prevailing conjecture that N P '= coN P. The universality of propositional tautologies mentioned in the introduction means precisely what is stated in the following theorem. The theorem is a consequence of the so-called N Pcompleteness of the set of satisfiable formulas (Cook [4]). Theorem 2.5 (Cook-Reckhow[6]). N P = coN P iff the set T AU T of propositional tautologies (in DeMorgan language) admits a p-bounded proof system. Proof complexity (tacitly propositional proof complexity) studies proof systems for T AU T with the main aim to prove that none of them is p-bounded. It is a many-faceted area where computational complexity theory overlaps with mathematical logic.
Proof Complexity
225
Topics studied include: • Connections with first order theories and central issues of mathematical logic (bounded arithmetic, G¨ odel’s theorem, etc.). • Lower bounds for particular proof systems (resolution, bounded depth Frege systems, effective Nullstellensatz, polynomial calculus, cutting planes, theory of discretely ordered modules, etc.). • Upper bounds and various simulations between proof systems. • Connections with computational complexity (boolean complexity, cryptography, derandomization, communication complexity, etc.). I shall say nothing about the middle two items (with the exception of a couple of brief remarks at the end of Section 3) but I will discuss examples from the first and from the last items in the next two sections. 3. Consistency statements Let A(x, y) be a p-time relation. Given a ∈ {0, 1}n we want to express that ∀y ∈ {0, 1}m ; A(a, y) . For example, we can express in this way that a is an unsolvable polynomial system, an unsatisfiable formula, etc. In these cases m ≤ poly(n), and we will always assume this bound. Consider an algorithm executed by a Turing machine running in time poly(n, m) ≤ poly(n) and deciding A(x, y). Turing machine is the established mathematical model of a computer but any other faithful model would work just fine. At time t = 0 the tape of the machine (i.e., the memory) contains bits of x, y, and additional bits, a finite number of them encoding the state of the machine. Call all these bits w10 , w20 , . . . , wT0 . Generally, in time t, the memory holds bits wit , for i ≤ T . The parameter T here is some universal bound deduced from the time bound of the machine, so T ≤ poly(n). The key observation is that any bit wit+1 depends only on a finite number of bits wjt , some j’s. This allows to write down propositional formula CorrectA n (x, y, w), the conjunction of all local conditions expressing that each bit wit+1 is correctly computed from wjt ’s. Note that the number of bits wit+1 is poly(n) and so the size of CorrectA n (x, y, w) is also poly(n). The main property of this propositional representation is this: • ∀y ∈ {0, 1}m; A(a, y) holds true iff the propositional formula T CorrectA n (a, y, w) → woutput
is a tautology. T is the bit representing the output of the machine, i.e., it is equal Here woutput to 1, true, iff the machine yields the affirmative answer. We shall denote the implication by symbols: ||A||n (a, y) .
226
J. Kraj´ıˇcek
This is a propositional formula that has bits of a substituted for bits x1 , . . . , xn , has bits y1 , . . . , ym , and also bits wit that we do not show explicitly in this notation. With this general way of translating bounded universal statements into propositional formulas we now define a formula expressing the soundness of a proof system Q: ∀x, y, z(|x|, |z| ≤ |y|); Q(x, y) → Sat(x, z) . We assume with a loss of generality that Q(x, y) implies |x| ≤ |y| (we can always stipulate that a formula x is “a part” of a proof y). Sat(x, z) is the p-time relation “z is a truth assignment satisfying x”, |z| ≤ |x| ≤ |y|. Let || Ref Q ||n (x, y, z) be the propositional translation of this formula for |y| = n. Assume that we have a proof σn of this tautology in a proof system P . Given any Q-proof π of any tautology τ reason in P is follows: (1) Q(τ, π) → Sat(τ, z) [this is an instance of tautology || Ref Q ||n (x, y, z) proved by σn ] (2) Q(τ, π) [this is a true instance and is proved by evaluating it] (3) Sat(τ, z) [from (1) and (2)] (4) Sat(τ, z) ≡ τ (z) [this is a general fact that is verified by induction on the logical complexity of τ ] (5) τ (z) [from (3) and (4)] To summarize: Given a P -proof σn of || Ref Q ||n (x, y, z) we can transform any Q-proof π of size n into a P -proof π ∗ of the same formula, and of size |π ∗ | ≤ O(|σn | + poly(n)) . That is, “any” P proving tautologies || Ref Q ||n (x, y, z) by p-size proofs is “at least as good as” Q. The concept “at least as good as” is called simulation: P simulates Q iff P -proofs are at most polynomially longer than Q-proofs. In particular, if Q is p-bounded, so is P . I remark without elaborating on it that many Q do prove their own soundness in p-size, and so the formulas || Ref Q ||n (x, y, z) are actually the hardest tautologies that Q proves in p-size. See [8, Chpt.9] for details. The formula expressing the soundness of Q has the form ∀y; A(x, y) (we leave out the bounds on y) such that it is universally true for all x and not just for some instances x := a. For such formulas it seems more natural to consider their proofs in a first order theory rather than to perform a redundant translation, for each length of x, to propositional formulas and then to prove all these formulas separately.
Proof Complexity
227
A relation between first-order theories and proof systems can be indeed developed; the theory is called bounded arithmetic (the term comes from a universal form of the theories one can take in this context). I shall not elaborate on it but just informally state its main points. See [8] for details (original references include [5, 15, 1]). Proof systems P and theories T come in pairs such that: (1) When T proves ∀x, y; A(x, y) then tautologies ||A||n (x, y) have p-size P -proofs. (2) T proves the soundness of P and if T proves the soundness of Q then P simulates Q. (3) A form of converse to (1) (this needs some background in model theory that I shall omit). One can view the relation between theories and proof systems as being analogous to the relation of algorithms and circuits; it is the uniform and the non-uniform version respectively of the same concept. Before leaving these issues we remark that it allows to interpret lower bounds for particular proof system P as lower bounds to a whole class of proof systems whose soundness can be proved in P . This is often the case when the proof system is based on utilizing one combinatorial or algebraic fact (as was the case in some of our examples and it is the case in most proof systems studied so far). Property (1) can be used to prove upper bounds on the size of P -proofs (it is often much easier to see that T proves a universal formula than to construct short P -proofs of its individual instances). Property (2) can be used for proving simulations between proof systems. In fact, all more involved upper bounds or simulations have been first proved via bounded arithmetic. 4. Proof complexity generators The first issue one has to deal with when trying to prove lower bounds for a proof system P , or a class of proof systems, is to formulate tautologies that could be hard for the proof system. To produce sensible candidate tautologies appears actually surprisingly difficult. The point is that the tautology should be “hard” but should have also a clear intuitive meaning and structure in order that we are able to devise a lower bound proof. I use the verb “devise” because it can be showed that any lower bound proof amounts to constructing a model (in a precise logical meaning) for the negation of the tautology. We have, at present, three categories of such candidate hard tautologies: (1) Formulas || Ref Q ||n (x, y, z) for “strong” Q. (2) Combinatorial principles such as the pigeonhole principle. (3) Complexity/logic motivated candidates.
228
J. Kraj´ıˇcek
The first type of formulas is difficult to use for lower bounds because the adjective “strong” is difficult to substantiate. The combinatorial formulas work well but only for weaker systems. I shall discuss in this section an example of candidates of the third type, the so-called τ -formulas. I shall give enough details to informally explain main ideas but I make no attempt for an exhaustive or for the most general exposition (the interested reader may start with [12]). The rout to the τ -formulas went via feasible interpolation [9, 16] and provability of the dual weak pigeonhole principle in bounded arithmetic [10] (at least in my case). They have been defined in [10] and independently in [2], and their theory is being developed [11, 18, 12, 19, 13]. A more detailed discussion of background than I offer below can be found in the introductions to [12] and [19]. Consider a p-time map g with restrictions gn : {0, 1}n → {0, 1}m with n < m ≤ poly(n). We assume that |g(x)| depends only on |x|. Hence m := m(n) depends only on n and we will assume for simplicity of the notation that m(n) '= m(n ) if n '= n . As m > n, we have {0, 1}m \ Rng(g) '= ∅. Let b ∈ {0, 1}m \ Rng(g). Formula τb (g) expresses that b is outside of the range of g, “g(x1 , . . . , xn ) '= b”, i.e., it is ||g(x) '= y||n (x, y/b) . Note that its size is poly(n, m) ≤ poly(n). We will say that g hard for P iff • For every k ≥ 1 and sufficiently large n, for no b ∈ {0, 1}m \ Rng(g) has formula τb (g) a P -proof of size less than m(n)k . For a fixed k ≥ 1 define Easyk ⊆ {0, 1}∗ consisting of those y such that ∃z(|z| ≤ |y|k ); P (τy (g), z) . The hardness of g means that all these sets Easyk are finite. So what we want is a map g with parameters as above such that Rng(g) intersects all infinite N P-sets. This is akin to the definition of pseudo-random number generators from cryptography (more precisely to that of hitting set generators): These are maps that intersect all P/ poly-sets of non-negligible density. The relaxation from infinite sets to sets of positive density would not be such a problem for us: It would merely mean that perhaps not all τ -formulas are hard as we have required, but the fraction of b’s yielding easy instances would be negligible. What is difficult is the requirement that g should intersect N P-sets rather than P/ poly-sets. Even the existence of pseudorandom generators is proved only under other assumptions (like the existence of one-way functions) and for the version with N P-sets we have nothing similar. There is a construction of a weaker type of generators, the so-called Nisan-Wigderson [17] generators,
Proof Complexity
229
sufficient in derandomization. Their existence can be proved (even w.r.t. N Psets) from a plausible hypothesis in boolean complexity. But the problem here is that the parameters achieved are insufficient for our purposes (the time complexity of g, and hence the size of τb (g)’s, grows with the constant k ≥ 1). [19] conjectures that original parameters suffice, while different parameters were proposed in [12]. Detailed discussions of this can be found, from somewhat different perspectives, in these two papers. Instead discussing the technicalities more we shall consider another type of results, showing that in a particular sense there is “the hardest” g. For that we need to recall the notion of a boolean circuit. It is an object similar to a propositional formula except that it is represented very economically: Any subformula (or subcircuit) is written just once, even if it is needed in several occurrences. It is easy to see that a circuit of size s can be encoded by O(s log(s)) bits. Let C be a circuit with k inputs and of size at most 2k/3 . By the above it is encoded by a string of O(2k/3 k/3) < 2k/2 bits. Denote by tt(C) the truth table of the boolean function on {0, 1}k computed by C; it is an element of k {0, 1}2 . The truth-table function tt : C ∈ {0, 1}2
k/2
→ tt(C) ∈ {0, 1}2
k
has the parameters n = 2k/2 and m = 2k we want from g, and it is p-time computable. The truth-table function will be the hardest one, but first we need to modify a bit the definitions of hardness. I shall discuss this only informally. An intuitive drawback in the definition of hardness is that although it may be hard to prove that any particular b is outside of the range of g one still cannot “consistently think” in P that g is onto. This is because it may be that some disjunction of the form g(x) '= b ∨ g(x ) '= b ∨ . . . has a short P -proof, or bit more generally a disjunction of the form g(x) '= b ∨ g(x ) '= b (x) ∨ . . . where b (x) is a circuit computing b from x, etc. Define informally that g is very hard for P if “no such” disjunction has p-size P -proof, for n >> 0. The technical terms used here are pseudosurjectivity or iterability (cf. [12]). Theorem 4.1 (Kraj´ıˇcek[12]). If there is any g very hard for P containing resolution then the truth table function tt is very hard for P too. At least for the case of P being resolution we can prove the hypothesis of the theorem. Theorem 4.2 (Razborov[19]). There is a g that is very hard for resolution. The theorems imply the following corollary.
230
J. Kraj´ıˇcek
Corollary 4.3. The truth table function tt is very hard for resolution. In particular, it is also hard for resolution. Formula τb (tt) expresses that b is a truth table of a function with large (> 2k/3 ) circuit complexity (the size 2k/3 has been chosen for our discussion but can be almost anything). Hence these formulas are indeed hard if it is hard to prove circuit lower bound for any boolean function. This puts us in a bit peculiar situation: Our program succeeds, i.e., the τ -formulas are very hard even for strong proof systems, if it is hard to prove circuit lower bounds, i.e., it is hard to carry other programs in complexity theory that reduce various conjectures (like P '= N P or universal derandomization of probabilistic computations) to circuit lower bounds. 5. A broader perspective Let me conclude with a look at proof complexity problems from a distance. Around 1900 mathematicians were worried about questions as: • Is the consistency of Mathematics provable? • Is predicate calculus (i.e., what is true in all structures) algorithmically decidable? The first issue led to G¨odel’s theorem and Gentzen’s proof theory analysis, while the second led to the work of Turing, Church, Kleene and others (a formal definition of the notion of algorithm, undecidable problems). We can view the problems of complexity theory as quantitative versions of the the above questions. In particular: • Is the consistency of Mathematics w.r.t. proofs of size n provable in size comparable to n? • Is it feasibly decidable what is true in all structures of size n? If “comparable” means polynomially bounded and “feasibly” means in p-time, then the first problem is exactly N P =? coN P while the second one is P =? N P. The links with logic run deep. For example, there is a quantitative version of G¨ odel’s theorem that, if true, would imply that no proof system can simulate all other proof systems, and hence N P '= coN P and P '= N P. See [15]; there is an exposition in [8] too or a brief and non-technical one in [14]. References [1] M. Ajtai, The complexity of the pigeonhole principle, in: Proc. IEEE 29th Annual Symp. on Foundation of Computer Science, (1988), pp. 346–355. [2] M. Alekhnovich, E. Ben-Sasson, A.A. Razborov, and A. Wigderson, Pseudorandom generators in propositional proof complexity, Electronic Colloquium on Computational Complexity, Rep. No.23, (2000). Ext. abstract in: Proc. of the 41st Annual Symp. on Foundation of Computer Science, (2000), pp.43–53. [3] S.R. Buss, Bounded Arithmetic. Naples, Bibliopolis, (1986).
Proof Complexity
231
[4] S.A. Cook, The complexity of theorem proving procedures, in: Proc. 3rd Annual ACM Symp. on Theory of Computing, (1971), pp. 151–158. ACM Press. [5] S.A. Cook, Feasibly constructive proofs and the propositional calculus, in: Proc. 7th Annual ACM Symp. on Theory of Computing, (1975), pp. 83–97. ACM Press. [6] S.A. Cook and A.R. Reckhow, The relative efficiency of propositional proof systems, J. Symbolic Logic,44(1), (1979), pp. 36–50. ¨ [7] G. Haj´ os, Uber eine Konstruktion nicht n-farbbarer Graphen, Wiss. Z. MartinLuther-Univ., Halle-Wittenberg, Math. Natur. Reihe, 10, (1961), pp.116–117. [8] J. Kraj´ıˇcek, Bounded arithmetic, propositional logic, and complexity theory, Encyclopedia of Mathematics and Its Applications, Vol. 60, Cambridge University Press, (1995). [9] J. Kraj´ıˇcek, Interpolation theorems, lower bounds for proof systems, and independence results for bounded arithmetic, J. Symbolic Logic, 62(2), (1997), pp. 457–486. [10] J. Kraj´ıˇcek, On the weak pigeonhole principle, Fundamenta Mathematicae, Vol.170(1-3), (2001), pp. 123–140. [11] J. Kraj´ıˇcek, Tautologies from pseudo-random generators, Bulletin of Symbolic Logic, 7(2), (2001), pp. 197–212. [12] J. Kraj´ıˇcek, Dual weak pigeonhole principle, pseudo-surjective functions, and provability of circuit lower bounds, Journal of Symbolic Logic, 69(1), pp. 265– 286, (2004). [13] J. Kraj´ıˇcek, Diagonalization in proof complexity, Fundamenta Mathematicae 182, (2004), pp. 181–192. [14] J. Kraj´ıˇcek, Hardness assumptions in the foundations of theoretical computer science, Archive for Mathematical Logic, to app. [15] J. Kraj´ıˇcek, and P. Pudl´ ak, Propositional proof systems, the consistency of first order theories and the complexity of computations, J. Symbolic Logic, 54(3), (1989), pp. 1063–1079. [16] J. Kraj´ıˇcek and P. Pudl´ ak, Some consequences of cryptographical conjectures for S21 and EF ”, Information and Computation, Vol. 140 (1), (January 10, 1998), pp. 82–94. [17] N. Nisan, and A. Wigderson, Hardness vs. randomness, J. Comput. System Sci., Vol. 49, (1994), pp. 149–167. [18] A.A. Razborov, Resolution lower bounds for perfect matching principles, in: Proc. of the 17th IEEE Conf. on Computational Complexity, (2002), pp. 29–38. [19] A.A. Razborov, Pseudorandom generators hard for k-DNF resolution and polynomial calculus resolution, preprint, (May’03). Jan Kraj´ıˇ cek Mathematical Institute Academy of Sciences ˇ a 25 Zitn´ CZ-11567 Prague 1, The Czech Republic e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Horizontal Configurations of Points in Link Complements Daan Krammer Abstract. For any tangle T (up to isotopy) and integer k ≥ 1 we construct a group F (T ) (up to isomorphism). It is the fundamental group of the configuration space of k points in a horizontal plane avoiding the tangle, provided the tangle is in what we call Heegaard position. This is analogous to the first half of Lawrence’s homology construction of braid group representations. We briefly discuss the second half: homology groups of F (T ).
1. Introduction In her thesis [7] Ruth Lawrence introduced and studied certain representations of braid groups. She related her representations to the Jones polynomial (see also [3]). Some of her representations were later shown to be faithful [2], [5]. Encouraged by these results, we ask ourselves if (new) link invariants can be obtained by similar methods. Very briefly, the Lawrence representations of braid groups are constructed in two steps. Firstly, the braid group acts on a homotopy type called configuration space. Secondly, certain homology modules of configuration space are braid group modules. In the case of links we expect the same two steps: • From links to groups. • From groups to homology. Our main result belongs to the first bullet. On the second bullet we have only some simple remarks. Let L ⊂ R3 be a link (not up to isotopy!) and fix a positive integer k. Consider the configuration spaces C(L) = X ⊂ R3 \L |X| = k M (L) = X ∈ C(L) X lies in a horizontal plane . It is trivial that up to diffeomorphism C(L) depends only on the isotopy class of L. Therefore, a (twisted) homology module of C(L) is a link invariant. But the fundamental group of C(L) has no representations U such that H∗ (C(L), U ) is
234
D. Krammer
any interesting, at least no more than π1 (R3 \L) has.1 The group π1 M (L) has many more representations and we will henceforth concentrate on this group. A particular case of our main result 4.3 states: If L is a Heegaard link then π1 M (L) depends only on the isotopy class of L. (The full result considers the more general case of tangles.) See Section 3 for the definition of Heegaard links. Every link is isotopic to a Heegaard link. A direct consequence of the above result is the construction of a link invariant which takes isomorphism classes of groups for values. I do not know which other properties of M (L) depend only on the isotopy class of L (L again Heegaard). Does M (L) up to diffeomorphism? Does it up to homotopy equivalence? The paper is built as follows. In Section 2 we review Lawrence’s representations. Tangles and Heegaard tangles are introduced in Section 3. The main result is formulated in Section 4 and proved in Section 5. Section 6 discusses the second bullet (from groups to homology) but it does not get very far. 2. Lawrence representations We will review Lawrence’s representations of braid groups [7]. The braid group Bn is defined to be the fundamental group of BSn = X ⊂ C : |X| = n , the space of sets (called configurations) of n complex numbers. Throughout this paper, we fix a natural number k ≥ 1. Let E denote the space of pairs (X, Y ) where X ∈ BSn and Y ⊂ C\X is a set of k points which avoid X. The map f : E −→ BSn (X, Y ) −→ X is a fibre bundle (topologically locally trivial map). Let F ⊂ E be the fibre of f over a base-point in BSn . The fibre bundle f admits a continuous section s: BSn −→ E (section means f s(X) = X for all X), for example s(X) = X, {aX + 1, aX + 2, . . . , aX + k} where aX = max |x| : x ∈ X . It gives rise to a splitting t: Bn → π1 E. 1This is because π C(L) is a semi-direct product S π (R3 \L)k . 1 1 k
Horizontal Configurations of Points in Link Complements
235
C bt
Y
bq X
y x Figure 1. bq and bt
The theory of fibre bundles says that π1 E is now a semi-direct product Bn π1 F . In particular we have an action Bn −→ Aut(π1 F )
(2.1)
defined by x(y) = (tx)y(tx)−1 (x ∈ Bn , y ∈ π1 F ). Let U be a module over any ring and let r: π1 E → GL(U ) be a linear representation. We then put V := Hk (F, U ). It is known that F is a K(π, 1), so that we also have V = Hk (π1 F, U ). The Bn -action (2.1) on π1 F gives rise to a Bn -action Bn −→ GL(V ) because homology is a functor. This is a general form of the Lawrence representation of the braid group [7]. The case where U is 1-dimensional is especially interesting. In that case, the space of representations r is 2-dimensional if k ≥ 2; indeed, the abelianisation (π1 E)ab of π1 E is isomorphic to Z2 . Figure 1 shows two generators of (π1 E)ab . White dots are elements of X and shall be called punctures. Black dots are elements of Y . So bq ∈ π1 E means that an element of Y makes a full circle around an element of X in counterclockwise direction, and bt ∈ π1 E means that two elements of Y interchange counterclockwise. Then (π1 E)ab is the free abelian group on the images of bq and bt . The representation r: π1 E → GL 1, Z[q ±1 , t±1 ] is defined by r(bq ) = q and r(bt ) = t. It can be shown that n+k−2 dim V = k so that the Lawrence representation can briefly be written " # ±1 ±1 Bn −→ GL(V ) = GL n+k−2 , t ] . , Z[q k
236
D. Krammer
The Jones polynomial has been related to these representations in [7] and [3]. For k = 1 this representation is the well-known Burau representation discovered in 1936. The representation for k = 2 was shown to be faithful in [2] and [5]. In the following we will try to apply similar methods to obtain knot and link invariants.
3. Tangles We define tangles and Heegaard tangles. A tangle of type [a, b] (a, b ∈ R, a < b) is a smoothcompact 1-manifold T ⊂ C×[a, b] (with coordinates (x+iy, z)) with ∂T = T ∩ C×{a, b} and such that T is not tangent to C × {a, b}. A link is a tangle with empty boundary. Two tangles of types [a, b] and [c, d] are isotopic if one is taken to the other by a diffeomorphism f : C × [a, b] → C × [c, d] with f (x, a) = (x, c) and f (x, b) = (x, d), for all x ∈ C. The isotopy class of a tangle T is written [T ]. It may happen that the union T1 ∪ T2 of a tangle T1 of type [a, b] and a tangle T2 of type [b, c] is again a tangle (of type [a, c]); if this happens we call T1 T2 := T1 ∪ T2 the product of T1 , T2 . The isotopy class [T1 T2 ] depends only on [T1 ], [T2 ] and we thus obtain the multiplication of isotopy classes of tangles. Definition 3.1. Let p3 : C × R −→ R (x + iy, z) −→ z denote the projection on the third real coordinate. A plane p−1 3 (d) (d ∈ R) is called horizontal. A tangle T of type [a, b] is said to be Heegaard if a number c ∈ [a, b] is distinguished such that any local maximum (or cap) x of p3 |T : T −→ R satisfies p3 (x) > c, and every local minimum (or cup) satisfies p3 (x) < c, and −1 T is not tangent to p−1 3 (c). The horizontal plane p3 (c) is called the Heegaard plane and it separates the caps from the cups. Two Heegaard tangles are said to be Heegaard isotopic if one is taken to the other by a diffeomorphism f : C × [a, b] → C × [c, d] as for the usual isotopy provided f takes the one Heegaard plane to the other. If two Heegaard tangles are Heegaard isotopic then they are isotopic, but not conversely. It is known [4] that every tangle is isotopic to a Heegaard tangle. The concept of Heegaard tangles is closely related to the plat closure about which we shall be rather brief. A detailed discussion can be found in [4].
Horizontal Configurations of Points in Link Complements
We have a commuting diagram : B2n n≥0
237
; Links isotopy
plat closure
(3.1) ; Heegaard links Heegaard isotopy and the map in the top row, the plat closure, adds n caps and n cups to a braid on 2n strings. Loosely speaking, the plat closure of a braid is in Heegaard position in a natural way. The elements of the set at the bottom of (3.1) can be viewed as certain double cosets of braids as is done in [4]. The term plat closure is well known, but we prefer the language of Heegaard tangles because they are not isotopy classes, contrary to plat closures. 4. From tangles to groups Definition 4.1. Let T be a tangle of type [a, b]. We define M (T ) = X ⊂ C × [a, b] \T x, y ∈ X ⇒ p3 (x) = p3 (y) which we call the configuration space of T . Note that every X ∈ M (T ) is required to lie in a horizontal plane. For example, if k = 1 then M (T ) = C × [a, b] \T . Definition 4.2. A Heegaard tangle is saturated if each of its components contains a cap or a cup. It is known [4] that every tangle is isotopic to a Heegaard tangle, and therefore clearly also to a saturated Heegaard tangle. Every Heegaard link is saturated. The following is our main result. Theorem 4.3. Let T1 , T2 be saturated Heegaard tangles. If T1 , T2 are (‘nonHeegaard’) isotopic then π1 M (T1 ) ∼ = π1 M (T2 ). The point of the theorem is that T1 , T2 are not assumed to be Heegaard isotopic but just isotopic, which is a weaker assumption. It is trivial that π1 M (T1 ) and π1 M (T2 ) are isomorphic if T1 , T2 are Heegaard isotopic. A tangle invariant is just a map from the set of isotopy classes of tangles to any set. Theorem 4.3 suggests a tangle invariant as follows. Definition 4.4. Let T be a tangle. We define a group F (T ) as follows. First, choose a saturated Heegaard tangle U isotopic to T . We put F (T ) = π1 M (U ).
238
D. Krammer Heegaard plane
z
h
y
T1
−→
T2
x Figure 2. Elementary stabilisation
By 4.3, F (T ) is independent of the choice of U . It is immediate that F (T ) depends only on the isotopy class of T , so T → F (T ) is a tangle invariant. I do not know which other properties of M (T ) (T a saturated Heegaard tangle) depend only on the isotopy class of T . Is M (T ) up to diffeomorphism a tangle invariant? Is its homotopy type? 5. Proof After some preparation, we will prove our main result 4.3. In this section, all Heegaard tangles will be of type [−1, 1] and with Heegaard plane H := p−1 3 (0), unless stated otherwise. Stabilisation. Let T1 , T2 be Heegaard tangles. We say that T2 is obtained from T1 by an elementary stabilisation if T1 , T2 only differ close to some intersection point h ∈ T1 ∩ H and T2 has three such intersection points close to h rather than one. See Figure 2. Note that in the foregoing, T2 is determined up to Heegaard isotopy by T1 and h (and isotopy which is trivial far away from h). We will use the following result by Birman [4]. In fact she considered only links, but her proof also works for tangles. Theorem 5.1. The obvious map ; ; isotopy −→ Links isotopy Heegaard links Heegaard and stabilisation is bijective. (The set on the left is by definition the quotient of the set of Heegaard links by the equivalence relation ∼ generated by T1 ∼ T2 whenever T1 , T2 either are Heegaard isotopic or differ by an elementary stabilisation.) An application of the Seifert-Van Kampen theorem. Let T be a Heegaard tangle. Let Z denote the map Z: M (T ) −→ R X −→ p3 (x) for one (hence all) x ∈ X.
Horizontal Configurations of Points in Link Complements
239
We will use the following notation. G = G(T ) = π1 M (T )
M = M (T ) M0 = M0 (T ) = Z
−1
(0) M+ = M+ (T ) = Z −1 [0, 1] M− = M− (T ) = Z −1 [−1, 0]
G0 = G0 (T ) = π1 M0 (T ) G+ = G+ (T ) = π1 M+ (T ) G− = G− (T ) = π1 M− (T )
An immediate application of the Seifert-Van Kampen theorem shows that the diagram G+− → → → −→ G0− G (5.1) → → → → − G− (with obvious arrows) is a push-out diagram. There is another way of saying the same thing, because all maps in (5.1) are surjective: writing K+ = K+ (T ) = ker(G0 → G+ ) K− = K− (T ) = ker(G0 → G− ) K = K(T ) = ker(G0 → G)
(5.2)
we have K = K+ , K− .
(5.3)
Each of the three straight lines of two arrows in
−→ →
→ −→
K− G+ −→ → → → − − K −−−−→ G0−−−−−→ → G −→ → → → − G− K+ is exact. Some group presentations. The next lemma gives generators for K+ (T ). Lemma 5.2. Let T be a Heegaard tangle. Let D1 , . . . , D be disjoint closed disks in the Heegaard plane H and write Di := p3 Di . Suppose that ∂Di × [0, 1] ∩ T = ∅ for all i and that Di × [0, 1] ∩ T is an interval whose boundary lies in H, for all i. See Figure 3. Loosely speaking every cap of T lives in another Di ×[0, 1]. Recall that the Heegaard plane is H = p−1 3 (0) so we are only looking at the part above H. Fix a set X0 ⊂ H\(D1 ∪ · · · ∪ D ) of k − 1 elements (the heavy dots in Figure 3). Consider the conjugacy class Yi ⊂ G0 (T ) of those elements given by
240
D. Krammer
H z
Di
y x
Figure 3. T ∩ C × [0, 1] C
y
1 2
···
1 2
···
i
···
j
p gip
σj
j+1
···
···
k
m
x Figure 4. Generators for the braid group of the punctured disk a closed path in M0 (T ) homotopic to the map ∂Di −→ M0 (T ) x −→ {x} ∪ X0 . Then K+ (T ) is generated by Y1 ∪ · · · ∪ Y . Proof. Left to the reader.
Remark 5.3. One can get around the need of Lemma 5.2 if one replaces π1 M (T ) in the main result 4.3 by the group implied in Lemma 5.2, that is, G0 (T )/Y1 , . . . , Y , Z1 , . . . , Zm where the Yi are as in 5.2 and Zi likewise with cups instead of caps. The price one pays is that one should show that this group is well defined, that is, does not depend on the choice of the disks Di in 5.2. Modification of the proof of 4.3 is not necessary.
Horizontal Configurations of Points in Link Complements n
···
U1 1
x n
···
U2
z
n−1
2
H n−1
n+1 n−
···
U3
241
2 3
n n−
n−1
1 3
Figure 5. Three fundamental tangles Proposition 5.4. The k-string braid group π1 X ⊂ C |X| = k, X ∩ {1, . . . , m} = ∅ of the m times punctured disk with base-point {1 + i, 2 + i, . . . , k + i} (i = is presented by generators σi gip
1≤i
√
−1)
(5.4)
(see Figure 4) and relations σi σj = σj σi
|i − j| > 1
(5.5)
σi σj σi = σj σi σj
|i − j| = 1
(5.6)
i < j, p < q
(5.7)
[gip , gjq ] = 1 σi gi+1,p σi−1
−1 gip
=1
(5.8)
[gip , σi gip σi ] = 1
(5.9)
[gip , σj ] = 1
i '= {j, j + 1}
(5.10)
where [a, b] = aba−1 b−1 . Proof. Presentations for this group can be found in Theorem 2 or Theorem 3 of [6] or Theorem 5.1 of [1]. The generators σj , aij of Theorem 2 in [6] are our −1 . It is left to the reader to check that this identification respects σj , gj,m+1−i the group presentations. For any tangle T , the group G0 (T ) is the k-string braid group of a punctured disk H\T .
242
D. Krammer
We consider three Heegaard tangles U1 , U2 , U3 defined by Figure 5. We have U1 ∩ H = {1, . . . , n} × {0} U2 ∩ H = {1, . . . , n + 1} × {0} U3 ∩ H = {1, . . . , n − 1, n − 23 , n − 13 , n} × {0}. In the next lemma, we will prove that G(U1 ) and G(U2 ) are isomorphic in a precise sense. Of course, G0 (U1 ) is just the braid group of the n times punctured disk. By definition, G(U1 ) = G0 (U1 )/K(U1 ). Combining (5.3) and Lemma 5.4 then shows that G(U1 ) is presented by generators (5.4) and relations (5.5)– (5.10) (with m = n) as well as for all i ∈ {1, . . . , k}.
gi,n−1 gin = 1
(5.11)
Similarly, G(U2 ) is presented by generators (5.4) and relations (5.5)–(5.10) (with m = n + 1) and gi,n−1 gin = 1
for all i ∈ {1, . . . , k}
(5.12)
gin gi,n+1 = 1
for all i ∈ {1, . . . , k}.
(5.13)
Lemma 5.5. There is a (unique) isomorphism f : G(U2 ) → G(U1 ) such that f (σi ) = σi f (gip ) = gip
(p '= n + 1)
f (gi,n+1 ) = gi,n−1 . Proof. We need to prove that the substitution gi,n+1 −→ gi,n−1
(5.14)
takes any relation for G(U2 ) to one of the relations for G(U1 ) or a consequence of them. (It is clear that all relations of G(U1 ) are obtained this way.) First consider (5.7) with q = n + 1. Suppose p < n. Then we have the following computation in G(U1 ): (5.12)
(5.7)
−1 [gip , gj,n−1 ] = [gip , gj,n ] = 1.
(5.15)
The substitution (5.14) takes (5.7) to [gip , gj,n−1 ] = 1 which is true in G(U1 ) by (5.15). Suppose p = n + 1. Then the following holds in G(U1 ): (5.11)
(5.7)
−1 −1 , gjn ] = 1 [gin , gj,n−1 ] = [gi,n−1
(5.16)
But the substitution (5.14) takes (5.7) to [gin , gj,n−1 ] = 1 which is true by (5.16). −1 =1 Our substitution takes (5.8) with p = n + 1 to σi gi+1,n−1 σi−1 gi,n−1 which is true in G(U1 ) by (5.8). Likewise, the substitution takes (5.9) with p = n + 1 to [gi,n−1 , σi gi,n−1 σi ] = 1 which is true in G(U1 ) by (5.9). Also,
Horizontal Configurations of Points in Link Complements
243
the substitution takes (5.10) with p = n + 1 to [gi,n−1 , σj ] = 1 which is true in G(U1 ) by (5.10). The relation (5.12) is just (5.11). Our substitution takes (5.13) to a void statement. All relations for G(U2 ) that we have not mentioned so far do not involve gi,n+1 and are clearly taken to a relation for G(U1 ). We have H\U3 ⊂ H\U1 .
(5.17) {n− 32 , n− 13 }×{0}. The
Indeed, the difference between these two sets is precisely inclusion (5.17) yields an inclusion M0 (U3 ) ⊂ M0 (U1 ) which in turn induces a surjective map of their fundamental groups p: G0 (U3 ) −→ → G0 (U1 ).
(5.18)
Lemma 5.6. We have G(U3 ) ∼ = G(U1 ) and pK(U3 ) = K(U1 ). Proof. One proves that there exists an isomorphism q: G(U3 ) → G(U1 ) by applying 5.5 twice, once on (U1 , U2 ) and once on (U2 , U3 ). Inspection of 5.5 and the presentations of G(Ui ) also shows that q can be taken to be induced by p. By the definition of K(Ui ) (5.2) it follows that pK(U3 ) = K(U1 ). Proof of the main result 4.3. Let T1 , T3 be two saturated Heegaard tangles of type [−2, 2] with Heegaard plane H = p−1 3 (0), and suppose that T1 , T3 differ by an elementary stabilisation. Our aim is to prove G(T1 ) ∼ = G(T3 ). After applying a Heegaard isotopy to T1 and T3 if necessary, and changing the sign of the z-coordinate in both of them if necessary, we may assume T1 ∩ [−1, 1] = U1 ,
T3 ∩ [−1, 1] = U3 ,
T1 \U1 = T3 \U3
where U1 and U3 are as in Figure 5. (The cap in T1 ∩ [−1, 1] exists because T1 is saturated). Note that G0 (Ui ) = G0 (Ti )
(i = 1, 3)
which therefore contains both K(Ui ) and K(Ti ). In Lemma 5.6 we saw that the natural map p in (5.18) takes K(U3 ) to K(U1 ). We will show that it also takes K(T3 ) to K(T1 ). Let C denote the set of caps and cups of T1 \U1 = T3 \U3 . By 5.2 we have K(Ui ) ⊂ K(Ti )
(i = 1, 3).
Moreover one can associate, to each c ∈ C, two conjugacy classes Yi (c) ⊂ G0 (Ui ) (i = 1, 3) such that < = K(Ti ) = K(Ui ), {Yi (c) | c ∈ C} pY3 (c) = Y1 (c)
for all c ∈ C.
244
We find
D. Krammer
< = pK(T3 ) = p K(U3 ), {Y3 (c) | c ∈ C} < = = pK(U3 ), {pY3(c) | c ∈ C} = < = K(U1 ), {Y1 (c) | c ∈ C} = K(T1 )
as promised. Since the map p is surjective we get G(T3 ) = G0 (T3 )/K(T3 ) ∼ = G0 (T1 )/K(T1 ) = G(T1 ). This finishes the proof of our main result 4.3.
6. From groups to homology In Section 4 we defined a tangle invariant F (T ) which is a group. As we saw in the introduction, one hopes to turn this invariant into a more manageable invariant. Suppose that we have, for each tangle T , a representation r(T ): F (T ) −→ GL U (T ) defined over any ring. Suppose moreover that the pair (F (T ), r(T )) is a tangle invariant up to isomorphism, in the sense that for any two isotopic tangles T1 , T2 there exists a commutative diagram r(T1 ) F (T1 ) −−−−−→ GL U (T1 ) ∼ ∼ 5 5 r(T2 ) F (T2 ) −−−−−→ GL U (T2 ) whose vertical arrows are isomorphisms, the right-hand side one coming from an isomorphism U (T1 ) → U (T2 ). (Let us call a family {r(T )}T like this good.) Then the isomorphism class of the homology module H∗ F (T ), U is a tangle invariant. For example, if k = 1 and U (T ) is 1-dimensional then H1 F (T ), U (T ) is the well-known coloured Alexander module. For any tangle T and any k > 0 we have a continuous map M (T ) −→ BSk X −→ x + iy (x + iy, z) ∈ X (projection on the first two real coordinates) which induces a map to the braid group G(T ) → Bk . If also T is a saturated Heegaard tangle then F (T ) = G(T ) (see Section 4) so that we have a map v(T ): F (T ) −→ Bk .
Horizontal Configurations of Points in Link Complements
245
Every representation w: Bk −→ GL(U ) (not depending on any tangle) gives rise to a representation v(T )
w
F (T ) −−−−→ Bk −−−−→ GL(U ) and therefore to a tangle invariant H∗ (F (T ), U ). This is one way to produce good families of representations {r(T )}T but certainly not the only way. It would be interesting to compute any of the homology modules of F (T ) sketched in this section, or to know if they reveal information about links. References [1] Bellingeri, Paolo, On presentations of surface braid groups, J. Algebra 274 (2004), no. 2, 543–563. [2] Bigelow, Stephen, Braid groups are linear, J. Amer. Math. Soc. 14 (2001), no. 2, 471–486. [3] Bigelow, Stephen, A homological definition of the Jones polynomial, in Invariants of knots and 3-manifolds (Kyoto, 2001), 29–41, Geom. Topol. Monogr. 4, Geom. Topol. Publ., Coventry, 2002. [4] Birman, Joan S., On the stable equivalence of plat representations of knots and links, Canad. J. Math. 28 (1976), no. 2, 264–290. [5] Krammer, Daan, Braid groups are linear, Ann. of Math. (2) 155 (2002), no. 1, 131–156. [6] Lambropoulou, Sofia, Braid structures in knot complements, handlebodies and 3-manifolds, in Knots in Hellas ’98 (Delphi), 274–289, Ser. Knots Everything, 24, World Sci. Publishing, River Edge, NJ, 2000. [7] Lawrence, R.J., Homological representations of the Hecke algebra, Commun. Math. Phys. 135 (1990), no. 1, 141–191. Daan Krammer Mathematics Institute University of Warwick Coventry CV4 7AL, United Kingdom e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Invariant Measures for Multiparameter Diagonalizable Algebraic Actions – A Short Survey Elon Lindenstrauss
1. Classifying measures on the one-torus One of the simplest dynamical systems is the map ×n : x → nx mod 1 on the unit interval, where n is any natural number. In order to make this map continuous, we think of it as a map on the 1-torus T = R/Z. This system is very well understood, and it has many closed invariant sets and many invariant Z probability measures. let τ : Σ = {0, . . . , n − 1} → R/Z be the map ∞ Indeed, τ (a1 , a2 , . . . ) = i=1 n−i ai . Then any shift invariant probability measure ν on Σ, for example i.i.d. Bernoulli measure, gives rise to the ×n -invariant measure µ = τ∗ ν (and similarly for sets). Every ×n invariant probability measure on R/Z is of this form, and moreover for measures µ for which µ({0}) = 0 the map τ∗ is also one-to-one. However, R/Z has additional structure: it is an abelian group, and for a fixed n, the map ×n is just one out of many endomorphisms of this group. In 1967, Hillel Furstenberg considered the joint action of two such endomorphisms ×n and ×m for n and m multiplicatively independent (i.e., not powers of the same integer).1 This Z2+ action turns out to be much more subtle. In his landmark paper [8] Furstenberg introduced the notion of disjointness in dynamical systems and ergodic theory, a notion which has proven quite central in the modem theory of these subjects, and also proves as a byproduct that the closed subsets C ⊂ R/Z satisfying ×n (C) ⊂ C and ×m (C) ⊂ C are either R/Z or finite sets of rationals. The analogous question for measures has also been posed by Furstenberg (though apparently not in writing) in 1967, namely classifying the probability measures on R/Z invariant under ×n and ×m . This has proven substantially more difficult to resolve than the topological question. Furstenberg conjectured that any such invariant measure is a linear combination of Lebesgue measure and atomic measures supported on finite orbits of the semigroup {×nl mk }. The author is supported by a Clay Research Fellowship; partial support was also received from NSF grant DMS-0434403. 1One can also study the interplay between × for only one n and the additive structure on n R/Z; this also leads to interesting questions. For more details, see [21].
248
E. Lindenstrauss
To date the best result towards Furstenberg’s conjecture is due to Daniel Rudolph [37] and Aimee Johnson [10] which have shown that any measure µ which is invariant under ×n and ×m is a linear combination of Lebesgue measure and measures which have zero entropy with respect to the map ×n ; the first substantial result towards Furstenberg’s conjecture, which is weaker than the Rudolph-Johnson theorem is due to Russell Lyons [25]. This is completely equivalent to the statement that the only ×n , ×m -ergodic and invariant measure on R/Z with the entropy hµ (×n ) > 0 is Lebesgue measure. For n, m relatively prime, Bernard Host [9] has given a proof of a sharper version of Rudolph theorem, which also has advantage that it is more easily quantifiable (for the extension to more general n, m see [16]; see also [21]). The ×n , ×m action on R/Z is prototypical for a much larger class of algebraic multiparameter actions, and these actions occur naturally in many contexts. We limit ourselves in the remainder of this note exclusively to the case of Rk -actions on the locally homogeneous spaces with k ≥ 2. This does not cover many interesting and important cases, such as actions on tori, and actions on totally disconnected groups. I also do not cover my own work on arithmetic quantum unique ergodicity, which is closely related to the topics I survey here; the interested reader can consult [17] or the expository papers [20, 19]. 2. More general algebraic actions The algebraic action we will consider here are by affine transformations on Γ\G/K where G is a locally compact group (usually a linear group), Γ < G a discrete subgroup, and K < G compact, where by affine transformation we mean a map of the form ΓgK → ΓΦ(g)hK with Φ an endomorphism of G and h ∈ G; to make this well-defined, we need to assume that Φ(Γ) ⊂ Γ, Φ(K) ⊂ K and h ∈ G commutes with every k ∈ K, i.e., in the centralizer CG (K). There is little loss of generality in specializing to the special case where this endomorphism is the identity, i.e., looking at the action of a closed subgroup H < G by right translations on Γ\G/K where K ⊂ CG (H) is as before compact. Even Furstenberg’s ×n , ×m conjecture can be presented in this way for a suitable G. Also, while there certainly are interesting issues arising in the study of more general (even abelian!) groups G such as those considered in [38], we will consider only S-algebraic groups G – i.e., groups G which are the product of finitely many linear algebraic groups over local fields of characteristic 0 (without loss of generality, R or Qp for some prime p). Now if H is generated by unipotent one parameter elements (or even just by unipotents) many of the dynamical properties of this action are well understood. In the late 1980’s, Gregory Margulis solved the long-standing Oppenheim conjecture about nonrational indefinite quadratic forms in three or
Invariant Measures for Diagonalizable Actions
249
more variables by classifying the closed H = SO(2, 1) invariant subsets in SL(3, Z)\ SL(3, R) (see [2] for a very accessible treatment). Marina Ratner completely classified the H invariant measures for any subgroup generated by unipotent one parameter elements (this has been proved in a series of paper culminating in [33]), and used this to classify orbit closures of such H, indeed even the behavior of single orbits [34]. Ratner’s theorem, which has been extended to the S-arithmetic context (there are two treatments: one by Ratner [35] and one by Margulis and George Tomanov [27]), has found many diverse applications. This, however, does not cover, e.g., the case of H a commutative diagonalizable group. Indeed, it seems that in view of Ratner’s theorem, understanding the action of commutative groups (e.g., their invariant measures) is probably the main missing step in understanding actions of general closed, connected H (see [15, Sec. 4b]). The following is a prominent example: for n ≥ 2, let Xn be the space SL(n, Z)\ SL(n, R). We can identify Xn with the space of lattices in Rn of covolume one by assigning to every SL(n, Z)g the lattice Zn g < Rn . The space Xn is not compact: a sequence of lattices xi is bounded in Xn if and only if there is some δ > 0 so that every vector vi in the lattice xi has size ≥ δ. Now take H to be the group of diagonal n × n matrices of determinant 1. There is a sharp dichotomy between n = 2 and n ≥ 3. If n = 2 then X2 is isomorphic to a double cover of the unit cotangent bundle of the finite volume surface SL(2, Z)\H. Under this isomorphism, the action of H becomes the geodesic flow on SL(2, Z)\H. As is well-known this flow is a prototypical hyperbolic flow, and has good symbolic codings which were already pioneered by Koebe, Morse and Hedlund in the beginning of the 20th-century. This is closely analogous to the situation with the ×3 map acting on R/Z which has good symbolic codings.2 And using these symbolic codings, one can construct lots of invariant measures and lots of invariant sets with various properties, for example H-closed invariant sets which contains no periodic points and have positive fractional Hausdorff dimension. For n ≥ 3 the situation is drastically different. There, the dynamics of the diagonal group H is much more rigid. For example, in [28] Margulis made the following conjectures regarding this action: Conjecture 2.1 (Margulis). Let Xn = SL(n, Z)\ SL(n, R) and H < SL(n, R) as above, with n ≥ 3 (1) any bounded H orbit is in fact a compact orbit. (2) any H-invariant probability measure µ on Xn is a linear combination of algebraic measures (i.e., L-invariant probability measures on closed orbit of a closed subgroup L < SL(n, R) containing H).3 2Though there is one important difference: for SL(2, Z)\ SL(2, R) we are considering symbolic codings for a flow and not for a single transformation which is somewhat more complicated. 3This part of the conjecture is not stated explicitly there, but follows from [28, Conjecture 2].
250
E. Lindenstrauss
Similar conjectures have also been made by Furstenberg (which prompted work of Shahar Mozes towards the topological question [29, 30]) as well as Anatole Katok and Ralf Spatzier [13] (who also made the first substantial progress towards classifying such invariant measures in the context of homogeneous spaces). Note that there is a subtlety here which was not present in the cases covered by Ratner’s theorem. There is no one parameter subgroup of H whose action is in any way rigid: only the action of the full group H (or at least a two-dimensional subgroup of this group) is rigid. This can be used to construct non-algebraic orbit closures or invariant probability measures, even for H the group of n×n diagonal matrices and G = SL(n, R) on quotients Γ\G for certain lattices Γ < G, where one can create situations where the action essentially degenerates to the action of a one parameter subgroup. Because of certain coincidences, this does not happen on Xn = SL(n, Z)\ SL(n, R), but even in Xn one can easily construct non-algebraic H-invariant and ergodic Radon measures using the same idea. This complication has been pointed out by M. Rees In the unpublished [36] and independently by Mozes in [30]; a nice account of the Rees example and some generalizations can be found in [4, Section 9]. As mentioned before, Margulis proved the long-standing Oppenheim conjecture by classifying orbit closures of the group SO(2, 1) in X3 . Similarly, even the simplest(?) n = 3 case of either (1) or (2) in Conjecture 2.1 will give a proof of the following conjecture of Littlewood posed roughly at the same time as Oppenheim’s conjecture (and much more): Conjecture 2.2 (Littlewood (c. 1930)). Let x denote the distance from x ∈ R to the closest integer. Then lim n nα nβ = 0
(2.1)
n→∞
for any real numbers α and β. This implication has been discovered in a different terminology long before Furstenberg’s pioneering work regarding the rigidity of multiparameter actions (and long before Margulis’ proof of the Oppenheim conjecture using dynamical techniques) by J.W.S. Cassels and H.P.F. Swinnerton-Dyer [1]; however, it was Margulis who first recast this in dynamical terms [26]. Unlike the case of the one torus, there is no reason to believe that in the context of actions on locally homogeneous spaces the topological question, e.g., (1) of Conjecture 2.1, is substantially easier then the measure theoretic question (e.g., (2) of the same question). It is, however, true that using [1] or [24] one can deduce (1) of Conjecture 2.1 from (2). 3. Partial results towards classifying invariant measures in the locally homogeneous case In [13, 14], Katok and Spatzier made the first steps towards classifying invariant measures in the locally homogeneous case, in particular covering the case of the
Invariant Measures for Diagonalizable Actions
251
diagonal group (which will denote as before by H) acting on Xn . Their work contains some elements which are geometric analogous to the techniques used by Rudolph for the one-dimensional torus in [37], as well as some additional new ingredients, particularly in handling actions of abelian groups with nontrivial Jordan form. A good exposition of their method (though without mention of the locally homogeneous case), clarifying some aspects of the original work, can be found in [11]. Later, Boris Kalinin and Spatzier further developed this method [12], with nice ergodic theoretic applications we will discuss in the next section. For classifying measures, not all elements of H are created equal. The most important one parameter subgroups are those nontrivial elements diag(h1 , h2 , . . . , hn ) ∈ H which have at least two entries which are equal, say hi = hj . Such elements act isometrically, indeed only by translations, on the leaves of the H-invariant foliation of Xn into orbits of the unipotent group Uij = {(ukl )kl : ukk = 1 for all k, ukl = 0 for all k '= l except (k, l) = (i, j)} , and there is a substantial amount of information which can be learned about the measure µ merely because it is invariant under even a single element with this partial isometric property. Implicitly, this partial isometric feature of the action of some elements of the acting group is used already in [37]. For Xn the techniques of Katok and Spatzier (as well as the later work of Kalinin and Spatzier) give that if the subgroups H ij = {diag(h1 , h2 , . . . , hn ) : hi = hj } act ergodically4 with respect to µ, and if there is any one parameter subgroup of H which acts with positive entropy then µ needs to be algebraic. Of these two assumptions, the assumption regarding ergodicity of the groups H ij (which did not appear in Rudolph’s theorem) is the more restrictive one. The reason for this is that in a typical application of measure classification results such as Ratner’s theorems or the conjectured measure classification for the diagonal flow in Conjecture 2.1 the measure one analyzes is obtained as a weak∗ limit of measures on which we have some control (for example, empirical measure on orbit segments). Ergodicity properties are not stable under weak∗ limits. Entropy, on the other hand, is well behaved under weak∗ limits: though this is not true in the greatest generality, in the type of systems we consider entropy is an upper semicontinuous function of the measure with respect to the weak∗ topology. 4The assumptions that H ij act ergodically can be weakened somewhat, and the analogous
weaker statement is very important in some special cases, such as for the ×n, ×m-action on R/Z or for Zk−1 actions on k-dimensional tori, but it does not seem to be very useful in the locally homogeneous context.
252
E. Lindenstrauss
In the last couple of years, and there has been substantial progress towards eliminating the need for an ergodicity assumption in the locally homogeneous context, and research in this direction is still in progress. There are two complementary ways of proving such results: one of them developed by Manfred Einsiedler and Katok in [4] uses non-commutativity of invariant contracting foliations, such as the foliation by orbits of Uij and Ujk for Xn together with a simple but very useful lemma regarding a product structure of some conditional measures under the multiparameter diagonalizable flows discussed here. In particular, Einsiedler and Katok have proved that any H invariant measure in Xn which has positive entropy under all elements of the diagonal group is algebraic – in fact, under this assumption the measure needs to be Haar measure on Xn . Einsiedler and Katok have generalized their results to a very general class of groups (the original paper covered only R-split groups) in [5]. This approach cannot be used for every G. For example, for the action of the two-dimensional diagonal group on Γ\ SL(2, R) × SL(2, R) there are no non-commuting invariant contracting foliations. There a completely different method needs to be used: one that uses some of the ideas and techniques in Ratner’s work on the rigidity of unipotent flows, particularly from her earlier works [31, 32]. In [17], we show that a measure on Γ\ SL(2, R) × SL(2, R) invariant and ergodic under the action of the two-dimensional diagonal group (in fact it is enough that it be invariant under the action of the diagonal group in one SL(2, R) factor, and recurrent under the action of the other SL(2, R) factor, which is a substantially weaker assumption – see [17] for exact statement) and which has positive entropy with respect to some one parameter diagonal subgroup, is algebraic (again, in this case, Haar measure on the quotient). Combining these two techniques, one has the following results for Xn : Theorem 3.1 (Einsiedler, Katok and L. [6]). Let µ be a H-invariant and ergodic measure on Xn . Assume that there is some one parameter subgroup of H with respect to which µ has positive measure. Then µ is algebraic, and is not compactly supported. If n is prime then µ is the Haar measure on Xn . This theorem, precisely because of the good behavior of entropy with respect to weak∗ limits, implies the following partial result towards Littlewood Conjecture: Theorem 3.2 (Einsiedler, Katok and L. [6]). The set of (α, β) ∈ R2 for which limn→∞ n nα nβ > 0 has Hausdorff dimension zero. As mentioned above, a key role in the proof of these measure classification results play the singular direction in which the action is partially isometric. Understanding such actions for their own sake seems to be a fruitful direction of research; see [22, 23, 18] for more details; the results of [17] are also best seen in this light.
Invariant Measures for Diagonalizable Actions
253
4. Joining and isomorphism rigidity of diagonalizable actions on locally homogeneous spaces Earlier we have mentioned the dichotomy between the action of the diagonal group on X2 and the action of the corresponding group in Xn for n ≥ 3. In this section, we give another facet of this dichotomy.5 We start with generalities: let H be some group, and suppose this group H acts on the two spaces X and X . Let m, m be H invariant measures on X and X respectively. A joining of (X, H, m) and (X , H, m ) is a measure on X and X invariant under the diagonal action of H on X × X whose push forward under the projection to X (X ) is m (m respectively). Any measurable isomorphism φ between (X, m) and (X , m ) commuting with the H action gives rise to a joining between (X, H, m) and (X , H, m ) supported on the graph of this isomorphism. For n = 2 it is possible to construct non-algebraic joining of (X2 , H, m) with itself or other one parameter flows, and a similar statement should be true for isomorphisms. This is in stark contrast with what happens for n ≥ 3. Kalinin and Spatzier [12] proved a general isomorphism rigidity theorem for such multidimensional actions; in particular they proved: Theorem 4.1 (Kalinin and Spatzier [12]). Let G1 , G2 be connected semisimple Lie groups without compact factors. For i = 1, 2, let Γi < Gi be a uniform lattice, mi Haar measure on Γi \Gi and let ρi be an embedding of Rk (k ≥ 2) to the Cartan subgroup of Gi . Then any measurable isomorphism between the Rk actions corresponding to ρi on (Γi \Gi , mi ) is algebraic. The assumption that the lattices Γi are uniform does not seem to be essential for the proof. More generally, one can consider joining of such actions. Because isomorphisms gives rise to particular kind of joining, this is a more general question, and as we shall see below it has applications to equidistribution. Kalinin and Spatzier give some results toward classifying joining, but because they rely on the technology of [13] they need to assume ergodicity of the joining with respect to one parameter subgroups. Using the results of Einsiedler and Katok in [5], and some ideas of Einsiedler and Tom Ward from [3] jointly with Einsiedler we have the following: Theorem 4.2 (Einsiedler-L. [7]). Let G1 , G2 be connected semisimple Lie groups, Γi < Gi a lattice, mi Haar measure on Γi \Gi , and ρi be an embedding of Rk (k ≥ 2) to the Cartan subgroup of Gi such that the image of ρi (Rk ) on every factor of Gi has dimension ≥ 2. Then any ergodic joining between the Rk actions corresponding to ρi on (Γi \Gi , mi ) is algebraic. 5Again we limit ourselves to flows on locally homogeneous spaces; a lot of interesting work has been done in other contexts by Einsiedler, Katok, Kalinin, Schmidt, Thouvenot, Ward and many others.
254
E. Lindenstrauss
In particular, any self joining of Xn with itself for n ≥ 3 is algebraic. Theorem 4.2 can be used for example to show the following: Theorem 4.3 (Einsiedler-L. [7]). Let G be a connected simple Lie group, of R rank ≥ 2. Let Γ1 , Γ2 be two lattices which cannot be conjugated so as to be commensurable. Suppose x1 ∈ Γ1 \G and x2 ∈ Γ2 \G have the property that their orbit under the R-Cartan subgroup H < G is equidistributed.6 Then the same holds for the orbit of (x1 , x2 ) in Γ1 \G × Γ2 \G under the diagonal embedding of H. Acknowledgments. I would like to thank the organizing committee of the 4ECM for giving me the opportunity to present my work both orally and in writing to a wide mathematical community, as well as Ari Laptev for his patience. References [1] J.W.S. Cassels and H.P.F. Swinnerton-Dyer. On the product of three homogeneous linear forms and the indefinite ternary quadratic forms. Philos. Trans. Roy. Soc. London. Ser. A., 248:73–96, 1955. [2] S.G. Dani and G.A. Margulis. Values of quadratic forms at integral points: an elementary approach. Enseign. Math. (2), 36(1-2):143–174, 1990. [3] M. Einsiedler and T. Ward. Entropy geometry and disjointness for zerodimensional algebraic actions. preprint, 21 pages. [4] Manfred Einsiedler and Anatole Katok. Invariant measures on G/Γ for split simple Lie groups G. Comm. Pure Appl. Math., 56(8):1184–1221, 2003. Dedicated to the memory of J¨ urgen K. Moser. [5] Manfred Einsiedler and Anatole Katok. Rigidity of measures – the high entropy case, and non-commuting foliations. preprint, 54 pages, 2004. [6] Manfred Einsiedler, Anatole Katok, and Elon Lindenstrauss. Invariant measures and the set of exceptions to littlewoods conjecture. to appear Annals of Math. (45 pages), 2004. [7] Manfred Einsiedler and Elon Lindenstrauss. Joining of semisimple actions on locally homogeneous spaces. in preparation, 2004. [8] Harry Furstenberg. Disjointness in ergodic theory, minimal sets, and a problem in Diophantine approximation. Math. Systems Theory, 1:1–49, 1967. [9] Bernard Host. Nombres normaux, entropie, translations. Israel J. Math., 91(13):419–428, 1995. [10] Aimee S.A. Johnson. Measures on the circle invariant under multiplication by a nonlacunary subsemigroup of the integers. Israel J. Math., 77(1-2):211–240, 1992. [11] Boris Kalinin and Anatole Katok. Invariant measures for actions of higher rank abelian groups. In Smooth ergodic theory and its applications (Seattle, WA, 1999), 6More precisely, let ν be the uniform measure on the ball of radius r in H centered at r
the identity (with respect to, e.g., a left G-invariant Riemannian metric on G, bi-invariant under its maximal compact subgroup). Then the push-forward of νr to Γi \G under the map g → xi .g −1 tends as r → ∞ in the weak∗ topology to the Haar measure.
Invariant Measures for Diagonalizable Actions
[12]
[13] [14]
[15]
[16] [17] [18]
[19]
[20] [21] [22] [23] [24] [25] [26]
[27]
[28]
255
volume 69 of Proc. Sympos. Pure Math., pages 593–637. Amer. Math. Soc., Providence, RI, 2001. Boris Kalinin and Ralf Spatzier. Rigidity of the measurable structure for algebraic actions of higher rank abelian groups. to appear in Ergodic Theory Dynam. Systems. A. Katok and R.J. Spatzier. Invariant measures for higher-rank hyperbolic abelian actions. Ergodic Theory Dynam. Systems, 16(4):751–778, 1996. A. Katok and R.J. Spatzier. Corrections to: “Invariant measures for higher-rank hyperbolic abelian actions” [Ergodic Theory Dynam. Systems 16 (1996), no. 4, 751–778; MR 97d:58116]. Ergodic Theory Dynam. Systems, 18(2):503–507, 1998. Dmitry Kleinbock, Nimish Shah, and Alexander Starkov. Dynamics of subgroup actions on homogeneous spaces of Lie groups and applications to number theory. In Handbook of dynamical systems, Vol. 1A, pages 813–930. North-Holland, Amsterdam, 2002. Elon Lindenstrauss. p-adic foliation and equidistribution. Israel J. Math., 122:29– 42, 2001. Elon Lindenstrauss. Invariant measures and arithmetic quantum unique ergodicity. to appear in Annals of Math. (54 pages), 2003. Elon Lindenstrauss. Recurrent measures and measure rigidity. To appear in the proceeding of the II Workshop on Dynamics and Randomness, Santiago de Chile, Dec. 9-13, 2002. Eds. A. Maass, S. Martinez, J. San Martin. (25 pages), 2003. Elon Lindenstrauss. Rigidity of multiparameter actions. submitted to the forthcoming volume(s) of the Israel J. of Math. dedicated to H. Furstenberg (26 pages), 2003. Elon Lindenstrauss. Arithmetic quantum unique ergodicity and adelic dynamics. preprint (24 pages), 2004. Elon Lindenstrauss, David Meiri, and Yuval Peres. Entropy of convolutions on the circle. Ann. of Math. (2), 149(3):871–904, 1999. Elon Lindenstrauss and Klaus Schmidt. Invariant measures of nonexpansive group automorphisms. to appear Israel J. Math. (28 pages), 2003. Elon Lindenstrauss and Klaus Schmidt. Symbolic representations of nonexpansive group automorphisms. to appear in Israel J. Math (34 pages), 2003. Elon Lindenstrauss and Barak Weiss. On sets invariant under the action of the diagonal group. Ergodic Theory Dynam. Systems, 21(5):1481–1500, 2001. Russell Lyons. On measures simultaneously 2- and 3-invariant. Israel J. Math., 61(2):219–224, 1988. G.A. Margulis. Oppenheim conjecture. In Fields Medallists’ lectures, volume 5 of World Sci. Ser. 20th Century Math., pages 272–327. World Sci. Publishing, River Edge, NJ, 1997. G.A. Margulis and G.M. Tomanov. Invariant measures for actions of unipotent groups over local fields on homogeneous spaces. Invent. Math., 116(1-3):347–392, 1994. Gregory Margulis. Problems and conjectures in rigidity theory. In Mathematics: frontiers and perspectives, pages 161–174. Amer. Math. Soc., Providence, RI, 2000.
256
E. Lindenstrauss
[29] Shahar Mozes. On closures of orbits and arithmetic of quaternions. Israel J. Math., 86(1-3):195–209, 1994. [30] Shahar Mozes. Actions of Cartan subgroups. Israel J. Math., 90(1-3):253–294, 1995. [31] Marina Ratner. Factors of horocycle flows. Ergodic Theory Dynam. Systems, 2(3-4):465–489, 1982. [32] Marina Ratner. Horocycle flows, joinings and rigidity of products. Ann. of Math. (2), 118(2):277–313, 1983. [33] Marina Ratner. On Raghunathan’s measure conjecture. Ann. of Math. (2), 134(3):545–607, 1991. [34] Marina Ratner. Raghunathan’s topological conjecture and distributions of unipotent flows. Duke Math. J., 63(1):235–280, 1991. [35] Marina Ratner. Raghunathan’s conjectures for Cartesian products of real and p-adic Lie groups. Duke Math. J., 77(2):275–382, 1995. [36] M. Rees. Some R2 -anosov flows. 1982. [37] Daniel J. Rudolph. ×2 and ×3 invariant measures and entropy. Ergodic Theory Dynam. Systems, 10(2):395–406, 1990. [38] Klaus Schmidt. Dynamical systems of algebraic origin, volume 128 of Progress in Mathematics. Birkh¨ auser Verlag, Basel, 1995.
4ECM Stockholm 2004 c 2005 European Mathematical Society
Phase Transition Phenomena in Random Discrete Structures Tomasz L uczak Abstract. We present a few results concerning the phase transition phenomenon in the theory of random graphs, mathematical logic, and game theory.
1. Introduction A random graph is a probability measure defined on a family of subgraphs of some underling graph F (or, equivalently, a random subgraph of F ), typically parametrized by a parameter ρ. The simplest example of a random graph is F (p), when each edge is removed from F independently with probability 1 − p. For many random graph models there exists a critical value ρcr of ρ such that the structure of F (ρ) changes abruptly for ρ ∼ ρcr ; for instance, for arbitrarily small > 0, F (ρcr + ) may contain a giant component covering a positive fraction of all vertices of F , while all components of F (ρcr − ) are of moderate size. This behavior resembles the phase transition phenomena studied by physicists, where small changes of parameters can greatly affect properties of a system; in fact, the random graph F (p), directly related to Ising and Potts models, is commonly used to model and study the phase transition phenomena in statistical physics. In order to model the phase transition phenomena considered by physisists one should study the random graph F (p) where the underlying graph is an infinite d-dimensional lattice. In this case we expect the critical behavior of F (p) to depend mainly on the dimension of F , e.g., although the critical properties of random graphs based on 2-dimensional square and hexagonal lattices have different critical probabilities yet their critical behavior should be very similar. Thus, percolation theory which deals with results of this kind, uses not only probabilistic but also geometric as well as topological and analytical tools. Here however we consider random graphs F (ρ) where the underling graph F is the complete graph on n vertices and so has no geometric structure. This part of random graph theory has more combinatorial flavor and, although not so important for studying the properties of physical systems (it corresponds to not very exciting mean-field approximation approach) it is rich in interesting results and mathematical challenges. We concentrate on a few recent developments in this area and mention some connections of the phase transition phenomena with mathematical logic and game theory.
258
T. L uczak
2. The phase transition in the standard model Let us start with a short description of the phase transition in the most widely used random graph models G(n, p) and G(n, M ). The binomial random graph G(n, p) is a graph with vertex set [n] = {1, 2, . . . , n}, where each of n2 pairs of vertices is an edge of G(n, p) independently with probability p. Thus, a given graph G with vertex set [n] and e(G) edges appears as G(n, p) with probability n
Pr(G(n, p) = G) = pe(G) (1 − p)( 2 )−e(G) .
(2.1)
An alternative way to generate G(n, p) is to consider a family of independent, uniformly distributed random variables Uij , 1 ≤ i < j ≤ n, and define G(n, p) as a graph in which a pair {i, j} is an edge if and only if Uij ≤ p. Consequently, one can view G(n, p) as a stage of a Markov process {G(n, p)}0≤p≤1 . The uniform random graph G(n, M ), is a graph chosen uniformly at random from the family of all graphs with vertex set [n] and M edges. An equivalent way to obtain G(n, M ) is to start with an empty graph with vertex set [n] and add to it M edges, one by one, so that in each step a new edge is chosen uniformly at random among all available pairs. Thus, G(n, M ) is a stage of a Markov n(n−1)/2 chain {G(n, M )}M =0 . In random graph theory we often allow parameters p and M to depend on n, and consider asymptotic properties of G(n, p(n)) and G(n, M (n)) as n → ∞. In particular, we say that a property holds asymptotically almost surely (a.a.s.) if its probability tends to one as n → ∞. We also remark that for most natural properties (in particular, for all properties considered below) the asymptotic of both G(n, p) and G(n, M ) is basically behavior the same provided M = p n2 . Thus, we shall state our results (and heuristic) only for one of the models. In order to study the structure of G(n, p) let us fix a vertex v and identify vertices which belong to the component L(v) containing v using the breadthfirst search. Then, the number of vertices in L(v) can be viewed as the total number of offsprings in a branching process where all particles are taken from a finite reservoir of particles. The fact that we restrict the total number of particles does not matter much at the beginning of the process, so the giant component emerges for p such that the expected number of neighbors of a vertex, given by (n − 1)p, is close to one. One can easily make this argument rigorous (see, for instance, [11]) and get the following result, proved (for G(n, M )) in one of the first and by far the most influential paper in random graph theory, published in 1960 by Erd˝ os and R´enyi [9]. Here and below by Lk = Lk (n, p), k = 1, 2, we denote the random variable which counts the number of vertices in the kth largest component of a random graph. Theorem 2.1. Let p = c/n, where c is a constant. (i) If c < 1 then a.a.s. L1 = O(log n).
Phase Transition Phenomena
259
(ii) If c > 1 then a.a.s. L1 = (α(c) + o(1))n, where α(c) ∈ (0, 1) is a root of the equation (2.2) α(c) + e−cα(c) = 1 and L2 = O(log n). The investigation of the structure of G(n, p) when np → 1 was started by Bollob´ as [5] (see also [6]) and continued by L uczak [12], L uczak, Pittel and Wierman [14], and Janson, Knuth, L uczak and Pittel [10]. They showed that the dominating component appears in G(n, p) when np = 1 + O(n−1/3 ), and gave a fairly detailed picture of the structure of a random graph in this critical period. In particular, the following result holds. Theorem 2.2. Let np = 1 + , where = (n) → 0. (i) If n1/3 → −∞, then a.a.s. 2 L1 = (1 + o(1)L2 = (1 + o(1)) 2 log n||3 . (ii) If n1/3 → a ∈ (−∞, ∞), then for every b ∈ (0, ∞), and k = 1, 2, lim Pr(Lk ≤ bn2/3 ) = γ(k; a, b),
n→∞
where 0 < γ(k; a, b) < 1 is a continuous function of both a and b. (iii) If n1/3 → ∞ then a.a.s. L1 = (2 + o(1))n, and
2 log n3 . 2 Thus, the random variable L1 /L2 grows from 1 to infinity in the “critical interval” np = 1+O(n−1/3 ). It turns out that this special period of the evolution of a random graph has several other distinctive features. We mention just two of them. Note first that one can identify the critical period by considering the size of the largest component alone: it is the only time during the random process when the random variable L1 is not sharply concentrated around its median. The reason why this is the case is easy to see when we consider G(n, p) as a stage of the random graph process, obtained from the empty graph by adding to it randomly chosen edges. In the subcritical phase, when np = 1+, n1/3 → −∞, the largest components merge only with very small components. Hence, L1 is basically the maximum over a large family of independent random variables, and so its sharply concentrated around its median. This is not longer true in the critical interval: now leading components are so large that with a non-vanishing probability an edge added in the critical period may connect two of them; for instance, at some (random) moment of this phase of the random process the largest component may merge with, say, the second largest one, and increase its size considerably. Thus, one should not expect the random variable L1 to L2 = (1 + o(1))
260
T. L uczak
be sharply concentrated. Finally, in the supercritical phase, when np = 1 + , n1/3 → ∞, the giant component grows by merging with components much smaller than L1 . Thus, it can be approximated by a sum of independent random variables and so, again, it is sharply concentrated. The critical interval can be also identified by considering the internal structure of the components of G(n, p) instead of their sizes. In the subcritical phase of the process {G(n, p)}0≤p≤1 the probability that a new edge added to a graph has both its ends in one small component is very small as well, and so a.a.s. each component of G(n, p) is either a tree or contains one cycle. In the supercritical phase the random graph a.a.s. contains a large component with a fairly complicated internal structure, but all other components of G(n, p) have at most as many edges as vertices. It is only in the critical period, when, with non-negligible probability, G(n, p) may have several components containing at least two cycles each. 3. Cluster scale random model Theorem 2.1 states that if p = c/n, then the random variable L1 /n tends to α(c) as n → ∞, where α(c) = 0 for c ≤ 1 and is given by a non-zero solution of (2.2) for c > 1. Thus, α(c) is continuous for c ∈ (0, ∞); in fact it is analytic in the whole domain except the point c = 1 where its first derivative jumps from 0 to 2. Thus, the phase transition at c = 1 is continuous; moreover Theorem 2.2(ii) states that it is quite smooth if we “rescale” the critical period appropriately. It is yet another consequence of the fact that G(n, p) is a stage of a Markov process in which edges are added to a graph one by one; a new edge cannot increase the size of its largest component more than twice, so L1 grows smoothly with p. The phase transition in G(n, p) is nowadays thoroughly studied and well understood. Thus, one is tempted to modify this model so that it would admit another types of the critical behavior; in particular, we would like to have a simple probabilistic model for which a non-continuous phase transition can be observed and analyzed. As we have already remarked one should not expect any discontinuities in “dynamic” models of random graphs, in which we add new edges to a graph, possibly with some restrictions and/or preferences (e.g., we may prefer large or small vertices, avoid some subgraphs, etc.); although the critical behavior of some of such models differs greatly from the one observed for G(n, p), the phase transition is typically continuous. Thus, we should rather look at “static” random graph models, when a graph is selected according to some probability distribution similar to (2.1). A natural example of such a graph is Gq (n, p), a cluster-scaled random graph closely related to Potts model studied in statistical physics. For Gq (n, p) the probability of each graph G is proportional to q c(G) , where c(G) is the number of components of G, and q > 0 is an additional parameter of the model, i.e., n
Pr(G(n, p) = G) = q c(G) pe(G) (1 − p)( 2 )−e(G) /Zq (n, p),
(3.1)
Phase Transition Phenomena
261
and the partition function Zq (n, p) is given by n Zq (n, p) = q c(G) pe(G) (1 − p)( 2 )−e(G) . G
Let us mention that, unlike in the case of G(n, p), there is no natural way to obtain Gq (n, p ) from Gq (n, p ) for p > p for q '= 1, i.e., Gq (n, p) is “nonmarkovian”. Nevertheless, if q is an integer, Gq (n, p) is related to a naturally defined Markov chain {Gi }∞ i=0 . The chain starts with any graph G0 with vertex set [n]. If i ≥ 1, then Gi+1 is obtained from Gi in the following way. First, we choose uniformly at random one of q colors independently for each component of Gi , and color all vertices of this component by this color. In this way we get a partition of [n] into q parts of n1 , . . . , nq vertices respectively. Then we remove all edges of Gi , and connect each pair of vertices which are colored with the same color by an edge independently with probability p. Thus, the resulting graph Gi+1 is a sum of disjoint independent copies of G(ni , p), i = 1, 2, . . . , q. It is easy to see that, whenever 0 < p < 1, the above Markov chain is ergodic, and its unique stationary distribution is given by (3.1). From the above description of Gq (n, p) and Theorem 2.1 it follows that, at least for integer q, if np/q → c > 1 as n → ∞, then the random graph Gq (n, p) a.a.s. contains a large component of size at least α(c)n/q. Bollob´as, Grimmett and Janson [7] showed that if np/q = c, then the phase transition occurs at c ∼ 1 for 0 < q ≤ 2, but for q > 2 a discontinuous phase transition takes place already for c ∼ ccrit < 1. A somewhat simplified version of their result can be stated as follows. Here and below q > 0 is a (non-necessarily integer valued) constant, which does not depend on n. Theorem 3.1. Let np/q → c as n → ∞. (i) If 0 < q ≤ 2, then a.a.s. L1 = O(log n) for c < 1, while for c > 1, we have L1 /n → βq (c) > 0, where β(c) is a continuous function of c and β(0) = 0. (ii) If q > 2, then a.a.s. L1 = O(log n) for c < ccrit while for c > ccrit , we have L1 /n → βq (c) > 0, where ccrit =
2 q−1 ln(q − 1), q q−2
β (c) is a continuous function of c in the interval [ccrit , ∞), and q−2 > 0. β(ccrit ) = q−1 A more detailed analysis of the critical period was given by Luczak and L uczak [13]. We proved that for 0 < q < 2 the picture of the phase transition in Gq (n, p) is basically the same as for G(n, p) = G1 (n, p) with the critical interval np/q = 1 + O(n−1/3 ). The critical behavior of Gq (n, p) is much more interesting for q ≥ 2. If q > 2 and np/q = ccrit + b/n, then the probability space corresponding to Gq (n, p) can be partitioned into three parts: S1 in which all graphs consists of small components of size O(log n), S2 in which the size of
262
T. L uczak
√ the giant component is concentrated in the interval Θ( n) around its median mL ∼ q−1 q−2 , and the “bottleneck” S3 . The probabilities Pr(S1 ) and Pr(S2 ) are bounded away from zero and are smooth functions of parameter b, while Pr(S3 ) quickly tends to 0 as n → ∞. Theorem 3.2. Let q > 2 and np/q = ccrit + b/n for some constant b. Then there exists a continuous function ζ(b), such that lim Pr(L1 = (1 + o(1)) q−2 q−1 ) = ζ(b) ,
n→∞
whereas lim Pr(L1 = O(log n)) = 1 − ζ(b) .
n→∞
Note that the phase transition in this case is surprisingly sharp: a small change of the probability p by Θ(n−2 ) affects the limit probability that Gq (n, p) has a giant component, although the expected size of the largest component is modified only by Θ(1). The result describing critical behavior of G2 (n, p) is slightly more complicated. Theorem 3.3. Let q = 2 and np/2 = 1 + , where = (n) = o(1). (i) If n1/3 → −∞, then a.a.s. L1 = (1 + o(1))L2 = (1 + o(1))
2 log n||3 . 2
(ii) If n1/3 → a ∈ (−∞, ∞), then for every b ∈ (0, ∞) and k = 1, 2, lim Pr(Lk ≤ bn2/3 ) = γ2 (k; a, b),
n→∞
where 0 < γ2 (k; a, b) < 1 is a continuous function of both a and b. (iii) If n1/3 → 0 but n1/2 → −∞, and b ∈ (0, ∞), then 7 # 7 b " 2 n 2 lim Pr L1 ≤ b = e−x /2 d x, n→∞ π −∞ and
7 ∞ " 2 2 n 1 # e−x /2 dx . lim Pr L2 ≤ 2 log 3 = n→∞ b n π b
(iv) If n1/2 → c > 0 as n → ∞, and b ∈ (0, ∞), then b exp(−a4 /12 + a2 c/2) 3/4 3/4 = ∞0 lim n Pr L1 ≤ bn , n→∞ exp(−x4 /12 + cx2 /2) dx 0 and
∞ √ exp(−x4 /12 + cx2 /2) dx n log n b = . lim Pr L2 ≤ ∞ n→∞ 2b2 exp(−x4 /12 + cx2 /2) dx 0
Phase Transition Phenomena
263
(v) If n1/2 → ∞ but (n) = o(1) as n → ∞, then a.a.s. √ L1 = (1 + o(1)) 3n and L(2) = (1 + o(1))
log n2 3 . 3
Thus, in the evolution of G2 (n, p), besides the subcritical, critical and supercritical phases, analogous to that observed for G(n, p)=G1(n, p), one can observe an “early supercritical” phase which occurs for np = 1 + , where n1/3 → 0 but n1/2 = O(1), in which the largest component is unique, yet neither L1 nor L2 are sharply concentrated. The reason for such a behavior becomes clear when we recall the construction of the Markov chain {Gi }∞ i=0 , which has Gq (n, p) as its stationary distribution. If np = 2(1+) and (n)n1/3 → ∞, Theorem 2.2 implies that a large component of L(i) n2/3 vertices emerges in Gi . Let us randomly color components of Gi with two colors and suppose that the vertices of this large component are colored with the first color. If L(i) is large enough, then the number of vertices colored with the first color is a.a.s. n/2 + (1/2 + o(1))L(i) , and, by Theorem 2.2, the size of the largest component in Gi+1 will be sharply concentrated and determined only by L(i) . However, if n1/3 tends to infinity slowly enough, L(i) might be so small that it does not affect much the difference d(i) between the number of vertices colored with the first and the second colors. Thus, the size of the largest component L(i+1) of Gi+1 does not depend on L(i) but only on d(i) and, since d(i) is not sharply concentrated, the random variable L(i+1) is not sharply concentrated as well. 4. The phase transition and zero-one laws In the first two sections we have presented a fairly detailed description of the structure of a random graph near the critical point; now we consider how the picture of this phenomena depends on the language we use to describe it. One of the simplest language used to express properties of graphs is the first-order language of graphs, which, besides equality, contains only one binary predicate “∼”, where “x ∼ y” is interpreted as the adjacency of vertices x and y. Thus, the sentence ∀x ∀y ∃z x ∼ z ∧ y ∼ z , exemplifies the fact that the property that each two vertices of a graph are connected by a path of length two is the first-order property of graphs; on the other hand one can prove that the property that a graph is connected cannot be expressed in this language. It turns out that the first-order language of graphs is not rich enough to describe the phase transition phenomenon or even to identify the moment when in the random graph the giant component emerges. This fact follows from a result of Lynch [17] who showed that if p = c/n, then the probability that G(n, p) has a first-order property φ converges to a limit ρφ (c), and basically
264
T. L uczak
characterized all functions which can appear as ρφ (c); none of these functions has singularity at c = 1. Theorem 4.1. Let φ be any first-order sentence and c be a positive constant. Then the limit and ρφ (c) = lim Pr(G(n, c/n) |= φ) n→∞
exists. Moreover, ρφ (c) is analytic for c ∈ (0, ∞). Thus, let us strengthen our language and consider the monadic secondorder language of graphs in which it is allowed to quantify not only over vertices but also over sets of vertices; furthermore, the monadic second-order language contains a predicate “∈”, where, of course, “x ∈ y” means “a vertex x belongs to a set of vertices y”. The monadic second-order language of graphs is significantly stronger than the first-order one; for instance, the monadic second-order sentence " # ∃x ∃y y ∈ x ∧ ∃z ¬z ∈ x ∧ ∀y ∀z (y ∈ x ∧ ¬z ∈ x) =⇒ ¬y ∼ z corresponds to the property that a graph is disconnected, which, as we have already mentioned, is not expressible in the first-order language. It is not hard to state in the monadic second-order language of graphs a sentence φcomp which corresponds to the property that a graph contains a component with at least two cycles. Since in the supercritical phase a.a.s. each component of G(n, p) contains at most one cycle, while in the supercritical phase the graph a.a.s. contains a giant component with a complex internal structure, one can use φcomp to identify the point of the phase transition (and the critical interval). However, in the evolution of the G(n, p) presented in the monadic second-order language the point p ∼ 1/n plays much more important role: Shelah and Spencer [18] proved that when we pass it one can no longer guarantee the convergence of each monadic second-order sentence, i.e., the “weak zero-one law” does not hold any more. Theorem 4.2. Let c > 0 be a constant. (i) For every monadic second-order sentence ψ and c < 1 the limit ρφ (c) = lim Pr(G(n, c/n) |= ψ) n→∞
exists, and ρφ (c) is analytic for c ∈ (0, 1). (ii) There exists a monadic second-order sentence ψ0 such that for every c > 1 lim inf Pr(G(n, c/n) |= ψ0 ) = 0 , n→∞
while lim sup Pr(G(n, c/n) |= ψ0 ) = 1 . n→∞
Phase Transition Phenomena
265
Is the weak zero-one law fails precisely in the critical interval when np = 1+O(n−1/3 )? Although this problem has not been completely settled yet, there is a lot of evidence (see, for instance [16]) that it is “almost” the case and the following claim holds. Claim 4.3. Let pn = 1 + , where = (n) = o(1), and, for a sentence ψ, let rn (ψ) denote the probability that G(n, p) has the property described by ψ. (i) If n1/3 → −∞, then for every monadic second-order sentence ψ the sequence rn (ψ) converges as n → ∞. (ii) If n1/3 → a, then for every monadic second-order sentence ψ the limit ρφ (a) = lim Pr(G(n, c/n) |= ψ) n→∞
exists, and ρφ (a) is continuous for a ∈ (−∞, ∞). (iii) There exists a function 0 (n) such that 0 n1/3 → ∞, and rn (ψ) converges for every monadic second-order sentence ψ. (iv) If n1/3 ≥ f (n), for some recursive function f which tends to infinity as n → ∞, then there exists a monadic second-order sentence ψ0 such that lim inf Pr(G(n, p) |= ψ0 ) = 0 , n→∞
while lim sup Pr(G(n, p) |= ψ0 ) = 1 . n→∞
We describe briefly how one may try to prove the above claim to illustrate techniques used in this area of probabilistic combinatorics. In order to show (i) and (ii) it is enough to construct two sets of sentences: {φα }α∈A corresponding to “global properties of graphs”, and {φβ }β∈B related to “local properties”, such that • for each monadic second-order sentence ψ either ψ or ¬ψ can be deduced from {φα }α∈A ∪ {φβ }β∈B , where for β ∈ B we have either φβ = φβ or φβ = ¬φβ ; • for each α ∈ A the property φβ holds a.a.s. for G(n, p); • for each β ∈ B the probability that φβ holds for G(n, p) converges as n → ∞; • for every finite subset B ⊆ B Pr(∀β∈B {G(n, p) |= φβ }) = (1 + o(1)) Pr(G(n, p) |= φβ ). β∈B
The existence of a function 0 (n) in (iii) is a simple consequence of (ii) and the fact that there are only countable many monadic second-order properties of graphs. Finally, a natural way to show (iv) is to use special properties of G(n, p) to define a subset Sn of its vertices such that a.a.s. each ternary relation on Sn can be expressed in the monadic second-order language. Thus, one can use Sn to code an initial segment of the standard model of arithmetic, and, in particular, if g is a recursive function which (slowly) tends to infinity, then there is a monadic
266
T. L uczak
second-order sentence ψ(g) which corresponds to the property “g(|Sn |) is even”. If pn = 1 + and n1/3 → ∞, then the size of |Sn | grows typically as some power of log n1/3 . Suppose now that for some recursive function f , f (n) → ∞, we have, say, log log f ≤ |Sn | ≤ n. Then one can easily find a recursive function g such that each natural number appears as the value of g(|Sn |) for some n. Consequently, lim inf Pr(G(n, p) |= ψ(g)) = 0 , n→∞
while lim sup Pr(G(n, p) |= ψ(g)) = 1 . n→∞
Note that a recursive lower bound for n1/3 is crucial for this type of the argument (cf. (iii)); for more examples of similar “recursive bound behavior” see [8] and [15]. 5. The phase transition and deterministic games We conclude this brief survey with some results on deterministic combinatorial games whose connection with the phase transition in random graphs (if there is any) has not been well understood yet. Let us consider a perfect information game Comp(n, r), played between two players, Maker and Breaker, on the complete graph on n vertices. In each round Maker colors red one edge of the graph and Breaker answers with coloring blue at most r from the yet uncolored edges. The game ends when all edges are colored by either of the colors. Let L denote the number of vertices in the largest component of Maker’s (red) graph. The aim of Maker is to maximize L, while Breaker is trying to make L ¯ r) we denote the number of vertices in the largest as small as possible. By L(n, component of Maker’s graph in Comp(n, r) when both players play according to their optimal strategies. It is easy to see that if r = cn and c > 1, then Breaker can force each component of Maker’s graph to be of bounded size, but if c < 1 then he cannot prevent Maker from building a component which contains a positive fraction of ¯ r) for r = (1 + o(1)n was studied by Bednarska vertices. The behavior of L(n, and L uczak [4], who proved the following result. Theorem 5.1. (i) If 1 ≤ s ≤ n − 1 and r = n + s, then n ¯ r) < n + 1 , √ < L(n, s+ n s (ii) If n ≥ 106 and r = n − s, where 0 ≤ s ≤ 0.01n, then √ ¯ r) ≤ s + 100 s . s ≤ L(n, ¯ r) resembles quite closely the behavior of the size The behavior of L(n, of the largest component L(n, p) near the point of the phase transition. This analogy is even more evident if we notice that for the “critical value” r ∼ n
Phase Transition Phenomena
267
1 Maker’s graph has density r+1 ∼ n1 precisely as the expected density p of G(n, p) at the point of the phase transition. It is not quite clear if this and many other similarities in the behavior of combinatorial games and random structures are just coincidences or there is a “meta-theorem” still to find, which explains all or most of them. This problem was studied and discussed in detail by Beck (see [2], or his forthcoming monograph [1]). Thus, for instance, in some cases the connection between combinatorial games and random graphs are due to the fact that in the game a random strategy is nearly optimal for one of the players (see, for instance, [3]). Note however, that in the case of Comp(n, r) it is certainly not the case: a random strategy for either of the players is by far worse than the optimal one. Let us also mention that a surprising connection ¯ r) can be observed not only at the between the behavior of L1 (n, p) and L(n, point of the phase transition, but also for graphs of larger densities. Namely, if n/r → ∞, then ¯ r) = ne−Θ(n/r) , (5.1) n − L(n,
while for np → ∞, a.a.s. n − L1 (n, p) = ne−Θ(np) . Besides the above similarities between G(n, p) and Comp(n, r) we should ¯ r). mention also some important differences in the behavior of L1 (n, p) and L(n, √ For instance, in the critical phase L1 (n, r) = Θ( n), while the “critical size” for the largest component of G(n, p) is Θ(n2/3 ). Moreover, it seems that the ¯ r) is more interesting that that of L1 (n, p). Indeed, let evolution of L(n, α(c) ¯ = lim sup n→∞
¯ cn) L(n, . n
Then, from Theorem 5.1 we have α ¯ (c) = 0 for c ∈ (0.99, 1), while (5.1) implies that α ¯ (c) '= 0 for small c. The formula α(c) ¯ = 1 − c for c ∈ (0.99, 1) follows from the fact that the best strategy for Breaker near critical point is to color edges with precisely one end in the component of Maker’s graph which is the largest at this stage of the game. However, for small r/n this strategy fails completely; it cannot even prevent Maker from building a connected graph. Thus, α(c) ¯ has at least one more singularity in the interval (0, 0.99), i.e., in ¯ r) there exist several “critical points”. Unfortunately, the evolution of L(n, somewhat annoyingly, although the formula for function α(c) = L1 (n, c/n) has been given already in [9] (see Theorem 2.1) the behavior of α(c) ¯ is still to be determined. References [1] Beck, J.: Combinatorial Games, Cambridge Univ. Press, 2005. [2] Beck, J.: The Erd¨ os-Selfridge theorem in positional game theory. In “Paul Erd˝ os and his mathematics, II”, (Budapest, 1999), Bolyai Soc. Math. Stud., 11, J´ anos Bolyai Math. Soc., Budapest, 2002, 33–77.
268
T. L uczak
[3] Bednarska, M., L uczak, T.: Biased positional games for which random strategies are nearly optimal, Combinatorica 20, 477–488 (2000). [4] Bednarska, M., L uczak, T.: Biased positional games and the phase transition, Random Struct. Algorithms 18, 141–152 (2001). [5] Bollob´ as, B.: The evolution of random graphs, Trans. Amer. Math. Soc. 286, 257– 274 (1984). [6] Bollob´ as, B.: Random Graphs, 2nd edition. Cambridge University Press, 2001. [7] Bollob´ as, B., Grimmett, G.R., Janson, S.: The random-cluster model on the complete graph, Probab. Theory Relat. Fields 104, 283–317 (1996). [8] Dolan, P., Lynch, J.F.: The logic of ordered random structures, Random Struct. Algorithms 4, 429–445 (1993). [9] Erd˝ os, P., R´enyi, A.: On the evolution of random graphs, Magyar Tud. Akad. Mat. Kutat´ o Int. K¨ ozl 5, 17–61 (1960). [10] Janson, S., Knuth, D.E., L uczak, T., Pittel, B.: The birth of the giant component, Random Struct. Algorithms 4, 233–358 (1993). [11] Janson, S., L uczak, T., Ruci´ nski, A.: Random Graphs. Wiley, 2000. [12] L uczak, T.: Component behaviour near the critical point of the random graph process. Random Struct. Algorithms 1, 287–310 (1990). [13] Luczak, M., L uczak, T.: The phase transition in the cluster-scaled model of a random graph, Random Struct. Algorithms, to appear. [14] L uczak, T., Pittel, B., Wierman, J.C.: The structure of a random graph near the point of the phase transition, Trans. Amer. Math. Soc. 341, 721–748 (1994). [15] L uczak, T., Spencer, J.: When does the zero-one law hold? J. Amer. Math. Soc. 4, 451–468 (1991). [16] L uczak, T., Thoma, L.: Convergence of probabilities for the second-order monadic properties of a random mapping, Random Struct. Algorithms 11, 277– 295 (1997). [17] Lynch, J.F.: Probabilities of sentences about very sparse random graphs. Random Struct. Algorithms 3, 33–53 (1992). [18] Shelah, S., Spencer, J.: Zero-one laws for sparse random graphs, J. Amer. Math. Soc. 1, 97–115 (1988). Tomasz L uczak Faculty of Mathematics and CS Adam Mickiewicz University PL-61-614 Pozna´ n, Poland e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Systems Controlled by Rough Paths Terry Lyons Abstract. It is a matter of observation that many complex and important systems evolve, and that this evolution depends, to an extent, on external stimuli (which we call controls). These stimuli are frequently varying with time, and on normal scales are often highly oscillatory and potentially non-differentiable. To model them mathematically one has to go beyond the classical theory of differential equations and find a meaning for dyt dγ i f (yt ) t , = y0 = a. dt dt i when γ is a rough path. The theory of rough paths draws together earlier perspectives and results of L.C. Young and K.T. Chen; it develops the analysis required to model these interactions without imposing the requirement that the control be differentiable.
1. Introduction Differential equations are basic tools in pure and applied mathematics; they model the evolution of systems and can express interactions between systems. The theory of rough paths develops a rigorous extension of the classical theory of differential equations powerful enough to express interactions between (appropriate) systems without requiring the classical smoothness requirements. The approach is essentially non-linear and complementary to the theory of distributions. Let yt ∈ M represent the state of an autonomous evolving system at time t. If the state space M is a manifold and the evolution is smooth and deterministic then the direction and magnitude of that evolution (when started at a general point a ∈ M ) defines a vector field f (a) .One can model the evolution using a differential equation: dyt = f (yt ) dt or in slightly different notation dyt = f (yt ) dt.
(1.1)
Suppose one adds to this picture the possibility of external influences (or controls) acting to modify the evolution. Consider a smooth path (γt )t∈[0,T ] in Rn and a family f i i∈(1,...,n) of vector fields on M . Then one might consider
270
T. Lyons
the differential equation dyt =
f i (yt ) dγti .
(1.2)
i
In this case we think of γ as a control1 influencing the evolution of yt through the f i . If d = 1 and f 1 (t) = t then we recover (1.1). However, this model has wide application. Much of geometry is involved with the study of connections. Any connection is, by definition, a differential equation of this kind. It provides the information required to lift a path from a base space (the control) to the covering space (the response). We could also consider more applied settings, where y represents the state of some physical system and the external influence or control comes from the external environment (for example γt could be the air pressure in my ear and y could represent the state of my cochlea). The examples are diverse, (1.2) captures an important class of mathematical models. Thought of as a functional from control to response, it is essentially non-linear and contains considerable mathematical structure. For example, we may concatenate two controls γ and τ to get a new control γ ∗ τ . This composition is an associative product making the space of paths of finite length (modulo starting point) into a monoid. The flow defined by πγ (y0 ) := yt defines a homomorphism from the space of controls with this product into the group of diffeomorphisms of M and solving (1.2) can be regarded as constructing a group representation of ({γ, paths of finite length} , ∗) having been given a Lie map. Every homomorphism of finitely generated Lie groups can be factored through such a model. In some sense these are the simple cases (because the Lie algebra generated by the f i is finite-dimensional); most of the problems that arise are truly infinite-dimensional or come from understanding behavior as dimension increases in some way. 1.1. Non-differentiable controls. Considering again the more physical examples, one concludes that it seems inappropriate to regard the control γ as piecewise smooth or even differentiable on normal time-scales. One has a fundamental question. Problem 1.1. Can one extend calculus to make sense of equations such as dyt = f i (yt ) dγti , y0 = a i 1One should point out that in standard control theory literature γ˙ is called the control. We
regard γ as the control; using our notation γ and y are essentially similar types of object, there is no implicit smoothness assumption forced on us regarding γ, and it is transparent that development using a connection provides another example of the same class.
Systems Controlled by Rough Paths
271
for a wider class of γ and move beyond the classical examples where f i is Lipschitz and γ i is of finite length. One simple approach would be to avoid the difficulty and assume there is a very fine time-scale where controls and evolutions are differentiable? In which case one merely moves the problem to the closely related Problem 1.2. Is there a metric on paths in which the control-response map If taking γ to y is continuous or better, uniformly continuous. The initial answer is negative, this “control to response” map is never continuous in the uniform norm on γ, (excepting only the trivial case where the vector fields all commute (which includes the case where d = 1)). On reflection, one can easily see that this is as it should be – it could never make unambiguous sense to develop an general continuous path into a non-abelian Lie group and this is the simplest example of (1.2). None the less, one can point to two positive results. The first, by LC Young [34] proved that if γ is a path of p-variation and τ is a path of q-variation where 1/q + 1/p > 1 and the jumps of the two paths did not coincide then 2π τ dγ 0
is well defined. He used the result to make a decisive contribution to Harmonic Analysis. It was only recently that his method was extended [20] to treat differential equations of the type (1.2) where γ is of finite p-variation with 1 ≤ p < 2 and f is H¨older of order greater than p − 1. This does not cover the Brownian case (any p > 2 would suffice). The paper [20] can be regarded as one of many trying to catch up with Itˆ o! In 1942 Itˆ o [10] provided a spectacular positive answer for an interesting probabilistic example, when he showed that one could make sense of If (W ) where W was Brownian motion on (Ω, F , P) and If (W ) was regarded as an element of L2 (Ω, F , P) . If Young’s method was deterministic, then probability was essential to Itˆo’s construction and unfortunately, If (W ), in common with all elements of L2 is an equivalence class of functions. In particular, the response If (ω) is defined for almost all paths, but is not defined at a given path ω! None the less, the importance of these stochastic differential equations is now unquestionable (although it took some time for Itˆ o’s contribution to be fully recognized) and the functional If is known as the Itˆo functional reflecting the breakthrough his work represented. However, the Itˆo-Stratonovich approach i ˙ is quite unstable and for d > 1 and non-commuting f P-almost every point d in C [0, 1] , R is a point of discontinuity for If (W ) in any norm that carries Wiener measure. One might conclude that the Itˆ o’s approach was the best one could do [19] and that it would be impossible to treat (1.2) for paths that were rougher than Brownian paths. So it came as a surprise that there is a non-linear approach that works for any degree of roughness in γ and does not involve probability. Using Levy’s
272
T. Lyons
construction of a stochastic area allows the approach to be applied to almost all d-dimensional Brownian paths. It applies to many other stochastic models – completely outside the range of classical semi-martingale theory [4, 8, 1, 11]. 1.2. The signature of a path of finite length. The theory of Rough Paths starts from the premise that the Itˆ o functional has an obvious canonical meaning if a control γ is smooth and then provides a positive answer to the question posed in problem 1.2. The core of the methodology is to combine the ideas of Young with those of KT Chen [5] where he considers iterated integrals as natural functions on the space of paths. Suppose (γt )t∈[0,T ] is a continuous path segment of bounded variation (finite length) in a Banach Space V . One may consider its first n iterated integrals which we group together as a single element in the truncated tensor algebra: ˜ dγu + dγu1 dγu2 + Zγ : = 1 + 0
···+
0
···
dγu1 dγu2 · · · dγun
0
∈ R ⊕ V ⊕ (V ⊗ V ) ⊕ · · · ⊕ V ⊗n We call Z˜γ the truncated signature of γ. As Chen appreciated, there are several fundamental reasons for this choice. One important remark is that the map γ → Z˜γ is a representation of the monoid of paths into the truncated tensor algebra. Let denote the product on the truncated algebra (the truncated tensor product) Z˜γ Z˜τ = Z˜γ This follows easily if one observes that Z˜γ is the solution at time T to the differential equation dZu = Zu dγu , Z0 = 1. It also follows from this observation that the range (in the truncated tensor algebra) of the map γ → Z˜γ (which we refer to as the truncated signature of γ) is a group for each n. It is closed under products and (running γ backwards) has the inverse of every element. Remark 1.3. The truncated signature is a map from paths of finite length onto the free n-step nilpotent group G(n) over V . The group is itself represented in the truncated tensor algebra. It is a routine consequence of the Chow Rashevski theorem that the map is onto. Letting n → ∞ one could also consider the full signature Zγ of the path. Again it is obvious that the range of this map is a group in the full tensor algebra. Chen proved that any path γ of finite length which had continuous derivative when parameterised at unit speed is completely characterised by its
Systems Controlled by Rough Paths
273
full signature. Quite recently Hambly and Lyons have considered the signature map for general paths of finite length and exactly characterised the fibres of this map γ → Zγ , and proved that among paths with given signature – there is a unique shortest one which they call the tree reduced path. Recall that there is a canonical projection from the space of all words in an alphabet Λ onto the free group generated by Λ. The reduced word is the canonical representative of a fibre in the projection map. The tree reduced paths form a group, and play a similar conceptual role in this continuous setting to the reduced word. Since one may easily associate with any word a path along the lattice in RΛ we can also see the discrete setting as a restriction of the continuous case. Problem 1.4. Find effective algorithms for recovering the tree reduced path from its signature. Sidorova and Lyons (to appear) observe that there is always at least one shortest path with given truncated signature and that as a consequence of the uniqueness of the tree reduced path and the continuity of the signature in rough path p−metrics for p > 1, these paths will converge pointwise, as n → ∞, to the reduced path. However, the methods for determining the path with given truncated signature are deeply problematic. 1.3. Rough paths. Every path segment γ of finite length has a signature. We can use the truncated signature to compare paths. Fix p > 1 and let Z˜γ be the signature of a path segment γ truncated after the tensors of degree n and regarded as an element of the n-step nilpotent group G(n) . We can immediately introduce a metric on paths γ in Rd by asking that these lifts be close in pvariation (The method we give here for constructing a metric is conceptually simple but loses information as d or n increase. All the results here can be extended to paths in Banach spaces if one defines the metrics more carefully and in general the results are much more precise than the presentation here would suggest, so the seriously interested reader should revert to these more effective, if conceptually less simple, definitions [27].) Fix n > p − 1. Let γ and τ be two paths of finite length in Rd defined on [0, T ] 1 1p 1 1 −1 ˜ 1 ˜ 1 dp (γ, τ ) = sup Z Z γ| τ | 1 [ti ,ti+1 ] [ti ,ti+1 ] 1 D D
Theorem 1.5. Fix p. If a sequence of paths γk are dp -Cauchy for one n > p − 1 they are Cauchy for all n. Indeed, every path in G([p]) has a canonical lift to a path of p-variation in G(n) for any n > [p] and the map is continuous. Since the space of paths of finite p-variation in G(n) are a complete metric space we see that the completion of the space of paths of finite length can be identified with a collection of paths in G(n) having finite p variation.
274
T. Lyons
Definition 1.6. Fix p < n + 1. A geometric rough path in Rd with finite pvariation is a path Γ in G(n) having finite p-variation. In view of the previous theorem the definition does not depend on the choice of n. Theorem 1.7. If the vector fields f i are Lip (p ) then the Itˆ o functional If extends to a functional on the geometric rough paths with finite p-variation, p < p , and the map is continuous in dp for each p < p . If the vector fields are smooth then the map extends to all rough paths. These metrics are not norms and the completion is a space of paths in a free homogeneous nilpotent group projecting onto Rd ! The complexity of this group increases as p increases. For p < 2 it is just Rd and one recovers the Young theory. The degree of non-linearity in the analysis is directly related to the degree of roughness one allows in the control. The results were shown to be sharp by AM Davie (private communication) who also has a proof of this theorem based on Peano ideas. The original proof in [27] was based on studying Picard iteration and proving uniform convergence. It turns out that Brownian motion is on the edge of the linear theory p < 2; slightly outside, but so close to it that cancellations ensure (with probability one) that everything works using classical arguments. 1.4. Summary. One can complete the space of piecewise smooth paths in any of these metrics. Anything in the completion can be a control. It turns out that approach works well, and has many ramifications. However one should caution (n) the reader, such an approach could (n) be useless and even a tautology (Let γ converge if and only if If γ converges for all f .) and it is the concrete realisation of these spaces of rough paths that gives one the possibility of a useful theory. Still it might be of little interest if it did not, at least, include the main Itˆ o examples. In fact it does much more. The development has already had reasonably wide ramifications. The key connection with Brownian motion is the observation that if one considers the piecewise linear dyadic approximations to Brownian motion used in Levy’s construction of the process then, with probability one, these approximations are dp -Cauchy for every p > 2 and so almost every Brownian path can be thought of as a geometric rough path. This has allowed a path-wise interpretation of solutions to stochastic differential equations without exploiting the previsible assumption. However, broadly the same approach has been shown to work much more widely. Many other processes can also be regarded as geometric rough paths permitting completely new classes of stochastic coupling and stochastic differential equations: there is a canonical way to regard almost every sample path of multi-dimensional fractional Brownian motion as a rough path providing the Hurst index is at least 1/4 [8]. The theory of rough paths is infinite-dimensional and this has allowed one to give the first proofs for the existence of solutions to
Systems Controlled by Rough Paths
275
wide classes of linear SDE’s driven by infinite-dimensional Brownian Motions [11]. It has allowed new proofs, with better estimates, of hard results such as the support theorem [12]. Not all the applications are related to probability theory. The approach suggests new ways to compress sound, and there is a project on Speech recognition based on using the approach as well. There are now three extended sets of notes on the topic and another is planned – it is quite impossible to describe all the developments here. We give an extended bibliography below. 2. Pure rough paths However, perhaps the most exciting aspect of the theory so far discovered is only now becoming apparent. Conventional wisdom allows on to drive a stochastic process with Brownian motion – but not its derivative – white noise. However, it seems that this wisdom is not really correct, and that there are other simple surprises. Fix some p > 2, set n = [p], and consider one of these rough path metrics: dp . One apparently awkward aspect of the theory is that the graph of the Itˆ o functional is not closable in the space of paths in Rd . The completion of the smooth paths in the dp metric lives in the space of paths of finite p-variation in a finite-dimensional group G(n) .That group projects onto Rd so every rough path gives rise to a path in Rd ; moreover every path of finite p-variation in Rd has a p -rough path over it for any p > p. However, this lift is never unique! If p > 2 then there will be many rough paths in G(n) with finite p-variation above a path in Rd if there is one! Each rough path extension produces a different response at least for some differential equations. On the positive side, the lift from G(n) to a finite p-variation path in G(n ) is unique if n > n. This lack of uniqueness is true even for smooth paths. There is a unique rough path with finite 1-variation associated to smooth path but if p > 2 then there are many geometric rough paths with finite p-variation over any smooth path. This is even true of the constant path – the path γt = 0, t ∈ [0, T ]. As one learns the theory, one quickly understands the extra information carried in a rough path, over and above its position in Rd . It is finite-dimensional and not mysterious. However, it can still sometimes be surprising and leads to to situations where you see the smooth path in Rd but the rough path creating the response is not the canonical one associated with the smooth path but some other. At first this seemed like a pathology and an arkwardness. The remainder of this paper will present some very simple examples which have convinced this author that situations where what you see (γ the smooth path in Rd ) is not what you feel (the rough control Γ with finite p-variation with p > 2 over γ) are actually very natural and must be quite common in the wild. 2.1. What you see and what you feel. In this section we give some quite simple and conceptual examples where the path you can see is clearly not responsible
276
T. Lyons
for the way the system responds. We start with elementary pure mathematical examples; these were introduced in [27]. Consider the following planar curves 1 cos n2 t, sin n2 t γn,t : = n : = (rn,t , sn,t ) . One can quickly verify that these are cauchy in the rough path for any dp with p > 2. This example is particularly instructive in that it demonstrates that in general one cannot expect If to be continuous in the uniform topology. What you see is the curves γn ∈ R2 clearly converging to zero in the uniform topology and in p-variation for any p > 2. If we take the anti-symmetric part of the two-tensor component of the signature then it has the expression dxn,u1 dyn,u2 − dyn,u1 dxn,u2 0
0
and is the area captured by the path including multiplicity. The limiting signature of γ, which obviously has no 1-tensor component, does not tend to zero at level two and is Γt = (1 + 0 + t (1 ⊗ i + i ⊗ 1)) . The limiting behavior of dyn,t =
i f i (yn,t ) dγn,t
i
as n tends to infinity is non-trivial – one feels the control in the limit. In fact, although we cannot prove it here, limn→∞ yn,t = yt where yt solves % & dyt = f i , f j (yt ) dt. If the above example is simple to understand and pure mathematical. The next suggests that rough paths with “non-canonical” second order terms occur naturally. Example 2.1 (Hoff – Thesis – 2005). Let Bt be a Brownian motion and for some ε > 0 consider the delay differential equation dyε,t = f 1 (yε,t ) dBt + f 2 (yε,t ) dBt−ε . We might loosely think of this equation as modelling the vibrations in a glass when a sound Bt hits it – the multiple terms arise because the sound can take multiple routes, e.g., a direct path and also bounce off a wall arriving slightly later. For convenience, take the equation in Stratonovich sense, we can use Itˆ o calculus to construct solutions as, over intervals of time less that ε, the two Brownian motion in the two time frames are independent. What happens as ε → 0. Intuitively one might expect that one gets superposition of the effects and yε → y where dyt = f 1 (yt ) + f 2 (yt ) dBt .
Systems Controlled by Rough Paths
277
In general this is quite false. It only happens if the vector fields f i commute. The point is that (Bt , Bt−ε ) converges as a rough path (so the responses do converge) but the limit has non-trivial second order terms and is not the obvious lift of (Bt , Bt ) . In fact yε → y where % & dyt = f 1 (yt ) + f 2 (yt ) dBt + f i , f j (yt ) dt. In both of these examples, it was easy to write down the correction as a new differential equation of classical kind. This does not have to be the case but it is reasonable to expect it in the simplest cases. In the following example we do something quite radical (but not difficult given the theory). We take a discrete model for white noise and renormalise it so that one gets convergence in distribution to a random rough path Πt with finite p-variation for every p > 4. It will have no spatial motion. The solution to a rough differential equation dyt = f i (yt ) dΠt i
can be expressed as a classical stochastic differential equation driven by Brownian motion; however these equations involves only Lie brackets of vector fields (and so are invisible in the case where the vector fields commute). The Brownian motion in the expression comes from the 2-tensor term in the signature of Π. Example 2.2 (Hambly and Lyons). Suppose that Xn are independent identically distributed two-dimensional normal or Gaussian random variables with mean 0 and variance 1. Consider the path γt defined by wt = Xn , t ∈ (n − 1, n] . Then γ is a discrete analogue of white noise. If one renormalises it: √ (T ) wt = wT t T (T )
then, as a distribution valued random variable, γt noise. It’s indefinite integral t (T ) bt := ws(T ) ds
converges in law to white
0
converges in law to Brownian motion. Let γt be the continuous function that satisfies γn = Xn and is linear √ in (T ) between. Then essentially the same remarks hold true for wt and γtT T . Although one might suppose from classical Itˆ o integration that there wasn’t much point looking at differential equations driven by γ and its renormalisations we now have a wider notion of convergence – the possibility of convergence as to a non-trivial control is not ruled out by convergence to zero as a path in Rd . So let us be bold! Let γT t (T ) γt := 1/4 . T
278
T. Lyons
A trivial Borel Cantelli argument shows that the paths converge uniformly to zero as T → ∞. It is an easy exercise to see that the stochastic area has a non-trivial limiting distribution. The processes converge in distribution to a stochastic i 4-rough path process Πit and so a limiting behavior exists for dyt = i f (yt ) dΠt (providing the f are C4+ε ). 3. Conclusions When Lebesgue generalise the integral of Riemann he did it in an essentially commutative (rearrangement invariant) way. In contrast, the theory of rough paths exploits the order of the incremental events that make up the control γ in a decisive way. An alternative name for the theory could have been noncommutative analysis. Based on the iterated integral expansions studied by Chen, as well as by a control theorists, and using analytic tools extending results of Young, the author established a family of uniform metric conditions on paths γ, y which ensure that the functional γ → y (γ) is uniformly continuous and ensures that two controls γ which are close in one of these metrics will produce similar responses y (γ). The closure of the piecewise smooth paths in these metrics are the rough paths. They are concrete spaces of “generalised” paths and are quite straightforward to understand. They include paths with arbitrary degrees of roughness and lead to new classes of controls. Probability theory is a rich and natural environment producing many examples of “Rough Paths”; and these methods recover the main results of the Itˆo methodology through the existence of Levy Area; however the rough path methodology allows one to identify completely new couplings of stochastic systems. Systems controlled by multidimensional fractional Brownian motion can often be fitted into this framework even though they are outside the usual outside the Itˆ o framework. The approach leads to new classes of control, which we refer to as pure rough paths. On reflection it is clear that the extra structure implicit in these pure rough paths must be reflected in a number of everyday phenomena although it is not visible in more classical differential equation models. We also mention that continuous paths have played an important part in topology and pure mathematics for many years. Rough paths seem to be the natural class to consider when one is interested in issues of a more geometric or differential nature. For example, on the one hand, under reasonable equicontinuity assumptions a pointwise limit of rough paths is a rough path, on the other, given a fibre bundle over a base space, a rough path on the base space, and a smooth connection, it is always possible to lift the rough path in a unique way to the bundle. If I could suggest a single future development one would point out that the theory of rough paths is a rigorous and structured approach to multi-scale analysis. It takes a control and says what do we need to know about its large
Systems Controlled by Rough Paths
279
scale structure if we are to predict its influence on an evolving systems. On the pure side it is of real interest to understand the nature of the range of the signature map and how to invert it efficiently. But there are many other questions. . . Finally, in the talk and this short note we have tried to give a flavor for some of consequences of the theory of rough paths. However, this is only partly possible without more details and less informality. There are now a surprising number of contributions to the field and we have tried to include many in the Bibliography. [28, 2, 3, 31, 14, 8, 29, 11, 1, 12, 13, 15, 24, 4, 33, 30, 25, 9, 23, 32, 27, 18, 21, 22, 17, 16, 26, 20, 19, 7, 6, 5] References [1] R.F. Bass, B.M. Hambly, and T.J. Lyons, Extending the Wong-Zakai theorem to reversible Markov processes, J. Eur. Math. Soc. (JEMS) 4 (2002), no. 3, 237–269. MR 1924401 (2003f:60128) [2] Richard F. Bass, Krzysztof Burdzy, and Zhen-Qing Chen, Stochastic differential equations driven by stable processes for which pathwise uniqueness fails, Stochastic Process. Appl. 111 (2004), no. 1, 1–15. MR 2049566 ´ [3] Fabrice Baudoin, Equations diff´ erentielles stochastiques conduites par des lacets dans les groupes de Carnot, C. R. Math. Acad. Sci. Paris 338 (2004), no. 9, 719–722. MR 2065381 [4] M. Capitaine and C. Donati-Martin, The L´ evy area process for the free Brownian motion, J. Funct. Anal. 179 (2001), no. 1, 153–169. MR 1807256 (2001k:46103) [5] Kuo-Tsai Chen, Integration of paths, geometric invariants and a generalized Baker-Hausdorff formula, Ann. of Math. (2) 65 (1957), 163–178. MR 0085251 (19,12a) [6] Kuo-tsai Chen, Formal differential equations, Ann. of Math. (2) 73 (1961), 110– 133. MR 0150370 (27 #371) [7] , Formal differential equations, Rev. Un. Mat. Argentina 20 (1962), 58–62. MR 0152522 (27 #2500) [8] Laure Coutin and Zhongmin Qian, Stochastic analysis, rough path analysis and fractional Brownian motions, Probab. Theory Related Fields 122 (2002), no. 1, 108–140. MR 1883719 (2003c:60066) [9] B.M. Hambly and T.J. Lyons, Stochastic area for Brownian motion on the Sierpinski gasket, Ann. Probab. 26 (1998), no. 1, 132–148. MR 1617044 (99f:60145) [10] K. Ito. [11] M. Ledoux, T. Lyons, and Z. Qian, L´evy area of Wiener processes in Banach spaces, Ann. Probab. 30 (2002), no. 2, 546–578. MR 1905851 (2003h:60088) [12] M. Ledoux, Z. Qian, and T. Zhang, Large deviations and support theorem for diffusion processes via rough paths, Stochastic Process. Appl. 102 (2002), no. 2, 265–283. MR 1935127 (2003m:60152) [13] Antoine Lejay, On the convergence of stochastic integrals driven by processes converging on account of a homogenization property, Electron. J. Probab. 7 (2002), no. 18, 18 pp. (electronic). MR 1943891 (2003k:60073)
280 [14] [15]
[16]
[17]
[18] [19] [20]
[21] [22]
[23]
[24] [25]
[26]
[27] [28]
[29]
[30]
T. Lyons , An introduction to rough paths, S´eminaire de Probabilit´es XXXVII, Lecture Notes in Math., vol. 1832, Springer, Berlin, 2003, pp. 1–59. MR 2053040 T.J. Lyons, System control and rough paths, Numerical methods and stochastics (Toronto, ON, 1999), Fields Inst. Commun., vol. 34, Amer. Math. Soc., Providence, RI, 2002, pp. 91–99. MR 1944747 (2003k:60121) T.J. Lyons and Z.M. Qian, Calculus for multiplicative functionals, Itˆ o’s formula and differential equations, Itˆ o’s stochastic calculus and probability theory, Springer, Tokyo, 1996, pp. 233–250. MR 1439528 (98k:60111) , Calculus of variation for multiplicative functionals, New trends in stochastic analysis (Charingworth, 1994), World Sci. Publishing, River Edge, NJ, 1997, pp. 348–374. MR 1654380 (2000k:60112) , A class of vector fields on path spaces, J. Funct. Anal. 145 (1997), no. 1, 205–223. MR 1442166 (98d:60107) Terry Lyons, On the nonexistence of path integrals, Proc. Roy. Soc. London Ser. A 432 (1991), no. 1885, 281–290. MR 1116958 (92j:60069) , Differential equations driven by rough signals. I. An extension of an inequality of L. C. Young, Math. Res. Lett. 1 (1994), no. 4, 451–464. MR 1302388 (96b:60150) Terry Lyons and Zhongmin Qian, Flow equations on spaces of rough paths, J. Funct. Anal. 149 (1997), no. 1, 135–159. MR 1471102 (99b:58241) , Stochastic Jacobi fields and vector fields induced by varying area on path spaces, Probab. Theory Related Fields 109 (1997), no. 4, 539–570. MR 1483599 (98m:60016) , Flow of diffeomorphisms induced by a geometric multiplicative functional, Probab. Theory Related Fields 112 (1998), no. 1, 91–119. MR 1646428 (99k:60153) , System control and rough paths, Oxford Mathematical Monographs, Oxford University Press, Oxford, 2002, Oxford Science Publications. MR 2036784 Terry Lyons and Ofer Zeitouni, Conditional exponential moments for iterated Wiener integrals, Ann. Probab. 27 (1999), no. 4, 1738–1749. MR 1742886 (2001g:60198) Terry J. Lyons, The interpretation and solution of ordinary differential equations driven by rough signals, Stochastic analysis (Ithaca, NY, 1993), Proc. Sympos. Pure Math., vol. 57, Amer. Math. Soc., Providence, RI, 1995, pp. 115–128. MR 1335466 (96d:34076) , Differential equations driven by rough signals, Rev. Mat. Iberoamericana 14 (1998), no. 2, 215–310. MR 1654527 (2000c:60089) Renaud Marty, Th´ eor`eme limite pour une ´ equation diff´ erentielle ` a coefficient al´eatoire a ` m´ emoire longue, C. R. Math. Acad. Sci. Paris 338 (2004), no. 2, 167–170. MR 2038288 (2004j:60123) David Nualart and Aurel R˘ a¸scanu, Differential equations driven by fractional Brownian motion, Collect. Math. 53 (2002), no. 1, 55–81. MR 1893308 (2003f:60105) Thomas Simon, Support theorem for jump processes, Stochastic Process. Appl. 89 (2000), no. 1, 1–30. MR 1775224 (2001j:60116)
Systems Controlled by Rough Paths
281
, Small deviations in p-variation for multidimensional L´evy processes, J. Math. Kyoto Univ. 43 (2003), no. 3, 523–565. MR 2028666 [32] Shinzo Watanabe, T. J. Lyons’ theory of differential equations that are driven by a rough signal and its application to stochastic differential equations, S¯ urikaisekikenky¯ usho K¯ oky¯ uroku (1998), no. 1032, 1–12, Problems in stochastic numerical analysis, III (Japanese) (Kyoto, 1997). MR 1652420 [33] David R.E. Williams, Path-wise solutions of stochastic differential equations driven by L´evy processes, Rev. Mat. Iberoamericana 17 (2001), no. 2, 295–329. MR 1891200 (2003h:60102) [34] L.C. Young, An inequality of holder type, connected with Stieltjes integration, Acta Math. (1936). [31]
Terry Lyons Mathematical Institute University of Oxford 24-29 St Giles’ Oxford OX1 3LB, United Kingdom e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
The Stable Mapping Class Group and Stable Homotopy Theory Ib Madsen and Michael Weiss Abstract. This overview is intended as a lightweight companion to the long article [20]. One of the main results there is the determination of the rational cohomology of the stable mapping class group, in agreement with the Mumford conjecture [26]. This is part of a recent development in surface theory which was set in motion by Ulrike Tillmann’s discovery [34] that Quillen’s plus construction turns the classifying space of the stable mapping class group into an infinite loop space. Tillmann’s discovery depends heavily on Harer’s homological stability theorem [15] for mapping class groups, which can be regarded as one of the high points of geometric surface theory.
1. Surface bundles without stable homotopy theory We denote by Fg,b an oriented smooth compact surface of genus g with b boundary components; if b = 0, we also write Fg . Let Diff(Fg,b ; ∂) be the topological group of all diffeomorphisms Fg,b → Fg,b which respect the orientation and restrict to the identity on the boundary. (This is equipped with the Whitney C ∞ topology.) Let Diff1 (Fg,b ; ∂) be the open subgroup consisting of those diffeomorphisms Fg,b → Fg,b which are homotopic to the identity relative to the boundary. Theorem 1.1. [10], [11]. If g > 1 or b > 0, then Diff1 (Fg,b ; ∂) is contractible. Idea of proof. For simplicity suppose that b = 0, hence g > 1. Write F = Fg = Fg,0 . Let H(F ) be the space of hyperbolic metrics (i.e., Riemannian metrics of constant sectional curvature −1) on F . The group Diff1 (F ; ∂) acts on H(F ) by transport of metrics. The action is free and the orbit space is the Teichm¨ uller space T(F ). The projection map H(F ) −→ T(F ) admits local sections, so that H(F ) is the total space of a principal bundle with uller theory, T(F ) is homeomorphic structure group Diff1 (F ; ∂). By the Teichm¨ Key words and phrases. Mapping class group, Mumford conjecture, Morse function, homological stability, infinite loop space, homotopy colimit. Partially supported by the American Institute of Mathematics (I. Madsen) Partially supported by the Royal Society (M. Weiss).
284
I. Madsen and M. Weiss
to a Euclidean space, hence contractible. It is therefore enough to show that H(F ) is contractible. This is not easy. Let S(F ) be the set of conformal structures on F (equivalently, complex manifold structures on F which refine the given smooth structure and are compatible with the orientation of F ). Let J(F ) be the set of almost complex structures on F . Elements of J(F ) can be regarded as smooth vector bundle automorphisms J : T F → T F with the property J 2 = −id and det(a(v), aJ(v)) > 0 for any x ∈ F , v ∈ Tx F and oriented isomorphism a : Tx F → R2 . Hence J(F ) has a canonical (Whitney C ∞ ) topology. It is a consequence of the “uniformization theorem” that the forgetful map H(F ) → S(F ) is a bijection. The forgetful map S(F ) → J(F ) is also a bijection. This is another hard old theorem (the Korn–Lichtenstein theorem); see, e.g., [8], [27]. Hence the composite map H(F ) → J(F ) is a bijection. It is clearly continuous. One of the main points of [10] and [11] is that the inverse J(F ) → H(F ) is also continuous. Hence H(F ) is homeomorphic to J(F ), and J(F ) is clearly contractible. Definition 1.2. With the assumptions of Theorem 1.1, the mapping class group Γg,b is π0 Diff(Fg,b ; ∂) = Diff(Fg,b ; ∂)/Diff1 (Fg,b ; ∂). Remark 1.3. BDiff(Fg,b ; ∂) BΓg,b . Proof. By Theorem 1.1, the projection Diff(Fg,b ; ∂) → Γg,b is a homotopy equivalence. Hence the induced map BDiff(Fg,b ; ∂) −→ BΓg,b is a homotopy equivalence. It seems that a homological theory of mapping class groups emerged only after the Earle–Eells–Schatz result, Theorem 1.1. One of the most basic homological results is the following, due to Powell [29]. Proposition 1.4. H1 (BΓg ; Z) = 0 for g ≥ 3. This is of course equivalent to the statement that Γg is perfect when g ≥ 3. The proof is based on a result of Dehn’s which states that Γg can be generated by a finite selection of Dehn twists along simple closed curves in Fg . Powell shows that each of these generating Dehn twists is a commutator. An important consequence of Proposition 1.4 is that there exist a simply + connected space BΓ+ g and a map f : BΓg → BΓg which induces an isomorphism + in integer homology. The space BΓg and the map f are essentially unique and the whole construction is a special case of Quillen’s plus construction, beautifully explained in [1]. Around 1980, Hatcher and Thurston [16] succeeded in showing that Γg is finitely presented. Their proof uses a simplicial complex of cut systems on a surface, an idea introduced a few years earlier by W.J. Harvey. This is also an essential ingredient in the proof of the following theorem. Theorem 1.5. Let N be an oriented compact surface, N = N1 ∪N2 where N1 ∩N2 is a union of finitely many smooth circles in N ∂N . Suppose that N1 ∼ = Fg,b
The Stable Mapping Class Group and Stable Homotopy Theory
285
and N ∼ = Fh,c . Then the homomorphism H∗ (BΓg,b ; Z) → H∗ (BΓh,c ; Z) induced by the inclusion N1 → N is an isomorphism for ∗ ≤ g/2 − 1. This is the homological stability theorem of Harer [15] with improvements due to Ivanov [17], [18]. It is a hard theorem and we shall not attempt to outline the proof. Corollary 1.6. H1 (BΓg,b ; Z) = 0 for all b if g ≥ 4. Proof. This follows easily from Proposition 1.4 and Theorem 1.5.
By Remark 1.3, there is a “universal” surface bundle E → BΓg,b with oriented fibers ∼ = Fg,b and trivialized boundary bundle ∂E → BΓg,b (so that ∂E is identified with ∂Fg,b × BΓg,b ). Let Tv E be the vertical tangent bundle of E, a two-dimensional oriented vector bundle on E with a trivialization over ∂E. This has an Euler class e ∈ H 2 (E, ∂E; Z). The image of ei+1 under the Gysin transfer H 2i+2 (E, ∂E; Z) → H 2i (BΓg,b ; Z) (alias integration along the fiber) is the Mumford–Morita–Miller characteristic class κi ∈ H 2i (BΓb,g ; Z). It was introduced by Mumford [26], but the description in differential topology language which we use here owes much to Miller [22] and Morita [24]. The class κ0 equals the genus g ∈ Z ∼ = H 0 (BΓb,g ; Z). For i > 0, however, κi is stable, i.e., independent of g and b. Namely, the homomorphism H ∗ (BΓh,c ; Z) → H ∗ (BΓg,b ; Z) induced by an embedding Fg,b → Fh,c as in Theorem 1.5 takes the κi class in H ∗ (BΓh,c ; Z) to the κi class in H ∗ (BΓg,b ; Z). Mumford conjectured in [26] that the homomorphism of graded rings Q[x1 , x2 , x3 , . . . ] −→ H ∗ (BΓg,b ; Q) taking xi to κi (where deg(xi ) = 2i) is an isomorphism in a (then unspecified) “stable range”. By the Harer–Ivanov stability theorem, which is slightly younger than Mumford’s conjecture, we can take that to mean: in degrees less than g/2 − 1. Morita [24], [25] and Miller [22] were able to show relatively quickly that Mumford’s homomorphism Q[x1 , x2 , x3 , . . . ] −→ H ∗ (BΓg,b ; Q) is injective in the stable range. There matters stood until, in 1996-7, Tillmann introduced concepts from stable homotopy theory into surface bundle theory. 2. Stabilization and Tillmann’s theorem Here it will be convenient to consider oriented surfaces Fg,b where each of the b boundary circles is identified with S1 . These identifications may or may not be orientation preserving; if it is, we regard the boundary component as “outgoing”, otherwise as “incoming”. We write Fg,b1 +b2 to indicate that there are b1 incoming and b2 outgoing boundary circles.
286
I. Madsen and M. Weiss
Fix standard surfaces Fg,1+1 for g ≥ 0 in such a way Fg+h,1+1 is identified with the union Fg,1+1 "S1 Fh,1+1 (the outgoing boundary circle of Fg,1+1 being glued to the incoming boundary circle of Fh,1+1 ). A smooth automorphism α of Fg,1+1 , relative to the boundary, can be regarded as a smooth automorphism α "S1 id of Fg,1+1 "S1 F1,1+1 ∼ = Fg+1,1+1. This gives us stabilization homomorphisms · · · −→ Γg,1+1 −→ Γg+1,1+1 −→ Γg+2,1+1 −→ · · · and we define Γ∞,1+1 as the direct limit colimg→∞ Γg,1+1 . This is the most obvious contender for the title of a stable mapping class group. It is still a perfect group. A more illuminating way to proceed is to note that a pair of smooth automorphisms α : Fg,1+1 → Fg,1+1 and β : Fh,1+1 → Fh,1+1 , both relative to the boundary, determines an automorphism α " β of Fg+h,1+1 . In other words, we have concatenation homomorphisms Γg,1+1 × Γh,1+1 −→ Γg+h,1+1 which induce maps BΓg,1+1 × BΓh,1+1 → BΓg+h,1+1 . These maps amount to a structure of topological monoid on the disjoint union : BΓg,1+1 . g≥0
> We can form the group completion ΩB( g BΓg,1+1 ). The inclusion of g BΓg,1+1 in the group completion is a map of topological monoids and the target is a group-like topological monoid (i.e., its π0 is a group) because it is a loop space. > Proposition 2.1. ΩB( g BΓg,1+1 ) Z × BΓ+ ∞,1+1 .
>
Idea of proof. It is enough to produce a map from right-hand side to left-hand side which induces an isomorphism in integer homology. Indeed, the existence of such a map implies that H1 (left-hand side; Z) = 0. Since the left-hand side is a loop space, the vanishing of H1 implies that all its connected components are simply connected. > Let M = g BΓg,1+1 and let F be the homotopy direct limit (here: telescope) of the sequence z·
z·
z·
z·
M −→ M −→ M −→ M −→ · · · where z· is left multiplication by a fixed element in the genus one component of M. The topological monoid M acts on the right of F. Theorem 1.5 implies that it acts by maps F → F which induce isomorphisms in integer homology. It follows [21] that the projection from the Borel construction FhM to the classifying space BM is a homology fibration. (The Borel construction FhM is the classifying space of the topological category with object space F and morphism space F × M, where the “source” map is the projection F × M → F, the “target” map
The Stable Mapping Class Group and Stable Homotopy Theory
287
is the right action map F×M → F, and composition of morphisms is determined by the multiplication in M.) In particular, the inclusion of the fiber of FhM −→ BM over the base point into the corresponding homotopy fiber induces an isomorphism in integer homology. Since the fiber over the base point is F Z × BΓ∞,1+1 , it remains only to identify the homotopy fiber over the base point as ΩB(M). For that it is enough to show that FhM is contractible. But FhM is the homotopy direct limit (telescope) of the sequence ·z
·z
·z
·z
MhM −→ MhM −→ MhM −→ MhM −→ · · · where each term MhM is contractible.
One remarkable consequence of Proposition 2.1 is that Z × BΓ+ ∞,1+1 is a loop space. Miller did better than that [22] by constructing a two-fold loop space structure on Z × BΓ+ ∞,1+1 . To be more accurate, he constructed such a structure on a space which ought to be denoted Z × BΓ+ ∞,0+1 but which is homotopy equivalent to Z × BΓ+ ∞,1+1 by the Harer–Ivanov theorem. This construction of Miller’s will not be explained here (perhaps unfairly, because it may have influenced the proof of the following theorem due to Tillmann). Theorem 2.2. [34] The space Z × BΓ+ ∞,1+1 is an infinite loop space. Remark. If Y is an infinite loop space, then the contravariant functor taking a space X to [X, Y ], the set of homotopy classes of maps from X to Y , is the 0th term of a generalized cohomology theory. Apart from Eilenberg– MacLane spaces, the most popular example is Y = Z × BU, which is an infinite loop space because it is homotopy equivalent to its own two-fold loop space. The corresponding generalized cohomology theory is, of course, the K-theory of Atiyah, Bott and Hirzebruch. The construction, description, classification, etc., of generalized cohomology theories is considered to be a major part of stable homotopy theory. Outline of proof of Theorem 2.2. It is well known that infinite loop spaces can be manufactured from symmetric monoidal categories, i.e., categories with a notion of “direct sum” which is associative and commutative up to canonical isomorphisms. For more details on symmetric monoidal categories, see [1]. If C is such a category, then the classifying space BC has a structure of topological monoid which reflects the direct sum operation in C. If this happens to be group-like, i.e., π0 BC is a group, then BC is an infinite loop space. If not, then at least the group completion ΩB(BC) is an infinite loop space. More details and a particularly satisfying proof can be found in [32]. For an overview and alternative proofs, see also [1].
288
I. Madsen and M. Weiss
The standard example of such a category is the category of finitely generated left projective modules over a ring R, where the morphisms are the Risomorphisms. Here group completion of the classifying space is required and the resulting infinite loop space is the algebraic K-theory space K(R). For a slightly different example, take the category of finite-dimensional vector spaces over C, with mor(V, W ) equal to the space of C-linear isomorphisms from V to W . The new feature here is that we have a symmetric monoidal category with a topology on each of its morphism sets. This “enrichment” must be fed into the construction > of the classifying space, which then turns out to be homotopy equivalent to n BU(n). Again, group completion is required and the associated infinite loop space is Z × BU, up to a homotopy equivalence. Another example which is particularly important here is as follows. Let ob(C) consist of all closed oriented 1-manifolds. Given two such objects, say C and C , we would like to say roughly that a morphism from C to C is a smooth compact surface F with boundary −C " C (where the minus sign indicates a reversed orientation). To be more precise, let mor(C, C ) be “the” classifying space for bundles of smooth compact oriented surfaces whose boundaries are identified with the disjoint union −C " C . The composition map mor(C, C ) × mor(C , C ) −→ mor(C, C ) is given by concatenation, as usual. Disjoint union of objects and morphisms can be regarded as a “direct sum” operation which makes C into a symmetric monoidal category, again with a topology on each of its morphism sets. The enrichment must be fed into the construction of BC. Then BC is clearly connected, and by the above it is an infinite loop space. Unfortunately it is not clear whether the homotopy type of BC is at all closely related to that of Z×BΓ+ ∞,1+1 . This is mostly due to the fact that, in the above definition of mor(C, C ) for objects C and C of C, we allowed arbitrary compact surfaces with boundary ∼ = −C " C instead of insisting on connected surfaces. And if we had insisted on connected surfaces throughout, we would have lost the “direct sum” alias “disjoint union” operation which is so essential. (Disjoint unions of connected things are typically not connected.) A new idea is required, and Tillmann comes up with the following beautiful two-liner. Make a subcategory C0 of C by keeping all objects of C, but only those morphisms (surfaces) for which the inclusion of the outgoing boundary induces a surjection in π0 . In the above notation, where we have a surface F and ∂F is identified with −C " C , the condition means that π0 C → π0 F is onto. It is clear that C0 is closed under the disjoint union operation, and that BC0 is connected, so BC0 is still an infinite loop space. While the surfaces which we see in the definition of C0 need not be connected, they always become connected when we compose on the left (i.e., concatenate at the outgoing boundary) with a morphism to the connected object S1 . This observation leads fairly automatically, i.e., by imitation of the proof of Proposition 2.1, to a
The Stable Mapping Class Group and Stable Homotopy Theory
289
homotopy equivalence Ω(BC0 ) Z × BΓ+ ∞,1+1 and so to the conclusion that Z × BΓ+ ∞,1+1 is an infinite loop space. Namely, we introduce a contravariant functor F on C0 in such a way that F(C), for an object C, is the homotopy direct limit (= telescope) of the sequence z·
z·
z·
morC0 (C, S1 ) −→ morC0 (C, S1 ) −→ morC0 (C, S1 ) −→ · · · where z· is left multiplication by a fixed element in the genus one component of morC0 (S1 , S1 ). Theorem 1.5 implies that any map F(C ) → F(C) determined by a morphism C → C in C0 induces an isomorphism in integer homology. It follows that the projection from the homotopy direct limit of F to BC0 is a homology fibration. (The homotopy colimit of F replaces the Borel construction in the proof of Proposition 2.1; see Definition 6.1 below for more details.) The fiber over the vertex determined by the object S1 is F(S1 ) Z × BΓ∞,1+1 . It remains to show that the corresponding homotopy fiber is ΩBC0 , and for that it is enough to prove that hocolim F is contractible. But hocolim F is the homotopy direct limit (telescope) of a sequence z·
z·
z·
z·
hocolim E −→ hocolim E −→ hocolim E −→ hocolim E −→ · · · where E is the representable contravariant functor C → morC0 (C, S1 ). Homotopy colimits of representable contravariant functors (on categories where the morphism sets are topologized and composition of morphisms is continuous) are always contractible. Remark. The outline above is deliberately careless about the definition of the composition maps (alias concatenation maps) mor(C, C ) × mor(C , C ) −→ mor(C, C ) in the category C. This is actually not a straightforward matter. Tillmann has a very elegant solution in a later article [35] where she constructs a category equivalent to the C0 above using (few) generators and relations. 3. Mock surface bundles Relying on Theorem 2.2, Tillmann in [35] began to develop methods to split off known infinite loop spaces from Z × BΓ+ ∞,1+1 , specifically infinite loop spaces of the “free” type Q(X) = colim Ωn Σn X n→∞
where X is a pointed space. This was taken to a higher level in a joint paper by Madsen and Tillmann [19]. The paper begins with the construction of an integral version of the total Mumford–Morita–Miller class, which is an infinite ∞ ∞ loop map α∞ from Z × BΓ+ ∞,1+1 to a well-known infinite loop space Ω CP−1 . The main result is a splitting theorem, formulated in terms of α∞ and known
290
I. Madsen and M. Weiss
decompositions of Ω∞ CP∞ −1 , which can be regarded as a p-local version of the Morita–Miller injectivity result. It is proved by methods which are somewhat similar to Morita’s methods. Here we are going to describe α∞ from a slightly different angle, emphasizing bordism theoretic ideas and initially downplaying the motivations from characteristic class theory. Definition 3.1. Let X be a smooth manifold (with empty boundary). A mock surface bundle on X consists of a smooth manifold M with dim(M )−dim(X) = 2, a proper smooth map q:M → X , a stable vector bundle surjection δq : T M → q ∗ T X and an orientation of the two-dimensional kernel vector bundle ker(δq) on M . Explanations. The word stable in “stable vector bundle surjection” means that δp is a vector bundle map T M × Ri → p∗ T X × Ri for some i, possibly large. Note that δq is not required to agree with the differential dq of q. It should be regarded as a “formal” differential of q. If δq = dq, then q is a smooth proper submersion. Smooth proper submersions are fiber bundles by Ehresmann’s lemma [5]. In short, an integrable mock surface bundle (δq = dq) is a surface bundle. Mock surface bundles share many good properties with honest surface bundles. They can (usually) be pulled back, they have a classifying space, and they have Mumford–Morita–Miller characteristic classes, as we shall see. To begin with the pullback property, suppose that q : M → X2 with δq etc. is a mock surface bundle and let f : X1 → X2 be a smooth map. If f is transverse to q, which means that the map (x, y) → (f (x), q(y)) from X1 × M to X2 × X2 is transverse to the diagonal, then the pullback f ∗ M = {(x, y) ∈ X1 × M | f (x) = q(y)} is a smooth manifold, with projection p : f ∗ M → X1 . The transversality property and the information in δq can be used to make a canonical choice of formal (stable) differential δp : T (f ∗ M ) −→ p∗ T X1 with oriented two-dimensional kernel bundle. Then (p, δp) is a mock surface bundle on X1 . The details are left to the reader. If f is not transverse to q, then we can make it transverse to q by a small perturbation [5, 14.9.3]. In that situation, of course, (p, δp) is not entirely well defined because it depends on the perturbation. It is however well defined up to a concordance: Definition 3.2. Two mock surface bundles q0 : M0 → X and q1 : M1 → X (with vector bundle data which we suppress) are concordant if there exists a mock surface bundle qR : MR → X × R (with vector bundle data ...) such that qR is transverse to X × {0} and X × {1}, and the pullbacks of qR to X × {0} and X × {1} agree with q0 × {0} and q1 × {1}, respectively.
The Stable Mapping Class Group and Stable Homotopy Theory
291
Next we turn to the construction of a classifying space for mock surface bundles. This is an instance of Pontryagin–Thom theory in a cohomological setting which was popularized by Quillen [9] and later by Buoncristiano–Rourke– Sanderson [7]. Let Gr2 (R2+n ) be the Grassmannian of oriented 2-planes in R2+n and let Pn , Vn be the canonical vector bundles of dimension 2 and n on Gr2 (R2+n ), respectively. Let Th(Vn ) be the Thom space (one-point compactification of the total space) of Vn . Since Vn+1 |Gr2 (R2+n ) is identified with Vn × R, there is a preferred embedding ΣTh(Vn ) → Th(Vn+1 ), with adjoint Th(Vn ) → ΩTh(Vn+1 ). We form the direct limit colim Ωn+2 Th(Vn ) =: Ω∞ CP∞ −1 . n→∞
Lemma 3.3. For any smooth manifold X there is a natural bijection from the set of homotopy classes [X, Ω∞ CP∞ −1 ] to the set of concordance classes of mock surface bundles on X. Outline of proof (one direction only). A map from X to Ω∞ CP∞ −1 factors through Ωn+2 Th(Vn ) for some n. Let f be the adjoint, a based map from the (n + 2)-fold suspension of X+ to Th(Vn ). It is convenient to identify the complement of the base point in Σn+2 X+ with X × Rn+2 . We can assume that f is transverse to the zero section of Vn . Let M ⊂ X × Rn+2 be the inverse image of the zero section under f . Let q : M → X be the projection. By construction of M there is an isomorphism T M ⊕ (f |M )∗ Vn ∼ = q ∗ T X × Rn+2 of vector bundles on M . Adding (f |M )∗ Pn on the left hand side and noting that T M ⊕(f |M )∗ Vn ⊕(f |M )∗ Pn is identified with T M ×Rn+2 , we get a vector bundle surjection δq : T M × Rn+2 −→ q ∗ T X × Rn+2 ∼ (f |M )∗ Pn , which implies an orientation on ker(δq). Now with ker(δq) = q : M → X and δq with the orientation on ker(δq) constitute a mock surface bundle whose concordance class is independent of all the choices we made in the construction. Finally we construct Mumford–Morita–Miller classes for mock surface bundles. Let q : M → X be a mock surface bundle, with δq : T M → q ∗ T X. The oriented 2-dimensional vector bundle ker(δq) on M has an Euler class e ∈ H 2 (M ; Z). Our hypotheses on q imply that q induces a transfer map in cohomology, H ∗+2 (M ; Z) −→ H ∗ (X; Z). This is obtained essentially by conjugating an induced map in homology with Poincar´e duality. (The correct version of homology for this purpose is locally finite homology with Z-coefficients twisted by the orientation character.) We now
292
I. Madsen and M. Weiss
define κi (q, δq) ∈ H 2i (X; Z) to be the image of ei+1 ∈ H 2i+2 (M ; Z) under the transfer. The classes κi are concordance invariants and behave naturally under (transverse) pullback of mock surface bundles. They can therefore be regarded as classes in the cohomology of the classifying space for mock surface bundles: κi ∈ H 2i (Ω∞ CP∞ −1 ; Z) . It is not difficult to see that certain mild modifications of Definition 3.1 do not change the concordance classification. In particular, a convenient modification of that sort consists in allowing q : M → X with δq etc. where M has a boundary ∂M , the restriction q|∂M is a trivialized bundle with fibers ∼ = −S1 " S1 and δq agrees with the differential dq on ∂M . If we now regard Ω∞ CP∞ −1 as a classifying space for these modified mock surface bundles, then we obtain a comparison map of classifying spaces : BΓg,1+1 −→ Ω∞ CP∞ −1 g
(Indeed, the left-hand side is a classifying space for honest bundles whose fibers are connected oriented smooth surfaces with prescribed boundary ∼ = −S1 " S1 .) Furthermore, the map commutes with concatenation and its target is a grouplike space. By the universal property of the group completion, the map just constructed extends in an essentially unique way to a map ∞ ∞ α∞ : Z × BΓ+ ∞,1+1 −→ Ω CP−1
(where we are using Proposition 2.1). One of us (I.M.) conjectured the following, now a theorem [20]: Theorem 3.4. The map α∞ is a homotopy equivalence. As a conjecture this is stated in [19], and supported by the splitting theorem mentioned earlier. In the same article, it is shown that α∞ is a map of infinite loop spaces, with Tillmann’s infinite loop space structure on Z × BΓ+ ∞,1+1 , and the obvious infinite loop structure on n+2 Th(Vn ). Ω∞ CP∞ −1 = colim Ω n→∞
It is easy to show that the rational cohomology of any connected component of Ω∞ CP∞ −1 is a polynomial ring Q[x1 , x2 , x3 , . . . ] where deg(xi ) = 2i; moreover the xi can be taken as the κi classes for i > 0. The cohomology with finite field coefficients H ∗ (Ω∞ CP∞ −1 ; Fp ) is much more difficult to determine. Nevertheless this has been done in the meantime by Galatius [13]. Remark on notation. The strange abbreviation Ω∞ CP∞ −1 for the direct limit colimn Ωn+2 Th(Vn ) can be justified as follows. Let CP n ⊂ Gr2 (R2n+2 ) be the Grassmannian of one-dimensional C-linear subspaces in Cn+1 ∼ = R2n+2 , alias
The Stable Mapping Class Group and Stable Homotopy Theory
293
complex projective space of complex dimension n. Let Ln be the tautological line bundle on CP n and L⊥ n its canonical complement, a complex vector bundle of dimension n. The inclusion 2n+2 colim Ω2n+2 Th(L⊥ Th(V2n ) n ) −→ colim Ω n→∞
2n→∞
is a homotopy equivalence. Now Thom spaces of certain vector bundles on (complex) projective spaces can be viewed as “stunted” projective spaces CPki = CP i /CP k−1 where i ≥ k. Namely, CPki is identified with the Thom space of the Whitney sum of k copies of the tautological line bundle on CP i−k . Allowing k = −1, stable homotopy theorists therefore like to write n−1 2n+2 Th(L⊥ Th(−Ln ) = Σ2n+2 CP−1 . n) = Σ
In addition they use the reasonable abbreviation n−1 colim Ωn+2 Σ2n+2 CP−1 =: Ω∞ CP∞ −1 . n→∞
4. First desingularization In the remaining sections, some key ideas from the proof of Theorem 3.4 in [20] will be sketched. The proof proceeds from the target of α∞ to the source. That is, it starts from the original (co)bordism-theoretic description of Ω∞ CP∞ −1 and goes through a number of steps to obtain alternative descriptions which are more and more bundle theoretic. Each step can also be viewed as a step towards the goal of “desingularizing” mock surface bundles. The first step in this sequence is a little surprising. Let q : M → X together with δq : T M → q ∗ T X be a mock surface bundle. We form E = M × R and get (p, f ) : E → X × R where p(z, t) = q(z) and f (z, t) = t for (z, t) ∈ M × R = E. There is a formal (surjective, stable) differential δp : T E → p∗ T X , obtained by composing the projection T E → T M with δq. There is also the honest differential of f , which we regard as a vector bundle surjection δf = df : ker(δp) → f ∗ (T R). All in all, we have made a conversion (q, δq) (p, f, δp, δf ). Here (p, f ) : E → X × R is smooth and proper, δp is a formal (stable, surjective) differential for p with a 3-dimensional oriented kernel bundle, and δf is a surjective vector bundle map from ker(δp) to f ∗ (T R) (which agrees with df ).
294
I. Madsen and M. Weiss
We are going to “sacrifice” the equation δf = df in order to “obtain” an equation δp = dp. It turns out that this can always be achieved by a continuous deformation ((ps , fs , δps , δfs ))s∈[0,1] of the quadruple (p, f, δp, δf ), on the understanding that each (ps , fs ) : E → X ×R is smooth and proper, each δps is a formal (stable, surjective) differential for ps with a 3-dimensional oriented kernel bundle, and each δfs : ker(δps ) → fs∗ (T R) is a surjective vector bundle map. (For s = 0 we want (ps , fs , δps , δfs ) = (p, f, δp, δf ) and for s = 1 we want δps = dps , so that p1 is a submersion.) The proof is easy modulo submersion theory [28], [14], especially if X is closed which we assume for simplicity. Firstly, obstruction theory [33] shows that δp , although assumed to be a stable vector bundle surjection, can be deformed (through stable vector bundle surjections) to an honest vector bundle surjection δu p from T E to p∗ T X. Secondly, the manifold E has no compact component, so that the main theorem of submersion theory applies to E and the pair (p, δu p). The combined conclusion is that (p, δp) can be deformed through similar pairs (ps , δps ) to an integrable pair (p1 , δp1 ), so that δp1 = dp1 and consequently p1 is a submersion. We set fs = f for all s ∈ [0, 1]. Finally, since ker(δps ) ∼ = ker(δp) for each s, there is no problem in defining δfs : ker(δps ) → fs∗ (T R) somehow, for all s ∈ [0, 1] as a surjective vector bundle map depending continuously on s. Note that the maps (ps , fs ) : E → X × R are automatically proper since each fs = f is proper. These observations amount to an outline of more than half the proof of the following proposition. Proposition 4.1. The classifying space for mock surface bundles, Ω∞ CP∞ −1 , is also a classifying space for families of oriented smooth 3-manifolds Ex equipped with a proper map fx : Ex → R and a vector bundle surjection δfx : T Ex → fx∗ (T R). Details. The “families” in Proposition 4.1 are submersions π : E → X with fibers Ex for x ∈ X. They are not assumed to be bundles. The parameter space X can be any smooth manifold without boundary (and in some situations it is convenient to allow a nonempty boundary). The maps fx : Ex → R are supposed to make up a smooth map f : E → R. Similarly the δfx make up a vector bundle surjection δf from the vertical tangent bundle of E to f ∗ (T R). The properness condition, correctly stated, means that (π, f ) : E → X × R is proper. Although these families are submersions rather than bundles, they can be pulled back just like bundles. The classification is up to concordance. A concordance between two families on X (of the sort under discussion) is another family (of the sort under discussion) on X × R, restricting to the prescribed families on the submanifolds X × {0} and X × {1}.
The Stable Mapping Class Group and Stable Homotopy Theory
295
Outline of remainder of proof of Proposition 4.1. We have seen how a mock surface bundle on X can be converted to a family as in Proposition 4.1. Going in the other direction is easier: namely, given a family π : E → X with f : E → R etc., as in Proposition 4.1, choose a regular value c ∈ R for f and let M = f −1 (c) ⊂ E. Then q = π|M etc. is a mock surface bundle on X. In showing that these two procedures are inverses of one another, we have to verify in particular the following. Given a family π : E → X with f : E → R etc., as in Proposition 4.1, and a regular value c ∈ R for f with M = f −1 (c), there exists a concordance from the original family to another family with total space ∼ = M × R. This is particularly easy to see when X is compact (i.e., closed). In that case we can choose a small open interval U about c ∈ R containing no critical values of f , and an orientation preserving diffeomorphism h : U → R. Let E = f −1 (U ) ∼ = M × R. Now π|E together with h ◦ f |E and dh ◦ δf constitute a new family which is concordant to the old one. (To make the concordance, use an isotopy from id : R → R to h−1 .) Yes, the concordance relation is very coarse.
5. A zoo of generalized surfaces The advantage of the new characterization of Ω∞ CP∞ −1 given in Proposition 4.1 is that it paves the way for a number of useful variations on the Madsen conjecture alias Theorem 3.4. We are going to formulate these as statements about classifying spaces for families of certain generalized (“thickened”) surfaces. Following is a list of the types of generalized or thickened surface which we need, with labels. (They are all defined as 3-manifolds with additional structure; but see the comments below.) V W
oriented smooth 3-manifold Ex with proper smooth nonsingular fx : Ex → R oriented smooth 3-manifold Ex with proper smooth Morse function fx : Ex → R
oriented smooth 3-manifold Ex with smooth Morse function Wloc fx : Ex → R whose restriction to the critical point set crit(fx ) is proper hV hW
oriented smooth 3-manifold Ex with proper fx : Ex → R and vector bundle surjection δfx : T Ex → fx∗ (T R) oriented smooth 3-manifold Ex with proper fx : Ex → R and δfx : T Ex → fx∗ (T R) of Morse type (details below)
oriented smooth 3-manifold Ex with fx : Ex → R hWloc and δfx : T Ex → fx∗ (T R) of Morse type; restriction of fx to crit(δfx ) is proper (details below)
296
I. Madsen and M. Weiss
Details. The map δf of “Morse type” in the definition of types hW and hWloc is a map T Ex → fx∗ (T R) over E, but is not required to be a vector bundle homomorphism. It is required to be the sum of a linear term and a quadratic term k, subject to the condition that kz is nondegenerate whenever z = 0, for z ∈ Ex . Its formal critical point set crit(δfx ) is the the set of z ∈ Ex such that z = 0. Comments. The conditions “proper” and “nonsingular” in the definition of type V imply that fx : Ex → R is a proper submersion, hence a bundle of closed surfaces on R. From a classification point of view, this carries the same information as the closed surface fx−1 (0). Similarly, in the definitions of type W and Wloc , the focus is mainly on fx−1 (0), which in both cases is a surface with finitely many very “moderate” singularities. (It is compact in the W case, but can be noncompact in the Wloc case.) The x superscripts have been kept mainly for consistency with the formulation of Proposition 4.1. They do indicate, correctly, that we are interested in families of such generalized surfaces. Let |V|, |W|, |Wloc |, |hV|, |hW| and |hWloc | be the classifying spaces for families of generalized surfaces of type V, W, Wloc , hV, hW and hWloc , respectively. We have seen the details in the case of hV; they are similar in the other cases. In particular, family with parameter manifold X should always be interpreted as submersion with target X. (The existence of the six classifying spaces can be deduced from a general statement known as Brown’s representation theorem [6], but more explicit constructions are available. In the V case, the families alias submersions are automatically bundles with fibers Ex ∼ = Fx ×R, where Fx is a closed surface.) We obtain a commutative diagram of classifying spaces |V| −−−−→ |W| −−−−→ |Wloc | 5 5 5
(∗)
|hV| −−−−→ |hW| −−−−→ |hWloc | where the vertical arrows are obtained essentially by viewing honest derivatives as “formal” derivatives. One of the six spaces, |V|, is a little provisional because it classifies all bundles of closed surfaces (whereas we should be interested in connected surfaces of high genus). The other five, however, are in final form. We saw that |hV| Ω∞ CP∞ −1 . Modulo a plus construction and small corrections in the definition of |V|, the left-hand vertical arrow in the diagram is α∞ . Proposition 5.1. The lower row of diagram (∗) is a homotopy fiber sequence. Lemma 5.2. The right-hand vertical arrow in (∗) is a homotopy equivalence. About the proofs. The proof of Proposition 5.1 is a matter of stable homotopy theory and specifically bordism theory. The spaces |hW| and |hWloc | have alternative bordism-theoretic descriptions similar to the equivalence |hV| colim Ωn+2 Th(Vn ) n→∞
The Stable Mapping Class Group and Stable Homotopy Theory
297
of Proposition 4.1. In particular, let GrW (Rn+3 ) be the Grassmannian of 3dimensional oriented linear subspaces of Rn+3 equipped with a function + k of Morse type (where is a linear form and k is a quadratic form). Let Wn be the canonical n-dimensional vector bundle on GrW (Rn+3 ). Then |hW| colim Ωn+2 Th(Wn ) . n→∞
From the bordism-theoretic descriptions, it follows easily that the lower row of (∗) is a homotopy fiber sequence. The proof of lemma 5.2 is easy. Apart from the fact that |hWloc | is well understood in bordism-theoretic terms, the main reason for that is as follows: A generalized surface (Ex , fx ) of of type Wloc is determined, up to a canonical concordance, by its germ about the critical point set of fx . This carries over to families of surfaces of type Wloc . Theorem 5.3. The middle vertical arrow in (∗) is a homotopy equivalence. This is a distant corollary of a hard theorem due to Vassiliev [36], [37]. Following are some definitions and abbreviations which are useful in the formulation of Vassiliev’s theorem. Let M be a smooth manifold without boundary, z ∈ M . A k-jet from M to Rn at z is an equivalence class of smooth map germs f : (M, z) → Rn , where two such germs are considered equivalent if they agree to kth order at z. Let J k (M, Rn )z be the set of equivalence classes and let ? J k (M, Rn )z . J k (M, Rn ) = z
The projection J (M, R ) → M has a canonical structure of smooth vector bundle. Every smooth function f : M → Rn determines a smooth section j k f of the jet bundle J k (M, Rn ) → M , the k-jet prolongation of f . The value of j k f at z ∈ M is the k-jet of f at z. Note that j k f determines f . Let A be a closed semialgebraic subset [3] of the vector space J k (Rm , Rn ) where m = dim(M ). Suppose that A is invariant under the right action of the group of diffeomorphisms Rm → Rm , and of codimension ≥ m + 2 in J k (Rm , Rn ). Let A(M ) ⊂ J k (M, Rn ) consist of the jets which, in local coordinates about their source, belong to A. Let Γ¬A (J k (M, Rn )) be the space of smooth sections of the vector bundle J k (M, Rn ) → M which avoid A(M ). Let map¬A (M, Rn ) be the space of smooth maps f : M → Rn whose jet prolongations avoid A(M ). Both are to be equipped with the Whitney C ∞ topology. k
n
Theorem 5.4. [36], [37]. Suppose that M is closed. Then with the above hypotheses on A, the jet prolongation map map¬A (M, Rn ) −→ Γ¬A (J k (M, Rn )) induces an isomorphism in cohomology with coefficients Z. A corresponding statement holds for compact M with boundary, with the convention that all
298
I. Madsen and M. Weiss
smooth maps M → Rn and all sections of J k (M, Rn ) in sight must agree near ∂M with a prescribed ϕ : M → Rn which has no A-singularities near ∂M . For an idea of how Theorem 5.3 can be deduced from Theorem 5.4, take m = 3, n = 1 and k = 2. Let A ⊂ J 2 (R3 , R) be the set of 2-jets represented by germs f : (R3 , z) → R which either have a nonzero value f (z), or a nonzero first derivative at z, or a nondegenerate critical point at z. The codimension of A is exactly 3 + 2, the minimum of what is allowed in Vassiliev’s theorem. Change the definition of the “generalized surfaces” of type W given earlier by asking only that critical points of fx with critical value 0 be nondegenerate. In other words, require only that fx : Ex → R be Morse on a neighborhood of the compact set fx−1 (0). Change the definition of type hW generalized surfaces accordingly. These changes do not affect the homotopy types of |W| and |hW|, by a shrinking argument similar to that given at the end of chapter 5. Note also that δfx in the definition of type hW ought to have been more correctly described as a section of the jet bundle J 2 (Ex , R) → Ex . (After a choice of a Riemannian metric on Ex , an element of J k (Ex , R) with source z ∈ Ex can be viewed as a polynomial function of degree ≤ k on the tangent space of Ex at z.) With these specifications and changes, Theorem 5.3 begins to look like a special case of Vassiliev’s theorem. It should however be seen as a generalization of a special case due to the fact that families of noncompact manifolds Ex depending on a parameter x ∈ X are involved. Vassiliev’s theorem as stated above is about a “constant” compact manifold. Remarks concerning the proof of Vassiliev’s theorem. It is a complicated proof and the interested reader should, if possible, consult [36] as well as [37]. One of us (M.W.) has attempted to give an overview in [39], but this is already obsolete because of the following. Vassiliev’s proof uses a spectral sequence converging to the cohomology of the section space Γ¬A (J k (M, Rn )), and elaborate transversality and interpolation arguments to show that it converges to the cohomology of map¬A (M, Rn ), too. The spectral sequence is well hidden in the final paragraphs of the proof and looks as if it might depend on a number of obscure choices. But Elmer Rees informed us recently, naming Vassiliev as the source of this information, that the spectral sequence, from the second page onwards, does not depend on obscure choices and agrees with a spectral sequence of “generalized Eilenberg– Moore type”, discovered already in 1972 by D.Anderson [2]. Anderson intended it as a spectral sequence converging to the (co)homology of a space of maps X → Y . Here X is a finite-dimensional CW -space and Y is a dim(X)-connected space. (There is a version for based spaces, too; the case where X = S1 and all maps are based is the standard Eilenberg–Moore spectral sequence [12], [30].) Vassiliev needs a variation where the space of maps is replaced by a space of
The Stable Mapping Class Group and Stable Homotopy Theory
299
sections of a certain bundle on M whose fibers are dim(M )-connected. The bundle is, of course, Γ¬A (J k (M, Rn )) → M . In conclusion, anybody wanting to understand Vassiliev’s proof really well should try to understand the Anderson–Eilenberg–Moore spectral sequence for mapping spaces first. Anderson’s article [2] is an announcement, but detailed proofs can be found in [4]. 6. Stratifications and homotopy colimit decompositions The developments in the previous section essentially reduce the proof of Theorem 3.4 to the assertion that the homotopy fiber of the inclusion map |W| → |Wloc | in diagram (∗) is homotopy equivalent to Z × BΓ+ ∞,1+1 . The proof of that assertion in [20] takes up many pages and relies mainly on compatible decompositions of |W| and |Wloc | into manageable pieces. There is no point in repeating the details here. But there is a point in providing some motivation for the decompositions. The motivation which we propose here (very much “a posteriori”) is almost perpendicular to the hard work involved in establishing the decompositions, and so does not overlap very much with anything in [20]. As a motivation for the motivation, we shall begin by describing the decompositions (without, of course, constructing them). Definition 6.1. [31]. Let F be a covariant functor from a small category C to spaces. The transport category C∫ F is the topological category where the objects are the pairs (c, x) with c ∈ ob(C) and x ∈ F(c), and where a morphism from (c, x) to (d, y) is a morphism g : c → d > in C such that F(g) : F(c) → F(d) takes x to y. Thus ob( C∫ F) is the space c F(c) and the morphism space mor( C∫ F) is the pullback of source
ob(C∫ F) −−−−→ ob(C) ←−−−−− mor(C) . The homotopy colimit of F is the classifying space of the topological category C∫ F. Notation: hocolim F , hocolim F , hocolim F(c). C
c in C
Remarks 6.2. If C has only one object, then C is a monoid, F amounts to a space with an action of the monoid, and hocolim F is the Borel construction. The variance of F is not important; if F is a contravariant functor from C to spaces, replace C by C op in the above definition. In that situation it is still customary to write hocolimC F for the homotopy colimit. Definition 6.3. Let K be the discrete category defined as follows. An object of K is a finite set S with a map to {0, 1, 2, 3}. A morphism from S to T in K consists of an injection f : S → T over {0, 1, 2, 3}, and a map ε from T f (S) to {−1, +1}. The composition of (f1 , ε1 ) : S → T with (f2 , ε2 ) : R → S is (f1 f2 , ε3 ) : R → T where ε3 (t) = ε1 (t) if t ∈ / f1 (S) and ε3 f1 (s) = ε2 (s) if s∈ / f2 (R).
300
I. Madsen and M. Weiss
The category K arises very naturally in the taxonomy of generalized surfaces of type W and Wloc . Let (Ex , fx ) be a generalized surface of type W or Wloc . Then the set crit0 (fx ) = crit(fx ) ∩ fx−1 (0) is a finite set with a map to {0, 1, 2, 3} given by the Morse index. In other words it is an object of K. In view of this, we expand our earlier list of generalized surface types by adding the following sub-types WS and Wloc,S of types W and Wloc , respectively, for a fixed object S of K. WS
oriented smooth 3-manifold Ex with proper smooth Morse function fx : Ex → R and an isomorphism S → crit0 (fx ) in K
oriented smooth 3-manifold Ex with smooth Morse Wloc,S function fx : Ex → R such that fx |crit(fx ) is proper, and an isomorphism S → crit0 (fx ) in K The classifying spaces for the corresponding families (which are, as usual, submersions) are denoted |WS | and |Wloc,S |, respectively. The promised decompositions of |W| and |Wloc | can now be described loosely as follows. Theorem 6.4. |W| hocolim |WS | and |Wloc | hocolim |Wloc,S |. S in K
S in K
Implicit in these formulae is the claim that |WS | and |Wloc,S | are contravariant functors of the variable S in K. A rigorous verification would take up much space, and does take up much space in [20], but the true reasons for this functoriality are not hard to understand. Fix a morphism (g, ε) : S → T in K. Let (Ex , fx ) be a generalized surface of type WT or Wloc,T , so that crit0 (fx ) is identified with T . Choose a smooth function ψ : Ex → R with support in a small neighborhood of crit0 (fx ) such that ψ equals ε near points of crit0 (fx ) ∼ = T not in g(S), and equals 0 near the remaining points of crit0 (fx ). Then for all sufficiently small c > 0, the function fx + cψ is Morse and has exactly the same critical points as fx . But the values of fx + cψ on the critical points differ from those of fx , with the result that (Ex , fx + cψ) is a generalized surface of type WS or Wloc,S as appropriate. The procedure generalizes to families and so induces maps |WT | −→ |WS | ,
|Wloc,T | −→ |Wloc,S | .
Theorem 6.4 in its present raw state can be deduced from a recognition principle for homotopy colimits over certain categories. In the special case when the category is a group G, the recognition principle is well known and states the following. Suppose that Y is the total space of a fibration p : Y → BG. Then Y XhG for some G-space X such that X p−1 (). (See Remark 6.2, and for the proof let X be the pullback along p of the universal cover of BG.) In the general setting, the indexing category is an EIcategory, that is, a category in which every Endomorphism is an Isomorphism.
The Stable Mapping Class Group and Stable Homotopy Theory
301
The category K is an example of an EI-category. Groupoids and posets are also extreme examples of EI-categories. The opposite category of any EI-category is an EI-category. EI-categories have something to do with stratified spaces, which justifies the following excursion. Definition 6.5. A stratification of a space Z is a locally finite partition of Z into locally closed subsets, the strata, such that the closure of each stratum in Z is a union of strata. Example 6.6. Let C be a small EI-category. For each isomorphism class [C] of objects in C, we define a locally closed subset BC[C] of the classifying space BC, as follows. A point x ∈ BC is in BC[C] if the unique cell of BC containing x corresponds to a diagram C0 ← C1 ← · · · ← Ck without identity arrows, where C0 is isomorphic to C. (Remember that BC is a CW-space, with one cell for each diagram C0 ← C1 ← · · · ← Ck as above.) Then BC is stratified, with one stratum BC[C] for each isomorphism class [C]. The closure of the stratum BC[C] is the union of all strata BC[D] for objects D which admit a morphism D → C. To be even more specific, we can take C = Kop . The isomorphism types of objects in Kop correspond to quadruples (n0 , n1 , n2 , n3 ) of non-negative integers. The stratum of BKop corresponding to such a quadruple turns out to have a normal vector bundle in BKop , of fiber dimension n0 + n1 + n2 + n3 ; hence the stratum can be said to have codimension n0 +n1 +n2 +n3 . Its closure is the union of all strata corresponding to quadruples (m0 , m1 , m2 , m3 ) where mi ≥ ni . There is a unique open stratum, corresponding to the object ∅ of Kop or the quadruple (0, 0, 0, 0). Digression. The stratification of BKop just described can be used to determine the homotopy type of BKop , roughly as follows. Let f : X → BKop be a map, where X is a smooth manifold. Up to a homotopy, f is “transverse” to the strata of codimension > 0. Then the union of the inverse images of these codimension > 0 strata is the image of a proper smooth codimension 1 immersion M → X with trivialized normal line bundle, with transverse self-intersections, and with a map M → {0, 1, 2, 3}. The construction can be reversed, i.e., such an immersion determines a homotopy class of maps X → BKop . In this sense, BKop classifies (up to concordance) proper smooth codimension one immersions with trivialized normal bundle and with a map from the source M to {0, 1, 2, 3}. It follows that BKop QS1 × QS1 × QS1 × QS1 because QS1 = Ω∞ Σ∞ S1 is known to classify proper smooth codimension 1 immersions with trivialized normal bundle [38]. Definition 6.7. Let Z be a stratified space. A path γ : [0, 1] → Z is nonincreasing if, for each t ∈ [0, 1], the set γ[0, t] is contained in the closure of the stratum
302
I. Madsen and M. Weiss
which contains γ(t). A homotopy of maps (ht : X → Z)t∈[0,1] , where X is some space, is nonincreasing if, for each x ∈ X, the path t → ht (x) is nonincreasing. Remark. For a nonincreasing path γ, the depth, complexity, etc. of the stratum containing γ(t) is a nonincreasing function of t. Definition 6.8. Let p : Y → Z be a map, where Z is stratified. Say that p is a downward fibration if it has the homotopy lifting property for nonincreasing homotopies. That is, given a nonincreasing homotopy (ht : X → Z)t∈[0,1] and a map g0 : X → Y such that pg0 = h0 , there exists a homotopy (gt : X → Y )t∈[0,1] such that pgt = ht for all t ∈ [0, 1]. Pre-theorem 6.9. Let C be an EI-category. Stratify BC as in example 6.6. Let Y be a space and let p : Y → BC be a downward fibration. Then Y hocolim F(c) c in C
where F is a covariant functor from C to spaces such that F(c) p−1 (c) for all objects c of C, alias vertices of BC. This is the recognition principle (no proof offered for lack of time and space). It has an obvious weakness: the functor F is not sufficiently determined by the conditions F(c) p−1 (c). But then it is meant as a principle, a rule of thumb. In any case we should apply it with Y = |W| or Y = |Wloc | and C = Kop , the opposite of K. There is a problem with that plan. Explicit descriptions of |W| and |Wloc | have not yet been given (in this paper). Instead, we have highfalutin characterizations of |W| and |Wloc | as classifying spaces for certain families. The modified plan is, therefore, to move BC = BKop to the same highfalutin level, and to verify the hypothesis of Pretheorem 6.9 at that level. This leads us to the interesting question: What does the classifying space of a category C classify ? There is no doubt that the question has many correct answers. One such answer is given in [20, 4.1.2]. This is essentially identical with an answer known to tom Dieck (but possibly attributed to G. Segal) in the early 70’s, according to unpublished lecture notes for which we are indebted to R. Vogt. Moerdijk [23] has a more streamlined answer, and many generalizations of the question, too. The following proposal is inspired by a passage in Moerdijk’s book, but is apparently not identical with (a special case of) his answer and if it should fail badly the responsibility is ours. Terminology. A C-set is a functor from C op to sets. The category of C-sets shares many good properties with the category of sets. (It is a topos.) In particular, we can talk about sheaves of C-sets on a space. A C-set is representable if it is isomorphic to one of the form c → morC (c, c0 ) for a fixed object c0 in C. Pre-theorem 6.10. The classifying space BC classifies sheaves of C-sets whose stalks are representable.
The Stable Mapping Class Group and Stable Homotopy Theory
303
Remark. Traditionally there are two equivalent definitions of the notion “sheaf” on a space X. According to one of them, a sheaf is a contravariant functor from the poset of open sets of X to sets, subject to a gluing condition. According to the other, a sheaf on X is an ´etale map to X. While the first point of view is better for processing most of the interesting examples, the second one is better for showing that sheaves behave contravariantly (can be “pulled back”). This carries over to sheaves of C-sets. The classification of the sheaves in Pretheorem 6.10 is up to concordance. Two sheaves G0 , G1 on X as in the pretheorem are concordant if there exists a sheaf on X ×[0, 1], as in the pretheorem, whose restrictions to X ×{0} ∼ = X and X × {1} ∼ = X are isomorphic to G0 and G1 , respectively. The claim is that, for “most” spaces X, there is a natural bijection from the set of homotopy classes [X, BC] to the set of concordance classes of sheaves of C-sets with representable stalks on X. Example 6.11. Let (π, f ) be a family of generalized surfaces of type W on a smooth X. That is, π : E → X is a smooth submersion with oriented 3dimensional fibers, f : E → R is a smooth map such that (π, f ) : E → X × R is proper, and the restrictions fx = f |Ex of f to the fibers of π are Morse functions. With these data, we can associate a sheaf I(π,f ) of Kop -sets on X. Namely, for an open subset U of X and an object S of K, let I(π,f ) (U )(S) be the subset of morK (crit0 (fx ), S) x∈U
0 consisting of the elements for which the adjoint map from x∈U crit0 (fx ) ⊂ E to S is continuous and the resulting sign function, defined on a subset of U ×S, is also continuous. Then I(π,f ) (U )(S) is a covariant functor of S in K = (Kop )op and a contravariant functor of the variable U , as it should be. The stalk at x ∈ X is easily identified with the functor S → morK (crit0 (fx ), S). It is obviously representable as a functor on Kop . The construction of I(π,f ) in example 6.11 works equally well for a family of generalized surfaces of type Wloc . Now Theorem 6.4 can be understood as a special case of (something analogous to) the recognition principle, Pretheorem 6.9. Take C = Kop and Y = |W| op or Y = |Wloc | in Pretheorem 6.9. There are no explicit maps |W| → BK op or |Wloc | → BK to work with. But there is instead the procedure of example 6.11 which from every family (π, f ) of the sort classified by |W| or |Wloc | op constructs a sheaf I(π,f ) of the sort classified by BK . The “downward fibration” condition in Pretheorem 6.9 can be stated and proved in this setting. In more detail, for the case of |W|, fix a smooth manifold X and a sheaf H op of K -sets on X ×[0, 1] with representable stalks. Assume that H is nonincreasing. This means simply that, for every x ∈ X, the function which to t ∈ [0, 1] assigns the cardinality of the representing object for the stalk at (x, t) is nonincreasing. Assume further that the restriction of H to X × {0} is identified with
304
I. Madsen and M. Weiss
I(π,f ) for a family (π, f ) on X × {0}, as in example 6.11. Then we can extend that family to a family (ψ, g) on X × [0, 1], and the isomorphism of sheaves to an isomorphism of I(ψ,g) with H. The verification is left to the reader. 7. Final touches The guiding idea for this chapter is that, because of Theorem 6.4, we should be able to understand the homotopy fiber(s) of |W| → |Wloc | by understanding the homotopy fibers of |WS | → |Wloc,S | for each object S in K. The underpinning for this strategy is the following general fact. (Notation: “hofiberz (f )” is short for the homotopy fiber over a point z in the target of a map f .) Proposition 7.1. Let C be a small category and let F1 , F2 be functors from C to the category of spaces. Let u : F1 → F2 be a natural transformation. Suppose that, for every object morphism g : c → d and every z ∈ F2 (c), the map g∗
hofiberz (F1 (c) → F2 (c)) −−−−→ hofiberu∗ (z) (F1 (d) → F2 (d)) induced is a homotopy equivalence (resp., induces an isomorphism in integral homology). Then, for any object c in C and z ∈ F2 (c), the inclusion hofiberz (F1 (c) → F2 (c)) −→ hofiberz (hocolim F1 → hocolim F2 ) is a homotopy equivalence (resp., induces an isomorphism in integral homology). We now have to ask whether the hypotheses of this proposition are satisfied or “nearly satisfied” in the case where C is (equivalent to) Kop and u : F1 → F2 is given by the inclusions |WS | → |Wloc,S | for S in Kop . Lemma 7.2. The space |Wloc,S | is a classifying space for oriented 3-dimensional Riemannian vector bundles V on S equipped with the following extra structure: an orthogonal splitting V ∼ = V (+) ⊕ V (−), where the fiber dimension function of V (−) agrees with the structure map S → {0, 1, 2, 3}. Idea of proof. A generalized surface (Ex , fx ) of type Wloc,S is canonically concordant to (V, fx |V ) for any open neighborhood V of crit0 (fx ) ∼ = S. Take V to be a standard tubular neighborhood of S, so that the retraction V → S comes with a vector bundle structure. By Morse theory, there is no substantial loss of information in replacing fx |V by the “total Hessian” of fx , which is a nondegenerate symmetric form on V . A choice of an orthogonal splitting of V into a positive definite part and a negative definite part for the Hessian can be added, because that is a contractible choice. By changing the sign of the Hessian on the negative definite summand, we obtain a Riemannian structure on V . Lemma 7.3. The space |WS | is a classifying space for bundles of smooth closed oriented surfaces, where each fiber F is equipped with “surgery data” as follows: • a 3-dimensional vector bundle V on S, etc., as in lemma 7.2; • a smooth orientation preserving embedding e of D(V (+)) ×S S(V (−)) in F , where D(. . . ) and S(. . . ) denotes unit disk and unit sphere bundles.
The Stable Mapping Class Group and Stable Homotopy Theory
305
Idea of proof. In the definition of a generalized surface (Ex , fx ) of type WS , add the condition crit0 (fx ) = crit(fx ), so that critical values other than 0 are forbidden. A shrinking argument similar to all the previous shrinking arguments in this paper shows that this change does not affect the homotopy type of the classifying space |WS |. With the new condition crit0 (fx ) = crit(fx ), however, a generalized surface (Ex , fx ) of type WS can be described as the (long) trace of |S| simultaneous surgeries on the genuine smooth oriented surface fx−1 (c) for fixed c < 0. The simultaneous surgeries are in the usual way determined by disjoint embeddings of certain thickened spheres (labelled by the elements of S) in the surface fx−1 (c). Corollary 7.4. The homotopy fiber of |WS | → |Wloc,S | over any point z in |Wloc,S | is a classifying space for bundles of compact smooth oriented surfaces with a prescribed (oriented) boundary depending on S and z. Outline of proof. The choice of z amounts to a choice of a Riemannian vector bundle V on S with splitting etc., as in lemma 7.2. To obtain a correct description of the homotopy fiber, simply fix V etc. in the re-definition of |WS | given in lemma 7.3. This fixes the source of the codimension zero embedding e. Hence the information carried by the surface F and the embedding e is carried by the closure of F im(e), and the identification of its boundary with S(V (+)) ×S S(V (−)). It is obvious how the homotopy fibers in corollary 7.4 depend on z alias V . The dependence on S is more interesting because we can vary by morphisms in K which are not isomorphisms. It suffices to describe the dependence in the case of a morphism (g, ε) : R → S in K where g is an inclusion and S R has a single element s. Let z ∈ |Wloc,S | correspond to a vector bundle V on S, etc., as in 7.2. Then the image y ∈ |Wloc,R | of z under (g, ε)∗ corresponds to V |R = V Vs . Lemma 7.5. The map induced by (g, ε) from the homotopy fiber of |WS | → |Wloc,S | over z to the homotopy fiber of |WR | → |Wloc,R | over y is given by a gluing construction "∂L L, applied to the surfaces featuring in corollary 7.4, where D(Vs (+)) × S(Vs (−)) if ε(s) = +1 L = S(Vs (+)) × D(Vs (−)) if ε(s) = −1. The transition maps described in this lemma do not induce homology isomorphisms in general (i.e., do not satisfy the conditions of Proposition 7.1), but in a sense they come close to that. Indeed they are maps of the type considered in the Harer–Ivanov stability Theorem 1.5. The remaining difficulty, from this point of view, is therefore that the surfaces featuring in corollary 7.4 need not be connected and of large genus. Fortunately it is possible to make some changes in the decomposition |W| hocolimS |WS | so that corollary 7.4 comes out “right”, i.e., with something resembling the phrase connected and of large genus in it. (A welcome side-effect of these changes is that |W∅ | metamorphoses into Z × BΓ∞,1+1 .) These adjustments occupy the final chapters of [20].
306
I. Madsen and M. Weiss
References [1] J.F. Adams, Infinite loop spaces, Annals of Math. Studies, vol. 90, Princeton Univ. Press, 1978. [2] D.W. Anderson, A generalization of the Eilenberg–Moore spectral sequence, Bull. Amer. Math. Soc. 78 (1972), 784–786. [3] R. Benedetti and J.-J. Risler, Real algebraic and semi-algebraic sets, Actualit´es math´ematiques, Hermann, 1990. [4] A.K. Bousfield, On the homology spectral sequence of a cosimplicial space, Amer. J. Math. 109 (1987), 361–394. [5] T. Br¨ ocker and K. J¨ anich, Introduction to differential topology, Camb. Univ. Press, 1982, German edition Springer-Verlag (1973). [6] E. Brown, Cohomology theories, Ann. of Math. 75 (1962), 467–484, correction in Ann. of Math. 78 (1963). [7] S. Buoncristiano, C. Rourke, and B. Sanderson, A geometric approach to homology theory, Lond. Math. Soc. Lecture Note ser., vol. 18, Camb. Univ. Press, 1976. [8] S.S. Chern, An elementary proof of the existence of isothermal parameters on a surface, Proc. Amer. Math. Soc. 6 (1955), 771–782. [9] D. Quillen, Elementary proofs of some results of cobordism theory using Steenrod operations, Advances in Math. 7 (1971), 29–56. [10] C.J. Earle and J. Eells, A fibre bundle description of Teichm¨ uller theory, J. Differential Geom. 3 (1969), 19–43. [11] C.J. Earle and A. Schatz, Teichm¨ uller theory for surfaces with boundary, J. Differential Geom. 4 (1970), 169–185. [12] S. Eilenberg and J.C. Moore, Homology and fibrations. I. Coalgebras, cotensor product and its derived functors, Comment. Math. Helv. 40 (1966), 199–236. [13] S. Galatius, Mod p homology of the stable mapping class group, Topology 43 (2004), 1105–1132. [14] A. Haefliger, Lectures on the theorem of Gromov, Proc. of 1969/70 Liverpool Singularities Symp., Lecture Notes in Math., vol. 209, Springer, 1971, pp. 128–141. [15] J.L. Harer, Stability of the homology of the mapping class groups of oriented surfaces, Ann. of Math. 121 (1985), 215–249. [16] A. Hatcher and W. Thurston, A presentation for the mapping class group of a closed orientable surface, Topology 19 (1980), 221–237. [17] N.V. Ivanov, Stabilization of the homology of the Teichmueller modular groups, Algebra i Analiz 1 (1989), 110–126, translation in: Leningrad Math. J. 1 (1990) 675–691. [18] , On the homology stability for Teichm¨ uller modular groups: closed surfaces and twisted coefficients, Mapping class groups and moduli spaces of Riemann surfaces (G¨ ottingen/Seattle 1991), Contemporary Mathematics, vol. 150, Amer. Math. Soc., 1993, pp. 149–194. [19] I. Madsen and U. Tillmann, The stable mapping class group and Q(CP ∞ ), Invent. Math. 145 (2001), 509–544. [20] I. Madsen and M. Weiss, The stable moduli space of Riemann surfaces: Mumford’s conjecture, preprint, arXiv:math.AT/0212321, 2002. [21] D. McDuff and G. Segal, Homology fibrations and the “group-completion” theorem, Invent. Math. 31 (1976), 279–284.
The Stable Mapping Class Group and Stable Homotopy Theory
307
[22] E. Miller, The homology of the mapping class group, J. Diff. Geom. 24 (1986), 1–14. [23] I. Moerdijk, Classifying spaces and classifying topoi, Lecture Notes in Math., vol. 1616, Springer, 1995. [24] S. Morita, Characteristic classes of surface bundles, Bull. Amer. Math. Soc. 11 (1984), 386–388. [25] , Characteristic classes of surface bundles, Invent. Math. 90 (1987), 551–577. [26] D. Mumford, Towards an enumerative geometry of the moduli space of curves, Aritmetic and Geometry, II, Progr. in Math., vol. 36, Birkh¨ auser, 1983, pp. 271–328. [27] A. Newlander and L. Nirenberg, Complex analytic coordinates in almost complex manifolds, Ann. of Math. 65 (1957), 391–404. [28] A. Phillips, Submersions of open manifolds, Topology 6 (1967), 170–206. [29] J. Powell, Two theorems on the mapping class group of a surface, Proc. Amer. Math. Soc. 68 (1978), 347–359. [30] D.L. Rector, Steenrod operations in the Eilenberg–Moore spectral sequence, Comment. Math. Helv. 45 (1970), 540–552. [31] G. Segal, Classifying spaces and spectral sequences, Inst. Hautes Etudes Sci. Publ Math. 34 (1968), 105–112. [32] G.B. Segal, Categories and cohomology theories, Topology 13 (1974), 293–312. [33] E.H. Spanier, Algebraic topology, McGraw–Hill, New York, 1966. [34] U. Tillmann, On the homotopy of the stable mapping class group, Invent. Math. 130 (1997), 257–275. [35] , A splitting for the stable mapping class group, Math. Proc. Camb. Phil. Soc. 127 (1999), 55–56. [36] V. Vassiliev, Topology of spaces of functions without complicated singularities, Funktsional. Anal. i Prilozhen. 93 (1989), 24–36, Engl. translation in Funct. Analysis Appl. 23 (1989), 266–286. [37] , Complements of Discriminants of Smooth Maps: Topology and Applications, Transl. of Math. Monographs, vol. 98, Amer. Math. Soc., 1994 (1992), revised edition. [38] P. Vogel, Cobordisme d’immersions, Ann. Sci. Ecole Norm. Sup. 7 (1974), 317–357. [39] M. Weiss, Cohomology of the stable mapping class group, Topology, Geometry and Quantum Field Theory: Proc. of 2002 Oxf. Symp. in honour of G. Segal’s 60th birthday, Cambridge Univ. Press, 2004, ed. by U. Tillmann. Ib Madsen Institute for the Math. Sciences, Aarhus University DK-8000 Aarhus C, Denmark e-mail:
[email protected] Michael Weiss Department of Mathematics, University of Aberdeen Aberdeen AB24 3UE, UK e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
A Non-asymptotic Theory for Model Selection Pascal Massart Abstract. Model selection is a classical topic in statistics. The idea of selecting a model via penalizing a log-likelihood type criterion goes back to the early seventies with the pioneering works of Mallows and Akaike. One can find many consistency results in the literature for such criteria. These results are asymptotic in the sense that one deals with a given number of models and the number of observations tends to infinity. We shall give an overview of a non asymtotic theory for model selection which has emerged during these last ten years. In various contexts of function estimation it is possible to design penalized log-likelihood type criteria with penalty terms depending not only on the number of parameters defining each model (as for the classical criteria) but also on the “complexity” of the whole collection of models to be considered. For practical relevance of these methods, it is desirable to get a precise expression of the penalty terms involved in the penalized criteria on which they are based. Our approach heavily relies on concentration inequalities, the prototype being Talagrand’s inequality for empirical processes which leads to explicit penalties. Simultaneously, we derive non asymptotic risk bounds for the corresponding penalized estimators showing that they perform almost as well as if the “best model” (i.e., with minimal risk) were known. Our purpose will be to give an account of the theory and discuss some selected applications such as variable selection or change point detection.
1. Statistical inference If one observes some random variable ξ (which can be a random vector or a random process) with unknown distribution, the basic problem of statistical inference is to take a decision about some quantity s related to the distribution of ξ, for instance estimate s or provide a confidence set for s with a given level of confidence. Usually, one starts from a genuine estimation procedure for s and try to get some idea of how far it is from the target. Since generally speaking the exact distribution of the estimation procedure is not available, the role of Probability Theory is to provide relevant approximation tools to evaluate it. In the situation where ξ = ξ (n) depends on some parameter n (typically when ξ = (ξ1 , . . . , ξn ), where the variables ξ1 , . . . , ξn are independent), asymptotic theory in statistics uses limit Theorems (Central Limit Theorems, Large deviation Received by the editors October 2004. 2000 Mathematics Subject Classification. Primary: 60E15; Secondary 60F10, 94A17. Key words and phrases. Change point detection, classification, concentration inequalities, empirical processes, model selection, regression estimation, statistical learning, variable selection.
310
P. Massart
Principles. . . ) as approximation tools when n is large. One of the first example of such a result to be found in the literature is the use of the CLT to analyze the behavior of a maximum likelihood estimator on a given regular parametric model (independent of n) as n goes to infinity. More recently, since the seminal works of Dudley in the seventies, the theory of probability in Banach spaces has deeply influenced the development of asymptotic statistics, the main tools involved in these applications being limit theorems for empirical processes. This led to decisive advances for the theory of asymptotic efficiency in semiparametric models for instance and the interested reader will find numerous results in this direction in the books by van der Vaart and Wellner [29] or van der Vaart [28]. 2. Model selection Designing a genuine estimation procedure requires some prior knowledge on the unknown distribution of ξ and choosing a proper model is a major problem for the statistician. The aim of model selection is to construct data-driven criteria to select a model among a given list. We shall see that in many situations motivated by applications such as signal analysis for instance, it is useful to allow the size of the models to depend on the sample size n. In these situations, classical asymptotic analysis breaks down and one needs to introduce an alternative approach that we call non-asymptotic. By non-asymptotic, we do not mean of course that large samples of observations are not welcome but that the size of the models as well as the size of the list of models should be allowed to be large when n is large in order to be able to warrant that the statistical model is not far from the truth. When the target quantity s to be estimated is a function, this allows in particular to consider models which have good approximation properties at different scales and use model selection criteria to choose from the data what is the best approximating model to be considered. Since the last 20 years, the phenomenon of the concentration of measure has received much attention mainly due to the remarkable series of works by Michel Talagrand which led to a variety of new powerful inequalities (see in particular [25] and [26]). The main interesting feature of concentration inequalities is that, unlike central limit theorems or large deviations inequalities, they are indeed non-asymptotic. The point is that these new tools of Probability theory lead to a non-asymptotic theory for model selection and illustrate the benefits of this approach for several functional estimation problems. The basic examples of functional estimation frameworks that we have in mind are the following. Example 2.1 (Density estimation). One observes ξ1 , . . . , ξn which are i.i.d. random variables with unknown density s with respect to some given measure µ. Example 2.2 (Regression). One observes (X1 , Y1 ) , . . . , (Xn , Yn ) with Yi = s (Xi ) + εi , 1 ≤ i ≤ n.
A Non-asymptotic Theory for Model Selection
311
One assumes the explanatory variables X1 , . . . , Xn to be independent (but non necessarily i.i.d.) and the regression errors ε1 , . . . , εn to be i.i.d. with E [εi | Xi ] = 0. s is the so-called regression function. Example 2.3 (Binary classification). As in the previous setting, one observes independent pairs (X1 , Y1 ) , . . . , (Xn , Yn ) but here we assume those pairs to be copies of a pair (X, Y ) where the response variables Y takes only two values, say: 0 or 1. The basic problem of statistical learning is to estimate the so-called Bayes classifier s defined by s (x) = 1Iη(x)≥1/2 where η denotes the regression function, η (x) = E [Y | X = x] . " # d Example 2.4 (Gaussian white noise). Let s ∈ L2 [0, 1] . One observes the d
process ξ (n) on [0, 1] defined by 1 dξ (n) (x) = s (x) + √ dW (x) , ξ (n) (0) = 0, n where √ W denotes a Brownian sheet. The level of noise ε is here written as ε = 1/ n for notational convenience and in order to allow an easy comparison with the other frameworks. In all of the examples above, one observes some random variable ξ (n) with unknown distribution which depends on some quantity s ∈ S to be estimated. One can typically think of s as a function belonging to some space S which may be infinite-dimensional. For instance • In the density framework, s is a density and S can be taken as the set of all probability densities with respect to µ. • In the i.i.d. regression framework, the variables ξi = (Xi , Yi ) are independent copies of a pair of random variables (X, Y ), where X takes its values in some measurable space X . Assuming the variable Y to be square integrable, the regression function s defined by s (x) = E [Y | X = x] for every x ∈ X belongs to S = L2 (µ), where µ denotes the distribution of X. One of the most commonly used method to estimate s is minimum contrast estimation. 2.1. Minimum contrast estimation. Let us consider some empirical criterion γn (based on the observation ξ (n) ) such that on the set S t → E [γn (t)] achieves a minimum at point s. Such a criterion is called an empirical contrast function for the estimation of s. Given some subset S of S that we call a model, a minimum contrast estimator s of s is a minimizer of γn over S. The heuristics of
312
P. Massart
minimum contrast estimation is that, if one substitutes the empirical criterion γn to its expectation and minimizes γn on some subset S of S (that we call a model), there is some hope to get a sensible estimator of s, at least if s belongs (or is close enough) to model S. This estimation method is widely used and has been extensively studied in the asymptotic parametric setting for which one assumes that S is a given parametric model, s belongs to S and n is large. Probably, the most popular examples are maximum likelihood and least squares estimation. Let us see what this gives in the above functional estimation frameworks. In each example given below we shall check that a given empirical criterion is indeed an empirical contrast function by showing that the associated natural loss function l (s, t) = E [γn (t)] − E [γn (s)]
(2.1)
is non negative for all t ∈ S. In the case where ξ (n) = (ξ1 , . . . , ξn ), we shall define an empirical criterion γn in the following way 1 γ (t, ξi ) , n i=1 n
γn (t) = Pn [γ (t, .)] =
so that it remains to precise for each case example what is the adequate function γ to be considered. Example 2.5 (Density estimation). One observes ξ1 , . . . , ξn which are i.i.d. random variables with unknown density s with respect to some given measure µ. The choice γ (t, x) = − log (t (x)) leads to maximum likelihood estimation and the corresponding loss function l is given by l (s, t) = K (s, t) , where K (s, t) denotes the Kullback-Leibler information number between the probabilities sµ and tµ, i.e., "s# K (s, t) = s log t if sµ is absolutely continuous with respect to tµ and K (s, t) = +∞ otherwise. Assuming that s ∈ L2 (µ), it is also possible to define a least squares density estimation procedure by setting this time 2
γ (t, x) = t − 2t (x) where . denotes the norm in L2 (µ) and the corresponding loss function l is in this case given by 2 l (s, t) = s − t , for every t ∈ L2 (µ).
A Non-asymptotic Theory for Model Selection
313
Example 2.6 (Regression). One observes (X1 , Y1 ) , . . . , (Xn , Yn ) with Yi = s (Xi ) + εi , 1 ≤ i ≤ n, where X1 , . . . , Xn are independent and ε1 , . . . , εn are i.i.d. with E [εi | Xi ] = 0. Let µ be the arithmetic mean of the distributions of the variables X1 , . . . , Xn , then least squares estimation is obtained by setting for every t ∈ L2 (µ) 2
γ (t, (x, y)) = (y − t (x)) , and the corresponding loss function l is given by 2
l (s, t) = s − t , where . denotes the norm in L2 (µ). Example 2.7 (Binary classification). One observes independent copies (X1 , Y1 ), . . . , (Xn , Yn ) of a pair (X, Y ) where Y takes its values in {0, 1}. Taking the same value for γ as in the least squares regression case but we restrict this time the minimization to the set S of classifiers, i.e., {0, 1}-valued measurable functions (instead of L2 (µ)). This leads to the so-called empirical risk minimization procedure according to Vapnik’s terminology (see [30]). Setting s (x) = 1Iη(x)≥1/2 where η denotes the regression function, η (x) = E [Y | X = x] , the corresponding loss function l is given by l (s, t) = P [Y '= t (X)] − P [Y '= t (X)] = E [|2η (X) − 1| |s (X) − t (X)|] . Finally we can consider the least squares procedure in the Gaussian white noise framework too. Example 2.8 (Gaussian white noise). Recall that one observes the process ξ (n) d on [0, 1] defined by 1 dξ (n) (x) = s (x) + √ dW (x) , ξ (n) (0) = 0, n
" # d where W denotes a Brownian sheet. We define for every t ∈ L2 [0, 1] 2
γn (t) = t − 2
1
t (x) dξ (n) (x) , 0
then the corresponding loss function l is simply given by 2
l (s, t) = s − t .
314
P. Massart
2.2. The model choice paradigm. The main problem which arises from minimum contrast estimation in a parametric setting is the choice of a proper model S on which the minimum contrast estimator is to be defined. In other words, it may be difficult to guess what is the right parametric model to consider in order to reflect the nature of data from the real life and one can get into problems whenever the model S is false in the sense that the true s is too far from S. One could then be tempted to choose S as big as possible. Taking S as S itself or as a “huge” subset of S is known to lead to inconsistent (see [3]) or suboptimal estimators (see [6]). We see that choosing some model S in advance leads to some difficulties • If S is a “small” model (think of some parametric model, defined by 1 or 2 parameters for instance) the behavior of a minimum contrast estimator on S is satisfactory as long as s is close enough to S but the model can easily turn to be false. • On the contrary, if S is a “huge” model (think of the set of all continuous functions on [0, 1] in the regression framework for instance), the minimization of the empirical criterion leads to a very poor estimator of s even if s truly belongs to S. 2.2.1. Illustration (white noise). In the white noise framework, if one takes S as a linear space with dimension D, one can compute the least squares estimator explicitly. Indeed, if (φj )1≤j≤D denotes some orthonormal basis of S, one has s =
D j=1
1
φj (x) dξ
(x) φj .
0
Since for every 1 ≤ j ≤ D 1 (n) φj (x) dξ (x) = 0
(n)
1
1 φj (x) s (x) dx + √ ηj n
0
where the variables η1 , . . . , ηD are i.i.d. standard normal variables, the expected quadratic risk of s can be easily computed. One indeed has D 2 E s − s = d2 (s, S) + . n This formula for the quadratic risk perfectly reflects the model choice paradigm since if one wants to choose a model in such a way that the risk of the resulting least square estimator is small, we have to warrant that the bias term d2 (s, S) and the variance term D/n are small simultaneously. It is therefore interesting to consider a family of models instead of a single one and try to select some appropriate model among the family. " More # precisely, if (Sm )m∈M is a list of d
finite-dimensional subspaces of L2 [0, 1]
and ( sm )m∈M be the corresponding 2 list of least square estimators, an “ideal” model should minimize E s − sm
A Non-asymptotic Theory for Model Selection
315
with respect to m ∈ M. Of course, since we do not know the bias, the quadratic risk cannot be used as a model choice criterion but just as a benchmark. More generally if we consider some empirical contrast γn and some (at most countable and usually finite) collection of models (Sm )m∈M , let us represent each model Sm by the empirical contrast estimator sm related to γn . The purpose is to select the “best” estimator among the collection ( sm )m∈M . Ideally, one would like to consider m (s) minimizing the risk E [l (s, sm )] with respect to m ∈ M. The minimum contrast estimator sm(s) on the corresponding model Sm(s) is called an oracle (according to the terminology introduced by Donoho and Johnstone, see [17] for instance). Unfortunately, since the risk depends on the unknown parameter s, so does m (s) and the oracle is not an estimator of s. However, the risk of an oracle can serve as a benchmark which will be useful in order to evaluate the performance of any data driven selection procedure among the collection of estimators ( sm )m∈M . Note that this notion is different from the notion of true model. In other words if s belongs to some model Sm0 , this does not necessarily implies that sm0 is an oracle. The idea is now to consider data-driven criteria to select an estimator which tends to mimic an oracle, i.e., one would like the risk of the selected estimator sm to be as close as possible to the risk of an oracle. 2.3. Model selection via penalization. Let us describe the method. The penalized minimum contrast estimation procedure consists in considering some minimizing proper penalty function pen: M → R+ and take m sm ) + pen (m) γn ( over M. We can then define the selected model Sm and the selected estimator sm . This method is definitely not new. Penalized criteria have been proposed in the early seventies by Akaike (see [1]) for penalized maximum log-likelihood in the density estimation framework and Mallows for penalized least squares regression (see [16] and [22]). In both cases the penalty functions are proportional to the number of parameters Dm of the corresponding model Sm • Akaike : Dm /n • Mallows : 2Dm σ 2 /n, where the variance σ 2 of the errors of the regression framework is assumed to be known by the sake of simplicity. Akaike’s heuristics leading to the choice of the penalty function Dm /n heavily relies on the assumption that the dimensions and the number of the models are bounded w.r.t. n and n tends to infinity. Let us give a simple motivating example for which those assumptions are clearly not satisfied. Example 2.9 (Change point detection). Change point detection on the mean is indeed a typical example for which these criteria are known to fail. A noisy
316
P. Massart
signal ξj is observed at each time j/n on [0, 1]. We consider the fixed design regression framework ξj = s (j/n) + εj , 1 ≤ j ≤ n where the errors are i.i.d. centered random variables. Detecting change points on the mean amounts to select the “best” piecewise constant estimator of the true signal s on some arbitrary partition m with endpoints on the regular grid {j/n, 0 ≤ j ≤ n}. Defining Sm as the linear space of piecewise constant functions on partition m, this means that we have to select a model among the family (Sm )m∈M , where M denotes the collection of all possible partitions by intervals with end points on the grid. Then, the number of models with n−1 dimension D, i.e., the number of partitions with D pieces is equal to D−1 which grows polynomially w.r.t. n. 2.3.1. The non asymptotic approach. The approach to model selection via penalization that we have developed (see for instance the seminal papers [7] and [5]) differs from the usual parametric asymptotic approach in the sense that: • The number as well as the dimensions of the models may depend on n. • One can choose a list of models because of its approximation properties: – wavelet expansions – trigonometric or piecewise polynomials – artificial neural networks etc It may perfectly happen that many models of the list have the same dimension and in our view, the “complexity” of the list of models is typically taken into account via the choice of the penalty function of the form (C1 + C2 Lm )
Dm n
where the weights Lm satisfy the constraint e−Lm Dm ≤ 1 m∈M
and C1 and C2 do not depend on n. As we shall see, concentration inequalities are deeply involved both in the construction of the penalized criteria and in the study of the performance of the resulting penalized estimator sm . 3. Gaussian model selection Focusing on the Gaussian framework, say the “white noise” (or regression on a fixed design with Gaussian errors with variance equal to 1), allows us to detail the rationale of our approach. The results below are part of a joint work with Lucien Birg´e (see [8]). Each model Sm is assumed to be linear with dimension Dm and sm is denoting the least-squares estimator on Sm .
A Non-asymptotic Theory for Model Selection
317
In such a situation Dm n where sm denotes the orthogonal projection of s on Sm . The oracle being an 2 sm − s , the aim is to mimic the oracle by ideal model achieving inf m∈M E estimating the risk. E sm − s 2 = sm − s 2 +
3.1. Mallows’ heuristics. The classical answer given by Mallows’ Cp heuristics is as follows. An “ideal” model should minimize the quadratic risk Dm Dm 2 2 2
sm − s + = s − sm + n n or equivalently Dm 2 . − sm + n 2 2 sm − Dm /n leads to Substituting to sm its natural unbiased estimator Mallows’ criterion 2Dm 2 . − sm + n 3.2. A general theorem. The above heuristics can be justified (or corrected) if 2 one can specify how close is sm from its expectation Dm /n, uniformly w.r.t. m ∈ M. The Gaussian concentration inequality is precisely the adequate tool to do that. Note that we simultaneously get a precise form for the penalty and an “oracle” type inequality. Theorem 3.1. Let (xm )m∈M be some family of positive numbers such that exp (−xm ) = Σ < ∞. m∈M
Let K > 1 and assume that
#2 √ K " Dm + 2xm . n Then, if m minimizes the penalized least-squares criterion pen (m) ≥
2
− sm + pen (m) , the following inequality is valid " # Σ 2 2 inf sm − s + pen (m) + E sm , − s ≤ C (K) m∈M n
(3.1)
where C (K) depends only on K. It is important to realize that Theorem 3.1 allows easily to compare the 2 sm − s . To risk of the penalized estimator with the benchmark inf m∈M E illustrate this idea, remembering that Dm 2 2 E sm − s = sm − s + , n
318
P. Massart
let us indeed consider the simple situation where one can take (xm )m∈M such that xm = LDm for some positive constant L and m∈M exp (−xm ) ≤ 1 say √ (nothing magic with 1 here). Then, taking pen(m) = KDm (1 + 2L)2 /n, the right-hand side in the risk bound is (up to constant) bounded by inf E sm − s 2 .
m∈M
In such a case, we recover the desired benchmark, which means that the selected model performs (almost) as well as an “oracle”. It is also worth noticing that Theorem 3.1 provides a link with Approximation Theory. To see this let us assume the number of models with the same dimension to be finite. Then a typical choice of the weights is xm = x (Dm ) with x (D) = αD + log |{m ∈ M; Dm = D}| and α > 0 so that those weights really represent the price to pay for redundancy (i.e., many models with the same dimension). The penalty can be taken as #2 K " Dm + 2x (Dm ) pen (m) = pen (Dm ) = n and (3.1) becomes #2 " # D" 2 2 − s ≤ C inf − s 2x (D) inf ,
s + E sm 1 + m D≥1 m∈M,Dm =D n where the positive constant 0 C depends on K and α. From this bound, the approximation properties of Dm =D Sm is absolutely essential. One can hope substantial gains in the bias term when considering redundant models at some reasonable price since the dependency of x (D) with respect to the number of models with the same dimension is logarithmic. This is typically what happens when one uses wavelet expansions to denoise some signal. 3.3. Examples. Many examples of applications of Theorem 3.1 are to be found in [8]. We just focus here on two cases example: variable selection and change point detection. Example 3.2 (Variable selection). Let {φj , j ≤ N } be some collection of linearly independent functions. For every subset m of {1, . . . , N } we define Sm to be the linear span of {φj , j ≤ N } and we consider some collection M of subsets of {1, . . . , N }. We first consider the ordered variable selection problem. In this case M is the collection of subsets of the form {1, . . . , D} with D ≤ N . Then, one can take pen (m) = K |m| /n with K > 1 and one can show that this constraint is sharp (if K < 1, it can be proved that selection criterion explodes in the sense that it systematically selects models with dimensions of order N ). This leads to an oracle inequality of the form E sm sm − s . − s ≤ C inf E 2
2
m∈M
Hence the selected model behaves like an oracle.
A Non-asymptotic Theory for Model Selection
319
In the complete variable selection context, M is the collection of all subsets of {1, . . . , N }. Taking xm = |m| log (N ) leads to N exp (−xm ) = exp (−D log (N )) ≤ e Σ= D m∈M
D≤N
and pen (m) = with K > 1. Then 2
E sm − s ≤ C inf
D≥1
#2 K |m| " 1 + 2 log (N ) n " # D log (N ) 2
sm − s + inf m∈M,Dm =D n
and we see that the extra factor log (N ) is a rather modest price to pay as compared to the potential gain in the bias provided by the redundancy of models with the same dimension. Interestingly, no orthogonality assumption is required on the system of functions {φj , j ≤ N } to derive this result. However whenever {φj , j ≤ N } is an orthonormal system, the penalized estimator can be explicitly computed and one recover the hardthresholding estimator introduced by Donoho and Johnstone in the white noise framework (see [17]). Indeed it is easy to check that N sm βj 1I|βj |≥T φj = j=1 where the βj"’s are the empirical coefficients (i.e., βj = φj (x) dξ (n) (x)) and # T = K/n 1 + 2 log (N ) . Again the constraint K > 1 turns out to be sharp. Note that the previous computations for the weights can be slightly refined. More precisely it is possible to replace the logarithmic factor log (N ) above by log (N/ |m|). This leads to a better 0 risk bound which turns out to be optimal in a minimax sense on each set |m|=D Sm , D ≤ N . Example 3.3 (Change points detection). We consider the change point detection on the mean problem described above. Recall that one observes the noisy signal ξj = s (j/n) + εj , 1 ≤ j ≤ n where the errors are i.i.d. random standard normal variables. Defining Sm as the linear space of piecewise constant functions on the partition m, the change point detection problem amounts to select a model among the family (Sm )m∈M , where M denotes the collection of all possible partitions by intervals with end points on the grid {j/n, 0 ≤ j ≤ n}. Since the number of models n−1 with dimension D, i.e., the number of partitions with D pieces is equal to D−1 , this collection of models has about the same combinatorial properties as the family of models corresponding to complete variable selection among N = n− 1 variables. Hence the same considerations concerning the penalty choice and the same resulting risk bounds as for complete variable selection hold true.
320
P. Massart
3.4. • • •
Conclusions. The following points can be made Mallows’ criterion can underpenalize. Condition K > 1 in the statement of Theorem 3.1 is sharp. What penalty should be recommended? One can try to optimize the oracle inequality. The result is that K = 2 is a good choice (see [9]). • In practice, the level of noise is unknown, but one can retain from the theory the rule of thumb: “optimal” penalty= 2דminimal” penalty. Interestingly the minimal penalty can be evaluated from the data because when the penalty is not heavy enough one systematically chooses models with large dimension. It remains to multiply by 2 to produce the desired (nearly) optimal penalty. This is a strategy for designing a data-driven penalty without knowing in advance the level of noise.
4. The role of concentration inequalities Coming back to the general problem of constructing sensible penalized empirical criteria for possibly non Gaussian observations, our approach can be described as follows. We take as a loss function the non negative quantity l (s, t) and recall that our aim is to mimic the oracle, i.e., minimize E [l (s, sm )] over m ∈ M. Let us introduce the centered empirical process νn (t) = γn (t) − E [γn (t)] By definition a penalized estimator sm satisfies for every m ∈ M and any point sm ∈ Sm γn ( sm ≤ γn ( sm ) + pen (m) ) + pen (m) ≤ γn (sm ) + pen (m) or, equivalently if we substitute νn (t) + E [γn (t)] to γn (t) sm + E [γn ( sm νn ( ) + pen (m) )] ≤ νn (sm ) + pen (m) + E [γn (sm )] . Subtracting E [γn (s)] to each side of this inequality finally leads to the following important bound l (s, sm ) ≤ l (s, sm ) + pen (m) + νn (sm ) − νn ( sm ) − pen (m) Hence, the penalty should be sm • heavy enough to annihilate the fluctuations of νn (sm ) − νn ( ) • but not too large since ideally we would like that l (s, sm ) + pen (m) ≤ E [l (s, sm )] Therefore we see that an accurate calibration of the penalty should rely on sm a sharp evaluation of the fluctuations of νn (sm ) − νn ( ). This is precisely why we need local concentration inequalities in order to analyze the uniform deviation of νn (u) − νn (t) when t is close to u and belongs to a given model.
A Non-asymptotic Theory for Model Selection
321
In other words the key is to get a good control of the supremum of some conveniently weighted empirical process νn (u) − νn (t) , t ∈ Sm . ω (u, t) The prototype of such bounds is the by now classical Gaussian concentration inequality (see [15]) and Talagrand’s inequality for empirical processes (see [26]) in the non-Gaussian case. It ensures that, given a sample ξ1 , . . . , ξn of i.i.d. random variables and a countable class F of functionswhich are cenn tered and uniformly bounded by 1, defining Z = supf ∈F i=1 f (Xi ) and & % n v = E supf ∈F i=1 f 2 (Xi ) then, for every positive x, except on a set with probability less than exp (−x) the following inequality holds √ Z ≤ E [Z] + 2vκx + cx where κ and c are universal constants. Following Ledoux’s approach to concentration (see [20] and [21]), based on log-Sobolev type inequalities, it can be proved that one can take κ = 4 and c = 2 (see [23]). At the & nprice% of modifying the variance factor v above by setting v = E [Z]+supf ∈F i=1 E f 2 (Xi ) , one can even prove that the inequality above is valid with the optimal values κ = 1 and c = 1/3 (see [13]). As pointed out for the first time in [7] in the context of least-squares density estimation, this type of concentration inequality for empirical processes allows to derive analogues of the Gaussian model selection Theorem stated above. Among other works building upon this idea, let us cite [14] for modified Akaike criteria on log-splines, [2] for Mallows’ type criteria in the context of design regression with non Gaussian errors, [4] for extensions of the previous results to weakly dependent data and [24] for results on statistical learning. 5. Data driven penalties Practical implementation of penalization methods involves the extension to non Gaussian frameworks of the data-driven penalty choice strategy suggested above in the Gaussian case. It can roughly be described as follows • Compute the minimum contrast estimator sD on the union of models defined by the same number D of parameters. • Use the theory to guess the shape of the penalty pen (D), typically pen (D) = αD • Estimate α from the data by multiplying by 2 the smallest value for which the corresponding penalized criterion does not explode. In the context of change point detection, this data-driven calibration method for the penalty has been successfully implemented and tested by E. Lebarbier (see [19]). In the non Gaussian case, we believe that this procedure remains valid but theoretical justification is far from being trivial and remains open. This problem is especially challenging in the classification context since
322
P. Massart
it is connected to the question of defining adaptive margin classifiers which is a topic attracting much attention in the statistical learning community at this moment (see [27] for instance). More generally, defining proper data-driven strategies for choosing a penalty offers a new field of mathematical investigation since future progress on the topic requires to understand in depth the behavior of γn ( sD ). Recent advances involve new concentration inequalities. A first step in this direction is made in [12] and a joint work in progress with S. Boucheron and O. Bousquet is building upon the new moment inequalities proved in [10]. References [1] Akaike, H. Information theory and an extension of the maximum likelihood principle. In P.N. Petrov and F. Csaki, editors, Proceedings 2nd International Symposium on Information Theory, pages 267–281. Akademia Kiado, Budapest, 1973. [2] Baraud, Y. Model selection for regression on a fixed design. Probability Theory and Related Fields 117, n◦ 4 467–493 (2000). [3] Bahadur, R.R. Examples of inconsistency of maximum likelihood estimates. Sankhya Ser.A 20, 207–210 (1958). [4] Baraud, Y., Comte, F. and Viennet, G. Model selection for (auto-)regression with dependent data. ESAIM: Probability and Statistics 5 33–49 (2001) http://www.emath.fr/ps/. [5] Barron, A.R., Birg´e, L., Massart, P. Risk bounds for model selection via penalization. Probab. Th. Rel. Fields. 113, 301–415 (1999). [6] Birg´e, L. and Massart, P. Rates of convergence for minimum contrast estimators. Probab. Th. Relat. Fields 97, 113–150 (1993). [7] Birg´e, L. and Massart, P. From model selection to adaptive estimation. In Festschrift for Lucien Lecam: Research Papers in Probability and Statistics (D. Pollard, E. Torgersen and G. Yang, eds.), 55–87 (1997) Springer-Verlag, NewYork. [8] Birg´e, L. and Massart, P. Gaussian model selection. Journal of the European Mathematical Society, n◦ 3, 203–268 (2001). [9] Birg´e, L., Massart, P. A generalized Cp criterion for Gaussian model selection. Pr´epublication, n◦ 647, Universit´es de Paris 6 & Paris 7 (2001). [10] Boucheron, S., Bousquet, O., Lugosi, G., Massart, P. Moment inequalities for functions of independent random variables. Ann. of Probability (to appear ). [11] Boucheron, S., Lugosi, G. and Massart, P. A sharp concentration inequality with applications. Random Structures and Algorithms 16, n◦ 3, 277–292 (2000). [12] Boucheron, S., Lugosi, G., Massart, P. Concentration inequalities using the entropy method. Ann. of Probability 31, n◦ 3, 1583–1614 (2003). [13] Bousquet, O. A Bennett concentration inequality and its application to suprema of empirical processes. C.R. Math. Acad. Sci. Paris 334 n◦ 6, 495–500 (2002). [14] Castellan, G. Density estimation via exponential model selection. IEEE Trans. Inform. Theory 49 n◦ 8, 2052–2060 (2003). [15] Cirel’son, B.S., Ibragimov, I.A. and Sudakov, V.N. Norm of Gaussian sample function. In Proceedings of the 3rd Japan-U.S.S.R. Symposium on Probability Theory, Lecture Notes in Mathematics 550 20–41 (1976) Springer-Verlag, Berlin.
A Non-asymptotic Theory for Model Selection
323
[16] Daniel, C. and Wood, F.S. Fitting Equations to Data. Wiley, New York (1971). [17] Donoho, D.L. and Johnstone, I.M. Ideal spatial adaptation by wavelet shrinkage. Biometrika 81, 425–455 (1994). [18] Dudley, R.M. Uniform Central Limit Theorems. Cambridge Studies in advanced mathematics 63, Cambridge University Press (1999). [19] Lebarbier, E. Detecting multiple change points in the mean of Gaussian process by model selection. Stochastic processes and their applications (to appear). [20] Ledoux, M. On Talagrand deviation inequalities for product measures. ESAIM: Probability and Statistics 1, 63–87 (1996) http://www.emath.fr/ps/. [21] Ledoux, M. The concentration of measure phenomenon. Mathematical Surveys and Monographs 89, American Mathematical Society. [22] Mallows, C.L. Some comments on Cp . Technometrics 15, 661–675 (1973). [23] Massart, P. About the constants in Talagrand’s concentration inequalities for empirical processes. Ann. of Probability. 28, n◦ 2, 863–884 (2000). [24] Massart, P. Some applications of concentration inequalities to Statistics. Probability Theory. Annales de la Facult´e des Sciences de Toulouse (6) 9, n◦ 2, 245–303 (2000). [25] Talagrand, M. Concentration of measure and isoperimetric inequalities in product spaces. Publications Math´ ematiques de l’I.H.E.S. 81 73–205 (1995). [26] Talagrand, M. New concentration inequalities in product spaces. Invent. Math. 126, 505–563 (1996). [27] Tsybakov, A.B. Optimal Aggregation of Classifiers in Statistical Learning. Ann. of Statistics 32, n◦ 1 (2004). [28] Van der Vaart, A. Asymptotic statistics. Cambridge University Press (1998). [29] Van der Vaart, A. and Wellner J. Weak Convergence and Empirical Processes. Springer, New York (1996). [30] Vapnik, V.N. Estimation of dependencies based on empirical data. Springer, New York (1982). Pascal Massart Universit´ e de Paris-Sud Current address: Equipe de “Probabilit´es Statistique et Mod´ elisation” Laboratoire de Math´ ematique UMR 8628 Bˆ at. 425, Centre d’Orsay Universit´ e de Paris-Sud F-91405 Orsay Cedex e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Reflection, Bernoulli Numbers and the Proof of Catalan’s Conjecture Preda Mih˘ ailescu Il y avait un jardin qu’on appelait la terre, Avec un lit de mousse pour y faire l’amour. Non ce n’´ etait pas le Paradis ni l’Enfer, Ni rien de d´ ej` a vu ni d´ ej` a entendu: Un jour, mon enfant, pour toi il florira . . . * To Seraina and Theres Abstract. Catalan’s conjecture states that the equation xp − y q = 1 has no other integer solutions but 32 − 23 = 1. We prove a theorem which simplifies the proof of this conjecture.
1. Introduction Let p, q be distinct odd primes with p '≡ 1 mod q, ζ ∈ C be a primitive p-th root of unity, E = Z[ζ + ζ]× be the real units of Q(ζ) and Eq , the subgroup of those units which are q-adic q-th powers (also called q-primary units). Let G = Gal (Q(ζ + ζ)/Q) and Fq [G] be the group ring over the prime finite field with characteristic q and N = NQ(ζ+ζ)/Q ∈ Z[G]. The main theorem of this paper states: Theorem 1.1. Let p > q be odd primes with p '≡ 1 mod q. If C is the ideal class group of Q(ζ + ζ), E = Z[ζ + ζ]× and Aq = {x ∈ C : xq = 1} and the module T is defined by ? supp(Aq ), T = supp(Eq /E q ) then T '= (N) · Fq [G]. The notion of support: supp(T), will be defined below and the signification of various modules over the group ring will be given in detail. The module T introduced above has the following connection to the Catalan conjecture, which is proved in [Mi]: * Free after Georges Moustaki.
326
P. Mih˘ ailescu
Theorem 1.2. If p, q are distinct odd primes with p '≡ 1 mod q, such that Catalan’s equation xp − y q = 1 has a non-trivial solution in the integers, then, with the notation introduced above, T = (N). Remark 1.3. In [Mi], the Theorem of Thaine and the assumption p > q are used for the proof of T '= (N). The new Theorem allows herewith to bypass the use of Thaine’s Theorem but not the condition p > q. 2. Cyclotomic fields and their group rings The n-th cyclotomic extension is denoted, following [Ono], by Cn and its maximal real subfield is C+ n ; thus Cp = Q(ζ), etc. The n-th cyclotomic polynomial is Φn (X) ∈ Z[X]. The Galois groups are Gn = Gal (Cn /Q) ∼ = (Z/n · Z)∗ and + + ∗ Gn = Gal (Cn /Q). For c ∈ (Z/n · Z) , we let σc be the automorphism of Q(ζn ) with ζn −→ ζnc . If n, n are coprime odd integers, then the fields Cn , Cn are linear independent [Ono] and Gn·n = Gn × Gn . An automorphism σ ∈ Gn lifts to Gnn by fixing ζn . Complex multiplication is an automorphism ⊂ Gn for all n ∈ N. 2.1. Group rings. If R is a ring and G = Gal (K/Q) a Galois group, the module R[G] is a free R-module generated by the elements of G and is called the group ring of G. For |G| ∈ R× , the group ring is separable, and we require that this condition holds. We shall write R[G] = R[G]/ NK/Q for the submodule obtained by modding out the ideal generated by the norm. If n is an odd prime power, G = Gn is generated by ς ∈ G and ϕ(n) ∈ R× , then the polynomial X ϕ(n) − 1 is separable over R and ς → X mod X ϕ(n) − 1 induces an isomorphism ιn : R[X]/(X ϕ(n) − 1) → R[Gn ]
(2.1)
with ϕ(n) −1 X . R[Gn ] = ιn R[X]/ X −1
For (n.n ) = 1, the isomorphism ι extends by multiplicativity. It is thus defined for all cyclotomic fields and we shall write ι, irrespective of the value of n and the ring R. The real group ring embeds in R[G] by R[G+ ] ∼ = 1+ 2 · R[G] and if R is a finite field of odd characteristic, then R[G+ ] ∼ = (1 + )R[G]. In the latter case we shall think of the real group ring in terms of the module on the righthand side of the isomorphism. Let G− = G/G+ , the minus part of G; then
Reflection, Bernoulli Numbers and Catalan
327
R[G− ] ∼ = 1− 2 ·R[G], etc. In particular, since ϕ(n) is even, under the isomorphism ι we have: ## " " R[G+ ] = ιn R[X]/ X ϕ(n)/2 − 1 , ϕ(n)/2 −1 X + , and (2.2) R[G ] = ιn R[X]/ X −1 ## " " R[G− ] = ιn R[X]/ X ϕ(n)/2 + 1 . 2.2. Characters, idempotents and irreducible modules. The topics we expand next belong to representation theory, essentially Maschke’s Theorem. We expose it in some detail, in order to keep a consistent notation. Let f ∈ N>1 be a positive integer. A Dirichlet character ([Wa], Chapter 3) of conductor n is a multiplicative map ψ : Z → C, such that ψ(x) = ψ(y) if x ≡ y mod n and ψ(x) = 0 iff (x, n) > 1. The Dirichlet character is thus a multiplicative map χ : (Z/f · Z)∗ → C; if n|n , one can regard the same character as a map (Z/n · Z)∗ → C by composition with the natural projection (Z/n · Z)∗ → (Z/n · Z)∗ . The set of integers n for which the same map is defined builds an ideal and it is convenient to choose the generator of this ideal as conductor. A character defined with respect to its minimal conductor – which is sometimes denoted [Wa] by nχ is called primitive. We will only consider primitive characters. A character is odd if ψ(−1) = −1 and even if ψ(−1) = 1. Odd and even characters multiply like signs: odd times odd is even, etc. The trivial character is unique for all conductors and will be denoted by 1, so 1(x) = 1 for all x ∈ Z. The isomorphism Gn ∼ = (Z/n · Z)∗ allows one to consider Dirichlet characters as characters of the Galois group Gn = Gal (Cn /Q). More precisely, let H = (Z/n · Z)∗ / ker ψ ⊂ (Z/n · Z)∗ . Then there is a field K ⊂ Cn with Galois group isomorphic to H and ψ may be regarded as character of this field. Let G = Gal (K/Q) as before and R = k be a field and k an algebraic closure. If K = Cn is a cyclotomic field – the case we are interested in – then, due to the linear independence above mentioned, we may restrict ourselves to the case when n is a prime power; we shall also assume that n is odd. Furthermore, the polynomial F (X) = X ϕ(n) − 1 should be separable over k, so we require (char(k), ϕ(n)) = 1. Let F ⊂ k[X] be the set of irreducible factors of X ϕ(n) − 1 over k and, naturally, F = F \ {X − 1}; since+F (X)−is separable, F (X) = f ∈F f (X). We have the disjoint union F = F ∪ F induced by the rational polynomial factorization: " # " # X ϕ(n) − 1 = X ϕ(n)/2 − 1 · X ϕ(n)/2 + 1 . The primitive (Galois) characters χ : G → k are multiplicative maps which form a group G . We shall make the dependence on k explicit by writing G (k), whenever the context requires it. The Galois characters χ ∈ G (Q) can
328
P. Mih˘ ailescu
be identified to Dirichlet characters of conductor n via the convention χ(c) = χ(σc )
for c ∈ (Z/n · Z)∗ .
A simple and important property of sums of characters is the following: Lemma 2.1. Let G be an abelian Galois group and H ⊂ G (k) a subgroup of the Galois characters. Then 0 ∀ x ∈ Z \ ker(H ), and χ(x) = (2.3) | ∀ x ∈ ker(H ). |H χ∈H 0 ∀ χ ∈ G , χ '= 1, and (2.4) χ(x) = |G| if χ = 1. x∈G Proof. Let x∈ Z with H (x) '= {1}; then there is a χ ∈ H such that χ (x) '= 1. Let s(x) = χ∈K χ(x). Then (χ (x) − 1) · s(x) = χ(x) − χ (x) · χ(x) χ∈H
=
χ∈H
χ(x) −
χ∈H
χ (x) = 0.
χ ∈H
Since (χ (x)−1) '= 0, it follows that s(x) = 0. For x ∈ ker(H ) we have χ(x) = 1 for all χ ∈ H and obviously s(x) = |H|. The proof of (2.4) is similar. Let µ ∈ k be a primitive ϕ(n)-th root of unity. Since Gn is cyclic, ς ∈ G is a generator, then χ(ς) ∈ k determines all the values of χ by multiplicativity. ϕ(n) = 1 and χ(ς) ∈ µ is an ϕ(n)-th root of Furthermore ς ϕ(n) = 1, so (χ(ς)) unity. The orthogonal idempotents [Lo] of G over this field are: 1 1χ = χ(σ) · σ −1 ∈ k(µ)[G], ∀ χ ∈ G . (2.5) · |G| σ∈G
An easy computation shows that the idempotents verify: 1χ1 × 1χ2 = δ(χ1 , χ2 ) 1χ = 1,
∀χ1 , χ2 ∈ G ,
χ∈G
σ · 1χ = χ(σ) · 1χ , 1χ × (χ(σ0 ) − σ0 ) = 0
∀σ ∈ G, χ ∈ G ,
(2.6)
∀ σ0 ∈ G.
Here δ(χ1 , χ2 ) = 1 if χ1 = χ2 and 0 otherwise. In general 1χ '∈ k[G], so they have merely an abstract meaning, but their actions may not be well defined. We need idempotents in k[G]; let S(χ) = Gal (k(χ(G))/k), where k(χ(G)) is the field obtained by adjoining all the values χ(x), x ∈ G to the base field k. The action of S(χ) induces an equivalence relation on G given by χ ∼ χ ⇔ ∃ s ∈ S(χ) : χ = s(χ).
Reflection, Bernoulli Numbers and Catalan
329
We let X ⊂ G be a set of representants for the classes of G / ∼. The k-rational idempotents are defined by taking traces: 1 εχ = · 1χ ∈ k[G], χ ∈ G . |S(χ)| s∈S(χ)
The isomorphism ι defined by (2.1) extends to the field k[µ], by fixing this extension. Then ι(χ(ς)) = χ(ς) = ν is a root of unity whose order is equal to the order of the character χ ∈ G . The annihilator χ(ς) − ς of 1χ maps under the isomorphism defined in (2.1) to ι (χ(ς) − ς) = X − ν. The group S(χ) acts on χ and on ν but not on ς, and thus " # ι ς − s(χ(ς)) ≡ X − s(ν) ≡ fχ (X) mod X ϕ(n)−1 . s∈S(χ)
s∈S(χ)
Note that the polynomial fχ ∈ k[X] since it is invariant under the group S(χ) acting on ν. Furthermore it is an irreducible factor of X ϕ(n) − 1, so fχ ∈ F . We have thus a one-to-one map φ : X → F , χ → fχ . Since fχ(ς) annihilates 1χ for all conjugate characters of χ, it follows that it annihilates εχ . Furthermore, since (ς − χ(ς))|(σ0 − χ(σ0 )) for any σ0 ∈ G, it is also the minimal annihilator. We have thus the following properties for the k-rational idempotents: εχ1 × εχ2 = δ(χ1 , χ2 ) εχ = 1,
∀χ1 , χ2 ∈ G ,
χ∈X
σ · εχ = χ(σ) · εχ , εχ × fχ (σ0 ) = 0
∀σ ∈ G, χ ∈ G ,
(2.7)
∀ σ0 ∈ G.
Here, unlike (2.6), δ(χ1 , χ2 ) = 1 if χ1 ∼ χ2 and 0 otherwise. We define the irreducible submodules of k[G] by Mχ = εχ · k[G], χ ∈ X. By the previous remarks, they have fχ (ς) as minimal annihilator and thus Mχ ∼ = k[G]/ (fχ (ς)k[G]) and they are in fact fields and: ' ' εχ · k[G] = Mχ . (2.8) k[G] = χ∈X
χ∈X
Let H be a finite multiplicative abelian group on which G acts. The action of G makes H into a k[G]-module and (2.8) induces a direct sum representation of the module H = k[G] · H: ' ' (εχ · k[G]) · H = Mχ · H. (2.9) k[G] · H = χ∈X
χ∈X
The subgroups Mχ · H ⊂ H are called irreducible components of H; a component is the direct sum of one or more irreducible components. Note that the Q-rational idempotents correspond to the factorization of X ϕ(n) − 1 over the rationals. The induced Q-irreducible components are thus always unions of one of more Fr -irreducible components, for some prime r.
330
P. Mih˘ ailescu
We define the support and annihilator of H as the direct sum of irreducible modules which act non-trivially, resp. trivially on H: ' Mχ supp(H) = χ∈X0 Mχ ·H ={1}
'
ann(H) =
Mχ .
χ∈X0 Mχ ·H={1}
Note that supp(H), ann(H) ⊂ k[G]; they are components of k[G] and not of H. In particular, various unrelated abelian groups may share the same support and annihilator. Furthermore, an irreducible component needs not be a cyclic module. Since H is finite, there are a finite number of cyclic modules in Mχ · H: ∃ mχ,1 , mχ,2 , . . . , mχ,k ∈ H :
Mχ · H =
k '
Mχ · mχ,i .
i=1
The number k of cyclic modules Mχ · mχ,i in Mχ · H is called the cycle-rank of Mχ · H and will be denoted by cyc.rk.(Mχ ). Let now n1 , n2 be powers of coprime integers. Then G = Gn1 ·n2 = Gn1 × Gn2 , as noted in the previous section. A character χ ∈ Gn1 n2 splits then in χ = χ1 · χ2 , with χi ∈ Gni , i = 1, 2. If µ ∈ k is a primitive ϕ(n1 n2 )-th root of unity, we define the orthogonal idempotents by the same formula (2.5) used in the case of prime powers. Let χ ∈ G with χ = χ1 · χ2 as above. An easy computation shows that, using the representation τ ∈ Gnn with τ = σ1 · σ2 , where σi ∈ Gni , i = 1, 2 we have: 1 χ(τ ) · τ −1 · 1χ = |G| τ ∈G 1 · = χ1 (σ1 ) · χ2 (σ2 ) · σ1−1 · σ2−1 |Gn1 | · |Gn2 | σi ∈Gni (2.10) # " 1 # " 1 = · χ1 (σ1 ) · σ1−1 × · χ2 (σ2 ) · σ2−1 Gn1 Gn2 σ1 ∈Gn1
σ2 ∈Gn2
= 1χ1 × 1χ2 . Herewith all the properties of idempotents and further definitions which build up upon these properties, extend by multiplicativity to general cyclotomic fields. 3. Explicit reflection We let now be an odd prime and n ∈ N be divisible by and such that ' | ϕ(n). The fields will be K = Cn , so Gal (K/Q) = Gn , and k = F . Remember that the group ring k[Gn ] is defined by multiplicativity and it is semisimple, since = char(k) ' | |Gn |.
Reflection, Bernoulli Numbers and Catalan
331
There is a unique character ω = ω ∈ Gn such that ω(σ)
σ(ζ ) = ζ
,
∀ σ ∈ Gn .
This character is called the cyclotomic character for and it is an odd character. If χ ∈ G we define the reflected character χ∗ ∈ G by χ∗ (σ) = ω(σ) · χ(σ −1 ).
(3.1)
Since ω(σ) ∈ F = k it follows that χ∗ is irreducible iff χ is so; also, ω being odd, reflection changes the parity of a character. The definition of reflected irreducible modules and reflected idempotents follows naturally. We shall write 1∗χ = 1χ∗ , etc. One also remarks that reflection is an involutive operation, since −1 ∗ = χ. (χ∗ ) = ω · ωχ−1 −1 (X − j) splits in If n = , the polynomial Φϕ() = Φ−1 (X) = j=1 linear factors over k. The orthogonal idempotents are thus annihilated by linear polynomials ς − j and can be indexed by these polynomials. They have in this case the representation ([Wa], Chapter 6.2): εj = εχj = − ω j (σ) · σ −1 . (3.2) σ∈G
Reflection of idempotents follows here the simple law: ε∗j = εp−j . We now expose Leopoldt’s Reflection Theorem, which will establish relations between various -groups which are all k[Gn ] modules. Leopoldt’s original paper [Le] (see also [Lo]), treats the general case in which K is a normal field containing ζ and such that ([K/Q], ) = 1. Furthermore, the groups are Sylow groups, while we are only interested in their elementary -subgroups, i.e., the subgroups of exponent . This second modification is only marginal, but it allows to bypass a step in which the base field for the group rings has to be k = Q , the -adic rational field. Let C be the ideal class group of K and E = O(K+ )× be the real units. Let α ∈ K have valuation zero at each prime L ⊃ (); we say that α is -primary iff α ≡ ν mod · (1 − ζ )2 , for some ν ∈ K. We then write K = {x ∈ K× : x is -primary} and let E = E ∩ K . Note that if K ⊂ K is a field in which is inert, then the necessary condition for -primary numbers in K is α ≡ ν mod 2 . The first actors of reflection are then: A = {x ∈ C : x = 1},
and
U = E /E . If A '= {1}, there is a maximal abelian unramified elementary -extension L ⊃ K – i.e., an extension with -elementary Galois group H = Gal (L/K). This is a subfield of the Hilbert class field of K and the Artin map yields an isomorphism between the groups H ∼ = A . The module k[G] acts on H by conjugation:
332
P. Mih˘ ailescu
σh = hσ = σ −1 ◦ h ◦ σ, for all h ∈ H, σ ∈ G. Finally, a number α ∈ K is called -singular if there is a non-principal ideal a ⊂ x ∈ A such that a = (α). Note that by definition α '∈ K . We let B = {α ∈ K : α is -singular} ∩ (K \ E ) and B = B/(K × ) . Theorem 3.1 (Leopoldt’s Reflection Theorem). Notations being like above, let M = Mχ ⊂ k[G] be an irreducible submodule, with χ ∈ X an even character. Then the k[G] -modules A , U and B are related by:
and
cyc.rk.(Mχ B ) + cyc.rk.(Mχ U ) = cyc.rk.(Mχ∗ A ),
(3.3)
cyc.rk.(Mχ∗ B )
(3.4)
= cyc.rk.(Mχ A ),
cyc.rk.(Mχ B ) ≤ cyc.rk.(Mχ A ),
cyc.rk.(Mχ∗ B ) ≤ cyc.rk.(Mχ∗ A ). Moreover, the following inequality holds: cyc.rk.(Mχ · A ) ≤ cyc.rk.(Mχ∗ · A ) ≤ cyc.rk.(Mχ · A ) + cyc.rk.(Mχ · U ).
(3.5)
(3.6)
Proof. Note that the norm NK/Q annihilates all the groups under consideration, which explains why we concentrate on k[G] . The numbers in B are primary singular non-units and the union F = B ∪ U is disjoint, so cyc.rk.(M F ) = cyc.rk.(M B ) + cyc.rk.(M U ) for each simple submodule M ⊂ k[G] . If x ∈ F and y ∈ K× , y ≡ x mod (K × ) , then K(y 1/ ) is an unramified abelian extension (e.g., [Wa], Chapter 9, Exercises). These are exactly all possibilities for generating the extension L. The inequalities (3.5) are obvious, since it takes an ideal in a ∈ x ∈ A in order to define a singular number in B, and not all singular numbers are also primary, so the inequalities may be strict. We have the following one-to-one maps: F ↔ H ↔ A . The first map is a consequence of the above remark, the second is the Artin map. The inequalities (3.5) now follow from |M ∗ A | = |M F | = |M B | + |M U |. For odd characters χ, Mχ · U = {1}, since in this case Mχ annihilates the real units. This explains the asymmetry between (3.3) and (3.4). The symmetry is regained if we write, with F defined above, cyc.rk.(Mχ F ) = cyc.rk.(Mχ∗ A ).
(3.7)
This relation holds for any character χ, and we shall prove it below. The extension L/K is an abelian Kummer extension [La]; for b ∈ b ∈ F , the extension K(b1/ ) depends only upon the class b ∈ F of the algebraic number b. There is thus a (Kummer-) pairing H × F → ζ given by h, b =
hb1/ , b1/
for any b ∈ b.
Reflection, Bernoulli Numbers and Catalan
333
The pairing does not depend upon the choice of the -th root of b [La], is bilinear and non-degenerate. Furthermore, it is G-covariant in the sense that hσ , bσ = h, bσ ,
∀ σ ∈ G.
(3.8)
Let now χ ∈ G . We claim that the Kummer pairing verifies the reflection property: ε∗χ h, b = h, εχ b. (3.9) Indeed h, bσ = ζnσ = h, bω(σ) so (3.8) implies σh, b = h, bω(σ) = h, ω(σ)b. The statement now follows by directly inserting the definition of εχ and using the fact that |S(χ)| = |S(χ∗ )|. Let now b ∈ Mχ F , so εχ b = b. Then (3.9) implies that h, b = ε∗χ h, b, ∗ so if h, b '= 1 then εχ h '= 1. But this means that h ∈ Mχ∗ H; however, if b ∈ b ∈ F and 1 '= h ∈ Gal (K(b1/ )/K), then the pairing is necessarily h, b '= 1. This shows that the correspondence F ↔ H acts componentwise by reflection, implies (3.7) and completes the proof. The main application of reflection is, for our purpose, the following: Proposition 3.2. Let n = · n with ' | ϕ(n), an odd prime and n ∈ N. Let A , U be like above and χ ∈ Gn , an even character belonging to the field K = Cn ⊂ K. If Mχ U or Mχ A are not trivial, then Mχ∗ A '= {1} Proof. If Mχ U '= {1}, then by (3.3), Mχ∗ A '= {1}. Otherwise, if Mχ A is non trivial, then Mχ∗ B is non trivial as a consequence of (3.4) and (3.5). In both cases, Mχ∗ A '= {1}, which completes the proof. Let ε1 be the orthogonal idempotent in (3.2), defined with respect to = q. The Proposition implies: Corollary 3.3. Let T and Aq be as in the statement of Theorem 1.1. Then T∗ ⊃ supp(ε1 · Aq ). Proof. If χ ∈ Gp then χ∗ = ω ·χ−1 and M ∗ χ ⊂ ε1 k[Gpq ]. The statement follows now from Proposition 3.2. 4. Bernoulli numbers If χ '= 1 is a Dirichlet character of conductor f , then the generalized Bernoulli numbers are defined ([Wa], Chapter 4), by: B1,χ =
f 1 · a · χ(a). f a=1
(4.1)
A major distinction between Galois characters and Dirichlet characters becomes clear in the definition (4.1): although it is formally identical to the definition of the idempotent 1χ−1 , no factorization like (2.10) is possible. The reason is that in the definition of idempotents, χ(σ) is multiplied by an automorphism
334
P. Mih˘ ailescu
– thus, under the identification of Galois and Dirichlet characters, there is an implicit reduction modulo the conductor of χ. In (4.1) however, the factors a are considered as complex numbers, so the factorization is true only modulo f . The next lemma gathers some computational facts on various characters: Lemma 4.1. Let , n be like in the previous section and µ ∈ C a primitive ϕ(n)-th root of unity, L = Q(µ) and () ⊂ L ⊂ O(L) a prime ideal above . Let Fr = O(L)/L be a field of characteristic so that the group Gn (F ) has images in Fr ; finally, let L ⊃ Q the extension of the -adic field for which O(L )/ ( · O(L )) = Fr [Go]. If ν ≡ µ mod L ∈ Fr , then µ is the unique root of unity in C with this property. Furthermore, there is a unique ϕ(n)-th root of unity µ ∈ L such that µ mod ( · O(L )) = ν. If χ ∈ Gn (F ) there are unique characters ψχ ∈ Dn = Gn (Q) and λχ ∈ Gn (Q ) – thus a Dirichlet and an -adic character – such that ψχ (x) ≡ χ(x) mod L, ∀ x ∈ Z, λχ (x) ≡ χ(x) mod ( · O(L )) ,
∀ x ∈ Z,
ψχ (x) ≡ λχ (x) mod L ,
∀ x ∈ Z, N ∈ N.
N
(4.2)
If ω is the cyclotomic character for , then N −1
ω := ψω (x) ≡ x
mod LN ,
∀ x ∈ Z, N ∈ N.
(4.3)
Proof. There is exactly one µ ∈ C with µ ≡ ν mod L. If this was not the case and µ1 ≡ µ2 ≡ ν mod L, then µ1 −µ2 ≡ 0 mod L and N(µ1 −µ2 ) ≡ 0 mod . But the norm on the right hand side is only divisible by primes dividing the order of µ, thus dividing ϕ(n), which is coprime to , so µ1 = µ2 . The unicity of the root µ is proved similarly. It is an elementary fact on -adic extensions [Go], that O(L)/(LN ) ∼ = O(L )/ N · O(L ) for all N ∈ N. Let χ ∈ Gn (F ) and eχ (x) : Z → Z/(ϕ(n) · Z) be the exponent with χ(x) = ν eχ (x) ; then the characters in (4.2) are given by ψχ (x) = µeχ (x) and λχ (x) = (µ )eχ (x) . The properties in (4.2) are immediate consequences. Finally, the character ω has order − 1 and is defined by its values for a = 1, 2, . . . , − 1 for which ω(a) ≡ a mod . One verifies that the character ψω mod LN given by (4.3) has exactly these properties and the claim (4.3) follows from the unicity of ψω and λω . For even characters, B1,χ = 0 and the odd characters are connected to the field K by the class number formula [Wa], Theorem 4.17: k B1,χ , k ∈ Z h− n =2 ·n· χ odd Since we are interested in divisibility of h− n by the odd prime , the power of 2 is of less concern in our case. The factor n cancels with the denominator of B1,@ ωt , for all the cyclotomic characters defined with respect to prime divisors
Reflection, Bernoulli Numbers and Catalan
335
of t|n; all the other Bernoulli numbers are algebraic integers. The class number formula indicates that if |h− n , then some Bernoulli numbers will be divisible by prime ideals above . The next step is to follow this indication and gather a finer, component dependent information about divisibility of B1,χ by primes above . Let 1 θ= · a · σc−1 n 0
be the Stickelberger element of K ([Wa], Theorem 15.1). Then θc = (c − σc )θ ∈ Z[Gn ], for (c, n) = 1 and it annihilates the class group C of K. Idempotents, Bernoulli numbers and Stickelberger element are related by the following formula, which is a consequence of (2.6). We assume here that the characters χ ∈ G are defined with respect to the field k = Q and they are identified to Dirichlet characters as shown before. θ · 1χ = B1,χ−1 · 1χ , ∀ χ ∈ G (Q), (4.4) ∀ χ ∈ G (Q). (c − σc )θ · 1χ = (c − χ(c)) · B1,χ−1 · 1χ , By reducing the above relations modulo primes lying above , we obtain important information about Bernoulli numbers, when an -component of the class group is non trivial. Proposition 4.2. Let be an odd prime and n = · n ∈ N with (, ϕ(n)) = 1, K = Cn ; for m|ϕ(n), m > 1, let µ ∈ C be a primitive m-th root of unity and G = Gn (F ). We fix a prime ideal () ⊂ L ⊂ O(Q(µ)) and consider χ ∈ G , a non-trivial primitive group character of exact order m, other then the cyclotomic character ω . Let C be the class group of K, A = {x ∈ C : x = 1} and suppose that Mχ · A '= {1}. If ψ = ψχ is the Dirichlet character defined in (4.2), then: B1,ψ−1 ≡ 0 mod L.
(4.5)
Furthermore, if Mχ · A '= {1} for all characters of exact order m, then B1,ψ−1 ≡ 0
mod · O(Q(µ)).
(4.6)
Proof. Let c ∈ Z with χ(c) '≡ c mod – this is possible, since χ '= ω – so θc = (c−σc )θ ∈ Z[G] and it annihilates the class group. Thus θc ·1κ A = {1} for all κ ∈ G and in particular for κ belonging to S(χ)χ. But since Mχ A '= {1}, it follows that the last annihilation is non-trivial. We insert c in the second relation of (4.4) and use c − χ(c) '≡ 0 mod L, thus finding θc εχ ≡ (c − χ(c)) · B1,ψ−1 · εχ
mod L.
Since εχ does by definition not annihilate Mχ A and c − χ(c) mod L ∈ F× r , it follows that B1,ψ−1 must vanish modulo L, which is the statement of (4.5). Suppose now that (4.5) holds for all characters of order m and let ψχ be the Dirichlet character induced by one of the χ ∈ Gn (F ). Let σ ∈ Gal (Q(µ)/Q);
336
P. Mih˘ ailescu
then σ(ψ) is also a character of exact order m for which (4.5) holds. Thus B1,σ−1 (ψ−1 ) = σ −1 B1,ψ−1 ≡ 0 mod L, and, by applying σ to the above congruence, we find that B1,ψ−1 ≡ 0 mod σL. This is the case for all σ ∈ Gal (Q(µ)/Q) and (4.6) follows. In particular, when the situation described in the Proposition happens for the reflected of all even characters in F [Gn ] , then we have: Corollary 4.3. Let the notations be the same as in Lemma 4.2, n ≥ 7 or n = 5 and suppose that Mχ∗ · A '= {1} for all even characters χ ∈ F [Gn ] . If µ ∈ C is a primitive ϕ(n )-th root of unity and () ⊂ L ⊂ O(Q(µ)) is a prime ideal above , then for all even Dirichlet characters ψ of conductor n the following holds: (4.7) B1,(ω−1 ·ψ ≡ 0 mod · O(Q(µ)). Proof. Note that for n < 7, n '= 5, we have ϕ(n ) ≤ 2 and there are no nontrivial even characters in Gn . The Corollary is a consequence of (4.6) and the fact that the ideal () in the m-th cyclotomic extension lifts to the ideal () in Q(µ), for any 1 < m|ϕ(n ). 5. Proof of the Theorem The proof of Theorem 1.1 is an application of Corollaries 3.3 and 4.3 combined with some involved computations with congruences and integer parts. Let p, q be the primes in Theorem 1.1 and let = q, n = pq and n = p. Since p '≡ 1 mod q and p > q, p ≥ 5, we are in the situation of the previous results. Assume that T = (N) in Theorem 1.1. Then Corollary 3.3 implies that Mχ∗ Aq is non trivial for all even, non-trivial χ ∈ Gp with images in Fq . Let µ ∈ C be a primitive (p − 1)/2-th root of unity – since we consider only even characters of Gp , their order divides (p − 1)/2; let Ep be the set of all even, non-trivial Dirichlet characters of conductor p. Then Corollary 4.3 implies that (4.7) holds for all ψ ∈ Ep . For such ψ, we write β1,ψ = pqB1,ψ , so that B1,ψ ≡ 0
mod q ⇔ β1,ψ ≡ 0 mod q 2 .
The characters ψ ∈ Ep are even, ψ(a) = ψ(p − a). We need some facts on computations modulo pq. Let 0 < u < q, 0 < v < p be the unique integers given by the extended Euclid algorithm, such that up + vq ≡ 1 mod pq. The following easy consequence of the definition of u, v will be used below: v ≡ ±1
mod p
⇔
q ≡ ∓1
mod p.
Let 0 < x(a, b) < pq and 0 ≤ n(a, b) < p be the unique integers with a mod p, a = 1, 2, . . . , p − 1, x(a, b) = b + q · n(a, b) ≡ b mod q, b = 1, 2, . . . , q − 1.
(5.1)
Reflection, Bernoulli Numbers and Catalan
337
Then x(a, b) ≡ upb + vqa mod pq and q · n(a, b) ≡ avq + bq
up − 1 ≡ q(av − bv) q
mod pq,
so n(a, b) ≡ (a − b)v
mod p.
(5.2)
Note the identity n(a, b) + n(p − q, q − b) = p − 1. Indeed, since x(p − a, q − b) = pq − x(a, b), we have pq = b + qn(a, b) + (q − b) + qn(p − a, q − b) = q · (1 + n(a, b) + n(p − a, q − b)) , which confirms the claim. For a = 1, 2, . . . , p − 1 we let f (a) ∈ Fq be defined by: q−1 f (a) ≡ b−1 · n(a, b) mod q. (5.3) b=1
Then f (p − a) ≡
1
(q − b)−1 · n(p − a, q − b) ≡
b=q−1
1
−b−1 · (p − 1 − n(a, b))
b=q−1
≡ f (a) mod q. With this, (4.7) implies for all non trivial ψ ∈ Ep :
(p−1)/2,q−1
β1,(ω−1 ψ =
ψ(a)( ω −1 (b) · (x(a, b) + x(p − a, b))
a=1; b=1
(p−1)/2; q−1
≡
2ψ(a)b1−q + qb−1 · (n(a, b) + n(p − a, b))
mod q 2 .
a=1;b=1
(p−1)/2 p−1 From (2.4), since ψ '= 1, we have 2 · a=1 ψ(a) = a=1 ψ(a) = 0. The sum vanishes in C and a fortiori modulo q 2 , and with the definition (5.3), the previous congruence becomes
(p−1)/2
ψ(a) · f (a) ≡ 0
mod q,
ψ ∈ Ep .
(5.4)
a=1
We can regard the above as an homogeneous linear system of equations over Fq , with (p − 1)/2 unknowns and (p − 3)/2 equations. One recognizes that the system matrix has a submatrix of rank (p − 3)/2, which is in fact a Vandermonde matrix. An easy verification shows that the constant vector is a solution of (5.4), so ∃ c0 ∈ Fq
such that
f (a) = c0 ,
for
a = 1, 2, . . . , p − 1.
338
P. Mih˘ ailescu
Since x − p · xp ∈ {0, 1, . . . , p − 1} for all x ∈ Z, it follows that n(a, b) = (a − b) − p · a−b . We can compute the constant c0 directly, using (5.2): p
(a − b)v b (a − b)v − p c0 ≡ p b=1
q−1 −1 (a − b)v mod q. b ≡v−p· p q−1
−1
b=1
With a new constant c1 ≡ of equations: q−1
b−1 ·
b=1
v−c0 p
≡ uv − uc0 mod q, we have the linear system
(a − b)v − c1 ≡ 0 p
mod q,
a = 1, 2, . . . , p − 1.
(5.5)
For a heuristic investigation of (5.5), let us define θa,b
q−1 (a − b)v −1 = · σb − c1 ∈ Fq [Gq ]. p b=1
Then (5.5) says that ε1 θa,b = 0 for a = 1, 2, . . . , p − 1 and ε1 the idempotent in (3.2), with respect to = q. We assume that the vectors (n(a, b))q−1 b=1 are random distributed for a = 1, 2, . . . , (p − 1)/2. By fixing c1 such that θ1,b ε1 = 0, the probability that the same component vanishes for the further (p − 3)/2 independent elements in Fq [G]q is q −(p−3)/2 . For fixed p and q < N → ∞, the probability that (5.5) is verified for at least one q is thus P (p) < ζ p−3 −1 < 1, 2 with ζ, the Riemann function. The heuristic suggests thus that (5.5) has no solutions, irrespective of the size of p and q. For a proof, we shall need to restrict generality to the case p > q, as in the statement ofTheorem 1.1, and since p and q are primes, then p − 2 ≥ q. (z+1)v We let sv (z) = − zv for z ∈ Z. Since 0 < v < p, if follows that p p 0 ≤ sv (z) ≤ 1 for all z ∈ Z. We extend the summation range to b = 0 and replace b−1 by ω −1 (b) which is also defined at b = 0. By subtracting the identities above for two successive values a, a + 1 with 0 < a < p − 2, it follows that c1 − c1 ≡
q−1 b=0 q−1
≡
b=0
ω −1 (b) ·
(a − b)v (a + 1 − b)v − p p
ω −1 (b) · sv (a − b) ≡ 0
mod q.
Reflection, Bernoulli Numbers and Catalan
339
or, equivalently a
ω −1 (a − t) · sv (t) ≡ 0
mod q.
(5.6)
t=a+1−q
Since p > q, relation (5.1) implies that v '≡ ±1 mod p and a simple computation shows that sv (z) = sv (z + q) for 1 − q < z ≤ 0. This allows to keep the argument of sv (t) in the range 0 ≤ t < q, when a < q: q−1
ω −1 (a − t) · sv (t) ≡ 0
mod q,
a = 1, 2, . . . , p − 2.
(5.7)
t=0
The first q equations in (5.7) then lead to a quadratic homogeneous system modulo q. Let the matrices Ωi ∈ M (Fq , q − i), i = 0, 1, be defined by: q−1−i i = 0, 1. Ωi = ω −1 (a − t) a,t=0 , Then Ω1 is a submatrix of Ω0 , which is the system matrix of the first q equations in the system (5.7). Note that Ω1 is a Toeplitz matrix and it has the characteristic polynomial X q−1 + 1 – as results by applying an usual method of numerical analysts for such matrices. The method consists in completing the matrix into a 2(q − 1) × 2(q − 1) circulant matrix, whose eigenvalues are then k ξ2(q−1) , where ξ2(q−1) is a primitive 2(q − 1)-th root of unity over Fq (i.e., the quadratic root of a generator of Fq ) and k = 0, 1, . . . , 2(q − 1) − 1. One verifies that the odd powers are eigenvalues of Ω1 , which leads to the claimed characteristic polynomial. In particular, Ω1 is a regular matrix and since Ω0 x = 0 allows the constant vector as solution, it follows that this is also the only solution. But then sv (t) is the constant vector, for t = 0, . . . , q − 1; since sv (0) = 0 and
q−1 q−1 tv qv 1 + (q − u)p (t + 1)v − = = sv (t) = p p p p t=0 t=0 = q − u > 0. We reached a contradiction, which completes the proof of the Theorem. Remark 5.1. The careful reader may have noted that we started from a redundant system of equations, which allowed for the substitution a → p − a and we obtained a non redundant system of rank q − 1. This may seem surprising, especially if q − 1 > p−1 2 . However tracing back the use of sv (t) = sv (t + q), one notes that (5.6) is invariant under the above substitution, while (5.7) is not. The Theorem 1.1 is tailored for the needs of the proof of Catalan’s equation. The Proposition 4.2 allows for more general results and raises more general questions then the Theorem, questions and results which shall be presented separately.
340
P. Mih˘ ailescu
The general question is the following: given , n = ·n like in the previous section and if T ⊂ F [Gn ] is one of the supports supp(A ), supp(F ), is it possible that T a full Q-rational component of F [G] ? Further manipulation of the fundamental system (5.4) together with heuristics similar to the one above (and the one used by Washington in [Wa] for analysing the likeliness of Vandiver’s conjecture), suggest that this fact should never happen, independently of the size of , n , as long as the degree of the rational components is at least 3. In lack of a proof, we conject it is impossible and will investigate this conjecture in future works. Conjecture 5.2. Let , n be like in the previous section and T ⊂ F [G] be one of supp(A ), supp(F ). Let X ϕ(n) − 1 , X2 − 1 be an irreducible factor of degree at least 3 and let g(X) ∈ Z[X]
with
g(X)|
Xg = { χ ∈ G : g(X) ≡ 0
mod (, fχ (X)) }.
Then ∪χ∈Xq Mχ '⊂ T. Acknowledgments. I thank Francisco Thaine for his suggestions and encouragement shown during the development of this paper. References Fernando Q. Gouvˆea: p-adic Numbers, An Introduction, Second Edition, Springer Universitext (1991). [La] Lang, S.: Algebra, Third Edition, Springer 2002, Graduate Texts in Mathematics 211 ¨ [Le] Leopoldt, H. W.: Uber Einheitengruppe und Klassenzahl reeller abelscher Zahlk¨ orper, Abhandlungen der Deutschen Akademie der Wissenschaften, Berlin, Kl. Math. Nat. 1953, no. 2 (1954). [Lo] Long, R.: Algebraic number theory, Marcel Dekker, Series in Pure and Applied Mathematics (1977). [Mi] P. Mih˘ ailescu: Primary Cyclotomic Units and a Proof of Catalan’s Conjecture, J. reine angew. Math. 572 (2004), pp. 167–195. [Ono] Takashi Ono: Algebraic Number Theory, Academic Press. [Wa] L. Washington: Introduction to Cyclotomic Fields, Second Edition, Springer (1996), Graduate Texts in Mathematics 83. [Go]
Preda Mih˘ ailescu Institut f¨ ur Mathematik der Universit¨ at G¨ ottingen e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
F-thresholds and Bernstein-Sato Polynomials Mircea Mustat¸ˇa, Shunsuke Takagi and Kei-ichi Watanabe
Introduction We introduce and study invariants of singularities in positive characteristic called F-thresholds. They give an analogue of the jumping coefficients of multiplier ideals in characteristic zero. Unlike these, however, the F-thresholds are not defined via resolution of singularities, but via the action of the Frobenius morphism. We are especially interested in the connection between the invariants of an ideal a in characteristic zero and the invariants of the different reductions mod p of a. Our main point is that this relation depends on arithmetic properties of p. We present several examples, as well as some questions on this topic. In a slightly different direction, we describe a new connection between invariants mod p and the roots of the Bernstein-Sato polynomial. We will restrict ourselves to the case of an ambient smooth variety, when our invariants have a down-to-earth description. Let (R, m) be a regular local ring of characteristic p > 0. We want to measure the singularities of a nonzero ideal a ⊆ m. For every ideal J ⊆ m containing a in its radical, and for every e ≥ 1, we put e νaJ (pe ) := max{r|ar '⊆ J [p ] }, e
e
where J [p ] = (f p |f ∈ J). One can check that the limit νaJ (pe ) e→∞ pe
cJ (a) := lim
exists and is finite. We call this limit the F-threshold of a with respect to J. When J = m, we simply write c(a) and νa (pe ). The invariant c(a) was introduced in [TW] under the name of F-pure threshold. In the first section we define these invariants and give their basic properties. The second section is devoted to the connection with the generalized test ideals introduced by Hara and Yoshida in [HY]. More precisely, we show that our invariants are the jumping coefficients for their test ideals. As it was shown in [HY] that the test ideals satisfy similar properties with the multiplier ideals in characteristic zero, it is not surprising that the F-thresholds behave in a similar way with the jumping coefficients of the multiplier ideals from [ELSV]. Such an analogy was also stressed in [TW], where it was shown that
342
M. Mustata, S. Takagi and K. Watanabe
the smallest F-threshold c(a) behaves in the same way as the smallest jumping coefficient in characteristic zero (known as the log canonical threshold). We point out that it is not known whether the analogue of two basic properties of jumping coefficients of multiplier ideals hold in our setting: whether cJ (a) is always a rational number and whether the set of all F-thresholds of a is discrete. There are very interesting questions related to the invariants attached to different reductions mod p of a characteristic zero ideal a. We discuss these in §3. For simplicity, we assume that a and J are ideals in Z[X1 , . . . , Xn ], contained in (X1 , . . . , Xn ) and such that a is contained in the radical of J. Let us denote by ap and Jp the localizations at (X1 , . . . , Xn ) of the images of a and J, respectively, in Fp [X1 , . . . , Xn ]. We want to compare our invariants mod p (which we write as νaJ (pe ) and cJ (ap )) with the characteristic zero invariants of a (more precisely, with the invariants around the origin of the image aQ of a in Q[X1 , . . . , Xn ]). First, let us denote by lc0 (a) the log canonical threshold of aQ around the origin. It follows from results of Hara and Watanabe (see [HW]) that if p 0 then c(ap ) ≤ lc0 (a) and limp→∞ c(ap ) = lc0 (a). Moreover, results of Hara and Yoshida from [HY] allow the extension of these formulas to higher jumping numbers (see Theorems 3.3 and 3.4 below for statements). It is easy to give examples in which c(ap ) '= lc0 (a) for infinitely many p. On the other hand, one conjectures that there are infinitely many p with c(ap ) = lc0 (a). We give examples in which more is true: there is a positive integer N such that for p ≡ 1 (mod N ) we have equality c(ap ) = lc0 (a). Moreover, in these examples one can find rational functions Ri ∈ Q(t) associated to every i ∈ {1, . . . , N − 1} relatively prime to N , such that c(ap ) = Ri (p) whenever p 0 satisfies p ≡ i (mod N ). It would be interesting to understand better when such a behavior holds. As the example of a cone over an elliptic curve without complex multiplication shows, this can’t hold in general. On the other hand, motivated by our examples one can speculate that the following holds: there is always a number field K such that whenever the prime p is large enough and completely split in K, then c(ap ) = lc0 (a). A surprising fact is that our invariants for ap are related to the BernsteinSato polynomial ba,0 (s) of a. More precisely, we show that for all p 0 and for all e, we have ba,0 (νaJ (pe )) ≡ 0 (mod p). We show on some examples in §4 how to use this to give roots of the Bernstein-Sato polynomial (and not just roots mod p). In these examples we will see the following behavior: given some ideal J containing a in its radical, and e ≥ 1, we can find N such that for all i ∈ {1, . . . , N − 1} relatively prime to N there are polynomials Pi ∈ Q[t] of degree e satisfying νaJ (pe ) = Pi (p) for all p 0, with p ≡ i (mod N ). The previous observation implies that ba,0 (Pi (0)) is divisible by p for every such p. By Dirichlet’s Theorem we deduce that Pi (0) is a root of ba,0 .
F-thresholds and Bernstein-Sato Polynomials
343
An interesting question is which roots can be obtained by the above method. It is shown in [BMS1] that for monomial ideals the functions p → νaJ (pe ) behave as described above, and moreover, all roots of the BernsteinSato polynomial are given by this procedure. On the other hand, Example 4.1 below shows that in some cases there are roots which can not be given by our method. 1. F-thresholds Let (R, m, k) be a regular local ring of dimension n and of characteristic p > 0. Since R is regular, the Frobenius morphism F : R −→ R, F (x) = xp is flat. In what follows q denotes a positive power of p, and if I = (y1 , . . . , ys ) is an ideal in R, then I [q] := (y q |y ∈ I) = (y1q , . . . , ysq ). We will use below the fact that as R is regular, every ideal I is equal with its tight closure (see, for example [HH]). This means that if u, f ∈ R are such that uf q ∈ I [q] for all q 0, and if u '= 0, then f ∈ I. This is easy to see: by the flatness/of the Frobenius morphism we have (I [q] : f q ) = (I : f )[q] . Therefore u lies in q (I : f )[q] , which is zero f is not in I. Let a be a fixed ideal of R, such that (0) '= a ⊆ m. To each ideal J of R such that a ⊆ Rad(J) ⊆ m, we associate a threshold as follows. For every q, let νaJ (q) := max{r ∈ N|ar '⊆ J [q] }. As a ⊆ Rad(J), this is a nonnegative integer. Lemma 1.1. For every a, J and q as above, we have νaJ (pq) ≥ p · νaJ (q). Proof. The inequality is a consequence of the fact that if u '∈ J [q] , then up '∈ J [pq] . It follows from the above lemma that ν J (q) ν J (q) lim a = sup a . q→∞ q q q
(1.1)
We call this limit the F-threshold of the pair (R, a) (or simply of a) with respect to J, and we denote it by cJ (a). Remark 1.2. The above limit is finite. In fact, if a is generated by r elements, and if aN ⊆ J, then aN (r(p
e
−1)+1)
e
e
e
⊆ (a[p ] )N = (aN )[p ] ⊆ J [p ] .
Therefore νaJ (pe ) ≤ N (r(pe − 1) + 1) − 1. Dividing by pe and taking the limit gives cJ (a) ≤ N r. We also have cJ (a) > 0. More precisely, as a '= (0), Krull’s Intersection Theorem shows that we can find e such that a '⊆ J [e] , so cJ (a) ≥ 1/pe . We make the convention cR (a) = 0.
344
M. Mustata, S. Takagi and K. Watanabe
Example 1.3. If J is an ideal generated by a regular sequence y1 , . . . , yr in R, then νJJ (q) = r(q − 1) for all q. Therefore cJ (J) = r. Question 1.4. Is it true that for all nonzero ideals a and J with a ⊆ Rad(J) ⊆ m, the F-threshold cJ (a) is a rational number ? Remark 1.5. The F-pure threshold c(a) was defined in [TW] (under the assumption that the Frobenius morphism F on R is finite) as the supremum of those t ∈ Q+ such that the pair (R, at ) is F-pure. Under this extra assumption on F , since R is regular, the pair (R, at ) is F-pure if and only if for q 0 we have at(q−1) '⊆ m[q] (see Lemma 3.9 in [Ta]). Here we use the notation /α0 for the largest integer ≤ α. The above condition is equivalent with νam (q) ≥ /t(q − 1)0 for q 0. It follows from our definition that if (R, at ) is F-pure, then t ≤ cm (a), and that if t < cm (a), then (R, at ) is F-pure. Therefore the F-pure threshold c(a) is equal to the F-threshold cm (a) of a with respect to the maximal ideal. We will keep the notation c(a) for cm (a), and moreover, we will put νa (q) := νam (q). Note that the F-pure threshold was defined in [TW] without the regularity assumption on R, but in what follows we will work under this restrictive hypothesis. Remark 1.6. In characteristic zero, the only analogue of J [q] which does not depend on the choice of generators for J is the usual power J q . If we imitate the definition of the F-pure threshold in this setting, replacing m[q] by mq , then we get 1/ mult0 (a), where mult0 (a) is the largest power of m containing a. Here are a few properties of F-thresholds. When J = m these have been proved in [TW] in a more general setting. Proposition 1.7. Let a, b, J ⊆ m be nonzero ideals, such that a and b are contained in the radical of J. (1) If a ⊆ b, then cJ (a) ≤ cJ (b). J (2) cJ (as ) = c s(a) for every positive integer s. (3) If a ⊆ J s and J can be generated by m elements, then cJ (a) ≤ m/s. If a '⊆ ms+1 and i J ⊆ m , then cJ (a) ≥ /s. (4) If a is the integral closure of a, then cJ (a) = cJ (a). (5) For every q, we have νaq(q) < cJ (a). (6) We have cJ (a + b) ≤ cJ (a) + cJ (b). Proof. The first assertion is trivial: since a ⊆ b, we get νaJ (q) ≤ νbJ (q) for all q. Hence cJ (a) ≤ cJ (b). Given s and q, we have (as )r '⊆ J [q] if and only if rs ≤ νaJ (q). Hence J νas (q) = /νaJ (q)/s0, which after dividing by q and passing to limit gives (2). J If a ⊆ J s , then by (1) and (2) we have cJ (a) ≤ cJ (J s ) = c s(J ) ≤ m . The s last inequality follows from Remark 1.2.
F-thresholds and Bernstein-Sato Polynomials
345
Suppose now that a '⊆ ms+1 . If cJ (a) < /s, then by taking q large enough we can find r such that r cJ (a) < < . (1.2) q s The first inequality shows that ar ⊆ J [q] ⊆ mq . As by hypothesis ar '⊆ mrs+1 , we deduce rs + 1 > q. This contradicts the second inequality in (1.2). For (4) note first that cJ (a) ≤ cJ (a) follows from (1). For the reverse inequality, recall that by general properties of the integral closure, there is a fixed positive integer s such that a+s ⊆ a for all . Hence we have νaJ (q) ≥ νaJ (q) − s for every q, which implies cJ (a) ≥ cJ (a). In order to prove (5) suppose that for some q we have νaJ (q)/q = cJ (a). If J νa (q) = r, this implies that νaJ (qq ) = rq for all q . Therefore arq +1 ⊆ J [qq ] for all q . As J [q] is equal to its tight closure, this gives ar ⊆ J [q] , a contradiction. We prove now (6). If (a + b)r '⊆ J [q] , then there are 1 and 2 such that J 1 + 2 = r and a1 '⊆ J [q] , b2 '⊆ J [q] . Therefore νa+b (q) ≤ νaJ (q) + νbJ (q) for all q, which gives (6). As pointed out in [TW], the F-pure threshold can be considered as an analogue of the log canonical threshold. Similarly, the F-thresholds play the role of the jumping coefficients from [ELSV]. We will see more clearly this analogy in the next sections. In what follows we fix the ideal a, and study the F-thresholds which appear for various J. We record in the next proposition some easy properties which deal with the variation of J. Proposition 1.8. (1) If a and J1 , J2 are as above with J1 ⊆ J2 , then cJ2 (a) ≤ cJ1 (a). In particular, the F-pure threshold c(a) is the smallest (nonzero) F-threshold of a. / (2) If J = λ∈Γ Jλ , then cJ (a) = sup cJλ (a). λ∈Γ [q]
(3) We have cJ (a) = q · cJ (a) for every q. Proof. The first assertion is straightforward, as we have νaJ2 (q) ≤ νaJ1 (q) for all q. For the second assertion, note that since the Frobenius morphism is flat, we / [q] have J [q] = λ Jλ , so νaJ (q) = maxλ νaJλ (q) which gives the formula for cJ (a). The equality in (3) is trivial, as in the definition of cJ (a) we have a limit. When the ideal a is generated by one element, then we can say more. The next proposition shows that in this case the F-threshold determines the numbers νaJ (q) for all q. If a = (f ), we simply write νfJ (q) and cJ (f ). We denote by 1α2 the smallest integer ≥ α.
346
M. Mustata, S. Takagi and K. Watanabe
Proposition 1.9. Let J ⊆ m be an ideal whose radical contains f '= 0. For every q we have νfJ (pq) + 1 pq so cJ (f ) = inf q
νfJ (q)+1 . q
≤
νfJ (q) + 1 q
,
Moreover, we have νfJ (q) = 1cJ (f )q2 − 1 for all q. J
Proof. For the first assertion it is enough to note that if f νa (q)+1 lies in J [q] J then f p(νf (q)+1) is in J [pq] . The last statement follows from νfJ (q) νfJ (q) + 1 < cJ (f ) ≤ . q q
We clearly have c(f ) (f ) = 1, so 1 is always an F-threshold for principal ideals. The next proposition shows that moreover, in this case it is enough to understand the thresholds in (0, 1). Proposition 1.10. If J is an ideal containing the nonzero f in its radical, then cf J (f ) = cJ (f ) + 1, c(J :f ) (f ) = max{cJ (f ) − 1, 0}.
(1.3)
In particular, a nonnegative λ is an F-threshold of a if and only if λ + 1 is. Proof. The proof is straightforward. The only thing to notice is that since the Frobenius morphism is flat, we have (J : f )[q] = J [q] : f q for all q. Remark 1.11. It follows easily from Proposition 1.9 that when a is principal, cJ (a) is a rational number if and only if the function e → ν Ja (pe ) := νaJ (pe ) − pνaJ (pe−1 ) is eventually periodic. Furthermore, this is equivalent with the fact that the series PaJ (t) = νaJ (pe )te e≥1
is a rational function. One could ask more generally whether for any a the above series is a rational function (again, this would imply that cJ (a) is rational). It follows from [BMS1] that this stronger assertion holds for monomial ideals. In fact, in this case it is again true that the function ν Ja is eventually periodic. Remark 1.12. In the study of singularities in characteristic zero, one can often reduce the invariant of an arbitrary ideal a to that of a principal ideal (f ) by taking f general in a. This does not work in our setting. For example, let a = m[p] . We have c(a) = c(mp ) = n/p, but for every f ∈ a, νf (p) = 0, so c(f ) ≤ 1/p.
F-thresholds and Bernstein-Sato Polynomials
347
2. F-thresholds as jumping coefficients Test ideals are a very useful tool in tight closure theory. In [HY] Hara and Yoshida introduced a generalization of test ideals in the setting of pairs. These ideals enjoy properties similar to those of multiplier ideals in characteristic zero. In fact, there is a strong connection between the test ideals and the multiplier ideals via reduction mod p (see [HY], and also the next section). We start by reviewing their definition in our particular setting, in order to describe the connection between test ideals and F-thresholds. n Let us fix first some notation. Let E = E(R) := Hm (R) be the top local cohomology module of R, so E is isomorphic to the injective hull of k. If ˆ x1 , . . . , xn form a regular system of parameters in R, then the completion R of R is isomorphic to the formal power series ring k[[X1 , . . . , Xn ]] such that xi corresponds to Xi . Note that we have n ˆ E(R) E(R) Rx1 ...xn / Rx1 ...xi ...xn . (2.1) i=1
Whenever working in E we will assume we have fixed such a regular system of parameters, so via the above isomorphism we may represent each element of E as the class [u/(x1 . . . xn )d ] for some u ∈ R and some d. We will use freely Matlis duality: Hom(−, E) induces a duality between ˆ ˆ finitely generated R-modules and Artinian R-modules (which are the same as the Artinian R-modules). See, for example [BH] for more on local cohomology and Matlis duality. On E we have a Frobenius morphism FE which via the isomorphism in (2.1) is given by FE ([u/(x1 . . . xn )d ]) = [up /(x1 . . . xn )pd ]. FE is injective. Moreover, if a ∈ R \ {0} is such that aFEe (w) = 0 for all e, then w = 0. Indeed, this is an immediate consequence of the fact that every ideal is equal with its tight closure. Let a ⊆ m be a fixed ideal. For every r ≥ 0 and e ≥ 1 we put Zr,e := ker(ar FEe ) = {w ∈ E|hFEe (w) = 0 for all h ∈ ar }. 0 Lemma 2.1. If r < s, then Zr,e ⊆ Zs,e . We have E = r Zr,e . Moreover, Zpr,e+1 is contained in Zr,e . Proof. The first assertions are clear, and the last one follows from the injectivity of FE and the fact that FE (hw) = hp FE (w) for all h ∈ R and w ∈ E. Definition 2.2. ([HY]) If a ⊆ m is a nonzero ideal, and if c ∈ R+ , the test ideal of a of exponent c is Zcpe ,e . τ (ac ) := AnnR e≥1
348
M. Mustata, S. Takagi and K. Watanabe
As E is Artinian, it follows from Lemma 2.1 that τ (ac ) = AnnR Zcpe ,e if e 0. / For every c > 0, let Zc := e Zcpe ,e . Note that Zc '= E. Indeed, if m ≥ c is an integer and if h is a nonzero element of a, then Zc ⊆ Zmpe ,e ⊆ e ker(hmp FEe ), which is equal to the kernel of the multiplication by hm on E (this follows from the injectivity of FE ). Therefore Zc is a proper submodule of E. ˆ and a by aR, ˆ then Zc remains the same. For the If we replace R by R basic properties of test ideals we refer the reader to [HY]. We prove only the following Lemma which we will need in the next section. See [HY], [HW] and [Smi] for related stronger statements. Lemma 2.3. For every c > 0, the submodule Zc is the unique maximal proper e submodule of E invariant by all hFEe , where e ≥ 1 and h ∈ acp . Proof. It is clear that Zc is invariant under hFEe as above, as hFEe (Zcpe ,e ) = 0 by definition. Since Zc does not change when we pass to the completion, we may assume that R is complete. In this case every proper submodule of E has nonzero annihilator. Therefore in order to finish the proof it is enough to show that if g ∈ R is a nonzero e element, and if w ∈ E is such that gacp FEe (w) = 0 for all e ≥ 1, then w ∈ Zc . Fix e and h ∈ acp e
as hp ∈ ap the proof.
e
cpe
e
e
. For every e we have gFEe (hFEe (w)) = ghp FEe+e (w) = 0,
⊆ acp
e+e
. This implies that hFEe (w) = 0, which completes
Our goal is to show that the F-thresholds we have introduced in the previous section can be interpreted as jumping coefficients for the test ideals. We start by interpreting the function νaJ in terms of the Frobenius morphism on E. Lemma 2.4. Let a and J ⊆ m be nonzero ideals, with a contained in the radical of J. If M is a submodule of E such that J = AnnR (M ), then νaJ (pe ) is the largest r such that M '⊆ Zr,e . / Proof. For every w ∈ M we put Jw = AnnR w, so J = w∈M Jw . If w = [pe ]
e
e
[u/(x1 . . . xn )d ], then Jw = (xd1 , . . . , xdn )[p ] : up . For every w, we see that νaJw (pe ) is the largest r such that w '∈ Zr,e . As νaJ = maxw∈M νaJw , we get the assertion in the lemma. Remark 2.5. By Matlis duality, we may take in the above Lemma M = AnnE (J). For future reference we include also the next lemma whose proof is immediate from definition. Lemma 2.6. If a ⊆ b, then τ (ac ) ⊆ τ (bc ) for every c ∈ R+ . If c1 < c2 , then τ (ac2 ) ⊆ τ (ac1 ) for every ideal a.
F-thresholds and Bernstein-Sato Polynomials
349
Proposition 2.7. If a ⊆ m is a nonzero ideal contained in the radical of J, then J
τ (ac
(a)
) ⊆ J.
Going the other way, if α ∈ R+ , then a is contained in the radical of τ (aα ) and α
cτ (a ) (a) ≤ α. Therefore the maps J −→ cJ (a) and α −→ τ (aα ) give a bijection between the set of test ideals of a and the set of F-thresholds of a. Proof. For the first statement, let M = AnnE J, so by Matlis duality we need to prove that M ⊆ ZcJ (a)pe ,e for all e. This follows from Lemma 2.4 and the fact that 1cJ (a)pe 2 > νaJ (pe ). We show now that for α ∈ R+ , we have a ⊆ Rad(τ (aα )). Let e 0 be such that τ (aα ) = AnnR Zαpe ,e . If m ≥ α is an integer, it follows from the injectivity of FE that am ⊆ τ (aα ). We deduce now from Lemma 2.4 and from the definition of τ (aα ) that τ (aα )
νa
(pe ) ≤ 1αpe 2 − 1 < αpe .
Dividing by pe and taking the limit gives the required inequality. The last statement is a formal consequence of the first two assertions.
Remark 2.8. It follows from Proposition 2.7 that if we have an F-threshold c of a, then there is a unique minimal ideal J such that cJ (a) = c. Indeed, this is τ (ac ). Moreover, if c1 and c2 are such F-thresholds, then c1 < c2 if and only if τ (ac2 ) is strictly contained in τ (ac1 ). Remark 2.9. As R is Noetherian, it follows from the previous remark that there is no strictly decreasing sequence of F-thresholds of a. Remark 2.10. There are arbitrarily large thresholds: take, for example pe c(a) = [pe ] cm (a) for e ≥ 1. Note/also that a sequence of thresholds {cm }m of a is cm unbounded / if andconly if m τ (a ) = (0). The only thing one needs to check is that c∈R+ τ (a ) = (0). This follows since for every integer ≥ n, we have τ (a ) ⊆ τ (m ) ⊆ m−n+1 . The jumping coefficients for multiplier ideals are discrete. We do not know if the analogous assertion is true for the F-thresholds. Question 2.11. Given an ideal (0) '= a ⊆ m, could there exist finite accumulation points for the set of F-thresholds of a ? Remark 2.12. Given a test ideal J corresponding to a, the set of those α ∈ R+ such that τ (aα ) = J is an interval of the form [a, b). Indeed, if a = cJ (a), it follows from Lemma 2.6 and Proposition 2.7 that for λ < a, τ (aλ ) strictly contains J = τ (aa ). On the other hand, if J is the largest test ideal strictly contained in J, and if b = cJ (a), then it is clear that b = sup{λ|τ (aλ ) = J}, which gives our assertion.
350
M. Mustata, S. Takagi and K. Watanabe
Example 2.13. Consider the case when a is a monomial ideal, i.e., a is generated by monomials in the localization of k[X1 , . . . , Xn ] at (X1 , . . . , Xn ). It is shown in [HY] that for every α we have τ (aα ) = I(aα ), where I(aα ) is the multiplier ideal of a with exponent α. It follows from this and from Proposition 2.7 that the set of F-thresholds of a coincides with the set of jumping coefficients of the multiplier ideals of a. Let us recall the description of multiplier ideals for monomial ideals from [Ho2]. Consider the Newton polyhedron Pa of a: this is the convex hull in Rn of {u ∈ Nn |X u ∈ a}, where for u = (u1 , . . . , un ) we put X u = X1u1 . . . Xnun . If we put e = (1, . . . , 1), then I(aα ) = (X u |u + e ∈ Int(α · Pa )).
(2.2)
It follows that each jumping coefficients α of the multiplier ideals is associated to some b = (bi ) with all bi positive integers, where α is such that b lies in the boundary of α · Pa . Of course, several distinct b can give the same α. In fact, one can show that in order to compute all the F-thresholds of a it is enough to consider only ideals J of the form (X1b1 , . . . , Xnbn ) with bi positive integers. Moreover, one can check directly that if J is the above ideal, then cJ (a) = α, where α is associated to b = (b1 , . . . , bn ) as above (see [BMS1] for this approach). We end this section by considering in more detail the case when a = (f ) is a principal ideal. One can easily check that for such a we have Zr,e = Zpr,e+1 . This shows that any two Zr1 ,e1 and Zr2 ,e2 are comparable. As E is Artinian, we may consider the submodules of E defined inductively as follows: let M0 := {0} be the minimal module in M := {Zr,e |r, e}, and for m ≥ 1, let Mm be the unique minimal module in M{M0 , . . . , Mm−1 }. It follows that Mm is properly contained in Mm+1 . In addition, given any r and e, either Mm ⊆ Zr,e , or Zr,e ∈ {M0 , . . . , Mm−1 }. Proposition 2.14. With f ∈ m as above, we put for every i, Ji = AnnR (Mi ), and ci = cJi (f ). (1) For i ≥ 1, let νi (e) be the largest r such that Zr,e ⊆ Mi−1 . Then ci = lim
e→∞
νi (e) . pe
(2) Every Ji is a test ideal of a, and if J is any test ideal different from all Ji , then J is contained in all these ideals. Proof. Note that by definition νi (e) is the largest r such that Mi '⊆ Zr,e . By Lemma 2.4, we get νi (e) = νfJi (pe ) and this proves (1). We show that Ji is a test ideal by proving that τ (f ci ) = Ji . By Lemma 2.7, it is enough to show that Ji ⊆ τ (f ci ) for i ≥/1. This follows from 1ci pe 2 = νi (pe ) + 1 (see Proposition 1.9) which implies e Zci pe ,e = Mi .
F-thresholds and Bernstein-Sato Polynomials
351
For the last statement it is enough to show that for all i ≥ 0 and for c c ∈ [ci , ci+1) we have τ (f/ ) = Ji (with the convention c0 = 0). If e 0, we e have 1cp 2 < νi+1 (e), so e Zcpe ,e ⊆ Mi . This implies Ji ⊆ τ (f c ), and the other inclusion is clear, as we have seen that τ (f ci ) = Ji . Remark 2.15. Note that in the case of a principal ideal the set of F-thresholds of (f ) is discrete if and only if limm→∞ cm = ∞. This is equivalent with the 0 fact that m Zm = E. Note also that by the periodicity of the F-thresholds (see Lemma 1.10), this is further equivalent with the finiteness of the set of F-thresholds in (0, 1). It follows from Proposition 1.9 that c(f ) = 1 if and only if νf (pe ) = pe − 1 for every e. The following proposition based on an argument of Fedder shows that in fact, it is enough to check this for only one e ≥ 1. Proposition 2.16. ([Fe]) If f is a nonzero element in m, then c(f ) = 1 if and only if there is e such that νf (pe ) = pe − 1. Moreover, this is the case if and n−1 (R/(f )) is injective. only if the action of the Frobenius morphism on Hm Proof. The exact sequence 0 → R → R → R/(f ) → 0 n−1 (R/(f )) with the annihilator of f in E. Moreinduces an isomorphism of Hm n−1 (R/(f )) is given over, via this identification the Frobenius morphism on Hm p−1 ( by F (u) = f FE (u). We see that F(e is injective if and only if Zpe −1,pe = (0). This is the case if and only if νf (pe ) = pe − 1. Since F( is injective if and only if F(e is, this completes the proof.
3. Reduction mod p and the connection with the Bernstein polynomial In this section we study the way our invariants behave for different reductions mod p of a given ideal. Everything in this section works in the usual framework for reducing mod p which is used in tight closure theory (see for example [HY]). In order to simplify the presentation as well as the notation, we prefer to work in the following concrete setup. The interested reader should have no trouble translating everything to the general setting. Let A be the localization of Z at some nonzero integer. We fix a nonzero ideal a of A[X] = A[X1 , . . . , Xn ], such that a ⊆ (X1 , . . . , Xn ). Let Fp = Z/pZ. We want to relate the invariants attached to aQ := a · Q[X] around the origin with those attached to the localizations of the reductions mod p, ap := a · Fp [X](X1 ,...,Xn ) , where p is a large prime. We will use the same subscripts whenever tensoring with Q or reducing (and localizing) mod p. Note that since we are interested only in large primes, we are free to further localize A at any nonzero element.
352
M. Mustata, S. Takagi and K. Watanabe
Let us consider a log resolution of aQ defined over Q: this is a proper birational morphism πQ : YQ −→ AnQ , with YQ smooth, such that the product −1 (aQ ) and the ideal defining the exceptional locus of πQ is principal, between πQ and it defines a divisor with simple normal crossings. Such a resolution exists by [Hir]. After further localizing A, we may assume that πQ is obtained by extending the scalars from a morphism π : Y −→ AnA with analogous properties. If we denote by D the effective divisor defined by π −1 (a) and if K is the relative canonical divisor of π (i.e., the effective divisor defined by the Jacobian of π), then for all α ∈ R I(aα ) := H 0 (Y, OY (K − /αD0)).
(3.1)
Here /αD0 denotes the integral part of the R-divisor αD. Note that I(aα )Q is the multiplier ideal of aQ of with exponent α. We refer for the theory of multiplier ideals to [Laz]. The jumping numbers at (0, . . . , 0) introduced in [ELSV] are the numbers λ such that I(aλ )Q is strictly contained in I(aλ− )Q in a neighborhood of the origin, for every > 0. The smallest positive such number is the log canonical threshold lc0 (a): it is the first λ such that I(aλ )Q is different from the structure sheaf around the origin. In order to simplify the notation we will drop the subscript Q whenever considering the invariants associated to aQ . In our setting, by taking p 0, we may assume that the above resolution induces a log resolution πp for ap . Over Q we have Ri π∗ (OY )Q = 0 for i ≥ 1. This remains true for the reductions mod p if p 0. From now on we assume that p is large enough, so these conditions are satisfied. We define I(aα p ) by a formula similar to (3.1), using πp . Note that for a fixed α, we have I(aα )p = I(aα p ) if p 0. We recall two results which describe what is known about the connection between multiplier ideals and test ideals. The first one is proved in more generality in [HY], based on ideas from [HW]. We include the proof as it is quite short in our context. Theorem 3.1. With the above notation, if p 0, then for every α we have α τ (aα p ) ⊆ I(ap ).
Proof. Let R be the localization of Fp [X] at (X1 , . . . , Xn ), and let m be its maximal ideal. We denote by W ⊆ Yp the subset defined by πp−1 (m). We will use the notation from the previous section. n (R), using the fact that the higher direct images of OYp are If E = Hm n zero, and the long exact sequence for local cohomology we get E HW (OYp ). A version of Local Duality shows that if n n (OYp ) −→ HW (OYp (/αDp 0)) δ : HW
is the surjective morphism induced by the natural inclusion of sheaves, then I(aα p ) = AnnR (ker δ). By Lemma 2.3, it is therefore enough to show that if αpe
h ∈ ap
, then hFEe (ker δ) ⊆ ker δ.
F-thresholds and Bernstein-Sato Polynomials
353
The Frobenius morphism on local cohomology is induced by the Frobenius morphism F on the fraction field of R. As the inclusion hF e (OYp ) ⊆ OYp is clear, in order to finish it is enough to show also that hF e (OYp (/αDp 0)) ⊆ OYp (/αDp 0). This is an immediate consequence of the definitions. The proof of the next Theorem is more involved, so we refer the reader to [HY]. Theorem 3.2. With the above notation, if α is given and if p 0 (depending on α), then α τ (aα p ) = I(ap ). We reformulate the above results in terms of thresholds. In order to do this we index the jumping coefficients of aQ at the origin by analogy with the F-thresholds, as follows. Suppose that J ⊆ (X1 , . . . , Xn )A[X] is an ideal containing a in its radical. We define λJ0 (a) := min{α > 0 | I(aα )Q ⊆ JQ around 0}. It is clear that this is a jumping coefficient of aQ around the origin, and that every such coefficient appears in this way for a suitable J. For example, if J = (X1 , . . . , Xn ), then λJ0 (a) = lc0 (a). Using Proposition 2.7, we may reformulate the above results as follows. We will denote the invariants of ap with respect to Jp , which we have introduced in §1, simply by cJ (ap ) and νaJ (pe ). Theorem 3.3. If p 0, then for every ideal J as above we have cJ (ap ) ≤ λJ0 (a). In particular, we have the following inequality between the F-pure threshold and the log canonical threshold: c(ap ) ≤ lc0 (a). Theorem 3.4. Given an ideal J as above, we have lim cJ (ap ) = λJ0 (a).
p→∞
In particular, we have limp→∞ c(ap ) = lc0 (a). Remark 3.5. The fact that in Theorem 3.2 p depends on α is reflected in Theorem 3.4 in that we may have cJ (ap ) < λJ0 (a) for infinitely many p. This is a very important point, and we will see examples of such a behavior (for J = m) in the next section. We discuss now possible further connections between the invariants over Q and those of the reductions mod p. We formulate them in the case J = m and we will give some examples in §4. However, note that similar questions can be asked for arbitrary J. Conjecture 3.6. Given the ideal a, there are infinitely many primes p such that c(ap ) = lc0 (a).
354
M. Mustata, S. Takagi and K. Watanabe
Problem 3.7. Given the ideal a, give conditions such that there is a positive integer N with the following property: for every prime p with p ≡ 1 (mod N ) we have c(ap ) = lc0 (a). Problem 3.8. Give conditions on an ideal a such that there is a positive integer N , and rational functions Ri ∈ Q(t) for every i ∈ {0, . . . , N − 1} with gcd(i, N ) = 1 with the following property: c(ap ) = Ri (p) whenever p ≡ i (mod N ) and p is large enough. These problems are motivated by the examples we will discuss in the next section. We will see that the behavior described in the problems is satisfied in many cases. On the other hand, Example 4.6 below shows that one can not expect for such a behavior to hold in general. We will see in this example that the failure is related to subtle arithmetic phenomena. However, note that if p is an odd prime, then one can reinterpret the condition p ≡ 1 (mod N ) in Problem 3.7 as saying that p is completely split in the cyclotomic field of the N th roots of unity (see [Neu], Cor. 10.4). We will see that something similar happens in Example 4.6 below: there is a number field K such that if p splits completely in K, then the log canonical threshold is equal to the corresponding F-pure threshold. This motivates the following Question 3.9. Given an ideal a as above, is there a number field K such that whenever the prime p 0 splits completely in K, we have c(ap ) = lc(a) ? ˇ Note that by Cebotarev’s Density Theorem (see [Neu], Cor. 13.6), given a number field K there are infinitely many primes p which split completely in K. Therefore a positive answer to Question 3.9 would imply Conjecture 3.6. We include here another problem with a similar flavor, on the behavior of the functions νaJ (pe ) when we vary p. The interest in this problem comes from the fact that whenever we can prove that such a behavior holds, one can use this to give roots of the Bernstein-Sato polynomial of aQ (see Remark 3.13 below). The Conjecture is proved for monomial ideals in [BMS1]. For other examples, see the next section. Problem 3.10. Find conditions on an ideal a such that the following holds. Given an ideal J as above, and e ≥ 1, there is a positive integer N , and polynomials Pj ∈ Q[t] of degree e, for every j ∈ {1, . . . , N −1} with gcd(j, N ) = 1, such that νaJ (pe ) = Pj (p) for every p 0, p ≡ j (mod N ). When could N be chosen independently on J and e ? We turn now to a different connection between invariants which appear in characteristic zero and the ones we have defined in §1. The characteristic zero invariants we will consider are the roots of the Bernstein-Sato polynomial, whose definition we now recall. Let I ⊆ C[X1 , . . . , Xn ] be a nonzero ideal, and let f1 , . . . , fr be nonzero generators of I. We introduce indeterminates s1 , . . . , sr and the Bernstein-Sato
F-thresholds and Bernstein-Sato Polynomials
355
polynomial bI is the monic polynomial in one variable of minimal degree such that we have an equation r r sj fisi = Pc (s, X, ∂X ) • fisi +ci . (3.2) bI (s1 + · · · + sr ) −c j i=1 c i=1 j,cj <0 Here the sum varies over finitely many c ∈ Zr such that j cj = 1, for every such c we havethe nonzero differential operator Pc ∈ C[sj , Xi , ∂Xi |j ≤ r, i ≤ n], sj and as usual −c = sj (sj − 1) . . . (sj + cj + 1)/(−cj )!. Note that • denotes j the action of a differential operator. Equation (3.2) is understood formally, but if we let si = mi ∈ N, then it has the obvious meaning. If we require (3.2) to hold only in some neighborhood of the origin in Cn , then we get the local Bernstein-Sato polynomial bI,0 (s). Note that if r = 1, i.e., if I = (f ) is a principal ideal, then (3.2) takes the more familiar form bf (s)f s = P (s, X, ∂X ) • f s+1 . We refer to [Bj] for some basic properties properties of the Bernstein-Sato polynomial of principal ideals, and to [BMS2] for the general case. In the case of principal ideals, there is an extensive literature on connections between this polynomial and other invariants of singularities (see [Mal], [Ka2], [Ig] and [Kol]). Some of these results have been extended to arbitrary ideals in [BMS2]. Here are a few properties which are relevant to our study. First, it is proved in [BMS2] that this polynomial does not depend on the choice of generators. All the roots of bI,0 are negative rational numbers, the largest one is − lc0 (I), and for every jumping coefficient around the origin λ of I, if λ ∈ [lc0 (I), lc0 (I) + 1), then −λ is a root of bI,0 . For these facts, see [Ka2], [Kol] and [ELSV] for principal ideals and [BMS2] for the general case. We return now to our setting. The extension of our ideal a to C defines a Bernstein-Sato polynomial around the origin, which we simply denote by ba,0 . Consider the defining equation (3.2) and let B be a subalgebra of C, finitely generated over Z and containing all the coefficients of ba,0 and of the Pc . Moreover, we may assume that for all c which appear in (3.2) and for all j such that cj < 0, (−cj )! is invertible in B. It is clear that there is M such that for every prime p ≥ M , there is a maximal ideal P of B with pA = P ∩ A. For such p and P , let Rp and SP be the localizations of Fp [X] and (B/P )[X], respectively, at the ideal generated by the variables. Suppose now that J ⊆ (X1 , . . . , Xn )A[X] is an ideal containing a in its radical. We will denote by Jp and JP the image of J in Rp and SP , respectively. Note that since SP is flat over Rp and since the Frobenius morphism is [pe ] [pe ] flat, it follows that for every e we have JP ∩ Rp = Jp . In particular, we have νaJ (pe ) = νaJpPSP (pe ).
(3.3)
356
M. Mustata, S. Takagi and K. Watanabe
Proposition 3.11. If a ⊆ (X1 , . . . , Xn )A[X] is a nonzero ideal, then for every prime p 0 and for every J ⊆ (X1 , . . . , Xn )A[X] containing a in its radical we have ba,0 (νaJ (pe )) = 0 in Fp
(3.4)
for all e. Remark 3.12. Recall that all roots of ba,0 are rational, so ba,0 ∈ Q[s]. After localizing A at a suitable element, we may assume that ba,0 ∈ A[s]. Therefore for every m ∈ Z, ba,0 (m) has a well-defined class in Fp . Proof of Proposition 3.11. We use the above notation and let m = νaJ (pe ). Recall that we have generators f1 , . . . , fr of a. It follows from (3.3) that there i [pe ] are nonnegative integers 1 , . . . , r such that i i = m and i fi '∈ JP . On the other hand, for every nonnegative integers 1 , . . . , r with i i = m + 1, [pe ] we have i fi i ∈ JP . Note that (3.2) holds in SP if si = i for all i. If c and i are such that i + ci < 0, then i (i − 1) . . . (i + ci + 1) = 0, so this term does not appear in [pe ] the corresponding equality. As JP is invariant under the action of operators in B/P [Xi , ∂Xi | i ≤ n], we deduce that ba,0 (m) is zero in Rp , hence in Fp . Remark 3.13. Note that whenever we can show that a behaves as in Problem 3.10, we get roots of ba,0 . More precisely, suppose that for some J as above and for some e, there is a positive number N and polynomials Pj ∈ Q[t] for every j with gcd(j, N ) = 1 such that νaJ (pe ) = Pj (p) for every prime p 0 with p ≡ j (mod N ). In this case, the above proposition shows that ba,0 (Pj (p)) is divisible by p, so p divides ba,0 (Pj (0)). By the Dirichlet Theorem there are infinitely many such p, and therefore Pj (0) is a root of ba,0 . Remark 3.14. Let a be a principal ideal generated by f . Suppose that the analogue of the setup in Problem 3.7 holds for a jumping coefficient µ ∈ (0, 1] (around the origin) of fQ . More precisely, suppose that J and N are such that if p ≡ 1 (mod N ), then cJ (fp ) = µ (a natural choice for such a J is J = I(f µ )). We may choose such p so that µ(p − 1) is an integer. Since µ ≤ 1, it follows from Proposition 1.9 that in this case we have νaJ (pe ) = µ(pe − 1) for all e ≥ 1. Remark 3.13 implies now that −µ is a root of bf,0 (s). As we have already mentioned, this is proved in [ELSV], but this would provide an “explanation” from our point of view. Remark 3.15. It is an interesting question which roots of ba,0 can be given by the procedure in Remark 3.13. It is proved in [BMS1] that this is the case for all the roots if a is a monomial ideal. On the other case, Example 4.1 below shows that some roots may not come from our approach.
F-thresholds and Bernstein-Sato Polynomials
357
4. Examples Example 4.1. Let n ≥ 3 and f = X1 X2 + X32 + · · · + Xn2 , so its Bernstein-Sato polynomial is given by bf (s) = (s + 1)(s + n2 ) (see [Ka1], Example 6.19, but this is actually one of the few examples which can be computed directly). We will see that we can not account for the root − n2 by the procedure described in Remark 3.13. We claim that for every p and for every e ≥ 1, we have νf (pe ) = pe − 1. e e To see this note that if over Fp we have f r ∈ (X1p , . . . , Xnp ), then (X1 X2 )r ∈ e e (X1p , . . . , Xnp ), as follows by choosing a monomial order on the polynomial ring such that in(f ) = X1 X2 (see [Eis], Chapter 15 for monomial orders). Therefore νf (pe ) ≥ pe − 1, so we must have equality. This shows that the smallest nonzero F-threshold is c(fp ) = 1. Proposition 1.10 shows that if λ is an F-threshold of fp which is not an integer, then the fractional part of λ gives an F-threshold in (0, 1), a contradiction. Therefore the set of F-thresholds of fp consists of the set of positive integers. Proposition 1.9 implies that for every ideal J contained in (X1 , . . . , Xn ) and such that f ∈ Rad(J), and for every p, there is a positive integer m such that νfJp (pe ) = mpe − 1 for all e. Therefore the only root of bf (s) we get by the procedure described in Remark 3.13 is −1. Example 4.2. Consider f ∈ Z[X1 , . . . , Xn ] which we write as f = ri=1 ci X αi , ∈ k are nonzero.We assume where all αi = (αi,1 , . . . , αi,n ) ∈ Nn and all ci that α1 , . . . , αr are affinely independent, i.e., if i λi αi = 0 and i λi = 0 for λ = (λi ) ∈ Qr , then λi = 0 for all i. We will assume also that for every j ≤ n there is i ≤ r with αi,j > 0 (otherwise we may work in a smaller polynomial ring). Let a = (X αi |1 ≤ i ≤ r). One can check that our condition of f implies that f is generic with respect to a in the following sense. If P is a compact face of the convex hull Pa of {αi |1 ≤ i ≤ r}, and if g is the sum of the terms in f which correspond to elements in P , then the differential dg does not vanish on (C∗ )n . Under this assumption it is proved in [Ho1] that around the origin we have I(aα ) = I(f α ) for α < 1. In particular, we have lc0 (f ) = min{1, lc(a)} (note that lc(a) = lc0 (a) as a is a monomial ideal). We start by computing νf (p) when p 0. Over Z we have fm =
a
m! ca1 1 . . . car r X i ai αi , a1 ! . . . ar !
(4.1)
on where the sum is over those a = (ai ) ∈ Nr with i ai = m.Our hypothesis f implies that if a '= b and i ai = i bi , then we have X i ai αi '= X i bi αi . We may assume that p does not divide any of the ci , so if m ≤ p − 1, then p does not divide any of the coefficients in (4.1). Hence amonomial X b appears in fpm if and only if there are a1 , . . . , ar ∈ N such that i ai = m and
358
M. Mustata, S. Takagi and K. Watanabe
p m p ai αi = b. Therefore f p ∈ (X1 , . . . , Xn ) if and only if the following holds: r for every (ai ) ∈ N with i ai αij ≤ p − 1 for all j, we have i ai ≤ m − 1. Let r ai αij ≤ 1 for all j}. Q := {(a1 , . . . , ar ) ∈ Rr+ | i
i=1
The above discussion shows that νf (p) = min{p − 1,
max
b∈(p−1)Q∩Nr
bi }.
(4.2)
i
Compare this with the following formula for lc(a) which follows easily from the description of lc(a) in [Ho2] (see, for example, Proposition 3.10 in [BMS1]): bi . (4.3) lc(a) = max b∈Q
i
Note that since for every j ≤ r we have αi,j > 0 for some i, Q is bounded. It is more complicated to compute νf (pe ) for e ≥ 2. Let c = lc0 (f ). We show now that νf (pe ) = c(pe − 1) for all e when p ≡ 1 (mod N ), where N will be suitably chosen. that and if lc0 (f ) < 1, If lc0 (f ) = 1, choose any v ∈ Q∩Qr such i vi = 1, let v be one of the vertices of Q, such that i vi = maxb∈Q i bi . We take N such that N vi is an integer for all i. Note that we have c = i vi . If p ≡ 1 (mod N ), then ai := (pe − 1)vi ∈ N, and in order to show that νf (pe ) ≥ c(pe − 1), it is enough to show that p does not divide ((pe − 1)c)!/a1 ! . . . ar !. Therefore it is enough to check that r e e /c(p − 1)/p 0 = /(pe − 1)vi /pe 0 i=1
for 1 ≤ e ≤ e, where /x0 is the integral part of x. This follows by an easy computation. Dividing by pe and passing to the limit, we deduce that if p ≡ 1 (mod N ), then c(fp ) ≥ lc0 (f ). On the other hand, the reverse inequality holds by Theorem 3.3 (here this follows also from the fact that f ∈ a and lc0 (f ) = lc(a)). Note that using Proposition 1.9, we deduce that νf (pe ) = c(pe − 1) for all e and all p as above. In particular, we see that c(fp ) exhibits the behavior described in Problem 3.7. Example 4.3. Let f = x2 + y 3 , so lc0 (f ) = 12 + 13 = 56 . Moreover, bf,0 has simple roots − 56 , −1, − 67 (see [Ka1], Example 6.19). We give below the list of F-pure thresholds of fp . We see that for p '= 2, 3, the behavior depends on the congruence class of p mod 3. Note that we see the behavior described in Problems 3.7, 3.8 and 3.10 (when J = m). Moreover, we obtain by our procedure all the roots of bf,0 (s). (1) If p = 2, then c(fp ) = 12 . (2) If p = 3, we have c(fp ) = 23 .
F-thresholds and Bernstein-Sato Polynomials
359
(3) If If p ≡ 1 (mod 3), then c(fp ) = 56 . We have νf (pe ) = 56 (pe − 1) for all e, so this gives the root − 56 of bf,0 (s). 1 , so (4) If p ≡ 2 (mod 3) and p '= 2, then c(fp ) = 56 − 6p 5 p− 7 if e = 1, e νf (p ) = 65 e 61 e−1 − 1 if e ≥ 2. 6p − 6p Therefore we get the roots − 76 and −1 of bf,0 (s). Example 4.4. Let f = x2 + y 7 . The log canonical threshold is given by lc0 (f ) = 1 1 9 2 + 7 = 14 . All the roots of bf,0 are simple, and they are −
11 13 15 17 19 9 , − , − , −1, − , − , − 14 14 14 14 14 14
(see [Ka1], Example 6.19). We assume that p '= 2, 7 and we give the description of the F-pure thresholds and of the functions νf (pe ). The behavior depends on the congruence class of p mod 7. We see again the behavior described in Problems 3.7, 3.8 and 3.10. In addition, we get all the roots of bf,0 by our procedure. 9 9 , and νf (pe ) = 14 (pe − 1) for all e. This (1) If p ≡ 1 (mod 7), then c(fp ) = 14 9 of bf,0 (s). gives the root − 14 9 1 − 14p (2) If p ≡ 2 (mod 7), then c(fp ) = 14 2 . Hence 9 11 if e = 1, 14 p − 14 e 9 2 15 νf (p ) = 14 p − 14 if e = 2, 9 e 1 e−2 − 1 if e ≥ 3. 14 p − 14 p 15 This gives the roots − 11 14 , − 14 and −1 of bf,0 (s). 9 5 − 14p (3) If p ≡ 3 (mod 7), then c(fp ) = 14 3 . Therefore
νf (pe ) =
9 13 14 p − 14 9 p2 − 11 14
9 3 p − 14 9 e 14 p −
14 19 14 5 e−3 14 p
−1
if if if if
e = 1, e = 2, e = 3, e ≥ 4.
11 19 We get the roots − 13 14 , − 14 , − 14 and −1 of bf,0 (s). 9 1 − 14p . Hence (4) If p ≡ 4 (mod 7), then c(fp ) = 14 9 p − 15 if e = 1, e 14 νf (p ) = 14 9 e 1 e−1 − 1 if e ≥ 2. 14 p − 14 p
This gives the roots − 15 14 and −1 of bf,0 (s).
360
M. Mustata, S. Takagi and K. Watanabe
9 3 (5) If p ≡ 5 (mod 7), then c(fp ) = 14 − 14p . Therefore 9 p − 17 if e = 1, 14 νf (pe ) = 14 9 e 3 e−1 p − p − 1 if e ≥ 2. 14 14
This gives the roots − 17 14 and −1 for bf,0 (s). 9 5 − 14p . Hence (6) If p ≡ 6 (mod 7), then c(fp ) = 14 9 p − 19 if e = 1, 14 νf (pe ) = 14 9 e 5 e−1 − 1 if e ≥ 2. 14 p − 14 p This gives the roots − 19 14 and −1 of bf,0 (s). Example 4.5. Let f = x5 + y 4 + x3 y 2 . The following are the roots of bf,0 (s) (see [Ya]) 9 11 13 7 17 9 19 21 11 23 13 27 − , − , − , − , − , − , − , −1, − , − , − , − , − . 20 20 20 10 20 10 20 20 10 20 10 20 As in Example 4.2, since the exponents of the monomials in f satisfy that genericity condition, we can compute the jumping coefficients of f using Howald’s description from [Ho2]. The ones in (0, 1] are 9 13 7 17 9 19 , , , , , , 1. 20 20 10 20 10 20 As pointed out in [Sa], the interest in this example comes from the fact that there is a root λ ∈ (−1, 0) of bf,0 (s) such that −λ is not a jumping coefficients of f (namely λ = − 11 20 ). Note also that f has an isolated singularity at the origin. We show that we can get all roots of bf,0 (s) by the procedure described in Remark 3.13. Note however that it is not enough to consider only cJf (pe ) for J = m. We will use the notation in Proposition 2.14 to index the functions ν J (−). In particular, ν1 (pe ) = νf (pe ). We assume that p '= 2, 5 and we compute ν1 (pe ), depending on the congruence class of p mod 20. (1) If p ≡ 1 (mod 20), then 9 e (p − 1) for all e ≥ 1. ν1 (pe ) = 20 (2) If p ≡ 3 (mod 20), then 9 p − 27 if e = 1, e 20 ν1 (p ) = 20 9 e 7 e−1 p − 20 p − 1 if e ≥ 2. 20 (3) If p ≡ 7 (mod 20), then ν1 (pe ) =
9 23 20 p − 20 9 e 3 e−1 20 p − 20 p
−1
if e = 1, if e ≥ 2.
F-thresholds and Bernstein-Sato Polynomials
(4) If p ≡ 9 (mod 20), then ν1 (pe ) = (5) If p ≡ 11 (mod 20), then 9 p − 19 · e 20 20 ν1 (p ) = 9 e 20 (p − 1)
9 21 20 − 20 9 e 1 e−1 p − 20 p 20
pe+1 −1 p2 −1
+
19
20 p
−1
−
9 20
361
if e = 1, if e ≥ 2.
·
pe −p p2 −1
(6) If p ≡ 13 (mod 20), then 9 17 20 p − 20 9 2 ν1 (pe ) = 20 p − 21 20 9 e 1 e−2 p − −1 20 20 p
if e = 1, if e = 2, if e ≥ 3.
(7) If p ≡ 17 (mod 20), then 9 13 20 p − 20 9 2 ν1 (pe ) = 20 p − 21 20 9 e 1 e−2 p − 20 p −1 20
if e = 1, if e = 2, if e ≥ 3.
if e is odd, if e is even.
(8) If p ≡ 19 (mod 20), then 11 9 e p− (1 + p + · · · + pe−1 ) for all e ≥ 1. ν1 (p ) = 20 20 We see that in this way we have accounted for the following roots of 9 13 17 19 21 23 27 bf,0 (s): − 20 , − 11 20 , − 20 , − 20 , − 20 , −1, − 20 , − 20 , − 20 . In order to get the other four roots, we need to compute also some values of ν3 (p). (1) If p ≡ 1 (mod 10), then ν3 (p) =
7 (p − 1). 10
ν3 (p) =
11 7 p− . 10 10
ν3 (p) =
9 7 p− . 10 10
ν3 (p) =
13 7 p− . 10 10
(2) If p ≡ 3 (mod 10), then
(3) If p ≡ 5 (mod 10), then
(4) If p ≡ 7 (mod 10), then
7 9 13 Therefore we recover also the roots − 10 , − 10 , − 11 10 and − 10 of bf,0 (s).
362
M. Mustata, S. Takagi and K. Watanabe
Note that if p ≡ 19 (mod 20), then c(fp ) =
9p − 11 , 20(p − 1)
so this gives an example when c(fp ) is not a polynomial in p1 . Example 4.6. Let f ∈ Z[X1 , . . . , Xn ] be a homogeneous polynomial of degree n, defining a smooth hypersurface Y in Pn−1 . It is well known that in this case lc0 (f ) = 1 (see [Kol]). On the other hand, it follows from Proposition 2.16 that n−1 c(fp ) = 1 if and only if the action of the Frobenius morphism on Hm (R/fp ) is injective. Here R = Fp [X1 , . . . , Xn ](X1 ,...,Xn ) . The above action is injective if and only if it is injective on the socle of n−1 Hm (R/(fp )). We assume that p is large enough, so Yp , the reduction mod p of Y , is smooth. It follows that c(fp ) = 1 if and only if the action induced by the Frobenius morphism on H n−2 (Yp , OYp ) is injective. If n = 3, then Y is an elliptic curve. In this case we see that c(fp ) = 1 if and only if Y is not supersingular. There are two cases: suppose first that Y has complex multiplication (over C). In this case, Yp is supersingular if and only if p is inert in the imaginary quadratic CM field. On the other hand, if Y has no complex multiplication then Serre [Se] proved that the set of primes p for which Yp is supersingular has natural density zero in the set of all primes. This clearly suggests that the behavior of c(fp ) does not depend on the congruence of p modulo some N , as in Problems 3.7 and 3.8. It would be interesting to compute explicitly the F-pure thresholds at the primes where the curve is supersingular. We mention a result of Elkies [El] which is relevant in this setting: it says that for every elliptic curve Y as above, there are infinitely many primes p for which Yp is supersingular, and therefore c(fp ) '= 1. As B. Conrad and N. Katz pointed out to us, if K is the field obtained by adjoining to Q all points of order of Y (for some odd prime ), then for every odd prime p such that p splits completely in K, the curve Yp is not supersingular. This provides an affirmative answer to Question 3.9 in this example. Acknowledgements. We are grateful to Johan de Jong, Lawrence Ein and Martin Olsson for useful discussions. We are particularly indebted to Karen Smith for suggesting to us Example 4.6, and to Brian Conrad and Nick Katz for answering our questions on supersingular elliptic curves. While working on this project the first author was a Clay Mathematics Institute Research Fellow. Our work started during the first author’s visit to University of Tokyo. He is grateful to his host Yujiro Kawamata for his wonderful hospitality. Part of this work was done during the second’s author’s stay at University of Michigan. He would like to express his deep gratitude to Melvin Hochster for his hospitality and support.
F-thresholds and Bernstein-Sato Polynomials
363
References [Bj] J.-E. Bj¨ ork, Rings of differential operators, Amsterdam, North-Holland, 1979. [BH] W. Bruns and J. Herzog, Cohen-Macaulay rings, Cambridge studies in advanced mathematics 39, 1998. [BMS1] N. Budur, M. Mustat¸a ˇ and M. Saito, Roots of Bernstein-Sato polynomials for monomial ideals, preprint 2004. [BMS2] N. Budur, M. Mustat¸a ˇ and M. Saito, Bernstein-Sato polynomials of arbitrary varieties, preprint 2004 math.AG/0408408. [ELSV] L. Ein, R. Lazarsfeld, K.E. Smith and D. Varolin, Jumping coefficients of multiplier ideals, Duke Math. J. 123 (2004), 469–506. [Eis] D. Eisenbud, Commutative algebra. With a view toward algebraic geometry, Graduate Texts in Mathematics 150, Springer-Verlag, New York, 1995. [El] N. Elkies, The existence of infinitely many supersingular primes for every elliptic curve over Q. Invent. Math. 89 (1987), 561–567. [Fe] R. Fedder, F-purity and rational singularity, Trans. Amer. Math. Soc. 278 (1983), 461–480. [Ha] N. Hara, A characterization of rational singularities in terms of injectivity of Frobenius maps, Amer. J. Math. 120 (1998), 981–996. [HW] N. Hara and K.-i. Watanabe, F-regular and F-pure rings vs. log terminal and log canonical singularities, J. Algebraic Geom. 11 (2002), 363–392. [HY] N. Hara and K.-i. Yoshida, A generalization of tight closure and multiplier ideals, Trans. Amer. Math. Soc. 355 (2003), 3143–3174. [Hir] H. Hironaka, Resolution of singularities of an algebraic variety over a field of characteristic zero, Ann. of Math.(2) 79, 1964, 109–326. [HH] M. Hochster and C. Huneke, Tight closure, invariant theory and the Brian¸conSkoda theorem, J. Amer. Math. Soc. 3 (1990), 31–116. [Ho1] J. Howald, Multiplier ideals of sufficiently general polynomials, preprint 2003, math.AG/0303203. [Ho2] J. Howald, Multiplier ideals of monomial ideals, Trans. Amer. Math. Soc 353 (2001), 2665–2671. [Ig] J.-i. Igusa, An introduction to the theory of local zeta function, AMS/IP Studies in Advanced Mathematics 14, American Mathematical Society, Providence, RI; International Press, Cambridge, MA, 2000. [Ka1] M. Kashiwara, D-modules and microlocal calculus. Translated from the 2000 Japanese original by Mutsumi Saito. Translations of Mathematical Monographs, 217. Iwanami Series in Modern Mathematics. American Mathematical Society, Providence, RI, 2003. [Ka2] M. Kashiwara, B-functions and holonomic systems. Rationality of roots of Bfunctions, Invent. Math. 38 (1976/77), 33–53. [Kol] J. Koll´ ar, Singularities of pairs, in Algebraic geometry, Santa Cruz 1995, volume 62 of Proc. Symp. Pure Math Amer. Math. Soc. 1997, 221–286. [Laz] R. Lazarsfeld, Positivity in algebraic geometry II, Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge, A series of Modern Surveys in Mathematics, Vol. 49, Springer-Verlag, Berlin, 2004.
364
M. Mustata, S. Takagi and K. Watanabe
[Mal] B. Malgrange, Le polynˆ ome de Bernstein d’une singularit´e isol´ee, in Fourier integral operators and partial differential equations, pp. 98–119, Lecture Notes in Math., Vol. 459, Springer, Berlin, 1975. [Neu] J. Neukirch, Algebraic number theory. Translated from the 1992 German original and with a note by Norbert Schappacher. With a foreword by G. Harder. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], Vol. 322. Springer-Verlag, Berlin, 1999. [Sa] M. Saito, On b-function, spectrum and rational singularity, Math. Ann. 295 (1993), 51–74. [Se] J.-P. Serre, Propri´et´es galoisiennes des points d’ordre fini des courbes elliptiques, Invent. Math. 15 (1972), 259–331. [Smi] K.E. Smith, F-rational rings have rational singularities, Amer. J. Math. 121 (1997), 159–180. [Ta] S. Takagi, F-singularities of pairs and Inversion of Adjunction of arbitrary codimension, Invent. Math. 157 (2004), 123–146. [TW] S. Takagi and K.-i. Watanabe, On F-pure thresholds, J. Algebra 282 (2004), 278–297. [Ya] T. Yano, On the theory of b-functions, Publ. Res. Inst. Math. Sci. 14 (1978), 111–202. Mircea Mustat¸ˇ a Department of Mathematics University of Michigan Ann Arbor, MI 48109, USA e-mail:
[email protected] Shunsuke Takagi Faculty of Mathematics Kyushu University 6-10-1 Hakozaki, Higashi-ku Fukuoka-city, 812-8581, Japan e-mail:
[email protected] Kei-ichi Watanabe Department of Mathematics College of Humanities and Sciences Nihon University, Setagaya-Ku Tokyo 156-0045, Japan e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Hyperk¨ahler Manifolds and Algebraic Geometry Kieran G. O’Grady Abstract. We survey some recent and less recent results on hyperk¨ ahler manifolds, i.e., irreducible (holomorphic) symplectic manifolds. The point of view will be that of an algebraic (or complex) geometer.
1. Introduction A compact K¨ahler surface S is a K3 if it is simply connected and it carries a global holomorphic symplectic form (i.e., the canonical bundle KS := ∧2 Ω1S is trivial). An example: if P (x0 , . . . , x4 ) is a homogeneous polynomial of degree 4 such that ∇P has no non-trivial zeroes then S = V (P ) := {[x0 , . . . , x4 ] ∈ P3C | P (x0 , . . . , x4 ) = 0}
(1.1)
ole in the classification is a K3 surface1. These surfaces play an important rˆ of K¨ ahler manifolds and they have a very rich geometry. Thus it is natural to search for higher-dimensional analogues of K3 surfaces. A natural generalization of the definition of a K3 is that of a Calabi-Yau: a compact K¨ ahler manifold X with trivial canonical bundle and h0 (ΩpX ) = 0 for 0 < p < dim X 2. Another generalization of the definition of a K3 is that of an irreducible symplectic manifold: a simply connected compact K¨ ahler manifold X carrying a holomorphic symplectic form spanning Γ(Ω2X ). It turns out that higher-dimensional irreducible symplectic manifolds behave very much like K3’s in many respects. In these notes we will survey certain recent (and less recent) results which have been proved on these manifolds. First we explain why they are also known as hyperk¨ ahler manifolds. If X is irreducible symplectic then by Yau’s solution of Calabi’s conjecture there exists a Riemannian metric g on X such that (X, g) is an irreducible hyperk¨ ahler manifold, i.e., the holonomy is isomorphic to the tautological representation of t
Sp(r) := {φ : Hr → Hr | φ right-linear and φ(v) · φ(w) = vt · w}
(1.2)
Supported by Cofinanziamento MIUR 2003-2004. 1S is simply connected by Lefschetz’ Hyperplane section Theorem. The exact sequence 0 → IS |S → Ω1P3 |S → Ω1S → 0 together with the isomorphisms IS |S ∼ = OS (−4) and ∧3 Ω1P3 ∼ = OP3 (−4) gives that KS is trivial. 2We do not require that π (X) is trivial, only that b (X) = 0. One shows that if n = 2 the 1 1 latter condition is equivalent to the former.
366
K.G. O’Grady
on Hr where 4r = dimR X. Conversely let (M, g) be a compact irreducible hyperk¨ ahler manifold. Fix a point p ∈ M ; left-multiplication by λ = ai + bj + ck with λ2 = −1 on the tangent space Θp M is an endomorphism commuting with the holonomy group and hence gives M the structure of a complex manifold Xλ for which g is the real part of a K¨ ahler metric. Furthermore since Sp(r) = U (C2r ) ∩ Sp(C2r ) we get that Xλ carries a holomorphic symplectic form spanning Γ(Ω2Xλ ). It can be proved that π1 (M ) = {1} (see [2]) and hence Xλ is an irreducible symplectic manifold. We will start our survey by giving the first examples of higher-dimensional (dim > 2) irreducible symplectic manifolds that were ever constructed (Fujiki, Beauville): they give two distinct deformation classes in every (even) dimension greater than 2. In Section (3) we will review that part of the general theory that was developed roughly 25 years ago by Bogomolv, Fujiki and Beauville and we will state some of Huybrechts’ recent theorems on the K¨ahler cone and surjectivity of the period map. These results give strong evidence in favor of the slogan higher-dimensional irreducible symplectic manifolds are analogues of K3 surfaces. In fact these manifolds, similarly to K3 surfaces, are studied via periods of the symplectic form and the results of Section (3) are extensions to arbitrary dimension of theorems which had previously been proved to hold for K3 surfaces. After that we review the many examples one encounters in algebraic geometry, mostly moduli spaces of sheaves on projective K3 surfaces or abelian surfaces (following Mukai). These moduli spaces will also give us examples of interesting birational maps between irreducible symplectic manifolds. We will recall Huybrechts’ beautiful Theorem which states that birational equivalent irreducible symplectic manifolds are deformation equivalent. In the following section we will give our construction of examples in dimensions 6 and 10 which are not deformations of the previously known ones. Every known higher-dimensional irreducible symplectic manifold is deformation equivalent to one of Beauville’s examples or to one of our examples3 – thus we know of 2 distinct deformation classes in every even dimension at least 4 with one extra deformation class in dimensions 6 and 10. 1.1. Notation. We will be working in the category of complex spaces or of complex algebraic varieties. Thus unless we specify otherwise a symplectic form is holomorphic. 2. First higher-dimensional examples Beauville constructed two families of irreducible symplectic manifolds in every even dimension greater than 2. The first family consists of Hilbert schemes of 0-dimensional subschemes of a K3, the second family consists of generalized Kummer manifolds. Members of distinct families are not deformation equivalent because they do not have the same Betti numbers. 3Kodaira proved roughly 40 years ago that any two K3 surfaces are deformation equivalent.
Hyperk¨ ahler Manifolds and Algebraic Geometry
367
2.1. Hilbert schemes of K3’s. Let S be a K3 surface: the Hilbert scheme (or Douady space) S [n] is the 2n-dimensional connected manifold parametrizing subschemes Z ⊂ S of finite length4 equal to n. One forms a picture of S [n] by contemplating the cycle map γn : S [n] → S (n)
(2.1)
where S (n) is the symmetric product of n copies of S. The map γn is an isomorphism over (S (n) )0 , the smooth locus of S (n) , i.e., the subset parametrizing cycles p1 + · · · + pn with pairwise distinct pi ’s. The fibers of γn over points of sing(S (n) ) are positive dimensional. Let us examine S [2] more closely. In this case we may avoid appealing to the theory of Hilbert schemes: simply define S [2] to be the blow-up of S (2) with center sing(S (2) ) and thus (2.1) is the blow-up map (this is Fujiki’s construction of the first example of a higher-dimensional irreducible symplectic manifold). Then S [2] is stratified according to the dimension of fibers of (2.1). There are two strata: the open stratum isomorphic to (S (2) )0 and the closed stratum isomorphic to the projectivization of the tangent bundle of S. The manifold S [n] is K¨ahler by a Theorem of Varouchas. One associates to a symplectic form σ on S a symplectic form σ [n] on S [n] as follows. Let πi : S n → S be the ith projection. The 2-form on S n given by ni=1 πi∗ σ is symplectic and invariant under the action of the symmetric group hence it descends to a symplectic form on on (S (n) )0 . Since γn is an isomorphism over (S (n) )0 we get a symplectic form on γn−1 (S (n) )0 ; one verifies easily that this form extends to a symplectic form on S [n] . We refer to [2] for the proof that S [n] is irreducible symplectic. The important Betti number b2 (S [n] ) is computed as follows. One proves that the exceptional divisor γn−1 (sing(S (n) )) is irreducible; from this one easily gets5 that b2 (S [n] ) = b2 (S (n) ) + 1 = b2 (S) + 1 = 23,
n ≥ 2.
(2.2)
2.2. Generalized Kummer manifolds. Let T be a 2-dimensional complex torus6 and σ be a symplectic form on T . Proceeding as in the case of K3 surfaces one associates to σ a symplectic form σ [n+1] on T [n+1] . However T [n+1] is not irreducible symplectic. In fact consider the composition γn+1
ζn+1
T [n+1] −→ T (n+1) −→ T,
(2.3)
where γn+1 is the cycle map (see (2.1)) and ζn+1 is the map defined by the addition law on T . By (2.3) we have b1 (T [n+1] ) ≥ b1 (T ) = 4 and we also get that (ζn+1 ◦ γn+1 )∗ (σ) is a 2-form independent of σ [n+1] . On the other hand [2] it turns out that K [n+1] (T ) := (ζn+1 ◦ γn+1 )−1 (0) is irreducible symplectic of 4If the ideal sheaf of Z is I ⊂ O the length of Z is the dimension of O := O /I as Z S Z S Z
complex vector space. 5Thom-Hirzebruch’s Index Theorem gives that b (K3) = 22. 2 6T = C2 /Λ where Λ ∼ Z4 is a discrete subgroup, i.e., it spans C2 over R. =
368
K.G. O’Grady
dimension 2n. If n = 1 this is the classical Kummer surface, a particular K3. Considering the cycle map γn+1 one shows that b2 (K [n+1] (T )) = b2 (T ) + 1 = 7,
n ≥ 2.
(2.4)
3. Periods Almost 50 years ago A. Weil [32] formulated a series of conjectures on moduli7 and periods8 of K3 surfaces. Most of the conjectures were proved in the following 20 years; the most celebrated result is the Global Torelli Theorem proved in the ’70’s by Piatechki-Shapiro and Shafarevich [27], Burns and Rapoport [7], Looijenga and Peters [20] (Friedman [11] gave a radically different proof). Roughly 20 years ago Beauville [20] started investigating periods of irreducible symplectic manifolds of arbitrary dimension (there had been a first attempt by Bogomolov [4]) and showed that periods of higher-dimensional irreducible symplectic manifolds behave very much like those of K3’s. Recently Huybrechts [15, 16, 17] proved many deep results on K¨ ahler classes and moduli of irreducible symplectic manifolds. Huybrechts made heavy use of the period map, in particular periods of the twistor family {Xλ }λ∈P1 described in Section (1). Notice that the twistor family exists because of the equivalence between irreducible symplectic manifolds and irreducible hyperk¨ ahler manifolds (i.e., thanks to Yau’s solution of the Calabi conjecture). 3.1. Deformations and the local period map. Let X be an irreducible symplectic manifold. Bogomolov [4] proved that deformations of X are unobstructed. Thus there exists a proper submersive map f : X → U where U is a polydisc, X ∼ = X0 := f −1 (0) and any irreducible symplectic manifold whose complex structure is “close” to X is isomorphic to Xt := f −1 (t) for some t ∈ U (if Aut(X) is trivial then t is unique, in general the set of such t is at most countable). Furthermore the Kodaira-Spencer map κ
∼
ΘU,0 −→ H 1 (ΘX ) −→ H 1 (Ω1X )
(3.1)
is an isomorphism. (The second map of (3.1) is the isomorphism induced by contraction with a symplectic form.) We say that f : X → U is a representative of Def(X): in what follows we will feel free to shrink arbitrarily U around 0 in other words we are mostly interested in the germs of X and U at X0 and 0 respectively. From (3.1) and the Hodge equality b2 (X) = 2h2,0 (X) + h1,1 (X) we get that (3.2) dim U = b2 (X) − 2. Example 3.1. Let S be a K3 and X = S [n] with n ≥ 2: by (3.2)–(2.2) we have dim U = 21. On the other hand the deformation space of S has dimension 20 by (3.2) and hence the generic deformation of S [n] is not of the form (K3)[n] . 7Isomorphism classes. 8Integrals (“periods”) of a symplectic form over integral 2-cycles.
Hyperk¨ ahler Manifolds and Algebraic Geometry
369
Similarly the generic deformation of a generalized Kummer manifold is not a generalized Kummer. Now we define the period map. Since U is contractible X is diffeomorphic to X × U and hence for all t ∈ U we have a well-defined integral isomorphism ∼
φt : H 2 (X) −→ H 2 (Xt ).
(3.3)
The local period map of X is given by U t
P
X −→ P(H 2 (X)) 2,0 → φ−1 (Xt ) t H
(3.4)
By Griffiths’s general results on the derivative of period maps the image of dPX (0) lies in the subspace Hom(H 0 (Ω2X ), H 1 (Ω1X )) ⊂ Hom(H 0 (Ω2X ), H 2 (X)/H 0 (Ω2X ))
(3.5)
and we have a natural identification of dPX (0) with the map H 1 (ΘX ) −→ Hom(H 0 (Ω2X ), H 1 (Ω1X )) θ → contr(·, θ).
(3.6)
Since H 0 (Ω2X ) is spanned by a symplectic form σ and contraction with σ defines ∼ an isomorphism of vector-bundles ΘX −→ Ω1X we get that the above map is an isomorphism. Thus dPX (0) is injective and hence PX is an immersion of U near 0 - this is the Local Torelli Theorem. By (3.6) (or by (3.2)) the image PX (U ) is a smooth analytic subset of codimension 1 in P(H 2 (X)). 3.2. Beauville’s quadratic form and Fujiki’s constant. Theorem 3.2. [(Beauville: Thm. (4) of [2])+(Fujiki: Thm. (4.7) of [12])] Let X be an irreducible symplectic manifold of dimension 2n. There exist a positive rational number cX (Fujiki’s constant) and an integral indivisible nondegenerate symmetric bilinear form (, )X on H 2 (X) (Beauville’s form) of signature (3, b2 (X) − 3) such that the following hold: (1) Im(PX ) ⊂ Q := {[σ] ∈ P(H 2 (X))| (σ, σ)X = 0, (σ, σ)X > 0}, (2) X α2n = cX · (α, α)nX for α ∈ H 2 (X). (3) (α, α )X = 0 if α ∈ H p,2−p (X), α ∈ H p ,2−p (X) with p + p '= 2. Proof. Let F( ∈ S2n H 2 (X)∨ be the intersection form ( F (α1 , . . . , α2n ) := α1 ∧ · · · ∧ α2n .
(3.7)
X
Let [α] ∈ Im(PX ) and β1 , . . . , βn−1 ∈ H 2 (X) be arbitrary; we claim that F((α, . . . , α, β1 , . . . , βn−1 ) = 0. A BC D n+1
(3.8)
370
K.G. O’Grady
In fact if [α] = PX (t) then F((α, . . . , α, β1 , . . . , βn−1 ) = A BC D n+1
φt (α)n+1 ∧ φt (β1 ) · · · ∧ φt (βn−1 ).
(3.9)
Xt
By definition of the period map we may represent φt (α) by a (holomorphic) symplectic form and hence the integrand is represented by a sum of forms of type (p, q) with p ≥ (2n + 2); since dim X = 2n these forms are identically zero and the integral vanishes. This proves (3.8). Let F be the degree-2n polynomial defined by (3.10) F (γ) := F((γ, . . . , γ ). A BC D 2n
Setting β1 = · · · βn−1 = α in (3.8) we get that F vanishes on Im(PX ): since F is not identically zero9 it follows that the Zariski closure10 of Im(PX ) in P(H 2 (X)) is a proper subset of P(H 2 (X)). On the other hand we know by Subsection (3.1) that Im(PX ) is a smooth connected analytic subset of codimension 1 in P(H 2 (X)) and hence the Zariski closure of Im(PX ) is the set of zeroes of an irreducible non-zero homogeneous polynomial A. One verifies that Im(PX ) does not belong to a hyperplane, i.e., deg A ≥ 2. Since F vanishes on Im(PX ) it vanishes also on V (A) and hence by irreducibility of A we have F = F1 · A. If n = 1 we have 2 = deg F = deg F1 + deg A ≥ deg F1 + 2 and hence deg F1 = 0, deg A = 2. Of course if n = 1 the theorem is quite trivial: Beauville’s form is the intersection form F and Fujiki’s constant is equal to 1. If n = 2 we notice that (3.8) tells us that the partial derivatives of F vanish on Im(PX ) and hence also on the zero-set of A. This implies that A divides F1 and hence (3.11) F = F2 · A2 . Thus 4 = deg F = deg F2 + 2 deg A ≥ deg F2 + 4 and hence deg F2 = 0, deg A = 2. Equation (3.11) determines the constant F2 and the quadratic form A up to multiplicative factors; as is easily verified we can rescale F2 and A so that A is integral, indivisible and A(σ + σ, σ + σ) > 0 for a (holomorphic) symplectic form σ. Let (, )X be the bilinear form defined by the “rescaled” quadratic polynomial A and cX be the “rescaled” F2 . All the statements in the theorem hold by construction except possibly for the statement regarding the non-degeneracy and signature of (, )X ; this follows easily from the Hodge index Theorem. If n > 2 one proceeds similarly; by (3.8) the partial derivatives of F up to order (n − 1) vanish on Im(PX ) and hence also on the zero-set of A. Dividing F successively by A one gets that F = Fn · An where Fn is a constant and deg A = 2. The rest of the argument is as in the case n = 2. 9If ω is a K¨ ahler class then F (ω) > 0. 10
Common zeroes of all homogeneous polynomials vanishing on Im(PX ).
Hyperk¨ ahler Manifolds and Algebraic Geometry
371
A few comments: (a) Let U be a representative of Def(X): by the Local Torelli Theorem (see the end of Subsection (3.1)) PX : U → Q is an isomorphism onto an open subset of Q. (b) The quantities cX and (, )X are uniquely characterized by Properties (1)– (2) above and are invariant under deformation of complex structure; they are the main discreet invariants of X. (c) Since (, )X is integral it gives H 2 (X; Z) a structure of lattice11. The discreet invariants of Beauville’s examples are as follows. Let S be a K3 surface; then (2n)! , H 2 (S [n] ; Z) ∼ cS [n] = = H ⊕3 ⊕(−E8 )⊕2 ⊕(−2(n−1)), n ≥ 2 (3.12) n!2n where H is the standard hyperbolic plane. Let T be a 2-dimensional complex torus; then (2n)! (n+1), H 2 (K [n+1] (T ); Z) ∼ cK [n+1] (T ) = = H ⊕3 ⊕(−2(n+1)), n ≥ 2. n!2n (3.13) 3.3. The K¨ahler cone and surjectivity of the period map. We will state some of Huybrechts’ recent results (with an improvement by Boucksom [5]); we refer to [18] for a very readable survey and of course to the original papers [15, 16, 17, 5]. The first result is a projectivity criterion. Proposition 3.3. [Projectivity criterion] An irreducible symplectic manifold X is projective if and only if there exists α ∈ HZ1,1 (X) such that (α, α)X > 0. Assume that X is projective and that L is an ample line bundle on X. Let σ ∈ Γ(Ω2X ) be a symplectic form; then X c1 (L)2 ∧ (σ + σ)2n−2 > 0, where 2n = dim X. Applying Items (2)–(3) of Theorem (3.2) we get that (c1 (L), c1 (L))X > 0. Thus the non-trivial part of the criterion is the sufficiency of the condition. The next result describes the K¨ ahler cone KX ⊂ HR1,1 (X) of K¨ ahler classes. First we recall that the positive cone CX ⊂ HR1,1 (X) is the connected component of {α ∈ HR1,1 (X)| (α, α)X > 0} containing KX . Theorem 3.4. [Huybrechts [17]+Boucksom [5]] The K¨ ahler cone KX consists of those α ∈ CX such that C α > 0 for all rational curves C ⊂ X. Here a rational curve C ⊂ X is the image of a non-constant map P1 → X. A comment on the statement of the theorem. Demailly-Paun [9] have recently extended the Nakai-Moishezon theorem to the case of a compact K¨ahler manifold X, i.e., they proved that the ahler cone is a connected component of the K¨ 1,1 d set of α ∈ HR (X) such that Z α > 0 for all d-dimensional analytic subsets Z ⊂ X. Theorem (3.4) states that if X is an irreducible symplectic manifold 11A finitely generated free abelian group endowed with a non-degenerate integral symmetric
bilinear form.
372
K.G. O’Grady
it suffices to test those Z which are rational curves: in this respect X really behaves like a K3 surface. A comment on the proof: essential use is made of the twistor family f : X → P1 one can associate to an irreducible symplectic manifold together with the choice of a K¨ahler class – the complex structure on the fibers Xλ := f −1 (λ) is defined as in Section (1). In order to formulate the last result we recall how to define the global period map. Choose a deformation class D of irreducible symplectic manifolds: thus there is a lattice Λ with bilinear form (, )Λ such that for any X ∈ D the lattice H 2 (X; Z) is isometric to Λ. The associated period space is QΛ := {[σ] ∈ P(Λ ⊗ C)| (σ, σ)Λ = 0,
(σ, σ)Λ > 0}.
(3.14)
A marked manifold in D consists a couple (X, φ) where X ∈ D and ∼
φ : H 2 (X; Z) → Λ is an isometry. To a marked manifold we associate its period P (X, φ) := φC (H 2,0 (X)) ∈ QΛ . The set of equivalence classes of marked manifolds in D is a (non-Hausdorff) analytic space MD and the period map P : MD → QΛ is holomorphic. Theorem 3.5. [Huybrechts [15]] Let M0D be a connected component of MD . The restriction of P to M0D is surjective onto QΛ . Again the existence of the twistor family is essential for the proof.
4. More examples, birational maps We will present most of the known explicit constructions of irreducible symplectic manifolds. First the Fano variety of lines on a smooth cubic hypersurface in P5 – this example is due to Beauville and Donagi. Next we give the construction, due to Mukai, of a symplectic form on moduli spaces of stable sheaves on a projective surface S with trivial canonical bundle12 and we recall the result (Mukai, Huybrechts-G¨ ottsche, O’Grady, Yoshioka) stating that if such a moduli space is compact then it is an irreducible symplectic manifold (S a K3) or one of its “Bogomolov-Beauville factors” is (S an abelian surface). By this method one gets a very rich series of examples of irreducible symplectic varieties and also of interesting birational maps between them; we give explicit examples of Mukai flops, the simplest non-regular birational maps. We finish by stating Huybrechts’ Theorem on birational irreducible symplectic manifolds. 12Thus S is either a K3 or an abelian surface.
Hyperk¨ ahler Manifolds and Algebraic Geometry
373
4.1. Lines on a cubic 4-fold. Let Y ⊂ P5 be a smooth cubic hypersurface and X := F (Y ) be the set of lines L ⊂ X. Thus X is a closed subvariety of the Grassmannian Gr(1, P5 ). Beauville and Donagi [3] proved that X is an irreducible symplectic manifold deformation equivalent to (K3)[2] . Let Gr(1, P5 ) → P14 be the Pl¨ ucker embedding. Thus we have X ⊂ P14 ; let h := c1 (OX (1)) be the first Chern class of the hyperplane bundle on X. One verifies [3] that (h, h)X = 6. The remarkable feature of Beauville-Donagi’s example is the following: the family of X = F (Y ) one gets by letting Y vary among all smooth cubic hypersurfaces is locally complete for deformations keeping the class h of type (1, 1). In other words every small deformation of X = F (Y ) keeping h of type (1, 1) is isomorphic to X = F (Y ) for some cubic hypersurface Y . I know of no other explicit locally complete family of higher-dimensional polarized irreducible symplectic varieties. 4.2. Moduli spaces of sheaves. Let S be a projective surface. In general any natural algebraic structure on the set of isomorphism classes of vector-bundles on S is not separated, i.e., not Hausdorff. In order to get separated moduli spaces one restricts to the class of H-stable vector-bundles, where H is an ample divisor13 on S. In general moduli spaces of H-stable vector-bundles are not compact: in order to get compact moduli spaces one considers the larger class of H-semistable torsion-free sheaves and one introduces S-equivalence, a relation which coincides with isomorphism for H-stable sheaves and is coarser than isomorphism for H-semistable non stable sheaves. Explicitely: a torsionfree sheaf F is H-semistable if for all non-zero subsheaves G ⊂ F we have 1 1 χ(G ⊗ OS (mH)) ≤ χ(F ⊗ OS (mH)). (4.1) rk(G) rk(F ) If the inequality is strict whenever G '= F then F is H-stable. A celebrated theorem of Gieseker and Maruyama states that the set of S-equivalence classes of H-semistable torsion-free sheaves with fixed rank and Chern classes (in H ∗ (S)) has a natural structure of projective variety. Now let’s assume that KS is trivial, i.e., that S is a K3 or an abelian surface. Given a positive r ∈ N, s ∈ Z and c1 ∈ HZ1,1 (S) we let M (r, c1, s) be the set of S-equivalence classes of coherent pure H-semistable sheaves F on S with r + s if S is a K3, rk(F ) = r, c1 (F ) = c1 , χ(F ) = (4.2) s if S is an abelian surface. (To simplify notation we omit reference to S, H; however one must keep in mind that the moduli space depends both on S and H.) Mukai [22] proved that the open subset M st (r, c1 , s) ⊂ M (r, c1, s) parametrizing stable sheaves is smooth and that if it is non-empty then dim M st (r, c1 , s) = 2 − 2rs + c21 .
(4.3)
13i.e., there exists an embedding f : S → Pn with f ∗ O n (1) ∼ O (kH) for some k > 0. = S P
374
K.G. O’Grady
Furthermore Mukai showed how to associate to a symplectic form σ on S a symplectic form σM on M st (r, c1 , s). We give the definition of σM at a point [F ] ∈ M st (r, c1 , s) representing a locally-free sheaf, i.e., a vector-bundle. Since F is a vector-bundle there is a canonical isomorphism14 Θ[F ] M (r, c1 , s) ∼ = H 0,1 (End(F )). Given α, β ∈ H 0,1 (End(F )) one sets
(4.4)
σM , α ∧ β :=
σ ∧ T r(α ∧ β).
(4.5)
S
If (r, c1 , s) and H are suitably chosen then M st (r, c1 , s) = M (r, c1, s) and we may hope that M (r, c1, s) is an irreducible symplectic manifold. ∼ S [n] : a sheaf is repreExample 4.1. If S is a K3 surface then M (1, 0, 1 − n) = sented by a point of M (1, 0, 1 − n) if and only if it is isomorphic to IZ where [Z] ∈ S [n] . If S is an abelian surface then M (1, 0, −n) ∼ = S [n] × P ic0 (S): a sheaf is represented by a point of M (1, 0, −n) if and only if it is isomorphic to IZ ⊗ L where [Z] ∈ S [n] and [L] ∈ P ic0 (S). The example above suggests that M (r, c1, s) might be irreducible symplectic if S is a K3. If S is an abelian surface we should first “reduce” M (r, c1, s) by considering the map Φ
M (r, c1 , s) −→ S × P icc1 (S) r [F ] → ( crat 2 (F ), [∧ F ]).
(4.6)
2 Chern class in the We explain our notation: crat 2 (F ) ∈ CH (S) is the 2-nd group of rational equivalence classes of 0-cycles on S and : CH 2 (S) → S is induced by the addition law on S, [∧r F ] is the isomorphism class of the linebundle ∧r F . Assume that dim M (r, c1, s) ≥ 4: then Φ is submersive and any two of its fibers are isomorphic. Thus M (r, c1 , s)0 := Φ−1 (a, [ξ]) is well defined up to isomorphism and by (4.3)
dim M (r, c1, s)0 = −2 − 2rs + c21 .
(4.7)
One verifies that the restriction of σM to M st (r, c1 , s)0 is symplectic. Now we can state the main result regarding M (r, c1 , s) when S is a K3 and M (r, c1, s)0 when S is an abelian surface under the hypothesis that (r, c1 , s) and H have been chosen so that M st (r, c1 , s) = M (r, c1 , s). Theorem 4.2. [[23, 13, 24, 30, 31]] Keep notation and hypotheses as above. If S is a K3 then M (r, c1, s) is a deformation of S [n] where 2n = 2 − 2rs + c21 . If S is an abelian surface and dim M (r, c1, s) ≥ 4 then M (r, c1, s)0 is a deformation of K [n+1] (S) where 2n = −2 − 2rs + c21 . 14If F is not locally-free replace H 0,1 (End(F )) by Ext1 (F, F ).
Hyperk¨ ahler Manifolds and Algebraic Geometry
375
We notice that although the above moduli spaces belong to the same deformation class as Beauville’s examples they are in general not isomorphic (and not birational) to Beauville’s examples (recall Example (3.1)). 4.3. Moduli spaces and Mukai flops. We will examine a particular moduli space of sheaves on a K3 surface. This will serve two purposes: it will show how one goes about proving Theorem (4.2) and it will introduce Mukai flops, the simplest non-regular birational maps between holomorphic symplectic manifolds. Let S ⊂ P3 be a smooth quartic surface, i.e., a hypersurface given by (1.1), and assume that S contains a line L. Let := c1 (L). We consider M := M (2, , −1), where stability is with respect to OS (1). As is easily checked M = M st . By the results of Mukai quoted in the preceding subsection we get that if M is non-empty then it is a 4-dimensional smooth projective variety with a regular symplectic form. Let us show that M is birational to S [2] : in particular this will prove that M is irreducible symplectic. Claim 4.3. Keeping notation as above, let [F ] ∈ M . Then h0 (F ) ≥ 1. Proof. Serre duality gives that H 2 (F ) ∼ = Hom(F, OS )∨ and the last group 2 vanishes by stability, hence h (F ) = 0. By definition of M (see (4.2)) we have χ(F ) = 1 and hence we get that h0 (F ) ≥ 1. Let τ ∈ H 0 (F ) be non-zero. Then τ has isolated zeroes by stability of F and hence F fits into an exact sequence τ
0 → OS −→ F −→ IZ ⊗ OS (L) → 0,
(4.8)
where IZ is the ideal sheaf of a 0-dimensional subscheme Z ⊂ S. From χ(F ) = 1 we get that χ(IZ ⊗O >S (L)) = −1 and hence Z has length 2. From this one easily gets that M = M1 M2 where Mi := {[F ] ∈ M | h0 (F ) = i}.
(4.9)
By upper-semicontinuity of cohomology dimension we get that M1 is open in M . One gets a regular map M1 → S [2] by associating to [F ] the (unique) Z appearing in Exact Sequence (4.8). One checks easily that this map gives an isomorphism ∼ f : M1 −→ (S [2] \ L[2] ), (4.10) [2] [2] where L ⊂ S is the closed subset parametrizing subschemes of L. On the other hand we have an isomorphism ∼ (L[2] )∨ . M2 = (4.11) (Explanation: L[2] ∼ = P2 and (L[2] )∨ is the dual plane.) Isomorphism (4.11) is defined as follows. To [F ] ∈ M2 we associate the set RF of Z ⊂ S appearing in (4.8) as τ varies among (H 0 (F ) \ {0}). One verifies easily that all Z parametrized by RF are contained in L and that RF is a line in L[2] . From the above we get that M1 is dense in M and hence that f defines a birational map −1 f : M · · · > S [2] . One checks that f is not regular. The inverse f replaces L[2]
376
K.G. O’Grady
by its dual plane (L[2] )∨ . This is an example of a Mukai flop, defined in general as follows. Let X be an irreducible symplectic manifold with symplectic form σ. Assume that there exists a closed Z ⊂ X of codimension r and that we have a Pr -fibration ρ : Z → B. Let p ∈ Z and Pr = ρ−1 (ρ(p)) be the fiber of ρ through p. The restriction of σ to Z is the pull-back of a 2-form on B, hence contraction with σ defines an isomorphism Θp Pr ∼ = (NZ/X )∨ p.
(4.12)
( → X be the blow up of Z and E ⊂ X ( be the exceptional divisor. Let X From (4.12) we get an inclusion ι
E → Z ×B Z ∨
(4.13)
where ρ∨ : Z ∨ → B is the dual fibration of ρ and Im(ι) is the relative incidence subvariety consisting of couples (p, H) with ρ(p) = ρ∨ (H) and p ∈ H. Thus in addition to the (blow-up) Pr−1 -fibration π : E → Z we have a dual Pr−1 -fibration π ∨ : E → Z ∨ . By Nakano’s contractibility criterion there is a ( → X ∨ to a smooth X ∨ contracting the fibers of π ∨ and hence we morphism X have a non-regular birational map X · · · > X∨
(4.14)
which is an isomorphism outside Z, Z ∨ . This is a Mukai flop, see [22]. The complex manifold X ∨ is simply connected and it has a symplectic form spanning the space of holomorphic 2-forms, hence if it is K¨ ahler it is irreducible symplectic. In our example X = M , X ∨ = S [2] , r = 2 and B = pt. Markman [21] has introduced and studied so-called generalized Mukai flops. There are many natural birational maps between moduli spaces of sheaves on a K3 or abelian surface: they are Mukai flops in low dimensions, in general they tend to be generalized Mukai flops. 4.4. Huybrechts’ Theorem. Birational maps between irreducible symplectic manifolds have been studied intensively, see [6, 8, 14, 33]. We single out Huybrechts’ beautiful result. Theorem 4.4. [Huybrechts [17]] Let X, Y be birational (bimeromorphic) irreducible symplectic manifolds. Then X, Y are deformation equivalent. The theorem above should be compared to theorems of Batyrev [1] and Denef-Loeser [10] stating that birational manifolds with trivial canonical bundles have the same Betti numbers, respectively Hodge numbers; however birational CY’s need not be deformation equivalent, in fact they may not be homeomorphic. As for Theorems (3.4)–(3.5) a key rˆ ole in the proof of Theorem (4.4) is played by the Twistor family.
Hyperk¨ ahler Manifolds and Algebraic Geometry
377
5. New deformation classes We will sketch our construction [25, 26] of 6- and 10-dimensional irreducible symplectic manifolds which are not deformation equivalent to Beauville’s examples. Let S be a K3 or abelian surface with an ample divisor H. We consider the moduli space M (2, 0, −2). This is a typical example in which M st (2, 0, −2) '= M (2, 0, −2); if S is a K3 the sheaf IZ ⊕IW where [Z], [W ] ∈ S [2] is a semistable non-stable sheaf parametrized by M (2, 0, −2), if S is a torus the sheaf (Ip ⊗ L) ⊕ (Ip ⊗ L ) where p, p ∈ S and [L], [L ] ∈ P ic0 (S) is a semistable non-stable sheaf parametrized by M (2, 0, −2). If H is chosen “generically” these are precisely the semistable non-stable sheaves parametrized by M (2, 0, −2) and their moduli sweep out the singular locus of M (2, 0, −2). I was able to 8(2, 0, −2) → M (2, 0, −2) with construct a symplectic desingularization π : M 8(2, 0, −2) projective; symplectic means that π ∗ σM extends to a symplecM 8(2, 0, −2). If S is a torus set M 8(2, 0, −2)0 := π −1 (M (2, 0, −2)0); tic form on M 8(2, 0, −2) = 10 and M 8(2, 0, −2)0 = 6. I proved by (4.3) and (4.7) we have dim M 8(2, 0, −2) is irreducible symplectic and that if S is a K3 then M 8(2, 0, −2)) ≥ 24. b2 (M
(5.1)
8(2, 0, −2) belongs to a new deformation class of 10-dimenThis shows that M sional irreducible symplectic manifolds because for the standard Beauville ex8(2, 0, −2)0 amples b2 is either 7 or 23. We proved also that if S is a torus then M is irreducible symplectic and 8(2, 0, −2)0) = 8. b2 (M
(5.2)
8(2, 0, −2)0 belongs to a new deformation class of 6-dimensional irreThus M ducible symplectic manifolds. A few comments on the proof. The symplectic desingularization is obtained by first following Kirwan’s procedure that gives (partial) desingularizations of GIT quotients whenever there are semistable non-stable orbits and then by contracting an extremal ray – see also Kaledin and Lehn [19] for an approach which avoids the contraction. The hardest part of 8(2, 0, −2) (when S is a K3) or M 8(2, 0, −2)0 the proof consists in showing that M (when S is an abelian surface) is irreducible symplectic and that (5.1)–(5.2) hold. To explain where the problem lies we first take a step backwards: Theorem (4.2) is proved by showing that for a suitable choice of (S, H) the moduli space is isomorphic to (K3)[n] (birational suffices by Huybrechts’ Theorem (4.4)) and this is also the quickest method for showing that M (or M 0 ) 8 and M 80 we need to proceed differis irreducible symplectic. In studying M 8 (or M 80 ) other than ently: for the moment being we have no description of M as a moduli space. Applying Lefschetz’ Hyperplane Section Theorem we can 8 (or M 80 ) whose low-dimensional describe quite explicitly a certain subset of M topology resembles that of the mysterious variety we are studying. Examining
378
K.G. O’Grady
this subset we are able to show that the mysterious variety is irreducible sym80 have plectic and also that (5.1)–(5.2) hold. The basic discreet invariants of M been computed recently by Rapagnetta [28]. In the same paper Rapagnetta 80 is equal to 1920. also proved that the topological Euler characteristic of M The question that naturally arises is whether one can generalize the above construction to produce other deformation classes of irreducible symplectic manifolds. In [25] we studied M (2, 0, 2 − 2k) for S a K3 and any k ≥ 2 (if k < 2 we get nothing interesting). If H is chosen generically the semistable non-stable sheaves are represented by IZ ⊕ IW where [Z], [W ] ∈ S [k] and their moduli sweep out the singular locus of the (8k − 6)-dimensional space M (2, 0, 2 − 2k). The singularities of M (2, 0, 2 − 2k) for k > 2 differ from those of M (2, 0, −2): in [25] we constructed a projective symplectic partial desingu8(2, 0, 2−2k) → M (2, 0, 2−2k) which is an actual desingularization larization M only when k = 2. Recently Kaledin and Lehn [19] proved that M (2, 0, 2 − 2k) has no symplectic resolution if k > 2. Of course there are many other moduli spaces M (r, c1, s) for which M st (r, c1 , s) '= M (r, c1 , s). However M (2r , 2c1 , 2s ) with (r , c1 , s ) an indivisible vector in (Z ⊕ H 2 (S; Z) ⊕ Z) is deformation equivalent to M (2, 0, 2 − 2k) for an appropriate k and hence we will get nothing new. In general it looks unlikely that we will find new deformation classes of irreducible symplectic manifolds by desingularizing moduli spaces M (r, c1, s). References [1] V. Batyrev, Birational Calabi-Yau n-folds have equal Betti numbers, New trends in algebraic geometry (Warwick 1996), London Math. Soc. Lecture Note Ser. 264, CUP, 1999, pp. 1–11. [2] A. Beauville, Vari´ et´es K¨ ahleriennes dont la premi` ere classe de Chern est nulle, J. Differential geometry 18, 1983, pp. 755–782. [3] A. Beauville, R. Donagi, La vari´ et´es des droites d’une hypersurface cubique de dimension 4. C. R. Acad. Sci. Paris S´er. I Math. 301, 1985, pp. 703–706. [4] F. Bogomolov, Hamiltonian K¨ ahlerian manifolds, Soviet Math. Dokl. 19 (1978), 1979, pp. 1462–1465. [5] S. Boucksom, Le cˆ one k¨ ahl´erien d’une vari´ et´e hyperk¨ ahl´erienne, C. R. Acad. Sci. Paris 333, 2001, pp. 935–938. [6] D. Burns, Y. Hu, T. Luo, HyperK¨ ahler Manifolds and Birational Transformations in dimension 4 , Vector bundles and representation theory (Columbia, MO, 2002), Contemp. Math. 322, AMS, 2003, pp. 141–149. [7] D. Burns, M. Rapoport, On the Torelli problem for K¨ ahlerian K3 surfaces, ´ Norm. Sup. 8 (1975), pp. 235–274. Ann. scient. Ec. [8] K. Cho, Y. Miyaoka, N. Shepherd-Barron, Characterizations of projective space and applications to complex symplectic manifolds, Higher-dimensional birational geometry (Kyoto, 1997), Adv. Stud. Pure Math. 35, Math. Soc. Japan, 2002, pp. 1–88. [9] J.P. Demailly, M. Paun, Numerical characterization of the K¨ ahler cone of a compact K¨ ahler manifold, Ann. of Math. 159 (2004), pp. 1247–1274.
Hyperk¨ ahler Manifolds and Algebraic Geometry
379
[10] J. Denef, F. Loeser, Germs of arcs on singular algebraic varieties and motivic integration, Invent. Math. 135, 1999, pp. 201–232. [11] R. Friedman, A new proof of the global Torelli theorem for K3 surfaces, Ann. of Math. 120, 1984, pp. 237–269. [12] A. Fujiki, On the de Rham Cohomology Group of a Compact K¨ ahler Symplectic Manifold, Adv. Studies in Pure Math. 10, Algebraic Geometry, Sendai 1985, 1987, pp. 105–165. [13] L. G¨ ottsche, D. Huybrechts, Hodge numbers of moduli spaces of stable bundles on K3 surfaces, Internat. J. Math. 7, 1996, pp. 359–372. [14] Y. Hu, S.-T. Yau, HyperK¨ ahler manifolds and birational transformations, Adv. Theor. Math. Phys. 6, 2002, pp. 557–574. [15] D. Huybrechts, Compact hyper-K¨ ahler manifolds: basic results, Invent. Math. 135, 1999, pp. 63–113. [16] D. Huybrechts, Erratum: “Compact hyper-K¨ ahler manifolds: basic results” [Invent. Math. 135 (1999), no. 1, 63–113] , Invent. Math. 152, 2003, pp. 209–212. [17] D. Huybrechts, The K¨ ahler cone of a compact hyperk¨ ahler manifold, Math. Ann. 326, 2003, pp. 499–513. [18] D. Huybrechts, Compact hyperk¨ ahler manifolds, Calabi-Yau manifolds and related geometries (Nordfjordeid 2001), Universitext, Springer, Berlin, 2003, pp. 161–225. [19] D. Kaledin, M. Lehn, Local structure of hyperk¨ ahler singularities in O’Grady examples, arXiv:math.AG/0405575. [20] C. Peters, E. Looijenga, Torelli theorems for K¨ ahler K3 surfaces, Compositio Math. 42, 1980/81, 145–186. [21] E. Markman, Brill-Noether duality for moduli spaces of sheaves on K3 surfaces, J. Algebraic Geom. 10, 2001, pp. 623–694. [22] S. Mukai, Symplectic structure of the moduli space of sheaves on an abelian or K3 surface, Invent. math 77, 1984, pp. 101–116. [23] S. Mukai, On the moduli space of bundles on K3 surfaces, I , Vector Bundles on Algebraic Varieties, TIFR, Bombay, O.U.P., 1987, pp. 341–413. [24] K.G. O’Grady, The weight-two Hodge structure of moduli spaces of sheaves on a K3 surface, J. Algebraic Geom. 6, 1997, pp. 599–644. [25] K.G. O’Grady, Desingularized moduli spaces of sheaves on a K3 , J. f¨ ur die reine und angew. Math. 512, 1999, pp. 49–117. [26] K. G. O’Grady, A new six-dimensional irreducible symplectic variety, J. Algebraic Geom. 12 (2003), pp. 435–505. [27] I. Piatechki-Shapiro, I.R. Shafarevich A Torelli theorem for algebraic surfaces of type K3, Math. USSR Izvestija 5 (1971) pp. 547–588. [28] A. Rapagnetta, Topological invariants of O’Grady’s six-dimensional irreducible symplectic variety, arXiv:math.AG/0406026. [29] J. Varouchas, Sur l’image d’une vari´et´e k¨ ahl´erienne compacte, Fonctions de plusieurs variables complexes, V (Paris, 1979-1985), Springer LNM 1188, 1986, pp. 245–259. [30] K. Yoshioka, Some examples of Mukai’s reflections on K3 surfaces, J. Reine Angew. Math. 515, 1999, pp. 97–123.
380
K.G. O’Grady
[31] K. Yoshioka, Moduli spaces of stable sheaves on abelian surfaces, Math. Ann. 321, 2001, pp. 817–884. [32] A. Weil, Final report on contract AF 18(603)-57 , Andr´e Weil – Collected papers, vol. II, Springer, 1979, pp. 393–395. [33] J. Wierzba, J. Wi´sniewski Small contractions of symplectic 4-folds, Duke Math. J. 120, 2003, pp. 65–95. Kieran G. O’Grady Universit` a di Roma “La Sapienza” Dipartimento di Matematica “Guido Castelnuovo” Piazzale Aldo Moro n. 5 I-00185 Rome, Italy e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Sumsets Imre Z. Ruzsa Abstract. Highlights in the theory of sets with small sumset and related problems.
1. Introduction Let A and B be sets in a commutative group. We will call the group operation addition and use additive notation. The sumset of these sets is A + B = {a + b : a ∈ A, b ∈ B}. We shall also consider a more general form of sumsets. Let G be a graph whose vertices contain A ∪ B. We define the sum along G as G
A + B = {a + b : a ∈ A, b ∈ B, a and b are connected.} We shall consider the following sort of question. Write |A| = n, and assume G
that |A + A| ≤ Kn (or |A + A| ≤ Kn). What can we say about the set A? A will typically be a set of integers or residues modulo m. Sometimes we can extend the results to general commutative groups. Very rarely can we handle noncommutative groups, and we shall emphasize when we have a result valid without commutativity. The discussion is divided into three parts. In Section 2 we consider the set of all sums, which corresponds to the case of a complete graph. In Section 3 we consider sums along dense graphs; the results will be similar to the case of all sums. In Section 4 we consider thin graphs; this changes the situation completely. Caveat: due to an accident, the author could work much less on this paper than planned and it is quite incomplete – an outline of a survey. 2. All sums We want to describe sets that have few sums. If |A| = n, then clearly |A+A| ≥ n in every group (with equality for cosets), which can be improved to 2n − 1 for sets of integers (or torsionfree groups in general). What can we say if we know Supported by Hungarian National Foundation for Scientific Research (OTKA), Grants No. T 38396, T 43623 and T 42750.
382
I.Z. Ruzsa
that |A + A| ≤ Kn, where K is constant or grows slowly as n → ∞? That is, we are looking for statements of the form |A| = n, |A + A| ≤ Kn −→ (. . . ). Such a condition (. . . ) is adequate, if this implication can be reversed to some degree, that is, there is an implication in the other direction (. . . ) −→ |A + A| ≤ K n, with K = K (K) depending only on K and not on n or other properties of the set. Between such results we can distinguish on two grounds. First, the smaller the value of K , the better the description; next, subjectively, the more we learn on the structure of the set the happier we are. As an example consider the following implication [11] (valid in every group, even without commutativity): |A| = n, |A + A| ≤ Kn −→ |A − A| ≤ K 2 n. (The exponent 2 is best possible here.) In commutative groups we have a similar implication in the other direction [12]: |A| = n, |A − A| ≤ Kn −→ |A + A| ≤ K 2 n (the exponent 2 is probably not best possible here). If we combine the two we get that |A + A| ≤ Kn −→ |A − A| ≤ K 2 n −→ |A + A| ≤ K 4 n, K = K 4 , so this is an adequate description with a very good value of K , but it tells little about the structure of A and it is not surprising. Indeed, a + b = c + d ⇐⇒ a − c = d − b, so a coincidence between sums corresponds to a coincidence between differences. In particular, this shows that |A + A| attains its maximal value n(n + 1)/2 exactly when |A − A| attains its maximal value n(n − 1) + 1. (Such sets, with no nontrivial coincidence between sums or differences, are often called Sidon sets.) There is a similar connection between minimal values of these quantities. For sets of integers the minimal value of both |A + A| and |A − A| is 2n − 1, and equality occurs only for arithmetic progressions. Still, the connection here is less obvious than it looks. We illustrate this by the case of near-maximal values. Suppose that |A + A| ≥ κn2 ; does it follow that |A − A| ≥ κ n2 with some κ depending on κ? The answer is negative in a rather strong way: |A + A| > n2 /2 − n2−δ and |A − A| < n2−δ can happen with some constant δ > 0. Similarly |A − A| > n2 /2 − n2−δ and |A + A| < n2−δ is also possible. [14]
Sumsets
383
Freiman’s theory. A set of integers with a minimal sumset (|A+A| = 2n−1) is necessarily an arithmetic progression. This easy result exhibits some stability. A set with a nearly minimal sumset is almost an arithmetic progression, as the following result shows. Theorem 2.1 (G. Freiman [6]). If A ⊂ N, |A| = n, |A + A| ≤ 3n − 4, then A is contained in an arithmetic progression of length ≤ |A + A| − n + 1 ≤ 2n − 3. Beyond 3n, however, a single arithmetic progression is insufficient, as the following example shows. Take A = {1, . . . , n/2} ∪ {t + 1, . . . , t + n/2}, .........
.........
we have |A + A| = 3n − 3, and A cannot be covered by a progression shorter than t+n/2. The reason is that this set has a hidden two-dimensional structure: ......... ......... These sets are not isomorphic algebraically, but they behave analogously regarding the coincidence of sums. To describe such sets we need multidimensional, or generalized arithmetic progressions. A generalized arithmetic progression is a set of the form P = {b + x1 q1 + · · · + xd qd : 0 ≤ xi ≤ li − 1} (a projection of a cube). We call d the dimension, |P | = l1 l2 . . . ld the size of P. The principal result sounds as follows. Theorem 2.2 (G. Freiman[6]). If A ⊂ N, |A| = n, |A + A| ≤ Kn, then A is contained in a generalized arithmetic progression of dimension ≤ d(K) and size ≤ s(K)n. This is an adequate description with the simplest possible structure: if A ⊂ P , then |A + A| ≤ |P + P | < 2d |P | ≤ 2d sn, K = 2d(K) s(K). (The above is not exactly what Freiman proved, but to acknowledge his fundamental contribution I prefer to call it his theorem.) For a comprehensive account of this theory up to 1996 see Nathanson’s book [10]. Three basic questions arise here: (1) to find good bounds for d(K), s(K); (2) is this the “real” form? (3) how to extend this from N to other groups.
384
I.Z. Ruzsa
Bounds. Due to works by the author [13, 15], Y. Bilu [2], M. C. Chang [3] we c know that d < K (best possible) and s < eK . It is also known that a bound for s must be 2K ; probably the proper order is ecK . The real form. Probably a flexible form (several covering sets, projections of lattice points in more general convex bodies) would give better bounds for K . Other groups. For sets situated in Zm or in general commutative torsionfree groups verbatim the same result holds. In groups with torsion a new phenomenon arises, namely any coset has |A + A| = |A|. For groups a strong torsion property this alone suffices to characterize sets with small sumsets. Recall that the exponent of a group S is the smallest positive integer r such that rg = 0 for every g ∈ S. Theorem 2.3 (Ruzsa[16]). Let S be a commutative group of exponent r, A ⊂ S, |A| = n, |A + A| ≤ Kn. A is contained in a coset of a subgroup of size 2 ≤ K 2 r[2K −2] n. Here I have a conjecture how the optimal form should look. I formulate it for the case r = 2. Conjecture 2.4. Let S = Zm 2 be a dyadic group, A ⊂ G, |A| = n, |A+ A| ≤ Kn. A is contained in ≤ K c cosets of a subgroup of size ≤ n. In the most optimistic form c would be 1 + o(1). This is equivalent to the following problem, which I think is interesting in its own right. Equivalent Conjecture. Let S be as above, f : S → S a function such that f (x+ y)−f (x)−f (y) assumes at most K distinct values. Then f has a decomposition f = g + h, where g is a homomorphism and h assumes ≤ K c values. The equivalence is meant in a loose sense, the values of c need not be the same. (The proof of this equivalence is unpublished.) General commutative groups. In a general commutative group, a set with a small sumset can be covered by a combination of the two mentioned structures, cosets and generalized arithmetic progressions. Theorem 2.5 (Green-Ruzsa (in preparation)). Let S be a commutative group of, A ⊂ S, |A| = n, |A + A| ≤ Kn. A is contained in a set of the form H + P , where H is a subgroup, P is a generalized arithmetic progression, the dimension of P is ≤ d(K) and |H||P | ≤ s(K)n. c
For the quantities we have the following bounds: d(K) K c , s(K) eK .
Sumsets
385
Noncommutative groups. For general groups, I do not even have a decent conjecture! There is a structure theorem for SL2 (R) (Elekes-Kir´ aly[5]). Roughly speaking, it asserts that a set with a small sumset is contained in a few cosets of a commutative subgroup, and within a coset we have a generalized arithmetic progression structure. 3. Dense graphs In the sequel let A be in a commutative group, |A| = n, and let G be a graph on A. Recall that G
A + A = {a + b : a, b ∈ A, a and b are connected.} The first result on such sumsets is due to Balog and Szemer´edi. G
Theorem 3.1 (Balog-Szemer´edi[1]). If |A + A| ≤ Kn and G has ≥ cn2 edges, then there is A ⊂ A such that |A | ≥ c1 n and |A + A | ≤ c2 n, with positive c1 , c2 depending on K and c. In this way the graph-sum problem is reduced to previous type problem about ordinary sumsets, and a Freiman-type result can be applied if it is available. We cannot hope much more than a subset A in this situation; indeed, a part of A may have no edge at all and then clearly we cannot say anything about these elements. We can claim more if every degree is large. G
Theorem 3.2 (Elekes-Ruzsa[4]). If |A + A| ≤ Kn and in G every vertex has degree ≥ βn, then there is a decomposition A = A1 ∪ · · · ∪ Ak such that |Ai | ≥ βn/2, k ≤ 2/β, |Ai + Ai | ≤ f (K, β)n. With a stronger assumption we can omit the partition. G is β-dense-connected if for every B ⊂ A there are ≥ β|B||A \ B| edges between B and A \ B. G
Theorem 3.3 (Elekes-Ruzsa[4]). If |A + A| ≤ Kn and G is β-dense-connected, then |A + A| ≤ g(K, β)n. Clearly such an assumption is necessary; if G is the union of two disjoint cliques, then we cannot expect anything about sums between elements of different cliques. 4. Thin graphs If we do not assume anything about the graph G, then we cannot hope to G
deduce any structural property of A from the assumption that A + A is small. We concentrate on a single problem: if the number of sums along a graph is small, what can we say about the differences along the same graph?
386
I.Z. Ruzsa
Recall that for the complete graph we had the implication |A| = n, |A + A| ≤ Kn −→ |A − A| ≤ K 2 n. We shall concentrate on the case K = 1: G
G
|A| = n, |A + A| ≤ n −→ |A − A| ≤? Even a bound o(n2 ) is not obvious here. A bound of the form n2−c was obtained by Gowers, improved by Bourgain, then by Katz and Tao [8]. Theorem 4.1 (Gowers, Bourgain, Katz-Tao). For arbitrary sets in a commutative group and a graph on them we have G
G
|A − B| ≤ (|A||B|)2/3 |A + B|1/2 ; in particular, G
G
|A| = n, |A + A| ≤ n −→ |A − A| ≤ n11/6 . We show now that the exponent cannot be much improved. Example. Take A = {0, 1}k , n = 2k ; connect (x1 , . . . , xk ) and (y1 , . . . , yk ) if xi + yi ≤ 1 for all i. We have clearly G
G
|A + A| = 2k = n, |A − A| = 3k = nc , c =
log 3 = 1.58496 . . . . log 2
This example is a “power”: the coordinates are treated independently. We can improve it by imposing a dependence on them. where Better Example. We will take a set A ⊂ {0, 1}3k , namely those vectors k . Connect exactly k coordinates are equal to 1. Clearly |A| = n = 3k ≈ (27/4) k G
(x1 , . . . , xk ) and (y1 , . . . , yk ) if always xi + yi ≤ 1. A + A contains vectors with G
G
exactly 2k coordinates equal to 1, so |A + A| = n. A − A will consist of the vectors with exactly k coordinates equal to 1, exactly k equal to 0 and k equal to −1. Consequently G
|A − A| = (3k)!/k!3 ≈ 27k ≈ nc ,
where
c=
log 27 = 1.72598 . . . . log 27/4
Recall that the upper bound for the exponent was 11/6 = 1.8333 . . . . The second example and entropy. Let X, Y be (dependent) random variables, (X, Y ) = (0, 0) or (0, 1) or (1, 0) each with probability 1/3. We can retell the previous example as follows. (x1 , . . . , xk ) ∈ A if the statistical distribution of coordinates is the same as the distribution of X; (y1 , . . . , yk ) ∈ B if the distribution of coordinates is the same as the distribution of Y . We draw an edge between them if the joint distribution of pairs (xi , yi ) is the same as the joint distribution of (X, Y ).
Sumsets
387
In this way we will have |A| ≈ 2h(X)k |B| ≈ 2h(Y )k , G
G
|A + B| ≈ 2h(X+Y ) , |A − B| ≈ 2h(X−Y ) , where h(X) = − pi log pi is the entropy of a discrete variable which assumes its values with probabilities pi . We can do the same for any pair of variables X, Y assuming finitely many values with rational probabilities. If now we apply the Katz-Tao inequality G
G
|A − B| ≤ (|A||B|)2/3 |A + B|1/2 to such sets, we obtain the inequality 1 2 h(X − Y ) ≤ (h(X) + h(Y )) + h(X + Y ). 3 2 It is now a routine argument to extend this inequality for every pair or random variables for which all the entropies exist. In fact, one can prove (unpublished) a general equivalence between a class of similar entropy inequalities and a corresponding inequality for sumsets. Sumsets, arithmetical progressions and the Kakeya conjecture. Let us return to the starting question G
G
|A| = n, |A + A| ≤ n −→ |A − A| ≤? An equivalent formulation is as follows: how large is the union of m 3term arithmetic progressions with distinct differences? To see the connection between these problems, given a collection of 3-term arithmetic progressions, let A be the set of both endpoints and connect two elements of A if one of the G
given progressions starts with one and ends with the other. Here A + A will G
contain the midpoints multiplied by 2, and A − A will contain the differences G
multiplied by 2. The Katz-Tao inequality asserts that m ≤ |A|4/3 |A + A|1/2 , so we get that the cardinality of the union is ≥ m6/11 . We know little about longer progressions, though Katz and Tao have also results in this direction. Conjecture 4.2. The union of m k-term arithmetic progressions with distinct differences has m1−εk elements with εk → 0. An interesting aspect of this problem is its connection with what is know as the Kakeya conjecture: if a set in Rd contains a unit interval in each direction, then its box-dimension (or Minkowski dimension) is d. (For more on this problem and its connections to other branches of mathematics see Tao[17]. For further results see Katz-Tao [9, 7].) Sometimes the Hausdorff dimension is used; I would be very surprised if the answers were different for this case, though the known bounds for Minkowski and Hausdorff dimension do differ.
388
I.Z. Ruzsa
We outline this connection in an informal and heuristic way. First change the assumption in the Kakeya problem to the following. For arbitrary numbers 0 ≤ a1 , . . . , ad−1 ≤ 1 our set contains an interval with endpoints (x1 , x2 , . . . , xd−1 , 0) and (x1 + a1 , . . . , xd−1 + ad−1 , 1) for suitable xi . √ In this version the length of the intervals varies between 1 and d, and we have only a subset of directions, namely those where the last coordinate is the largest; it is easy to see that these changes do not affect the result. The relevance of the restriction that the starting points lie on a hyperplane is less obvious; we leave it to the reader to realize that the two problems are equivalent. Assume now that this set has dimension α. Now take an integer l for which εl is small, then take a large n and divide the space into cubes of size 1/n. Our set intersects about nα of them. A layer (cubes with a common last coordinate) contains nα−1 on average. With a statistical argument we can find l layers with the following properties: • the last coordinates form an arithmetic progression, • the starting points lie in the interval (0, 1/3), the endpoints in (2/3, 1), • together they contain < nα−1+ε nonempty cubes. Take a typical interval forming our set, and consider the l cubes in the selected l layers through which it passes. Now form the arithmetic progression of length l which starts with the center of the first cube and ends with the center of the last. The ith term of this progression need not coincide with the center of the ith cube, but the difference is a vector with coordinates of the form j/((l − 1)n), |j| ≤ l. Hence these progressions together have < nα−1+ε elements. Now restrict our attention to intervals arising from vectors (a1 , . . . , ad−1 ) such that each coordinate ai is a rational number of the form 10m/n. The arithmetic progressions formed from these intervals in the above described way will have distinct differences. The number of these progressions is nd−1 , thus our conjecture implies nα−1+ε n(d−1)(1−εl ) , that is, α − 1 + ε ≥ (d − 1)(1 − εl ). If εl can be taken arbitrarily small, we can conclude that α = d as wanted. References [1] A. Balog and E. Szemer´edi, A statistical theorem of set addition, Combinatorica 14 (1994), 263–268. [2] Y. Bilu, Structure of sets with small sumset, Structure theory of set addition, Ast´erisque, vol. 258, Soc. Mat. France, 1999, pp. 77–108. [3] Mei-Chu Chang, A polynomial bound in Freiman’s theorem, (preprint). [4] Gy. Elekes and I.Z. Ruzsa, The structure of sumsets with few sums along a graph, J. Combinatorial Th., Ser. A., submitted.
Sumsets
389
[5] Gy¨ orgy Elekes and Zolt´ an Kir´ aly, On combinatorics of projective mappings, 14 (2001), 183–197. [6] G. Freiman, Foundations of a structural theory of set addition, American Math. Soc., 1973. [7] N. Katz and T. Tao, Recent progress on the Kakeya conjecture, Proceedings of the 6th international conference on harmonic analysis and partial differential equations (Barcelona), U. Barcelona. [8] , Bounds on arithmetic projections, and applications to the Kakeya conjecture, Math. Res. Letters 6 (1999), 625–630. [9] , New bounds for Kakeya problems, J. Anal. Jerusalem 87 (2002), 231–263. [10] M.B. Nathanson, Additive number theory: Inverse problems and the geometry of sumsets, Springer, 1996. [11] I. Z. Ruzsa, On the cardinality of A + A and A − A, Combinatorics (Keszthely 1976), Coll. Math. Soc. J. Bolyai, vol. 18, North-Holland – Bolyai T´ arsulat, Budapest, 1978, pp. 933–938. [12] , An application of graph theory to additive number theory, Scientia, Ser. A 3 (1989), 97–109. [13] , Arithmetical progressions and the number of sums, Periodica Math. Hung. 25 (1992), 105–111. [14] , On the number of sums and differences, Acta Math. Sci. Hungar 59 (1992), 439–447. [15] , Generalized arithmetical progressions and sumsets, Acta Math. Hung. 65 (1994), 379–388. [16] , An analog of Freiman’s theorem in groups, Structure theory of set addition, Ast´erisque, vol. 258, Soc. Mat. France, 1999, pp. 323–326. [17] T. Tao, From rotating needles to stability of waves: Emerging connections between combinatorics, analysis, and pde, Notices of the AMS (2001), 294–303. Imre Z. Ruzsa Alfr´ ed R´ enyi Institute of Mathematics Budapest, Pf. 127 H-1364 Hungary e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Measurable Group Theory Yehuda Shalom
Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392 2. The setting, basic notions, and some appetizers. . . . . . . . . . . . . . . . . . . . . . . 393 The setting and basic notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393 Amenable vs. non-amenable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394 3. The ergodic theoretic approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 4. The bounded cohomology approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 5. The relation to operator algebra and 2 -Betti numbers . . . . . . . . . . . . . . . . 400 The group measure space construction and some applications . . . . . . . . . 400 The fundamental group of factors and equivalence relations . . . . . . . . . . . 401 2 -Betti numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402 6. Measurable vs. geometric group theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404 Quasi-isometries and Measure Equivalence of groups . . . . . . . . . . . . . . . . . . 404 ME Rigidity results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405 Applications to the geometric group theory of amenable groups . . . . . . . 407 7. The relation to descriptive set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408 8. Graphings, cost, and treeability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410 9. Other ME-invariants, further remarks, and some questions . . . . . . . . . . . . 413 9.1. Spectral ME-invariants and property (T) . . . . . . . . . . . . . . . . . . . . . . . . . 413 9.2. Hyperbolic groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414 9.3. Lattices in rank 1 Lie groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415 9.4. Strengthening Measure Equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416 9.5. Algebraic structures on relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416 Addendum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
392
Y. Shalom
1. Introduction Measurable group theory aims at understanding how much of the algebraic structure of a countable group can be recovered solely from the equivalence relation of “being in the same orbit”, induced by (specific, or all) finite measure preserving actions of the group. It turns out that some groups (e.g., abelian) lose “most” of their structure, while for others the opposite happens, to the extent that occasionally both the group and the action can be entirely reconstructed from the equivalence relation. This theme turns out to be a common playground for diverse areas of research including ergodic theory, operator algebra, and descriptive set theory. It can also be viewed as the “measurable younger brother” of geometric group theory; a perspective (due to Gromov), which is found to be fruitful in both disciplines. Nowadays, following five to six years of rapid progress involving the introduction of diverse and deep tools, it may be considered as an independent – even if interdisciplinary – area of research in its own right. The purpose of this survey is to describe the foundations of the discipline on one hand, and its most recent exciting developments on the other, in as friendly and non-technical a manner as possible. The vast majority of results presented here appeared over the past six years, and of those many had not been published at the time of writing this paper. Proofs are not given, and occasionally even the results themselves are not stated in their full generality. Rather, the emphasis is on the main ideas, approaches, and concepts underlying the results, along with the connections between them. Each of the sections of the paper actually deserves a considerably more detailed exposition, and we hope that the inevitable omission of some results will be received with understanding. We do try, however, to offer a fairly complete list of relevant bibliography to which, at appropriate places, readers are referred for further details. We hope that while helping to bridge the “cultural differences” present in this interdisciplinary area, the exposition will also serve as an attractive and welcoming invitation for mathematicians working in neighboring fields, as well as graduate students taking their first steps. Acknowledgments. It is a pleasure to thank Alex Furman and Damien Gaboriau, with whom we held numerous enlightening discussions on and around measurable group theory, and whose insights and results considerably influenced this exposition. We particularly thank Nicolas Monod for the enjoyable and fruitful collaboration we have had around some of the results presented here, as well as Greg Hjorth, Alekos Kechris, Sorin Popa, Benjy Weiss, and Pierre de la Harpe, whose much appreciated remarks and suggestions have found their way into this manuscript.
Measurable Group Theory
393
2. The setting, basic notions, and some appetizers. . . The setting and basic notions. Throughout this paper we shall consider infinite countable groups Γ acting on probability measure spaces (X, µ). The measure spaces will always be assumed standard; measurably they can all be thought of as the unit interval equipped with the Lebesgue measure. Unless specified otherwise, we shall keep the following assumptions on the actions throughout the paper: • Measure preserving: ∀γ ∈ Γ, A ⊆ X: µ(γA) = µ(A) • Ergodic: If A ⊆ X & γA = A ∀γ ∈ Γ ⇒ µ(A)µ(X − A) = 0 • (Essentially) Free: ∀γ ∈ Γ γ '= id : µ{x ∈ X | γx = x} = 0 Any such action induces on X an equivalence relation R = RΓ of “being in the same orbit”: R
x ∼Γ y ⇔ ∃γ ∈ Γ s.t.
γx = y
We remark that most of the equivalence relations R we shall be concerned with, are of the form RΓ . However, one can (and does) define abstractly the class of relations R which are: countable (equivalence classes are countable), measurable (the set {(x1 , x2 ) ∈ X × X | x1 ∼ x2 } is measurable), finite measure preserving (µ is finite and is preserved by every µ-measure class preserving isomorphism f ∈ Aut(X, R)), and ergodic (the R-saturation of any positive measure subset has full measure). These are called type II1 relations. While studying this abstraction, Feldman-Moore [34] showed that any such relation is generated by some action of a countable group, leaving open the question of whether the action can be chosen to be free. This was answered negatively by Adams [2] in the Borel setting, or, in the measurable one where the measure is not ergodic. The more natural (from our point of view) measurable and ergodic version, was answered negatively only recently by Furman [43], as a by-product of his breakthrough discussed in Sections 3 and 6 below (see the paragraph proceeding Theorem 6.10 for more details). Here are some basic examples of group actions, which we shall use in the sequel: 1. Any invertible, ergodic, measure preserving transformation T of (X, µ) corresponds to a Z = T n n∈Z action on X (e.g., T =non-periodic rotation of the circle). There is of course a host of such actions, which are very different from ergodic theoretic point of view. 2. The Γ = SLn (Z)-action on the n-torus (Tn = Rn /Zn , µ =Lebesgue), induced by its standard linear action on Rn . 3. Let G be a second countable, locally compact group, and let Γ, Λ < G be discrete subgroups which are lattices: their natural multiplication action on G admits a finite (Haar) measure fundamental domain. In this setting, one has a naturally associated action of (say) Γ on the finite measure quotient space G/Λ (leaving aside here the general issues of ergodicity or freeness). Important basic example: Γ = SLn (Z)< SLn (R) = G is a lattice.
394
Y. Shalom
4. Let K be a compact (say, Lie) group, equipped with its Haar (probability) measure µ, and Γ < K be a dense subgroup. The left multiplication action of Γ on K is always measure preserving, free, and ergodic (by density). 5. Let Γ be any (countable) group. Define: X = [0, 1]Γ = {(x)γ∈Γ }, and let Γ act on it by “permuting coordinates”: (γ0 (x))γ = (x)γ0 γ . Any product measure of the form µ = υ × υ × · · · on X (with υ arbitrary probability measure on [0, 1], possibly of finite support, excluding only Dirac point masses) is Γ-invariant, and gives rise to an ergodic free Γ-action. We shall generally refer to such group actions as Bernoulli actions. The basic notion which enables one to formulate in a precise manner the questions posed at the introduction, is the following: Definition 2.1. The Γ-action on (X, µ) and Λ-action on (Y, υ) are called Orbit Equivalent (OE) if there exists an isomorphism of measure spaces f : X → Y such that: f (Γx) = Λf (x) for a.e. x, i.e.: RΓ ∼ = RΛ are isomorphic relations. When RΓ and RΛ are isomorphic via an isomorphism f : X → Y as above, we say that f induces the (given) orbit equivalence. Amenable vs. non-amenable. The following striking result, which is the departure point of this exposition, was first proved by Dye [30], [31] for Z or groups of polynomial growth, and in general by Ornstein-Weiss [93] (for a more general version, not relevant to us here, see [20]): Theorem 2.2. If Γ and Λ are amenable groups, then any actions of them are OE. For all purposes of this paper, one may use the (one of many) definition of amenability of a group as having an invariant probability measure for any continuous action on a compact space. Thus, it follows for instance that all the actions in Example 1 in the list above are OE. More importantly, for any (countably infinite) amenable group, its only structural property reflected in the equivalence relations it generates, is its amenability (that at least that much is preserved is easy to see). Thus, Theorem 2.2 demonstrates a sharp non-rigidity phenomenon, which may be taken as a somewhat discouraging beginning: any group theoretic property which can be separated within the class of amenable groups (e.g., finite generation), cannot be captured in general, in the orbit equivalence relations generated by group actions. Our main purpose in this paper is to show that nevertheless, a rich and deep rigidity theory underlies the notion of orbit equivalence. There are two related, yet independent ways, in which one may proceed in order to contrast the Dye-Ornstein-Weiss Theorem above: I. Finding groups Γ (as “natural” and “familiar” as possible) possessing many non OE actions, and II. Finding pairs of groups Γ, Λ whose actions are never OE. The tools developed for these questions are sometimes strong enough to establish situations where the equivalence relation actually determines (at least much about) both the group and its action. The next sections are organized along the first of these two
Measurable Group Theory
395
directions, and subsequently they merge. Here are some concrete applications of the various approaches we shall discuss: Theorem 2.3. The group Γ = SLn≥3 (Z) (or any lattice in SLn≥3 (R)) admits a continuum of non OE actions. This is one illustration of Zimmer’s “cocycle superrigidity” approach from the ergodic theory of algebraic groups, which is the subject of the next section. Theorem 2.4. The same result holds for a continuum of groups Γ, e.g., any torsion free group of the form Γ = Γ1 × Γ2 , where each Γi is a free product of infinite groups, or a Gromov hyperbolic group. This is one application of Monod-Shalom’s bounded cohomology approach, discussed in Section 4. Theorem 2.5. The same result holds when Γ is any non-abelian free group. This result of Gaboriau-Popa is one consequence of the relation with operator algebra, which became particularly powerful with the recent pioneering work of Popa. The connections with operator algebra, both classical and very recent, turn out to be fruitful in both ways, and are discussed in Section 5. In fact, prior to all the results stated here, a rather coincidental construction of countable “exotic” groups Γ satisfying the same conclusion as in the preceding Theorems has appeared, using operator algebra [13]. These constructions are application of McDuff’s work [85], but unlike the other directions pursued here, no systematic approach ever followed them. Finally, to complete the general picture, we mention the following converse to Theorem 2.2: Theorem 2.6. Any non-amenable group Γ admits at least two non OE actions. The result is a beautiful application of Kazhdan’s property (T) from infinite-dimensional unitary representation theory (cf. [61]). The proof divides into two cases, according to whether Γ does, or does not, have property (T). The latter was handled 25 years ago by Connes-Weiss [24], leaving open the intriguing case of Kazhdan groups. This was settled only recently by the logician Hjorth [67], who showed that such groups admit in fact a continuum of non OE actions, via a clever, yet elementary argument (unlike the proofs of the other previously stated results, which involve heavy machinery). We remark that Property (T) was also central in the work of Gefter-Golodets [54], in the constructions (among other things), of the first relations with trivial fundamental group (see Section 5 below). More on the connections with property (T) can be found in Section 9.1. 3. The ergodic theoretic approach Assume that the measurable isomorphism f : X → Y induces orbit equivalence of the Γ-action on (X, µ) with the Λ-action on (Y, υ). Associated naturally to
396
Y. Shalom
this setting we have a so-called cocycle α : Γ × X → Λ defined as follows: α(γ, x) is the unique (by freeness) λ ∈ Λ satisfying f (γx) = λf (x) It is easy to verify directly that α indeed satisfies the defining cocycle identity: α(γ1 γ2 , x) = α(γ1 , γ2 x)α(γ2 , x) for all γi ∈ Γ and a.e. x ∈ X
(∗)
Notice that when such α does not depend on x, it is a group homomorphism of Γ. There are many isomorphisms f : X → Y inducing the same bijection between the set of orbits. However it is easy to see that each of them is obtained by cutting X into (countably many) measurable pieces, and on each composing f with some element of Λ. To such new f corresponds a cocycle α which is an equivalent (or “cohomologous”) cocycle, a notion which can be similarly defined for all cocycles (∗). A breakthrough, and the first systematic approach in the OE rigidity theory, came with Zimmer [124] as a consequence of his deep generalization of Margulis’ seminal superrigidity theory (see [84]), from homomorphisms to general cocycles satisfying (∗) above (not necessarily ones related to orbit equivalence). The essence of Zimmer’s result is as follows: Let G be a simple Lie group of real rank at least 2 (e.g., SLn≥3 (R)), and let Γ < G be a lattice (e.g., Γ = SLn≥3 (Z)). Assume that Γ acts ergodically and measure preservingly on the finite measure space X. Then any “nondegenerate” cocycle α : Γ × X → H, where H is any simple linear algebraic group defined over a locally compact field, is equivalent to a homomorphism. Zimmer’s original result [124] was stated for G in place of Γ, and when taking in this case X = G/Γ, it is no more than a reformulation of Margulis’ superrigidity (see [126, Ch. 5]). Results of this type are referred to as cocycle (super-)rigidity, and their strength, for a given group Γ, depends on the precise “non-degeneracy” assumption made on α, and on the nature of the family of groups H covered. In view of the previous discussion, it is clear why such a result, when applied to “orbit equivalence cocycles”, should give rise to sharp OE rigidity results (it is not difficult to show that OE-cocycles are indeed “nondegenerate”). We remark that while Zimmer’s first motivation in proving the above result in [124] was the application (brought there) to OE-rigidity, this theorem soon became a very powerful tool, with a variety of other applications, in the ergodic theory of semisimple Lie groups (see [126], [127] for a few). A particularly friendly exposition of Zimmer’s Theorem can be found in [38] (see also [45]). In subsequent effort, Zimmer and others (notably mentioning the work of Gefter and Golodets – see [54], [55], [56] and the references therein), were able to deduce from the cocycle superrigidity Theorem sharp applications to the OE theory of higher rank lattices (including Theorem 2.3 above). However, in trying to go further and understand which groups can posses an action which is OE to an action of a given higher rank lattice Γ (such as SLn≥3 (Z)), for
Measurable Group Theory
397
apparent reasons they all faced the wall of some linearity assumption on these possible “mystery” groups – see [128]. It was only with another breakthrough of Furman [42], [43] (inspired, as explained in Section 6 below, by ideas from geometric group theory), when the approach culminated in spectacular rigidity results. One elementarily stated sample from Furman’s work is the following, which describes a situation where the orbit structure captures “everything” (for another application see Theorem 6.10 below): Theorem 3.1. Fix odd n ≥ 3. If the SLn (Z)-action on (Tn = Rn /Zn , µ=Lebesgue) (see Example 2 in Section 2) is OE to a Λ-action, for some group Λ, then necessarily: Λ∼ = SLn (Z), and the OE is induced by an isomorphism of the actions. See [43, Cor. A,B] for this and other related results. There is nothing very special to the particular Γ-action appearing in Theorem 3.1; it should only avoid having any quotients of the form described in Example 3 of the list in Section 2, where G = SLn (R) is the same ambient Lie group. (To sharpen the statement made in [43] to the one here, one needs also the property that the restriction to any finite index subgroup remains ergodic, and then apply arguments similar to those in the proof of [90, Thm 1.10]. The oddness of n gives triviality of center; otherwise the statement holds only after a slight modification). Following Zimmer, it is now a basic principle that “sufficiently strong” cocycle superrigidity theorems immediately apply to orbit equivalence problems – see also [6], [8] for such results, and the discussion following Theorem 4.3 below for some additional rigidity applications. Recently, Monod-Shalom [89] established such a theorem in a general setting of product of groups using bounded cohomology. An elementarily stated, particular case of their main result (combining [86] as well) is the following: Retain the same setting as in Zimmer’s theorem above, only let now Γ = Γ1 × Γ2 be a product of any (countably infinite) Kazhdan groups, each acting ergodically on X (as is the case, e.g., in Theorem 4.3 below). Let H be any torsion free Gromov hyperbolic group. Then with these notations the conclusion of Zimmer’s theorem above holds for any cocycle α. Monod-Shalom’s results can substitute some (but not all) applications of Zimmer’s theorem, but yield some new phenomena as well. A more general bounded cohomology approach to OE rigidity was introduced by them in [90], and is the subject of the next section. Other OE rigidity results using cocycles from the ergodic theoretic viewpoint can be found in [4], [5] and [71]. 4. The bounded cohomology approach Assume that f induces an orbit equivalence between the Γ-action on (X, µ) and Λ-action on (Y, υ), and let α : Γ × X → Λ be the associated cocycle, exactly as in the beginning of the previous section. At the heart of the bounded cohomology approach introduced by Monod-Shalom in [90] lies the idea that one can use this setting in order to relate between certain representations,
398
Y. Shalom
and more importantly cohomology, of Γ and Λ. Once coupled with appropriate vanishing and non-vanishing cohomological results, OE rigidity applications can be deduced. The class of representations of use to us here is that of unitary representations on Hilbert spaces. Given such a Λ-representation π on the Hilbert space Vπ , one can define an induced Hilbert space IndΓΛ π by: 2
F 2Vπ dµ(x) < ∞ L (X, Vπ ) = F : X → Vπ X
(with the natural inner product), and a unitary Γ-representation on it by: [γF ](x) = π(α(γ −1 , x)
−1
)F (γ −1 x)
One can also use this structure in order to define a natural “induced” morphism I from the first to the second (group equivariant-) co-chain complex associated to these Λ and Γ-representations, as follows: [Iω(γo , . . . , γn )](x) = ω(α(γo−1 , x)−1 , . . . , α(γn−1 , x)−1 )
(ω : Λn+1 → Vπ ).
It is easy to verify directly that I satisfies, at the formal level, the properties required in order to induce a map between the two cohomology groups. But the exact description of I is of no real importance to us here. Rather, what should be transparent is the problem one immediately encounters when working with standard group cohomology: it is not at all clear why the norm of [Iω](γo , . . . , γn ) should be a square integrable function of x ∈ X for a fixed (γo , . . . , γn ) ∈ Γn+1 . Indeed, this need not be the case, and in fact one cannot guarantee in general any control on any function of this kind, as the cocycle α may be (and sometimes is) “wild”. Thus if one nevertheless wishes to pursue this direction, it is imperative to work with bounded cohomology, i.e., the cohomology of the complex of uniformly norm bounded (group equivariant) cochains. It is then immediate that the finiteness of measure of X automatically implies that the map I above is well defined. Consequently, it induces maps, for each value of n: I n : Hbn (Λ, π) → Hbn (Γ, IndΓΛ π) between the two bounded cohomology groups. Of course, any such map is of use only if it can be shown to be injective, and it is here that price is paid for “forcing” the cohomology theory to adapt to our setting. In bounded, rather than ordinary cohomology, one generally has much less cohomological machinery available. Indeed, had we been somehow guaranteed that there are no convergence issues (a situation we shall indeed encounter in the last subsection of Section 6 below), and hence could work instead with usual group cohomology, the injectivity of I n , for each n, would have been almost a formality. The first ingredient in Monod-Shalom’s approach is the following (see [90, Sec. 4]): Theorem 4.1. In the above setting, for every unitary Λ-representation π the map I n is injective when n = 2.
Measurable Group Theory
399
It is probably not true in general that the same holds for n > 2. Having the injectivity for n = 2 at hand, one can prove and apply some vanishing vs. non-vanishing results in bounded cohomology, in order to deduce sharp consequences. Two such results, whose “tension” is essential for the rigidity applications (e.g., those in Theorems 2.4 above, and 4.3, 4.4 below), are described in the following theorem. Here “negatively curved” represents a wide class of groups including non-elementary: Gromov hyperbolic groups, free product of groups, discrete subgroups of the isometry group of any CAT(-1) space. Theorem 4.2. 1. Any “negatively curved” group Γ satisfies Hb2 (Γ, 2 (Γ)) '= 0. 2. Let σ be any unitary representation of the (countable) group Γ = Γ1 × Γ2 , in which neither one of the factors Γi has a non-zero invariant vector. Then Hb2 (Γ, σ) = 0. Concerning Part 1, see [89] for the first general results of this nature (motivated by Sela’s [109]), which were later complemented by [86] in the case of all (subgroups of) hyperbolic groups – see also the recent [60], inspired by Brooks’ well known quasimorphism approach. Part 2 is a result of BurgerMonod [15] (see also [90] for a simpler proof). Both Theorems 4.1, 4.2, as well as other results required in this approach at the bounded cohomology level, rely heavily on the recent deep functorial approach to bounded cohomology developed by Burger-Monod [15], [87], which is particularly useful for Hb2 . One sample consequence, taken from [90], was mentioned in Theorem 2.4 above. Here are another two from that paper, where the first makes use of Example 5, and the second of Example 4 in the list of actions in Section 2. Theorem 4.3. Let Γ = Γ1 × Γ2 be as in Theorem 2.4 above. If a Bernoulli Γ-action [0, 1]Γ is OE to a Bernoulli Λ-action [0, 1]Λ for some group Λ (and any choices of Bernoulli measures υ × υ × υ · · · ), then necessarily: Λ∼ = Γ and the OE is induced by an isomorphism of the two actions. Theorem 4.4. There exists a continuum of type II1 relations R (R = RΓ ) with: R
R
Aut R (= {f ∈ Aut(X, µ) | x ∼ y ⇔ f (x) ∼ f (y)}) = R
= Inn R (= {f ∈ Aut(X, µ) | f (x) ∼ x for a.e. x ∈ X}) The first construction of a relation with trivial outer automorphism group was obtained by Gefter [53], using Zimmer’s cocycle superrigidity. A more comprehensive treatment of this theme, still in the framework of higher rank lattices, was recently carried out by Furman [44], who constructed a continuum of relations R as in theorem, but which are all weakly equivalent (see Section 6 below), unlike the case here. The proof of Theorem 4.4 capitalizes on the additional flexibility made possible in the bounded cohomology approach, compared to the more rigid setting of higher rank lattices.
400
Y. Shalom
5. The relation to operator algebra and 2 -Betti numbers The group measure space construction and some applications. It is through operator algebra that measurable group theory came to life, in the seminal work of Murray and von-Neumann, where the group measure space construction was introduced [92]. For a long time, it seems, the results obtained in the orbit equivalence theory were primarily examined in the perspective of operator algebra, a point of view which has gradually shifted since Zimmer’s work described earlier. However, with the exception of a few rather coincidental results, it is only in the last three years or so, that the connections between the two theories became truly powerful and useful in both directions. These developments, mostly due to the pioneering work of Popa, have led to the solutions of several outstanding open questions in the two areas. Recall that a von-Neumann algebra N is an algebra of bounded operators on a separable Hilbert space, which is closed under the ∗ operation and in the weak topology, and contains the identity operator. It is called a factor if its center consists of the scalars alone. The following definition of the group measure space construction is an equivalent simplified version of the one found in most textbooks (cf. [40], [120]), which is valid only in the finite measure preserving case we are interested in (it is actually identical to the original one due to Murray-von-Neumann appearing in [92]). Definition 5.1. Assume that the countable group Γ acts on (X, µ) (as usual ergodically, freely, finite measure preservingly). Consider the space Γ × X with the product of the counting measure on Γ and µ, and the natural diagonal Γaction on it. This Γ-action induces a unitary Γ-representation π on the Hilbert space H = L2 (Γ × X), on which the abelian algebra A = L∞ (X, µ) acts as well, via multiplication. The type II1 factor N associated with the Γ-action on X is then the weak closure of the algebra generated by the family of operators π(Γ) and A. The fact that this is a factor comes from ergodicity, while its being of type II1 (i.e., the existence of an appropriate finite trace) comes from the finiteness and invariance of µ. A comprehensive treatment of this subject can be found e.g., in [120] (and its sequel); for a more friendly introduction to it, see [40]. The basic fact which makes this algebra so relevant to our discussion is that it depends on the action only up to Orbit Equivalence. More on this issue can be found in the work of Krieger (see [76] and the references therein), see also Moore’s survey [91]. The next major advance in this direction was the seminal work of FeldmanMoore [34], [35]. They showed that not only the factor can be directly constructed from the relation R itself, but in fact, given a type II1 factor N , the above construction can be reversed, provided N contains a so-called Cartan Subalgebra A ⊆ N . Such A is a maximal abelian ∗ -subalgebra defined by certain technical properties. In the case where N is obtained via the group
Measurable Group Theory
401
measure space construction, one such A is provided by L∞ (X), however in general a Cartan Subalgebra need not be unique up to (unitary) conjugation. This, rather unexpected fact at the time, was first demonstrated five years later by Connes-Jones in [21]. Thus, there is a bijection between abstract type II1 (i.e., countable, finite measure preserving, ergodic) equivalence relations (X, R), and pairs (N , A); any isomorphism of objects at one side corresponds to an isomorphism at the other. The fact that this relation depends not only on N but also on A (whose conjugacy class is not canonical), makes it a priori too weak for substantial applications. Thus, a reformulation of Connes-Jones example mentioned above is that there are non OE relations, whose associated factors are nevertheless isomorphic. Recently, a breakthrough was obtained by Popa, beginning with [100]. He showed that for certain group actions, the Cartan Subalgebra A = L∞ (X) arising in the group measure space construction, satisfies stronger properties, which when exist, do determine a Cartan Subalgebra A ⊆ N satisfying them, uniquely up to unitary conjugation. This so-called “rigid inclusion” A ⊆ N arises from a “tension” between (relative) property (T) and amenability (or the so-called “Haagerup property”), a tension whose usefulness appears in various forms in rigidity theory. In light of the above discussion it is now clear that in situations where such rigid inclusions are present, any (auto)morphism or invariant of the equivalence relation transfers to one of the associated factor and vice versa, thereby opening possibilities for a variety of applications in both directions. These ideas play an important role in the work of GaboriauPopa [51] on free groups (Theorem 2.5 above), which in fact covers a much larger family of groups (see section 9.1 below), even if considerable more effort is still required there. We should note that prior to [51] only a few (at most 5, it seems) non-orbit equivalent actions of a free group were known to exist. We next describe a major application of Popa’s theory, involving the notion of fundamental group of a factor. The fundamental group of factors and equivalence relations. Given any (type II1 ) factor N and real t > 0, Murray and von-Neumann defined a new “rescaled” factor, called the “t-amplification” of N and denoted N t . They then define the fundamental group of N , F (N ), to be the group of all t such that N ∼ = N t . The first constructions of N with F (N ) '= R+ (indeed, of a factor with countable F ) was due to Connes [18] using Kazhdan’s property (T). However nothing more explicit was known about these (or other) fundamental groups, in particular, leaving open the natural question (raised by Kadison in 1967) whether there exists N with F (N ) = 1. Recently, this was settled affirmatively by Popa [100] (to which we refer for more background and relevant references): Theorem 5.2. There exists N with F (N ) = 1. In fact, the type II1 factor N corresponding via the group measure space construction to RΓ with Γ = SL2 (Z) acting on T2 (Example 2 in the list of Section 2), satisfies this property.
402
Y. Shalom
In fact, even more recently Popa established [101] the following striking: Theorem 5.3. For any countable subgroup S ⊂ R∗+ there exists N with F (N ) = S. In order to understand how Popa was able to apply here his theory and the relation between operator algebra and orbit equivalence, we need to recall the corresponding notion in the measurable setting. Given any type II1 -relation R on the probability measure space (X, µ) (say R = RΓ ), and a subset Y ⊆ X with µ(Y ) = t > 0, one can consider the restricted relation RY of R to Y (i.e., two points in Y are equivalent in RY iff they are R-equivalent in X). It is not difficult to see that if Y, Z ⊆ X satisfy µ(Y ) = µ(Z) = t > 0, then RY ∼ = RZ (this follows from the fact that the so-called Full group – the group Aut(R) defined in Theorem 4.4 above, acts transitively on the sets (modulo 0) of measure t). Thus, for any 0 < t ≤ 1 the “rescaled” relation Rt is well defined by restricting to any subset of measure t, and one can then define: Definition 5.4. The fundamental group F (R) of a type II1 relation R, is the group generated by the t’s with R ∼ = Rt . For example, it is easy to deduce from Theorem 2.2 above that when R = RΓ with Γ amenable, F (R) = R∗+ . It should not come as a surprise that in Feldman-Moore’s correspondence, the relation Rt corresponds to N t with At = L∞ (Y ). Thus any isomorphism R ∼ = Rt clearly gives rise to one t between N and N (thereby inducing an embedding of the first fundamental group in the second). However the converse will hold only for isomorphisms intertwining A and At (up to conjugation), which cannot be guaranteed in general. Nevertheless, in situations where one has Popa’s rigid inclusion, since A is canonical it does follow that F (N ) = F (R), and if moreover F (R) = 1, this establishes Theorem 5.2 above. As was shown first by Gefter-Golodets [54], property (T), as well as Zimmer’s cocycle superrigidity approach, enable one to find Γ with F (RΓ ) = 1 (in fact, this will be the case for any action of a “higher rank” lattice Γ). However, Popa’s theory requires one to work with Γ having the “Haagerup property”, which higher rank lattices never posses. This suggests one motivation for the next subsection, concerning Gaboriau’s important theory of 2 -Betti numbers of equivalence relations, in which we shall see that for Γ = SL2 (Z) appearing in Theorem 5.2 (to which Popa’s machinery can be applied), all relations R = RΓ have trivial F . We remark that later, Valette implemented in [122] Popa’s ideas for other groups with vanishing 2 -Betti numbers (e.g., products of lattices in SL2 (C)), where Monod-Shalom’s bounded cohomology approach can be applied to show triviality of the fundamental group. For other recent related results of Popa and further details, see also [102], [103]. 2 -Betti numbers. The fascinating theme of 2 -Betti numbers (denoted βn for each natural n), deserves a discussion in its own right, which obviously cannot be offered here. First defined in a very analytic form by Atiyah [10] in
Measurable Group Theory
403
the setting of manifolds admitting a co-compact group action, generalized to foliations by Connes [19], 2 -Betti numbers reached their complete definition in the work of Cheeger-Gromov [16]. Considerable advance came recently with the more algebraic approach of L¨ uck (cf. [81], [82]). See also the much recommended survey [32] for more details. Perhaps the simplest equivalent definition, due to L¨ uck [78], is available for groups Γ satisfying appropriate “finiteness properties” (like all the groups mentioned here): If Γ contains a decreasing sequence Γi of finite index normal subgroup with trivial intersection, then βn (Γ) = limi bn (Γi )/[Γ : Γi ], where bn = dim H n is the usual nth Betti number. The relevance to measurable group theory came in the extension of this invariant to type II1 equivalence relations by Gaboriau [48]: Theorem 5.5. Let R be any type II1 equivalence relation. 1. For every n one can define abstractly βn (R) (without any reference to a group), so that if R = RΓ then βn (R) = βn (Γ). 2. For any 0 < t ≤ 1 one has for the rescaled relation: βn (Rt ) = βn (R)/t. A different proof of the Theorem was recently suggested by Sauer [111]. It follows immediately that if Γ is a countable group such that for some n one has 0 < βn (Γ) < ∞, then F (RΓ ) = 1. For example, the free group on m generators Fm satisfies β1 (Fm ) = m−1, and hence the virtually free group Γ = SL2 (Z) satisfies β1 (Γ) '= 0 as well, which as explained previously, was used by Popa [100] in the proof of Theorem 5.2 above. In fact, in situations where Popa’s “rigid inclusions” A ⊆ N are present, we have seen that invariants of the relation transfer to ones of the associated factor, thereby defining 2 -Betti numbers of the factor N (behaving appropriately with respect to amplifications). It is a general hope, motivated by the intimate connections between orbit equivalence theory and operator algebra, that any group invariant which “respects” orbit equivalence, should find its von-Neumann algebra counterpart (cf. [22] in the case of property (T)). An intriguing implementation of this principle in the case of 2 -Betti numbers was recently suggested by Connes-Shlyakhtenko [23] (see also [113] in relation to cost discussed in Section 8 below). Ideally, this may eventually lead to a solution of the long standing problem of whether the group von-Neumann algebras associated with different free groups are (non)isomorphic factors. We close this section with a remarkable application, due to Gaboriau [48], of measurable group theory back to the theory of 2 -betti numbers: Theorem 5.6. Let Γ be a countable group with β1 (Γ) '= 0. If N Γ is a normal subgroup which is both infinite and has infinite index, then N is not finitely generated. Under the assumption that Γ/N is not torsion (or contains arbitrarily large finite subgroups), this was proved by L¨ uck [79], [80], but the only proof known to date of the general statement uses measurable orbit equivalence theory (see also the discussion following Theorem 8.2 below). There are general situations in which one can a priori guarantee that Γ satisfies β1 (Γ) '= 0 (for a
404
Y. Shalom
geometric one see [110, Thm 1.5] and its extension in [14], see also [11]), thereby obtaining for Γ the conclusion of Theorem 5.6 (classically known as Schreier’s theorem for a free group Γ). An application of Gaboriau’s results to problems in percolation theory can be found in [49] (see also [83]). 6. Measurable vs. geometric group theory Quasi-isometries and Measure Equivalence of groups. Roughly speaking, geometric group theory studies the algebraic group properties which are reflected in its geometry at a “large scale”. The following basic notion enables one to make this more precise: Definition 6.1. Let Γ, Λ be finitely generated groups, and let dΓ , dΛ be the associated word metrics on Γ, Λ w.r.t. some finite generating sets. Say that Γ q.i. is quasi-isometric to Λ denoted Γ ∼ Λ , if there exists a (so-called) quasiisometry f : Γ → Λ, i.e., f which satisfies for some global constants, C, L, D and all γ, γ ∈ Γ, λ ∈ Λ: 1 dΓ (γ, γ ) − L ≤ dΛ (f (γ), f (γ )) ≤ CdΓ (γ, γ ) + L dΛ (ϕ(Γ), λ) < D C It is immediate that changing one finite generating set to another gives q.i. metrics on the group, hence the q.i. equivalence relation on groups is well defined. Beginning with Gromov’s general program (cf. [58]) to classify groups up to quasi-isometry, geometric group theory has developed, in the last decade or so, remarkable results and tools incorporating diverse areas of mathematics (cf. the survey [33], see also the discussion proceeding Theorem 6.10 below). The following elementary observation of Gromov [58] (see [62, p. 98], or [111, Sec. 2] for the easy details), opens the door for the connection with measurable group theory: q.i.
Proposition 6.2. Γ ∼ Λ ⇔ ∃ a locally compact space X on which Γ, Λ act continuously and properly, with bounded fundamental domains, and the actions commute. A space X as in the Proposition is referred to as topological coupling of the groups. A basic example of this situation, making use of the commuting left and right group multiplication, is the following: Example 6.3. Any two co-compact lattices in a locally compact group G are q.i. Gromov then proceeded to suggest the following natural measurable analogue: Definition 6.4. Say that the groups Γ and Λ are Measure Equivalent (ME), ME
denoted Γ ∼ Λ, if there exists a σ-finite measure space X on which Γ, Λ act measure preservingly, with finite measure fundamental domains, such that the actions commute. Such X is called a measurable coupling of the groups, and the ratio between the measures of (any choice of) fundamental domains of them is called the coupling index.
Measurable Group Theory
405
Analogous to the geometric setting, one has: Example 6.5. Any two lattices in a locally compact group G are ME. The important conceptual bridge between measurable and geometric group theory, is given by the following result of Furman [43] (who credits Zimmer as well): Theorem 6.6. The groups Γ, Λ are ME iff for some actions of them, say Γ on (X, µ) and Λ on (Y, υ), and some t, s ∈ R+ , one has OE of the restricted s relations (see Definition 5.4 above): RΓt ∼ . = RΛ When some Γ−and Λ-actions satisfy the conclusion of the second (“if”) part of the theorem, they are called Weakly (or Stably) Orbit Equivalent, denoted WOE. This generalizes the notion of OE which corresponds to the case where t = s(= 1). The more flexible WOE turns out to be natural to work with; for example any finite index subgroup is ME to the ambient group, and dividing a given action by a finite normal subgroup, or inducing it to a finite index ambient group, results in a WOE action (see [43], and [90, Sec. 2] for more details). Typically, all the results in the OE-setting generalize naturally to the WOE setting, and sometimes, even if one is interested in OE only, passing through WOE considerations seems necessary. For a related examination of the ME-setting from the operator algebra perspective, see Vershik’s [123]. Measure equivalence provides a convenient framework, enabling one to focus on the OE (or WOE) properties of groups, rather than of particular actions of them. Thus, the basic conceptual question of “how much” of the structure of Γ is revealed in all the relations its actions generate, can be made more precise as: Question 6.7. Given Γ, what can be said about the groups ME to it ? We conclude this introductory subsection by remarking that neither one of the ME and q.i. relations implies the other (e.g., by using amenability on one side, and property (T), which is ME invariant but not q.i. invariant, in the other – see e.g., [47] for details). Empirical experience (cf. Dye-OrnsteinWeiss Theorem 2.2 above) shows however, that it typically takes finer tools to distinguish measurably than geometrically, between groups. ME Rigidity results. The Dye-Ornstein-Weiss Theorem 2.2 (together with Theorem 6.6) immediately yield one instance where a complete answer to the above question is available: ME
Theorem 6.8. Assume that Γ is amenable. Then Λ ∼ Γ ⇔ Λ is amenable. As remarked in [90] (see [50] for much more on this issue), the following additional result of non-rigid type can easily be deduced: Proposition 6.9. There is a continuum of groups ME to a free group (or to SL2 (Z)).
406
Y. Shalom
We remark that a similar result will of course hold for direct products of free groups. Somewhat surprisingly though, such products do exhibit sharp rigidity phenomena [90] (compare with Theorems 4.3 above and 6.12 below; see also [71]). The outstanding rigidity result in this setting is the following, due to Furman [42], which should be contrasted with the previous Proposition. Theorem 6.10. Let Γ = SLn≥3 (Z) (or let it be any lattice in a simple Lie group of real rank > 1). Then all the groups ME to Γ are accounted for by the example following Definition 6.4 above, modulo finite kernels and co-kernels. Namely, if ME Λ ∼ Γ, then after passing to a finite index subgroup and dividing by a finite normal subgroup, Λ and Γ are lattices in the same ambient simple Lie group. Furman borrowed a key idea from geometric group theory in the proof of this remarkable result. It is a well known basic principle in the q.i. rigidity theory, that if one has “sufficiently good” control over all the self quasi-isometries of a group Γ, then this gives considerable information on the possible groups q.i. to it. More precisely, one can define naturally the group of such self q.i.’s (modulo ones of uniformly bounded distance), denoted QI(Γ), in which every Λ q.i. to Γ must embed (modulo a finite kernel). While in the measurable category there seems to be no direct analogue of QI(Γ), implementing the same philosophy turned out to be crucial in Furman’s proof, and enabled him to cross the “linearity barrier” alluded to in Section 3 above, which seemed to be present due to the assumption on the target group H in Zimmer’s cocycle superrigidity theorem (notice that the problem indeed disappears when studying an orbit equivalence between two Γ-actions). Continuing the analogy with geometric group theory, we mention that the q.i. analogue of Theorem 6.10 (see Example 6.3), was previously established for all lattices in simple Lie groups G, as an accumulation of various highly involved results, due to a long list of authors whom we shall not mention here (see the useful survey [33]). In the geometric setting, however, co-compact and non co-compact lattices behave differently, the latter having only commensurable groups being quasi-isometric to them. Finally, Furman’s first examples of (ergodic) type II1 relations which cannot be obtained from a free measure preserving action of any group, can now be described quite easily: From Theorem 6.10 (and 6.6) it follows that for any action of any higher rank lattice Γ, if one has for some t an isomorphism of the rescaled relation RΓt ∼ = RΛ with some group Λ, then t must lie in a prescribed countable set S (computed in terms of the cardinality and index of the kernel and co-kernel appearing in Theorem 6.10 above, and the possible co-volume ratios of the two lattices there). Thus for any Γ-action and t ∈ / S, the relation RΓt will do the job ! See [43, Thm D] for more details. Gaboriau’s Theorem 5.5 above admits the following adaptation to the ME setting, which is often useful in distinguishing “measurably” between groups:
Measurable Group Theory
407 ME
Theorem 6.11 (“Gaboriau’s Proportionality”). Assume that Γ ∼ Λ. Then there exists a positive constant c (= t/s in the notation of Theorem 6.6), such that for all n one has: βn (Γ) = cβn (Λ). This result has no analogue in geometric group theory (although the vanishing of βn was shown by Pansu to be a q.i. invariant). Finally, we mention the following sample application of Monod-Shalom’s bounded cohomology approach [90] (see Section 4), in which “negatively curved” has the same meaning as before Theorem 4.2 above. Theorem 6.12. Let Γi , Λj be “negatively curved” (as above), and torsion free. If Γ1 ×··· ×Γn ∼ Λ1 ×··· ×Λm , then n = m, and after re-ordering: Γi ∼ Λi ∀i. ME
ME
A similar prime factorization result is known to hold in geometric group theory. Partially motivated by Theorem 6.12, a result of this type in the operator algebra setting was also established recently by Ozawa-Popa [94]. Another rigidity result established in [90] for groups with “radical” is the following: Theorem 6.13. Let N Γ, M Λ be amenable normal subgroups such that both quotients Γ/N, Λ/M are torsion free and “negatively curved” (e.g., as above). Then Γ ∼ Λ ⇒ Γ/N ∼ Λ/M . ME
ME
Applications to the geometric group theory of amenable groups. Despite of the mutual independence of geometric and measurable group theory, we have already seen one example of a flow of techniques, from the former to the latter, in the proof of Theorem 6.10 above. In this subsection we briefly describe an interaction going in the opposite direction, giving rise to new rigidity results on the large scale geometry of amenable groups. For further details the reader is referred to [111]. Assume that Γ and Λ are amenable groups which are quasi-isometric, and consider a topological coupling X of them, as in Proposition 6.2. Notice that since the two actions commute, this coupling induces an action of Γ on X/Λ (and vice versa), which by compactness and amenability, admits a finite invariant measure. It is easy to see that “lifting” (and tessellating) this measure to X then yields a ME coupling structure on X, as in Definition 6.4. The departure point of the approach taken in [111] is to gain information on the relation between Γ and Λ, by using this simple observation, in order to shift from the geometric to the measurable category, and then apply techniques of the latter (an idea which may seem quite strange at first, in light of Theorem 6.8 above). However, the topological structure is not abandoned completely. Continuing the transition, we next use Theorem 6.6 to arrive at OE (or WOE) actions of Γ and Λ, to which the OE cocyle α is associated, similarly to the discussion in Section 3 above. Inspecting more closely the process made here, one readily observes that the original topological structure (particularly the compactness of the fundamental domains) gives rise to the optimal finiteness properties one would like the OE cocycle α to have. Namely, for each γ0 ∈ Γ, the map α(γ0 , −)
408
Y. Shalom
takes only finitely many values in Λ. We can now return to the same construction of induction of representations and cohomology, as outlined in Section 4. This time, however, using the above finiteness property of α, we are able to work with various categories of representations and cohomology without having to bother with “convergence issues”. As mentioned in Section 4, the injectivity of the map I defined there is quite easily obtained when working with ordinary group cohomology. Of course, to implement this strategy, additional results at the representation and cohomological levels (some of interest in their own right) are established. Among the applications of this approach to the geometric group theory of amenable groups we mention: 1. (Co-)homological dimension over Q is a q.i.-invariant. Consequently, so is the Hirsch length of solvable groups: hΓ = dimQ (Γ(i) /Γ(i+1) ) ⊗Z Q . 2. The ordinary Betti numbers are q.i. invariant for nilpotent groups. 3. First “substantial” results on the large scale geometry of polycyclic groups, and some non-finitely presentable groups. 4. A proof of q.i. rigidity of abelian groups avoiding Gromov’s celebrated polynomial growth theorem (thereby avoiding Montgomery-Zippin’s involved work). Some elements of the approach in [111] were recently improved by Sauer [104], enabling him (among other things) to sharpen some results of [111] to the statement made in 1, and even cover in some results all groups, not only amenable. The result in 2 gives rise to the first examples of non q.i. nilpotent groups, which do have isomorphic graded real Lie algebra (recall that Pansu showed [95] that the latter is a q.i. invariant for nilpotent groups, leaving open its completeness as such). One sample result covered by 3 is that any group q.i. to a polycyclic group, or to the (non-finitely presentable) Lamplighter group, has a finite index subgroup with infinite abelianization. Finally, the result in 4 comes following a natural question raised by various authors – see [111] and the references therein for details on this, as well as other issues mentioned in this brief account. 7. The relation to descriptive set theory A theme of growing interest in descriptive set theory is the systematic study of “complexity” of equivalence relations in general, and ones related to classification problems in particular. Roughly speaking, in classification problems a category of objects is given (e.g., all groups generated by a fixed number of elements, all irreducible unitary representations of a given group, etc.), which one would like to classify up to an “isomorphism”, giving rise to a natural equivalence relation on the category. In many natural examples the category carries, or can be given, a structure of a “Polish” (or “Borel”) space, with respect to which the equivalence relation is Borel as well. Sometimes “reasonable classification” is indeed possible (by which one generally means describing a “simple” set of complete (say, real) invariants parameterizing the equivalence classes), but typically this cannot be done, and one is left with the problem
Measurable Group Theory
409
of trying to “measure” or “compare” the complexity of different classification problems (soon to be made more precise and concrete). Thus, we enter the playground of Borel equivalence relations (defined similarly to measurable relations in Section 2 above), on Borel spaces. Letting now E, F be two such relations on the spaces X, Y resp., consider the following: Definition 7.1. We say that E is Borel reducible to F, and denote E ≤ F , if F E there is a Borel map f : X → Y such that x1 ∼ x2 iff f (x1 ) ∼ f (x2 ). Such f is called a Borel reduction from E to F. If both E ≤ F and F ≤ E hold, they are said to be Borel bireducible, denoted E ∼ F . If E ≤ F, yet F ≤ E does not hold, write E < F. Thus, when f is a Borel reduction from E to F, it forms a reduction of the problem of “classification up to E” to “classification up to F”, thereby inducing a natural (partial) order on relations. Various expositions of this subject and its recent developments can be found, e.g., in Hjorth’s [63], [65], and JacksonKechris-Louveau [73]. Of particular interest are the countable (-equivalence classes) Borel relations, which turn out to appear in (or be Borel bireducible to) many of those one encounters. It turns out that among those relations E which are “non-trivial”, there is a “smallest” (the “hyperfinite”) one, denoted E0 , and a largest, “universal”, denoted E∞ , namely, for all such E: E0 ≤ E ≤ E∞ . Surprisingly, the following basic result of Adams-Kechris [9] appeared only 5 years ago: Theorem 7.2. 1. There exist incomparable relations, i.e., ≤ is a partial ordering. 2. There exist infinitely many (indeed a continuum of ) relations which are mutually non bireducible. In fact, much more on this was established in [9], to which we also refer the reader for more background material and references. It turns out that virtually all that is currently known in the subject comes with the aid of groups, and the techniques of measurable orbit equivalence theory (particularly ones related to cocycle rigidity results – see Section 3 above). Indeed, to any Borel reduction of relations one can associate naturally an “orbit” cocycle α, just as explained in Section 3, and in situations where the Borel relation arises from a measure preserving action of a “higher rank lattice”, Zimmer’s Theorem (for example) naturally enters the game. In fact, the cocycle superrigidity theorem for products of groups in [89], can often serve as a substitute to Zimmer’s theorem, e.g., in Adams-Kechris work [9], or in proving Theorem 7.3 below. A more (yet not entirely) elementary, ergodic theoretic (cocycle rigidity-) approach to some of these applications is also suggested by Hjorth-Kechris in [71]. Let us mention now two other basic and elementarily stated problems in this area, which via the same ergodic theoretic construction were recently answered. For the first recall that if E,F are Borel relations defined on the same Borel space, then one has the natural notion of containment E ⊆ F. Also, given
410
Y. Shalom
a Borel relation E on the Borel space X, one defines the relation E ⊕ E on the space X × {1, 2} by declaring (x, i) ∼ (y, j) iff i = j and x ∼ y. Obviously, E ≤ E ⊕ E. The first part of the following result was proved by Adams [7], while the second, following the latter, by Thomas [117]. Both use the same construction, which is of the type described in Example 4 of the list in Section 2, with Γ being (isomorphic to) a suitable higher rank lattice. Theorem 7.3. 1. There exist relations E ⊆ F without satisfying E ≤ F. 2. There exists a relation E with E < E ⊕ E. Finally, we discuss briefly one concrete classification problem, which attracted considerable interest until its recent final solution. This is the classification problem of torsion free abelian groups. A torsion free abelian group of rank at most n is a subgroup of Qn . The space S(Qn ) of all such groups admits a natural Borel structure, on which the relation of group isomorphism forms a countable Borel equivalence relation, denoted ∼ =n . In the case n = 1 one has (due to Baer) a completely satisfactory (and easy) solution to the classification problem, and the resulting relation on S(Q) turns out to be the same as the smallest E0 mentioned before Theorem 7.2 above. However very little was known when n > 1. In fact, Hjorth-Kechris conjectured in [69] that for all n > 1 ∼ =n are the most “complex” relations, i.e., they are bireducible to the universal E∞ (alluded to before Theorem 7.2 above). The following, due to Thomas [118], is the culmination of several partial results on this problem due to various authors (see below). Theorem 7.4. For all n one has: ∼ =n+1 (<E∞ ). =n < ∼ It is easy to see that if two subgroups A, B < Qn are isomorphic, then there is an element g ∈ GLn (Q) with gA = B, hence the equivalence relation we study is induced by the (non-free) action of the countable group GLn (Q). A key ingredient in the progress towards Theorem 7.4 above was Hjorth’s result [64], that even though the latter group does not admit a finite invariant measure for its action, our “favorite” group SLn (Z) does ! This opens the door for, and makes relevant techniques from, measurable orbit equivalence theory, even if there is still much more work to be done. We refer the reader to [115] for a clear account on the developments around Theorem 7.4 and the results preceding it (see also [66]). In the next Section we describe the notion of treeability of equivalence relations, first introduced by Adams, which is receiving recently some attention from the descriptive set theory viewpoint as well (cf. [73, Section 3], [71]). 8. Graphings, cost, and treeability Given a probability measure space (X, µ), there is a natural way to construct a (countable, measure preserving) equivalence relation R, by introducing on X a Graphing Φ. By this we mean a countable family of partial isomorphisms of
Measurable Group Theory
411
X: Φ = {φ1 , φ2 . . . }, namely, for each i, φi : Ai → Bi preserves the measure µ, where Ai , Bi ⊆ X. Any graphing Φ naturally generates an equivalence relation RΦ on X, namely, the smallest equivalence relation ∼ for which x ∼ φi (x) for all i and all x ∈ Ai . For example, if a group Γ acts on X (preserving µ), and S < Γ is any generating subset, then for Φ = ΦS = {φs }s∈S (where φs is the action map of s), we have RΦ = RΓ (here Ai = X for all i). We can now define: Definition 8.1. Retain the above setting and notations. 1. The cost of Φ, C(Φ), is defined by C(Φ) = Σi µ(Ai ). 2. The cost of an equivalence relation R is defined by: C(R) = inf{C(Φ) | R = RΦ }. 3. Φ is called a treeing of the relation R if R = RΦ , and for a.e. x, the graph structure on the equivalence class R(x) induced by Φ (i.e., the graph with edges corresponding to φ±1 i ) is a tree. In this case we say that R is treeable. Graphings and treeability were first introduced by Adams [1] (see also [3] and [6] for other early results). Cost was introduced by Levitt in [77]. It is by definition an orbit equivalence invariant, the difficulty being, of course, showing it is a non-trivial one and computing it, beyond the basic observations that C(R) ≥ 1 (R is infinite), with equality when R is amenable (use Theorem 2.2 above and Γ = Z). The breakthrough around this notion came with the work of Gaboriau [46], who produced the first non-trivial examples, along with a systematic study of it. The most basic construction of such examples came through his insight that when R is treeable, any treeing Φ of R achieves the infimum in the definition of C(R), thereby enabling its computation (it is easy to see that conversely, if there is a graphing attaining the infimum then it must be a treeing). This accounts for Part 1 in the following Theorem, summarizing some of Gaboriau’s main results in [46]: Theorem 8.2. Cost satisfies the following properties: 1. If Γ = Fn is the free group on n ≤ ∞ generators, then for any Γ-action one has: C(RΓ ) = n. Consequently, (answering a question of Kaimanovich by taking n = ∞), there exist R which cannot be generated by any finitely generated group. 2. If Y ⊂ X, and S is the restriction to Y of the relation R on X, then C(S) − µ(Y ) = C(R) − µ(X). 3. Many groups Γ (do they all ??) “do not bargain”, but rather have a “fixed price property”, namely, C(RΓ ) = C(Γ) is a constant depending only on Γ, and not on the action; for example: 4. One has C(πg ) = 2g − 1 for the “genus-g surface group” πg , C(SL2 (Z)) = 13/12, C(SLn≥3 (Z)) = 1, and C(Γ1 × Γ2 ) = 1 for any infinite groups Γ1 , Γ2 (the latter was obtained originally under the assumption that Γ1 contains an infinite amenable subgroup, later to be removed by KechrisMiller [75]). Finally, any real value ≥ 1 is obtained as some C(Γ).
412
Y. Shalom
5. The following relation between the first 2 -Betti number and the cost of an (infinite) equivalence relation R holds: 1+β1 (R) ≤ C(R) (equality holds for treeable relations, no case of strict inequality is known). Part 5 of the Theorem was established in [48]. Notice that 1 shows that when n '= m, the groups Fn and Fm do not have any OE actions, an application which answered a major open problem. Note that the latter can also be deduced directly from Theorem 5.5. In fact, to some extent there has been a shift of interest from cost theory to the superseding 2 -Betti numbers approach, even though some intriguing questions on cost are still open (e.g., the one in Part 3). A comprehensive treatment of cost (among other things), along with various problems around it, can be found in the recent recommended book by Kechris and Miller [75], see also [29]. We only remark that an elegant application of it is made in (one) proof of Theorem 5.6 above: It is not difficult to see (cf. [47, Thm 3.4]), that if N Γ is a finitely generated, infinite, and infinite index normal subgroup, then any diagonal product action of Γ on X × Y , where the Γ-action on X is free and the action on Y factors through the quotient Γ/N , has cost one, hence by the general inequality in 5 above (and Theorem 5.5), necessarily β1 (Γ) = 0. The particular case of treeable equivalence relations, and groups which induce such actions, has received a fair amount of attention. Very recently it has matured into a rather complete theory. A summary of the essential results is as follows: Theorem 8.3. Let R be a treeable (type II1 -)relation. Then: 1. All subrelations, restrictions and relations which Borel reduce to R (as in Definition 7.1 above) are treeable (see [46], [73, Section 3]). 2. R has cost one if and only if R = RΓ for amenable Γ. 3. R is always induced by a free action of some group (unlike the case for general type II1 relations, as mentioned in Section 2 above). Moreover, if C(R) = n is an integer, then such a group can always be taken to be the free group Fn . Statement 2, due to Gaboriau [46], follows from his insight that treeing achieve the cost (as compared with Levitt’s [77]), while statement 3 is a highly non-trivial recent result of Hjorth [68]. From the groups point of view, we have the following: Theorem 8.4. For a countable group Γ the following are equivalent: 1. 2. 3. 4.
Γ admits one treeable action, i.e., an action where RΓ treeable. All actions of Γ are treeable. Γ is ME to a free group (possibly also Z, or F∞ ). Γ is treeable in the following sense defined by Pemantal-Peres [98]: The set of trees with vertex set Γ supports a Γ-invariant probability measure.
Measurable Group Theory
413
The equivalence 1 ⇔ 2 is a recent result of Golodets-Dooley [29], 1 ⇔ 4 is straightforward, and 1 ⇔ 3 follows from the last assertion of 3 in Theorem 8.3 (the point being to pass to a restriction of R having integral cost, cf. [68] and [50]). Another property equivalent to (1 and hence all of) the above, is having ergodic dimension (as defined by Gaboriau [48]) equal to 1. Many groups ME to the free group can be found in Gaboriau’s recent account [50] (containing also a clear exposition of some results presented here), the simplest family of such groups being free products of any number of infinite amenable groups. 9. Other ME-invariants, further remarks, and some questions 9.1. Spectral ME-invariants and property (T). The most basic and easily established ME-invariant is amenability (note that this does not require the strength of Dye-Ornstein-Weiss Theorem 2.2 above), and it is not very difficult to verify the same for Kazhdan’s property (T) (see [42], following Zimmer’s earlier definition of an “action with property (T) – cf. [126, Def. 9.16] and the references therein). A more “sophisticated” ME-invariant (established as such in [28], see also [74]), is the so-called Cowling-Haagerup constant Λ(Γ) [27], pertaining to unit approximation in the Fourier algebra A(Γ). When A(Γ) = 1, Γ has the “Haagerup property”, but the converse (conjectured by Cowling) is not known. The Haagerup property, namely, admitting a proper isometric action on a Hilbert space (according to one characterization – cf. [17]), is also known to be ME-invariant (see e.g., [74], [99]). Property (T) deserves some further discussion, especially in the light of Hjorth’s Theorem 2.6 above (which answers one of Schmidt’s questions in [108]). It is not difficult to see that the property of an action to have “almost (or asymptotically) invariant subsets” (see [106]) is a OE invariant. Whenever a group Γ is non-amenable (resp. non-Kazhdan), it has an action without (resp. with) almost invariant sets (see [24]), hence any group not belonging to either one of these two classes admits at least two non OE (in fact non WOE) actions. Thus the Kazhdan groups are the ones to resist last Theorem 2.6. In [67] Hjorth gives a beautiful elementary argument showing that any Γ-action (say, on (X, µ)), inducing a given relation R, is “locally rigid” within the (separable) “space of all free Γ-actions on X inducing this R”. Thus only countably many Γ-actions (among the continuum of them) can induce a given relation. Hjorth was influenced by operator algebra considerations [99], but the proof has also the flavor of the recent local rigidity results for Kazhdan groups by Fisher-Margulis [41]. Of course, the next destination is to show that any non-amenable group admits infinitely many non-OE-actions, and there are by now quite a few families of groups which can be handled, often making use of the relative property (T) mentioned earlier in connection to Popa’s work. Gaboriau-Popa’s proof of Theorem 2.5 makes strong use of this property, as does Tornquist [121], who very recently used their work to obtain a proof of Theorem 2.5 without involving operator algebra. In fact, even though both Gaboriau-Popa and Tornquist state
414
Y. Shalom
their result only for free groups, a close inspection of their proof enables one to prove the following more general statement, which highlights the essential role of the relative property (T): Let Γ be a countable group admitting an automorphism action on a discrete abelian group A, such that the semi-direct product ΓA has the relative property (T). Then the free product Γ ∗ Z has a continuum of non OE actions. (Note that since Γ ∗ Z has infinitely many ends, it satisfies β1 '= 0 [11, Cor. 5], hence by Gaboriau’s Theorem 5.5, WOE of its actions implies OE). The relative advantage of each approach is reflected in the different perspectives: GaboriauPopa’s full result shows the stronger fact that the factors associated to the actions are actually non-isomorphic, while that of Tornquist yields that there is even no Borel classification of the actions up to OE. A host of examples of groups Γ satisfying the assumption in the above theorem is furnished by Fernos’ very recent [39], e.g., any unbounded subgroup of GLn (C) with trivial solvable radical (see also Valette’s preceding [122] for lattices). Section 9.5 below contains additional results of this nature, including a generalization by Popa of Hjorth’s result above. Staying with property (T), we remark that in [54] it is applied not only to construct the first examples of relations with trivial F , but in fact, to show that Kazhdan arithmetic lattices admit a continuum of non OE actions. Although Golodets and Gefter formulate this result only for higher rank lattices, it is actually the case of rank one lattices (i.e., those in Sp(n, 1), F4 ) which could not have been approached by Zimmer’s cocycle superrigidity (or in any other way, until the very recent developments). The arithmeticity, or merely the linearity of these groups Γ, was necessary in order to construct actions of the type described in Example 4 of Section 2, varying K among the different profinite completions of Γ. A natural question remaining in the direction of property (T), is whether Kazhdan groups can admit at all an action with non-trivial fundamental group, or, more generally, can the coupling index of some ME coupling of them be different from 1. (Incidentally, a similar question is open in general for lattices in SO(n, 1) with n odd – see also the general discussion on such lattices below.) 9.2. Hyperbolic groups. Gromov hyperbolic groups are of central importance in geometric group theory. Being hyperbolic is well known to be a geometric property, but it is certainly not a ME-invariant (e.g., a non-uniform lattice in the automorphism group of a regular tree is ME to a uniform lattice there – a free group – but is not even finitely generated). Hence, following [86], it is natural to ask what it is in the geometric hyperbolicity property, which is captured, after all, in the measurable setting. The first result in this direction was established by Adams [4] (generalizing Zimmer’s result [125] for rank 1 lattices), who showed that if Γ is ME to a (non-elementary) hyperbolic group, then its center is finite and it cannot be decomposed as a direct product of infinite groups.
Measurable Group Theory
415
This result admits substantial generalization via Monod-Shalom’s bounded cohomology approach, as follows. It is shown in [90] that the property of satisfying the conclusion of Part 1 in Theorem 4.2 above, namely, Hb2 (Γ, 2 (Γ)) '= 0, is a ME-invariant. The class of groups with this property is denoted by Creg , and contains many families of “negatively curved” groups, including hyperbolic ones (see the references following Theorem 4.2 above). Now, in order to extend Adams’ result mentioned above, one observes (see the introduction of [86]), that not only direct products of infinite groups, or ones with infinite center, are excluded in Creg , but in fact similarly excluded are groups having an infinite normal subgroup with either one of these two properties. Motivated by the 2 -Betti numbers theory, it would be interesting to try to make the “being in Creg ” property quantitative (and then OE-invariant), by finding a way to “measure the size” of Hb2 (Γ, 2 (Γ)). Another related ME-invariant, which covers the larger class of groups which are hyperbolic after dividing by a normal amenable subgroup, is having a unitary representation σ which is weakly contained in the regular representation, and satisfies Hb2 (Γ, σ) '= 0 (in contrast, by Part 2 of Theorem 4.2, such a representation cannot exist for a product of two non-amenable groups). Another candidate for ME-invariant which would apply to all hyperbolic groups is described in the next section. 9.3. Lattices in rank 1 Lie groups. A particularly interesting class of groups (all ME to hyperbolic groups) are lattices in simple Lie groups of real rank 1. In his original cocycle superrigidity paper [124], Zimmer asked whether one could show that two such lattices are ME only if their ambient groups are (locally) isomorphic. Gaboriau’s 2 -Betti numbers results indeed enable one to distinguish between “many” of them, with the one family completely resisting this approach (having βn = 0) being the lattices in SO(n, 1) for n odd (cf. [50] for more details). Some information on them is available using Gaboriau’s notion of ergodic dimension, which shows that if two lattices in SO(n, 1), SO(m, 1) are ME, then one necessarily has: max(n, m) ≤ 2min(n, m) [48, Cor. 6.9]. One group invariant, which would be particularly useful for hyperbolic groups in general, and rank 1 lattices in particular, is the infimum over the (possibly empty) set of p for which H 1 (Γ, p (Γ)) '= 0 – see [14], and [96]. In the latter work it was shown by Pansu to take the value n − 1 for lattices in SO(n, 1). This invariant turns out to coincide often with one defined in [110, Def. 1.8], involving p -integrability of matrix coefficients for cohomological unitary representations, which is itself also a candidate for a ME-invariant. Both invariants reduce to 2 -Betti numbers at the value 2, and the difficulties of establishing ME invariance for them seem related. Among rank 1 lattices, first candidates for rigidity are lattices in Sp(n, 1) and the exceptional F4 , which are known (following Corlette [25] and GromovSchoen [59]) to satisfy Margulis’ superrigidity. A weaker version of cocycle superrigidity was established for them By Corlette-Zimmer in [26], but it is not strong enough for applications to OE-ME rigidity. One ME-invariant these
416
Y. Shalom
lattices do enjoy, is the Cowling-Haagerup constant (alluded to in the first subsection), which separates, e.g., the various Sp(n, 1)’s. In fact, it was suggested by Furman that all rank 1 lattices (excluding, as usual, SL2 (R) as the ambient group) might actually satisfy a far reaching rigidity result, similar to his Theorem 6.10 above in the higher rank case. Even though many of them (e.g., all those in SO(n, 1)) are known not to satisfy Zimmer’s cocycle (or even Margulis-) superrigidity in its full generality, he suspects that in analogy with Mostow rigidity, for the very specific OE-cocycles, Zimmer’s superrigidity may actually hold (possibly by adapting elements of the geometric Besson-CourtoisGallot techniques (cf. [12]) to this setting). At this point, however, it is not even known that all rank one lattices (without property (T)) admit a continuum of non OE actions. 9.4. Strengthening Measure Equivalence. Within any given ME class of groups, one may strengthen the notion of ME as follows: say that Γ and Λ are Strongly Measure Equivalent (SME) if for every Γ-action there is some Λ-action which is Weakly Orbit Equivalent to it, and vice versa. This is the ultimate “OE-indistinguishablity” relation. For example, two amenable, or two commensurable groups, are always SME. Another example of this situation, is when Γ is a free group, and Λ is a free product of amenable groups. It may also happen, that there is an interesting non-symmetric “hierarchy” in the ME class, e.g., when every Γ-action is WOE to a Λ-action (this happens by Theorem 8.3 above, whenever Λ is a free group and Γ is ME to Λ), but not vice versa. Can this happen ? Is this the case in the last example ? What about the particular case where Γ is a surface group – is every treeable relation “surfaceable” ? One example to keep in mind of two ME groups which are not SME √ is Γ = SL2 (Z( 2)), and Λ = product of two non-abelian free groups – both lattices in SL2 (R) × SL2 (R). Using Zimmer’s cocycle superrigidity (the irreducible lattice version) on one side, and Monod-Shalom’s cocycle superrigidity for products of groups in the other, one can construct actions of each of the groups which is not WOE to any action of the other. What about other pairs of lattices in the same ambient simple Lie group? 9.5. Algebraic structures on relations. Many standard algebraic structures on groups found their natural analogues in equivalence relations R, starting with the work of Krieger [76] and Feldman-Moore [34],[35]. First and foremost is the associated von-Neumann algebra already discussed in Section 5 above, which is related to notions like the regular representation and cohomology of R – see also the survey [91]. In fact, almost the first known OE-invariant (following, and capturing Schmidt’s “strong ergodicity” – the absence of “almost invariant subsets” alluded to in subsection 1 above), is the first cohomology of R. This was first defined by Singer [114] in the von-Neumann algebra group action setting, and then transferred to OE-invariance by Moore [91], who computed some (first) examples (see also [106] for the relation with strong ergodicity, and some computations of it by Gefter [52]). It is only recently, however, that Popa [103],
Measurable Group Theory
417
within a comprehensive treatment of it, demonstrated its usefulness, e.g., by showing that any group Γ containing an infinite normal subgroup with the relative property (T) (e.g., a product of a Kazhdan group with any group), admits a continuum of non OE actions (thereby covering Hjorth’s result as well). Even more recently, Ioana [72] defined and established the OE invariance of a suitable subgroup of H 1 , and used it to show that every direct product of infinite amenable and non-amenable groups admits a continuum of non OE actions. Of course, often the algebraic structures one defines on relations are standard extensions of the relevant notions from groups to groupoids. One can define intrinsically amenability, and more interestingly Kazhdan’s property (T) for R (see the interesting [97] in that direction). Analogous to the group theoretic level one can work with subrelations, e.g., finite index, normal, and also quotients (see [36], [37]), and following Gaboriau [46], define free or amalgamated products, or HNN extensions of relations (these are particularly useful from the point of view of cost). Typically, when R = RΓ , algebraic operations, decomposition, or properties of the acting group pass to R, but things do not always go in the opposite direction. In general, motivated by Dye-OrnsteinWeiss Theorem 2.2, and the (relatively few) known results, one may expect that group invariants which are “transparently trivial” in the class of amenable groups, may be (at least) good candidates to be ME-invariants (outstanding examples here being 2 - and bounded cohomology techniques and invariants). Addendum 1. A revised version of Sorin Popa’s paper [102] (February 2005) was posted, containing several truly spectacular results. Among the achievements are the explicit calculations of various fundamental groups of type-II1 relations via pure operator algebra techniques, and establishing far reaching OE-superrigidity results, e.g.: Assume that Γ is Kazhdan, and has infinite conjugacy classes (ICC). Consider the relation RΓ obtained by a Bernoulli Γ-action (Example 5 in the list of Section 2). If Λ is any group, such that for some action of it and some t, one has RtΓ ∼ = RΛ , then t = 1, Λ ∼ = Γ, and the OE is induced by an isomorphism of the two actions. In particular, taking such RtΓ with 0 < t < 1 adds a large family of examples to Furman’s first constructions (discussed after Theorem 6.10 above) of relations which cannot be generated by a free group action [43]. Note however, that Furman’s constructions include also relations which are not even weakly equivalent to a relation which is freely generated by a countable group. 2. We were alerted by Damien Gaboriau of a possible gap in the proof of [29, Lemma 7.5], which is used, through Theorem 7.3 there, in the proof of 1 ⇒ 2 in Theorem 8.4 above. Although Dooley and Golodets assured him that the problem is fixable, details have not been supplied yet.
418
Y. Shalom
References [1] S. Adams. Indecompasibility of treed equivalence relations. Israel J. Math., 64, 362–380, 1988. [2] S. Adams. An equivalence relation that is not freely generated. Proc. Am. Math. Soc. 102, No. 3, 565–566 (1988). [3] S. Adams. Trees and amenable equivalence relations. Erg. Th. Dyn. Sys., 10(1):1–14, 1990. [4] S. Adams. Indecomposability of equivalence relations generated by word hyperbolic groups. Topology, 33(4):785–798, 1994. [5] S. Adams. Some new rigidity results for stable orbit equivalence. Ergodic Theory Dynam. Systems, 15(2):209–219, 1995. [6] S. Adams. Reduction of cocycles with hyperbolic targets. Ergodic Theory Dynam. Systems, 16 no. 6, 1111–1145, 1996. [7] S. Adams, Containment does not imply Borel reducibility in: S. Thomas (Ed.), Set Theory: The Hajnal Conference, DIMACS Series vol. 58, 2002, pp. 1–23. [8] S. Adams and R. Spatzier. Kazhdan groups, cocycles and trees. Amer. J. Math., 112(2):271–287, 1990. [9] S. Adams, A.S. Kechris. Linear algebraic groups and countable Borel equivalence relations&apos. J. Amer. Math. Soc., 13, no. 4, 903–943, 2000. [10] M. Atiyah. Elliptic operators, discrete groups and von Neumann algebras. In Colloque “Analyse et Topologie” en l’Honneur de Henri Cartan, pages 43–72. Ast´erisque, No. 32–33, 1976. [11] M.E.B. Bekka, A. Valette. Group cohomology, harmonic functions and the first L2 -Betti number. Potential Anal. 6, no. 4, 313–326, 1997. [12] G. Besson, G. Courtois, S. Gallot. Minimal entropy and Mostow’s rigidity theorems, Ergodic Theory Dynam. Systems, 16, no. 4, 623–649, 1996. [13] S. Bezugly˘ı, V. Golodets. Hyperfinite and II1 actions for nonamenable groups. J. Funct. Anal. (1), 40, 3044, 1981. [14] M. Bourdon, F. Martin, A. Valette. Vanishing and non-vanishing of the first Lp -cohomology of groups Comment. Math. Helvetici. to appear. [15] M. Burger, N. Monod. Continuous bounded cohomology and applications to rigidity theory. Geom. Funct. Anal. 12, No. 2, 219–280, 2002. [16] J. Cheeger and M. Gromov. L2 -cohomology and group cohomology. Topology, 25(2):189–215, 1986. [17] P. Cherix, M. Cowling, P. Jolissaint, P. Julg, A. Valette. Groups with the Haagerup property. Gromov’s a-T-menability. Progress in Mathematics, 197. Birkh¨ auser Verlag, Basel, 2001. [18] A. Connes. A type II1 factor with countable fundamental group. J. Operator Theory, 4, 151–153, (1980). [19] A. Connes. A survey of foliations and operator algebras. In Operator algebras and applications, Part I (Kingston, Ont., 1980), pages 521–628. Amer. Math. Soc., Providence, R.I., 1982. [20] A. Connes, J. Feldman, and B. Weiss. An amenable equivalence relation is generated by a single transformation. Ergodic Theory Dynam. Systems, 1(4):431–450 (1982), 1981. [21] A. Connes, V.F.R. Jones. A II1 factor with two nonconjugate Cartan subalgebras. Bull. Amer. Math. Soc. (N.S.) 6, no. 2, 211–212, 1982.
Measurable Group Theory
419
[22] A. Connes, V.F.R. Jones. Property T for von Neumann algebras. Bull. London Math. Soc., 17, 57–62, 1985. [23] A. Connes., D. Shlyakhtenko. L2 Homology for von Neumann algebras. preprint. [24] A. Connes, B. Weiss. Property T and asymptotically invariant sequences. Israel J. Math., (3), 37, 209–210, 1980. [25] K. Corlette. Archimedean superrigidity and hyperbolic geometry. Ann. of Math. (2), 135 no. 1, 165–182, 1994. [26] K. Corlette, R.J. Zimmer. Superrigidity for cocycles and hyperbolic geometry, Internat. J. Math., 5, no. 3, 273–290, 1994. [27] M. Cowling, U. Haagerup. Completely bounded multipliers of the Fourier algebra of a simple Lie group of real rank one. Invent. Math. 96, No. 3, 507–549, 1989. [28] M. Cowling, R. Zimmer. Actions of lattices in Sp(1, n). Ergodic Theory Dynamical Systems, 9(2):221–237, 1989. [29] A.H. Dooley, V.Ya. Golodets The cost of an equivalence relation is determined by the cost of a finite index subrelation. preprint (2004). [30] H. Dye. On groups of measure preserving transformation. I. Am. J. Math., 81:119–159, 1959. [31] H. Dye. On groups of measure preserving transformations. II. Am. J. Math., 85:551–576, 1963. [32] B. Eckmann. Introduction to l2 -methods in topology: reduced l2 -homology, harmonic chains, l2 -Betti numbers. Israel J. Math., 117:183–219, 2000. Notes prepared by Guido Mislin. [33] B. Farb. The quasi-isometry classification of lattices in semisimple Lie groups. Math. Res. Letters, Vol. 4, No. 5, 705–718, Sept. 1997, [34] J. Feldman and C. Moore. Ergodic equivalence relations, cohomology, and von Neumann algebras. I. Trans. Amer. Math. Soc., 234(2):289–324, 1977. [35] J. Feldman, C. Moore. Ergodic equivalence relations, cohomology, and von Neumann algebras. II. Trans. Amer. Math. Soc., 234(2):325–359, 1977. [36] J. Feldman, C.E. Sutherland, R.J. Zimmer. Normal subrelations of ergodic equivalence relations. Miniconferences on harmonic analysis and operator algebras (Canberra, 1987), 95–102. Proc. Centre Math. Anal. Austral. Nat. Univ., 16, Austral. Nat. Univ., Canberra, 1988. [37] J. Feldman, C.E. Sutherland, R.J. Zimmer. Subrelations of ergodic equivalence relations. Ergodic Theory Dynam. Systems, 9, no. 2, 239–269, 1989. [38] R. Feres. An introduction to Cocycle Super-Rigidity. In Rigidity in Dynamics and Geometry (Cambridge, UK, 2000). Springer-Verlag, 2002. [39] T. Fernos. Kazhdan’s relative property (T): some new examples. preprint, November 2004. [40] P.A. Fillmore. A user’s guide to operator algebras. Canadian Mathematical Society Series of Monographs and Advanced Texts. New York, 1996. [41] D. Fisher, G.A. Margulis, Almost isometric actions, property T, and local rigidity. Invent. Math., to appear [42] A. Furman. Gromov’s measure equivalence and rigidity of higher rank lattices. Ann. of Math. (2), 150(3):1059–1081, 1999. [43] A. Furman. Orbit equivalence rigidity. Ann. of Math. (2), 150(3):1083–1108, 1999.
420
Y. Shalom
[44] A. Furman. Outer automorphism groups of some ergodic equivalence relations, Comm. Math. Helv., Vol 80, 2005. [45] H. Furstenberg Rigidity and cocycles for ergodic actions of semisimple Lie groups (d’apr`es G.A. Margulis and R. Zimmer). Bourb. Seminar, Lec. Notes in Math, Vol. 842, pp. 273–292 1981. [46] D. Gaboriau. Coˆ ut des relations d’´equivalence et des groupes. Inv. Math., 139(1):41–98, 2000. [47] D. Gaboriau. On Orbit Equivalence of Measure Preserving Actions. in Rigidity in dynamics and geometry (Cambridge, 2000), 167–186, Springer, Berlin, 2002. [48] D. Gaboriau. Invariants L2 de relations d’´equivalence et de groupes. Publ. math. ´ Inst. Hautes Etud. Sci., 95(1):93–150, 2002. [49] D. Gaboriau. Invariant Percolation and Harmonic Dirichlet Functions. GAFA, to appear. [50] D. Gaboriau. Examples of groups that are Measure Equivalent to the free group, preprint. [51] D. Gaboriau, S. Popa An uncountable family of non-orbit equivalent actions of Fn , Jour. of AMS, to appear. [52] S.L. Gefter. Cohomology of the ergodic action of a T -group on the homogeneous space of a compact Lie group. In Operators in function spaces and problems in function theory (Russian), 77–83, 146, “Naukova Dumka”, Kiev, 1987. [53] S.L. Gefter. Ergodic equivalence relation without outer automorphisms. Dokl. Akad. Nauk Ukrani, no. 11, 25–27, 1993. [54] S. Gefter and V. Golodets. Fundamental groups for ergodic actions and actions with unit fundamental groups. Publ. Res. Inst. Math. Sci., 24(6):821–847 (1989), 1988. [55] V. Golodets. Actions of T -groups on Lebesgue spaces and properties of full factors of type II1 . Publ. Res. Inst. Math. Sci., 22, no. 4, 613–636, 1986. [56] V. Golodets. Ergodic actions of T -groups that are not orbit equivalent and factors of type II1 . Dokl. Akad. Nauk SSSR, 286, no. 3, 524–527 1986. [57] V. Golodets, N. Nessonov. Property T and non-isomorphic factors of type II and III. J. Funct. Analysis, 70, 80–89, 1987. [58] M. Gromov. Asymptotic invariants of infinite groups. In Geometric group theory, Vol. 2 (Sussex, 1991), pages 1–295. Cambridge Univ. Press, Cambridge, 1993. [59] M. Gromov, R. Schoen. Harmonic maps into singular spaces and p-adic superrigidity for lattices in groups of rank one. Inst. Hautes tudes Sci. Publ. Math., no. 76, 165–246, 1992. [60] U. Hamenstadt. Bounded cohomology and isometry groups of hyperbolic spaces, preprint. [61] P. de la Harpe, A. Valette. La propri´et´e (T) de Kazhdan pour les groupes localement compacts. Asterisque, 175, Soc. Math. de France, 1989. [62] P. de la Harpe. Topics in geometric group theory. Univ. of Chicago Press, Chicago, 2000. [63] G. Hjorth. When is an equivalence relation classifiable ? Proceedings of the International Congress of Mathematicians, Vol. II Doc. Math. Extra vol. II, 23–32, 1998.
Measurable Group Theory
421
[64] G. Hjorth. Around nonclassifiability for countable torsion free abelian groups. Abelian groups and modules (Dublin, 1998), 269–292, Trends Math., Birkh¨ auser Basel, 1999. [65] G. Hjorth. Classification and orbit equivalence relations. Math Surveys Monogr. Amer. Math. Soc. vol. 75, xviii+195 pp., 2000. [66] G. Hjorth. The isomorphism relation on countable torsion free abelian groups. Fund. Math. 175, no. 3, 241–257, 2002. [67] G. Hjorth. A converse to Dye’s Theorem. Trans. of AMS, to appear. [68] G. Hjorth. A lemma for cost attained. preprint. [69] G. Hjorth, A.S. Kechris. Borel equivalence relations and classification of countable models. Ann. Pure Appl. Logic, 82, 221–272, 1996. [70] G. Hjorth, A.S. Kechris. Recent developments in the theory of Borel reducibility. Fund. Math. 170, no. 1–2, 21–52, 2001. [71] G. Hjorth, A. Kechris. Rigidity theorems for actions of product groups and countable Borel equivalence relations. to appear in Memoirs of the Amer. Math. Soc., 2005 [72] A. Ioana, A relative version of Connes’ χ(M ) invariant and existence of orbit inequivalent actions. preprint, November 2004. [73] S. Jackson, A. Kechris, A. Louveau. Countable Borel equivalence relations. J. Math. Log. 2, no. 1, 1–80, 2002. [74] P. Jolissaint. Approximation properties for Measure Equivalent groups. preprint 2001. [75] A. Kechris, B. Miller. Topics in orbit equivalence. Springer-Verlag, Berlin, 2004. [76] W. Krieger, On constructing non-∗ isomorphic hyperfinite factors of type III. J. Functional Analysis 6, 97–109, 1970. [77] G. Levitt. On the cost of generating an equivalence relation. Ergodic Theory Dynam. Systems, 15(6):1173–1181, 1995. [78] W. L¨ uck. Approximating L2 -invariants by their finite-dimensional analogues. Geom. Funct. Anal, no. 4, 455–481, 1994. [79] W. L¨ uck. L2 -torsion and 3-manifolds. In Low-dimensional topology (Knoxville, TN, 1992), pages 75–107. Internat. Press, Cambridge, MA, 1994. [80] W. L¨ uck. Dimension theory of arbitrary modules over finite von Neumann algebras and L2 -Betti numbers. I. Foundations. J. Reine Angew. Math., 495:135– 162, 1998. [81] W. L¨ uck. Dimension theory of arbitrary modules over finite von Neumann algebras and L2 -Betti numbers. II. Applications to Grothendieck groups, L2 -Euler characteristics and Burnside groups. J. Reine Angew. Math., 496:213–236, 1998. [82] W. L¨ uck. L2 -Invariants and K-Theory. Mathematics – Monograph, SpringerVerlag, to appear. [83] R. Lyons, Y. Peres. Probability on Trees and Networks. Forthcoming book. [84] G.A. Margulis. Discrete subgroups of semisimple Lie groups. Berlin: Springer (1991). [85] D. McDuff. Uncountably many II1 factors. Ann. of Math. (2), 90, 372–377, 1969. [86] I. Mineyev, N. Monod, Y. Shalom. Ideal bicombing for hyperbolic groups and applications. Topology 43(6): 1319–1344, 2004. [87] N. Monod. Continuous bounded cohomology of locally compact groups. Lecture Notes in Mathematics, 1758. Berlin: Springer.
422
Y. Shalom
[88] N. Monod, Y. Shalom. Negative curvature from a cohomological viewpoint and cocycle superrigidity. C. R. Math. Acad. Sci. Paris, 337, no. 10, 635–638, 2003. [89] N. Monod, Y. Shalom. Cocycle superrigidity and bounded cohomology for negatively curved spaces. J. Differential Geometry, Vol. 67 no. 3, 395–455, 2004. [90] N. Monod, Y. Shalom. Orbit equivalence rigidity and bounded cohomology. Ann. of Math., to appear. [91] C. Moore. Ergodic theory and von Neumann algebras. In Operator algebras and applications, Part 2 (Kingston, Ont., 1980), pages 179–226. Amer. Math. Soc., Providence, R.I., 1982. [92] F. Murray and J. von Neumann. On rings of operators. Ann. of Math., II., 37:116–229, 1936. [93] D. Ornstein and B. Weiss. Ergodic theory of amenable group actions. I. The Rohlin lemma. Bull. Amer. Math. Soc. (N.S.), 2(1):161–164, 1980. [94] N. Ozawa, S. Popa. Some prime factorization results for type II1 factors. Invent. Math. 156, no. 2, 223–234, 2004. [95] P. Pansu. M´etriques de Carnot-Carath´eodory et quasiisom´etries des espaces sym´etriques de rang un. Ann. of Math. (2), 129(1):1–60, 1989. [96] P. Pansu Cohomologie Lp des vari´et´es a courbure n´egative cas du degr´e 1 In: PDE and geometry 1988 Rend. Sem. Mat. Tor. Fasc. Spez 95–120 (1989). [97] Pichot, Mikael. Conditions simpliciales de rigidit´e pour les relations de type II1 . C. R. Math. Acad. Sci. Paris, 337 (2003), no. 1, 7–12. [98] R. Pemantle, Y. Peres. Nonamenable products are not treeable. Isr. J. Math., 118:147–155, 2000. [99] S. Popa. Correspondences. Pr´epublication Institul Natinoal Pentru Creatie Stiintifica si Tehnica, (1986). [100] S. Popa. On a class of type II1 factors with Betti numbers invariants, Ann. of Math., to appear. [101] S. Popa. Strong rigidity of II1 factors arising from malleable actions of w-rigid groups, Part I, math.OA/0305306, Preprint 2004. [102] S. Popa. Strong rigidity of II1 factors arising from malleable actions of w-rigid groups, Part II, math.OA/0407103, Preprint 2004. [103] S. Popa. Some computations of 1-cohomology groups and construction of non orbit equivalent actions, math.OA/0407199, Preprint, September 2004. [104] R. Sauer. Homological invariants and quasi-isometry. Preprint (2004) [105] R. Sauer. L2 -Betti numbers of discrete measured groupoids, preprint. [106] K. Schmidt. Asymptotically invariant sequences and an action of SL(2, Z) on the 2-sphere. Israel J. Math., 37(3):193–208, 1980. [107] K. Schmidt. Amenability, Kazhdan’s property T , strong ergodicity and invariant means for ergodic group-actions. Ergodic Theory Dynamical Systems, 1, no. 2, 223–236, 1981. [108] K. Schmidt. Some solved and unsolved problems concerning orbit equivalence of countable group actions. In Proceedings of the conference on ergodic theory and related topics, II (Georgenthal, 1986), pages 171–184, Leipzig, 1987. [109] Z. Sela, Uniform embeddings of hyperbolic groups in Hilbert spaces. Israel J. Math. 80 no. 1-2, 171–181 1992.
Measurable Group Theory
423
[110] Y. Shalom. Rigidity, unitary representations of semisimple groups, and fundamental groups of manifolds with rank one transformation group. Ann. of Math. (2), 152, no. 1, 113–182, 2000. [111] Y. Shalom. Harmonic analysis, cohomology, and the large-scale geometry of amenable groups. Acta Math., 192, no. 2, 119–185, 2004. [112] D. Shlyakhtenko. Free Fisher information with respect to a completely positive map and cost of equivalence relations. Comm. Math. Phys. 218, 133–152, 2001. [113] D. Shlyakhtenko. Microstates free entropy and cost of equivalence relations. Duke Math. J., 118, no. 3, 375–425, 2003, [114] I.M. Singer, Automorphisms of finite factors. Amer. J. Math. 77, 117–133, 1955. [115] S. Thomas. On the complexity of the classification problem for torsion-free abelian groups of finite rank. Bull. Symbolic Logic 7 no. 3, 329–344, 2001. [116] S. Thomas. On the complexity of the classification problem for torsion-free abelian groups of rank two. Acta Math. 189, no. 2, 287–305, 2002. [117] S. Thomas. Some applications of superrigidity to Borel equivalence relations. DIMACS Ser. Discrete Math. Theoret. Comput. Sci. 58, 129–134, 2002. [118] S. Thomas. The classification problem for torsion-free abelian groups of finite rank. J. Amer. Math. Soc. 16, no. 1, 233–258, 2003. [119] S. Thomas. Superrigidity and countable Borel equivalence relations. Ann. Pure Appl. Logic 120, no. 1-3, 237–262, 2003. [120] M. Takesaki. Theory of operator algebras I. Springer-Verlag, New York 1979. [121] A. Tornquist. Orbit equivalence and actions of Fn . preprint, August 2004. [122] A. Valette. Group pairs with property (T), from arithmetic lattices. Geom. Ded., to appear. [123] A.M. Vershik. Strange factor representations of type II1 and pairs of dual dynamical systems. Mosc. Math. J. 3, no. 4, 1441–1457, 2003. [124] R.J. Zimmer. Strong rigidity for ergodic actions of semisimple Lie groups. Ann. of Math. (2), 112, no. 3, 511–529. 1980. [125] R.J. Zimmer. Ergodic actions of semisimple groups and product relations. Ann. of Math. 118, no. 1, 9–19, 1983. [126] R.J. Zimmer. Ergodic theory and semisimple groups. Birkh¨ auser Verlag, Basel, 1984. [127] R.J. Zimmer. Actions of semisimple groups and discrete subgroups. Proceedings of the ICM Vol. 1, 2 (Berkeley, Calif., 1986), 1247–1258, Amer. Math. Soc., Providence, RI, 1987. [128] R.J. Zimmer. Groups generating transversals to semisimple Lie group actions. Israel J. Math., 73(2):151–159, 1991. Yehuda Shalom School of Mathematical Sciences Tel-Aviv University Tel-Aviv 69978 Israel e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Some Mathematical Problems of Neural Networks Theory M. Shcherbina Abstract. We discuss some problems of the dynamics of neural networks, in particular, the rigorous results on the critical capacity of the Hopfield model, the cavity method for spin glass and the rigorous solution of the Gardner problem.
1. Critical capacity of neural network models The spin glass and neural network theories are of considerable importance and interest for a number of branches of theoretical and mathematical physics (see [MPV] and references therein). Among many topics of interest the analysis of the different models of neural network dynamics is one of the most important because of its numerical links with computer applications, in particular, with the models of the associative memory. To discuss these models let us start from a simple example. Suppose that we have p chosen “patterns” and we want to teach computer to recognize them, when they are slightly modified. Fore example, these patterns can be hand-written letters, some sequences of sounds (“words”), some pictures of people etc. Let us map these patterns into the sequences (µ) (µ) (µ) of ±1 of length N (usually N is very big): ξ (µ) = (ξ1 , . . . , ξN ) (ξi = ±1) (µ = 1, . . . , p). Now consider the set ΣN = {σ ∈ RN , σi = ±1} of all possible sequences of ±1 of length N with a usual distance: 2
||σ − σ || ≡
N
(σi − σi )2 .
i=1
Consider also some initial configuration σ(0) = (σ1 (0), . . . , σN (0)) ∈ ΣN which is close enough to one of chosen patterns, e.g., to ξ (1) ||σ(0) − ξ (1) || ≤ ε0 N.
(1.1)
Our goal is to introduce some sequential dynamics on ΣN in such a way to be sure that, starting from any σ(0), satisfying the above condition we arrive at the end into ξ (1) . Moreover, we want to have the same property of our dynamics around any of chosen patterns.
426
M. Shcherbina
One of the most popular ways to introduce the sequential dynamics on ΣN is the following: N (1.2) Jkj σj (t) , σk (t + 1) = sign j=1,j =k
where the matrix of interactions {Jij } (not necessary symmetric) depends on the concrete model. In the mathematical models usually {ξ (µ) }pµ=1 are chosen i.i.d. random (µ)
vectors with i.i.d. components ξi (i = 1, . . . , N ), assuming values ±1 and (µ) E{ξi } = 0. Here and below we denote by E{. . . } the averaging with respect to all random variables of the problem. Now the problem of storing p chosen patterns is transformed into the following mathematical problem: Problem 1.1. To introduce interactions {Jij }N i,j=1 in such a way that chosen random independent vectors {ξ (µ) }pµ=1 (patterns) with i.i.d. components ξi ±1 are the fixed points of the dynamics (1.2).
(µ)
=
To analyze dynamics (1.2) usually it is useful to consider also the energy function (the Hamiltonian) 1 Jjk σj σk . 2 N
H(σ) = −
j =k
It is easily seen that the function H(σ(t)) does not increase in the process of evolution. Thus, the dynamics of the model depends on the “energy landscape” of the function H(σ) and the local minima of the function are the fixed points of dynamics. Hopfield [H] was one of the first who proposed a possible form of the matrix Jjk . He introduced the matrix of the form Jjk
p 1 (µ) (µ) = ξ ξ , N µ=1 j k
(1.3)
where ξ (µ) (µ = 1, . . . , p) are chosen patterns. It is easy to check that if p is finite and N → ∞, then with probability 1, starting from any points close enough to any of the patterns ξ (µ) (see condition (1.1)), the dynamics (1.2) will stop at the point ξ (µ) . The next problem which it is natural to study is the possibility for p to increase together with N , so that our dynamical system has the same property as N, p → ∞. How big can be p comparing with N ? This problem was solved by Mac Eliece et al. [McEPRV]. They proved that if p < N/4 log N , then {ξ (µ) }pµ=1 are the fixed points of the dynamics (1.2) for the Hopfield model. If p N/ log N , {ξ (µ) }pµ=1 are not fixed points of (1.2).
Some Mathematical Problems of Neural Networks Theory
427
Let us transform a bit Problem 1.1. We assume now that the fixed points of dynamics do not coincide exactly with ξ (µ) but are situated in such a small neighborhoods of ξ (µ) that anyway we can distinguish different patterns. Numerical computations shows that such a picture is observed if p/N → α, as p, N → ∞, and α < αc , αc ∼ 0.138 . . . . For these α the dynamics (1.2), starting from any initial point σ(0) close enough to one of the patterns ξ (µ) converges quickly to some fixed point, which is very close to ξ (µ) . And if α > αc the dynamics (1.2) becomes chaotic. The first rigorous result for the Hopfield model with p/N → α, as p, N → ∞ was obtained by Newman [N]. He proved that for α ≤ 0.056 . . . , an “energy barrier” exists with probability 1 around every pattern ξ (µ) , i.e., there exist some positive numbers δ < 1/2 and ε, such that for any σ, belonging to Ωµδ ≡ {σ : ||σ − ξ (µ) ||2 = 4[δN ]} the inequality holds H(σ) − H(ξ (µ) ) ≥ εN. To illustrate the methods of the field let us explain in a few lines the idea of the proof. We denote (µ) A = min1 (H(σ) − H(ξ )) ≥ εN σ ∈Ωδ and show that
Prob{A} ≤ e−N C .
(1.4)
It is easy to see that Prob{A} = Prob{∪σ ∈Ω1δ {H(σ) − H(ξ (1) ) ≤ εN }} Prob{H(σ) − H(ξ (1) ) ≤ εN } ≤ 1 σ ∈Ωδ [δN ]
= CN
Prob{H(σ(1,δ) ) − H(ξ (1) ) ≤ εN }
where σ (1,δ) is some fixed point in Ωµδ , e.g., (1,δ)
σk
= −ξk1 , (k = 1, . . . , [δN ]),
(1,δ)
σk
= ξk1 , (k = 1 + [δN ], . . . , N ).
According to the Chebyshev inequality Prob{H(σ(1,δ) ) − H(ξ (1) ) ≤ εN } " # ≤ min E exp λ −H(σ(1,δ) ) + H(ξ (1) ) + εN λ>0 [δN ] p N (µ) (µ) E exp λ N −1/2 ξi ξi N −1/2 + λε = min λ>0 µ=1
i=1
j=[δN ]+1
= exp{−N (φ(δ, ε) + o(1))}.
428
M. Shcherbina
It is evident that if for some α there exist δ, ε such that φ(δ, ε) > N −1 log CN
(δN )
∼ δ log δ + (1 − δ) log(1 − δ),
then the the inequality (1.4) holds and so, according to the Borrel-Cantelli lemma, the energy barrier exists with probability 1. One can show, that if such a “barrier” exists, then inside each open ball Bδµ ≡ {σ : ||σ − ξ (µ) ||2 < 2[δN ]},
(µ = 1, . . . , p)
there exists a point of local minimum of the function H(σ), which, as it was mentioned above, is the fixed point of dynamics (1.2). This result was improved by Loukianova [L]. Using a similar method, she proved the existence of the energy barriers for α ≤ 0.071, so αc ≥ 0.071 . . . . Then this result was improved a little by Talagrand [T1]. In the paper [FST] a novel approach of study αc was introduced. This approach is based upon analysis of the Fourier transform of the joint distribution of the effective fields p p (µ) (µ) x ˜k ≡ N −1 ξk ξj . (1.5) µ=1 j=1
It enables us to obtain a new bound for the critical capacity (αc ≥ 0.113 . . . ) and also allows us to find an asymptotic (for small α) behavior of the distance between the fixed points and the nearest patterns. The idea of the method is to study the probability that the point σ (1,δ) ∈ 1 Ωδ is a local minimum of the function H(σ) on Ω1δ . This means that H(σ (1,δ) ) − H(σ (1,δ,j,k) ) ≤ 0. for any σ (1,δ,j,k) ∈ Ω1δ which is the “nearest neighbor” in Ω1δ , where (1,δ) i '= j, k σi , (1,δ,k,j) = σi (1,δ) −σi , otherwise ( k = 1, . . . , [δN ] and j = [δN ] + 1, . . . , N ). After some transformations we are faced with the problem to find N [δN ] PN (q, q ; α, δ) ≡ E θ(˜ xk − a1 ) θ(˜ xk − a2 ) , k=1
k=1+[δN ]
where x ˜k are the effective fields defined by (1.5), θ(x) =
1, x ≥ 0 , is the 0 x<0
Heaviside function, and a1 ≡ α + 1 − 2δ + q,
a2 ≡ α − 1 + 2δ + q .
The following theorem proven in [FST] gives us the motivation to study the asymptotic behavior as N → ∞ of the function PN (q, q ; α, δ):
Some Mathematical Problems of Neural Networks Theory
429
Theorem 1.2. Denote by A the event that there exist some δ, ε > 0 and some point σ 0 : ||σ 0 − ξ (1) ||2 < 4δN , such that min H(σ) − H(σ0 ) > ε2 N. σ ∈Ω1δ Then if for some α and δ max lim sup q>0
N →∞
# 1 " [δN ] < 0, log PN (q, −q; α, δ) + log CN N
(1.6)
then there exists some C(α) > 0 such that Prob{A} ≤ e−N C(α) . The main technical result of the paper [FST] is the theorem which describes the asymptotic behavior of the function PN (q, q ; α, δ), as N → ∞ is given by Theorem 1.3 below. We would like to stress that the proofs for the (µ) (µ) cases of Gaussian ξi and nongaussian ξi are very different. In the Gaussian case the computations are not very simple, but they are straightforward. We express the function PN (q, q ; α, δ) in terms of the joint Fourier transform F (ζ) ˜k (1.5). (ζ = (ζ1 , . . . , ζN )) of the distribution of the effective fields x F (ζ) ≡ (2π)−N/2 exp{i
N
x ˜k ζk } = (2π)−N/2 ei(u,v ) , ˜ ˜
k=1
˜ = (˜ ˜ = (˜ ˜p ), v v 1 , . . . , v˜p ) with where we denote u u1 , . . . , u u ˜µ ≡ N −1/2
N
(µ)
ξk ζk ,
v˜µ ≡ N −1/2
µ µ
ei˜u
v ˜
= (2π)−1
(µ)
ξj .
j=1
k=1
It is easy to see that
N
µ
duµ dv µ ei(u
u ˜µ +v µ v ˜µ )
e−iu
µ µ
v
.
Thus, using the inverse Fourier transform for the function F (ζ), we get PN (q, q ; α, δ) =
=
1 (2π)N/2 1
N
θ(xk − ak )dxk
k=1
e−i(u,v ) dudv
dζ exp{−i
N
N
xk ζk }F (ζ)
k=1
dxk θ(xk − ak ) (2π)(N +p) k=1 u, u) + i(˜ v, v)}, × dζk exp{−iζk xk + i(˜
where we denote for simplicity ak =
a1 , k ≤ [δN ], a2 , k > [δN ].
430
M. Shcherbina (µ)
But, since ξi
are independent normal variables,
exp{−iζk xk + i(˜ u, u) + i(˜ v , v)} (µ) 2 p p p e−(ξk ) /2 (µ) −1/2 µ (µ) −1/2 √ = ( ) exp{i(N u ξk ζk + N v µ ξk )} 2π µ=1 µ=1 µ=1 =
p
exp{−
µ=1
Therefore 1
PN (q, q ; α, δ) =
(uµ ζk + v µ )2 }. (1.7) 2N
1 dudv exp{−i(u, v) − (v, v)} 2 N (ixk + N −1 (u, v))2 θ(xk − ak ) × exp{ }, dxk U 2U 2
N (2π)( 2 +p)
k=1
where U ≡ (u, u)
1/2
N
−1/2
. Integrating with respect to xk , we get 1 −p dudv exp{−i(u, v) − (v, v)} PN (q, q ; α, δ) = (2π) 2 N ak − iN −1 (u, v) H( × ). U k=1
Now let us fix u and change variables in the integral with respect to v 1 v1 = √ (e1 , v), v2 = (e2 , v), . . . , vp = (ep , v), N p µ p where i }i=1 is the orthonormal system of vectors in R such that e1 = √ {e (U N )−1 uµ . Then, integrating with respect v2 , . . . , vp , we obtain
PN (q, q ; α, δ) = (2π)−(p−1)/2
p N ( duµ ) dv1 exp{−iN U v1 − (v1 )2 2 µ=1
+ [N δ] log H(
a1 a2 − iv1 ) + (N − [N δ]) log H( − iv1 )}. U U
Using the spherical coordinates in the integral with respect to u and integrating with respect to angular variables, we get ∞ N PN (q, q ; α, δ) = Γ(p) dU dv1 exp{(p − 1) log U − iN U v1 − (v1 )2 2 0 a1 a2 + [N δ] log H( − iv1 ) + (N − [N δ]) log H( − iv1 )}. U U Then, using the saddle point method we obtain the asymptotic expression (1.8).
Some Mathematical Problems of Neural Networks Theory
431
The difference of non-Gaussian case from the Gaussian one is that we have, in (1.7), p uµ ζk + v µ √ cos N µ=1 instead of
p
exp{−
µ=1
(uµ ζk + v µ )2 }. 2N
To replace the former term by the latter one we have to estimate the difference between them for different u, v and ζ. Besides, since most of integrals do not converge absolutely, hence the estimates of the absolute values (differently from the Newman work) do not work. This produces so many technical difficulties that in the nongaussian case we are able to prove only the upper bound for PN (q, q ; α, δ). Till now there are some doubts that the true asymptotic for PN (q, q ; α, δ) for all values of parameters q, q , α, δ coincides with that for the Gaussian case. But the remarkable fact is that in the field of parameters which we need to study in order to apply Theorem 1.2 we can prove that the upper bound for PN (q, q ; α, δ) coincides with the asymptotic expression for (µ) PN (q, q ; α, δ) in the case of normal ξi . (µ)
Theorem 1.3. For the Gaussian i.i.d. ξi
lim N −1 log PN (q, q ; α, δ) = max min F0 (U, V ; α, δ, q, q )−
N →∞
U >0
V
α α log α+ , (1.8) 2 2
where a∗ a∗ F0 (U, V ; α, δ, q, q ) ≡ δ log H( 1 − V ) + (1 − δ) log H( 2 − V ) U U ∞ 2 1 1 H(x) ≡ √ e−t /2 dt . −U V + V 2 + α log U, 2 2π x (µ)
For the Bernulli i.i.d. ξi : lim sup N −1 log PN (q, q ; α, δ) N →∞
(D)
≤ max min F0 U >0
V
(U, V ; α, δ, q, q ) −
α α log α + , 2 2
where F0 (U, V ; α, δ, q, q ) ≥ F0 (U, V ; α, δ, q, q ) (see [FST] for the exact ex(D) pression of F0 (U, V ; α, δ, q, q )). (D)
As it was already mentioned above, in the field of interest (D)
F0
(U, V ; α, δ, q, q ) = F0 (U, V ; α, δ, q, q ).
Remark 1.4. Numerical calculations show that condition (1.6) is fulfilled for any α ≤ α∗c = 0.113 . . .
432
M. Shcherbina
The result of Theorem 1.3 also enables us to obtain a rather simple upper bound for the probability to have a fixed point of the dynamics (1.2) at the distance δ from the first pattern: Theorem 1.5. PN∗ (δ, α) – the probability to have a fixed point of the dynamics of the Hopfield model at the distance δ from the first pattern has an upper bound of the form: " 1 − 2δ # PN∗ (δ, α) ≤ exp N [−δ log δ − (1 − δ) log(1 − δ) + δ log H √ α " 1 − 2δ # + (1 − δ) log H − √ + O(e−1/α ) + o(δ log α−1 ) + o(1)] . α It is shown in [FST] that this bound becomes asymptotically exact for small α (α → 0). Moreover, Theorem 1.5 implies very important corollary: Corollary 1.6. It follows from Theorem 1.5, that δc (α), the minimal δ for which PN∗ (δ, α) does not decay exponentially in N , as N → ∞, has the asymptotic behavior √ α δc (α) ∼ √ e−1/2α . 2π This result coincides with the formula found by Amit et al. [AGS] with replica calculations. 2. The Hopfield model of spin glasses Now we discuss another method to study the Hopfield model – so-called statistical mechanics approach. This approach is based on the observation that if we take some positive parameter β (usually β is called the inverse temperature) and introduce the Gibbs measure on ΣN = {σ ∈ RN , σi = ±1} of the form −1 (. . . )e−βH(σ) , ZN = e−βH(σ ) , . . . = ZN σ ∈ΣN σ ∈ΣN then this measure is an invariant measure of the so-called Glauber dynamics for fixed β. The Glauber dynamics is some special kind of stochastic dynamics. And the neural network dynamics (1.2) is the limiting case of the Glauber dynamics for β → ∞. So the idea is to study the Gibbs measure for fixed β and then make some conclusions about its behavior as β → ∞. The key role in studies of the Gibbs measure plays the free energy fN (β) = −
1 log ZN , βN
because the most important characteristics of the Gibbs measure can be obtained as derivatives of the free energy with respect to the different parameters.
Some Mathematical Problems of Neural Networks Theory
433
Consider the Hopfield model with additional parameters τ, ε which correspond some additional terms (fields) in the energy function: H(σ) = −
N
Jij σi σj + τ
N
i,j=1
(1)
ξi σi + ε
i
N
hi σi ,
i
Jij =
p 1 (µ) (µ) ξ ξ N µ=1 i j
with hi -i.i.d. normal variables. This model for the case p = const was introduced initially by Pastur and Figotin [PF] as an exactly solvable model of spin glasses. They have shown that the free energy of the Hopfield model with the finite number of patterns in the limit N → ∞ coincides with that for the Curie-Weiss model. This result means, in particular, that the Gibbs measure for the finite β in the limit N → ∞ is concentrated on some spheres around the patterns ξ (µ) and the radius of these spheres tends to zero, as β → ∞. Similar result was obtained by Koch and Piasko [KP] in the case as p ∼ log N when N → ∞. And finally in the work [ST1] this result was generalized on the case when p, N → ∞, p/N → 0. The Hopfield model with extensively many patterns (p, N → ∞, p/N → α) was widely discussed in the physical literature. By using so-called replica calculations, which are not rigorous from mathematical point of view but sometimes very efficient, a lot of results on the Hopfield model were found. But most of them only wait for their mathematical proof. Let us discuss briefly this method and results. 2.1. Replica trick. The replica trick was proposed initially by Parisi to study the free energy of the other very popular model of spin glasses – the SherringtonKirkpatrick model (see [MPV] and references therein). The method is based on a simple observation that E log ZN = lim
→0
d EZN . d
So the idea is to find for n ∈ N n EZN = exp{N (φ(n) + o(1))}.
Then we construct the analytical function: φ(ζ) → φ(n)|ζ=n and find φ (0). Then limN →∞ fN = −β −1 φ (0). These scheme for the Hopfield model was realized by Amit, Gutfreund and Sompolinsky [AGS]. They found that there exists some αc (β) such that for α < αc (β) the order parameter of the problem (so-called Edwards-Anderson order parameter) qN = N
−1
N i=1
σi 2 ,
(2.1)
434
M. Shcherbina
possess the self-averaging property (his variance vanishes as N → ∞) and his limiting mean value is the solution of so-called replica symmetric equations: s √ 2 dz ν √ e−z /2 E{ξ ν tanh β( αrz + m = (mν + hν )ξ1ν )}, 2π ν=1 s √ 2 dz √ e−z /2 E{tanh2 β( αrz + q= (mν + hν )ξ1ν )}, (2.2) 2π ν=1 q , r= (1 − β(1 − q))2 where q = lim qN , mν = lim mνN , r = lim rN , N →∞ N (µ) N →∞ p N →∞ mνN = N −1 i=1 ξi σi , rN = µ=s+1 (mνN )2 .
(2.3)
And the mean value of the free energy has the limit f=
s 1 ν 2 αβr(1 − q) 1 α + (2.4) { (m ) + m1 ,...,m ,r,q 2 2 ν=1 2
α βq + ln(1 − β(1 − q)) − 2β 1 − β(1 − q) s √ 2 1 dz √ e−z /2 E{ln[2 cosh β( αrz + − (mν + hν )ξ1ν )]}}. β 2π ν=1
mins
It is easy to check that equations (2.2) can be obtained as the extremum conditions of the for the l.h.s. of (2.4). And for α > αc (β) the Edward-Anderson order parameter is a random variable even in the limit N → ∞ and its distribution is a solution of some rather complicated nonlinear partial differential equation of the second order. The most important for the neural networks dynamics result here is that for α < αc (β) the Gibbs measure is concentrated around the patterns ξ (µ) . And it was shown that αc (β) → 0.138 . . . as β → ∞. Till now there are not so many rigorous results for the Hopfield model with extensively many patterns (p, N → ∞, p/N → α). Self-averaging property of the free energy, i.e., that the variance of the free energy vanishes as N → ∞ lim E (fN − EfN )2 = 0 N →∞
was proven in [ST1]. This result was generalized by Bovier et al. [BGP] who proved the large deviation type bounds for (fN − EfN ). The most interesting rigorous results on the Hopfield model of spin glass (see [PST1], [PST2], [BG], [T1]) were obtained by using some version of the cavity method, which we are going to discuss now.
Some Mathematical Problems of Neural Networks Theory
435
2.2. Cavity method. In the spin glass theory this method is used mainly to study the replica symmetric field (for α < αc (β)). Recall the simple identity σi = tanh β(
N
Jij σj + εh1 )
(2.5)
j =i
valid for the Ising model (σi = ±1) with any interaction Ji,j . The mean field approximation is based on the assumption that the thermodynamic correlations between spins vanish in the macroscopic limit |σi σj − σi σj | → 0,
N → ∞.
(2.6)
Then if Ji,j → 0, as N → ∞, we can replace (2.5) by the relation σi = tanh β(
N
Jij σj + εh1 ) + o(1),
(2.7)
j =i
that can be regarded as a system of equations for the “local magnetization” σi and leads to the corresponding self-consistent equations for the order parameters of the model. The rigorous version of the cavity method for the spin glass theory was proposed first in [PS], [S] and the adopted to the Hopfield model in [PST1], [PST2]. It was shown here that vanishing of correlations is equivalent to the self-averaging property of the Edwards-Anderson order parameter E{(qN − E{qN })2 } → 0,
N → ∞,
(2.8)
and if for some α, β, ε, t the parameter qN is s.a., then of (2.7) is valid σ1 = tanh β(
N
J1j σj 0 + εh1 ) + r1,N ,
2 Er1,N → 0.
(2.9)
j=2
Here . . .0 is the Gibbs measure, corresponding to the H(σ)|σ1 =0 . From the last relation it is straightforward to derive the replica symmetric equations (2.2) for the order parameters. Thus, the key point of the cavity method is the proof of some analog of (2.6). As soon as we establish (2.6) for some model, then we can derive some kind of self-consistent equations. There are a few works, where (2.6) is obtained for the Hopfield model and then equations (2.2) are derived (see, e.g., [BG] and [T1]). But unfortunately all of them deal with α 1, so they cannot be used for the purposes of the neural networks dynamics.
436
M. Shcherbina
3. The Gardner problem Now let us come back to the neural networks dynamics (1.2) and recall that the main problem here was to introduce an interaction {Jij }N i,j=1 (not neces(µ) p sary symmetric) in such a way that some chosen vectors {ξ }µ=1 (patterns) are the fixed points of the dynamics (1.2). The choice of matrix {Jij }N i,j=1 depends on the concrete model, but one can see easily that multiplication of all coefficients in the same line {Jij }N j=1 by some positive constant λi does not change the dynamics (1.2). So it is natural to consider the matrices whose lines satisfies some kind of normalization conditions. For most popular models of neural networks dynamics (e.g., for the Hopfield model) these conditions have the form N 2 Jij = N R (i = 1, . . . , N ), (3.1) j=1,j =i
where R is some fixed number which could be taken equal to 1. It is obvious also that if ξ (µ) are the fixed points of (1.2), then the interactions matrix {Jij } satisfies also conditions (µ) ξi
N
(µ)
Jij ξj
>0
(i = 1, . . . , N ),
(µ = 1, . . . , p).
(3.2)
j=1,j =i
Sometimes condition (3.2) is not sufficient to have ξ (µ) as the end points of the dynamics. To have some “basin of attraction” (that is some neighborhood of ξ (µ) , starting from which we for sure arrive in ξ (µ) ) one should introduce some positive parameter k and impose the conditions: (µ) ξi
N
(µ)
Jij ξj
>k
(i = 1, . . . , N ),
(µ = 1, . . . , p).
(3.3)
j=1,j =i
Gardner (see [G]) was the first who solved a kind of inverse problem. Problem 3.1. For which α = Np the interaction {Jij }, satisfying (3.1) and (3.3) exists? What is the ratio of the total Lebesgue measure of the interactions satisfying (3.3) and (3.1) to the measure of all interactions, satisfying (3.1) (she called this quantity the typical fractional volume of the interactions)? Since all conditions (3.1) and (3.3) are factorized with respect to i, this problem after a simple transformation should be replaced by the following. (µ) For the system of p ∼ αN i.i.d. random patterns {ξ (µ) }pµ=1 with i.i.d. ξi (i = 1, . . . , N ) assuming values ±1 with probability 12 , consider p dJ θ(N −1/2 (ξ (µ) , J) − k), (3.4) ΘN,p (k) = |SN |−1 (J ,J )=N µ=1
Some Mathematical Problems of Neural Networks Theory
437
(θ(x) is the Heaviside-function), |SN | is the Lebesgue measure of N -dimensional sphere of radius N 1/2 . Then, the question of interest is the behavior of 1 log ΘN,p (k) N in the limit N, p → ∞, p/N → α. This problem has a very nice geometrical interpretation. For very large integer N consider the N -dimensional sphere SN of radius N 1/2 centered in the origin and p = αN independent random half spaces Πµ (µ = 1, . . . p). Let Πµ = {J ∈ RN : N −1/2 (ξ (µ) , J) ≥ k}, where ξ (µ) are i.i.d. random vectors with i.i.d. Bernoulli components ξj and k is the distance from Πµ to the origin. The problem is to find the maximum value of α such that the volume of the intersection of SN with ∩Πµ is not “too small” comparing with |SN |, i.e., their ratio is of the order e−N C with some bounded C. Let us remark here, that since " πe #N/2 |SN | ∼ π 1 /2 , as N → ∞, 2 (µ)
it is natural to expect that the “normal behavior” of our ratio is just e−N C , and so the words “too small” mean that the ratio tends to zero more fast than e−N C with any positive C. Gardner [G] had solved this problem by using replica trick, described in the previous section. As it was mentioned above this method is far from being rigorous from mathematical point of view, but it plays very important role in the physical literature and gives results which usually are correct. Using this method Gardner has shown that for any α < αc (k), where " 1 ∞ #−1 2 αc (k) ≡ √ (u + k)2 e−u /2 du , (3.5) 2π −k we have so-called replica-symmetric solution of the problem. This means first of all that, if we define the Edward-Anderson order parameter as Ji 2Θ , (3.6) qN = N −1 with . . .Θ being the uniform distribution on the intersection of SN with ∩Πµ , then qN possess the self-averaging property (2.8) and its limiting mean value can be found as a solution of the replica symmetric equation √ $ u q + k −2 (3.7) q = α(1 − q)E H √ 1−q
438
M. Shcherbina
∞ 2 where H(x) ≡ √12π x e−t /2 dt and u is a Gaussian normal random variable. Besides, there exists N −1 E{log ΘN,p (k)} = F (α, k)
√ u q+k 1 1 q + log(1 − q) . (3.8) ≡ min αE log H √ + q:0≤q≤1 21−q 2 1−q
lim
N,p→∞,p/N →α
It is easy to check that equation (3.7) is just the minimum condition for the function in the right-hand side of (3.8). For α ≥ αc (k) 1 log ΘN,p (k) → −∞, N
as N → ∞.
It is interesting to observe, that αc (0) = 1/2 (cf. with the Hopfield model, where αc = 0.138 . . . ). Let us remark that, according to the results of Gardner, for this model there is no field of parameters here with so-called broken replica symmetric solution, so there is a hope, that differently from the Hopfield model the Gardner model could be studied completely (i.e., in the whole field of parameters) by using the cavity method described in the previous section. 3.1. Rigorous results for the Gardner problem. The first rigorous result for the Gardner problem with Gaussian ξ (µ) was obtained by Talagrand [T2]. He proved large deviation type bounds for the fluctuations of log ΘN,p (k). Complete rigorous solution for the Gardner problem was obtained in [ST2] (see also [ST3]), where the Gardner formulas (3.8) (3.7) for the free energy and the Edwards-Anderson order parameter were proved. To this end we use a version of the cavity method, but the problem is that we are not able to produce the equations for the order parameter in the case, when the “randomness” is not included in the Hamiltonian, but is contained in the form of the integration domain. That is why we used a rather common trick: substitute θ-functions by some smooth functions which depend on the small parameter ε and tend, as ε → 0, to θ-function. We choose for these purposes H(xε−1/2 ), where H is the erf -function, but the particular form of these smoothing functions is not very important for us. The most important fact is, that they are not zero in any point and so, taking their logarithms, we can treat them as a part of our Hamiltonian. So we introduce the intermediate Hamiltonian p z k − (ξ (µ) , J )N −1/2 √ + (J , J). log H HN,p (J , k, h, z, ε) ≡ − (3.9) 2 ε µ=1
Some Mathematical Problems of Neural Networks Theory
439
The partition function for this Hamiltonian is −1 ZN,p (k, z, ε) = |SN | dJ exp{−Hε (J, k, z, ε)} −1
= |SN |
p
dJ
H
µ=1
k − (ξ (µ) , J )N −1/2 √ ε
exp{−z(J, J )/2}. (3.10)
We denote also by . . . the corresponding Gibbs averaging and fN,p (k, z, ε) ≡
1N ZN,p (k, z, ε). log
One more difference of this model from the model (3.4) is that we introduce an additional parameter z > 0 to replace the integration over the sphere (J , J ) = N in (3.4) by the integration in the whole space RN in (3.10). It is proven in [ST2] that if we find the thermodynamic limit lim
N,p→∞,p/N →α
E{fN,p (k, z, ε)} = F (α, k, z, ε)
and choose z ∗ from the condition F (α, k, z ∗ , ε) +
z∗ z = min{F (α, k, z, ε) + }, z>0 2 2
then lim
N,p→∞,p/N →α
N
−1
E
−1 log σN
dJ exp{−H(J, k, 0, ε) (J ,J )=N
= F (α, k, z ∗ , ε) +
z∗ . 2
We call the model (3.9)–(3.10) by the modified Gardner model. The free energy of this model can be found using the following theorem proven in [ST2]: Theorem 3.2. For α < 2, ε small enough, and z ≤ ε−1/3 , there exists E{fN,p (k, z, ε)} = F (α, k, z, ε), √ u q+k F (α, k, h, z, ε) ≡ max min αE log H √ R>0 0≤q≤R ε+R−q
1 q 1 z + + log(R − q) − R , 2R−q 2 2
lim
N,p→∞,αN →α
where u is a normal random variable. As it was mentioned above, the proof of Theorem 3.2 is based on the the application of the cavity method to the Gardner problem. The key point of
440
M. Shcherbina
this application is the proof of the vanishing of the thermodynamic correlations between Ji and Jj in the limit N → ∞ (cf. (2.6)): (3.11) E (Ji − Ji )(Jj − Jj )2 → 0, as N → ∞, which follows from the Brascamp-Lieb [BL] inequalities, according to which for any integer n and any x ∈ RN .2n I H n ˙ x) (J, Γ(2n − 1) |x|2 √ ≤ n . (3.12) z Γ(n − 1) N n N It is interesting to remark that the Brascamp-Lieb inequalities follow from the classical geometrical theorem: some convex set. Consider Theorem 3.3 (Brunn-Minkowski). Let M ⊂ RN be √ N the family of hyper planes L(t) = {x ∈ R (x, e) = t N }. Let A(t) = M ∩ L(t). Consider R(t) ≡ [mesA(t)]1/N . Then d2 R(t) ≤0 dt2 2
R(t) and d dt ≡ 0 for t ∈ [t1 , t2 ] if and only if all the sets A(t) for t ∈ [t1 , t2 ] are 2 homothetic to each other.
After the proof of Theorem 3.2 the next step is the limiting transition ε → 0, i.e., the proof that the product of αN θ-functions in (3.4) can be replaced by the product of H( √xε ) with the small difference, when ε is small enough. Despite expectations, it is the most difficult step from the technical point of view. It is rather simple to prove, that the expression (3.8) is an upper bound or log ΘN , p(k). But the estimate from below is much more complicated. The problem is that to estimate the difference between the free energies corresponding to two Hamiltonians we, as a rule, need to have them defined in the common configuration space, or, at least, we need to know some a priori bounds for some Gibbs averages. In the case of the Gardner problem we do not posses this information. This leads to rather serious technical problems. The final result has the form: Theorem 3.4. For any α < αc (k) there exists lim
N,p→∞,p/N →α
E{N −1 log ΘN,p (k)} = lim max F (α, k, z, ε) = F (α, k), ε→0 z>0
where F (α, k) is the Gardner expression. For α > αc (k) E{N −1 log ΘN,p (k)} → −∞, as N → ∞. It is interesting to mention one more problem which is very similar to the Gardner problem. It is so-called the Gardner-Derrida problem [DG] in which we seek the matrix {Jij }N i,j=1 , satisfying conditions (3.2) or (3.3) but assuming values Jij = ±1. The geometrical interpretation here is that we are interested
Some Mathematical Problems of Neural Networks Theory
441
in the measure of the intersection of our random half spaces Πµ with a discrete cube ΣN = [−1, 1]N . This problem was also solved by the replica trick (see [DG]) and similarly to the Gardner problem it was shown that the replica symmetric solution for this problem is true in the whole field of parameters (α, k). But till now the rigorous proof of these results with some version of the cavity method was found only for α 1 (see [T3], [T4]). This difference with a case of the Gardner problem is explained by the fact that in the former we can use the Brascamp-Lieb inequalities (3.12) to prove the vanishing of the thermodynamic correlations (3.11), while in the case of the Gardner-Derrida model these inequalities are not applicable. 3.2. CLT for the free energy and order parameters. An important ingredient of the analysis of the free energy of the model (3.9) in [ST2] was the proof of the fact that the variance of its order parameters (or the overlap parameters) disappears in the thermodynamic limit. In the paper [ST4] we study the behavior of fluctuations of the overlap parameters, defined as Rl,m =
1 (l) (m) (J , J ), N
(l, m = 1, . . . n),
(3.13)
where the upper indexes of the variables J mean that we consider n replicas of the Hamiltonian (3.9) with the same random parameters {ξ (µ) }pµ=1 , but different J (1) , . . . , J (n) . We introduce also the notations: q˙ = N 1/2 (R1,2 − q), (l) (m) 1 ), Tl,m = 1/2 (J˙ , J˙ N
Tl =
1 N 1/2
(J˙
(l)
, J).
(3.14)
Here J˙ ≡ J − J and J = (J1 , . . . JN ) ∈ RN , where . . . is the Gibbs averaging with respect to the Hamiltonian (3.9). (q, R) is the solution of the system of equations: √
qu + k α 2 2 E A √ q = (R − q) , R − q +ε R −√q + ε qu + k √ α (3.15) E ( qu + k)A √ z = 3/2 (R − q + ε) R−q+ε q 1 − , + (R − q)2 R−q with 1 d A(x) = − √ log H(x). 2π dx These equations are equivalent to ∂F = 0, ∂q
∂F = 0, ∂R
442
M. Shcherbina
for the function F (q, R; k, z, ε) which is defined by the expression in the r.h.s. of (1.8) before taking maxR minq . It is proven in [ST2] that if α < 2, ε ≤ ε∗ (α, k) and z ≤ ε1/3 , the the system (3.15) has a unique solution. To avoid additional technical difficulties in the proof of central limit the(µ) orems we assume that {ξi } are independent normal random variables. The main result of the paper [ST2] is −1/3 . Then for Theorem 3.5. Consider any α < 2, k > 0, ε ≤ ε∗ (α, √ k) and z ≤ ε any integer n the families of random variables { N (Rl,m − ERl,m )}l<m≤n , converges in distribution, as N, p → ∞, p/N → α, to the Gaussian family of random variables {vl,m }l<m≤n , with the covariance matrix:
E{vl,m vl,m } E{vl,m vl,m } E{vl,m vl ,m }
= = =
A∗ , B ∗ (m '= m ), C ∗ (m, m , l, l are different).
(3.16)
In particular, lim
2n E T1,2
=
lim
E T12n
=
lim
E{q˙2n }
=
N,p→∞,p/N →α N,p→∞,p/N →α N,p→∞,p/N →α
Γ(2n − 1) n A Γ(n − 1) ∗ Γ(2n − 1) n B Γ(n − 1) ∗ Γ(2n − 1) n C , Γ(n − 1) ∗
(3.17)
where the constants A∗ , B ∗ , C ∗ , A∗ , B∗ , C∗ depend on α, k, z, ε and all odd moments for these random variables tend to zero. Remark 3.6. In fact it follows from our proof that {Tl,m }l<m≤n and {Tl }l≤n in (µ) some sense do not depend on the random variables {ξi }, i.e., if we consider P - some product of {Tl,m }l<m≤n and {Tl }l≤n , then lim
N,p→∞,p/N →α
E{(P − EP )2 } = 0.
(3.18)
Similar result for the free energy (3.9) of the modified Gardner model was obtained in [ST5]. Theorem 3.7. Consider the modified Gardner model with i.i.d. normal variables (µ) {ξi }i=1,...,N,µ=1,...,p . Then for any α < 2, k > 0, ε ≤ ε∗ (α, k) and z ≤ ε−1/3 the random variable vN,p = N 1/2 (fN,p − E{fN,p }) converges in distribution, as N, p → ∞, p/N → α, to a Gaussian random variable with zero mean and the variance 2 $ √ √ q + k u q+k u 2 2 V = αE − αE log H √ log H √ ε+R−q ε+R−q
Some Mathematical Problems of Neural Networks Theory
443
Similar results for the fluctuations of the overlap parameters and of the free energy were obtained in [T4] for the Gardner-Derrida model for small α and for the Sherrington-Kirkpatrick model for the high temperature. We would like to mention also the work [GuT], where the fluctuations of the overlap parameters for the Sherrington-Kirkpatrick model in the high temperature region were studied by the method of characteristic functions. References [AGS] Amit D., Gutfreund H. and Sompolinsky H. Statistical Mechanics of Neural Networks. Annals of Physics 173, 30–47 (1987) [BGP] A. Bovier, V. Gayrard and P. Picco. Large deviation principles for the Hopfield and the Kac-Hopfield model, Probab. Theory Rel. Fields 101, 511–546 (1995) [BG] A. Bovier, V. Gayrard. Hopfield models as generalized random mean field models. In “Mathematical aspects of spin glasses and neural networks”, A. Bovier and P. Picco (eds.), Progress in Probability 41, 1–89 (Birkh¨ auser, Boston 1998) [BL] Brascamp H.J., Lieb E.H. On the Extension of the Brunn-Minkowsky and Pekoda-Leindler Theorems, Includings Inequalities for Log Concave functions, and with an Application to the Diffusion Equation. J. Func. Anal. 22, 366–389 (1976) [DG] B. Derrida, E. Gardner Optimal Stage Properties of Neural Network Models. J. Phys. A: Math. Gen. 21, 271–284 (1988) [FST] J. Feng, M. Shcherbina, B. Tirozzi. On the critical capacity of the Hopfield model. Commun. Math. Phys. V. 216, p. 139–177, (2001) [G] E. Gardner: The Space of Interactions in Neural Network Models. J. Phys. A: Math. Gen. 21, 257–270 (1988) [GuT] F. Guerra and F.L. Toninelli. Central Limit Theorem for Fluctuations in the High Temperature Region of the Sherrington-Kirkpatrick Spin Glass Model. J. Math. Phys. 43, 6224–6237 (2002) [H] Hopfield J. Neural Networks and Physical Systems with Emergent Collective Computational Abilities. Proc. Nat. Ac. Sci. 79, 2554–2558 (1982) [KP] Koch H., Piasko J., Some Rigorous Results on the Hopfield Neural Network Model. J. of Stat. Phys., 55, 5/6, 903–928, (1993) [L] D. Loukianova. Lower bounds on the restitution error of the Hopfield model. Prob. Theor. Relat. Fields, 107 161–176 (1997) [McEPRV] R.J. MacEliece, E.C. Posner, E.R. Rodemich, S.S. Venkatesir. The capacity of the Hopfield associative memory. IEEE Trans. Inform. Theory 33, 461–468 (1987) [MPV] Mezard M., Parisi G., Virasoro M.A. Spin Glass Theory and Beyond. Singapur: World Scientific, 1987. [N] C. Newman. Memory capacity in neural network models: Rigorous lower bounds. Neural Networks I, 223–238 (1988) [PF] L. Pastur, A. Figotin. Exactly Soluble Model of the Spin Glass. Soviet. Phys. J.E.T.P., 25, 348–353, (1977)
444
M. Shcherbina
[PS]
L. Pastur, M. Shcherbina. Absence of Self-Averaging of the Order Parameter in the Sherrington-Kirkpatrick Model. J.Stat.Phys., 62, 1–26, (1991) [PST1] L. Pastur, M. Shcherbina, B. Tirozzi. ’The Replica-Symmetric Solution Without Replica Trick for the Hopfield Model’. J. Stat. Phys., 74, 5/6, 1161–1183, (1994) [PST2] L.Pastur, M.Shcherbina, B.Tirozzi. On the replica symmetric equations for the Hopfield model. J.Math.Phys. V.40 (1999) [S] M. Shcherbina. More about the absence of selfaverageness of order parameter in SK-model. Preprint Rome University-I, 1991. [ST1] M. Shcherbina, B. Tirozzi. The Free Energy of a Class of Hopfield Models. J. of Stat. Phys., 72 1/2, 113–125, (1993) [ST2] M. Shcherbina, B. Tirozzi. Rigorous Solution of the Gardner Problem. Commun. Math. Phys., (2003) [ST3] M. Shcherbina, B. Tirozzi. On the Volume of the Intersection of a Sphere with Random Half Spaces. CRAS Ser.I 334 p. 803–806, (2002) [ST4] M. Shcherbina, B. Tirozzi. Central Limit Theorems for Order Parameters of the Gardner Problem. Markov processes and related fields, N4, p. 583–602 (2003) [ST5] M. Shcherbina, B. Tirozzi. Central limit theorems for the free energy of the modified Gardner problem. To appear in Markov processes and related fields (2004) [T1] M. Talagrand. Rigorous Results for the Hopfield Models with Many Patterns. Prob. Theor. Rel. Fields, 110, 109–176, (1998) [T2] Talagrand M.: Self-averaging and the Space of Interactions in Neural Networks. Random Structures and Algorithms 14, 199–213 (1998) [T3] Talagrand M. Intersecting Random Half-Spaces: Toward the Gardner-Derrida Problem. Ann. Probab., 28, 725–758 (2000) [T4] Talagrand M. Spin glasses: a challenge for mathematicians. Mean field models and cavity method. Springer-Verlag, 2002. M. Shcherbina Institute for Low Temperature Physics Ukr. Ac. Sci. 47 Lenin ave. Kharkov, Ukraine
4ECM Stockholm 2004 c 2005 European Mathematical Society
Zeroes of Gaussian Analytic Functions Mikhail Sodin Abstract. Geometrically, zeroes of a Gaussian analytic function are intersection points of an analytic curve in a Hilbert space with a randomly chosen hyperplane. Mathematical physics provides another interpretation as a gas of interacting particles. In the last decade, these interpretations influenced progress in understanding statistical patterns in the zeroes of Gaussian analytic functions, and led to the discovery of canonical models with invariant zero distribution. We shall discuss some of recent results in this area and mention several open questions.
Introduction A Gaussian analytic function is a linear combination ζk fk (z) f (z) = k≥0
of analytic functions fk : G → C (G ⊆ C is a domain), |fk (z)|2 < ∞ locally uniformly in G, k≥0
with independent standard complex Gaussian random coefficients ζk . The random zero set Zf = f −1 (0) is the theme of this talk. Pioneering contributions in this area were made by Paley and Wiener [28, Chapter X], Kac [15, Chapter I], and Rice [30]. Paley and Wiener constructed a large class of Gaussian analytic functions in a strip with stationary distribution with respect to the shifts, and computed the mean number of zeroes. Their work was influenced by ergodic theory and theory of almost-periodic functions. Kac was interested in the mean number of real zeroes of polynomials with real coefficients. Rice systematically treated both theoretical and applied aspects of random noises in radio signals. The technique introduced by Kac and Rice has been significant to radio engineers and physicists. These studies were continued in various directions, notably by Littlewood and Offord [20], Hammersley [12], Offord [26, 27], and Kahane [16, Chapter 13]. Hammersley looked at pure probabilistic aspects of point processes generated by zeroes of random polynomials. The other authors were motivated by the Supported by the Israel Science Foundation of the Israel Academy of Sciences and Humanities.
446
M. Sodin
entire functions and Nevanlinna theory. Introducing randomness, they tried to single out ‘typical patterns’ in the zero distribution in various classes of entire and analytic functions. In the 90s, the subject was revived by several groups of researchers who came from different areas: Bogomolny, Bohigas and Leboeuf; Shub and Smale; Edelman and Kostlan; Hannay; Bleher, Shiffman and Zelditch; Nonnenmacher and Voros; by no means is this list complete. They established new links to physics (Coulomb gas of charged particles, random matrices, quantum chaos) and geometry (analytic curves in projective Hilbert spaces), and drastically changed the whole subject. We start this lecture with a quick review of basic results pertaining to zero sets of arbitrary Gaussian analytic functions (Section 1). In Section 2, we introduce three canonical random zero processes (on the Riemann sphere, complex plane, and the unit disc), distinguished by stationarity with respect to the isometries. In Sections 3 and 4, we consider in more detail one of them, the canonical random zero process in C. The exposition in this part is based on joint works with Tsirelson [34]. In Section 5, we discuss an ‘exactly solvable case’ of the hyperbolic zero process recently discovered by Peres and Vir´ag [29]. 1. The random zero process Zf Informally speaking, the random zero set Zf = f −1 (0) is the intersection of an analytic curve f : G → P (H) in a projective Hilbert space with a random hyperplane, the analytic functions fk are homogeneous coordinates of the curve f. Projective unitary transformations of the curve f do not change the random zero set Zf . Hence the random set Zf depends only on geometry of the curve f. Its study can be interpreted as part of the H. Cartan–Ahlfors–H. and J. Weyl theory of analytic curves independent of the dimension of the target space. 1 The properties of the random process Zf can be expressed by its counting measure nf = δa , a : f (a)=0
δa is a point measure at a. The measure nf is a random, positive, locally finite measure on G. The classical formula 1 nf = ∆ log |f | (1.1) 2π (the Laplacian is understood in the sense of distributions) explicitly relates the random measure nf to the Gaussian analytic process f . Proofs of most of the results presented below start with this relation. 1An interesting attempt to build a ‘dimensionless theory’ of analytic curves was made by Favorov [10]. His approach is based on the pluripotential theory in Banach spaces.
Zeroes of Gaussian Analytic Functions
447
1.1. The Edelman-Kostlan formula for the mean measure. The first question about the random measure nf is to find its average which is a non-negative measure in G. Theorem 1.1 (Edelman-Kostlan [9]). Enf = where
f (z) :=
1 ∆ log f , 2π J
|fk (z)|2 .
k≥0
I.e., the mean measure Enf coincides with the Riesz measure of the subharmonic function log f (z). The RHS of the Edelman-Kostlan formula is a pull-back of the FubiniStudy area measure from the projective space P (H) to G by the curve f. Its density with respect to the Euclidean area measure in G equals 1 # 2 1 i
f 4 The function f# is a ‘Fubini-Study derivative’ of the curve f. The EdelmanKostlan formula can be viewed as a version of the classical Crofton formula from the integral geometry. Its proof is a simple computation based on equation (1.1): 1 1 ∆(E log |f |) = ∆ log f , Enf = 2π 2π since, for any complex Gaussian random variable ζ, E log |ζ| = log ζ + const . What about the higher ‘moments’ of the random measure nf ? They are expressed by the k-point correlation measures dµ(z1 , . . . zk ) = E (dnf (z1 ) . . . dnf (zk ))
(1.3)
on G × . . . G. Hannay [13] derived explicit formulas for these measures which A BC D k times
generalize (1.2). They involve determinants and permanents of k × k matrices. In different contexts, the rigorous proof of these formulas is given in [4] and [29]. 1.2. Calabi’s rigidity. Surprisingly, the mean Enf determines the random zero set Zf . In geometry, the same phenomenon was discovered by Calabi already in the beginning of the 50s. Theorem 1.2. Let f and g be Gaussian analytic functions in a domain G, and let Enf = Eng . Then the corresponding random zero sets Zf and Zg have the same distribution.
448
M. Sodin
This holds due analyticity. The idea is not difficult: " to the underlying # let K(z1 , z2 ) = E f (z1 )f (z2 ) be the covariance of the process f . By the Edelman-Kostlan formula, the mean measure Enf determines the function z → log K(z, z) up to a harmonic summand. In turn, the diagonal K(z, z) determines the whole covariance kernel K (due to analyticity of K in z1 and z¯2 ), and hence the distribution of the Gaussian process f . The details and references are in [33]. Here is Calabi’s original formulation: If two linearly non-degenerate analytic curves f : M → P (H1 ), g : M → P (H2 ) of a complex manifold M induce the same Riemannian metric on M by pulling back the corresponding FubiniStudy metrics, then the projective spaces coincide P (H1 ) = P (H2 ), and the curves are unitarily equivalent. 1.3. Offord-type estimate. Theorem 1.3. Let f be a Gaussian analytic function on a plane domain G. Then for any test function φ ∈ C02 (G) with a compact support in G, and any λ>0 2πλ P φ(dnf − E(dnf ) ) > λ ≤ 3 exp − . (1.4)
∆φ 1 G
Here, . 1 is the L1 norm with respect to the area measure. Here is an argument borrowed from Offord [26]. By (1.1) and Green’s formula, we need to estimate the probability P (log |f | − E log |f |)∆φ dm > 2πλ , G
m is the area measure. This reduces the proof to a simple fact about concentration of log |ζ|, where ζ is a complex Gaussian random variable. The details are in [33]. The result can be extended in various directions. It persists for zero sets of any random analytic process f in G with uniformly bounded exponential moment: for some c > 0, # " sup E ec| log |f (z)| | < ∞ . z∈G
Examples of such analytic processes are given in [24]. Instead of the L1 -norm of the Laplacian ∆φ, one can fix any of the Lq -norms, 2 < q ≤ ∞, of the gradient ∇φ. In Section 4, we discuss a more complicated ‘global version’ of Theorem 1.3. There is a price for such a level of generality: sometimes, Offord’s estimate does not give an optimal result. For example, it does not yield sharp bounds for the ‘hole probability’ (see (3.2) and (5.1) below).
Zeroes of Gaussian Analytic Functions
449
2. Chaotic analytic zero points In the beginning of the nineties, Bogomolny, Bohigas and Leboeuf; Kostlan; and Shub and Smale introduced a remarkably unique class of Gaussian analytic functions with unitary invariance of zero points. Following Hannay [13], we use the term ‘chaotic analytic zero points’ (CAZP, for short). We consider here three CAZP models: the elliptic CAZP, the flat CAZP, and the hyperbolic CAZP 2. They are the random zero set of a Gaussian analytic function 7 L L(L − 1) . . . (L − k + 1) k z ζk (elliptic, L = 1, 2, . . . ), (2.1) fL (z) = k! k=0 7 ∞ Lk k z ζk (flat, L > 0), (2.2) fL (z) = k! k=0 7 ∞ L(L + 1) . . . (L + k − 1) k z fL (z) = ζk (hyperbolic, L > 0). (2.3) k! k=0
The analytic function (2.1) is a polynomial of degree L (the domain of the elliptic CAZP is the Riemann sphere), the function (2.2) with probability one is an entire function, and the function (2.3) with probability one is analytic in the unit disc. We introduce unified notation: M for the domain of the CAZP, and Γ for the symmetry group of M. Then CAZP is a unique Γ-stationary random zero process. Here, Γ-stationarity means that for any γ ∈ Γ the point processes Zf and Zf ◦γ have the same distribution. Uniqueness means that CAZP is the only Γ-stationary process on M among the random zero sets of Gaussian analytic functions. Having explicit formulas (2.1), (2.2), (2.3), it is very easy to prove Γstationarity and uniqueness. It suffices only to check that EnfL = L · m∗ ,
(2.4)
where m∗ is a normalized Γ-invariant area measure on M. Then, by Calabi’s rigidity, ZfL is Γ-stationary, and, again by Calabi’s rigidity, fL is unique. Verification of (2.4) is a straightforward application of the Edelman-Kostlan formula. For example, in the flat case, fL 2 (z) = exp(L|z|2 ), and 1 1 ∆(L|z|2 /2) = L · m . 2π π ∗ We see that in the flat case the normalized area m equals π1 m. It is also worth mentioning that canonical isometric embeddings of M into projective Hilbert spaces corresponding to the Gaussian analytic functions (2.1), (2.2) and (2.3) are well known in geometry and physics. nfL =
2The toric CAZP (which we do not discuss here) was introduced by Nonnenmacher and Voros [25].
450
M. Sodin
By (2.4), the parameter L equals the mean number of random zeroes per unit area on M. In what follows, we shall consider the asymptotic behavior of the random zero processes ZfL in the ‘large intensity limit’ L → ∞. Some features for all three canonical models are similar, some are different. Due to compactness, the elliptic model sometimes is simpler to analyze. On the other hand, the hyperbolic model has additional intriguing features. To fix ideas, we shall concentrate on the flat model. In this case, intro√ ducing L, we just make a homothety of the plane with coefficient r = L. This makes the flat case more transparent3. Thus, we do not need parameter L anymore, and we consider the asymptotic zero distribution of the Gaussian entire function of order two ∞ zk ζk √ . f (z) = k! k=0 The function f (z) can be viewed as a Gaussian counterpart of the Weierstrass σ-function " z # z/ω+1/2(z/ω)2 1− e . σ(z) = z ω √ 2 ω∈ πZ \{0}
2
Indeed, the random function |f (z)|e−|z| /2 has a stationary distribution, while √ √ 2 the function |σ(z)|e−|z| /2 has periods π and i π. 3. Linear statistics Given a test-function h : C → R with a compact support, consider the random variable r2 z Zr (h) = h( r ) dn(z) , h dm , EZr (h) = π n is a counting measure of the flat CAZP with intensity L = 1, m is the area measure. We are interested in the asymptotic behavior of Zr (h) when r → ∞. The size of fluctuations of Zr (h) depends on the smoothness of the test-function h. 3.1. Smooth linear statistics. Theorem 3.1 ([34]). Let h be a C 2 -function on C with a compact support. Then κ + o(1)
∆h 22 , r → ∞, r2 where κ is a positive numerical constant. The random variables r2 r √ h dm Zr (h) − κ ∆h 2 π Var Zr (h) =
(3.1)
converge in distribution to the standard Gaussian law N (0; 1) for r → ∞. 3The scaling z = z + √w 0
flattens out the elliptic and hyperbolic geometry as L → ∞. In L this limit, the entire function (2.2) is a locally uniform limit of the functions (2.1) and (2.3), and the flat CAZP appears as a scaling limit of the other two CAZP models. This is the motivation for an advanced theory developed by Bleher, Shiffman and Zelditch in [4].
Zeroes of Gaussian Analytic Functions
451
Asymptotic formula (3.1) first appeared in Forrester and Honner [11]. It is worth mentioning that Theorem 3.1 persists for the other two CAZP models in the large intensity limit L → ∞ [34, Part I]. It is instructive to compare (3.1) with the size of variations for a simple point process in the plane given by i.i.d. Gaussian perturbations of the lattice. Consider the point process √ S= π(k + il) + ηk,l : (k, l) ∈ Z2 where ηk,l are independent standard complex Gaussian random variables. In this case, Var Sr (h) ∼ const ∇h 22 , for r → ∞. This is rather different from (3.1). Asymptotic similarity to the flat CAZP Zf can be achieved by inventing special correlations between the perturbations ηk,l . 4 In Section 4, we return to the idea of the flat CAZP as a perturbed lattice. The proof of Theorem 3.1 starts with Green formula r2 f (rz) 1 Zr (h) − fr∗ (z) = h dm = log |fr∗ | ∆h dm , . π 2π Varf (rz) The RHS is a non-linear functional on a Gaussian process fr∗ . The rest is based on the method of moments ´a la Breuer and Major [5]: we expand the function 2 ζ → log |ζ| in Hermite polynomials in the space L2C (e−|ζ| ) (the Wick expansion), and evaluate the moments of Zr (h) using the combinatorial diagram technique. 3.2. Number of chaotic analytic zero points. Let Ω ⊂ C be a bounded domain with a piecewise smooth boundary. We are interested in the asymptotic behavior of the random variable n(rΩ) = Zr (1Ω ) for r → ∞. Forrester and Honner [11] argued that the technique developed by Martin and Yal¸cin [22] for studying the Gibbs states of infinite systems of charged particles applied to the flat CAZP gives Var n(rΩ) = r · (τ + o(1)) Length(∂Ω) ,
r → ∞,
τ is a positive numerical constant. This is consistent with the idea that the variation of the number of points in rΩ should behave like the number of points in the ‘strip’ of constant size around the boundary ∂(rΩ). Presumably, the method of Martin and Yal¸cin also yields that the random variables n(rΩ) − π −1 r2 m(Ω) r · τ Length(∂Ω) converge in distribution to N (0; 1) for r → ∞. 4Lattice points are aggregated into clusters and each cluster scatters in a special (equiangular and equidistant) way [34, Part I, Introduction].
452
M. Sodin
It would be interesting to find a counterpart of the law of the iterated logarithm; i.e., to find a function φ(r) such that with probability one |n(r) − r2 | lim sup √ = 1. rφ(r) r→∞ Here n(r) = n ({|z| ≤ r}). 3.3. The ‘hole probability’ and large deviations. The next theorem proves an estimate conjectured by Yuval Peres: Theorem 3.2 ([34]). For r ≥ 1, 4
4
e−c1 r ≤ P (n(r) = 0) ≤ e−c2 r ,
(3.2)
where n(r) = n({|z| ≤ r}), and c1 and c2 are positive numerical constants. It would be interesting to check whether there exists the limit log− P (n(r) = 0) , r→∞ r4 and (if it does) to compute its value. The lower bound in (3.2) is obtained by an explicit construction. The upper bound follows from lim
Theorem 3.3 ([34]). For any δ ∈ (0, 14 ] and r ≥ 1, n(r) 4 P 2 − 1 ≥ δ ≤ e−c(δ)r . r The proof of Theorem 3.3 uses tools from the entire functions theory. First, we show that with very high probability log maxrD |f | is close to r2 /2. Then, estimating log |f | from below, we show that with very high probability the average 2π dθ log |f (reiθ )| 2π 0 is also close to r2 /2. From this, using Jensen’s formula, we deduce Theorem 3.3. Theorems 3.2 and 3.3 are consistent with the results known for a one component Coulomb system of charged particles of one sign embedded into a uniform background of the opposite sign, Jancovici, Lebowitz and Magnificat [18]. It would% be good &to understand the asymptotic behavior of the random variable r−α n(r) − r2 for r → ∞ and α ≥ 1/2. At present, we understand the extreme case α = 12 , and (not completely) the case α ≥ 2. A plausible guess (motivated by [18]) is 1 2α − 1, 2 ≤ α ≤ 1; 2 α log log P |n(r) − r | ≥ r lim = 3α − 2, 1 ≤ α ≤ 2; r→∞ log r 2α, α ≥ 2.
Zeroes of Gaussian Analytic Functions
453
2
In the first case, the normalized charge |n(r)− r2 | grows slower than the perimeter of the disc, in the second case it grows faster than the perimeter but slower than the area, in the last case (so called ‘overcrowding’) it grows faster than the area. According to the philosophy of [18], this should lead to a change of the asymptotic regime at α = 1 and α = 2. The technique we developed for the proof of Theorem 3.2 helps to analyze the case α > 2. The other two cases seem to require a different technique.5 4. Flat chaotic analytic zero points as a perturbed lattice How evenly do the flat CAZP spread over the plane? In the Euclidean case we have a very natural system of points spread evenly throughout the plane, namely, the lattice points. Instead of comparing the random counting measure nf with its average π1 m, we consider the flat CAZP as a perturbed lattice √ { π(k + il) + ξk,l : k, l ∈ Z}, for some (dependent) complex-valued random variables ξk,l . We may hope for fast decay of the tail probabilities P (|ξk,l | ≥ λ) for large λ, uniformly in (k, l) ∈ Z2 . The uniformity becomes trivial if the distribution of (ξk,l ) is invariant under lattice shifts. We treat the random 2 variables ξk,l as measurable functions on the space Ω = CZ of two-dimensional arrays ξ : Z2 → C. Theorem 4.1 ([34]). There exists a probability measure P on Ω such that (i) P is invariant underthe shifts of Z2 ; √ (ii) the random set S = π(k + il) + ξk,l : (k, l) ∈ Z2 is distributed like the flat #Z; " CAZP |ξ0,0 |2 < ∞ for some > 0. (iii) E e This result gives no information about correlation between ξk,l . Probably, they can be chosen to be nearly independent on large distances. The proof of Theorem 4.1 does not use the Gaussian nature of the flat CAZP but only the uniform boundedness of the exponential moment of the ‘random potential’ u(z) = log |f (z)| − 12 |z|2 . The main ingredients of the proof are the √ M. Hall’s ‘marriage lemma’ (needed to match the flat CAZP with the lattice πZ2 ), and a potential theory lemma which can be useful in other discrepancy problems: Lemma 4.2. Let u be a bounded delta-subharmonic function on C (i.e., a difference of two subharmonic functions), and let ∆u = µ − m in the distributional sense, µ is a non-negative measure. Then for any bounded Borel set E ⊂ C µ(E) ≤ m(E+t )
and
m(E) ≤ µ(E+t ) , 1/2
where E+t = {z : dist(z, E) ≤ t} is a t-vicinity of E, t = const u ∞ . 5Added in proofs: The case α > 2 was completely analyzed in a recent preprint of M. Krish-
napur. His results also capture the ‘phase transition’ in the exponent at α = 2.
454
M. Sodin
The boundedness of u is too strong for applications. It can be easily weakened by convolving u with a smooth convolutor supported by an appropriate disc. The two ingredients described above alone help to prove only a local result in the spirit of Theorem 4.1. Globalization is still a problem: after smoothing, the random potential u(z) = log |f (z)| − 12 |z|2 is almost surely unbounded. Rare fluctuations appear somewhere on the infinite plane C, though probably far from the origin. To achieve some locality, we introduce a special adaptive metric ρ on C using H¨ ormander’s construction [14, Section 1.4]. This metric ρ is small where the potential is large. Then we use a version of Lemma 4.2 making use of ρ-neighborhoods instead of Euclidean ones. It is worth mentioning that in [1] Ajtai, Koml´ os and Tusn´ ady studied high probability matchings of a system of N 2 independent random points Λ = {λ1 , . . . λN 2 } uniformly distributed in the square [0, N ]2 ⊂ R2 with the grid {ω1 , . . . ωN 2 } = [0, N ) ∩ Z2 . They considered the average transportation 1 |λi − ωπ(i) | , T (Λ) := min 2 π N 2 1≤i≤N
the minimum is taken over all permutations π of {1, 2, . . . , N 2 }. Then with high probability (4.1) const log N ≤ T (Λ) ≤ Const log N . For related results see Leighton and Shor [19], and Talagrand [35]. In our global set-up we deal with infinite measures in the plane. Then, according to the lower bound in (4.1), for any matching the average transportation distance tends to infinity in the N → ∞ limit. This leaves no hope for the finite average distance matching between the Poissonian point process in R2 and a lattice, even without a quantitative estimate (iii).6 It would be interesting to find a hyperbolic counterpart of Theorem 4.1. 5. Hyperbolic CAZP The hyperbolic CAZP, in contrast to the other models, depends on two intrinsic parameters: the mean density of zeroes per unit hyperbolic area, and the size of the sets on which we count the number of points. For instance, nL (Dr ) is a number of the hyperbolic CAZP with intensity L in the hyperbolic disc of radius r. This leads to different asymptotic regimes, and makes the hyperbolic model richer than the other two. It is also worth mentioning that, for different values of the intensity L, the Gaussian analytic function (2.3) exhibits completely different patterns in the asymptotic behavior when z approaches the boundary of the unit disc Kahane [16, Chapter 13]. 6Added in proofs: Existence of a stationary matching between the Poisson point process in R2 and the lattice follows from the results recently announced by C. Hoffman, A. E. Holroyd, and Y. Peres.
Zeroes of Gaussian Analytic Functions
455
An interesting observation by Diaconis and Evans [8], and Peres and Vir´ ag [29] says that the real part of the hyperbolic random function (2.3) up to a constant term is a Poisson integral of a Gaussian random noise on the unit circle. In the case L = 1 this is classical white noise [29, p.11], the case L = 2 corresponds 1/2 to the Gaussian process on the Dirichlet space H2 (T) [8, Example 5.6]. 5.1. An exactly solvable model. Here, we discuss a recent finding of Peres and Vir´ ag [29] which pertains to the case L = 1. Recall that the k-point correlation function p(z1 , . . . , zk ) of a random point process is p(z1 , . . . , zk ) = lim
→0
p (z1 , . . . , zk ) , (π2 )k
where p (z1 , . . . , zk ) is the probability that each disc {|z − zj | ≤ }, 1 ≤ j ≤ k, contains at least one point of the process. For the random zero processes of a Gaussian analytic function, the limit on the RHS always exists. An equivalent definition says that p(z1 , . . . , zk ) is a density of the k-point correlation measure (1.3) with respect to the Lebesgue measure. The correlation measure also contains singular terms supported by the large diagonal; these terms are expressed via j-point correlation functions with j < k. Thus, the random zero process can be described by its correlation functions. Using Hannay’s formulas [13], Peres and Vir´ ag proved Theorem 5.1 (Peres-Vir´ag [29]). The correlation function of the hyperbolic CAZP with intensity L = 1 is
1 −n p(z1 , . . . , zk ) = π det . (1 − zi z¯j )2 1≤i,j≤k This remarkable identity makes the hyperbolic CAZP with L = 1 an ‘exactly solvable model’ among all CAZP. 7 In particular, it yields amazingly simple explicit expressions for the distribution of the number of zeroes n(ρ) in the disc {|z| ≤ ρ} and for the asymptotics of the ‘hole probability’: Corollary 5.2 ([29]). Let Z be the hyperbolic CAZP with intensity L = 1. Then (i) n(ρ) has the same distribution as ∞ j=1 Xj where {Xj } is a sequence of independent Bernoulli random variables with P(Xj = 1) = ρ2j ; (ii) for ρ → 1
π 2 + o(1) P (n(ρ) = 0) = exp − ; (5.1) 1−ρ n(ρ) − En(ρ) ρ2 ρ2 (iii) the ratio (with En(ρ) = 1−ρ 2 and Var n(ρ) = 1−ρ4 ) conVar n(ρ) verges in distribution to the standard Gaussian law N (0; 1) for ρ → 1. In this case, Var n(ρ) has the same order of magnitude as En(ρ) whilst in the flat case the variance grows only as a square root of the mean. This 7Peres and Vir´ ag observe that this is the only determinantal process among CAZP.
456
M. Sodin
naturally reflects the difference between the hyperbolic and Euclidean geometries: in hyperbolic geometry the perimeter grows like the area, and much more random zeroes are located near the boundary circumference. We are not aware of counterparts of (ii) and (iii) for the hyperbolic CAZP with L '= 1. Loose ends In this lecture we have only touched the ‘ground level’ of the theory. There are plenty of interesting and deep developments. Among them are • scaling limits of zeroes of random polynomials (Ibragimov and Zeitouni [17] and Shiffman and Zelditch [31] ) and of random holomorphic sections of high powers of Hermitian line bundles (Bleher, Shiffman and Zelditch [4]); • solutions of random systems of algebraic equations, including sparse systems (see Edelman and Kostlan [9], Malajovich and Rojas [21], Shiffman and Zelditch [32] and references therein); • distribution of real zeroes of random real polynomials (Maslova [23], Dembo, Poonen, Shao and Zeitouni [6], Bleher and Di [3], Aldous and Fyodorov [2]); • links with the zero distribution of chaotic eigenfunctions (Nonnenmacher and Voros [25], Hannay [13]) and with the distribution of eigenvalues of random matrices with independent complex Gaussian entries (Forrester and Honner [11], Diaconis and Evans [8], Dennis and Hannay [7] and references therein), and each of them deserves a special lecture (cf. [36]). But these are different stories to be told by other people. Acknowledgement. For the last four years, I have been enjoying collaboration with Boris Tsirelson on the subject of this lecture. I am grateful to him for his encouragement and patience. I am also grateful to F¨edor Nazarov, Leonid Pastur, Yuval Peres, Leonid Polterovich, Zeev Rudnick, Bernard Shiffman, Peter Yuditskii, and Steve Zelditch for useful discussions. References [1] M. Ajtai, J. Koml´ os and G. Tusn´ ady, On optimal matchings. Combinatorica 4 (1984), 259–264. [2] A. Aldous and Y. Fyodorov, Real roots of random polynomials: universality close to accumulation points. J. Phys. A 37 (2004), 1231–1239. [3] P. Bleher and Xiaojun Di, Correlations between zeros of non-Gaussian random polynomials. IMRN 2004, no. 46, 2443–2484, arXiv:math.MP/0308014. [4] P. Bleher, B. Shiffman and S. Zelditch, Universality and scaling of correlations between zeros on complex manifolds. Invent. Math. 142 (2000), 351–395; Poincar´eLelong approach to universality and scaling of correlations between zeros. Comm. Math. Phys. 208 (2000), 771–785.
Zeroes of Gaussian Analytic Functions
457
[5] P. Breuer and P. Major, Central limit theorems for nonlinear functionals of Gaussian fields. J. Multivariate Anal. 13 (1983), 425–441. [6] A. Dembo, B. Poonen, Qi-Man Shao and O. Zeitouni. Random polynomials having few or no real zeros. J. Amer. Math. Soc. 15 (2002), 857–892. [7] M.R. Dennis and J.H. Hannay, Saddle points in the chaotic analytic function and Ginibre characteristic polynomial. J. Phys. A 36 (2003), 3379–3383. [8] P. Diaconis and S. Evans, Linear functionals of eigenvalues of random matrices. Trans. Amer. Math. Soc. 353 (2001), 2615–2633. [9] A. Edelman and E. Kostlan, How many zeros of a random polynomial are real? Bull. Amer. Math.Soc. (N.S.) 32 (1995), 1–37. [10] S.Yu. Favorov, Growth and distribution of the values of holomorphic mappings of a finite-dimensional space into a Banach space. Siberian Math. J. 31 (1990), 137– 146; On the growth of holomorphic mappings from a finite-dimensional space into a Banach space. Mat. Fiz. Anal. Geom. (Kharkov) 1 (1994), 240–251. (Russian) [11] P.J. Forrester and G. Honner, Exact statistical properties of the zeros of complex random polynomials. J. Phys. A 32 (1999), 2961–2981. [12] J.M. Hammersley, The zeros of a random polynomial. Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, 1954–1955, vol. II, pp. 89–111. [13] J.H. Hannay, Chaotic analytic zero points: exact statistics for those of a random spin state. J. Phys. A 29 (1996), L101–L105. The chaotic analytic function. J. Phys. A 31 (1998), L755–L761. [14] L. H¨ ormander, The analysis of linear partial differential operators. I. Distribution theory and Fourier analysis. Grundlehren der Mathematischen Wissenschaften, 256. Springer-Verlag, Berlin, 1983. [15] M. Kac, Probability and related topics in physical sciences. Lectures in Applied Mathematics. Proceedings of the Summer Seminar, Boulder, Colo., 1957, Vol. I Interscience Publishers, London-New York, 1959. [16] J.-P. Kahane, Some random series of functions (2nd ed). Cambridge University Press, Cambridge, 1985. [17] I. Ibragimov and O. Zeitouni, On roots of random polynomials, Trans. Amer. Math. Soc. 349 (1997), 2427–2441. [18] B. Jancovici, J.L. Lebowitz and G. Magnificat, Large charge fluctuations in classical Coulomb systems. J. Statist. Phys. 72 (1993), 773–787. [19] T. Leighton and P. Shor, Tight bounds for minimax grid matching with applications to the average case analysis of algorithms. Combinatorica 9 (1989), 161–187. [20] J.E. Littlewood and A.C. Offord, On the distribution of zeros and a-values of a random integral function. II. Ann. of Math. (2) 49 (1948), 885–952; errata 50 (1949), 990–991. [21] G. Malajovich and M. Rojas, Polynomial systems and the momentum map. Foundations of computational mathematics (Hong Kong, 2000), 251–266, World Sci. Publishing, River Edge, NJ, 2002. [22] Ph.A. Martin and T. Yal¸cin, The charge fluctuations in classical Coulomb systems. J. Statist. Phys. 22 (1980), 435–463. [23] N.B. Maslova, The variance of the number of real roots of random polynomials. Teor. Verojatnost. i Primenen. 19 (1974), 36–51. (Russian); The distribution of
458
[24]
[25] [26] [27]
[28] [29]
[30] [31] [32] [33] [34]
[35] [36]
M. Sodin the number of real roots of random polynomials. Teor. Verojatnost. i Primenen. 19 (1974), 488–500. (Russian) F. Nazarov, M. Sodin and A. Volberg, The geometric Kannan-Lovasz-Simonovits lemma, dimension-free estimates for the distribution of the values of polynomials, and the distribution of the zeros of random analytic functions. St. Petersburg Math. J. 14 (2003), 351–366; Local dimension-free estimates for volumes of sublevel sets of analytic functions. Israel J. Math. 133 (2003), 269–283. S. Nonnenmacher and A. Voros, Chaotic eigenfunctions in phase state. J. Stat. Phys. 92 (1998), 431–518. A.C. Offord, The distribution of zeros of power series whose coefficients are independent random variables. Indian J. Math. 9 (1967), 175–196. A.C. Offord, The distribution of the values of an entire function whose coefficients are independent random variables, I. Proc. London Math. Soc. (3) 14a (1965) 199–238; II. Math. Proc. Cambridge Philos. Soc. 118 (1995), 527–542; R.E.A.C. Paley and N. Wiener, Fourier transforms in the complex domain. AMS Colloquium Publications, 1934. Yu. Peres and B. Vir´ ag, Zeros of the i.i.d. Gaussian power series and a conformal invariant determinantal process. Acta Math., to appear, arXiv:math.PR/0310297. S.O. Rice, Mathematical analysis of random noise. Bell System Tech. J. 23 (1944), 282–332; Ibid 24 (1945), 46–156. B. Shiffman and S. Zelditch, Equilibrium distribution of zeros of random polynomials. IMRN (2003) no. 1, 25–49. B. Shiffman and S. Zelditch, Random polynomials with prescribed Newton polytope. J. Amer. Math. Soc. 17 (2004), 49–108. M. Sodin, Zeroes of Gaussian analytic functions. Math. Res. Lett. 7 (2000), 371– 381. M. Sodin and B. Tsirelson, Random complex zeroes. I. Asympotic normality arXiv:math.CV/0210090, Israel Jour. Math. 144 (2004), 125–149. II. Perturbed lattice. Israel Jour. Math., to appear, arXiv:math.CV/0309449; III. Decay of the hole probability. Israel Jour. Math. 147 (2005), 371–379, arXiv:math.CV/0312258. M. Talagrand, Matching theorems and empirical discrepancy computations using majorizing measures. J. Amer. Math. Soc. 7 (1994), 455–537. S. Zelditch, From random polynomials to symplectic geometry. XIIIth International Congress on Mathematical Physics (London, 2000), 367–376, Int. Press, Boston, MA, 2001; Asymptotics of polynomials and eigenfunctions. Proceedings of the International Congress of Mathematicians, Vol. II (Beijing, 2002), 733–742, Higher Ed. Press, Beijing, 2002.
Mikhail Sodin School of Mathematics Tel Aviv University Tel Aviv 69978, Israel e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Painlev´e’s Problem, Analytic Capacity and Curvature of Measures Xavier Tolsa Abstract. In this paper we survey some recent results in connection with the so-called Painlev´e’s problem, the semiadditivity of analytic capacity and other related results.
1. Introduction A compact set E ⊂ C is said to be removable for bounded analytic functions if for any open set Ω containing E, every bounded function analytic on Ω \ E has an analytic extension to Ω. In order to study removability, in the 1940’s Ahlfors [Ah] introduced the notion of analytic capacity. The analytic capacity of a compact set E ⊂ C is γ(E) = sup |f (∞)|,
(1.1)
where the supremum is taken over all analytic functions f : C \ E −→ C with |f | ≤ 1 on C \ E, and f (∞) = limz→∞ z(f (z) − f (∞)). In [Ah], Ahlfors showed that E is removable for bounded analytic functions if and only if γ(E) = 0. Painlev´e’s problem consists in characterizing removable singularities for bounded analytic functions in a metric/geometric way. By Ahlfors’ result this is equivalent to describing compact sets with positive analytic capacity in metric/geometric terms. Vitushkin in the 1950’s and 1960’s showed that analytic capacity and the so-called continuous analytic capacity play a central role in problems of uniform rational approximation on compact sets of the complex plane. The continuous analytic capacity of a compact set E ⊂ C is defined as α(E) = sup |f (∞)|, where the supremum is taken over all continuous functions f : C −→ C which are analytic on C \ E and uniformly bounded by 1 on C. Many results obtained by Vitushkin in connection with uniform rational approximation are stated in terms of α and γ. See [Vi1] and [Vi2], for example. Because of its applications Partially supported by grants MTM2004-00519 (Spain), 2001-SGR-00431 (Generalitat de Catalunya), and HPRN-2000-0116 (European Union).
460
X. Tolsa
to this type of problems he raised the question of the semiadditivity of γ and α. Namely, does there exist an absolute constant C such that γ(E ∪ F ) ≤ C(γ(E) + γ(F )) ? And analogously for α. In [To6] it has been recently proved that analytic capacity is indeed semiadditive. Moreover, a characterization of removable sets for bounded analytic functions in terms of the so-called curvature of measures is also given in [To6]. In the present paper we will survey these results and we will describe the main ideas and techniques involved in their proofs. We will also deal with other related results, although we don’t intend to make a complete account of all recent advances in connection with analytic capacity and Painlev´e’s problem. Let us make some comments about the notation used in the paper. By a cube Q we mean a closed cube with sides parallel to the axes. We denote its side length by (Q). As usual, in the paper the letter ‘C’ stands for an absolute constant which may change its value at different occurrences. The notation A B means that there is a positive absolute constant C such that A ≤ CB. Also, A ≈ B is equivalent to A B A. 2. Analytic capacity 2.1. Basic properties of analytic capacity. One should keep in mind that, in a sense, analytic capacity measures the size of a set as a non-removable singularity for bounded analytic functions. A direct consequence of the definition is that E ⊂ F ⇒ γ(E) ≤ γ(F ). Moreover, it is also easy to check that analytic capacity is translation invariant: γ(z + E) = γ(E)
for all z ∈ C.
Concerning dilations, we have γ(λE) = |λ|γ(E)
for all λ ∈ C.
Further, if E is connected, then diam(E)/4 ≤ γ(E) ≤ diam(E). The second inequality (which holds for any compact set E) follows from the fact that the analytic capacity of a closed disk coincides with its radius, and the first one is a consequence of Koebe’s 1/4 theorem (see [Gam, Chapter VIII] or [Gar2, Chapter I] for the details, for example). Thus if E is connected and different from a point, then it is non removable. This implies that any removable compact set must be totally disconnected.
Painlev´e’s Problem and Analytic Capacity
Q0
E1
E2
461
E3
Figure 1. The square Q0 and the sets E1 , E2 and E3 , which appear in the first stages of the construction of the corner quarters Cantor set. 2.2. Relationship with Hausdorff measure. The relationship between analytic capacity and Hausdorff measure is the following: • If dimH (E) > 1 (here dimH stands for the Hausdorff dimension), then γ(E) > 0. This result follows easily from Frostman’s Lemma. • γ(E) ≤ H1 (E), where H1 is the one-dimensional Hausdorff measure, or length. This follows from Cauchy’s integral formula, and it was proved by Painlev´e about one hundred years ago. Observe that, in particular we deduce that if dimH (E) < 1, then γ(E) = 0. By the statements above, it turns out that dimension 1 is the critical dimension in connection with analytic capacity. Moreover, a natural question arises: is it true that γ(E) > 0 if and only if H1 (E) > 0? Vitushkin showed that the answer is no. He showed that there are sets with positive length and vanishing analytic capacity. A typical example of such a set is the so-called corner quarters Cantor set. This set is constructed in the following way: consider a square Q0 with side length 1. Now replace Q0 by 4 squares Q1i , i = 1, . . . , 4, with side length 1/4 contained in Q0 , so that each Q1i contains a different vertex of Q0 . Analogously, in the next stage each Q1i is replaced by 4 squares with side length 1/16 contained in Q1i so that each one contains a different vertex of Q1i . So we will have 16 squares Q2k of side 04n n length 1/16. /∞ We proceed inductively (see Fig. 1), and we set En = i=1 Qi and E = n=1 En . This is the corner quarters Cantor set. Taking into account that 4n (Qni ) = 1 i=1
for each n, it is not difficult to see that 0 < H1 (E) < ∞. The proof of the fact that γ(E) = 0 is more difficult, and it is due independently to Garnett [Gar1] and Ivanov1 [Iv]. Recall that a set is called rectifiable if it is H1 -almost all contained in a countable union of rectifiable curves. On the other hand, it is called purely unrectifiable if it intersects any rectifiable curve at most in a set of zero length. 1Vitushkin constructed a different example previously.
462
X. Tolsa
It turns out that the corner quarters Cantor set, and also Vitushkin’s example, are purely unrectifiable. Motivated by this fact Vitushkin conjectured that pure unrectifiability is a necessary and sufficient condition for vanishing analytic capacity for sets with finite length. Guy David [Dd1] showed in 1998 that Vitushkin’s conjecture is true: Theorem 2.1. Let E ⊂ C be compact with H1 (E) < ∞. Then, γ(E) = 0 if and only if E is purely unrectifiable. To be precise, let us remark that the “if” part of the theorem is not due to David. In fact, it follows from Calder´ on’s theorem on the L2 boundedness of the Cauchy transform on Lipschitz graphs with small Lipschitz constant. The “only if” part of the theorem, which is more difficult, is the one proved by David. See also [MMV], [DM] and [L´e] for some important preliminary contributions to the proof. Theorem 2.1 is the solution of Painlev´e’s problem for sets with finite length. The analogous result is false for sets with infinite length (see [Ma1] and [JM]). For this type of sets there is no such a nice geometric solution of Painlev´e’s problem, and we have to content ourselves with a characterization such as the one in Corollary 6.2 below (at least, for the moment). 2.3. The capacity γ+ and the Cauchy transform. Given a finite complex Radon measure ν on C, the Cauchy transform of ν is defined by 1 Cν(z) = dν(ξ). ξ−z Although the integral above is absolutely convergent a. e. with respect to Lebesgue measure, it does not make sense, in general, for z ∈ supp(ν). This is the reason why one considers the truncated Cauchy transform of ν, which is defined as 1 dν(ξ), Cε ν(z) = |ξ−z|>ε ξ − z for any ε > 0 and z ∈ C. Given a positive Radon measure µ on the complex plane and a µ-measurable function f on C, we also denote Cµ f (z) := C(f dµ)(z) for z '∈ supp(f ), and
Cµ,ε f (z) := Cε (f dµ)(z) for any ε > 0 and z ∈ C. We say that Cµ is bounded on L2 (µ) if the operators Cµ,ε are bounded on L2 (µ) uniformly on ε > 0. The capacity γ+ of a compact set E ⊂ C is γ+ (E) := sup{µ(E) : supp(µ) ⊂ E, Cµ L∞ (C) ≤ 1}.
(2.1)
That is, γ+ is defined as γ in (1.1) with the additional constraint that f should coincide with Cµ, where µ is some positive Radon measure supported on E
Painlev´e’s Problem and Analytic Capacity
463
(observe that (Cµ) (∞) = −µ(C) for any Radon measure µ). To be precise, there is another slight difference: in (1.1) we asked f L∞ (C\E) ≤ 1, while in (2.1), f L∞ (C) ≤ 1 (for f = Cµ). Trivially, we have γ+ (E) ≤ γ(E).
3. The curvature of a measure A Radon measure µ on Rd has growth of degree n (or is of degree n) if there exists some constant C such that µ(B(x, r)) ≤ Crn for all x ∈ Rd , r > 0. When n = 1, we say that µ has linear growth. Given three pairwise different points x, y, z ∈ C, their Menger curvature is c(x, y, z) =
1 , R(x, y, z)
where R(x, y, z) is the radius of the circumference passing through x, y, z (with R(x, y, z) = ∞, c(x, y, z) = 0 if x, y, z lie on a same line). If two among these points coincide, we let c(x, y, z) = 0. For a positive Radon measure µ, we set c2µ (x) = c(x, y, z)2 dµ(y)dµ(z), and we define the curvature of µ as c(x, y, z)2 dµ(x)dµ(y)dµ(z). c2 (µ) = c2µ (x) dµ(x) =
(3.1)
The notion of curvature of a measure was introduced by Melnikov [Me] when he was studying a discrete version of analytic capacity, and it is one of the ideas which is responsible of the big recent advances in connection with analytic capacity. The notion of curvature is connected to the Cauchy transform by the following result, proved by Melnikov and Verdera [MV]. Proposition 3.1. Let µ be a Radon measure on C with linear growth. We have
Cε µ 2L2 (µ) =
1 2 c (µ) + O(µ(C)), 6 ε
(3.2)
where |O(µ(C))| ≤ Cµ(C). In this proposition, c2ε (µ) stands for the ε-truncated version of c2 (µ) (defined as in the right-hand side of (3.1), but with the triple integral over {x, y, z ∈ C : |x − y|, |y − z|, |x − z| > ε}). The identity (3.2) is remarkable because it relates an analytic notion (the Cauchy transform of a measure) with a metric-geometric one (curvature). We give a sketch of the proof.
464
X. Tolsa
Sketch of the proof of Proposition 3.1. If we do not worry about truncations and the absolute convergence of the integrals, we can write 2 1 2
Cµ L2(µ) = dµ(y) dµ(x) y−x 1 dµ(y)dµ(z)dµ(x). = (y − x)(z − x) By Fubini (assuming that it can be applied correctly), permuting x, y, z, we get, 1 1 2
Cµ L2 (µ) = dµ(z1 )dµ(z2 )dµ(z3 ), 6 (zs2 − zs1 )(zs3 − zs1 ) s∈S3
where S3 is the group of permutations of three elements. An elementary calculation shows that 1 = c(z1 , z2 , z3 )2 . (zs2 − zs1 )(zs3 − zs1 ) s∈S3
So we get 1 2 c (µ). 6 To argue rigorously, above we should use the truncated Cauchy transform Cε µ instead of Cµ. Then we would obtain 1
Cε µ 2L2 (µ) = dµ(y)dµ(z)dµ(x) |x−y|>ε (y − x)(z − x) |x−z|>ε (3.3) 1 = dµ(y)dµ(z)dµ(x) + O(µ(C)). |x−y|>ε |x−z|>ε (y − x)(z − x)
Cµ 2L2 (µ) =
|y−z|>ε
By the linear growth of µ, it is easy to check that |O(µ(C))| ≤ µ(C). As above, using Fubini and permuting x, y, z, one shows that the triple integral in (3.3) equals c2ε (µ)/6. The notion of curvature is related to rectifiability, and there is a strong connection of this notion with the coefficients β which appear in the travelling salesman theorem of P. Jones [Jo]. The following nice result of L´eger [L´e] is an example of this relationship. 1 Theorem 3.2. Let E ⊂ C be compact with H1 (E) < ∞. If c2 (H|E ) < ∞, then E is rectifiable.
Observe that from the preceding result and Proposition 3.1 one infers 1 ), then E that if H1 (E) < ∞ and the Cauchy transform is bounded on L2 (H|E must be rectifiable. A more quantitative version of this result due to Mattila, Melnikov and Verdera [MMV] asserts that if E is such that H1 (E ∩ B(x, r)) ≈ r
for all x ∈ E and 0 < r ≤ diam(E)
Painlev´e’s Problem and Analytic Capacity
465
1 and the Cauchy transform is bounded on L2 (H|E ), then E is contained in a regular curve Γ (i.e., a curve which also satisfies the preceding estimates, with Γ instead of E).
4. The T (1) and T (b) theorems and Calder´ on-Zygmund theory with non-doubling measures The study of analytic capacity has led to the extension of Calder´ on-Zygmund (CZ) theory to the situation where the underlying measure µ on C is non doubling. Recall that µ is said to be doubling if there exists some constant C such that µ(B(z, 2r)) ≤ Cµ(B(z, r))
for z ∈ supp(µ)) and r > 0.
Let us remark that in the classical CZ theory this doubling assumption plays an essential role in almost all results. When one deals with analytic capacity one is forced to deal with measures which may be non doubling, and which are only assumed to have linear growth. The use of CZ theory has been fundamental in most of the recent developments in connection with analytic capacity. For instance, the so-called “T (b) type theorems” are essential tools in the proofs of Vitushkin’s conjecture by G. David and of the semiadditivity of analytic capacity in [To6]. In this section we will describe briefly some results of CZ theory without doubling assumptions. In particular, we will state in detail the T (1) theorem and one of the T (b) type theorems of Nazarov, Treil and Volberg. Let us introduce some terminology. We say that k(·, ·) : Rd × Rd \ {(x, y) ∈ d R × Rd : x = y} → C is an n-dimensional Calder´ on-Zygmund kernel if there exist constants C > 0 and η, with 0 < η ≤ 1, such that the following inequalities hold for all x, y ∈ Rd , x '= y: |k(x, y)| ≤
C , |x − y|n
and
|k(x, y) − k(x , y)| + |k(y, x) − k(y, x )| ≤
(4.1) C|x − x |η |x − y|n+η
if |x − x | ≤ |x − y|/2.
For example, the Cauchy kernel 1/(y − x), with x, y ∈ C, is a 1-dimensional CZ kernel. Given a real or complex Radon measure µ on Rd , we define (4.2) T µ(x) := k(x, y) dµ(y), x ∈ Rd \ supp(µ). We say that T is an n-dimensional Calder´ on-Zygmund operator (CZO) with kernel k(·, ·). We also consider the following ε-truncated operators Tε , ε > 0: k(x, y) dµ(y), x ∈ Rd . Tε µ(x) := |x−y|>ε
466
X. Tolsa
If µ is non negative and f ∈ L1loc (µ), we denote Tµ f (x) := T (f dµ)(x)
x ∈ Rd \ supp(f dµ),
and Tµ,ε f (x) := Tε (f dµ)(x). We say that Tµ is bounded on L2 (µ) if the operators Tµ,ε are bounded on L2 (µ) uniformly on ε > 0. Given ρ > 1, we say that f ∈ L1loc (µ) belongs to the space BM Oρ (µ) if 1 |f − mQ (f )| dµ < ∞, sup Q µ(ρQ) Q where the supremum is taken over all the cubes in Rd and mQ (f ) is the µ-mean of f over Q. Following [NTV1], a Calder´ on-Zygmund operator Tµ is said to be weakly bounded if Tµ,ε χQ , χQ ≤ Cµ(Q) for all the cubes Q ⊂ Rd , uniformly on ε > 0. Notice that if Tµ is antisymmetric, then the left-hand side above vanishes and so Tµ is weakly bounded. Now we are ready to state the T (1) theorem: Theorem 4.1 (T (1) theorem). Let µ be a Radon measure on Rd of degree n, and let T be an n-dimensional Calder´ on-Zygmund operator. The following conditions are equivalent: (a) Tµ is bounded on L2 (µ). ∗ (1) ∈ (b) Tµ is weakly bounded and, for some ρ > 1, we have that Tµ,ε (1), Tµ,ε BM Oρ (µ) uniformly on ε > 0. (c) There exists some constant C such that for all ε > 0 and all the cubes Q ⊂ Rd ,
Tµ,ε χQ L2 (µ|Q) ≤ Cµ(Q)1/2
and
∗
Tµ,ε χQ L2 (µ|Q) ≤ Cµ(Q)1/2 .
The classical way of stating the T (1) theorem is the equivalence (a) ⇔ (b). However, for some applications it is sometimes more practical to state the result in terms of the L2 boundedness of Tµ and Tµ∗ over characteristic functions of cubes, i.e., (a) ⇔ (c). Theorem 4.1 is the extension of the classical T (1) theorem of David and Journ´e to measures of degree n which may be non doubling. The result was proved by Nazarov, Treil and Volberg in [NTV1], although not exactly in the form stated above. An independent proof for the particular case of the Cauchy transform was obtained almost simultaneously in [To1]. For the equivalence of conditions (b) and (c) above, the reader should see [To4, Remark 7.1 and Lemma 7.3]. Other (more recent) proofs of the T (1) theorem for nondoubling measures are in [Ve2] (for the particular case of the Cauchy transform) and in [To4].
Painlev´e’s Problem and Analytic Capacity
467
By Proposition 3.1, the T (1) theorem for the Cauchy transform can be rewritten in the following way: Theorem 4.2. Let µ be a Radon measure on C with linear growth. The Cauchy transform is bounded on L2 (µ) if and only if c2 (µ|Q ) ≤ Cµ(Q)
for all the squares Q ⊂ C.
Observe that this result is a restatement of the equivalence (a) ⇔ (c) in Theorem 4.1, by an application of (3.2) to the measure µ|Q , for all the squares Q ⊂ C. Let us remark that the boundedness of Tµ on L2 (µ) does not imply the boundedness of Tµ from L∞ (µ) into BM O(µ) (this is the space BM Oρ (µ) ∗ with parameter ρ = 1), and in general Tµ,ε (1), Tµ,ε (1) '∈ BM O(µ) uniformly on ε > 0. See [Ve2] and [MMNO]. On the contrary, one can show that if Tµ is bounded on L2 (µ), then it is also bounded from L∞ (µ) into BM Oρ (µ), for ρ > 1, by arguments similar to the classical ones for homogeneous spaces. However, the space BM Oρ (µ) has some drawbacks. For example, it depends on the parameter ρ and it does not satisfy the John-Nirenberg inequality. To solve these problems, in [To2] a new space called RBM O(µ) has been introduced. RBM O(µ) is a subspace of BM Oρ (µ) for all ρ > 1, and it coincides with BM O(µ) when µ is an AD-regular measure, that is, when µ(B(x, r)) ≈ rn
for all x ∈ supp(µ) and 0 < r ≤ diam(supp(µ)).
Moreover, RBM O(µ) satisfies a John-Nirenberg type inequality, and all CZO’s which are bounded on L2 (µ) are also bounded from L∞ (µ) into RBM O(µ). For these reasons RBM O(µ) seems to be a good substitute of the classical space BM O for non-doubling measures of degree n. For the precise definition of RBM O(µ) and its properties, see [To2]. T (b) type theorems are other criterions for the L2 (µ) boundedness of CZO’s. To state one of these theorems in detail we need to introduce the notion of weak accretivity. We say that a function b ∈ L1loc (µ) is weakly accretive if there exists some positive constant C such that b dµ ≥ C −1 µ(Q) for all cubes Q ⊂ Rd . Q
Then we have: Theorem 4.3 (T (b) theorem). Let µ be a Radon measure on Rd of degree n, and let T be an n-dimensional Calder´ on-Zygmund operator. Let b1 , b2 be two weakly accretive functions belonging to L∞ (µ). Then Tµ is bounded in L2 (µ) if ∗ and only if the operator b2 Tµ b1 is weakly bounded and Tµ,ε b1 , Tµ,ε b2 belong to BM Oρ (µ) uniformly on ε > 0, for some ρ > 1. The condition that b2 Tµ b1 is weakly bounded means that b2 Tµ,ε (χQ b1 ), χQ ≤ Cµ(Q)
468
X. Tolsa
uniformly on ε > 0, for all cubes Q ⊂ Rd . Notice that if Tµ is antisymmetric and b1 = b2 = b, then bTµ b is always weakly bounded. The preceding theorem has been proved in [NTV4], and it is a generalization of a classical theorem of David, Journ´e and Semmes to the case of non-doubling measures (and so it requires new ideas and techniques). Other variants of this result (i.e., other T (b) type theorems) can be found in [NTV3] and [NTV5]. For the particular case of the Cauchy transform, Theorem 4.3 yields the following result. Theorem 4.4. Let µ be a Radon measure on C with linear growth. Suppose that there exists a function b such that: (a) b ∈ L∞ (µ), (b) b is weakly accretive, (c) Cµ,ε b ∈ BM Oρ (µ) uniformly in ε > 0, for some ρ > 1. Then Cµ is bounded on L2 (µ). Many more results on Calder´ on-Zygmund theory with non-doubling measures have been proved recently. For example, there are results concerning Lp and weak (1, 1) estimates [NTV2]; Hardy spaces [To3]; weights [GCM1], [MM], [OP]; commutators [CS], [HMY2], [To2]; fractional integrals [GCM2], [GCG1]; Lipschitz spaces [GCG2]; etc. See also the survey paper [Ve3]. 5. Semiadditivity of γ+ and its characterization in terms of curvature We denote by Σ (E) the set of Radon measures supported on E such that µ(B(x, r)) ≤ r for all x ∈ C, r > 0. The following theorem characterizes γ+ in terms of curvature of measures and in terms of the L2 norm of the Cauchy transform. Theorem 5.1. For any compact set E ⊂ C we have γ+ (E) ≈ sup µ(E) : µ ∈ Σ (E), c2 (µ) ≤ µ(E) ≈ sup µ(E) : µ ∈ Σ (E), Cµ L2 (µ),L2 (µ) ≤ 1 .
(5.1)
In the statement above, Cµ L2 (µ),L2 (µ) stands for the operator norm of Cµ on L2 (µ). That is, Cµ L2 (µ),L2 (µ) = supε>0 Cµ,ε L2 (µ),L2 (µ) . Sketch of the proof of Theorem 5.1. Call S1 and S2 the first and second suprema on the right side of (5.1) respectively. To see that S1 γ+ (E) take µ supported on E such that Cµ ∞ ≤ 1 and µ(E) ≥ γ+ (E)/2. One easily gets that Cε µ ∞ 1 on supp(µ) for every ε > 0 and µ(B(x, r)) ≤ Cr for all r > 0. From Proposition 3.1, it follows then that c2 (µ) ≤ Cµ(E).
Painlev´e’s Problem and Analytic Capacity
469
The inequality S2 S1 can be proved using the T (1) theorem. Indeed, let µ be supported on E with linear growth such that c2 (µ) ≤ µ(E) and S1 ≤ 2µ(E). We set 2 A := x ∈ E : c(x, y, z) dµ(y)dµ(z) ≤ 2 . By Tchebychev µ(A) ≥ µ(E)/2. Moreover, for any set B ⊂ C, 2 c(x, y, z)2 dµ(x)dµ(y)dµ(z) ≤ 2µ(B). c (µ|B∩A ) ≤ x∈B∩A
In particular, this estimate holds when B is any square in C, and so Cµ|A is bounded on L2 (µ|A ), by Theorem 4.2. Thus S2 µ(A) ≈ S1 . Finally, the inequality γ+ (E) S2 follows from a dualization of the weak (1, 1) inequality for the Cauchy transform. See [To1] for the details, for example. From Theorem 5.1, since the term sup µ(E) : µ ∈ Σ (E), Cµ L2 (µ),L2 (µ) ≤ 1 is countably semiadditive, we infer that γ+ is also countably semiadditive. Corollary 5.2. The capacity γ+ is countably semiadditive. That is, if Ei , i = 1, 2, . . ., is a countable (or finite) family of compact sets, we have γ+
∞ "?
#
Ei ≤ C
i=1
∞
γ+ (Ei ).
i=1
Another consequence of Theorem 5.1 is that the capacity γ+ can be characterized in terms of the following potential, introduced by Verdera [Ve2]: Uµ (x) = sup r>0
µ(B(x, r)) + cµ (x). r
(5.2)
where cµ (x) is the pointwise version of curvature defined in (3.1). The precise result is the following. Corollary 5.3. For any compact set E ⊂ C we have γ+ (E) ≈ sup µ(E) : µ ∈ Σ (E), Uµ (x) ≤ 1 ∀x ∈ C . The proof of this corollary follows easily from the fact that γ+ (E) ≈ sup µ(E) : µ ∈ Σ (E), c2 (µ) ≤ µ(E) , using Tchebychev. Let us remark that the preceding characterization of γ+ in terms of Uµ is interesting because it suggests that some techniques of potential theory can be useful to study γ+ . See [To5] and [Ve2].
470
X. Tolsa
6. The comparability between γ and γ+ In [To6] the following result has been proved. Theorem 6.1. There exists an absolute constant C such that for any compact set E ⊂ C we have γ(E) ≤ Cγ+ (E). As a consequence, γ(E) ≈ γ+ (E). Let us remark that the comparability between γ and γ+ had been previously proved by P. Jones for compact connected sets by geometric arguments, very different from the ones in [To6] (see [Pa1, Chapter 3]). Also, in [MTV] it had already been shown that γ ≈ γ+ holds for a big class of Cantor sets. In particular, for the corner quarters Cantor set E (see Fig. 1) it was proved in [MTV] that γ(En ) ≈ γ+ (En ). Recall that En is the nth generation appearing in the construction of E. By results due to Mattila [Ma2] and Eiderman [Ei] (see also [To5]) it was already known that γ+ (En ) ≈ 1/n1/2 . Thus, one has γ(En ) ≈ 1/n1/2 . An obvious corollary of Theorem 6.1 and the characterization of γ+ in terms of curvature obtained in Theorem 5.1 is the following. Corollary 6.2. Let E ⊂ C be compact. Then, γ(E) > 0 if and only if E supports a non-zero Radon measure with linear growth and finite curvature. Since we know that γ+ is countably semiadditive, the same happens with γ: Corollary 6.3. Analytic capacity is countably semiadditive. That is, if Ei , i = 1, 2, . . ., is a countable (or finite) family of compact sets, we have ∞ ∞ # "? Ei ≤ C γ(Ei ). γ i=1
i=1
In the rest of this section we will describe the main ideas involved in the proof of Theorem 6.1. Notice that, by Theorem 5.1, to prove Theorem 6.1 it is enough to show that there exists some measure µ supported on E with linear growth, satisfying µ(E) ≈ γ(E), and such that the Cauchy transform Cµ is bounded on L2 (µ) with absolute constants. To implement this argument, the main tool used in [To6] is the T (b) theorem of Nazarov, Treil and Volberg in [NTV3], which is similar in spirit to the T (b) type theorem stated in Theorem 4.4 but more appropriate for the present situation. To apply this T (b) theorem, one has to construct a suitable measure µ and a function b ∈ L∞ (µ) fulfilling some precise conditions, analogous to the conditions (a), (b) and (c) in Theorem 4.4. Because of the definition of analytic capacity, there exists some function f (z) which is analytic and bounded in C \ E with f (∞) = γ(E) (this is the so-called Ahlfors function). By a standard approximation argument, it is
Painlev´e’s Problem and Analytic Capacity
471
not difficult to see that one can assume that E is a finite union of disjoint segments, so that in particular H1 (E) < ∞. Then one has to construct µ and b and to prove the comparability γ(E) ≈ γ+ (E) with estimates independent of H1 (E). Since E is a finite union of disjoint segments, there exists some complex measure ν0 (obtained from the boundary values of f (z)) supported on E such that f = Cν0 . This measure satisfies the following properties:
Cν0 ∞
≤
1,
|ν0 (E)|
=
γ(E),
=
b0 dH |E,
dν0
(6.1) 1
(6.2) where b0 satisfies b0 ∞ ≤ 1,
(6.3)
Given this information, by a more or less direct application of a T (b) type theorem we cannot expect to prove that the Cauchy transform is bounded with respect to a measure µ such us the one described above with absolute constants. Let us explain the reason in some detail. Suppose for example that there exists some function b such that dν0 = b dµ and we use the information about ν0 given by (6.1), (6.2) and (6.3) (notice the difference between b and b0 ). From (6.1) and (6.2) we derive
C(b dµ) ∞ ≤ 1 (6.4) and
b dµ ≈ µ(E).
(6.5)
The estimate (6.4) is very good for our purposes. In fact, most classical T (b) type theorems (like Theorem 4.4) require only the BM Oρ (µ) norm of b to be bounded, which is a weaker assumption. The estimate (6.5) is likewise good; it is a global accretivity condition, and with some technical difficulties (which may involve some kind of stopping time argument, like in [Dd1] or [NTV3]), one can hope to be able to prove that the accretivity condition b dµ ≈ µ(Q ∩ E) Q
holds for many squares Q. Our problems arise from (6.3). Notice that this implies that |ν0 |(E) ≤ H1 (E),
(6.6)
where |ν0 | stands for the variation of ν0 . This is a very bad estimate since we don’t have any control on H1 (E) (we only know H1 (E) < ∞ because our assumption on E). However, as far as we know, all T (b) type theorems require the estimate b L∞ (µ) ≤ C, or variants of it, which in particular imply that |ν0 |(E) ≤ Cµ(E) ≈ γ(E).
(6.7)
That is to say, the estimate that we get from (6.3) is (6.6), but the one we need is (6.7). So by a direct application of a T (b) type theorem we will obtain bad results when γ(E) H1 (E).
472
X. Tolsa
To prove Theorem 6.1, we need to work with a measure “better behaved” than ν0 , which we call ν. This new measure will be a suitable modification of ν0 with the required estimate for its total variation. To construct ν, in [To6] we consider a set F containing E made up of a finite disjoint union of squares: 0 F = i∈I Qi . One should think that the squares Qi approximate E at some “intermediate scale”. For example, if E = EN is N th approximation of the corner quarters Cantor set, then a good choice for F would be EN/2 (assuming N even), and the squares Qi are the 4N/2 squares of generation N/2. For each square Qi , we take a complex measure νi supported on Qi such that |νi (Qi )| (that is, νi is a constant multiple of a νi (Qi ) = ν0 (Qi ) and |νi |(Qi ) = positive measure). We set ν = i νi . So ν is some kind of approximation of ν0 , and if the squares Qi are big enough, the variation |ν| becomes sufficiently small (because there are “cancellations” in the measure ν0 in each Qi ). On the other hand, the squares Qi cannot be too big, because we need γ+ (F ) ≤ Cγ+ (E).
(6.8)
In this way, we will have constructed a complex measure ν supported on F satisfying |ν|(F ) ≈ |ν(F )| = γ(E). (6.9) Taking a suitable measure µ such that supp(µ) ⊃ supp(ν) and µ(F ) ≈ γ(E), we will be ready for the application of a T (b) type theorem, such as the one in [NTV3], which is a very powerful tool. Indeed, notice that (6.9) implies that ν satisfies a global accretivity condition and that also the variation |ν| is controlled. On the other hand, if we have been careful enough, we will have also some useful estimates on |Cν|, since ν is an approximation of ν0 . Then, using the T (b) theorem in [NTV3], we will deduce γ+ (F ) ≥ C −1 µ(E), and so γ+ (E) ≥ C −1 γ(E), by (6.8), and we will be done. Nevertheless, in order to obtain the right estimates on the measures ν and µ it will be necessary to use an induction argument involving the sizes of the squares Qi , which will allow to assume that γ(E ∩ Qi ) ≈ γ+ (E ∩ Qi ) for each square Qi . Let us remark that the choice of the right squares Qi which approximate E at an intermediate scale is one of the key points of the argument. The potential defined in (5.2) plays an important role here. 7. Other results In [To7], some results analogous to Theorems 5.1 and 6.1 have been obtained for the continuous analytic capacity α. Recall that this capacity is defined like γ in (1.1), with the additional requirement that the functions f considered in the sup should extend continuously to the whole complex plane. In particular, in [To7] it is shown that α is semiadditive. This result has some nice consequences for the theory of uniform rational approximation on the complex plane. For example, it implies the so-called inner boundary conjecture (see [DØ] and [Ve1] for previous contributions).
Painlev´e’s Problem and Analytic Capacity
473
Corollary 6.2 yields a characterization of removable sets for bounded analytic functions in terms of curvature of measures. Although this result has a definite geometric flavor, it is not clear if this is a really good geometric characterization. Nevertheless, in [To8] it has been shown that the characterization is invariant under bilipschitz mappings, using a corona type decomposition for non-doubling measures. See also [GV] for an analogous result for some Cantor sets. Using the corona type decomposition for measures with finite curvature and linear growth obtained in [To8], it has been proved in [To9] that if µ is a measure without atoms such that the Cauchy transform is bounded on L2 (µ), then any CZO associated to an odd kernel sufficiently smooth is also bounded in L2 (µ). Volberg [Vo] has proved the natural generalization of Theorem 6.1 to higher dimensions. In this case, one should consider Lipschitz harmonic capacity instead of analytic capacity (see [MP] for the definition and properties of Lipschitz harmonic capacity). The main difficulty arises from the fact that in this case one does not have any good substitute of the notion of curvature of measures, and then one has to argue with a potential very different from the one defined in (5.2). See also [MT] for related results about Cantor sets in Rd which avoid the use of any notion similar to curvature. However, the relationship of Lipschitz harmonic capacity with rectifiability is not well understood. That is to say, a result analogous to David’s Theorem 2.1 is missing for this capacity. The reason is that, given a set E ⊂ Rd with Hd−1 (E) < ∞ (where Hd−1 stands for the (d − 1)-dimensional Hausdorff measure), it is not known if the fact that the Riesz transform, i.e., the CZO d−1 ) associated to the vectorial kernel (x − y)/ x − y d , is bounded on L2 (H|E implies that E is (d − 1)-rectifiable. The techniques for the proof of Theorem 6.1 have also been used by Prat [Pr] and Mateu, Prat and Verdera [MPV] to study the capacities γs associated to s-dimensional signed Riesz kernels with s non-integer: k(x, y) =
x−y . |x − y|s+1
In [Pr] it is shown that sets with finite s-dimensional Hausdorff measure have vanishing capacity γs when 0 < s < 1. Moreover, for these s’s it is proved in [MPV] that γs is comparable to the capacity C 23 (n−s), 32 from nonlinear potential theory. The case of non-integer s with s > 1 seems much more difficult to study, although in the AD regular situation some results have been obtained [Pr]. The results in [Pr] and [MPV] show that the behavior of γs with s non-integer is very different from the one with s integer. For more information, we recommend the interested reader to look at the recent surveys [Dd2] and [Pa2], where the geometric part of the recent developments in connection with Painlev´e’s problem are treated in more detail
474
X. Tolsa
than in the present paper. For open questions about the relationship between the length of projections of sets and their analytic capacity, as well as other related problems, see [Ma3]. References [Ah] [CS] [Dd1] [Dd2] [DM] [DØ] [Ei]
[Gam] [GCM1]
[GCM2]
[GCG1]
[GCG2] [Gar1] [Gar2] [HMY2]
[Iv]
[GV] [Jo]
L.V. Ahlfors, Bounded analytic functions, Duke Math. J. 14, (1947). 1–11. W. Chen and E. Sawyer, A note on commutators of fractional integrals with RBMO functions, Illinois J. Math. 46 (2002), 1287–1298. G. David, Unrectifiable 1-sets have vanishing analytic capacity. Revista Mat. Iberoamericana 14(2) (1998), 369–479. G. David, Uniformly rectifiable sets, Preprint (2002). G. David and P. Mattila, Removable sets for Lipschitz harmonic functions in the plane, Rev. Mat. Iberoamericana 16 (2000), no. 1, 137–215. A.M. Davie and B. Øksendal, Analytic capacity and differentiability properties of finely harmonic functions, Acta Math. 149 (1982), no. 1-2, 127–152. V. Eiderman, Hausdorff measure and capacity associated with Cauchy potentials. (Russian) Mat. Zametki 63 (1998), 923–934. Translation in Math. Notes 63 (1998), 813–822. T. Gamelin, Uniform Algebras, Prentice Hall, Englewood Cliffs N.J., 1969. J. Garc´ıa-Cuerva and J.M. Martell, Weighted inequalities and vector-valued Calder´ on-Zygmund operators on non-homogeneous spaces, Publ. Mat. 44 (2000), 613–640. J. Garc´ıa-Cuerva and J.M. Martell, Two-weight norm inequalities for maximal operators and fractional integrals on non-homogeneous spaces, Indiana Univ. Math. J. 50 (2001), no. 3, 1241–1280. J. Garc´ıa-Cuerva and E. Gatto, Boundedness properties of fractional integral operators associated to non-doubling measures, Studia Math. 162 (2004), 245–261 J. Garc´ıa-Cuerva and E. Gatto, Lipschitz spaces and Calder´ on-Zygmund operators associated to non-doubling measures, Preprint (2002). J.B. Garnett, Positive length but zero analytic capacity, Proc. Amer. Math. Soc. 24 (1970), 696–699. J. Garnett, Analytic capacity and measure, Lecture Notes in Math. 297, Springer-Verlag, 1972. G. Hu, Y. Meng and D. Yang, New atomic characterization of H 1 space with non-doubling measures and its applications, to appear in Math. Proc. Cambridge Philos. Soc. L.D. Ivanov, On sets of analytic capacity zero, in “Linear and Complex Analysis Problem Book 3” (part II), Lectures Notes in Mathematics 1574, Springer-Verlag, Berlin, 1994, pp. 150–153. J. Garnett and J. Verdera, Analytic capacity, bilipschitz maps and Cantor sets, Math. Res. Lett. 10 (2003), 515–522. P.W. Jones, Rectifiable sets and the traveling salesman problem, Invent. Math. 102 (1990), 1–15.
Painlev´e’s Problem and Analytic Capacity
475
P.W. Jones and T. Murai, Positive analytic capacity but zero Buffon needle probability, Pacific J. Math. 133 (1988), 89–114. [L´e] J.C. L´eger, Menger curvature and rectifiability, Ann. of Math. 149 (1999), 831–869. [MM] J. Mart´ın and M. Milman, Gehring’s lemma for nondoubling measures, Michigan Math. J. 47 (2000), 559–573. [MMNO] J. Mateu, P. Mattila, A. Nicolau, and J. Orobitg, BMO for nondoubling measures, Duke Math. J. 102 (2000), 533–565. [MPV] J. Mateu, L. Prat and J. Verdera, The capacities associated to signed Riesz kernels, and Wolff potentials, Preprint (2003). To appear in J. Reine Angew. Math. [MT] J. Mateu and X. Tolsa, Riesz transforms and harmonic Lip1 -capacity in Cantor sets, Proc. London Math. Soc. 89(3) (2004), 676–696. [MTV] J. Mateu, X. Tolsa and J. Verdera, The planar Cantor sets of zero analytic capacity and the local T (b)-Theorem, J. Amer. Math. Soc. 16 (2003), 19–28. [Ma1] P. Mattila, Smooth maps, null sets for integralgeometric measure and analytic capacity, Ann. of Math. 123 (1986), 303–309. [Ma2] P. Mattila, On the analytic capacity and curvature of some Cantor sets with non-σ-finite length. Publ. Mat. 40 (1996), 127–136. [Ma3] P. Mattila, Hausdorff dimension, projections, and Fourier transform, Publ. Mat. 48 (2004), 3–48. [MMV] P. Mattila, M.S. Melnikov and J. Verdera, The Cauchy integral, analytic capacity, and uniform rectifiability, Ann. of Math. (2) 144 (1996), 127–136. [MP] P. Mattila and P.V. Paramonov, On geometric properties of harmonic Lip1 capacity, Pacific J. Math. 171:2 (1995), 469–490. [Me] M.S. Melnikov, Analytic capacity: discrete approach and curvature of a measure, Sbornik: Mathematics 186(6) (1995), 827–846. [MV] M.S. Melnikov and J. Verdera, A geometric proof of the L2 boundedness of the Cauchy integral on Lipschitz graphs, Internat. Math. Res. Notices (1995), 325–331. [NTV1] F. Nazarov, S. Treil and A. Volberg, Cauchy integral and Calder´ on-Zygmund operators in non-homogeneous spaces, Internat. Res. Math. Notices 15 (1997), 703–726. [NTV2] F. Nazarov, S. Treil and A. Volberg, Weak type estimates and Cotlar inequalities for Calder´ on-Zygmund operators in nonhomogeneous spaces, Int. Math. Res. Notices 9 (1998), 463–487. [NTV3] F. Nazarov, S. Treil and A. Volberg, How to prove Vitushkin’s conjecture by pulling ourselves up by the hair. Preprint (2000). [NTV4] F. Nazarov, S. Treil and A. Volberg, The T b-theorem on non-homogeneous spaces, Acta Math. 190 (2003), 151–239. [NTV5] F. Nazarov, S. Treil and A. Volberg, Accretive system T b-theorems on nonhomogeneous spaces, Duke Math. J. 113 (2002), 259–312. [OP] J. Orobitg and C. P´erez, Ap weights for nondoubling measures in Rn and applications, Trans. Amer. Math. Soc. 354 (2002), 2013–2033 [Pa1] H. Pajot, Analytic capacity, rectifiability, Menger curvature and the Cauchy integral, Lecture Notes in Math. 1799 (2002), Springer. [JM]
476
X. Tolsa H. Pajot, Capacit´ e analytique et le probl`eme de Painlev´e, S´eminaire Bourbaki, 56 ann´ee, 2003-04, n. 936. L. Prat, Potential theory of signed Riesz kernels: capacity and Hausdorff measures, Int. Math. Res. Notices 19 (2004), 937–981. X. Tolsa, L2 -boundedness of the Cauchy integral operator for continuous measures, Duke Math. J. 98(2) (1999), 269–304. X. Tolsa, BM O, H 1 and Calder´ on-Zygmund operators for non-doubling measures, Math. Ann. 319 (2001), 89–149. X. Tolsa, The space H 1 for nondoubling measures in terms of a grand maximal operator, Trans. Amer. Math. Soc. 355 (2003), 315–348 X. Tolsa, Littlewood-Paley theory and the T (1) theorem with non-doubling measures, Adv. Math. 164 (2001), 57–116. X. Tolsa, On the analytic capacity γ+ , Indiana Univ. Math. J. 51(2) (2002), 317–344. X. Tolsa, Painlev´e’s problem and the semiadditivity of analytic capacity, Acta Math. 190:1 (2003), 105–149. X. Tolsa, The semiadditivity of continuous analytic capacity and the inner boundary conjecture, Amer. J. Math. 126 (2004), 523–567. X. Tolsa, Bilipschitz maps, analytic capacity, and the Cauchy integral, Preprint (2003). To appear in Ann. of Math. X. Tolsa, L2 boundedness of the Cauchy transform implies L2 boundedness of all Calder´ on-Zygmund operators associated to odd kernels, Publ. Mat. 48 (2) (2004), 445–479. J. Verdera, BMO rational approximation and one-dimensional Hausdorff content, Trans. Amer. Math. Soc. 297 (1986), 283–304. J. Verdera, On the T (1)-theorem for the Cauchy integral. Ark. Mat. 38 (2000), 183–199. J. Verdera, The fall of the doubling condition in Calder´ on-Zygmund theory, Proceedings of the 6th International Conference on Harmonic Analysis and Partial Differential Equations (El Escorial, 2000). Publ. Mat. 2002, Vol. Extra, 275–292. A.G. Vitushkin, Estimate of the Cauchy integral, (Russian) Mat. Sb. 71 (113) (1966), 515–534. A.G. Vitushkin, The analytic capacity of sets in problems of approximation theory, Uspeikhi Mat. Nauk. 22(6) (1967), 141–199 (Russian); in Russian Math. Surveys 22 (1967), 139–200. A. Volberg, Calder´ on-Zygmund capacities and operators on nonhomogeneous spaces. CBMS Regional Conf. Ser. in Math. 100, Amer. Math. Soc., Providence, 2003.
[Pa2] [Pr] [To1] [To2] [To3] [To4] [To5] [To6] [To7] [To8] [To9]
[Ve1] [Ve2] [Ve3]
[Vi1] [Vi2]
[Vo]
Xavier Tolsa Instituci´ o Catalana de Recerca i Estudis Avan¸cats (ICREA) and Departament de Matem` atiques Universitat Aut` onoma de Barcelona, Spain e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Regularization Techniques for Singular Source Terms in Differential Equations Anna-Karin Tornberg Abstract. Regularization of singular source terms in partial differential equations is a widely used approach to deal with the challenge of numerically approximating such equations. In this paper, we analyze regularization techniques in one and multi-dimensions. We consider general numerical grids as well as the special case of a uniform grid, in which case discrete properties allow for reduced errors. All regularizations are based on a regularization of the Dirac delta function in one dimension, that are extended to more dimensions. It is critical how this extension is done. While it can produce a multi-dimensional approximation with the same accuracy as given by the one-dimensional analysis, we also show that a technique commonly used in connection to level-set methods may lead to O(1) errors. Modifications to this inconsistent technique are introduced.
1. Introduction and preliminaries In numerical methods, functions are represented by their values at nodes of a computational grid. Singular and discontinuous functions are not well represented on such grids since a singularity can fall anywhere between two grid points, without any change in the discrete representation. This is an issue for example in the quadrature of low regularity functions, and in the discretization of partial differential equations with discontinuous coefficients or singular source terms. Regularization of singularities is a way to properly place the singularities on the grid; there is no longer any ambiguity regarding their positions. However, how the regularization is designed will affect the overall accuracy of the numerical discretization. Singular source terms in differential equations appear in many different applications. Examples include multiphase flows [8, 11, 14, 16], dendritic solidification [4, 8], simulation of elastic boundaries in blood flow [9, 10] and subgrid wire modeling in computational electromagnetics [5]. In the example of immiscible multiphase flow, the singular source term arises from the surface tension forces that are acting on the moving and deforming interfaces that are separating any two different liquids. Regularization of singular terms is an important component in many computational techniques that have been applied to these problems, as for example Research partially sponsored by Swedish VR-grant no 222-2000-434.
478
A.-K. Tornberg
in the immersed boundary method by Peskin [10], in the front-tracking method by Tryggvason et al. [16] and in connection to the level-set method, see Osher and Fedkiw [8] and Sethian [11]. These are all techniques for moving interface problems, in which the underlying grid is not adapted to the moving boundaries. The boundaries, or interfaces, are instead represented separately. Singular source terms with support on these interfaces must be discretized on the background grid, which is often uniform. In a finite element setting, the handling of the interface source terms can be done by evaluating the resulting surface integral in the weak formulation [14]. In a finite difference method, an alternative to regularization is to incorporate the jump conditions arising from the singular term into the numerical algorithm, as is done in the immersed interface method by LeVeque and Li [6]. Regularization of the singularity is however more common, with the main advantage that standard finite difference or finite element methods may be used to discretize the equation. Let Γ ⊂ Rd be a d − 1-dimensional continuous and bounded surface and let S be surface coordinates on Γ. Define δ(Γ, g, x), x ∈ Rd as a delta function of variable strength supported on Γ such that δ(Γ, g, x) f (x) dx = g(S) f (X(S)) dS, (1.1) Rd
Γ
where X(S) ∈ Γ. We want to replace the Dirac delta function δ with support on Γ by a more regular function δε , which can be used in connection to numerical solution of differential equations with singular source terms and quadrature with singular integrands. An explicit representation of Γ is available in front-tracking methods, and it is convenient to define regularizations δε (Γ, g, x) based on this representation. In level-set methods however, Γ is only implicitly defined as the zero level set of a continuous function. Here, it is more convenient to make use of the closest distance to Γ, given as the absolute value of a signed distance function, d(Γ, x), in the definition of the delta approximation. We have analyzed this type of regularizations in a sequence of papers [12, 13, 15, 2]. The discussion in this paper is based on the results in these papers, combined with certain new results. Consider a partial differential equation Lu = δ(Γ, g, x) x ∈ Ω ⊂ Rd . where L is a linear differential operator and we here for simplicity assume that u satisfies homogeneous initial and boundary conditions. The solution can then be given on the form G(x, y)δ(Γ, g, y)dy u(x) = Ω
where G(x, y) is the fundamental solution or Green’s function.
Regularization Techniques
479
Assume that a numerical approximation of this equation with the singular source term δ(Γ, g, x) replaced by a regularized approximation δε (Γ, g, x) has the solution uε,h . Considering the form on which u(x) is given above, in the analysis of the error |u−uε,h | we will facilitate results regarding the error in the numerical integration of δ(Γ, g, x) f (x) with δ(Γ, g, x) replaced by a regularized approximation δε (Γ, g, x). The analysis of the quadrature errors is also of interest by itself. For example, in connection to level-set methods, where Γ is defined only implicitly, numerical integration is applied to a Dirac delta function with support on a curve or surface Γ to compute the curve length or the surface area of Γ [8, 11]. Denote numerical integration of a function f (x) by quad(f (x)), such that quad(f (x)) =
NQ
wi f (qi ),
i=1
where wi and qi , i = 1, . . . , NQ are the quadrature weights and points, respectively. We define the discretization error to be E = δ(Γ, g, x) f (x) dx − quad(δε (Γ, g, x) f (x)) . (1.2) Ω
The integral in this expression evaluates as given in Eq. (1.1). One approach to analyze this error is to split it into two parts, δε (Γ, g, x) f (x) dx) E ≤ δ(Γ, g, x) f (x) dx − Ω Ω + δε (Γ, g, x) f (x) dx − quad(δε (Γ, g, x) f (x))
(1.3)
Ω
where the first part is the analytical error made when replacing δ(Γ, g, x) with a regularized approximation δε (Γ, g, x). The second part is the numerical error in the integration of δε (Γ, g, x) f (x). This approach can be used whatever quadrature rule is applied for the numerical integration, and will be discussed in Section 2. However, if we define a uniform grid, with the quadrature simply a sum over those grid points, the total error can be analyzed directly, without splitting it up. Doing this, one finds that there are specific choices of regularizations where the error is particularly small. Let us consider an example in one dimension, where we define δε (t) = (1 + t/ε)/ε for −ε ≤ t < 0, (1 − t/ε)/ε for 0 ≤ t ≤ ε, and zero elsewhere. This is a linear hat function. If we analyze the error in the integration of δ(x − x ¯) by splitting it in two parts, the analysis yields that the best choice is to take √ ε proportional to h, where h is a representative grid size, and that the error then is of O(h). However, if we consider a uniform grid with grid size h (grid
480
A.-K. Tornberg
points jh, j ∈ Z) and define the discretization error as (with g ≡ 1), x) − h δε (xj − x ¯)f (xj )| E1D = |f (¯ j∈Z
we can analyze the full error directly. With ε = p h, p integer, we find that the error is O(h2 ). However, if p is not an integer, the error will be of O(1). This is discussed in Section 3.1. One-dimensional delta function approximations with compact support can be designed to yield any desired order of accuracy in the regularization parameter ε. This order of accuracy can be retained in higher dimensions, if the onedimensional delta function approximation is extended to several dimensions by a so called product rule. The proof is given in Section 3.2. In connection to level set methods, it is common to extend to several dimensions by using a signed distance function to Γ. Here, a choice of ε = ph will however lead to an O(1) error, also if p is an integer. This is discussed in Section 3.2, where we also as a remedy to this problem introduce other regularizations based on the signed distance function that are consistent. This special case of a regular uniform grid is important particularly in the context of partial differential equations. In Section 4, we make use of the results from the quadrature analysis in Sections 2 and 3, and apply it to the analysis of partial differential equations. In Section 5, we give some brief comments about regularization in connection to quadrature of discontinuous functions and discontinuous coefficients in partial differential equations. 2. Quadrature of singular functions In this section we analyze the error by splitting it into two parts, an analytical and a numerical error, as given in Eq. (1.3). If no regularization is made, there is no analytical error, but the numerical error tends to be large. The aim is to balance these two sources of errors. 2.1. One dimension. Before we consider higher dimensions, let us study a point of singularity, x ¯ in one dimension. We define a regularized delta function by 1 ε ϕ(t/ε) |t| ≤ ε, (2.1) δε (t) = 0 |t| > ε, where ϕ(ξ) is a smooth function in −1 ≤ ξ ≤ 1. With a singularity at x = x ¯, we have the analytical error ∞ [δ(x − x ¯) − δε (x − x ¯)] f (x)dx E1D,ε = −∞ ∞ δε (x − x ¯)f (x)dx. = f (¯ x) − −∞
(2.2)
Regularization Techniques
481
Assuming that f (x) ∈ C m ([¯ x −ε, x ¯ +ε]), we Taylor expand this function around x=x ¯. Using this expansion, and substituting t = x − x ¯ in the integrals, we get E1D,ε
= f (¯ x) 1−
m−1 ∞ 1 (p) f (¯ δε (t)dt + x) δε (t) tp dt p! −∞ −∞ p=1 ∞ 1 m f (η) δε (t) tm dt, + m! −∞ ∞
(2.3)
where η ∈ [¯ x − ε, x ¯ + ε]. To cancel the leading order error term, we need δε (t) to satisfy the mass condition ε 1 ∞ δε (t) dt = δε (t) dt = ϕ(ξ) dξ = 1. (2.4) −∞
−ε
−1
Assuming that δε also satisfies m − 1 more moment conditions, i.e., ε 1 ∞ δε (t)tp dt = δε (t)tp dt = εp ϕ(ξ) ξ p dξ = 0, −∞
−ε
(2.5)
−1
for p = 1, . . . , m − 1, we have that ∞ 1 1 m m m 1 m f (η) f (η) δε (t) t dt = ε ϕ(ξ) ξ m dξ. E1D,ε = m! m! −∞ −1
(2.6)
The last integral is bounded independent of ε, and so we have the bound |E1D | < Cεm . Hence, the number of moment conditions that are satisfied determines the order of the analytical error in powers of the width of the regularization zone, ε. To consider the numerical error, assume that we have a representative grid size of h and a quadrature rule that is of formal order Q, so that the quadrature error for a smooth function is bounded by ChQ . Now, if δε (t) ∈ C k (R), k + 1 < Q, and the (k + 1)th derivative of δε (t) is bounded, then the absolute value of the numerical error ∞ δε (x − x ¯) f (x) dx − quad(δε (x − x ¯) f (x)) (2.7) E1D,quad = −∞
(k+1)
(t)|hk+2 , where C is independent of ε. This is bounded by C maxt∈R |δε follows from a theorem by Jackson [3], which is stated in full in [12]. We have here assumed that f (x) is of higher regularity than δε . Finally, this yields, |E1D,quad | < C max |ϕ(k+1) (ξ)| ξ∈[−1,1]
hk+2 . εk+2
(2.8)
Hence, the regularity of the delta approximation determines the order of the numerical error, assuming that the order of the quadrature rule is high enough. The linear hat function discussed in the introduction with a moment order m = 2 and k = 0 has an analytical error of O(ε2 ) and a numerical error of
482
A.-K. Tornberg
O(h2 /ε2 ). In this case, the optimal scaling is ε ∼ of O(h).
√ h, yielding both errors
2.2. Several dimensions. Now, we want to define a delta function approximation when a curve Γ ∈ R2 defines the location of the singularity. One way to do so is to use the signed distance function to Γ. This function d(Γ, x) gives the closest distance to any point on Γ for each point x. There is a sign convention such that the function is positive on one side of Γ, negative on the other, so that Γ is the zero level set of d(Γ, x). We define δε (Γ, g, x) = g˜(x) δε (d(Γ, x)),
(2.9)
where g˜ is an extension from g(S) to a neighborhood of Γ. This technique is commonly used in connection to the level set method since the distance function d(Γ, x) is then readily available, see [8, 11]. We will in this paper not discuss the method for extending the function g. The regularity of the one-dimensional delta approximation as well as the moment conditions discussed in the previous section will be important also in the analysis of the error for this two-dimensional delta approximation. We therefore introduce the following definition: Definition 2.1. Denote by δεm,k (t), the delta function approximation defined by 1 m,k (t/ε) |t| ≤ ε, m,k εϕ (2.10) δε (t) = 0 |t| > ε, where the function ϕm,k (ξ) is such that for m ≥ 0 and k ≥ −1, 1 ϕm,k (ξ)dξ = 1,
(2.11)
−1
(ϕm,k )(β) (±1) = 0, and furthermore, ηα =
β = 0, . . . , k,
(2.12)
1
ϕm,k (ξ) ξ α dξ = 0, −1
α = 1, . . . , m − 1.
(2.13)
With this definition, we have that δεm,k (t) ∈ C k (R). Furthermore, it obeys m moment conditions. First, moment condition number zero, or the mass condition, as defined in (2.4) and (2.11), and m − 1 more moment conditions, as defined in (2.5) and (2.13). We say that the delta approximation is of moment order m. Theorem 2.2. Let δεm,k (t) be as in Definition 2.1. Assume that Γ can be parameterized by Γ = (X(s), Y (s)), X, Y ∈ C 2 [s1 , s2 ] and that the curvature κ(s) is such that (ε maxs |κ(s)|) < 1. Furthermore, let d(Γ, x) be the signed distance function to Γ, and assume Ω such that Ωε ⊂ Ω, where Ωε = {x ∈ R2 : |d(Γ, x)| ≤ ε }.
Regularization Techniques
Then, the analytical error Eε,f (δεm,k ) = {δεm,k (d(Γ, x)) − δ(d(Γ, x))} f (x) dx,
483
(2.14)
Ω
is given by (2.15) Eε,f (δεm,k ) = Cm,f ηm εm + O(εm+1 ), with ηm defined by (2.13). The constant Cm,f is independent of ε, and bounded under the assumption that all partial derivatives of f (x) up to order m are bounded. The proof of this theorem is given in [12]. It uses a parameterization of the region Ωε = {x ∈ R2 : |d(Γ, x)| ≤ ε}, in one coordinate s along Γ, and another (t) across the regularization zone. Similarly to one dimension, Taylor expansion is applied in the t coordinate, and the moment conditions satisfied by δεm,k (t) are used to deduce the final result. The functions ϕm,k (ξ) that are used to define the delta approximation 1 can be from any function class. For example, one could define ϕ2,1 cos (ξ) = 2 (1 + cos(πξ)). This approximation was introduced by Peskin in 1977 [9]. It is of moment order two, and it has one continuous derivative. We could also use a piecewise linear function, ϕ2,0 hat (ξ) = min(1 + ξ, 1 − ξ). This continuous approximation is also of moment order two, but its first derivative is discontinuous. However, to define delta approximations of higher moment order, it is convenient to work with polynomials defined on −1 ≤ ξ ≤ 1. Definition 2.3. Denote by ϕm,k (ξ), the delta polynomial of lowest degree that obeys the conditions in Definition 2.1. The following theorem considering these delta polynomials was proven in [12]: Theorem 2.4. The delta polynomial ϕm,k (ξ) exists and is uniquely determined by the conditions in Definition 2.3. It is a polynomial of degree r = 2 (/ m+1 2 0+ k), containing only even powers of ξ. Hence, given any desired moment order and regularity of the resulting delta approximation, the theorem certifies that such a delta polynomial does indeed exist. Moreover, if we want it to be the lowest degree possible, it is unique, with the polynomial degree as given by the theorem. Remark 2.5. In addition to (2.13), the quantity ηα = 0 for all α odd, since ϕm,k (ξ) is an even polynomial. This yields that the polynomial ϕn−1,k (ξ), n even, is equal to ϕn,k (ξ). This remark yields that for the delta polynomials, we can modify the error formula (2.16) in Theorem 2.2 to read Eε,f (δεm,k ) = Cβ,f ηβ εβ + O(εβ+2 ), where β =
2 / m+1 2 0.
(2.16)
484
A.-K. Tornberg
A few examples of polynomials ϕm,k (ξ) defining delta approximations through (2.10) are given below. The polynomials with moment order two (m = 2) and with k = 0, 1 and 2 continuous derivatives, respectively, are
δεm,k (t)
3 15 (1 − ξ 2 ), ϕ2,1 = (1 − 2ξ 2 + ξ 4 ), 4 16 (2.17) 35 (1 − 3ξ 2 + 3ξ 4 − ξ 6 ). ϕ2,2 = 32 The polynomials with moment order four (m = 4) and with k = 0, 1 and 2 continuous derivatives, respectively, are 15 105 (3 − 10ξ 2 + 7ξ 4 ), ϕ4,1 = (1 − 5ξ 2 + 7ξ 4 − 3ξ 6 ), ϕ4,0 = 32 64 (2.18) 315 4,2 2 4 6 8 (3 − 20ξ + 42ξ − 36ξ + 11ξ ). ϕ = 512 These polynomials are plotted in Figure 1. ϕ2,0 =
2
1.2
— ◦ — × —
1 0.8
ϕ2,0 (ξ) 2,1
(ξ)
2,2
(ξ)
ϕ ϕ
1.5
— ◦ — × —
ϕ4,0 (ξ) ϕ4,1 (ξ) ϕ4,2 (ξ)
1
0.6 0.5
0.4 0
0.2 0 −1
−0.5
0
0.5
1
−0.5 −1
−0.5
ξ
0
0.5
1
ξ
Figure 1. Plot of polynomials ϕm,k (ξ), defining delta approximations δεm,k (t) with moment order m and k continuous derivatives. Note that the y-axis is scaled differently the two plots. Considering the numerical error, the regularity of the one-dimensional delta approximation is critical also in this two-dimensional case. Assuming a quadrature rule of high enough order, one can show that Equad,f (δ m,k (d(Γ, x)) = δ m,k (d(Γ, x)) f (x) dx − quad(δ m,k (d(Γ, x)) f (x)) ε ε ε Ω
≤ C max |ϕ(k+1) (ξ)| ξ∈(−1,1)
hk+2 , εk+2 (2.19)
where C is independent of ε, and h is representative of the grid size (distance between quadrature points). For details, see [12].
Regularization Techniques
485
For a fixed grid size h, the numerical error decreases with increasing ε, i.e., as the delta approximation is better resolved on the grid. The analytical error however increases with increasing ε, i.e., as the regularization zone gets wider. Furthermore, from this numerical error term, we can see that a choice of ε proportional to h leads to an O(1) error term. In general, we need a choice of ε ∼ hα , α < 1. For an optimal scaling, we need to balance the order of the analytical error εm and the numerical error. For example, for a δεm,k with m = 2 and k = 0 (such as the linear √hat function or the polynomial given in Eq. (2.17)), with a choice of ε = C h, both the analytical error and the numerical error is O(h). This analysis and discussion was based on the use of the signed distance to Γ to define the delta function approximation δε (Γ, x) = δε (d(Γ, x)). The extension to several dimensions can also be made using a so called product rule, as introduced by Peskin [9]. For this approximation, the moment conditions of the one-dimensional delta approximation will again determine the order of the analytical error, and its regularity the order of the numerical error. This extension to several dimensions is convenient when an explicit parameterization of Γ is available. It is defined below as we state and prove the following theorem regarding the analytical error for this approximation. Theorem 2.6. Let Γ ⊂ Rd be parameterized by Γ = X(S) = (X (1) (S), . . . , X (d) (S)). Suppose δεm,k (t) as in Definition 2.1, g ∈ C and f ∈ C r (Rd ), r ≥ m. Define δε (Γ, g, x) by ! d m,k (l) (l) δε (Γ, g, x) = δε (x − X (S)) g(S)dS. (2.20) Γ
l=1
Furthermore, let d(Γ, x) be the signed distance function √ to Γ, and assume Ω ⊂ Rd such that Ωε ⊂ Ω, where Ωε = {x ∈ R2 : |d(Γ, x)| ≤ 2ε }. Then, the analytical error EP = {δε (Γ, g, x) − δ(Γ, g, x)} f (x) dx (2.21) Ω
is bounded by |EP | ≤ Cεm ,
(2.22)
and EP = 0 if f is constant. Proof. First, by definition of the delta function δ(Γ, g, x) f (x) dx = f (X(S)) g(S) dS. Ω
Γ
Using the definition of δε (Γ, g, x) in Eq. (2.20), we have ! ! d I= δε (Γ, g, x)f (x)dx = δεm,k (x(l) − X (l) (S)) g(S)dS f (x) dx. Ω
Ω
Γ
l=1
486
A.-K. Tornberg
(1) (d) (d) ¯ ⊃ Ω, Ω ¯ ⊂ Rd , such that Ω ¯ = [x(1) Let Ω a , xb ] × . . . × [xa , xb ], for some values (1) (1) xa , xb etc. Since the support of δε (Γ, g, x) is compact, we can replace the (1) x(d) b b ¯ for which we have ¯ = x(1) integral over Ω with an integral over Ω, . . . (d) . Ω
xa
xa
¯ and Γ, this can be written as Changing order of the integration over Ω x(1) x(2) b b m,k (1) (1) I= δε (x −X (S)) ... Γ
(1)
(2)
xa
xa
!
(d)
xb
...
δεm,k (x(d) −X (d) (S))f (x)dx(d)
(d)
! . . . dx
! (1)
g(S)dS.
xa
From Taylor expansion of f (x) in x(d) around X (d) (S), using that δεm,k is of moment order m, similarly to the derivation of Eq. (2.6), the last bracket evaluates as
(d)
xb
δεm,k (x(d) −X (d) (S)) f (x) dx(d) = f (x(1) , . . . , x(d−1) , X (d) (S))
p 1 ∞ ∂ m,k p r+1 δε (t)t dt ), + p f (x(1) ,...,x(d−1) ,X (d) (S)) + O(h p! −∞ ∂xd
(d) xa
m≤p≤r
where we have substituted t = x(d) −X (d) (S), and used the compact support of δεm,k to simplify integration limits. Repeating this step for x(d−1) , . . . , x(1) gives ! ∞ d 1 f (X(S)) + δεm,k (t) tβi dt Dβ f X(S) g(S)dS I= β! ∞ Γ i=1 i β∈Rmr
+ O(hr+1 ). Here, we have used that δεm,k satisfies the moment conditions according to d Definition 2.1. We have introduced a multi index β, s.t. |β| = i=1 βi , and Dβ f =
∂ β1 +β2 +···+βd f. ∂xβ1 ∂xβ2 · · · ∂xβd
The sum is over β ∈ Rmr , where Rmr = {β : m ≤ |β| ≤ r, βi ∈ {0, m, m+1, . . . , r}} . Since
∞
∞
we have
δεm,k (t) tβi dt
=ε
1
βi
ϕm,k (ξ) ξ βi dξ, −1
˜ f (X(S))g(S) dS + E,
I= Γ
˜ ≤ Cεm . where |E|
(2.23)
Regularization Techniques
487
Here, we have used that | Γ g(S)dS| is bounded independent of ε. In the special ˜ = 0. From this, case where f is constant, all derivatives of f are zero, and so E the theorem follows. 3. Error analysis on uniform grids In Section 2, the error was divided into an analytical error and a numerical error that were analyzed separately. For a very narrow support, the δε function is not sufficiently resolved to analyze the error by splitting it into these two parts. Instead, the error must be analyzed directly, taking into account discrete effects of the computational grid. This is possible to do in the case of uniform grids. 3.1. One dimension. In the previous analysis in Section 2, moment conditions were shown to determine the order of the analytical error. Here, we need a discrete analogue to these conditions. Assume a regular grid in one dimension, with grid size h and grid points xj = jh, j ∈ Z. We introduce the following definition: Definition 3.1. A function δε (x) ∈ Qq if δε has compact support in [−ε, ε], ε = ph, p > 0 and ∞ 1, r = 0 ¯, h) = h δε (xj − x ¯)(xj − x ¯ )r = (3.1) Mr (δε , x 0, 1 ≤ r < q j=−∞
for any x ¯ ∈ R, where xj = jh, h > 0, j ∈ Z. If δε satisfies q moment conditions, as in this definition, we will say that it has moment order q. Note the essential requirement that these moment conditions hold for all shifts in the grid. We have the following proposition, as given in [1, 15]: Proposition 3.2. Suppose that δε ∈ Qq , q > 0 as in Definition 3.1, and f (x) ∈ C q (R). Then ∞ δε (xj − x ¯)f (xj ) − f (¯ x) ≤ Chq , E = h j=−∞ and E = 0 if f is constant. Proof. By Taylor expansion follows h
∞
δε (xj − x ¯)f (xj )
j=−∞
=h
∞ j=−∞
δε (xj − x ¯)
q−1 r h r=0
r!
! ¯) f (xj − x
r (r)
q
(¯ x) + O(h )
488
A.-K. Tornberg
=
q−1 r h r=0
r!
∞
f (r) (¯ x) h
δε (xj − x ¯)(xj − x ¯)r + O(hq )
j=−∞
= M0 (δε , x ¯, h)f (¯ x) +
q−1 r h r=1
r!
f (r) (¯ x)Mr (δε , x ¯, h) + O(hq ).
¯, h) = 1 and Mr (δε , x ¯, h) = 0, for r = 1, . . . , q − 1. Since δε ∈ Q , q > 0, M0 (δε , x From this, the theorem follows. q
Note how the discrete moment conditions appear very similarly to the continuous moment conditions in Section 2. These moment conditions must be fulfilled for any position of the singularity in the grid, and in [15] we proved that such δε functions do exist: Theorem 3.3. There exists δε ∈ Qq if and only if 2ε ≥ qh. The most compact δε approximation that obeys q moment conditions may not be continuous. In computations, it is however most practical to deal with continuous δε functions. Define an approximate continuous delta function δε as 1 h ψp (x/h) |x| ≤ ε = ph, δε (x) = (3.2) 0 |x| > ε = ph, where δε ∈ C(R), i.e., ψp (−p) = ψp (p) = 0. For examples of such delta function approximations, and their moment order, see Table 1, and the related Figure 2. Note that the linear hat functions are of moment order 2, whereas the cosine function is only of moment order 1. The cubic function of moment order 4 can also be found in [7, 17]. δε
ε
ψp (ξ) (|ξ| ≤ p)
q
δhL
h
ψ1L (ξ) = min(ξ + 1, 1 − ξ)
2
L δ2h
2h ψ2L (ξ) =
cos δ2h
2h ψ2cos (ξ) = 14 (1 + cos(πξ/2)) 1 − 12 |ξ| − |ξ|2 + 12 |ξ|3 C 2h ψ2 (ξ) = 1 2 3 1 − 11 6 |ξ| + |ξ| − 6 |ξ|
C δ2h
1 4
min(ξ + 2, 2 − ξ)
2 1 0 ≤ |ξ| ≤ 1, 1 < |ξ| ≤ 2.
4
Table 1. Delta function approximations as defined in (3.2), with their ψp functions. The moment order q is given for each approximation.
Regularization Techniques
1
ψ1L
1
489
ψ2cos
ψ2C 1
0.5
0.5
0.5
0 0 −2
0
2
0 −2
0
2
−2
0
2
Figure 2. ψ(ξ) versus ξ. From left to right, ψ1L , ψ2cos , ψ2C , as defined in Table 1. Let us again consider the discrete sum over the delta approximation Σ(δε ) = h
j∈Z
δε (xj − x ¯) = h
n−1
δε (xj − x ¯),
(3.3)
j=0
¯ + ε < xn . If the delta where we now have assumed x ¯ > 0, and xn−1 ≤ x approximation obeys the mass condition, this sum evaluates as 1. This is true for example for the linear hat function with ε = h or ε = 2h. Now, let us write x ¯ = xk + rh, 0 ≤ r < 1, with xk being the x-value of the closest grid point to the left of x ¯. With ε = ph, the sum (3.3) for the linear hat function evaluates as 1 k0 (k0 + 1) k1 (k1 + 1) L Σ(δph − ) (3.4) ) = 2 ((k0 + k1 + 1)p + (k1 − k0 − 1)r − p 2 2 where k0 = /p − r0, k1 = /p + r0. (3.5) From this formula we can see that if p = 1 or p = 2, or any other integer, the mass condition is indeed fulfilled independently of the choice of r, that is, independently of how the grid is shifted relative to the location of δL . However, if p is not an integer, there is an error in the mass condition. This error is independent of h, and from the Taylor expansion in the proof of Proposition 3.2 one can see that this leads to an error E of O(1). See Figure 3 for a plot of the error when r = 0. The fact that the moment conditions are violated when the support is dilated is not true only for the linear hat function. This will occur for all the delta approximations with compact support that we have introduced. It is however possible to construct a delta approximation that obeys the mass condition for a wide range of dilations. By defining a function in Fourier space with compact support on ω ∈ [−β, β], β < 1, and the corresponding function in real space (with a proper scaling parameter ε), the Poisson’s summation formula yields that the mass condition will be satisfied for a wide range of ε values. For details, see [13]. This delta approximation will however not have compact support, but can, with a proper choice in Fourier space, have
490
A.-K. Tornberg
0.15 0.1 0.05 0
1
2
3
p ( ε = ph )
4
5
L ) − 1, plotted versus p (where Figure 3. The error in the sum, Σ(δph ε = ph) for the shift r = 0, i.e., x ¯ on a grid point.
an exponential decay. In practical calculations, this function can then be truncated to a compact support. A delta approximation constructed in this manner, that obeys more moment conditions, will have a slower decay. In this case, the Fourier space approach has not yet proven suitable for practical calculations. 3.2. Extending to several dimensions. In the previous section, we discussed delta approximations in one dimension, as discretized on a uniform grid. Now, let Γ ⊂ Rd be a d − 1-dimensional continuous and bounded surface and let S be surface coordinates on Γ. Assume that the space Rd is covered by a regular grid; (1)
(d)
{xj }j∈Z d , xj = (xj1 , . . . , xjd ) (l)
(l)
xjl = x0 + jl hl , jl ∈ Z, l = 1, . . . , d.
(3.6)
We are interested in Γ with general location relative to the computational grid. (l) Since we will consider fully general Γ there is no restriction if we fix x0 and (l) we will for simplicity let x0 = 0, l = 1, . . . , d. Again, we consider the same two techniques as in Section 2 to extend the one-dimensional regularization to the multi-dimensional case for which the singularity is supported on a curve or a surface Γ. Here, we define the product formula as ! d δε (Γ, g, x) = δεk (x(l) − X (l) (S)) g(S)dS, (3.7) Γ
l=1
with the regularization parameters εl = phl , l = 1, . . . , d, where the grid sizes h1 , . . . , hd refers to the regular grid introduced in Eq. (3.6). As before, δεk is a one-dimensional regularized δ function, x ∈ Rd , and X(S) = (X (1) (S), . . . , X (d) (S)) is a point on Γ.
Regularization Techniques
491
For this approximation, we have the following theorem: Theorem 3.4. Suppose that δε ∈ Qq , q > 0, as in Definition 3.1; g ∈ C and f ∈ C r (Rd ), r ≥ q. Furthermore, let δε (Γ, g, xj ) be defined as in Eq. (3.7). Then d . hl δε (Γ, g, xj )f (xj ) − g(S)f (X(S))dS ≤ Chq (3.8) E= Γ l=1 j∈Z d with h = max1≤l≤d hl and E = 0 for constant f . In one dimension, we saw in Proposition 3.2 that it is solely the moment order of the one-dimensional delta approximation that determines the order of accuracy. This theorem asserts that the same is true in several dimensions, when the multi-dimensional delta approximation is defined by the product rule, as given in Eq. (3.7). This theorem was proven in [15]. The proof is similar in spirit to the proof of Proposition 3.2, expanding in one dimension at a time in a manner similar to the proof of Theorem 2.6. Remark 3.5. There is a discrete analogue of Theorem 3.4. If the integral over Γ is replaced by a discrete sum, both in the definition of δε in Eq. (3.7) and of E in Eq. (3.8), the same estimate for E holds. The proof is identical to before, except that the integral over Γ needs to be changed to the discrete sum. Now, let us turn to the extension by distance function. Here we define δε (Γ, x) = δε (d(Γ, x)),
(3.9)
setting g(S) ≡ 1 in Eq. (2.9). The regularization parameter ε = p h with h = max1≤k≤d hl . The choice of the support in practical level-set simulations has mainly been ε = h, 1.5h or 2h, for discretization on regular grids ( with h1 = · · · = hd = h) [8, 11]. We shall show that such a choice may result in O(1) error. We have already had strong indications regarding this problem. From the continuous analysis, we have that there is a numerical error of order (h/ε)k+2 . If we let ε = mh, where m is a large enough integer that the continuous analysis is valid, this yields an error term of O(1), independent of the regularity of δε (as given by k). For a large ε, the total error will however likely be dominated by the analytical error, and this O(1) error might be difficult to see in a numerical test. Another indication comes from the discussion of O(1) errors in case of dilation of the support of the delta approximation in one dimension. (1) In two dimensions, let Γ be a straight line at 45 degree angle √ to the x (1) (2) (1) ¯ axis in the grid, i.e., define Γ = x, x = x , 0 ≤ x < S/ 2 . Consider the calculation of the length |Γ|, ¯ δ(Γ, x)dx, (3.10) |Γ| = S = R2
492
A.-K. Tornberg
computed using a δε (d(Γ, x)) approximation on a regular grid (with h1 = h2 = h), (1) (2) (l) δε (d(Γ, xj )), xj = (xj1 , xj2 ), xjl = jl h, jl ∈ Z, l = 1, 2. S¯h = h2 j∈Z 2
(3.11) For δε = δhL (the narrow hat function) a straightforward calculation, as presented in [15] yields √ 3− 2 ¯ ¯ Sh = √ S + O(h), 2 ¯ S) ¯ of over 12% as h → 0. Repeating which results in a relative error (|S¯h − S|/ the exercise for the wider piecewise linear hat function with ε = 2h we have 1 √ S¯h = (5 2 − 3)S¯ + O(h), 4 which yields a relative error of 1.8% as h → 0. Remark 3.6. The O(1) errors that we are observing are not a result of the specific choice of the one-dimensional delta function approximation δε , that is used to define δε (d(Γ, x)). An O(1) error is to be expected for any δε approximation with narrow compact support of ε = p h, p > 0. To summarize, we have arrived at very different results for the two approaches to extend the one-dimensional delta approximation to several dimensions. For extension by the product rule, the order of accuracy is still determined by the moment order of the one-dimensional delta approximation, whereas the the extension by distance function with ε = p h is found inconsistent. The product rule is very convenient to use if an explicit parameterization of Γ is available. This is the case for different front-tracking methods [10, 16]. In level set methods however, such an explicit parameterization is not available, and a definition based on the signed distance to Γ is much more convenient. In [2], we introduced alternative consistent ways to define a multi-dimensional delta approximation, based on d(Γ, x). The first, and most simple, approach is to define the delta approximation based on the signed distance function as before, but with a variable ε that depends on the local orientation of Γ in the grid. We define ! d ∂ ε = ε(∇d, ε0 ) = |∇d|1 ε0 = (3.12) ∂xl d(Γ, x) ε0 , l=1
with ε0 = p h. With Γ a straight line, the integration is now consistent, yielding an O(h) error in general. For certain straight lines with a rational slope, it yields the exact result, see theorem in [2]. Any rational slope can be expressed as a quotient of two relative positive primes, i.e., two integers that have no common denominator other that 1, and the proof of this theorem makes use of the theory of relative primes. This modification of ε works in both two and three dimensions.
Regularization Techniques
493
The next approach is to use an approximate product rule. If Γ is a curve in R2 , one way to do this is to define the intersection points of Γ and the grid lines in the underlying grid, and from these points define a piecewise linear ¯ The product rule can then be applied to Γ. ¯ This yields a second-order curve Γ. accurate method, assuming that the one-dimensional delta approximation used is at least of moment order two (such as the linear hat function). To do this, one does however need to use the grid to define the regularized delta function. There is an alternative way, similar in spirit, which simply uses the signed distance function and its gradient in the definition. We define a delta function approximation δ˜ε (Γ, x) that in a point x can be computed by ˜ ¯ δε (Γ, x) = δεL (x − X(x, s))δεL (y − Y¯ (x, s))ds, (3.13) Γ
¯ ¯ ∈ Γ, the closest point on where (X(x, s), Y¯ (x, s)), s ∈ R is the tangent line to x Γ to x. The one-dimensional δε function is the linear hat function. Due to the compact support of the one-dimensional δε -function, this integrand is non-zero only in the box [x − ε, x + ε] × [y − ε, y + ε], and within this box, the tangent ¯ ¯ line X(x, s) = (X(x, s), Y¯ (x, s)) will be close to Γ, see Figure 4. ¯ X(x, s2 )
ε ¯ X(x, sx0 ) x×
¯ x
¯ X(x, sy0 ) ε
Γ ¯ Γ ¯ X(x, s1 )
Figure 4. The integrand in Eq. (3.13), defining δ˜ε (Γ, x), x = (x, y), is non-zero only in the box [x−ε, x+ε]×[y−ε, y+ε]. Within this box, ¯ To evaluate the integral, Γ is approximated by the tangent line Γ. ˜ intersections of Γ with the boundaries of this box must be computed. The tangent ¯ Γ(x, s) :=
¯ ¯ line Γ(x, s) = (X(x, s), Y¯ (x, s)) is defined as
¯ X(x, s) cos θ − sin θ = x − d(Γ, x) +s , Y¯ (x, s) sin θ cos θ
494
A.-K. Tornberg
where s is the arclength, s ∈ K = [s1 , s2 ], and θ = arctan(|
∂ ∂ d(Γ, x)/ d(Γ, x)|). ∂x ∂y
The line integral must be split into different pieces, such that each lies within one quadrant of the box [x − ε, x + ε] × [y − ε, y + ε]. To define these ¯ in addition to s1 and s2 , we need to define (if applicable), s = sx segments of Γ, 0 ¯ ¯ y ), Y¯ (sy )) = such that (X(sx0 ), Y¯ (sx0 )) = (x, Y¯ (sx0 )), and s = sy0 such that (X(s 0 0 ¯ y ), y), see Figure 4. (X(s 0 Now, δ˜ε (Γ, x), as defined in Eq. (3.13) can be evaluated as ε if sx0 , sy0 ∈ / K, I+,+ (s1 , s2 , d, θ) y ε x ε x x /K I+,+ (s1 , s0 , d, θ) + I−,+ (s0 , s2 , d, θ) if s0 ∈ K and if s0 ∈ y y y ˜ ε ε x δε (Γ, x) = I+,− (s1 , s0 , d, θ) + I+,+ (s0 , s2 , d, θ) if s0 ∈ / K and if s0 ∈ K y y x ε ε (s , s , d, θ) + I (s , s , d, θ) I 1 +,− +,+ 0 0 0 if sx0 , sy0 ∈ K ε (sx0 , s2 , d, θ) + I−,+ where d = d(Γ, x). Using c1 = ±1 and c2 ± 1 to represent the signs in the subscript, the integrals above evaluate as Icε1 ,c2 (sa , sb , d, θ) = Ic1 ,c2 (sa /ε, sb/ε, d/ε, θ)/ε, where
˜ θ) = Ic1 ,c2 (a, b; d,
## " " 1 + c1 d˜cos θ − α sin θ a " " ## × 1 + c2 d˜sin θ + α cos θ dα. b
(3.14)
¯ a piecewise linear approximaThe related method discussed above with Γ tion to Γ can be shown to be second order. This method is also expected to be second order, and all numerical tests performed so far indicates that this is the case. However, as of yet, this method has formally only been shown to be at least first order accurate [2]. 4. Partial Differential Equations The properties of source term regularization in the numerical solution of differential equations are closely related to the regularization of singular integrands in numerical quadrature. Let the solution of a differential equation Lu = s(x) x ∈ Ω ⊂ Rd , Bu = r(x)
x ∈ ∂Ω,
(4.1)
Regularization Techniques
495
be given on the standard form as an integral of the fundamental solution G(x, y) multiplying the source term s(x), u(x) = G(x, y)s(y)dy + R(x), (4.2) Ω
where R(x) represents the contribution from the boundary conditions. We will consider Eq. (4.2) with s(y) = δ(Γ, g, y) for x values away from the discontinuity and thus assume that δ(Γ, g, y) has compact support away from the boundaries and that |x − y| ≥ C > ε for any y ∈ Γ. If we consider homogeneous boundary conditions, we have that u(x) is given by Eq. (4.2) with R(x) = 0. Let u(x) denote the solution to the original problem (s(x) = δ(Γ, g, x)), uε (x) the solution to the regularized problem (δ(Γ, g, x) replaced by δε (Γ, g, x)), and uε,h the numerical solution to the regularized problem at x = xj . j We then have ε ε,h ε,h ε u(xj ) − uj ≤ |u(xj ) − u (xj )| + u (xj ) − uj For the first part of the error, we have ε |u(xj ) − u (xj )| = G(xj , y) [δ(Γ, g, y) − δε (Γ, g, y)] dy . Ω
For |xj − y| ≥ C > ε for any y ∈ Γ, G is regular, and this is the analytical error as analyzed in Section 2, with the main Theorem 2.2 for an extension of a onedimensional delta approximation by the distance function, and Theorem 2.6 for an extension by the product rule. Hence, if we assume a regularization based on a one-dimensional delta function approximation of (continuous) moment order m, the error will be of O(εm ). The second part of the error is the numerical error in the solution of the regularized problem. The order of this error term will depend on the particular numerical method. In applying any estimate for the numerical error for the method of choice, one should remember that the derivatives of δε contain powers of ε−1 . In the case of a uniform grid as introduced in Eq. (3.6), we can split the error a bit differently, and now instead use the results discussed in Section 3. Concerning the solution of the corresponding numerical approximation to (4.1) at xj , we can explicitly write . - d hk Gjm sm + Rj , (4.3) uj = k=1
m∈Ωh
where Gjm is the discrete fundamental solution and Ωh is the index set for the grid points inside Ω. Rj is the contribution from the boundary conditions.
496
A.-K. Tornberg
We again use a regularized delta function in this discrete approximation and define sm = δε (Γ, g, xm ) in Eq. (4.3). For homogeneous boundary conditions Rj = 0. The summation over m can be replaced by m ∈ Z d , due to the compact support of the delta approximation. To use the specific results for the analysis of uniform grids, we now subtract and add a discrete sum over the Green’s function for the continuous problem, and write |u(xj ) − uε,h j | . - d = G(xj , y)δ(Γ, g, y) dy − hk Gjm δε (Γ, g, xm ) Ω k=1 m∈Z d . d ≤ G(xj , y)δ(Γ, g, y) dy − hk G(xj , xm ) δε (Γ, g, xm ) Ω k=1 m∈Z d . d + hk [G(xj , xm ) − Gjm ] δε (Γ, g, xm ) . k=1 m∈Z d For the first part of the error, we can now identify the function f in Eq. (3.8) with the Green’s function above for fixed xj . The error analysis of Section 3 will thus apply directly. If we assume δε (Γ, g, xm ) is defined by the product rule as in Eq. (3.7), based on δε (x), x ∈ R, where δε (x) ∈ Qq (i.e., satisfies q moment conditions), the error of the first part will be of O(hq ). Furthermore, if the numerical approximation is of order p with |Gjm − G(xj , xm )| ≤ C1 hp , away from xj = xm , then the total error |uj − u(xj )| ≤ C2 hmin(p,q) ,
(4.4)
x| ≥ C > ε for any x ∈ Γ. We have here used that the discrete sum where " |xj − # d k=1 hk m∈Z d δε (Γ, g, xm ), is bounded, since this is the mass of the delta approximation, which is equal to 1. For δε (Γ, x) = δε (d(Γ, x)) (g ≡ 1) with ε = mh, no such estimate can be obtained. In fact, as shown in Section 3, there are cases where the quadrature error is O(1). A second-order error can be achieved if we apply the modified extension based on the distance function, as defined in Eqs. (3.13)–(3.14) (although it is formally only shown to be at least first order accurate). 4.1. Error in maximum norm. So far, we have discussed the error in the numerical solution of a partial differential equation away from the singularities. At or near the location of any singularity, we can however in general expect a first-order error in the maximum norm.
Regularization Techniques
497
Only in some simple cases can a delta function approximation be constructed to avoid the first-order maximum error. To do this, the specific discretization of the problem must be considered. For some examples in one dimension for a second-order and a fourth-order finite difference approximation, see [15]. For more general equations in several dimensions, such tailored delta approximations are very complicated to construct, if not impossible. When applying regularization to singular sources, without altering the numerical discretization of the equation, one is in general to expect a first-order error in maximum norm.
5. Discontinuities Before we conclude, we here give a brief comment regarding quadrature of discontinuous functions and discontinuous coefficients in partial differential equations. Consider a function Υ(x), discontinuous across Γ. With d(Γ, x) the signed distance function to Γ, and d(Γ, x) > 0, Υ1 (x) Υ(x) = (Υ1 (x) + Υ2 (x))/2 d(Γ, x) = 0, (5.1) Υ2 (x) d(Γ, x) < 0, we can write Υ(x) = Υ2 (x) + H(d(Γ, x))(Υ1 (x) − Υ2 (x)), where H(t) is the one-dimensional Heaviside function. Similarly to what was done with the Dirac delta function, we introduce a regularized Heaviside function Hε (t). A similar quadrature analysis to that in Section 2 can then be performed. The moment conditions, here for H(t) − Hε (t), determines the analytical error, and the regularity of Hε determines the numerical error. See [12] for details. It is possible to define Υ(x) also along the lines of the product rule for the multi-dimensional delta approximation (Eq. (2.20) or Eq. (3.7)), but it is not as simple to compute. See [15]. In partial differential equations with a discontinuous source term, the discussion will be very similar to that of Section 4, using the results from the quadrature analysis. However, if discontinuous functions appear as coefficients in a partial differential equation, the situation can be quite different. In some instances, the standard moment conditions work, but only up to a certain limit. In others, typically for discontinuous coefficients in elliptical operators, it is the inverse of these coefficients that should be regularized, with the accuracy of the regularization determined by the moment conditions applied to this inverse. See the discussion in [13].
498
A.-K. Tornberg
6. Conclusions Regularization is a practical and sound numerical technique for problems with singularities. Care must however be taken to apply regularization in a manner that yields an accurate method. As was discussed in Section 3.2, the technique commonly used in connection to level-set methods, to approximate δ(Γ, g, x) (with g ≡ 1) by δε (d(Γ, x)) with ε proportional to the grid size may lead to an O(1) error. The analysis of the errors associated with regularization in connection to numerical integration can for a wider support of the regularization be performed by dividing the error into an analytical and a numerical part, analyzing each part separately. This leads to continuous moment conditions on the onedimensional delta approximation that determine the analytical error also for the multi-dimensional delta approximation. The regularity of the one-dimensional delta approximation determines the leading order of the numerical error (assuming that the quadrature is of high enough order). This holds when the extension to several dimensions is made both using the closest distance to Γ or the product rule. To achieve the optimal order in terms of a characteristic grid size h, there is an optimal scaling of the width of the support of the regularization (ε) as a function of the grid size (h). Typically, ε ∼ hα with 0 < α < 1. For the practically preferred case of narrow support of the regularizations, over a few grid cells only, discrete effects will be important, and the total error must be analyzed directly. This is possible for a uniform grid with ε = ph, where h is the grid size. The analysis in one dimension now instead leads to discrete moment conditions to be imposed on the delta function approximation, to ensure a certain accuracy. There are here no requirements on the regularity of the delta function approximation. In this case, the extension by product rule and the extension by closest distance yield very different results. The product rule is proven to naturally carry over the properties and accuracy obtained in one dimension to several dimensions. The extension by closest distance however, may lead to O(1) errors, as mentioned above. As a remedy to this problem, we have introduced two modified extensions to multi-dimensions based on the closest distance, that are consistent. The results from the above analysis can be applied to the discretization of a certain class of partial differential equations. Away from the singularities, the error is given by two parts. The first part is determined by the above analysis, and the second is the error in the numerical solution to the regularized problem, which can be bounded by applying estimates for that particular numerical method. In the case of a discretization on a uniform grid, the error can be analyzed using a different decomposition, yielding a bound assuming only the formal order of accuracy of the numerical approximation of the PDE, avoiding inverse powers of ε in the final result.
Regularization Techniques
499
References [1] R.P. Beyer and R.J. LeVeque. Analysis of a one-dimensional model for the immersed boundary method. SIAM J. Num. Anal., 29:332–364, 1992. [2] B. Engquist, A.-K. Tornberg, and R. Tsai. Discretization of Dirac delta functions in level set methods. J. Comput. Phys. 207:28–51, 2005. [3] D. Jackson. The Theory of Approximation. American Mathematical Society, New York, 1930. [4] D. Juric and G. Tryggvason. A front-tracking method for dendritic solidification. J. Comput. Phys., 123:127–148, 1996. [5] G. Ledfelt. A thin wire sub cell model for arbitrary oriented wires for the fd-td method. In G. Kristensson, editor, Proc. EMB 98 – Electromagnetic computations for analysis and design of complex systems, pages 148–155, 1998. [6] R.J. LeVeque and Z.L. Li. Immersed interface methods for stokes flow with elastic boundaries or surface tension. SIAM J. Sci. Comput., 18:709–735, 1997. [7] J.J. Monaghan. Extrapolating B splines for interpolation. J. Comput. Phys., 60:253–262, 1985. [8] S.J. Osher and R.P. Fedkiw. Level set methods and dynamic implicit surfaces. Springer Verlag, 2002. [9] C.S. Peskin. Numerical analysis of blood flow in the heart. J. Comput. Phys., 25:220–252, 1977. [10] C.S. Peskin. The immersed boundary method. Acta Numerica, 11:479–517, 2002. [11] J.A. Sethian. Level set methods and fast marching methods. Evolving interfaces in computational geometry, fluid mechanics, computer vision and materials science. Cambridge University Press, 1999. [12] A.K. Tornberg. Multi-dimensional quadrature of singular and discontinuous functions. BIT, 42:644–669, 2002. [13] A.K. Tornberg and B. Engquist. Regularization techniques for numerical approximation of PDEs with singularities. J. of Sci. Comput., 19:527–552, 2003. [14] A.K. Tornberg and B. Engquist. The segment projection method for interface tracking. Commun. Pur. Appl. Math., 56:47–79, 2003. [15] A.K. Tornberg and B. Engquist. Numerical Approximations of Singular Source Terms in Differential Equations. J. Comput. Phys., 200:462–488, 2004. [16] G. Tryggvason, B. Bunner, A. Esmaeeli, D. Juric, N. Al-Rawahi, W. Tauber, J. Han, S. Nas, and Y.J. Jan. A front-tracking method for the computations of multiphase flow. J. Comput. Phys., 169:708–759, 2001. [17] J. Wald´en. On the approximation of singular source terms in differential equations. Numer. Meth. Part. D E, 15:503–520, 1999. Anna-Karin Tornberg Courant Institute of Mathematical Sciences New York University e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Equilibrium Measures and Polynomials Vilmos Totik Abstract. Two areas of approximation theory are reviewed which utilize potential theoretical methods. These are approximation by weighted polynomials with varying weights and the so-called polynomial inverse image method. We also illustrate the latter method by finding the form of some equilibrium measures.
1. Potential theoretical methods In the last two decades potential theoretical methods have penetrated several areas of approximation theory and orthogonal polynomial. They provided the right tools to solve several problems. They were used in connection with general orthogonal polynomials; approximation by varying weights; fast decreasing polynomials; orthogonal polynomials with varying weights; extremal polynomials and numerical conformal mappings; rational and Pad´e approximation; polynomial inequalities; the polynomial inverse image method; steepest descent and Riemann-Hilbert approach; some problems from physics (elasticity, statistical-mechanical models). The list of main contributors in these developments include: A.I. Aptekarev, D. Benko, P. Deift, A.A. Gonchar, M. Ismail, T. Kriecherbauer, A.B. Kuijlaars, A.L. Levin, G.L. Lopez, D.S. Lubinsky, A. Martinez-Finkelshtein, K.T.-R. Mc.Laughlin, H.N. Mhaskar, L. Pastur, F. Peherstorfer, E.A. Rahmanov, E.B. Saff, P. Simeonov, M. Shcherbina, H. Stahl, W. Van Assche, S. Venakides, X. Zhou. Let me also mention that there is another direction that is connected to pluripotential theory and approximation in Cn , but here we shall restrict our attention to classical approximation and potential theory. Altogether there have been over 150 papers and about 10 monographs in the period 1980–2003 connected with potential theoretical methods. In this paper we shall concentrate on some developments in the two areas • approximation by varying weights and • the polynomial inverse image method. Before we embark on our discussion we shall need to make a short detour in potential theory. Supported by NSF grant DMS-040650 and by OTKA T/034323, TS44782 Sections 1–11 are the transcripts of the lecture given at 4ECM, Stockholm, Sweden, 2004.
502
V. Totik
2. Equilibrium measures Let E ⊂ C be a compact subset on the plane considered as a conductor and suppose we put a unit charge on E, the distribution of which be the probability measure µ. In equilibrium the energy integral 1 log dµ(x)dµ(t) |x − t| is minimized (in C the Coulomb force is proportional with the reciprocal of the distance). There are some sets for which this energy integral is always infinite (these are called polar sets or sets of logarithmic capacity zero), otherwise there is a unique minimizing measure µE which is called the equilibrium measure of E. For example, for a disk or a circle µE is the normalized arc measure on the boundary, and if E is a segment, say E = [−1, 1], then dx dµ[−1,1] (x) = √ π 1 − x2 is the so-called Chebishov or arcsine distribution. In what follows, if E ⊂ R, we denote by ωE (x) the density of µE with respect to linear measure (Radon-Nikodym derivative) provided it exists. Thus, 1 . ω[−1,1] (x) = √ π 1 − x2 3. Polynomial approximation with varying weights Let us recall Weierstrass’ theorem according to which on any finite closed interval any continuous function can be arbitrarily well approximated by polynomials. Let Σ ⊆ R be a closed set and w = e−Q a weight on Σ. We shall consider the problem: what functions can be approximated by expressions of the form wn Pn ? For example, what functions are approximable by • e−n|x| Pn (x) on (−∞, ∞) or by • xαn Pn (x) on [0, 1]? λ
We emphasize that in this problem the weight changes with the degree of the polynomial, hence this is very different from what is traditionally called weighted approximation (where the weight function is fixed, does not depend on the degree). In some sense this is a much more difficult problem, for the polynomial has to balance exponential oscillations in wn (any change in w is exponentially enlarged in wn ). Nevertheless, this is the type of approximation that has arisen in several problems such as incomplete polynomials, Freud-type orthogonal polynomials or fast decreasing polynomials. See [21] for details. To solve this approximation problem we have to modify the classical equilibrium problem from potential theory.
Equilibrium Measures and Polynomials
503
4. Modified equilibrium problem Let Σ ⊆ C be a closed set of positive logarithmic capacity and Q : Σ → R a weight function (“external field”) with the properties: • Q is continuous, • lim|z|→∞, z∈Σ (Q(z) − log |z|) = ∞ if Σ is unbounded. In the presence of this external field the weighted energy 1 dµ(x)dµ(t) + 2 Qdµ log |x − t| is to be minimized for all unit Borel measures µ supported in Σ. If µ is thought to be a charge placed on the conductor Σ, then the first integral is just the internal energy of the charge, while the second integral is associated with the potential energy coming from the external field. It can be shown [21] that there is a unique minimizing measure µQ , which is called the equilibrium measure for Q. It has compact support denoted by S = supp(µQ ). 5. Solving the approximation problem Let, as before, w = e−Q on a closed set Σ ⊆ R where |z|w(z) → 0 as |z| → ∞ if Σ is unbounded. Solve with this Q the equilibrium problem from the preceding section, and set S = supp(µQ ). Now it turns out ([25]) that if f approximable by weighted polynomial of the form wn Pn , then f (x) must vanish outside S, i.e., non-trivial approximation is possible only on S. Note that this is in a very sharp contrast with the Weierstrass approximation theorem. There is a Stone-Weierstrass type result [12]: there is a closed set Z ⊂ Σ with the property that a continuous f is approximable by wn Pn if and only if f vanishes on Z. Thus, the problem is reduced to finding Z. As we have just mentioned, always R \ S ⊆ Z. However, here we may have strict containment, for example if Σ = (−∞, ∞) and Q(x) = |x|λ , 0 < λ < 1, then S = [−aλ , aλ ] with some aλ (see below), while Z = (R \ [aλ , aλ ]) ∪ {0}. If a point x0 ∈ Σ belongs to Z or not depends on the behavior of the density v of µQ with respect to linear Lebesgue measure. The main results on the problem if a point x0 belongs to Z or not are as follows: • • • • •
If v is continuous and positive in a neighborhood of x0 , then x0 '∈ Z ([25]). If v(t) ∼ |t − x0 |α with α '= 0, then x0 ∈ Z ([11]). If v > 0 is slowly varying at x0 then x0 '∈ Z ([26]). If Q is convex, then Z = R \ S ([27]). If Σ ⊂ [0, ∞) and xQ (x) 3, then Z = R \ S ([3]).
We remark that, e.g., in the last but one statement, for convex Q the density v may have an infinite singularity on a dense set, so this case does not follow from the previous local ones.
504
V. Totik
6. Examples 1. Freud weights. Let λ ≥ 1, and consider the weight w(x) = e−n|x| on (−∞, ∞). Then ([13]) there are polynomials Pn of respective degree n = 1, 2, . . . λ with e−n|x| Pn (x) → f (x) uniformly on (−∞, ∞) if and only if f (x) = 0 outside S = [−γλ , γλ ], where ; 1 λ 1 λ γλ = Γ Γ 2Γ + . 2 2 2 2 λ
If 0 < λ < 1, then f must also vanish at 0 ([14]). 2. Lorentz’ incomplete polynomials. The above results can be used to prove the following approximation theorem for incomplete polynomials. Suppose that 0 < θ < 1. There are polynomials of the form am xm + am+1 xm+1 + · · · + an xn with m/n → θ as n → ∞ uniformly approximating f on [0, 1] if and only if f (x) = 0 on [0, θ2 ] ([19] and [10]). Here one first uses the aforementioned approximation results for the weight w(x) = xθ/(1−θ) , to get approximation by weighted polynomials of the form x(θ/(1−θ))n Pn (x), and then shows that by approximating appropriate functions that depend on f , the polynomials x[(θ/(1−θ))n] Pn (x) do the job. 3. Fast decreasing polynomials. The results are also relevant for constructing fast decreasing (sometimes called pin) polynomials, i.e., polynomials which peak at the origin, and decrease very fast (on [−1, 1]) as we move away from the origin. These polynomials approximate the “Dirac delta”, and they can be used in convolution kernels or in well localized “partition of unity” consisting of polynomials. We list only one theorem from this area. √ Let ϕ be even and increasing on [0, 1] such that ϕ( x) is convex from below. Then ([25]) there are polynomials Pn with Pn (0) = 1 and |Pn (x)| ≤ e−nϕ(x) , if and only if 2 π
1 0
x ∈ [−1, 1]
ϕ(t) √ dt ≤ 1. t2 1 − t2
For example, there are Pn with Pn (0) = 1 and 2
|Pn (x)| ≤ e−nx ,
x ∈ [−1, 1],
but for no ε > 0 it is possible to have Pn (0) = 1 and 2
|Pn (x)| ≤ e−n(1+ε)x ,
x ∈ [−1, 1].
Equilibrium Measures and Polynomials
505
7. Polynomial inverse images of intervals Let TN be a real polynomial of degree N with N − 1 alternating minima and maxima that are ≤ −1 resp. ≥ 1. Then the set TN−1 [−1, 1] consists of N intervals on which TN is a one-to-one mapping onto [−1, 1], but some of these intervals may be attached to one another, so TN−1 [−1, 1] consists of some l intervals, where 1 ≤ l ≤ N . We call such polynomials admissible. Recently a new method has emerged to transfer results from a single interval to general compact subsets of the real line, which is based on the following density theorem. Density theorem (Bogatyrev, Peherstorfer, Totik) If E = ∪kj=1 [aj , bj ], and ε > 0, then there is E ∗ = T −1 [−1, 1], E ∗ = ∪kj=1 [a∗j , b∗j ] with some admissible polynomial T such that |aj − a∗j | ≤ ε,
|bj − b∗j | ≤ ε.
Actually, we may have here a∗j = aj for all j, and even b∗k = bk . One can also request b∗j < bj or b∗j > bj for all other j = 1, . . . , k − 1. Note that in this theorem T −1 [−1, 1] must consist of precisely k intervals while the degree of T is generally large (for small ε). This theorem was proved independently in three different papers by three different methods, see [8], [16], [23]. Suppose we have a result on [−1, 1], and want to find its analogue/generalization on compacts of R. The aforementioned polynomial inverse image method consists of the following steps: • Start from the result on [−1, 1]. • Transform it by x → T (x) to T −1 [−1, 1] for any admissible polynomial T . • Approximate any set consisting of finitely many intervals by a polynomial inverse image T −1 [−1, 1] using the density theorem. • Approximate a compact set by sets consisting of finitely many intervals. We emphasize that we DO NOT want to copy the proof applied in the interval case (which would be in most cases a futile attempt), but want to get the general statement from the simpler interval case by the above transfer technique. The point is that in many cases intervals can be much easier handled than general compact sets (e.g., one can map [−1, 1] onto the unit circle and there powerful methods of harmonic analysis are available), and if the above transfer works then one gets the general case almost free. In the heart of the method is the fact that the equilibrium measure is preserved under polynomial inverse images (with respect to admissible polynomials), see Section 12. Now this method may or may not work with a given result, but below we list three cases where it works nicely.
506
V. Totik
8. Bernstein inequality on compact subsets of the real line Bernstein inequality connects the size of the derivative of a polynomial with its supremum norm: n
Pn [−1,1] , x ∈ [−1, 1]. |Pn (x)| ≤ √ 1 − x2 This is one of the basic inequalities for polynomials which is frequently used in approximation theory. With the polynomial inverse image method one gets the following generalization (recall that ωE is the density of the equilibrium measure of E ⊂ R with respect to linear Lebesgue measure wherever it exists): If E ⊂ R is compact, then |Pn (x)| ≤ nπωE (x) Pn E ,
x ∈ Int(E).
More is true, namely 2 |Pn (x)| + n2 |Pn (x)|2 ≤ n2 Pn 2E , x ∈ Int(E), πωE (x) which is the analogue of the inequality " #2 |Pn (x)| 1 − x2 + n2 |Pn (x)|2 ≤ n2 Pn 2[−1,1]
(8.1)
(8.2)
(8.3)
of Bernstein [7] (see also [22]). (8.1) and (8.2) are due to M. Baran [2], who actually got them also in higher dimension. Both inequalities were rediscovered in [23] with the polynomials inverse image method. It was also proven in [23] that the inequality (8.1) is sharp: If E ⊂ R is compact and x ∈ Int(E), then for every ε > 0 there are polynomials Pn of degree n ≥ n0 such that |Pn (x)| ≥ (1 − ε)nπωE (x) Pn E . √ Note that if E = [−1, 1], then πωE (x) = 1/ 1 − x2 , and (8.1) gives back the original Bernstein inequality. A similar method can be used to get asymptotically sharp constants in the analogue of Markov’s inequality for sets consisting of finitely many intervals (see [23]). 9. Christoffel functions Christoffel functions play a fundamental role in the theory of orthogonal polynomials. The Christoffel function associated with a measure ν supported on the real line is defined as λn (ν, z) = inf |Pn |2 dν. Pn (z)=1
The asymptotic behavior of λn (ν, z) is closed linked to the behavior of orthogonal polynomials with respect to ν. For z not lying on the support this asymptotic behavior goes back to Szeg˝ o, and so does it when z is on the support, but the measure is a smooth one. When the measure is not smooth, the
Equilibrium Measures and Polynomials
507
situation is more difficult. When the support of ν is [−1, 1] A. M´ at´e, P. Nevai and V. Totik proved [15]: If supp(ν) = [−1, 1] and log ν is integrable, then for almost all x ∈ [−1, 1] (9.1) lim nλn (ν, x) = π 1 − x2 ν (x). n→∞
From here the polynomial inverse image method gives the following. Let E =supp(ν) ⊂ R be compact, and let ν satisfy locally the condition log ν ∈ L1 in the interior Int(E) of E. Then for almost all x ∈ Int(E) lim nλn (ν, x) =
n→∞
ν (x) . ωE (x)
√ Note again that if E = [−1, 1], then 1/ωE (x) = π 1 − x2 , so we get back the original M´ at´e-Nevai-Totik result. 10. Approximation on compact subsets Let E be a subset of the real line, and f a continuous function on E. The quantity inf
f − Pn E En (f, E) = deg(Pn )≤n
is the error of best approximation of f on E by polynomials of degree at most n, and it is one of the basic quantities in approximation theory. S.N. Bernstein proved in [4] that the limit lim nEn (|x|, [−1, 1]) = σ
n→∞
(10.1)
exists, positive and finite. The exact value of the constant σ is not known. This simple looking result is quite difficult, the shortest known proof is over 50 pages. Later Bernstein [5], [6] extended (10.1) by proving that if p ≥ 0 is not an even integer, then lim np En (|x|p , [−1, 1]) = σp
n→∞
with some constant σp . Furthermore, with the same σp if x0 ∈ (−1, 1), then lim np En (|x − x0 |p , [−1, 1]) = (1 − x20 )p/2 σp .
n→∞
In general, we have the following extension: If E ⊂ R is compact and x0 ∈ Int(E), then lim np En (|x − x0 |p , E) = (πωE (x0 ))−p σp .
n→∞
(10.2)
If E = [−1, 1], then (πωE (x0 ))−p = (1−x20 )p/2 , so we get back Bernstein’s third theorem above. This theorem was found by R.K. Vasiliev [29] in a different form. Vasiliev’s proof is over a hundred pages long, while the polynomial inverse image method proves (10.2) in a few pages (from Bernstein’s original theorem), see [28].
508
V. Totik
11. Why is the density theorem true? k For E = ∪kj=1 [aj , bj ] set H(x) = j=1 (x − aj )(x − bj ). Then the following are equivalent ([17], [23], [1]): • E = TN−1 [−1, 1] for some admissible polynomial TN , • µE ([aj , bj ]) are each rational (where µE is the equilibrium measure of E), • the Pell type equation 2 (x) = 1 Sn2 (x) − H(x)Un−k
is solvable for the polynomials Sn , Un−k . Thus, if we want to hit upon a set which is the inverse image of [−1, 1] under an admissible polynomial, then we must make sure that µE ([aj , bj ]), 1 ≤ j ≤ k are each rational. For x = (x1 , . . . , xk ) consider Ex = ∪kj=1 [aj , bj +xj ] and gj (x) = µEx ([aj , bj + xj ]). With G(x) = (g1 (x), . . . , gk (x)) we want to prove: there are arbitrarily small x for which G(x) is a rational point in Rk . The functions gj have the following properties (for small x). (A) gj is a continuous function on some cube [0, a]k , (B) gj is strictly monotone increasing in xj and strictly monotone decreasing in every xi with i '= j, and k (C) j=1 gj (x) = 1. Call any system {gj (x1 , . . . , xk )}kj=1 with these properties a monotone system. There are many monotone systems from market shares to scheduling jobs on a mainframe computer. Here is one that appeared as one of the problems in the 1991 Mikl´os Schweitzer Contest in Hungary. The inheritance problem. To divide their inheritance k brothers turn to an impartial judge. Secretely however, each of them bribes the judge. What a given brother gets depends continuously and monotonically on the bribes: it is monotone increasing in his own bribe and it is monotone decreasing in everybody else’s bribe. Show that if the eldest brother does not give too much to the judge, then the others can give so that the decision will be fair. Let xj be the bribe of the jth brother and gj (x1 , . . . , xk ) his inheritance. These form a monotone system. If we set again G(x) = (g1 (x), . . . , gk (x)), then the inheritance problem states that if xk is not large, then there are x1 , . . . , xk−1 so that G(x) = G(0). (11.1) Recall, that for gj (x) = µEx ([aj , bj + xj ]) we needed that rational points k k in the range of x → G(x) are dense. Note that G : [0, a] → R is a singular mapping, its range lies in {y : j yj = 1}, so neither property (density of rational points or (11.1)) is obvious. But both follow from general properties of monotone systems: Let gj , j = 1, . . . , k form a monotone system, and set Gk−1 (x) = (g1 (x), . . . , gk−1 (x)). Then there is a δ > 0 such that for fixed
Equilibrium Measures and Polynomials
509
xk ∈ [0, δ] the image of any [0, ε]k−1 under Gk−1 contains a neighborhood of Gk−1 (0). This can be proven by induction on k. In particular, the density theorem follows. Actually, for fixed xk the mapping x → Gk−1 (x) is a homeomorphism. 12. An illustration of the polynomial inverse image method As we have already mentioned, the equilibrium measure of [−1, 1] is given by the Chebyshev distribution dµ[−1,1] 1 = √ . (12.1) dx π 1 − x2 As an illustration of the polynomial inverse image method we show that if E = ∪nj=1 [a2j−1 , a2j ], a1 < a2 < · · · < a2n consists of finitely many intervals, then its equilibrium measure is of the form n−1 dµE (x) 1 j=1 |x − ζj | = , x ∈ E, (12.2) dx π 2n i=1 |x − ai | where ζj ∈ (a2j , a2j+1 ) are some fixed numbers in the contiguous intervals to E. This is known (see, e.g., [20, Lemma 4.4.1], [30]), but the following elementary deduction uses the aforementioned density theorem. We shall transfer (12.1) into (12.2). For a set E ⊂ R consisting of finitely many intervals the equilibrium measure µE is the unique measure minimizing the logarithmic energy 1 dµ(x)dµ(t) log |x − t| among all probability measures on E. It is characterized also as the unique probability measure on E for which the logarithmic potential 1 µ dµ(t) (12.3) U (x) = log |x − t| is constant on E (otherwise we could move mass to a lower potential place to reduce the energy). For example, we have 1 1 1 √ log du = log 2, z ∈ [−1, 1]. (12.4) |z − u| π 1 − u2 −1 For all these see, e.g., [18, Chapter 3]. Now we transfer (12.1) into (12.2). First of all if E = T −1 [−1, 1] = n ∪j=1 [a2j−1 , a2j ], a1 < a2 < · · · < a2n with some admissible polynomial TN then we prove (see [9]) |T (x)| dµE (x) N = , dx N π 1 − TN2 (x)
x ∈ [−1, 1].
(12.5)
510
V. Totik
This already verifies (12.2) in the special case when the set E is a polynomial inverse image of an interval. In fact, recall that there are N intervals I1 , . . . , IN that are mapped by TN onto [−1, 1] in a one-to-one manner. Now E = TN−1 [−1, 1] = ∪N i=1 Ii consists of the n intervals [a2j−1 , a2j ], j = 1, . . . , n, so each [a2j−1 , a2j ] is the union of some Ii ’s. If [a2j−1 , a2j ] is the union of s such Ii ’s then at those s − 1 endpoints z of the latter intervals which lie inside (a2j−1 , a2j ) we must have TN (z) = 0. There are N − n such z’s, furthermore since |TN (a2j )| = |TN (a2j+1 )| = 1, there must be at least one-one zero of TN in the contiguous intervals (a2j , a2j+1), j = 1, . . . , n − 1 (note that |TN (x)| > 1 on these intervals). This is already (N − n) + (n − 1) = N − 1 zeros for TN , and since TN has at most N − 1 zeros, there must be a unique zero ζj lying in each interval (a2j , a2j+1 ), j = 1, . . . , n − 1. Now if γN is the leading coefficient of TN , then we get |TN (x)| = N γn
n−1
|x − ζj |
|x − z|,
z
j=1
where the last product is taken for all endpoints z of the intervals Ii ’s that lie inside E. In a similar fashion, since all these z’s are double zeros of 1 − TN2 (x), for x ∈ E we have 1 − TN2 (x) = |TN2 (x) − 1| 2 = γN
2n j=1
|x − aj |
|x − z|2 .
z
Thus, the factors |x − z| cancel on the right of (12.5), and we obtain the form (12.2). To prove (12.5), first of all the change of variable u = TN (x) shows that for every i = 1, . . . , N we have 1 |T (x)| 1 1 N √ du = , dx = 2 2 N −1 N π 1 − u Ii N π 1 − TN (x) hence the measure given in (12.5) is a probability measure on E. Thus, it is enough to prove that its logarithmic potential is constant on E. The same change of variable gives for z ∈ E |T (x)| 1 N log dx |z − x| N π 1 − TN2 (x) Ii 1 1 1 √ log du, (12.6) = −1 |z − T (u)| N π 1 − u2 −1 N,i −1 −1 where TN,i is the inverse of the restriction of TN onto Ii . Denote TN,i (u) by ui . N The polynomial i=1 (z −ui ) has the zeros z = u1 , . . . , uN , which, by definition,
Equilibrium Measures and Polynomials
511
are the zeros of TN (z) − u = 0. Hence, N
(z − ui ) =
i=1
1 (TN (z) − u). γN
Therefore, we obtain from (12.6) |T (x)| 1 N log dx |z − x| N π 1 − TN2 (x) E N |T (x)| 1 N log = dx |z − x| N π 1 − TN2 (x) i=1 Ii N 1 1 1 √ log du = −1 |z − TN,i (u)| N π 1 − u2 i=1 −1 1 |γN | 1 √ = log du |T (z) − u| N π 1 − u2 N −1 log |γN | + log 2 , = N where in the last step we used (12.4) and the fact that TN (z) ∈ [−1, 1]. This completes the proof for E = T −1 [−1, 1]. Finally, let E = ∪nj=1 [a2j−1 , a2j ] be an arbitrary set consisting of finitely many intervals. This case is obtained from the polynomial inverse image case by a simple limit process using the density theorem. Let us choose polynomial inverse image sets E ⊆ Em with Em = ∪nj=1 [a2j−1,m , a2j,m ], where a2j−1,m = a2j−1 , a2j,m 4 a2j as m → ∞ as in the density theorem (see the remark after the density theorem in section 7). For Em we have already verified the validity of (12.2), and let the corresponding ζ’s be ζj,m ∈ (a2j,m , a2j+1,m), j = 1, . . . , n − 1. By selecting a subsequence if necessary, we may assume that ζj,m → ζj as m → ∞ for some ζj ∈ [a2j , a2j+1] for all j = 1, . . . , n − 1. Let on E the measure µ(x) be given by the right-hand side of (12.2) with these ζj ’s. We claim that this µ is the equilibrium measure of E, and to this end it is sufficient to show that it has total measure 1 and its logarithmic potential is constant on E, i.e., for z, y ∈ E we have |z − x| log dµ(x) = 0. (12.7) |y − x| Since on [a2j−1 , a2j ] we have n−1 C 1 i=1 |x − ζi,m | ≤ , π 2n |x − a2j−1 ||x − a2j | i=1 |x − ai,m |
512
V. Totik
with a C independent of m, Lebesgue’s dominated convergence theorem implies that |z − x| |z − x| dµ(x) = lim dµEm (x). log log m→∞ |y − x| |y − x| [a2j−1 ,a2j ] [a2j−1 ,a2j ] It is also easy to show from the concrete form given in (12.2) for Em that a2j+1,m |z − x| dµEm (x) → 0 log as m → ∞. |y − x| a2j These give now |z − x| |z − x| dµ(x) = lim dµEm (x) = lim 0 = 0, log log m→∞ E m→∞ |y − x| |y − x| E m so (12.7) has been verified. That µ is a probability measure follows from similar argument using that the total mass of µEm is 1. Finally, to complete the proof of (12.2) we still have to make sure that ζj ∈ (a2j , a2j+1), i.e., that ζj cannot coincide with either a2j or a2j+1 . But that is easy: if E ⊆ E = ∪ni=1 [a2i−1 , a2i ] is a polynomial inverse image set under an admissible mapping with a2j = a2j for a given j (see the remark after the density theorem in section 7), then dµE (x)/dx has a 1/ |x − a2j | singularity at a2j = a2j (this follows from (12.2) that has been verified for E ). Since on E we have dµE (x)/dx ≥ dµE (x)/dx (in fact, µE is obtained from µE by redistributing onto E the mass of µE lying on E \ E, this is the so-called balayage process), it follows that dµE (x)/dx has at least this strong singularity. But then ζj = a2j is impossible, for then the factor |x − ζj | in the numerator of (12.2) would cancel the factor |x − a2j | in the denominator. Similar argument shows that ζj '= a2j+1, and this completes the proof of (12.2). References [1] A.I. Aptekarev, Asymptotic properties of polynomials orthogonal on a system of countours and periodic motions of Toda lattices, Math. USSR Sbornik, 53 (1986), 223–260. [2] M. Baran, Complex equilibrium measure and Bernstein type theorems for compact sets in Rn , Proc. Amer. Math. Soc., 123 (1995), 485–494. [3] D. Benko, Approximation by weighted polynomials. J. Approx. Theory, 120 (2003), 153–182. [4] S.N. Bernstein, Sur la meilleure approximation de |x| par des polynˆ omes des degr´es donn´es, Acta Math., 37 (1914), 1–57. [5] S.N. Bernstein, On the best approximation of |x|p by means of polynomials of extremely high degree, Izv. Akad. Nauk SSSR, Ser. Mat. 2 (1938), 160–180. Reprinted in S.N. Bernstein Collected Works, Vol. 2, pp. 262–272. Izdat. Nauk SSSR, Moscow, 1954 (Russian).
Equilibrium Measures and Polynomials
513
[6] S.N. Bernstein, On the best approximation of |x − c|p , Dokl. Akad. Nauk SSSR, 18 (1938), 379–384. Reprinted in S.N. Bernstein Collected Works, Vol. 2, pp. 273–260. Izdat. Nauk SSSR, Moscow, 1954 (Russian). [7] S.N. Bernstein, Extremal properties of polynomials and best approximation of functions of a real variable, I., ONTI, 1–203 (Russian). [8] A.B. Bogatyrev, Effective computation of Chebyshev polynomials for several intervals, Math. USSR Sb., 190 (1999), 1571–1605. [9] J.S. Geronimo and W. Van Assche, Orthogonal polynomials on several intervals via a polynomial mapping, Trans. Amer. Math. Soc., 308 (1988), 559–581. [10] M. von Golitschek, Approximation by incomplete polynomials, J. Approx. Theory, 28 (1980), 155–160. [11] A.B. Kuijlaars, The role of the endpoint in weighted polynomial approximation with varying weights, Constr. Approx., 12 (1996), 287–301. [12] A.B. Kuijlaars, A note on weighted polynomial approximation with varying weights, J. Approx. Theory, 87 (1996), 112–115. [13] D.S. Lubinsky, and E.B. Saff, Uniform and mean approximation by certain weighted polynomials, with applications, Constr. Approx., 4 (1988), 21–64. [14] D.S. Lubinsky and V. Totik, Weighted polynomial approximation with Freud weights, Constructive Approx., 10 (1994), 301–315. [15] A. M´ at´e, P. Nevai and V. Totik, Szeg˝ o’s extremum problem on the unit circle, Annals of Math., 134 (1991), 433–453. [16] F. Peherstorfer, Deformation of minimizing polynomials and approximation of several intervals by an inverse polynomial mapping, J. Approx. Theory, 111 (2001), 180–195. [17] F. Peherstorfer and K. Schiefermayr, Theoretical and numerical description of extremal polynomials on several intervals I, Acta Math. Hungar, 83 (1999), 27–58. [18] T. Ransford, Potential Theory in the Complex Plane, Cambridge University Press, Cambridge, 1995 [19] E.B. Saff and R.S. Varga, Uniform approximation by incomplete polynomials, Internat. J. Math. Math. Sci. 1 (1978), 407–420. [20] H. Stahl and V. Totik, General Orthogonal Polynomials Encyclopedia of Mathematics, 43, Cambridge University Press, New York 1992. [21] E.B. Saff and V. Totik, Logarithmic Potentials with External Fields, Grundlehren der mathematischen Wissenschaften, 316, Springer Verlag, Berlin, Heidelberg, 1997. ¨ [22] G. Szeg˝ o, Uber einen Satz des Herrn Serge Bernstein, Schriften K¨ onigsberger Gelehrten Ges. Naturwiss. Kl., 5 (1928/29), 59–70. [23] V. Totik, Polynomial inverse images of intervals and polynomial inequalities, Acta Math., 187 (2001), 139–160. [24] V. Totik, Asymptotics for Christoffel functions with varying weights, Advances of Applied Math., 25 (2000), 322–351. [25] V. Totik, Weighted approximation with varying weights, Lecture Notes in Mathematics, 1569, Springer Verlag, New York, 1994. [26] V. Totik, Weighted polynomial approximation for weights with slowly varying extremal density, J. Approx. Theory, 99 (1999), 258–288.
514
V. Totik
[27] V. Totik, Weighted polynomial approximation for convex fields, Constr. Approx., 16 (2000), 261–281. [28] V. Totik, Metric properties of harmonic measures (manuscript) [29] R.K. Vasiliev, Chebyshev Polynomials and Approximation Theory on Compact Subsets of the Real Axis, Saratov University Publishing House, 1998. [30] H. Widom: Polynomials associated with measures in the complex plane, J. Math. Mech., 16 (1967), 997–1013. Vilmos Totik Bolyai Institute University of Szeged Aradi v. tere 1 H-6720, Hungary and Department of Mathematics University of South Florida Tampa, FL 33620, USA e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
SLE, Conformal Restriction, Loops Wendelin Werner Abstract. There have been quite a few surveys or proceedings papers on the present subject of Schramm-Loewner Evolutions and 2D critical phenomena in recent years, such for instance as [38, 17, 18, 8, 42], as well as lecture notes [39, 16, 41], and a book [19] is in preparation. We will here try to review the subject including some recent and ongoing work, in an informal style close to that of an actual seminar talk.
1. Background The large-scale behavior of generic physical systems tends to be deterministic, even if on small scale, one can model it as being random. One of the goals of “statistical physics” is precisely to start from a random discrete model, and to show that when the size of the system goes to infinity, the macroscopic observables of the system do converge to deterministic values. However, it has also been observed for many systems, that if they are taken exactly at their critical point (i.e., the value of the parameter – usually the temperature – at which a phase transition occurs), the macroscopic behavior can exhibit random features. Furthermore these features are related to the behavior of the deterministic system when the temperature is very close but not equal to the critical temperature. More precisely, some of these deterministic observables do diverge when the temperature of the system approaches the critical temperature Tc (a phase transition is often described in those terms) and typically, when T → Tc , one often expects them to behave like (T −Tc )α for some power α that is related to the behavior of the random system at Tc via so-called hyperscaling relations. The number α is a critical exponent. It has been (and still is) an important goal both for physicists and mathematicians to determine these exponents and to study these random macroscopic systems. In three-dimensional space, there seems to be no way to give a precise mathematical description. For instance, the critical exponents are not expected to take any special value. On the contrary, for two-dimensional models, it has been recognized more than twenty years ago in the physics community that the critical systems could be conformally invariant in their scaling limit, and that this should allow to describe some of their features. This has given rise to many striking predictions, via tools such as conformal field theory (that used and developed some elaborate mathematics).
516
W. Werner
Figure 1. Percolation in 5 × 5 rhombus It is maybe a good idea at this point to say a word about what conformal invariance means in this context. We will deliberately opt here for a definition in terms of random sets, since this is the approach that we shall use throughout this paper. To fix things, let us first quickly describe one of the simplest probabilistic models that clearly exhibit random behavior on macroscopic level. In Fig. 1, each site of the triangular lattice in a rhombus has been colored in black or white, independently with probability 1/2. Let us now consider the event A that in this rhombus, there exists a path of white cells, that joins the left-side of the rhombus to the right side. In fact, this happens if and only if there is no black path that joins the bottom side of the rhombus to the top (this black path would block any possible white crossing). In particular, we see that for symmetry reasons, the probability of A is 1/2, regardless of the size of this rhombus. Hence, even at very large scale, there are non-deterministic observables with probability bounded away from 0 and from 1. In general, one focuses at the connectivity properties of the subgraph consisting of the white cells, say. It is described by its connected components, that are called “clusters”. The previous argument yields that at any scale, one can find (with high probability) clusters of size comparable to that scale. Suppose now that the considered system (which can be the percolation model that we have just defined, or another model) is described in terms of clusters. This is a family of disjoint compact sets (Kj ). What does it mean for such a system to be conformally invariant in the scaling limit? Recall that Riemann’s mapping theorem ensures that for any two simply connected domains D and D (that are both non-empty and with non-empty complement), there exist many conformal maps from D onto D , i.e., bijections from D onto D that preserve the angles. In fact, the family of such maps (for fixed D and D ) can be parametrized by three real parameters. Now, consider the statistical physics system on a very fine grid approximation of D (i.e., the mesh-size of the grid is very small), and also on a very fine grid approximation of D . This gives a law on clusters (Kj ) in D, and a law on clusters (Kj ) in D . We say that the system is conformally invariant in the scaling limit, if when the mesh-size of the lattice vanishes, the law of (Kj ) becomes closer and closer to the law of (Φ(Kj )) where Φ is any fixed conformal map from D onto D . Loosely speaking: What one sees in D is close (in law) to the image
SLE, Conformal Restriction, Loops
517
of what one sees in D under any conformal map from D onto D . Note that here (as is most of this text), one obtains the scaling limit by fixing a geometry (the domain D) and letting the mesh-size of the grid approximation vanish. This geometrical description has more or less equivalent formulations in terms of “correlation functions” that are more standard in the physics literature, but that turn out to be more involved as one then has to use the asymptotic fractal dimension of the clusters in the definition of conformal invariance. An example of a correlation function in the percolation picture before would be the probability that two given points are in the same cluster. In fact, the laws of the systems are often described in a domain with some marked points. For instance, one can cut the boundary of D into two different parts separated by the two boundary points A and B and put different boundary conditions on these two boundary arcs. Then, conformal invariance means that in the previous setup (with two domains D and D with respective boundary points (A, B) and (A , B )), the law of what one sees in D is close to the image of what one sees in D under any conformal map from D onto D that sends the boundary points A and B onto A and B respectively. Belavin, Polyakov and Zamolodchikov [4] have proposed that the law of the limits of two-dimensional critical models from statistical physics, under the assumption that the limits exist and is conformally invariant, can be described/classified in terms of conformal fields. Furthermore, these fields should possess certain properties due to conformal invariance and the nature of the considered models, that can be translated in terms of properties of certain related highest-weight representations of some infinite-dimensional Lie Algebras. Thanks to the classification of these special (degenerate) representations, one can predict various features of the considered model, such as the value of the critical exponents etc. In the example of critical percolation described before, Cardy [7] could for instance give an explicit prediction of the limit of the probability of a leftright crossing of a conformal rectangle (instead of a rhombus) of aspect-ratio L (for instance an L × 1 rectangle) as a function of L. An example of a critical exponent in this case is the following prediction: The probability that a given point belongs in a cluster of diameter R times bigger than the meshsize decays like R−5/48 when R → ∞ (more precisely, it decays like R−5/48+o(1) ). In fact, for most models, several important things are missing in order to get a clean and rigorous proof of these predictions via this route. First of all, conformal invariance (in the previous sense) for discrete models is only rigorously established for a few exceptional models: • Simple random walk that converges to planar Brownian motion, which has been proved by Paul L´evy to be conformally invariant back in the 50’s [27]. • The above-described percolation model, for which Smirnov [37] recently proved Cardy’s formula and deduces more generally conformal invariance (see also [5]).
518
W. Werner
• The loop-erased random walks and uniform spanning trees (Lawler, Schramm and Werner [22]) • The “harmonic explorer” (or “harmonic navigator”), see Schramm and Sheffield [32]. For all other models, proving the existence of a scaling limit, and a fortiori its conformal invariance, is an open problem. Secondly, the actual relation between these concrete models and the conformal fields (and the relations between them) remained somewhat mysterious. See for instance the review paper [15] that tries to make sense of Cardy’s arguments. This paper has been quite influential in that it has attracted many mathematicians’ attention to these questions. The present survey will deal essentially with the properties of the continuous objects, and not on the issue of whether the discrete models do actually converge to these continuous objects. 2. Schramm-Loewner Evolutions Suppose that one wishes to describe the law of a random continuous curve γ in a simply connected domain D, that joins two boundary points A and B. Typically, this curve would be the scaling limit of an interface in a discrete grid-based model. We are going to assume that this law satisfies the following two properties (P1) and (P2): (P1) The law of γ is invariant under the (one-dimensional) family of conformal maps from D onto itself that leave A and B unchanged. If this property holds, then for any D , A , B as before, it is possible to define the law of a random curve that joins A and B in D , by taking the conformal image of the law in D (the law of this conformal image is independent of the actual choice of the conformal map because of (P1)). By construction, the law of this random curve is then conformally invariant in the sense described before (for any triplet D, A, B it defines a law PD,A,B on curves joining A to B in D, and for any conformal map Φ, Φ ◦ PD,A,B = PΦ(D),Φ(A),Φ(B) ). Note that it is not really difficult to find a random curve satisfying (P1): Take D to be the upper half-plane H, A = 0, B = ∞. Then, any random curve γ that is scale-invariant in law would do (the conformal maps form H onto itself that preserve 0 and ∞ are just the multiplications by a positive constant). (P2) The “Markovian property”: Suppose that the curve is parametrized “from A to B” and that we know how γ begins (i.e., we know γ[0, t] say). What is the (conditional) law of the future of γ? It is exactly the law PDt ,γt ,B , where Dt is the connected component of D \ γ[0, t] that has B on its boundary (when γ is a simple curve, then Dt is simply D \ γ[0, t]). It is very natural to combine these two properties. In most of the relevant discrete models that are conjectured to be conformal invariant, one can find
SLE, Conformal Restriction, Loops
519
D B A
γ (t)
Figure 2. Condition (P2) natural curves in these models (interfaces etc.) that do satisfy a discrete analog of (P2). In particular, critical interfaces in nearest-neighbor interaction models are expected to satisfy both (P1) and (P2). Schramm [31] pointed out that there are not that many possible random curves with these two properties. They form a one-dimensional family and are the so-called SLE curves (SLE stands for Schramm – or Stochastic – Loewner Evolutions). These random curves can be described and constructed via iterations of random conformal maps. The parameter that describes which SLE one is talking about is usually denoted by the letter κ. It is a positive real number that loosely speaking describes how wiggly the curve γ is. In fact, one can prove (see Rohde-Schramm [30] for the upper bound and Beffara [3] for the involved lower bound) that the Hausdorff dimension of SLEκ is 1 + κ/8 when κ < 8. When κ ≥ 8 the curve becomes space-filling [30] and has Hausdorff dimension 2. Of course, this is not very informative for the reader at this point since the actual definition of SLEκ has not yet been given. . . But it already shows that these random curves can have different sometimes surprising fractal structures. There is also an important phase transition at the value κ = 4: The SLE curves are simple (without double points) when κ ≤ 4, and they have double points as soon as κ > 4; see [30]. Let us now sketch the basic idea behind the construction of SLEs: Suppose for instance that D is the unit square and that A and B are two opposite corners. For convenience, let us focus on the case where the random curve γ is simple. Construct first the path γ up to some time t. Define then the unique conformal map ft from D \ γ[0, t] onto D such that: ft (γ(t)) = A, ft (B) = B and ft (B) = 1. The image of the path γ[0, t] under ft becomes now a part of the boundary of D (that contains A). Suppose now that we continue the path γ after time t. Then, the image of this additional stretch under ft is a simple path in D that joins ft (γ(t)) = A to ft (B) = B. The combination of the properties (P1) and (P2) implies immediately that the conditional law of γ˜ := ft (γ[t, ∞)) (given γ[0, t]) is PD,A,B
520
W. Werner
Figure 3. Sketch of the maps ft , f˜t and f2t = f˜t ◦ ft (because the conditional law of γ[t, ∞) is PDt ,γ(t),B and conformal invariance holds). In particular, this conditional law is independent of γ[0, t]. Hence, γ[0, t] and γ˜ [0, t] are independent and have the same law. Now, the conformal map f2t is clearly obtained by composing the two independent identically distributed maps f˜t (corresponding to γ˜ [0, t]) and ft . Similarly, for any N > 1, fN t is obtained by iterating N independent copies of ft . It is therefore very natural to encode the curve γ via these conformal maps ft so that the Markovian property translates into independence of the “increments” of ft . In fact, for any t, ft itself can be viewed as the iteration of N independent copies of ft/N . Hence, for any small , the knowledge of the law of f yields the law of γ at any time that is a multiple of . This leads to the fact that the knowledge of the law of f for infinitesimal in fact characterizes the law of the entire curve γ. It turns out that there is just “one possible direction” in which f can move on infinitesimal level, and this implies that there exists just a one-dimensional family of possible laws for f , each one corresponding to a certain speed of motion in this direction. This leads to the one-dimensional family of possible SLE curves. More precisely, suppose now that D is the upper half-plane, and that A = 0 and B = ∞. Then, the conformal map ft has a Laurent expansion at infinity: ft (z) = z + a0 (t) + a1 (t)/z + o(1/z). It is easy to see that a1 (t) is positive, increasing continuously with t, and that it is therefore natural to use a time-parametrization for which a1 (t) increases linearly (in this way, f2t is indeed obtained as the iteration of two independent copies of ft ). Then, the Markovian property implies immediately that a0 (t) is a (real-valued) symmetric Markov process, with independent increments. This yields that a0 (t) = β(κt) for some constant κ ≥ 0 where β is a standard realvalued Brownian motion.
SLE, Conformal Restriction, Loops
521
Furthermore, one can recover γ if one knows only the function t → a0 (t): One has to solve (for each fixed z) the stochastic differential equation 2 dft (z) = da0 (t) + dt ft (z) and see that γt = ft−1 (A). It is not easy but can be proved [30] that this procedure indeed defines (for any fixed κ) almost surely a path γ (as we have already mentioned, one can even compute its fractal dimension). Some of the SLE curves can be shown to have special properties that can be related to the special features of the corresponding discrete models: The independence properties of percolation correspond to the “locality” property of SLE6 derived and studied in [20, 21]. The special properties of self-avoiding walks correspond to the restriction property of SLE8/3 that we will describe below etc. 3. Conformal restriction and universality We are now going to describe a “static approach”, developed in [24] (some ideas go back to [25]) that can be fruitfully combined with the previous ideas. Suppose that one knows the law of a random set K in a simply connected domain D. For a given simply connected subset D of D, there exist two different simple ways to try to define a law on a random subset of D . The first one is to use conformal invariance: Take the law of the conformal image of K under a given conformal map from D onto D . The other simple possibility is to use conditioning and to take the law of K, conditioned on {K ⊂ D } (provided that this event has a positive probability). The general idea that will be described in the present section is to compare these two possibilities, and to see for instance, that for some exceptional laws on K (called restriction measures), these two ways coincide. Suppose that we are looking for a random connected closed subset K of D that joins the two boundary points A and B of D. Conformal restriction property. We will say that (the law of) K satisfies conformal restriction if: for any D as above, the law of K conditioned on {K ⊂ D } is identical to that of Φ(K) for any conformal map Φ from D onto D such that Φ(A) = A and Φ(B) = B. Note that conformal restriction implies conformal invariance (just take D = D ). It is not very difficult to check that the following sets do satisfy conformal restriction: (1) A Brownian motion started from A conditioned to exit D at B (it is not a big problem to turn this into a rigorous definition). (2) A Brownian loop starting and ending at A, conditioned to stay in D and to go through B (this can be viewed as the union of two independent Brownian motions defined in (1) – one goes from A to B and one comes back).
522
W. Werner
(3) The scaling limit of a critical and ordinary percolation cluster, when the percolation is conditioned so that this cluster stays in D and touches the boundary of D only at A and B. (4) An SLE curve from A to B in D with parameter κ = 8/3. (5) The union of two SLE curves from A to B in D with κ = 8/3 that are conditioned not to intersect (again, this conditioning can be made rigorous). It is conjectured that the last two objects respectively correspond to the scaling limits of “self-avoiding walks” from A to B in D (see, e.g., [23] for more details) and “self-avoiding polygons” in D conditioned to touch ∂D at A and B. In fact, one can see [24] that there exists exactly a one-dimensional family of such random sets K: If K satisfies conformal restriction, then for some positive α, one has P (K ⊂ D ) = (Φ (A)Φ (B))−α for all D and conformal map Φ from D onto D that leave A and B fixed. This equation does in fact characterize the law of the filling of K (i.e., all the “holes” in K are added to K). Conversely, for each α ≥ 5/8 (but not for smaller values of α), it is possible to construct such a random set, that satisfies conformal restriction. See [24] for details. It turns out that the boundaries of all these random sets K are similar. Their fractal (Hausdorff) dimension is always 4/3 (which is indeed the dimension of SLE8/3 ). Also, the SLE8/3 curve is the only simple curve satisfying this property and it corresponds to the value α = 5/8. This is related to the fact that it is the conjectured scaling limit of self-avoiding walks [23]. It is not difficult to argue that the value of α for the random objects described in cases (2), (3), (5) listed above must be α = 2. This is due to the fact that a typical self-avoiding polygon/Brownian loop/critical percolation cluster will only have few (i.e., a tight number) of points with maximal (resp. minimal) y-coordinates in the plane, so that the probability that these points fall in a disc of radius will be of order 2 . This simple observation shows that the filling of these three objects have the same law. For instance, a critical percolation cluster in its scaling limit and seen from the outside, has exactly the same random shape as the outside of a Brownian loop. This sort of argument has made it possible to deduce from SLE computations the fact that the dimension of the boundary of a Brownian motion is 4/3 [21]. 4. Restriction for families of loops 4.1. Loop-measure. The previous section suggests that the restriction measure for α = 2 has something special. One way to see this [26] is to relate it to a natural (infinite) measure on simply connected subsets of the entire plane that possess remarkable properties. A possibility to define this infinite measure goes as follows: Weight first a starting point z according to Lebesgue measure in the plane, and weight a timelength T according to the measure dT /2πT 2 . Draw a Brownian loop of time-
SLE, Conformal Restriction, Loops
523
length T starting and ending at z, i.e., a planar Brownian motion (Bt , t ≤ T ) conditioned on B0 = BT = z. Consider the filling K of the obtained loop. The measure under which K is defined will be called µ. Note that µ is clearly an infinite measure. It is not difficult to see using Brownian scaling that it is scale-invariant (K and λK are defined under the same measure). The relation with the previous restriction measure is relatively clear: If one “conditions appropriately” K to stay in D and to touch the boundary of D at A and B, one gets the restriction measure with α = 2. For each simply connected set D, we define µD as the measure µ restricted to those sets K that are subsets of D. Clearly, when D ⊂ D, µD = µD 1{K⊂D } . Also the scale-invariance of µ shows that µD is invariant under scaling: If K is defined under µD , then λK is defined under µλD . In fact, much more is true: µD is conformally invariant; if Φ is a one-to-one map from D onto Φ(D) that preserves the angles, then the image measure of µD under Φ (which is therefore a measure supported on sets that stay in Φ(D)) is exactly µΦ(D) . Hence, µD also satisfies a conformal restriction-type property: If Φ(D) ⊂ D, then there exist two different ways to construct µΦ(D) using µD : One by restriction, one by conformal invariance. These two ways coincide for all such Φ. In fact, it is possible to argue that the measures µD (and their multiples) are the unique ones with this property (via a Conformal restriction type argument), which shows that they are very natural in this context. 4.2. Soups. It is possible [26] to use this infinite measure µD to construct random (i.e., defined under a probability measure) collections of subsets K of D using Poissonization. Mathematically speaking, one constructs a Poisson point process with intensity cµD (where c is some positive constant). This is a random countable collection of subsets of D that we will now focus on. Intuitively, one may think of this Poisson point process as follows: The measure µD measures how likely a given set of possible K’s is. We let it “rain” during time c. What falls are not raindrops but subsets of D. The probability that during an infinitesimal time-interval dt, a set K of the collection A of possible sets falls is proportional to µD (A)dt. After time c, many sets will have fallen on D: These are the sets of the so called soup in D with intensity c. Alternatively, one can also let it rain Brownian loops (those that create the sets K) instead of their simply connected fillings K and this gives rise to the so-called Brownian loop-soup. This random collection of subsets of D inherits the restriction-type properties of µD : It is conformally invariant, and the soup in D ⊂ D is obtained by keeping the sets in the D-soup that stay in D . Suppose that D is a bounded simply connected domain. Then, it is immediate from the definition that a sample of the soup will be a countable (unbounded) family of subsets of D, but that there are (almost surely) only finitely many large ones. Also, the larger c is, the more sets there are in the
524
W. Werner
soup. Note that the sets in the soup are typically overlapping each other: Each set will in fact almost surely intersect countably many other sets in the soup. When c = 2, it turns out [26] that the Brownian loop-soup is very closely related to the family of Brownian loops that are erased when one “loop-erases chronologically” a Brownian path (recall that the scaling limit of loop-erased random walks has been proved [22] to be SLE with κ = 2). More generally, the quantities that measure the restriction defect of SLE curves (i.e., the RadonNikodym derivative of the law of the SLEκ in D and conditioned to stay D with respect to the law of SLE in D ) for κ '= 8/3 (that have been determined in [24] in terms of Schwarzian derivatives etc.) can be expressed simply and naturally in terms of a Brownian loop-soup with a certain intensity c = c(κ). 4.3. Soup-percolation. In the sequel, we will focus on some properties of the set D \ ∪n Kn , where the Kn ’s are the sets in a given soup. Before proceeding further in the analysis of some aspects of the soup, it is worth recalling some facts concerning the “fractal percolation” model introduced by Mandelbrot (see, e.g., [29]). Consider the unit square C0 = [0, 1]2 and fix some parameter p ∈ (0, 1). Divide it into 9 squares of side-length 1/3. For each of them, decide independently with probability 1 − p to remove it, and with probability p to keep it. This constructs a random union C1 of squares of side-length 1/3. For each of these sub-squares, we iterate the procedure: Divide them into 9 squares of side-length 1/32 , and remove some of them at random with the probability 1 − p. This constructs a random union of squares C2 of side-length 1/9. If we iterate the procedure, we get a random limiting set C∞ = ∩n Cn . Of course, if p ≤ 1/9, then C∞ is almost surely empty, and otherwise C∞ '= ∅ with positive probability. In fact, there is another interesting phase transition (see [9, 10]): If p is sufficiently large, then C∞ contains connected sets and paths with positive probability (for instance a path in C∞ that joins two opposite sides of C0 ). The phase transition in this case can be shown to be of the following type: For some fixed pc , the limiting set C∞ contains a.s. no path if and only if p < pc (i.e., at p = pc , there are paths). The soup that we described above has many similarities with Mandelbrot’s fractal percolation: At each scale, one removes certain sets, and what one does is scale-invariant, because of the scale-invariance in law of the soup. Hence, it should not be surprising that there is a similar phase transition in the geometry of D \∪n Kn . When c is small (not many sets did fall in D), the set does contain paths. In other words, there are many disjoint clusters: The set ∪n Kn has a countable number of connected components. In particular, one can identify their outer boundaries, which create a family of disjoint (a priori) fractal loops. On the other hand, when c is large, then almost surely, ∪n Kn has just one connected component (for very large c it is easy to see that ∪n Kn = D). These two phases will be separated by a critical value c0 , above which one can not detect anymore boundaries of clusters in the soup with intensity c.
SLE, Conformal Restriction, Loops
525
But in the soup picture, one is able to say much more than for the Mandelbrot percolation: In fact, it is possible to give a hand-waving argument [40] to justify – and hopefully it is also possible to prove it rigorously [36] – that the outer boundaries of clusters in the sub-critical case where there are sever such clusters) are SLE-type curves because of the fact that they will satisfy some Markovian-type properties. Furthermore, it possible to understand how the law of the cluster-boundaries change if one perturbs the domain that they are defined in: Loosely speaking, if one compares the law in D to the law in D ⊂ D using the same soup, we see that the boundary curves γ and γ are identical unless one loop that stays in D but exits D does intersect γ . If one compares this with the corresponding properties that have been worked out for SLE curves (see [24]), one gets a dictionary between c and κ, i.e., the value of κ that corresponds to boundaries of soup clusters with intensity c. The relation is c=
(3κ − 8)(6 − κ) 2κ
for κ ∈ (8/3, 4]. In fact, in the CFT language, the intensity c of the soup is simply the central charge of the model, because it will correspond to the central charge of the corresponding degenerate highest-weight representation of the Virasoro Algebra. Recall that the maximal value of κ for which the SLE is a simple self-avoiding curve is κ = 4. This will correspond to the critical intensity c0 = 1 which is the maximal positive central charge for which such degenerate highest-weight representations do exist.
4.4. Loops via the Gaussian Free Field. An important recent and ongoing development is the study of the intimate relation between the “geometry” of the Gaussian Free Field and SLE, by Schramm-Sheffield [33] and Sheffield [35]. The Gaussian Free Field in a domain D, see, e.g., [34], is a Gaussian generalized surface (i.e., it is a random distribution A) with correlations between its value at two points given by the Green function in D. In other words, for any smooth functions f and g, the covariance between A(f ) and A(g) is G(x, y)f (x)g(y)dxdy. A is not defined at points (i.e., A(δx ) is not wellD×D defined) but for instance, the mean-value of A on a curve makes sense. The Gaussian Free Field inherits conformal invariance properties from the properties of Green’s functions. In particular, it follows that some geometric curves on the field (say level-lines) if one is able to define them will be conformally invariant. It turns out that is indeed possible to prove the existence of such lines and to prove that they are SLE-type curves. These “level-lines” form in fact a system of loops in the GFF. This is an example the so-called Gaussian Loop-ensembles. Again, there is hope to prove [36] via an universality type argument that these loop-ensembles are the same as those defined as boundaries of loop-soup clusters.
526
W. Werner
5. Relation to conformal field theory One of the motivations to define these families of SLE loops in a domain comes from the fact that in the generic case, one SLE curve is not enough to construct a conformal field. One way to explain this is the following: Conformal fields are supposed to describe the properly rescaled limit of correlation functions of critical systems such as the Ising model, the Potts models for q ≤ 4 (see, e.g., [4]). An SLE curve from A to B in a domain D is related to an interface between two domains in D, one attached to each of the parts of ∂D \ {A, B}, that correspond to different boundary conditions (wired/free, +/−). Once the SLE curve is defined, one is left with new domains (the connected components of D \ γ) that have “monochromatic” boundary conditions, and one needs to understand what goes on in them in order to control the correlation functions; so, the law of one SLE curve is not enough (except for κ = 6 where locality makes things easier, see, e.g., [6]). In the physics literature, the quest for a probabilistic construction of the conformal fields has not been so intensive. One reason is maybe its difficulty. Another comes possibly from the fact that CFT (and more generally the “Euclidean theories”) has been developed using analogies with quantum field theory for which a probabilistic interpretation (i.e., states are measures, operators are “conditionings” etc.) does not seem to work. In the κ = 8/3 case for which SLE satisfies conformal restriction, it is however possible [14] to directly construct the highest-weight representation that is related from the CFT predictions. It suffices to consider functionals of the remaining domain that are expressed are probabilities to visit boundary points. This allows to give a simple interpretation to the CFT “Ward identities for the stress-energy tensor” in terms of conformal restriction. In the other cases, the loop-soup gives one way to construct CFT. Another way to relate SLE to CFT, is to couple to an SLE the conformal fields that are supposed to describe what happens in the unexplored domains. The observables will then evolve like martingales when one explores the domain with an SLE curve. This leads to various interplays between SLE and CFT considerations. This (among other things) has been explored by Bauer and Bernard in a series of papers ([1, 2] and references therein). A closely related issue is to define SLE curves on Riemann surfaces. Indeed, many of the CFT considerations are based on the fact that one does not only perturbs “conformally” the domain D, but also that one distorts the metric inside D and looks at the (infinitesimal) response of the system to this. This motivates the papers [13] (which takes a more physical approach) and [43, 28]. This list of related papers is far from exhaustive; we have only selected those directly relevant to this exposition. Very close to the restriction property is Dub´edat’s [11] approach to the conjectured “SLE duality” (i.e., the relation between the outer boundary of an SLEκ for κ > 4 and the simple SLE16/κ
SLE, Conformal Restriction, Loops
527
curve conjectured by Duplantier). Also, for a full list of references on the related Coulomb gas and/or quantum gravity physics literature, see, e.g., [12]. References [1] M. Bauer, D. Bernard (2003), Conformal Field Theories of Stochastic Loewner Evolutions, hep-th/0210015 Comm. Math. Phys. 239, 493–521. [2] M. Bauer, D. Bernard (2003), Conformal transformations and the SLE partition function martingale, math-ph/0305061, Ann. Henri Poincar´e 5, 289–326. [3] V. Beffara (2002), The dimension of the SLE curves, math.PR/0211322, preprint. [4] A.A. Belavin, A.M. Polyakov, A.B. Zamolodchikov (1984), Infinite conformal symmetry in two-dimensional quantum field theory. Nuclear Phys. B 241, 333– 380. [5] D. Beliaev, S. Smirnov, Harmonic Measure on Fractal Sets, this volume. [6] F. Camia, C. Newman (2003), Continuum nonsimple loops and 2D percolation, math.PR/0308122, preprint. [7] J.L. Cardy (1992), Critical percolation in finite geometries, J. Phys. A, 25 L201– L206. [8] J.L. Cardy (2002), Conformal Invariance in Percolation, Self-Avoiding Walks and Related Problems, cond-mat/0209638, Plenary talk given at TH-2002 Paris. [9] J.T. Chayes, L. Chayes, R. Durrett (1988), Connectivity properties of Mandelbrot’s percolation process, Probab. Theory Related Fields 77, 307–324. [10] F.M. Dekking, R.W.J. Meester (1990), On the structure of Mandelbrot’s percolation and other random Cantor sets, J. Stat. Phys. 58, 335–341. [11] J. Dub´edat (2004), SLE(κ, ρ) martingales and duality, math.PR/0303128, Ann. Probab., to appear. [12] B. Duplantier (2003), Conformal Fractal Geometry and Boundary Quantum Gravity, math-ph/0303034, in Fractal geometry and application, A jubilee of Benoit Mandelbrot, AMS Proc. Symp. Pure Math., 2004, 365–482. [13] R. Friedrich, J. Kalkkinen (2004), On Conformal Field Theory and Stochastic Loewner Evolution, hep-th/0308020, Nucl. Phys. B687 (2004) 279–302. [14] R. Friedrich, W. Werner (2003), Conformal restriction, highest-weight representations and SLE, math-ph/0301018, Comm. Math. Phys., 243, 105–122. [15] R. Langlands, Y. Pouillot, Y. Saint-Aubin (1994), Conformal invariance in twodimensional percolation, Bull. A.M.S. 30, 1–61. [16] G.F. Lawler (2004), Conformally invariant processes in the plane, ICTP Lecture Notes 17, 305–351. [17] G.F. Lawler (2004), Conformal invariance, universality and the dimension of the Brownian frontier, Proceedings of the ICM 2002 Beijing, Vol. III, 63-72. [18] G.F. Lawler (2003), Restriction property for conformally covariant measures, Proceedings of ICMP Lisboa 2003. [19] G.F. Lawler (2004), Conformally invariant processes in the plane, AMS, 2005. [20] G.F. Lawler, O. Schramm, W. Werner (2001), Values of Brownian intersection exponents I: Half-plane exponents, math.PR/9911084, Acta Mathematica 187, 237–273. [21] G.F. Lawler, O. Schramm, W. Werner (2001), Values of Brownian intersection exponents II: Plane exponents, math.PR/0003156, Acta Mathematica 187, 275– 308.
528
W. Werner
[22] G.F. Lawler, O. Schramm, W. Werner (2004), Conformal invariance of planar loop-erased random walks and uniform spanning trees, math.PR/0112234, Ann. Prob. 32 , 939–995. [23] G.F. Lawler, O. Schramm, W. Werner (2002), On the scaling limit of planar self-avoiding walks, math.PR/0204277, in Fractal geometry and application, A jubilee of Benoit Mandelbrot, AMS Proc. Symp. Pure Math., 2004, 329–364. [24] G.F. Lawler, O. Schramm, W. Werner (2003), Conformal restriction properties. The chordal case, math.PR/0209343, J. Amer. Math. Soc. 16, 917–955. [25] G.F. Lawler, W. Werner (2000), Universality for conformally invariant intersection exponents, J. Europ. Math. Soc. 2, 291–328. [26] G.F. Lawler, W. Werner (2004), The Brownian loop-soup, math.PR/0304419, Probab. Th. Rel. Fields 128, 565–588. [27] P. L´evy, Processus Stochastiques et Mouvement Brownien, Gauthier-Villars, Paris, 1948. [28] N. Makarov, D. Zhan (2004), SLE-type processes on Riemann surfaces, in preparation. [29] B.B. Mandelbrot, The Fractal Geometry of Nature, Freeman, 1982. [30] S. Rohde, O. Schramm (2001), Basic properties of SLE, math.PR/0106036, Ann. Math. 105 (2005), 879–920. [31] O. Schramm (2000), Scaling limits of loop-erased random walks and uniform spanning trees, Israel J. Math. 118, 221–288. [32] O. Schramm, S. Sheffield (2003), The harmonic explorer and its convergence to SLE(4), math.PR/0310210, preprint [33] O. Schramm, S. Sheffield (2004), in preparation [34] S. Sheffield (2003), Gaussian Free Fields for mathematicians, math.PR/0312099, preprint. [35] S. Sheffield (2004), in preparation. [36] S. Sheffield, W. Werner (2004), in preparation. [37] S. Smirnov (2001), Critical percolation in the plane: conformal invariance, Cardy’s formula, scaling limits, C. R. Acad. Sci. Paris Sr. I Math. 333, 239–24 [38] W. Werner (2001), Critical exponents, conformal invariance and planar Brownian motion, Proc. ECM 2000 Barcelona, Birkh¨ auser, 87–103. [39] W. Werner (2004), Random planar curves and Schramm-Loewner evolutions, math.PR/0303354, Lecture notes from the 2002 Saint-Flour summer school, Springer, L.N. Math. 1840, 107-195. [40] W. Werner (2003), SLEs as boundaries of clusters of Brownian loops, math.PR/0308164, C. R. Ac. Sci. Paris Ser. I Math. 337, 481–486. [41] W. Werner (2003), Conformal restriction and related questions, math.PR/0307353, The Probability Surveys, vol. 2, to appear (2005). [42] W. Werner (2004), Some mathematical aspects of the scaling limit of critical twodimensional systems, Proceedings Statphys XXI Bangalore, Pramana J. Phys., to appear. [43] Dapeng Zhan (2004), Stochastic Loewner Evolution in doubly connected domains, math.PR/0310350, Probab. Theor. Rel. Fields, 340–380. Wendelin Werner, Universit´e Paris-Sud and IUF
4ECM Stockholm 2004 c 2005 European Mathematical Society
On the Integral Points on Certain Algebraic Varieties Umberto Zannier Abstract. We shall briefly present some recent results concerning S-integral points on certain varieties of dimension > 1. Some of them concern non-singular surfaces, some other ones arise from diophantine equations with linear recurrences. These results, obtained jointly with P. Corvaja, use the Schmidt Subspace Theorem applied to suitable auxiliary linear forms, constructed as values of certain regular functions on the variety. For the methods to apply, it is often necessary that the divisor at infinity is highly reducible; we give examples showing that on dropping this restriction very difficult problem arise even in simple contexts. We then show some instances when the reducibility condition may be achieved by taking an unramified cover, as in recent work by Faltings. Generally speaking, we shall not assume prior knowledge of the subject and we shall often illustrate the theorems by quite concrete examples.
The present article will survey on some old and more recent results on integral points on affine algebraic varieties. We shall not consider however the important and difficult known results about rational points, nor we shall pause on the deep results on integral points on (semi)abelian varieties; rather, we shall especially focus on some recent joint work with P. Corvaja. Generally speaking, we shall not assume prior knowledge of the subject and we shall often illustrate the theorems by quite concrete examples. We start by recalling some simple definitions. 1. Some definitions and known facts; reducibility at infinity Let X be an affine algebraic variety, defined over the field Q of algebraic numbers, embedded in An , and for our purposes supposed throughout to be irreducible. If X is defined by polynomial equations fi (x1 , . . . , xn ) = 0, i ∈ I, our basic problem is to describe the solutions to this system in integers xi ∈ Z. Actually, it will be convenient to deal, more generally, with algebraic integer solutions in an arbitrary given number field k, and also to allow denominators made up only of primes from a prescribed finite set S; in other words, given a finite set S of places of k, containing the archimedean ones, we define the ring OS of S-integers of k, by OS = Ok,S = {x ∈ k : |x|v ≤ 1, ∀v ∈ S} and we consider X (OS ) = {x ∈ OSn , fi (x) = 0, i ∈ I}, the set of S-integral points of X . Supported by the ECM organization and by the European Network Group of Arithmetic Algebraic Geometry: MRTN -CT2003-504917.
530
U. Zannier
All of the results that we shall meet depend only on geometric properties of X and are invariant by such extensions of k or S. Here, for simplicity, we are assuming that X is already embedded in some affine space; however a more intrinsic definition of the sets of S-integral points is possible, only in terms of the values of the regular functions on X ; for this, see, e.g., [Se], [V] or [Z]. We now view An as the standard open subset of Pn given by X0 '= 0, where X0 , . . . , Xn are homogeneous coordinates. We let X˜ be the closure of X in Pn ; it is defined by homogeneizing all the equations which hold on X . The set X˜ \ X is a finite union of distinct irreducible varieties D1 , . . . , Dr . It turns out that many results on integral points strongly involve the structure of the Di ’s and in particular their number r, which usually, at any rate implicitly, has to be “large” for the relevant assumptions to be satisfied.1 With this in mind, let us review a few rather classical instances. (i) Thue’s Theorem (1909) on the integer solutions to equations f (x, y) = c, where f ∈ Z[x, y] is homogeneous of degree d, without multiple factors, and where c is a given nonzero integer. Now X ⊂ A2 is the plane curve f = c and X˜ ⊂ P2 is defined by f (X, Y ) = cZ d and is obtained by adding to X the points on the line Z = 0 at infinity, gotten from f (X, Y ) = 0, which has d distinct zeros in P1 . Hence r = d now. Thue’s result asserts that if r = d ≥ 3 we have finiteness of X (Z). (Note that for d = 2 we have, e.g., the Pell’s equations x2 − ay 2 = 1, which admit infinitely many solutions in Z2 when a is a positive integer, not a square.) (ii) Siegel’s Theorem (1929) on integral points on an affine irreducible curve X ; this states that if either X has positive genus or if r ≥ 3 then X (OS ) is finite2 (which includes Thue’s result). (iii) W.M. Schmidt’s theorems (1972) on “Norm-form equations”; these equations, which generalize Thue’s ones in dimension > 1, have the shape k NQ (L(x)) = c, where L is a linear form in x = (x1 , . . . , xn ) with coefficients in the number field k and c is a nonzero integer. The divisors at infinity correspond to the hyperplanes defined by Lσ (x) = 0, for the conjugates Lσ of L. Schmidt gave a full description of the cases with infinitely many integer solutions; we do not give any complete statement here, and refer to [S1], [S2] or [Z] for this. However we remark that the condition r ≥ n + 1 is always implicit in these criteria, in order to ensure finiteness. 1Note however that r may depend on the embedding, as happens for instance with A2
embedded in P2 or in P21 . We wonder whether there exist situations when the verification of the assumptions of some of the theorems stated here depends on the embedding of X . 2Siegel’s original version was over Z; the sharpening to O is due to Mahler, at any rate S for genus 1. Naturally Siegel’s Theorem follows, at any rate for genus ≥ 2, from the difficult Faltings’s solution of the Mordell conjecture that a curve of genus ≥ 2 has only finitely many rational points.
On the Integral Points on Certain Algebraic Varieties
531
(iv) Laurent’s Theorem (1984) on the integral points on a subvariety X of Gnm (see [Z]). We recall that the multiplicative algebraic group Gm is just the affine variety A1 \ {0} endowed with the multiplicative group law. It may be embedded in A2 as the hyperbola xy = 1 (two points at infinity), and then we see that the S-integral points on Gm correspond to the S-units OS∗ , the multiplicative group of invertible elements of OS . Hence, in this embedding X (OS ) consists of the points of X with coordinates in OS∗ ; recalling that OS∗ is finitely generated (which is part of a celebrated result by Dirichlet), we see that we are concerned now with exponential diophantine equations (like, e.g., 3a + 5b = 7c + 11d ). Laurent’s result (answering a question of Lang, who did the case of curves already in 1966) implies that: The Zariski closure of X (OS ) is a finite union of translates of algebraic subgroups of Gnm , and thus in particular is a finite set if X does not contain any such positive-dimensional translate. (Seeking such translates in G4m ∩ {x + y = z + w} one recovers the finiteness of integral solutions for the above example.) Now usually several divisors at infinity arise from the pairs of points at infinity of each factor Gm . (v) A theorem of Vojta (see [V] or [Z]) on integral points on an affine open subset X of a nonsingular projective variety X˜ with Pico (X˜ ) = 0. Let ρ be the rank of the N´eron-Severi group of X˜ ; Vojta’s Theorem (which may be derived in a fairly simple way from Laurent’s) states that: If r ≥ dim X + 1 + ρ then X (OS ) is not Zariski-dense in X .3 Once more, we see the relevance of having a large number r of components at infinity. (vi) Finally we recall the very deep result by Faltings (1991; see [F, Cor. 6.2 to Thm. 2]) that an affine subset of an abelian variety has only finitely many S-integral points. (Vojta 1999 has later obtained the difficult extension of this to the case of semi-abelian varieties.) This is one of the few known results which hold even with r = 1, i.e., with a single irreducible divisor at infinity. At the end we shall see other instances of this, obtained by working with a finite cover of X . 2. Diophantine approximation The proofs of these theorems use, in various fashions, Diophantine Approximation, a subject which originated as the theory of lower (and upper) bounds for the rational approximations to algebraic numbers. Thus, a solution in integers x = p, y = q of the Thue’s equation x3 − 2y 3 =√1 produces a very good rational √ 3 approximation p/q for 2, in fact such that | 3 2 − pq | 5 q −3 ; Thue was able to √ show that actually for any positive one has | 3 2 − pq | > q −5/2− for sufficiently large q, thus deducing the finiteness of solutions. Thue √ had analogous lower bounds for arbitrary algebraic numbers θ in place of 3 2, with “exponents” for q depending on the degree of θ. After some important improvements due to 3Vojta has later removed the condition on Pico by using his much deeper result on semi-
abelian varieties, as in (vi) below.
532
U. Zannier
Siegel, Gelfond and Dyson, finally Roth (1955) proved a “best-possible” lower bound |θ − pq | > q −2− for any fixed algebraic θ, fixed positive and q large enough with respect to θ and . In 1970 Schmidt extended in a substantial way the known techniques and proved a multi-dimensional extension of Roth’s result, which again was in a way optimal. Rather than bounding below merely the rational approximations to a given algebraic number, he bounded below the distance of rational points in Pn to a given hyperplane in that space, defined over the field of algebraic numbers. Schmidt Subspace Theorem is actually more general, dealing with the “average” distance from several given hyperplanes; in the affine version it states that given linearly independent linear forms L1 , . . . , Ln in n variables, over Q, for any > 0 the inequality |L1 (x) · · · Ln (x)| > (max |xi |)− holds for all integer points x = (x1 , . . . , xn ) ∈ Zn outside the union of certain finitely many proper linear subspaces of Qn (depending on the Li and ). Roth’s result immediately follows by taking L1 = x1 − θx2 , L2 = x2 . Schmidt applied this, for instance to the norm-form equations (see (iii) above). Later Schlickewei extended the theorem to cover simultaneously several absolute values of k; these lower bounds now involved the (Weil) “height” of the points, which is a kind of arithmetical complexity, and which for a point in Zn with coprime coordinates is just the maximum absolute value of them. This important evolution, analogous to what had been done for Roth’s theorem by Mahler, Ridout and Lang, led to several applications to diophantine equations and eventually to the above mentioned result by Laurent. We should mention that a new, more general, geometric formulation of the theorem has been given by Faltings and W¨ ustholz (1994), in which the approximant points are restricted to an algebraic subvariety of the ambient space; also the proof of this version is new compared to Schmidt’s original one (see [FW] and also [EF] for a quantitative version and a different approach). 3. Choosing suitable embeddings Most direct applications to integral points of these theorems of Diophantine Approximations implicitly require that the relevant variety X˜ is embedded in Pn in a somewhat special way. For instance, in the case of Thue’s equations, the tangents at the points at infinity have high order contact with the curve, that is, higher than expected for a “generic” curve. This implies that the distance of the integral points from a suitable tangent at infinity decreases so rapidly4 to contradict Thue’s or Roth’s Theorems (note that actually each tangent at infinity is defined by an equation with algebraic coefficients). For an arbitrary curve one cannot ensure this behavior of the relevant distance functions. To cope with this difficulty and prove his theorem on integral points in full generality, Siegel embedded the curve into its Jacobian (the case of positive genus was the difficult one) and lifted the points to a cover of the 4That is, rapidly compared to the coordinates, or the “height”, of the points.
On the Integral Points on Certain Algebraic Varieties
533
curve of suitably high degree.5 Then he applied the lower bounds of Diophantine Approximation to the points of the cover. Going to the cover does not change much the local distances but it quite changes the heights; this behavior of the height functions evaluated at the algebraic points on abelian varieties allowed a sufficiently good improvement of the lower bound provided by a direct use of theorems of Roth’s type and Siegel could conclude. (See, e.g., [Se] for a modern presentation of this proof.) Another approach to a proof of Siegel’s Theorem has been recently proposed in joint work [CZ1] of P. Corvaja and the writer; in this argument, rather than invoking the Jacobian, the curve X in question is embedded into a projective space of sufficiently high dimension, in such a way that it has very high order contact with suitable hyperplanes, at the points at infinity; this provides a convenient metric to work with. In other words the integral points on the curve rapidly approach the alluded hyperplanes, allowing a direct application of the Subspace Theorem. (Roth’s Theorem, sufficient for Siegel’s method, is not enough here.) To be a little more detailed, suppose first that the number r of points Q1 , . . . , Qr at infinity of X is ≥ 3; by going to a larger number field we may assume that they are defined over k. We pick a large integer N and consider the vector space VN of functions in k(X ) whose pole divisor is ≤ N (Q1 +· · ·+Qr ).6 Let f1 , . . . , fm be a basis of this space; from Riemann-Roch we have m = dim VN ≥ rN + O(1). Note that the fi are regular on X and so, multiplying them by suitable nonzero constants if necessary, we may assume that they take S-integral values at the S-integral points. Let now {Pn } be an infinite sequence of distinct S-integral points in X (OS ) and select an absolute value v ∈ S. Going to an infinite subsequence we may assume by compactness that (for all v ∈ S) Pn converges v-adically to some point P v ∈ X˜ (kv ) (where kv is the v-adic completion of k). If P v ∈ X is not at infinity then the Pn are v-adically bounded, which is a harmless case. So, let us assume that P v = Qi , i = i(v), namely that Pn → Qi in the v-adic topology. By easy linear algebra we may construct functions g1 , . . . , gm forming another basis for VN and such that ordP v gj ≥ j − 1 − N . Each gj may be expressed as a linear form Lj in the fi ’s. Now, since m ≥ rN +O(1) ≥ 3N +O(1) is “large”, the function gj has eventually a zero of large order at P v and since Pn converges to P v with respect to v, we see that many of the linear forms Lj evaluated at the integral points (f1 (Pn ), . . . , fm (Pn )) will be “very small” as n grows; in particular the product of the Lj will be very small at these points; note in fact that the product g1 · · · gm of the gj ’s vanishes at P v with order at least m m2 2 2 2 j=1 (j −1−N ) ≥ 2 −N m+O(N ) ≥ (r −2r)N /2+O(N ) ≥ 3N /2+O(N ). 5This lifting essentially preserves rationality, e.g., by the Mordell-Weil Theorem. 6We tacitly assume throughout that the curve X is nonsingular, which is a harmless condition
here.
534
U. Zannier
Quantifying all of this and combining the information for the various absolute values in S we find ourselves in position to apply the mentioned Schlickewei’s formulations of the Subspace Theorem, obtaining a contradiction. This argument well illustrates the relevance of the condition “r ≥ 3”. The remaining cases of Siegel’s Theorem, namely when the number of points at infinity is 1 or 2 and the genus is positive, may be reduced to the special case just treated by a nowadays standard process. One first takes an unramified cover π : X˜ ∗ → X˜ , of finite degree ≥ 3, defined over Q; this is possible, essentially because X has positive genus and so X˜ (C) is not simply connected. Then one invokes the Chevalley-Weil Theorem to show that the integral points on X lift to integral points on X ∗ := π −1 (X ) and applies the special case to X ∗ ; we shall return to this technique with a little more detail at the end. We remark that the principle of improving the diophantine approximations by changing the embedding occurs, in a somewhat different way, also in [FW] and especially in work by Evertse and Ferretti; for instance in the paper [EF] they use suitable embeddings to derive a good quantitative version of the Subspace Theorem for points restricted in algebraic varieties; as a byproduct they surprisingly recover the full version by Faltings and W¨ ustholz from the original versions by Schmidt and Schlickewei. 4. Some recent results The above sketched approach to Siegel’s Theorem avoids the somewhat delicate arithmetic theory of heights on Jacobians (even the existence of the Jacobian as a projective variety need not be invoked). This is just a methodological point, but the procedure sometimes has also other advantages, such as: (a) Quantitative formulations of Siegel’s Theorem which seem to escape from the other known methods. By “quantitative” we mean “explicit bounds for the number of integral points on an algebraic curve” (general effective bounds for the heights of integral points are not presently known). For example, for a fixed curve X with at least three points at infinity we prove that, for variable number field k and set S, #X (OS ) is bounded only in terms of #S and [k : Q] (see [CZ1, Remark] and [CZ2]). See especially [CZ2] for precise statements and several corollaries of these uniform conclusions. For instance, here is a special case of Corollary 2 therein: Let f ∈ k[X] be a cubic polynomial with at least two distinct roots; the number of S-integral solutions of f (x) = ay 3 is bounded uniformly in a ∈ OS . We remark that crucial ingredients for these applications are the deep uniform quantitative versions of the Subspace Theorem obtained by Evertse, Schlickewei and Schmidt. (We refer to [CZ2] for bibliographical details on this.) (b) Applications to certain varieties of dimension > 1. Here the embedding is obtained similarly to the above sketched case of curves; one uses either a suitable version of the Riemann-Roch Theorem (as in [CZ3]) or a more explicit
On the Integral Points on Certain Algebraic Varieties
535
construction of functions small at some divisor at infinity (like in [CZ4]). Naturally, some complications of detail appear with respect to the case of curves. An important difficulty occurs when the integral points converge (with respect to a same absolute value) simultaneously to several divisors; now the relevant “linear forms” have to vanish to high order at all the involved divisors, which in general is a geometric constraint not so easy to control. To state explicitly some results, let us begin with the paper [CZ3] on surfaces, where the following theorem appears (in a slightly more general form), for an affine open subset X of a nonsingular projective surface X˜ ; we preserve the above notation for D1 , . . . , Dr : Theorem A. Assume that r ≥ 2 and that: (i) No three of the Di share a common point and D := D1 +· · ·+Dr is ample. (ii) Defining ξi , for i = 1, . . . , r, as the minimal positive solution of the equation Di2 ξ 2 − 2(D.Di )ξ + D2 = 0 (ξi exists), we have the inequality 2D2 ξi > (D.Di )ξi2 + 3D2 . Then X (OS ) is not Zariski-dense in X . Here we have denoted by (·.·) the intersection product of divisors on the surface X˜ . The first part of condition (i) is “generically” verified. Also, the second part on ampleness follows from the Nakai-Moishezon criterion (see, e.g., [H]) as soon as (D.Di ) > 0 for all i; in fact any curve C˜ on X˜ , not entirely at infinity, ˜ > 0. The numerical condition (ii) must meet some Di at C˜ \ X , whence (D.C) looks cumbersome, but we shall now present some illustrative natural examples when it is verified (see [CZ3] for other applications); roughly speaking, it “tends” to be true when the number r is large, so we have another instance of how it can be convenient that X˜ \ X is highly reducible. As a first application, let us show (as in [CZ3, Ex. 1.5]) how Siegel’s Theorem follows directly from Theorem A, when r = 3, i.e., when the curve has three points at infinity. (We have already remarked that the general case may be derived in a standard way from this.) First, it is a known easy fact that going to a normalization we may assume ˜ is nonsingular. We apply Theorem that the projective curve, denoted now C, A with X˜ = C˜ × C˜ and X = C × C. If Q1 , Q2 , Q3 are the points at infinity for ˜ ∆ := C˜ × Qi , i = 1, 2, 3 are at infinity C, then the six divisors ∆i := Qi × C, i for X . Condition (i) is clear. As to (ii), we plainly have (∆i .∆j ) = 1, while (∆i .∆j ) = (∆i .∆j ) = 0 for all i, j. This leads to (D.Di ) = 3, D2 = 18, ξi = 3, whence the final inequality amounts to the true one 2 · 18 · 3 > 3 · 9 + 3 · 18. We thus deduce that the integral points on X are not Zariski-dense, which would not be the case if C(OS ) were infinite (note that X (OS ) = C(OS ) × C(OS )). As another application, suppose that no three of the Di intersect and that all the products (Di .Dj ) equal a certain positive integer c. Then to have
536
U. Zannier
“r ≥ 4” ensures the conclusion of Theorem A. (See [CZ3], Thm. 1(a).) 7 In fact, condition (i) follows as explained above. As to (ii), now we have (D.Di ) = rc, D2 = r2 c, which gives ξi = r and then what we need amounts to 2cr3 > cr3 + 3cr2 , i.e., to r > 3, which we are assuming. In turn, this corollary of Theorem A yields an amusing application to the description of quadratic-integral points on a curve. By this we mean points whose coordinates are algebraic integers, but now in a field which varies with the point and is only restricted by being of degree ≤ 2 over the base field k.8 Referring to [CZ3] (see the Corollary and the Addendum) for the full statements, for simplicity we illustrate here only a rather special (and easier) case. This concerns the so-called double Pell’s equations, considered already by Fermat, namely, certain affine open subsets of the intersections of two quadrics in P3 . A concrete instance is the curve C defined by x2 −2y 2 = 1 and z 2 −3y 2 = 1, say. This is an irreducible nonsingular curve of genus 1, embedded in A3 ; it has four (nonsingular) points at infinity, gotten from the equations X 2 = 2Y 2 , Z 2 = 3Y 2 in homogeneous coordinates on the hyperplane at infinity P3 \ A3 . Both criteria of Siegel’s Theorem thus imply the finiteness of C(OS ).9 On the contrary, we have an infinite set of quadratic-integral points on C, even when the ground field is Q (and #S = 1); actually, there exist at least three infinite families. In fact, we may solve in Z2 the first Pell’s equation x2= 2y 2 + 1 in infinitely many ways and then set, for a solution (x, y), z = ± 3y 2+ 1; or, similarly, we may solve the second Pell’s equation and then put x = ±√ 2y 2 + 1; or, finally, we may solve 3x2 = 2z 2 + 1 in Z2 and then set y = ± x2 − 1/2. Well, it is not difficult to derive from the mentioned corollary of Theorem A that at most a finite number of quadratic-integral points can escape from such description. We briefly sketch the argument. We consider the symmetric square ˜ ∼, where ∼ is the equivalence relation which X˜ = C˜(2) , i.e., the surface C˜ × C/ identifies (P, Q) and (Q, P ). This surface may be shown to be nonsingular. The surface X = C (2) = C × C/ ∼ is an open affine subset; since C has four points at infinity, X has four divisors D1 , . . . , D4 at infinity such that no three intersect and which satisfy, as is easy to see, (Di .Dj ) = 1 for all i, j. So we may apply the special case of Theorem A, proving that X (OS ) is not Zariski-dense and hence is contained in a certain (possibly reducible) curve Z ⊂ X . 7It may be proved that the condition yields that the D are numerically equivalent; then i
it may be seen that this special case of Theorem A is also a corollary of Vojta’s very deep theorem on semi-abelian varieties. However this is not the case for the full Theorem A; see [CZ3] and [Z2]. 8The quadratic-rational points have been almost fully described by Abramovitch and Harris using Faltings’s Theorem on rational points on subvarieties of abelian varieties. On the contrary, Faltings’s Theorem at (vi) above seems not enough to recover directly our results on quadratic-integral points. 9By Baker’s techniques from transcendental number theory this finite set if even effectively computable, which does not follow from the methods of proof mentioned here.
On the Integral Points on Certain Algebraic Varieties
537
Now, for a quadratic-integral point P ∈ C, consider the conjugate point P over k. Then the point (P, P ) ∈ C × C is sent to (P , P ) by conjugation, so its image in X is fixed by conjugation and is thus an ordinary integral point on X and so lies in Z. If a component Z of Z contains infinitely many such points it must be a rational curve by Siegel’s Theorem; then the map ˜ (= C˜ now) must be constant. (P, Q)/ ∼→ P + Q from Z to the Jacobian J(C) ˜ We conclude that P + P ∈ J(C) assumes only finitely many values A1 , . . . , Al . This shows that each quadratic-integral points is sent to a rational points by one at least of a finite number of rational maps on C of degree 2 (defined by P → P × (Ai − P )/ ∼) and with a little more effort one shows that there are at most three such relevant maps, corresponding to the above families (see [CZ3] for details).10 Still other consequences of Theorem A are the object of joint work in progress with Corvaja; in fact, there are simple corollaries which extend the above “special case” and moreover sometimes one can drop the restriction that no three of the Di intersect, by a suitable blow-up. For instance, we have the following: Suppose that D1 , D2 , D3 intersect at a single point pairwise transversely and (Di .Dj ) = c > 0 for all i, j. Then the conclusion of Theorem A holds. Nevertheless, Theorem A does not apply, at any rate directly, to the natural case of hypersurfaces in A3 , which are defined by a single polynomial equation. In fact, for the divisor at infinity to have at least r components (in the standard embedding), the defining equation must have the shape f1 · · · fr = g, where f1 , . . . , fr , g ∈ k[x1 , x2 , x3 ] and where deg g < deg(f1 · · · fr ). Now, if deg g = deg(f1 · · · fr ) − 1, the inequalities for Theorem A are not satisfied, while if the degree of g is smaller (as in the case of norm-form equations, when deg g = 0), the surface is necessarily singular at infinity, preventing again an application of Theorem A. Well, it happens that, leaving aside Theorem A, the basic principles of the method may be applied directly even to these cases, actually for any number of variables. For example in [CZ4] the following theorem is proved, where we denote by f1 , . . . , fr , g polynomials in k[x1 , . . . , xr ], by f i , g their homogeneizations in k[X0 , . . . , Xn ] and by X the hypersurface defined by f1 · · · fr = g, which is a kind of general Thue’s equation: Theorem B. Suppose that the set of common zeros (in Pn ) of X0 g and any n − 1 of the forms f i is finite and that no n of the f i have a common zero at 10This proof is highly ineffective, in that it does not even allow to estimate the number of
exceptional solutions escaping from the infinite families. In fact, a first ineffectivity comes from Theorem A, which does not allow to compute the exceptional curve Z in the above argument, but only to bound its degree. Now, no version of Siegel’s Theorem is known which bounds the number of integral points solely in terms of the degree and this double obstacle clearly produces a “higher degree” of ineffectivity.
538
U. Zannier
infinity. Assume also that r
deg fi > n max(deg fi ) + deg g.
i=1
Then X (OSn ) is not Zariski-dense in X . The conditions on the zeros are the analogue of the first part of (i) in Theorem A and cannot be omitted from the statement. (Similar conditions of “general position” occur in well-known broad conjectures by Vojta, which we shall briefly meet in the sequel.) As to the proof, now the construction of a suitable embedding does not use a Riemann-Roch Theorem, but is rather more explicit (though somewhat less efficient). We note that since g has “small” degree, some of the fi must be relatively small at the integral points. Then, for each absolute value in S, we construct many functions small at the integral points as polynomials which are divisible by some monomial of high degree in the relevant fi ’s. To check that we obtain sufficiently many independent functions in this way, we use the Hilbert polynomials for the components of X . The argument also leads to a kind of Subspace Theorem for polynomials (see [CZ4, Thm. 3]). We note that this is again related to the mentioned approach of Evertse and Ferretti to the Theorem of Faltings and W¨ ustholz; actually in a very recent paper [EF2] they apply their techniques to recover and further extend some of the results in [CZ4]. 5. Further applications and problems Other applications to integral points of the Subspace Theorem, after a suitable embedding, arise from diophantine equations with linear recurrences, which we only describe very briefly here, with a few concrete cases. A typical one occurs with the equation y 2 = 1 + 2n + 3n , or more generally with the problem of perfect powers in recurrent sequences. Most methods here (see [Z] for references) work only when there is a suitable factorization of the recurrence, and in general essentially work for binary recurrences only. For instance, concerning the example, the finiteness of the squares in the ternary recurrence 1 + 2n + 3n on the right hand side was not known until [CZ5], where some finiteness results were obtained for recurrences with arbitrarily many exponentials. The principle of the arguments therein is very simple: considering again the example, we ∞ 1+2n r expand the square root of 1 + 2n + 3n as a series 3n/2 r=0 1/2 r ( 3n ) , which converges for n > 1. Truncating the series after a large (but fixed) number R of terms, gives a good approximation for the square root by an ordinary exponential polynomial. If the square root y = y(n) is integral, this provides a small r sn −rn R linear form 3Rn y(n) − 3n/2+Rn r,s=0 1/2 with (algebraic) integer r s 2 3 Rn ns n/2+(R−r)n (s ≤ r). Other small linear forms are obtained entries 3 y(n), 2 3 at the places 2 and 3 by picking just the variables 3Rn y and 2ns 3n/2+(R−r)n .
On the Integral Points on Certain Algebraic Varieties
539
Some calculations show that, if R has been chosen large enough, the Subspace Theorem applies and allows to conclude. Even this technique relies on finding first a convenient embedding, going from the original space of triples (y(n), 2n , 3n ) to the space of the said variables for the linear forms; this time we obtain what we need directly from the Taylor series, but at bottom we are again constructing a function vanishing on a divisor at infinity. The method has been further expanded. For instance, we may view 2n , 3n as S-units (for S containing {∞, 2, 3}) and consider more generally the Sintegral points on the subvariety y 2 = 1 + x1 + x2 in A1 × G2m ; a special case of this problem occurs with the equation y 2 = 1 + 2m + 3n , for which the finiteness of integer solutions (m, n, y) is not known, though it is expected. In fact, as we shall soon see, it comes from a broad conjecture of Lang and Vojta. The above argument does not suffice, essentially because the Taylor expansion is not efficient enough when 2m and 3n are roughly of the same magnitude; however one may obtain in this way that for any possible infinite sequence of solutions the ratio m log 2/n log 3 converges to 1. For some other superficially similar equations, like y 2 = 1 + 2m + 6n , one can combine such a distributional constraint with a similar one, relative to a p-adic absolute value (here with p = 2) to prove unconditionally the finiteness of solutions. (See [CZ6] for this and for more general instances of equations f (am , y) = bn with only finitely many solutions (y, m, n) ∈ Z3 .) These questions are in turn special cases of the problem of describing the integral points on subvarieties of A1 × Gnm . In spite of Laurent’s result at point (iv) above, even the mentioned example y 2 = 1 + x1 + x2 seems rather intractable at present in full generality.11 A partial result, which contains the positive answers just mentioned, appears as Theorem IV.5 of [Z], which we do not state in this brief general account (it concerns the perfect dth powers of the shape x1 + · · · + xn , with xi ∈ OS∗ ). At p. 62 of [Z] the following conjecture also appears: Let X be an irreducible subvariety of A1 × Gnm with a Zariski-dense set of S-integral points, such that the projection π : X → Gnm is finite. Then π(X ) is an algebraic translate uH and there are an isogeny σ : H → H and a morphism τ : H → X such that uσ = π ◦ τ . After Laurent’s theorem, the difficult point is the existence of σ, τ . In case X is a curve the conjecture is true, as can be seen, e.g., by Siegel’s Theorem (see [Z], Ex. III.10). Let me now mention how these problems of integral points are related to further ones, coming from other varieties and leading to open questions. We have seen several examples when reducibility at infinity is an important issue. This fits into the following celebrated conjecture of Lang and Vojta (see, e.g., [L], p. 223 or [HSi], p. 486), where D is the divisor at infinity and K is the canonical class of X˜ : The integral points should not be Zariski-dense when K+D 11Even the function field analogue, usually substantially simpler, appears rather hard in this
case; see [Z3] for some results.
540
U. Zannier
is pseudo-ample and D has simple normal crossings. The “normal-crossings” condition ([L], p. 191) holds when the Di are nonsingular and meet transversally everywhere. The reducibility of D does not explicitly appear, but in practice the ampleness condition is likely to be verified if D has many components. To see some instances of this conjecture, from now let us on concentrate on the simply described cases when X = Pn \ D; we are concerned now with a rational variety, but nevertheless very interesting equations, often representing non-rational varieties, arise from these situations. To see how, note that now the divisor D at infinity is defined by a single equation f = 0 where f is a form in X0 , . . . , Xn of degree d = deg D. The integral points on X will give rise to integral points for the affine cone in An+1 over X , which is just the complement in An+1 of f = 0; this cone may be embedded as the affine variety in An+2 defined by zf (x0 , . . . , xn ) = 1. In turn, an S-integral point of this variety will be such that z ∈ OS∗ ; since OS∗ is finitely generated we may enlarge k to a fixed finite extension and write z = wd for an S-integer w in k. Then, setting xi := wxi ∈ OS , we finally find that the points in X (OS ) “essentially” correspond to S-integral solutions x = (x0 , . . . , xn ) ∈ OSn+1 for the equation f (x0 , . . . , xn ) = 1.12 (This rough argument may be replaced by a Veronese affine embedding of X ; see, e.g., [Z2, Prop. 1].) Note that if D splits into distinct irreducible components D1 , . . . , Dr , defined by forms f1 , . . . , fr such that f = f1 · · · fr , the equation f (x) = 1, x ∈ OSn+1 , will (essentially) imply that fi (x) ∈ OS∗ for all i = 1, . . . , r; again, we see that the reducibility of D somewhat strengthens our information. Going back to the Lang-Vojta conjecture, recall that the canonical class of Pn is −(n + 1)H, where H is the class of a hyperplane. Also, the class of D is (deg D)H = dH. Hence K + D ∼ (d − n − 1)H is (pseudo) ample if and only if d = deg D ≥ n + 2. Thus the conjecture predicts a non-Zariski dense set of S-integral points for Pn \ D if D has simple normal crossings and degree ≥ n + 2. When the divisor D splits into at least n + 2 components, the conclusion of the conjecture is a simple case of the result of Vojta at (v) above; in fact, Pic◦ (Pn ) = 0 and the N´eron-Severi group of Pn has rank ρ = 1. In short, the principle of Vojta’s proof is as follows: for an S-integral point x we have seen that yi := fi (x) lies in OS∗ , for i = 1, . . . , n + 2; now, the fi depend on n + 1 variables, so there is a nontrivial identical relation R(f1 , . . . , fn+2 ) = 0, which leads to R(y1 , . . . , yn+2 ) = 0 for the S-integral points in question. This gives determined by R = 0 and now an S-integral point on the subvariety of Gn+2 m Laurent’s Theorem applies. 12Again, we note that X is a rational variety, but the equation f = 1 in most cases represents
an irrational one, as happens already with X = P1 \ {0, 1, ∞}; this apparent paradox is due ∗. to the use of the cover-map z = wd of Gm , through the finite generation of OS
On the Integral Points on Certain Algebraic Varieties
541
The simplest instance of such splitting occurs when D is the sum of n + 2 hyperplanes and the result in this case was first proved independently by Evertse and van der Poorten-Schlickewei (see [Z]). It amounts to a special case of Laurent’s Theorem (and ultimately depends on the Subspace Theorem), but was obtained earlier and actually constituted a crucial tool for Laurent’s proof. By the above remarks it is not difficult to see that the thing boils down to the S-integral points on the linear subvariety y0 + · · · + yn = 1 of Gn+1 m , i.e., we find the so-called S-unit equation. Here, Laurent’s quoted theorem leads immediately to the conclusion since the subvariety is irreducible and not a translate of an algebraic subgroup.13 Another simple splitting occurs when D = D1 + · · · + Dn+1 for just one quadratic Di and linear remaining ones. The conclusion of the conjecture is now unknown already in the case n = 2, when D is a sum of two lines and a conic (in general position). This leads to apparently innocuous equations like x0 x1 (x20 + x21 + x22 ) = 1, xi ∈ Os ; again, x0 , x1 must be S-units and by finite generation of OS∗ we may suppose (going to a larger but fixed number field) that x0 x1 = t2 is a square. Then, putting y = x2 t, u = −x0 x31 , v = −x30 x1 , we find y 2 = 1 + u + v with S-integer y and S-units u, v. Namely, we have recovered one of the equations in the previous examples concerning recurrences and subvarieties of A1 × Gnm (and the steps may be reversed). On the one hand this shows that those problems fit in a broader context than it might perhaps appear; on the other hand, we have already remarked that little is known about them, so we get an idea of the depth of the general conjecture. Assuming the conjecture for this extremely special case, we deduce that the points (y, u, v) ∈ OS × (OS∗ )2 such that y 2 = 1 + u + v are not Zariskidense in this hypersurface. It easily follows that u, v must satisfy F (u, v) = 0 for some nonzero polynomial F depending only on k and S. Further, from Laurent’s theorem it is easy to infer that u, v must satisfy one of finitely many nontrivial equations of type ua v b = λ, a, b ∈ Z, λ ∈ OS∗ . In particular, if u = 2m , v = 3n either m or n must be bounded and we easily recover (say from Siegel’s Theorem) the previous claim on the finiteness of the integer solutions of y 2 = 1 + 2m + 3n . 6. The method of covers As announced earlier in this paper, we now turn to the description of the technique of (unramified) covers, which sometimes allows to deal with cases when there is a single divisor at infinity. A basic tool here is (a version of) the “Chevalley-Weil Theorem” (see [Se]): Let π : X → X be a finite unramified map of affine varieties, defined over the number field k. Then, given a finite set S of places of k, there exist a number field k and a finite set of places S of k such that π −1 (X (Ok,S )) ⊂ X (Ok ,S ). 13One can derive, as was done by the quoted authors, the more precise result that there are only finitely many S-integral solutions such that no nonempty subsum i∈I yi vanishes.
542
U. Zannier
In other words, we can lift the integral points on X to integral points on X defined over a same number field k ; naturally the important thing is that k does not depend on the lifted point and for this it is crucial that π is unramified. Often π may be obtained as the restriction to X := π ˜ −1 (X ) of a finite map π ˜ : X˜ → X˜ of projective varieties, which is unramified except (possibly) above the divisor at infinity X˜ \ X . This theorem is an arithmetic analogue of the lifting of maps in homotopy theory. Roughly speaking it may be proved by observing first that the coordinates of the points in π −1 (X (OS )) satisfy algebraic equations over k, of degree ≤ deg π and whose discriminants have a gcd which is essentially in OS∗ (this comes from the fact that π is unramified). Then one concludes by a classical result, due to Hermite, asserting the finiteness of number fields of bounded degree and discriminant in OS∗ . In the situation of the theorem, we may then work with the integral points on X rather than on X . This may be advantageous for several reasons; we have mentioned an example occurring in Siegel’s proof of his theorem, where reading the “diophantine approximation” on the cover yields a sharpening of the bounds. Other instances may occur when X , but not X , may be embedded in a semi-abelian variety, for which the theory is fairly complete. Further, going to the cover sometimes increases the number of components at infinity, and we have stressed throughout how important is the condition that this number is large. We met an example of this last phenomenon in the deduction of the general case of Siegel’s Theorem from the special case when there are at least three points at infinity; in fact, when the curve C has positive genus and a single point Q at infinity we may take an unramified cover of degree d ≥ 3 of ˜ Then the inverse image of Q will consist of d distinct the projective closure C. points and removing them from the cover leaves with an affine curve C with at least three points at infinity. Now, if C(Ok,S ) were infinite, the same would hold for C (Ok ,S ) by the Chevalley-Weil Theorem, contradicting the special case.
Unfortunately, when X has dimension > 1 the inverse image of an irreducible divisor under a finite map “usually” remains irreducible, so generally speaking the number of components at infinity does not usually increase. However there are exceptions to this. To see an instance, let us start with an unramified finite cover π : X˜ → X˜ of projective varieties, let us take a divisor D on X˜ and let us define D := π(D ). Then, this time, if deg π > 1, we shall “usually” have that π −1 (D) = π −1 (π(D )) will bring components not contained in D and so generally will have more components than D (see, e.g., [CZ3], Ex. 1.4, for an instance with abelian surfaces). However this construction is somewhat artificial (and of course does not work if X˜ is simply connected), since we would like to start with given D, rather than with given D .
On the Integral Points on Certain Algebraic Varieties
543
Some other, more natural, examples when the number of components at infinity increases after lifting to a cover have been proposed by Faltings in the paper [F2]. They concern affine surfaces of type X = P2 \ D for certain irreducible divisors D which we are going to describe. We again note that by the above remarks the S-integral points on X now correspond to S-integral solutions (x, y, z) ∈ OS3 of an equation f (x, y, z) = 1, where f is a certain homogeneous irreducible polynomial defining D in P2 . Following [F2] we briefly recall, in a slightly different language, Faltings’s construction. We let X be a projective smooth geometrically irreducible algebraic surface, defined over Q, embedded in Pn as a surface of degree ν > 8. We consider a “generic” projection fE : X → P2 with center an (n−3)-dimensional linear subspace E ⊂ Pn , which is parametrized by the corresponding Grassmannian. Note that fE has degree ν and will be regular and finite for E in an open subset of the Grassmannian. We define Z = ZE ⊂ X as the ramification locus of fE and D = DE := fE (Z) ⊂ P2 as the branch locus in P2 . For technical reasons we also assume that Z represents an ample class; in [F2] it is remarked that the class of Z is KX + 3L, where L is the class of a hyperplane section; in particular, this condition on Z depends only on X and the chosen embedding, not on E. We are interested in the S-integral points for X := P2 \ D. We continue to follow [F2], defining Y → X → P2 as the associated Galois closure. Note that the cover map π = πE : Y → P2 is unramified except above D (but however the map to X may be ramified also outside Z). In [F2] it is proved that, provided a hyperplane section of X satisfies certain ampleness conditions which we do not repeat here, for all E in a certain open dense subset of the Grassmannian it happens that E generates L and is such that Z is smooth and irreducible, fE is birational onto D, D has only cusps and simple double points as singularities and Y is smooth with Galois group Sν . Throughout we shall assume that E is such a linear space, saying in short that it is “general”. Now, a crucial fact for our purposes is that, for a general E, the inverse image of D in Y splits into ν(ν − 1)/2 irreducible components (see [F2]). As in [Z2] we remark that intuitively this is clear if we think of the simplest case when X is a hypersurface in P3 and E = (0 : 0 : 0 : 1), in which case fE is the projection on the first three homogeneous coordinates. Now X will be defined by a homogeneous equation F (X0 , X1 , X2 , X3 ) = 0 of degree ν, monic in X3 and an equation for the branch divisor D will be ∆(X0 , X1 , X2 ) = 0, where ∆ is the discriminant of F with respect to X3 . Let P = (x0 : x1 : x2 : u) be a generic point in X; then the points in X above fE (P ) = (x0 : x1 : x2 ) are given by (x0 : x1 : x2 : ui ), where ui runs through the ν distinct roots of F (x0 , x1 , x2 , U ) = 0; an ordering of these points corresponds to a point of Y above P . Note now the factorization ∆(x0 , x1 , x2 ) = i<j (ui − uj )2 ; well, the
544
U. Zannier
ν(ν − 1)/2 factors here correspond precisely to the mentioned components in the inverse image of D in Y . In the paper [F2], Faltings applies to this setting methods of diophantine approximations close to the papers [F] and [FW]; he first uses the ChevalleyWeil theorem and analyzes the integral points on X = P2 \ D by first studying directly those on Y \ π −1 (D) where π = πE : Y → P2 is the above cover map. For a general projection as above, he obtains in this way the finiteness of the S-integral points for X , under the condition that, with the above notation, (Z.L)L − αZ is ample on X for some α > 12; recalling that the class of Z is K + 3L this is equivalent with ((K.L) + 3ν − 3α)L − αK being ample.14 As an instance, he applies this when X is P1 × P1 , embedded in Pa+b+ab as a surface of degree ν = 2ab by means of the bihomogeneous polynomials of degrees a, b in two pairs of coordinates. It is noted that the required ampleness condition for L are satisfied when a, b ≥ 3 and thus we get quite explicit examples of finiteness of the S-integral points for P2 \ D, with certain irreducible D. Faltings also remarks that these results are not direct corollaries of the known ones, e.g., by Vojta, on subvarieties of semi-abelian varieties, by proving that the relevant cover Y \ π −1 (D) of X cannot be always embedded into a semi-abelian variety. In the paper [Z2] we have “tested”, so to speak, the methods and results of [CZ3], by applying them to the context introduced by Faltings, just described. We have borrowed from [F2] the geometrical picture and the calculations of the relevant intersection numbers and geometric numerical invariants; it has turned out that a construction slightly different compared to [F2] (see [Z2], Thm. 3.2, where we remove from Y less divisors than in [F2]) produces a variety Y \ D ⊃ Y \ π −1 (D) for which the assumptions for Theorem A above hold, under suitable conditions on the original surface X and its embedding; then the conclusion of Theorem A together with an application of Siegel’s Theorem prove the finiteness of integral points on Y \π −1 (D); finally, the Chevalley-Weil theorem gives the sought finiteness of X (OS ). The conditions which are required to apply Theorem A to this setting, differently from [F2], are this time purely numerical (see [Z2], Thm. 3.1); it is not clear whether they imply or are implied by Faltings’s ones, providing further evidence that the methods of [FW], although related to ours, are not in fact equivalent. In §4 of [Z2] we recover the above mentioned results of [F2] on P1 × P1 and we also show that the alluded numerical assumptions hold quite generally, by proving, e.g., that: If X has Kodaira number ≥ 0 then X has only finitely many S-integral points.15 (See [Z2], Cor. 4.1; naturally we 14Faltings also requires the somewhat technical conditions that D has some double point;
this is verified in the cases in question below, so we forget it here. 15We recall that the surfaces with negative Kodaira number are in a sense special, necessarily birational to a product C × P1 for a suitable curve C; see [H].
On the Integral Points on Certain Algebraic Varieties
545
tacitly assume here as in [F2] that ν > 8 and that the projection in question is “general”.) In view of the above mentioned Faltings’s remarks, these conclusions also show that Theorem A cannot be obtained directly by embeddings in semiabelian varieties. We also note that the interpretation of the splitting of D in Y by a discriminant factorization, as explained earlier, may be carried out directly, actually in any number of variables, and leads to other results about integral points (see Thm. 2.1 in [Z2]). In concrete terms, they concern integral solutions of “discriminantal” equations ∆(x0 , . . . , xn ) = 1; actually, this kind of principle has been known since long ago (see [B]). However, in the case of surfaces this more direct approach works only when X ⊂ P3 and so does not include the full context of [F2]. On the contrary, in “most” cases the surface X can be mapped to P3 only at the cost of introducing singularities and new components in the branch locus (see Remark 3.1 in [Z2]). The results of this last section illustrate that sometimes the method of covers is a powerful one. It seems moreover probable that further substantial applications of it can be found. Faltings himself remarks in [F2] that, though the Galois covers appearing in the context cannot be always embedded in semiabelian varieties, one cannot exclude a priori that the embedding exists for a further cover of them. These possibilities of course depend on the fundamental group of the affine variety given at the beginning as well as on the structure “at infinity” of its finite covering spaces16; it seems that generally speaking the knowledge here is not complete. Some advance in this interesting topic in the topology of algebraic varieties might then provide rather striking new applications to the numbertheoretical problem of integral points. Acknowledgment. I wish to thank Enrico Bombieri and Pietro Corvaja for helpful remarks and discussions. Added in Proof: Recently the methods of Sections 3 and 4 have been developed in arbitrary dimension by Aaron Levin in his PhD Thesis (Berkeley, 2005). A preprint by Levin in this respect appears on the web. References E. Bombieri, Sulle soluzioni intere dell’equazione 4X 3 = 27Y 2 + N, Riv. Mat. Univ. Parma, 8 (1957), 199–206. [CZ1] P. Corvaja, U. Zannier, A Subspace Theorem approach to integral points on curves, C. R. Acad. Sci. Paris, Ser. I 334 (2002), 267–271. [CZ2] P. Corvaja, U. Zannier, On the number of integral points on algebraic curves, J. reine angew. Math. 565 (2003), 27–42.
[B]
16The questions here are purely topological, due to a theorem of Grauert and Remmert which
roughly speaking asserts that the relevant topological coverings may be always realized as algebraic varieties.
546
U. Zannier
[CZ3] P. Corvaja, U. Zannier, On integral points on surfaces, Annals of Math., 160 (2004), 705–726. [CZ4] P. Corvaja, U. Zannier, On a general Thue’s equation, American J. of Math. 126 (2004), 1033–1055. [CZ5] P. Corvaja, U. Zannier, Diophantine equations with power sums and Universal Hilbert Sets, Indag. Mathem., N.S., 9 (3) (1998), 317–332. [CZ6] P. Corvaja, U. Zannier, On the diophantine equation f (am , y) = bn , Acta Arith. 94.1 (2000), 25–40. [EF] J.-H. Evertse, R. Ferretti, Diophantine inequalities on projective varieties, International Math. Res. Notices, 25 (2002), 1295–1330. [EF2] J.-H. Evertse, R. Ferretti, A generalizations of the Subspace Theorem with polynomials of higher degree, preprint NT/0408381, to appear on the Proceedings of the Schmidt Conference, Vienna 2003. [F] G. Faltings, Diophantine Approximation on Abelian Varieties, Annals of Math. 133 (1991), 549–576. [F2] G. Faltings, A New Application of Diophantine Approximation, in A Panorama of Number Theory, or The View from Baker’s Garden, G. W¨ ustholz Ed., Cambridge Univ. Press, 2002, 231–246. [FW] G. Faltings, G. W¨ ustholz, Diophantine Approximations on Projective Varieties, Inventiones Math. 116 (1994), 109–138. [H] R. Hartshorne, Algebraic Geometry, Springer-Verlag GTM 52, 1977. [HSi] M. Hindry, J.H. Silverman, Diophantine Geometry, Springer-Verlag, 2000. [L] S. Lang, Number Theory III, Encyclopoedia of Mathematical Sciences, Vol. 60, Springer-Verlag, 1991. [S1] W.M. Schmidt, Diophantine Approximation, Springer-Verlag LNM 785. [S2] W.M. Schmidt, Diophantine Approximations and Diophantine Equations, Springer-Verlag LNM 1467, 1991. [Se] J-P. Serre, Lectures on the Mordell-Weil Theorem, Vieweg, 1989. [Se2] J-P. Serre, Algebraic groups and class fields, Springer Verlag, GTM 117, 1988. [Si] J.H. Silverman, The Arithmetic of Elliptic Curves, Springer-Verlag GTM 106, 1986. [V] P. Vojta, Diophantine Approximations and value distribution theory, Springer Verlag LNM 1239. [Z] U. Zannier, Some Applications of Diophantine Approximation to Diophantine Equations, Forum Editrice, Udine, dicembre 2003. [Z2] U. Zannier, On the integral points on the complement of ramification divisors, Journal de Math. de Jussieu 4 (2005), 317–330. [Z3] U. Zannier, Polynomial squares of the form aX m + b(1 − X)n + c, Rend. Sem. Mat. Univ. Padova 112 (2004), 1–9. Umberto Zannier Scuola Normale Superiore Piazza dei Cavalieri 7 I-56126 Pisa, Italy e-mail:
[email protected]
Network Lectures
4ECM Stockholm 2004 c 2005 European Mathematical Society
Some Problems Related with Holomorphic Functions on Tube Domains over Light Cones Aline Bonami Abstract. In this survey, we consider two kinds of problems on tube domains over light cones. The first one is related with Poisson-Szeg¨ o integrals F . When F is a bounded real function and satisfies an appropriate smoothness up to the boundary, then F is necessarily the real part of a holomorphic function. The second one is the Lp boundedness of the Bergman projection, for which known positive and negative results leave a gap between them. This gives an illustration of activities within HARP.
1. Introduction Let us consider the complex tube domain Ω = Rn + iΓ ⊂ Cn , n ≥ 3, where Γ is the forward light cone given by (1.1) Γ = {y = (y1 , . . . , yn−1 , yn ) ∈ Rn : y1 > y22 + · · · + yn2 } . The cone Γ is the simplest example of a symmetric irreducible cone, apart from the positive real line, for which the associated tube domain is the upper-halfplane. The description of such cones can be done through Jordan algebras, and may be found in [FK]. They can be identified with symmetric spaces. Here Γ identifies with SO0 (n − 1, 1)/SO(n − 1), where SO0 (n − 1, 1) is the identity component of the Lorentz group. Rn is the Shilov boundary of the tube domain Ω. The Hardy space H 2 (Ω) consists in holomorphic functions that may be written as Laplace transforms of functions g ∈ L2 (Rn ) that are supported in Γ, that is, F (z) := (2π)−n eiz.ξ g(ξ)dξ = Cy ∗ f (x), z = x + iy, Γ
where f , which has Fourier transform g, is the (Shilov) boundary value of the holomorphic function F , and C is the Cauchy kernel. This last one can be explicitly computed, n −n eiz.ξ dξ = cn ∆(z/i)− 2 , Cy (x) := (2π) Γ
Research partially financed by the European Commission IHP Network 2002-2006 Harmonic Analysis and Related Problems (Contract Number: HPRN-CT-2001-00273 - HARP).
550
A. Bonami
with ∆(y) := y12 − y22 − · · · − yn2 the Lorentz form. For F ∈ H 2 , as in all domains (see [S]), the scalar product of its (Shilov) boundary value with the Szeg¨o kernel S(z, ·) gives its evaluation at z. Here S(z, u) = Cy (x − u). Now, the Poisson-Szeg¨o kernel is defined by P (z, u) :=
|S(z, u)|2 S(z, z)
z ∈ Ω, u ∈ Rn ,
and gives also the evaluation at the point z when restricted to holomorphic functions. Moreover, it gives an approximate identity, as the usual Poisson kernel related to the upper-half-plane. We say that the function F in the tube domain is a Poisson-Szeg¨o integral whenever it may be written as F (z) := P (z, u)f (u)du (1.2) Rn
when this integral makes sense (eventually extending to distributions). As it is seen from the last formulas, which can be obtained in the context of all irreducible symmetric cones (see [FK] or [Gi]), all this mimics the situation in the upper-half-plane. But there are main differences, and we will consider two of them. • Poisson-Szeg¨o integrals are no more pluriharmonic functions (that is, sums of holomorphic and anti-holomorphic functions); • the singularities of the kernels are of not of Calder` on-Zygmund type. They involve oscillatory integrals. These last ones have been the object of many studies, starting from the theorem of Fefferman [Fef], which states that the Szeg¨ o projection for the tube domain p n Ω is not bounded in L (R ) for p = 2. This projection is given, on the Fourier side, by the multiplication by the characteristic function of Γ, and this assertion can be deduced from the fact that the characteristic function of the disc is not a Fourier multiplier in two dimensions, for p = 2. Let us recall that this counter-example of Fefferman has led to the consideration of Bochner-Riesz means (see [S]), for which the problem of boundedness in Lp is still open from dimension 3, as well as the equivalent problem related to the cone itself, known as the “cone multiplier problem”, for which Wolff has obtained the best known partial results [W]. We will see that, when replacing the Szeg¨ o projection by the weighted Bergman projection, we have also partial results, which are in some way connected with the ones for the cone multiplier problem. 2. Poisson-Szeg¨o integrals and pluri-harmonicity Poisson-Szeg¨o integrals are known to coincide with solutions of a second-order system of partial differential equations, called the Hua system: this is due to Johnson and Koranyi in the general context of tube domains over symmetric cones, and with boundary values that are defined as hyper-functions (see [JK], [FK]). On the other hand, it has been observed, for the first time by Folland in [Fo], that harmonic functions for the hyperbolic Laplacian in the complex
Tube Domains over Light Cones
551
unit ball are not smooth up to the boundary, unless they are pluriharmonic. This phenomenon has been given more precise statements (see [Gr], [BBG]). In particular, smoothness may be understood in terms of distributions. In the context of the tube domain Ω, generalized to all irreducible symmetric cones, we have the following. Theorem [BBDHJ]. Let D be an irreducible symmetric domain of tube type. There exists k (depending on the dimension and the rank) such that, if F is the Poisson-Szeg¨ o integral of a bounded function and F extends into a function of class C k on D, then F is pluriharmonic. One needs some boundary condition for F to get the conclusion, but boundedness is certainly far from necessary. Also, one would like to read this property from the behavior of the Fourier transform of Py outside Γ ∪ (−Γ) (since pluriharmonic functions have their spectrum contained in this set). Trojan has some partial results in this direction, [T]. The proof given in [BBDHJ] is much more indirect. Many sufficient conditions for pluriharmonicity, which can be written in terms of families of second-order operators, had been given previously (see [BDH] and [DHMP]). Let us mention that, when dealing with Siegel domains of type II, the characterization of Poisson-Szeg¨ o integrals as solutions of a system of partial differential equations is not as satisfactory as for tube domains. Berline and Vergne in [BV] give a system that may be of third order. If one writes for general such domains the analogue of the Hua system, using in the same way the curvature tensor, then one is led to a second-order system which, surprisingly, annihilates only pluri-harmonic functions when adding some growth condition at the boundary (see [BBDHPT] and [Bu]). 3. Bergman projection and Besov spaces Since the Szeg¨o projection is not bounded in Lp (Rn ), it is natural to see whether the weighted Bergman projections are. This is studied in a series of papers, [BB], [BBPR], [BBGR], see also [BBGNPR]. More precisely, let Lpν (Ω) be the n weighted Lebesgue space, for the measure ∆(y)ν− 2 dxdy (recall that z = x + p iy), and let Aν be its closed subspace consisting in holomorphic functions. Integrability of the weight asks for the condition ν > n2 − 1, and guarantees that Apν is not reduced to zero. Again, the weighted Bergman kernel Bν (z, ζ), which gives the evaluation at z for functions of A2ν , is known explicitly, Bν (z, w) = d(ν)∆−(ν+ 2 ) ((z − w)/i). n
It is the kernel of the weighted Bergman projection, called Pν and given by n Pν F (z) = Bν (z, w)F (w) ∆( w)ν− 2 du dv, Ω
where we have used the notation w = u + iv. The best statement, up to now, is the following. We restrict to p ≥ 2, since Pν is self-adjoint.
552
A. Bonami
Theorem [BBGR]. The weighted Bergman projection Pν is bounded in Lp when 2 ≤ p < 2(ν+n−2) + ε(ν), for some explicit positive ε(ν). It is unbounded for n−2 p≥
2(ν+n−1) . n−2
Moreover, there exists some explicit constant ν0 such that, for
). ν > ν0 , then Pν is bounded in Lp in the whole range p ∈ [2, 2(ν+n−1) n−2 We conjecture that it is always bounded in the whole range 2 ≤ p < It is why we do not give the explicit values of ε(ν) and ν0 , which depend on non optimal estimates of Wolff and L aba-Wolff (see [W], [LW]), and refer for them to [BBGR]. Estimates of Tao-Vargas [TV] can also be used to have some improvement. Remark that the Szeg¨o projection corresponds formally to the case when ν = 0 (just look at the kernels given above), and the critical index that appears in the conjecture coincides to the critical index in the cone multiplier problem. We give an idea of the proof. Let us first mention that all this generalizes to general symmetric cones, except for the gain of ε(ν): the results of L aba-Wolff, which we use in the case of light cones, are not known in general. Apart from this last use of their estimates, the main ingredient is a reformulation of the problem in terms of inequalities for holomorphic functions in the tube domain, which may be thought of as Laplace transforms of functions with support in Γ. For the upper-half-plane, it is well known that the spaces of boundary values of weighted Bergman spaces coincide with Besov spaces at the boundary. These last ones are related to a Littlewood-Paley decomposition, which comes from a Whitney decomposition of the real line. The equivalent problem for Ω is proved to be equivalent to the Lp boundedness of the weighted Bergman projection. There are many equivalent definitions of the Besov spaces in this context, as in the classical case. We borrow the next one to Debertol [D]. The role of the dilation group is now played by the triangular group T := AN , given in the Iwasawa decomposition of SO0 (n − 1, 1). This last one acts simply transitively on Γ. Let dτ be its Haar measure. We fix a non zero smooth function ψ in Rn , which is compactly supported in Γ, and note e := (1, 0, . . . , 0). Then the Besov norm of a smooth function f , whose Fourier transform is supported in Γ, is given by p f ∗ ψτ pp ∆(τ e)−ν dτ, f Bνp,p := 2(ν+n−1) . n−2
T
ξ). The main issue is to prove that the extension operator where ψτ (ξ) := ψ(τ into a holomorphic function is a bounded operator, when considered from the Besov space into the Bergman space. We are linked to estimate fj pLp (Rn ) in terms of fj pLp (Rn ) when the fj s have their spectrum that are supported in nearly disjoint parallelepipeds of same size, which cover a neighborhood of the boundary of Γ, conveniently truncated. This is exactly the problem that is considered by L aba and Wolff, in connection with the cone multiplier problem. Let us indicate a last problem. One can define a Hardy-type space that corresponds to the limit value ν = n2 − 1 (see [Ga]). The problem of the Lp -
Tube Domains over Light Cones
553
boundedness of the corresponding projection is entirely open, except for the same negative results as above. Acknowledgement. This is a summary of the talk given by the author at 4ECM as the co-ordinator of the European network HARP. It illustrates some of the activities of HARP, and especially the fact that within HARP the interplay between Euclidean Harmonic Analysis and its counterpart on Lie groups is emphasized. Among other participants of HARP who have contributed to this area, one can mention D. Buraczewksi (Wroclaw), E. Damek (Wroclaw), D. Debertol (Pisa), G. Garrig´ os (Madrid), A. Hulanicki (Wroclaw), Ph. Jaming (Orl´eans), D. M¨ uller (Kiel), M. Peloso (Torino), F. Ricci (Pisa) and, indirectly, A. Vargas (Madrid). References ´koll´ Be e, D., A. Bonami. Estimates for the Bergman and Szeg¨ o projections in two symmetric domains of Cn , Colloq. Math. 68 (1995), 81–100. [BBPR] B´ ekoll´ e, D., A. Bonami, M. Peloso and F. Ricci. Boundedness of weighted Bergman projections on tube domains over light cones, Math. Z. 237 (2001), 31–59. ´ s and F. Ricci. Littlewood-Paley [BBGR] B´ ekoll´ e, D., A. Bonami, G. Garrigo decompositions related to symmetric cones and Bergman projections in tube domains , Proc. London Math. Soc. 89 (2004), 317–360. ´ s, C. Nana, M. Peloso and F. [BBGNPR] B´ ekoll´ e, D., A. Bonami, G. Garrigo Ricci. Lecture notes on Bergman projectors in tube domains over cones: an analytic and geometric viewpoint, Proceedings of the International Workshop in Classical Analysis, Yaound´e 2001. Available at http://www.uam.es/gustavo.garrigos. [BV] Berline, N. and M. Vergne. Equations de Hua et noyau de Poisson, Lecture Notes in Math. 880 (1981) 1–51, Springer-Verlag. [BBG] Bonami, A., J. Bruna, S. Grellier On Hardy, BMO and Lipschitz spaces of invariant harmonic functions in the unit ball, Proc. of the London Math. Soc. 71 (1998), 665–696. [BBDHPT] Bonami, A., D. Buraczewski, E. Damek, A. Hulanicki, R. Penney and B. Trojan Hua system and pluriharmonicity for symmetric irreducible Siegel domains of type II, J. Funct. Anal. 188 (2002), 38–74. [BBDHJ] Bonami, A., D. Buraczewski, E. Damek, Ph. Jaming and A. Hulanicki Maximum boundary regularity of bounded Hua-harmonic functions on tube domains, J. Geom. Anal. 14 (2004), 457–486. [Bu] Buraczewski, D. The Hua system on irreducible Hermitian symmetric spaces of nontube type, Ann. I. Fourier54 (2004), 81–128. [BDH] Buraczewski, D., E. Damek and A. Hulanicki Bounded pluriharmonic functions on symmetric irreducible Siegel domains, Math. Z. 240 (2002), 169– 195. ¨ ller and M. Peloso Pluriharmonic [DHMP] Damek, E., A. Hulanicki, D. Mu H 2 functions on symmetric irreducible Siegel domains, Geom. Funct. Anal. 10 (2000), 1090–1117. [BB]
554 [D] [FK] [Fef] [Fo] [Ga] [Gi] [Gr] [JK] [LW] [S] [TV] [T] [W]
A. Bonami Debertol, D. Besov spaces and the boundedness of weighted Bergman projections over symmetric tube domains, preprint (2003) ´ nyi Analysis On Symmetric Cones, Oxford Math. Faraut, J. and A. Kora Mongraphs, Oxford Sc.Publ. Calderon Press, 1994. Fefferman, C. The multiplier problem for the ball, Ann. of Math. 94, 330– 336. Folland, G. Spherical harmonic expansion of the Poisson-Szeg¨ o kernel for the ball, Proc. Amer. Math. Soc., 47 (1975), 401–408. ` s, G. Generalized Hardy spaces on tube domains over cones, Colloq. Garrigo Math. 90 (2001), 213–251. Gindikin, S. G. Analysis on homogeneous domains, Russian Math. Surveys 19 (1964), 1–89. Graham, C. R. The Dirichlet problem for the Bergman Laplacian. I Comm. Partial Differential Equations, 8 (1983), 433–476. ´ nyi The Hua operators on bounded symmetric Johnson, K. and A. Kora domains of tube type, Ann. of Math. 111 (1980), 589–608. L aba, I. and T. Wolff, A local smoothing estimate in higher dimensions, J. Anal. Math. 88 (2002), 149–171. Stein, E. M. Harmonic Analysis. Princeton University Press, Princeton, 1993. Tao, T. and A. Vargas. A bilinear approach to cone multipliers II. Applications, Geom. Funct. Anal. 10 (2000), 216–258. Trojan, B. Hua-harmonic functions on homogeneous Siegel domains, preprint. Wolff, T. Local smoothing type estimates on Lp for large p, Geom. Funct. Anal. 10 (2000), 1237–1288.
Aline Bonami Universit´ e d’Orl´ eans Facult´ e des Sciences D´ epartement de Math´ematiques BP 6759 F-45067 Orl´eans Cedex 2, France e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Hyperbolic PDEs, Kinetic Formulation and Geometric Measure Theory Yann Brenier Abstract. One of the most fruitful interaction between kinetic theory and nonlinear hyperbolic PDEs has been the so-called kinetic formulation of conservation laws. Very recently, within the HYKE network, a beautiful structure theorem for non-smooth solutions of some non-linear hyperbolic PDEs has been obtained by de Lellis, Otto and Westdickenberg [DOW]. The proof is an unusual blend of kinetic ideas, blow-up techniques and tools coming from geometric measure theory.
1. Scalar conservation laws and BV functions The simplest multidimensional non-linear hyperbolic equations, known as “scalar conservation laws” (see [Da, Se, BDLL], etc.), read: ∂xi (Ai (u)) = 0, (1.1) ∂t u + i=1,d
where A : R → Rd is a given smooth function, say Lipschitz continuous, u : (t, x) ∈ R+ × Rd → R is an unknown function to be determined from its value u0 at t = 0, and ∂t , ∂xi respectively denote the time derivative and the partial derivative with respect to xi , for i = 1, . . . , d. For most data A and u0 , there is no global smooth solution to those equations. Typically, solutions become discontinuous in finite time and develop “shock singularities”, namely jumps across sets of codimension one. Kruzhkov and Volpert [Kr, Vo] showed, in the late 60’, that there is a unique solution u ∈ C 0 (R+ , L1 (Rd )) for all fixed A and u0 ∈ L1 (Rd ), provided u is required to satisfy the so-called entropy inequalities ∂xi (AU (1.2) ∂t U (u) + i (u)) ≤ 0, i=1,d
in the distribution sense, for all convex Lipschitz function U with s U A (r)U (r)dr, s ∈ R. A (s) = 0
The method used by Kruzhkov and Volpert involves in an essential way the space E = (L∞ ∩ BV )(Rd ) of all bounded Lebesgue measurable functions v such that |v(x + h) − v(x)| T V (v) =: sup dx < ∞. h21 + · · · + h2d h=0 Rd
556
Y. Brenier
More precisely, it is first shown that, for all initial condition u0 in E, there is an entropy solution u = u(t, x) in L∞ (R+ ; E) ∩ C 0 (R+ , L1 (Rd )). Next, it is shown that, for two such solutions u and u ˜, the stability estimate: |u(t, x) − u ˜(t, x)|dx ≤ |u0 (x) − u ˜0 (x)|dx Rd
Rd
holds true for all t ≥ 0. Finally, using a standard density argument, existence and uniqueness of solutions in C 0 (R+ , L1 (Rd )) are established for all initial data in L1 (Rd ). Qualitatively speaking, the BV space E is very well suited to describe discontinuous solutions with shock type singularities. Indeed, according to geometric measure theory [Fe, Gi, EG], the so-called “structure theorem” for BV functions asserts that any function v ∈ E has a rectifiable “jump” set of codimension one along which v has two different values on each side of the sets. (Of course this vague statement has to be made more precise.) Unfortunately, the structure theorem does not apply (at all) to entropy solutions u with general initial condition u0 ∈ L1 (Rd )\E, since, then, u(t, ·) is not expected to enter the BV space E at any positive t. (See [DOW] for a more detailed discussion.) 2. Kinetic formulation of scalar conservation laws It was established in [Br1, Br2, GM] that scalar conservation laws can be lifted as linear hyperbolic equations by introducing an extra variable w ∈ R. In physical terms, according to the Maxwell-Boltzmann kinetic theory of dilute gases [CIP], the additional variable w can be interpreted as a (scalar) “momentum variable”. The linear equation corresponding to (1.1) reads: Ai (w)∂xi f = 0, (2.1) ∂t f + i=1,d
where f = f (t, x, w) and u = u(t, x) are related to each other through a lifting operator L : u → f and a projection operator P : f → u. These operators were introduced in [Br2] and defined by: +∞ f (t, x, w)dw. Lu(t, x, w) = 1{0<w
w>u(t,x)} , P f (t, x) = −∞
Clearly, P L is just the identity map, while the “collapse” operator M = LP differs from the identity map, is a non-expansive operator in L1 and enjoys an “entropy production” property. Of course, (2.1) is easily solved by f (t, x, w) = G(t)f (0, ·, ·)(x, w) =: f (0, x − tA(w), w). More precisely, it is shown in [Br1, Br2, GM], using BV estimates and Kruzhkov type analysis, that the entropy solution u(t, x) to (1.1) with initial condition u0 is just the limit in L∞ (R+ , L1 (Rd )), as τ → 0+ , of the approximate solution u(τ ) defined by: by u(τ ) (t, x) = (P G(τ )L)n u0 , nτ ≤ t < (n + 1)τ, n = 0, 1, 2, 3, . . . .
Hyperbolic PDEs, Kinetic Formulation, Geometric Measure Theory
557
Alternately, f = Lu(τ ) can be seen as the solution of the following PDE: ∞ ∂t f + Ai (w)∂xi f = (LP f − f )δ(t − nτ ). (2.2) n=1
i=1,d
A variant of (2.2), the so-called BGK model, was later considered by Perthame and Tadmor [PT]: LP f − f Ai (w)∂xi f = . (2.3) ∂t f + τ i=1,d
Again, using BV estimates, the authors prove the convergence of the approximate solution P f to the right entropy solution as the “relaxation time” τ > 0 tends to zero. Right after [PT], it was observed by Lions, Perthame and Tadmor [LPT1], that, without any approximation, equation (1.1) and entropy conditions (1.2) can be directly formulated in kinetic style. This so-called “kinetic formulation” reads ∂t f + Ai (w)∂xi f = ∂w µ, f = LP f, (2.4) i=1,d
where µ = µ(t, x, w) is a nonnegative measure and can be seen as a Lagrange multiplier for constraint f = LP f . This equation can be easily obtained either directly from (1.1,1.2) or by letting τ → 0 in equation (2.3). The kinetic formulation (2.4) quickly turned out to be very useful and influential (although its generalization to systems of conservation laws seems impossible except for very peculiar systems, as in [BC]). 3. Averaging lemmas The kinetic formulation (2.4) is very suitable for the use of the so-called “averaging lemmas”, one of the most powerful tools introduced in kinetic theory (see [GLPS, GG, DLM], etc.) going back to Golse, Perthame, Sentis [GPS] and Agoshkov [Ag]. (The best-known application being the famous DiPernaLions theorem showing existence of global solutions for the Boltzmann equation [DL].) Using averaging lemmas, the authors of [LPT2] showed that each entropy solution u of (1.1) gets immediately smoother than its initial value u0 , provided A satisfies the following “genuine non-linearity condition”: |{w ∈ R, τ + A (w).ξ = 0 }| = 0, ∀(τ, ξ) = (0, 0),
(3.1)
where | · | denotes the one-dimensional Lebesgue measure. (Notice that in the linear case A is a constant and the condition cannot be satisfied. In that case, it is obvious that singularities are preserved by evolution under (1.1).) Indeed, u can be written as the “average” u = P f of the solution f to a kinetic equation (2.4), where the right-hand side has some limited but controlled distributional regularity (since µ is a nonnegative measure). Under condition (3.1), it follows from some refined averaging lemmas [DLM], that P f belongs to a fractional time-space Sobolev space W s,p (where s < 1 and p depends
558
Y. Brenier
on A), which roughly means that fractional s derivatives in space and time are local Lp functions, when the initial condition is just a L1 function. This truly remarkable result shows (and quantifies) the dissipative effect of nonlinearities in a (formally) conservative and reversible PDE such as (1.1), without any dissipative term added. However, from a qualitative point of view, this result is somewhat disappointing. Indeed, a typical function belonging to a fractional space such as W s,p has a much larger and wilder singularity set than a BV function and there is no hope to exhibit a nice rectifiable codimension 1 jump set for such a function, as expected. 4. Blow-up techniques for conservation laws The averaging lemmas do not straightforwardly apply to the approximate equation (2.2) (although they do to (2.3)!), due to the singularity of the right-hand side (presence of delta measures in t). This difficulty was overcome by Vasseur [V1], who introduced for that purpose a powerful “blow-up” technique, borrowed from elliptic theory. This approach turned out to be useful for other applications. For instance, Vasseur was later able to prove the existence of traces for entropy solutions to multidimensional conservation laws without BV estimates (see [V3] and the related works [CF, CR]). A different application can be found in [V2]. The blow-up technique can be described, following [DOW] and using space time compact notations such as n = d + 1, y = (t, x), a(w) = (1, A (w)). For each fixed solution f (y, w) = L(u(y, ·))(w) to the kinetic formulation (2.4) with measure µ(y, w), we consider, for each fixed y ∞ ∈ Rn , the blow-up family, uy
∞
,r
(y) = u(y ∞ + ry),
µy
∞
,r
(A × B) = rn−1 µ((y ∞ + rA) × B)
for 0 < r ≤ and all Borel subsets A and B of, respectively, Rn and R. This ∞ family is relatively compact (in L1 for uy ,r and in the space of Radon measures ∞ for µy ,r ) and its limit points u∞ , f ∞ = Lu∞ , µ∞ still satisfy (2.4), while, for most blow-up points y ∞ , µ∞ gets a much simpler structure than the original µ. 5. A structure theorem for entropy solutions The most impressive application of blow-up techniques for scalar conservation laws is, in our opinion, the recent structure theorem [DOW] by De Lellis, Otto and Westdickenberg, which, in addition, involves a lot of tools from geometric measure theory. The authors are able to show that each entropy solution to a genuinely non-linear multidimensional scalar conservation laws (1.1) has a singularity set just as a typical BV function, although u itself is not in general a BV function. (Related works can be found in [DO, DR].) Roughly speaking, it is shown that there is a n − 1-dimensional set J on which, for almost every y ∞ , µ∞ is non-trivial and factorizes as µ∞ (x, w) = h∞ (w)ν ∞ (x), where h is a BV function on R and ν a measure on Rn , both depending only on the blow-up
Hyperbolic PDEs, Kinetic Formulation, Geometric Measure Theory
559
point y ∞ . Then, a careful classification of the blow-up points y ∞ is performed, according to the behavior of (u∞ , h∞ , ν ∞ ), viewed as a solution of: ai (w)∂xi f ∞ = ∂w h∞ ⊗ ν ∞ , f ∞ = Lu∞ . i=1,n
Using a long series of geometric measure theoretical tricks, the authors manage to prove that, indeed, the set J is rectifiable and essentially behaves as the jump set of a BV function, although the corresponding entropy solution u has no reason to be a BV function itself (cf. [DW]). Of course, we refer to [DOW] for more details. Acknowledgment. This work has been supported by the European network IHP network “HYKE” HPRN-CT-2002-00282. References V.I. Agoshkov, Spaces of functions with differential-difference characteristics and the smoothness of solutions of the transport equation, Dokl. Akad. Nauk. SSSR 276 (1984) 1289–1293. [AF] L. Ambrosio, N. Fusco, Functions of bounded variation and free discontinuity problems, Oxford Mathematical Monographs, The Clarendon Press, Oxford University Press, New York, 2000. [BDLL] G. Boillat. C. Dafermos, P. Lax, T.P. Liu, Recent mathematical methods in nonlinear wave propagation, Lecture Notes in Math., 1640, Springer, Berlin, 1996 [Br1] Y. Brenier, Une application de la sym´etrisation de Steiner aux ´equations hyperboliques: la m´ ethode de transport et ´ ecroulement, C. R. Acad. Sci. Paris Ser. I Math. 292 (1981) 563–566. [Br2] Y. Brenier, R´esolution d’´equations d’´evolution quasilin´ eaires en dimension N d’espace ` a l’aide d’´ equations lin´eaires en dimension N + 1, J. Differential Equations 50 (1983) 375–390. [BC] Y. Brenier, L. Corrias, A kinetic formulation for multi-branch entropy solutions of scalar conservation laws, Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 15 (1998) 169–190. [CIP] C. Cercignani, R. Illner, M. Pulvirenti, The mathematical theory of dilute gases, Applied Mathematical Sciences, 106, Springer-Verlag, New York, 1994. [CF] G.-Q. Chen, H. Frid, Divergence-measure fields and hyperbolic conservation laws, Arch. Ration. Mech. Anal. 147 (1999) 89–118. [CR] G.-Q. Chen, M. Rascle, Initial layers and uniqueness of weak entropy solutions to hyperbolic conservation laws, Arch. Ration. Mech. Anal. 153 (2000) 205–220. [Da] C. Dafermos, Hyperbolic conservation laws in continuum physics, Grundlehren der Mathematischen Wissenschaften, 325, Springer-Verlag 2000. [DO] C. De Lellis, F. Otto, Structure of entropy solutions to the eikonal equation, J. Eur. Math. Soc. (JEMS) 5 (2003) 107–145. [DOW] C. De Lellis, F. Otto, M. Westdickenberg, Structure of entropy solutions for multi-dimensional scalar conservation laws, Arch. Ration. Mech. Anal. 170 (2003) 137–184. [Ag]
560 [DW] [DR] [DL] [DLM] [EG] [Fe] [GG] [GM] [Gi] [GLPS] [GPS]
[Kr] [LPT1]
[LPT2]
[PT] [Se] [V1]
[V2] [V3] [Vo]
Y. Brenier C. De Lellis, M. Westdickenberg, On the optimality of velocity averaging lemmas, Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 20 (2003) 1075–1085. C. De Lellis, T. Rivi`ere, The rectifiability of entropy measures in one space dimension, J. Math. Pures Appl. (9) 82 (2003) 1343–1367. R.J. DiPerna, P.-L.Lions, On the Cauchy problem for Boltzmann equations: global existence and weak stability, Ann. of Math. (2) 130 (1989) 321–366. R.J. DiPerna, P.-L.Lions, Y. Meyer, Lp regularity of velocity averages, Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 8 (1991) 271–287. L.C. Evans, R. F.Gariepy, Measure theory and fine properties of functions, Studies in Advanced Mathematics, CRC Press, Boca Raton, FL, 1992. H. Federer, Geometric measure theory, Die Grundlehren der mathematischen Wissenschaften, 153, Springer-Verlag, New York 1969. P. G´erard, F. Golse, Averaging regularity results for PDEs under transversality assumptions, Comm. Pure Appl. Math. 45 (1992) 1–26. Y. Giga, T. Miyakawa, A kinetic construction of global solutions of first order quasilinear equations, Duke Math. J. 50 (1983) 505–515. E. Giusti, Minimal surfaces and functions of bounded variation, Monographs in Mathematics, 80. Birkh¨ auser Verlag, Basel, 1984. F. Golse, P.-L. Lions, B. Perthame, R. Sentis, Regularity of the moments of the solution of a transport equation, J. Funct. Anal. 76 (1988) 110–125. F. Golse, B. Perthame, R. Sentis, Un r´esultat de compacit´e pour les ´equations de transport et application au calcul de la limite de la valeur propre principale d’un op´ erateur de transport, C. R. Acad. Sci. Paris Ser. I Math. 301 (1985) 341–344. S.N. Kruˇzkov, First order quasilinear equations with several independent variables, Mat. Sb. (N.S.) 81 (123) 1970 228–255. P.-L. Lions, B. Perthame, E. Tadmor, Formulation cin´ etique des lois de conservation scalaires multidimensionnelles, C. R. Acad. Sci. Paris Ser. I Math. 312 (1991) 97–102. P.-L. Lions, B. Perthame, E. Tadmor, A kinetic formulation of multidimensional scalar conservation laws and related equations, J. Amer. Math. Soc. 7 (1994) 169–191. B. Perthame, E. Tadmor, A kinetic equation with kinetic entropy functions for scalar conservation laws, Comm. Math. Phys. 136 (1991) 501–517. D. Serre, Systems of conservation laws, 1 and 2, ch. 9.6 and 10.1, Cambridge University Press, Cambridge, 2000. A. Vasseur Convergence of a semi-discrete kinetic scheme for the system of isentropic gas dynamics with γ = 3, Indiana Univ. Math. J. 48 (1999) 347–364. A. Vasseur Time regularity for the system of isentropic gas dynamics with γ = 3, Comm. Partial Differential Equations 24 (1999) 1987–1997. A. Vasseur Strong traces for solutions of multidimensional scalar conservation laws, Arch. Ration. Mech. Anal. 160 (2001) 181–193. A. I. Volpert, Spaces BV and quasilinear equations, Mat. Sb. (N.S.) 73 (115) 1967 255–302.
Yann Brenier, CNRS, Universit´e de Nice, on leave from Universit´e Paris 6, France e-mail: [email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Random Dynamics in Spatially Extended Systems F. den Hollander Abstract. This short contribution describes the scientific programme “Random Dynamics in Spatially Extended Systems” that is supported by the European Science Foundation. In this programme, which runs over the period 2002–2006, 13 European countries participate. The main activities of the programme are listed, and a brief sketch is given of some of the main developments and future challenges in each of the eight research themes the programme is targeting.
1. Activities Random Dynamics in Spatially Extended Systems (RDSES) is a scientific programme that is running with the support of the European Science Foundation (2002–2006). The programme centres around mathematical statistical physics. Spatially extended systems consist of a large number of components that interact locally but that may nevertheless exhibit a global dependence, resulting in anomalous fluctuation phenomena and phase transitions. The main goal of the programme is to study the random dynamics, acting on the components of such systems, through the application of space-time scaling arguments and probabilistic limiting techniques. The challenge is to give a precise mathematical treatment of the interesting and complex physical phenomena that arise from this random dynamics. RDSES focusses on the following eight research themes in equilibrium and nonequilibrium statistical physics: (a) Gibbsian vs. non-Gibbsian spin systems. (b) Polymers and self-interacting random processes. (c) Interfaces and surface phenomena. (d) Disordered media. (e) Relaxation to equilibrium and metastability. (f) Hydrodynamic behavior of conservative systems. (g) Entropy production and fluctuations far from equilibrium. (h) Granular media and sandpile dynamics. “European network lecture” delivered at the 4th European Congress of Mathematics, 27 June–3 July, 2004, Stockholm, Sweden.
562
F. den Hollander
In total 13 European countries are participating in RDSES: Austria, Belgium, Czech Republic, Denmark, Finland, France, Germany, Hungary, Netherlands, Poland, Sweden, Switzerland and United Kingdom. Each country has a representative on the steering committee, which oversees the development of the programme. Chair: F. den Hollander. The main activities of the programme are: (1) Visitor exchange: short scientific visits of 1–2 weeks. A call is sent out 4 times a year to some 120 junior and senior researchers. So far 50 applications have been granted. Special care is taken that the call reaches young people. (2) Workshops: 3–5 day meetings on topics selected by the steering committee. Since the start, 10 workshops have been supported throughout Europe: – Constructing Non-Equilibrium Statistical Mechanics (November 2002, Leuven, Belgium); – Statistical Mechanics and Probability Theory (March 2003, Marseille, France); – Random Walks in Random Environments (August 2003, Cambridge, United Kingdom); – Random Matrix Theory (September 2003, Gregynog Hall, Wales, United Kingdom); – Gibbs vs. non-Gibbs in Statistical Mechanics and Related Fields (December 2003, Eindhoven, Netherlands); – Interacting Particle Systems: New Trends, with Application in Biology and Economy (January 2004, Paris, France); – Young European Probabilists I: Conformal Invariance, Scaling Limits and Percolation (April 2004, Eindhoven, Netherlands); – Equilibrium and Dynamics of Spin Glasses (April 2004, Ascona, Switzerland); – Statistical Mechanics and Interacting Particle Systems (June 2004, Rouen, France); – Stochastic-Geometric and Combinatorial Ideas in Statistical Mechanics (June 2004, Gothenburg, Sweden). Targeted topics for the future are: hydrodynamic scaling, metastability, ageing in disordered systems, and polymers. Two more workshops for Young European Probabilists are scheduled for 2005 and 2006, to train junior researchers. (3) Summer schools: 2–4 week tutorial programmes for junior researchers. One summer school on Mathematical Statistical Mechanics was organised in Prague, Czech Republic, in July 2003. A follow-up is planned for 2006. A large summer school on Mathematical Statistical Physics will take place in Les Houches, France, in July 2005. There will be lectures on 15 hot topics
Random Dynamics in Spatially Extended Systems
563
by top researchers from Europe and from North- and South America. These topics are intended to be a road map for mathematical physics in the next decade. (4) Meetings of the steering committee. The steering committee meets once a year. Thus far, meetings were held in Strasbourg (April 2002), Cambridge (August 2003) and Gothenburg (June 2004). During these meetings the activities of the programme are discussed, the workshop and summer school topics are selected, and strategic discussions take place on the development of the research area in the participating countries and in Europe. RDSES maintains a homepage at the ESF website: www.esf.org/rdses This homepage describes the mission, goals and scientific background of the programme, as well as details of the various activities that are being undertaken. Suggestions and comments are welcome. In March 2003, an ESF-brochure for RDSES was printed, which has been widely distributed. Copies are available upon request from the ESF Administrative Assistant to the programme: Ms. C. Werner, e-mail: [email protected] Mathematical statistical physics is an eclectic research area. The aim of the programme is to bring together the various groups that are active in this area within Europe. RDSES also acts as a forum for the development of ideas and actions, as witnessed by the start-up of a number a bilateral collaborations that grew out of the RDSES activities. Mathematical statistical physics is an interdisciplinary research area, with interfaces towards physics, chemistry, computer science, the life sciences, engineering, economics and telecommunication. RDSES provides training in the analysis and modelling of complex dynamical processes via the propagation of a common language and the stimulation of international exchange. 2. Research themes In this section we give a brief sketch of some of the main developments and future challenges in each of the eight research themes that are targeted by RDSES. The aim is to give the reader a flavor of what is going on and to guide him/her to some of the relevant literature. Obviously, it is not possible to do full justice to the field. (a) Gibbsian vs. non-Gibbsian spin systems. Gibbs theory, which has been successful for almost a century, aims at describing physical systems in equilibrium. Such systems consist of countably many interacting components, often referred to as spins, that are subject to a local interaction, among themselves and/or with an external field. This interaction is given by a Hamiltonian, which assigns an energy to each spin configuration. In equilibrium, the probability of a spin configuration is proportional to the negative exponential of its energy, the
564
F. den Hollander
so-called Boltzmann weight factor. The Hamiltonian typically contains one or more relevant parameters, such as temperature or magnetic field. Depending on the type of Hamiltonian, the system may undergo a phase transition along a curve of critical values in the space of parameters. On or close to this curve the system exhibits long-range dependence with universal scaling properties. At the extremities of this curve the system is believed to be scale invariant. More recently, it has become clear that Gibbsianness out of equilibrium is rare: many physical systems that are subjected to some dynamics do not allow for a Gibbsian description, due to the presence of a non-local interaction that cannot be properly described by a Hamiltonian. Examples are spin systems subject to random dynamics, to renormalization transformations or to disorder. For instance, a high-temperature Glauber spin-flip dynamics applied to a lowtemperature Ising-spin Gibbs measure may destroy the Gibbs property in finite time and may afterwards restore it. Currently there is intense research activity to classify various possible scenarios for non-Gibbsianness and to investigate how much of classical Gibbs theory can be saved. A particularly important notion, namely that of weak Gibbsianness introduced by Dobrushin, gives focus to these efforts. Here the idea is that a Hamiltonian description is still possible for “most” spin configurations, after some “bad” configurations (of measure zero) are discarded. It is still unclear what the full physical consequences of this notion are. Some systems turn out to be weak Gibbs, others not. One challenge is to find an algorithm that decides non-Gibbsianness. Another challenge is to understand Gibbsianness under conservative dynamics and under non-reversible dynamics. Georgii [12] is a key monograph for Gibbs theory. A fundamental paper on the issue of Gibbs vs. non-Gibbs is Van Enter, Fern´ andez and Sokal [9]. For a recent overview of the area, see the proceedings of the workshop in December 2003 that was supported by RDSES, edited by Van Enter, Le Ny and Redig [10]. Key references for interacting particle systems are the monographs by Liggett [22], [23]. (b) Polymers and self-interacting random processes. The spatial and temporal behavior of polymer chains is an exciting area, with applications in the physical, chemical, biological and engineering sciences. Mathematics has been involved since the 1950’s, although full immersion is taking place only since 15 years or so. Polymer chains are characterised by an irregular folding in space and by a long-range interaction (remote parts of the chain meet and interact with each other). As such they are rather different from more classical objects like Brownian motion, percolation or the contact process. There is a host of interesting models: self-repellent polymers, elastic polymers, charged polymers, polymers in a random potential, copolymers near interfaces. Many of these models are still largely unexplored. The self-avoiding walk, which is the archetypical model of a polymer, is described in the monograph by Madras and Slade [24]. For an
Random Dynamics in Spatially Extended Systems
565
overview on a variety of different polymer models in a more physical context, see the monograph by Vanderzande [37]. Copolymers are polymer chains consisting of a random concatenation of monomers of two (or more) types, e.g., hydrophobic and hydrophilic. In the presence of an interface separating two immiscible fluids, e.g., oil and water, the copolymer may or may not localise near the interface. Which of these two scenarios it chooses depends on the Hamiltonian of the interaction, which favors one type of monomer in one type of fluid and vice versa. A phase transition between the two scenarios depends on the parameters in the Hamiltonian and on the shape of the interface. The techniques to study this phase transition rely on the theory of large deviations. For an introduction to large deviation theory, see the monograph by den Hollander [16]. For an overview on the behavior of random copolymers, see Soteros and Whittington [32]. Branched polymers, consisting of a network of polymer chains appropriately tied together, turn out to scale to super-Brownian motion in high dimensions. The same type of scaling occurs in a variety of models that are (or turn out to be) close to branching random walk, such as critical percolation, lattice trees and the critical contact process. The key technique to prove this scaling is the lace expansion, a diagrammatic perturbation technique that is able to deal with complex interactions in high dimensions. A key reference for percolation is the monograph by Grimmett [13]. For an overview on the lace expansion and its applications, see Slade [31]. For related aspects, see the contribution by T. Luczak elsewhere in this volume. In dimension two, conformal invariance and the Schramm-L¨ owner evolution are central to a whole range of models at criticality. This theory, which combines ideas from stochastic analysis and conformal map theory, has led to a spectacular development, providing the identification of scaling limits and of associated critical exponents (the latter describe the behavior close to criticality). Overviews are given in Werner [38] and in Kager and Nienhuis [18]. The candidate scaling limits of a variety of discrete critical models have been identified, but it remains a challenge to prove that these scaling limits actually exist and are conformally invariant. This has so far been achieved for only a few models, like critical site percolation on the triangular lattice, loop erased random walk, uniform spanning trees, and the harmonic explorer. Still open are the self-avoiding walk and the Potts model. See the contributions by O. Schramm and W. Werner elsewhere in this volume. (c) Interfaces and surface phenomena. Interfaces in spatially extended systems arise from geometric constraints or from inhomogeneous initial conditions in combination with conservation laws. Examples are wetting phenomena (droplets interacting with a wall) and metastable phenomena (droplets acting as the energy barrier for a crossover between different phases). Wulff droplets have recently been the object of intense investigation. For a system in equilibrium at a first-order phase transition, a large droplet of
566
F. den Hollander
one phase inside another phase assumes the so-called Wulff shape. On the macroscopic scale, the shape is deterministic and is the solution of a variational problem involving the surface tension associated with the interface between the two phases. Examples occur in Ising and Potts models and in solid-on-solid models. For an overview, see Bodineau, Ioffe and Velenik [5]. An open question is to identify the shape of large droplets subject to a random dynamics, such as large critical droplets for metastable transitions between different phases. For Ising spins subject to a Glauber spin-flip dynamics, it was shown by Schonmann and Shlosman [30] that, close to the phase transition line, the critical droplet has the Wulff shape, i.e., the dynamics manages to keep the droplet close to (quasi-)equilibrium while it is growing, shrinking and moving. It is a challenge to extend this result to the lattice gas subject to a Kawasaki hopping dynamics. Here, particle conservation turns out to be a serious obstacle, since it causes long-range dependence and depletion of the gas around growing droplets. Anisotropic dynamics are expected not to preserve the Wulff shape. On the mesoscopic scale, the interface of droplets typically shows anomalous fluctuations. Remarkably, these fluctuations exhibit a high degree of universality. In dimension two considerable progress has been made, with Wigner’s semi-circle law and the Tracy-Widom distribution appearing as universal attractors for the scaling. A unification is envisioned for a whole range of different models, all in some way related to the behavior of spectra of large random matrices. Here a new world is opening up, linking geometry and analysis. A key reference is Baik, Deift, Johansson [2]. See also the contribution by A. Guionnet elsewhere in this volume. Simulations indicate that limiting shapes are delicate objects, which typically retain part of the information of the underlying lattice structure. (d) Disordered media. This has been a very active area for several decades already, with applications to amorphous materials, neural networks, chemical catalysis and biomolecules. Percolation, the random field Ising model, the Hopfield model, the random energy model, and random walk in random environment are by now classical. Exciting recent developments concern spin glasses (“random magnetic alloys”), in particular, the Sherrington-Kirkpatrick model and the Edwards-Anderson model. Here, new types of phase transitions are expected to occur due to a competition of interactions (“frustration”), causing a highly complex energy landscape given by a random Hamiltonian. For the Sherrington-Kirkpatrick model, which is a mean-field model with a long-range interaction, Parisi predicted the occurrence of replica symmetry breaking. After many years of hard effort, this prediction has recently been proved to be correct by Guerra [14] and Talagrand [35]. (See the contribution by F. Guerra elsewhere in this volume.) In Parisi’s solution, a key concept is the ultrametric structure of the ground states. The role of this ultrametric structure has been elucidated through the work of Aizenmann, Sims and Starr [1]. The techniques
Random Dynamics in Spatially Extended Systems
567
that are developed in this area find application in a range of different areas, including coding and hard combinatorial optimisation. An overview of spin glass theory can be found in the monograph by Talagrand [34]. The Edwards-Anderson model has a short-range interaction. Its relation to the Sherrington-Kirkpatrick model remains unclear: replica symmetry breaking may not occur in short-range models (see Newman [28] for an alternative scenario). Caricatures of spin glasses, such as the Hopfield model and the random energy model, are by now well understood. They shed light on the universality of ultrametricity in mean-field models. See Bovier and Kurkova [6] for an overview on the developments around random energy models. Ageing in disordered media is a new challenge on the horizon. Here one studies the evolution of systems that go through a cascade of metastable equilibria. This results in a correlation structure of the system that evolves with time. The behavior of spin glasses subject to a random dynamics is still largely open. Random walk in random environment has recently gone through major developments, especially in higher dimensions, where now some of the hard questions are finally reaching a solution. See Zeitouni [40] for an overview. Catalytic branching models, describing a reactant evolving in the presence of a catalyst, are models of disorder with random dynamics. This is an area that is growing fast, with applications in population dynamics. See the overviews by Dawson and Fleischmann [7] and by Klenke [21]. (e) Relaxation to equilibrium and metastability. A physical system out of equilibrium tends to relax towards equilibrium. This relaxation may, however, be extremely slow, a phenomenon that is called metastability. Consider, for instance, a system in equilibrium with parameters on one side of a first-order phase transition curve. Suppose that the parameters are suddenly changed to values corresponding to the opposite side of this curve. Then the system wants to relax from the old phase to the new phase, but in order to do so it has to overcome an energy barrier. Before crossing this energy barrier, the system persists for a long time in what is called a metastable state, which is characterised by many unsuccessful attempts to cross the barrier. The crossover is typically achieved after the system creates a critical droplet of the new phase inside the old phase. Several models are of interest, such as Ising spins under a Glauber spin-flip dynamics or the lattice gas under a Kawasaki hopping dynamics. The challenge is to give a detailed description of the crossover time and of the typical trajectories followed by the system prior to the crossover. The theory either relies on large deviation theory for the trajectories of the system (“pathwise approach”) or on a close analogy between metastable transition times and capacities in electric networks (“potential-theoretic approach”).
568
F. den Hollander
In two dimensions substantial progress has been made and key questions have been settled for a variety of different models. In three dimensions the geometry of critical droplets is rather complex and progress has only been partial. Describing metastable behavior under a conservative dynamics is a hard challenge. For an overview of the history and the developments in metastability, see the monograph by Olivieri and Vares [29]. For a critical comparison between Glauber and Kawasaki, as well as for mathematical references to droplet growth in metastability, see den Hollander [17]. (f ) Hydrodynamic behavior of conservative systems. One of the basic problems of non-equilibrium statistical mechanics is the derivation of hydrodynamic equations. On the proper macroscopic space-time scales, interacting particle systems develop autonomous behavior for a collection of locally conserved quantities, such as density, momentum and energy. The evolution of these quantities is given by a set of coupled partial differential equations. For deterministic microscopic dynamics only mild progress has been made, with even issues like ergodicity and mixing being still largely open. For random microscopic dynamics (i.e., in the presence of noise), progress has been fast over the past decade, especially for those systems whose quasi-equilibria conditioned on the locally conserved quantities are well understood. The type of pde depends on the scaling that is chosen. Eulerian scaling (space scales like time) leads to hyperbolic pde’s, diffusive scaling (space scales like square root of time) leads to parabolic pde’s. Diffusive systems are generally well understood, hyperbolic systems are much less so, since they may develop “shocks” in finite time. For an overview on hydrodynamic scaling, see the monographs by Spohn [33] and by Kipnis and Landim [19]. The large deviation techniques developed by Kipnis, Olla and Varadhan [20], and the relative entropy method of Yau [39], yield a derivation of the hydrodynamic equations in a rather broad context of models. Both Eulerian and diffusive scaling can be handled. In the former, the shocks and their microscopic counterpart have been the subject of intense research. Typically, the methods that are employed only give the hydrodynamic equation until the first time when a shock appears. A major breakthrough in the understanding of hydrodynamics with shocks is made in recent work by Fritz and T´ oth [11], where, with the help of the analytic theory of conservation laws, the validity of the hydrodynamic equation is obtained beyond shocks. This promises to open up a new line of research. Particularly challenging is the analysis of multi-component hyperbolic systems, where attractiveness typically fails, causing trouble with uniqueness issues. Important progress has been achieved in the recent paper by T´ oth and V´ alko [36]. (g) Entropy production and fluctuations far from equilibrium. For systems in a non-equilibrium steady state, such as a gas flowing through a pipe or a fluid in
Random Dynamics in Spatially Extended Systems
569
contact with two heat reservoirs at different temperatures, it is no longer possible to use considerations that are valid for systems in or close to equilibrium. Especially when driven by large external fields, the system is beyond the regime where linear response theory can be applied. Therefore it is of key importance to search for general principles in non-equilibrium, in particular, symmetry relations between the transport coefficients. A non-equilibrium steady state is non-reversible, and so it produces entropy. The study of entropy production and its fluctuations is therefore a central issue. For a discussion, see Maes, Redig and Van Moffaert [26]. The Gallavotti-Cohen fluctuation theorem expresses a symmetry property for the large deviations of the entropy production that holds in complete generality. Close to equilibrium, this symmetry reduces to the classical Onsager reciprocity relations for the response coefficients. In a recent approach, put forward by Maes [25], a non-equilibrium steady state is viewed as a Gibbs measure on space-time trajectories. In this setting, the Gallavotti-Cohen fluctuation theorem immediately follows from the Dobrushin-Lanford-Ruelle conditions on the space-time Gibbs measure. The entropy production is precisely the timereversal antisymmetric part of the Hamiltonian of the space-time Gibbs measure. The Gallavotti-Cohen fluctuation theorem can thus be viewed as similar to the Ward identities in quantum field theory. The area is witnessing the slow emergence of a microscopic theory, from which not only the thermodynamics of irreversible processes close to equilibrium can be derived, but which promises to go far beyond the linear regime. Further challenges in the study of non-equilibrium systems are recent efforts to derive Fourier’s law (relating macroscopic flow with external field) and to construct non-equilibrium fluctuation symmetries for quantum systems. (h) Granular media and sandpile dynamics. Granular media are systems whose components have a physical shape, rather than being idealised point particles. Examples are powder, sand, grains or rocks. The question is how this shape affects the microscopic, mesoscopic and macroscopic behavior. Inelastic collisions between the components and internal degrees of freedom play an important role. For the proceedings of a recent workshop in this area, see Helbing, Hermann, Schreckenberg and Wolf [15]. Mathematically, the area is largely undeveloped. In sandpile dynamics, grains of sand topple and cause avalanches, i.e., a motion involving a large number of components at the same time. Since these avalanches are highly non-local, it is hard to even define the dynamics properly. The concept of self-organised criticality (SOC), originally proposed by Bak, Tang and Wiesenfeld [3], has become central to a variety of physical, chemical and biological systems. SOC means that the system is “dynamically tuned towards criticality”, even though it has no parameter to tune. In other words, the system exhibits “power-law decay of avalanche sizes” (power law decay of correlations being typical for systems at criticality). Experiments on
570
F. den Hollander
granular media, such as sandpiles, have confirmed the presence of these power laws. One outstanding paradigm of SOC is the so-called abelian sandpile model, which allows for a mathematical treatment because of an underlying abelian group structure, originally revealed by Dhar [8]. This model has strong connections with fundamental objects in graph theory, such as the discrete Laplacian, wired spanning forests, and two-component spanning trees. In two dimensions, physicists predict a conformal field theory in the continuum limit. The abelian sandpile model also appears in algebraic combinatorics, in discrete potential theory, in group theory, and in computer science (see Biggs [4]). From the perspective of mathematical physics, the limit of infinite graphs is important, corresponding to what is called the thermodynamic limit in statistical physics. The first results in this direction have been obtained by Maes, Redig and Saada [27] for the abelian sandpile model on an infinite tree. By now, much progress has been made in a global understanding of the ergodic theory of this system, and its relation to random walks on compact groups. A challenge is to understand the basic features of abelian sandpile models in high dimensions. Acknowledgment. The author is grateful to Aernout van Enter and Frank Redig for commenting on a draft of this paper. References [1] M. Aizenman, R. Sims and S. Starr, An extended variational principle for the SK spin-glass model, Phys. Rev. B68 (2003) 214403. [2] J. Baik, P. Deift and K. Johansson, On the distribution of the length of the second row of a Young diagram under Plancherel measure, Geom. Func. Anal. (2000) 702–731. [3] P. Bak, K. Tang and K. Wiesenfeld, Self-organized criticality, Phys. Rev. A38 (1988) 364–374. [4] N.L. Biggs, Chip-firing and the critical group of a graph, J. Algebraic Combin. 9 (1999) 25–45. [5] T. Bodineau, D. Ioffe and Y. Velenik, Rigorous probabilistic analysis of equilibrium crystal shapes, J. Math. Phys. 41 (2000) 1033–1098. [6] A. Bovier and I. Kurkova, Rigorous results on some simple spin glass models, Markov Proc. Related Fields 9 (2003) 209–242. [7] D.A. Dawson and K. Fleischmann, Catalytic and mutually catalytic branching, in: Infinite-Dimensional Stochastic Analysis (eds. Ph. Cl´ement, F. den Hollander, J. van Neerven and B. de Pagter), Royal Netherlands Academy of Arts and Sciences, Amsterdam, 2000, pp. 145–170. [8] D. Dhar, Self-organised critical state of sandpile automaton models, Phys. Rev. Lett. 64 (1990) 1613–1616. [9] A.C.D. van Enter, R. Fern´ andez and A.D. Sokal, Regularity properties and pathologies of position-space renormalization-group transformations: scope and limitations of Gibbsian theory, J. Stat. Phys. 72 (1993) 879–1167.
Random Dynamics in Spatially Extended Systems
571
[10] A.C.D. van Enter, A. Le Ny and F. Redig (eds.), Gibbs vs. non-Gibbs in Statistical Mechanics and Related Fields, Markov Proc. Related Fields 10 (2004) 377–564. [11] J. Fritz and B. T´ oth, Derivation of the Leroux system as the hydrodynamic limit of a two-component lattice gas, Commun. Math. Phys. 249 (2004) 1–27. [12] H.-O. Georgii, Gibbs Measures and Phase Transitions, De Gruyter Studies in Mathematics 9, de Gruyter, Berlin, 1988. [13] G. Grimmett, Percolation (2nd ed.), Springer, Berlin, 1999. [14] F. Guerra, Broken replica symmetry bounds in the mean field spin glass model, Commun. Math. Phys. 233 (2003) 1–12. [15] D. Helbing, H.J. Hermann, M. Schreckenberg and D.E. Wolf (eds.), Traffic and Granular Flow, Springer, Berlin, 2000. [16] F. den Hollander, Large Deviations, Fields Institute Monographs 14, American Mathematical Society, Providence, RI, 2000. [17] F. den Hollander, Metastability under stochastic dynamics, Stoch. Proc. Appl. 114 (2004) 1–26. [18] W. Kager and B. Nienhuis, A guide to Stochastic L¨ owner Evolution and its applications, J. Stat. Phys. 115 (2004) 1149–1229. [19] C. Kipnis and C. Landim, Scaling Limits of Interacting Particle Systems, Springer, Berlin, 1999. [20] C. Kipnis, S. Olla and S.R.S. Varadhan, Hydrodynamics and large deviations for simple exclusion processes, Comm. Pure Appl. Math. 42 (1989) 115–137. [21] A. Klenke, A review on spatial catalytic branching, in: Stochastic Models, CMS Conf. Proc. 26, American Mathematical Society, 2000, pp. 245–263. [22] T.M. Liggett, Interacting Particle Systems, Grundlehren der Mathematischen Wissenschaften 276, Springer-Verlag, New York, 1985. [23] T.M. Liggett, Stochastic Interacting Systems: Contact, Voter and Exclusion Processes, Grundlehren der Mathematischen Wissenschaften 324, Springer-Verlag, Berlin, 1999. [24] N. Madras and G. Slade, The Self-Avoiding Walk, Birkh¨ auser, Boston, 1993. [25] C. Maes, The fluctuation theorem as a Gibbs property, J. Stat. Phys. 95 (1999) 367–392. [26] C. Maes, F. Redig and A. Van Moffaert, On the definition of entropy production, via examples, J. Math. Phys. 41 (2000) 1528–1554. [27] C. Maes, F. Redig and E. Saada, The abelian sandpile model on an infinite tree, Ann. Probab. 30 (2002) 2081–2107. [28] C.M. Newman, Topics in Disordered Systems, Lectures in Mathematics, ETH Z¨ urich, Birkh¨ auser, Basel, 1997. [29] E. Olivieri and M.E. Vares, Large Deviations and Metastability, Cambridge University Press, Cambridge, 2004. [30] R.H. Schonmann and S. Shlosman, Wulff droplets and the metastable relaxation of kinetic Ising models, Commun. Math. Phys. 194 (1998) 389–462. [31] G. Slade, The lace expansion and its applications, in: Ecole d’Et´e de Probabilit´es de Saint Flour XXXIV-2004. To appear as Springer Lecture Notes in Mathematics.
572
F. den Hollander
[32] C.E. Soteros and S.G. Whittington, The statistical mechanics of random copolymers, J. Phys. A: Math. Gen. 37 (2004) R279–R325. [33] H. Spohn, Large Scale Dynamics of Interacting Particles, Springer, Berlin, 1991. [34] M. Talagrand, Spin Glasses: A Challenge for Mathematicians, Ergebnisse der Mathematik und ihrer Grenzgebiete 46, Springer, Berlin, 2003. [35] M. Talagrand, The Parisi formula, to appear in Ann. Math. [36] B. T´ oth and B. Valk´ o, Onsager relations and Eulerian hydrodynamic limit for systems with several conservation laws, J. Stat. Phys. 112 (2003) 497–521. [37] C. Vanderzande, Lattice Models of Polymers, Cambridge University Press, Cambridge, 1998. [38] W. Werner, Random planar curves and Schramm-Loewner evolutions, in: Ecole d’Et´e de Probabilit´es de Saint Flour XXXII-2002. Lecture Notes in Math. 1840, Springer, Berlin, 2004, pp. 107–195. [39] H.T. Yau, Relative entropy and hydrodynamics of Ginzburg-Landau models, Lett. Math. Phys. 22 (1991) 63–80. [40] O. Zeitouni, Random walk in random environment, in: Ecole d’Et´e de Probabilit´es de Saint Flour XXXI-2001 (ed. J. Picard), Lecture Notes in Mathematics 1837, Springer, Berlin, 2004, pp. 189–312. F. den Hollander EURANDOM P.O. Box 513 NL-5600 MB Eindhoven The Netherlands
4ECM Stockholm 2004 c 2005 European Mathematical Society
Analysis and Operators 2000–2004 Four Years of Network Activity J. Esterle
1. General description of network activity We will try to give in the present paper an overview of the research activity of the Research and Training Network Classical analysis, operator theory, geometry of Banach spaces, their interplay and their applications, contract HPRNCT-2000-00116, which was funded from June 1, 2000 to May 31, 2004 by the European Commission within 5th PCRDT, with a budget of 1 494 000 euro. Further information is available on the network homepage http://maths.leeds.ac.uk/pure/analysis/rtn.html The coordinating team was Universit´e Bordeaux 1, the network coordinator was the author of this report. The following table gives the nodes, subnodes and the node coordinators team no. 1 2
node
subnodes Lille, Metz Delft, Leiden
4
Universit´e Bordeaux 1 Vrije Universiteit Amsterdam Universitat autonoma de Barcelona University College Dublin
5
Leeds University
6
Universit´e Paris 6
7
Norwegian University of Science and Technology at Trondheim TU Vienna Tel-Aviv University St Petersburg branch of the Steklov Institute
3
8 9 10
Barcelona, La Laguna Belfast, D¨ usseldorf London, Maynooth Cambridge, Lancaster Newcastle, Sheffield Besan¸con, Cergy-Pontoise Lyon, Marne la Vall´ee Bergen, Lund Stockholm, Uppsala Bremen, Regensburg University of St Petersburg
node coordinator N. Nikolski R. Kaashoek J. Bruna S. Gardiner J. Partington Y. Raynaud K. Seip
H. Langer A. Atzmon S. Kislyakov
M. Sodin (team 9) and X. Tolsa (team 3) gave an invited talk at 4ecm [77], [81], and X. Tolsa got after the Salem prize one of the ten ECM prizes for
574
J. Esterle
his (outstanding) proof of semiadditivity of analytic and continuous analytic capacities. The project was organized along three main directions of research. Here is the summary of the research objectives given in annex 1 of the contract. 1) Function theory. For Bergman and related spaces of holomorphic functions, develop factorization theory and characterize “inner-outer” functions in terms of their growth near the boundary. In one and multi-dimensional situations, characterize interpolating and sampling sequences, and use interpolating Blaschke products to find new invertibility criteria for Toeplitz operators. Develop approximation theory (harmonic approximation on general sets, tangential approximation) and find further relations between quadrature identities and best approximation problems. Improve understanding of capacities in metric and geometric terms end, more generally, improve understanding of the Cauchy and Hilbert transforms. 2) Operator theory. Develop the theory of function models, and clarify the relations between spectral properties of Hankel and Toeplitz operators and function theoretical properties of their symbols. Develop new operator theoretical methods to analyze problems arising from concrete classes of integral differential and delay equations. Describe the spaces spanned by generalized eigenvectors for non-selfadjoint operators arising from delay equations. Find new applications of the Brown approximation scheme, and use function theoretical tools to study translation invariant subspaces of lω2 (Z). 3) Geometry of Banach spaces, Convex geometry. Develop theories emerging from Banach spaces geometry and related to function theory and operator theory, (operator spaces, noncommutative analysis) and continue the “transfer of technology of Banach spaces” to these areas. Find new applications to analysis, convex geometry and statistical mechanics of the principle of concentration of measure and of the majorizing measure theorem. Improve estimates for contractive approximation algorithms of convex bodies by polytopes. Develop variational principles and pursue their applications to differential equations. The training program took advantage of the publication by network members during the period 2000–2004 of many monographs devoted to topics playing a central role in network activity, [6], [43], [62], [63], [66], [68], [75] (the monographs [4], [24], [78] are also relevant to network activity). Also the two volumes of the Handbook of the geometry of Banach spaces contains several important review papers by network members on modern aspects of Banach spaces geometry and related topics [5], [25], [33], [36], [47], [55], [56],[69], [71].
Analysis and Operators 2000–2004
575
Besides the daily individual training provided in the various nodes, an important part of the training at network level was provided by series of morning lectures at the four network annual meetings detailed below. St. Petersburg. (May 13–17, 2001, 44 network participants, 4 participants exterior to the network): Capacities and harmonic approximation, by S. Gardiner and A. O’Farrell (Dublin), Linear operators in Krein spaces and applications, by H. Langer (Vienna), Spectral Analysis of self-adjoint Jacobi matrices, by S. Naboko (St. Petersburg). Biarritz. (May 2–7, 2002, 68 network participants, 3 participants exterior to the network) The semiadditivity of analytic capacity, by X. Tolsa (Barcelona), Interpolation of Hardy type spaces, by S. Kislyakov (St. Petersburg), Geometric aspects of approximation in high dimension and connections of convex geometry with complexity theory, by V. Milman (Tel-Aviv), Local theory of operator spaces, by G. Pisier (Paris). Tenerife. (May 21–26, 2003, 67 network participants, 14 participants exterior to the network) Bergman function theory, by H. Hedenmalm (Stockholm), Control theory for analysts, by N. Nikolski (Bordeaux),Translation invariant subspaces, by A. Atzmon (Tel-Aviv), and J. Esterle (Bordeaux). Dalfsen. (May 1–7, 2004, 50 network participants, 2 participants exterior to the network) Singular integrals and capacities, by G. David (Orsay) and J. Verdera (Barcelona), Delay equations and infinite-dimensional systems, by J. Partington (Leeds) and S. Verduyn Lunel (Leiden), Toeplitz operators on Bergman spaces, by N. Vasilevski (Mexico). The impact of such lectures in future networks could certainly be improved by giving on the network webpage an access to the slides used for the series of morning lectures (this was partially done for the lectures at the third and fourth meetings), and by giving in advance a relevant bibliography and arranging within the nodes training seminars related to these lectures before and after the annual meetings. In addition to these annual meetings three pre/post docs workshops were organized at Bordeaux (January 17–18, 2002), and Paris (November 21–22, 2002 and January 22–23, 2004). All pre/postdocs appointed by the network at the time of these meetings (and some previous and future appointees) were given the opportunity to present a one hour talk, and these events helped structuring a community of young mathematicians appointed by the network. More specialized workshops on specific topics were organized at Trondheim, July 2–4, 2003 (Spaces of holomorphic functions), Leeds, July 3–5, 2003 (Invariant subspaces), Amsterdam, August 20–22, 2003 (Operator theory), Barcelona, November 20–23, 2003 (Bergman spaces and related topics in complex analysis) and Vienna, March 3–4, 2004 (Operator theory).
576
J. Esterle
During network activity there was some joint work by S. Verduyn Lunel (team 2) and members of the Chemistry department of Leiden University, using functional analysis tools to study periodic chemical processes. Also R. Gay, a retired Professor from Bordeaux, published a couple of papers with engineers on problems on signal theory arising from industry. At Bordeaux F. Turcu, who was preparing a thesis on a very abstract subject, also worked with Professor N. Najim, from an engineering laboratory on problems on 2-D random field modelling [1]. He actually helped solving some very concrete problems by using sophisticated operator-theoretical tools recently developed at Timisoara. He now holds a permanent CNRS research position in this engineering laboratory. Altogether the potential for direct applications was not exploited as it could have been at network level. A precise report of the job situation of previous pre-post doc appointed by the network is not yet available. There was unfortunately at least one example of unemployment this year. There is also a success story of employment outside the academic sector: a postdoc whose network appointment ended on May 31, 2004 started on June 1 a job at a bank in London, with a basic annual salary of £105 000, plus a guaranteed bonus of £20 000, plus £20 000 in stock-options, with a free car after six months of employment (this fellow never benefited from any specific training about financial mathematics before this new appointment). We now wish to describe the scientific activity of the network. Denote by U the set of functions analytic and bounded on C \ K vanishing at infinity. Recall that the analytic capacity of a compact subset K of C is given by the formula γ(K) = sup lim |zf (z)|. f ∈U |z|→∞
The compact sets of zero analytic capacity are exactly the sets K which are removable for bounded analytic functions (removable in short): every bounded analytic function on U \ K, where U is an open subset of C containing K, extends to a function analytic on the whole of U . The continuous analytic capacity is defined in a similar way, using continuous functions on C vanishing at infinity which are analytic on C \ K, and the sets of vanishing continuous analytic capacity are those sets K for which any function continuous on an open set U containing K and analytic on U \ K is in fact analytic on U . The scientific highlights are dominated by the accomplishments of X. Tolsa concerning these analytic and continuous analytic capacities. He showed in particular that these quantities are subadditive, which implies that the union of two removable sets is removable. These results, which follow the solution of related problems concerning planar Cantor sets by the Barcelona team during the first year of network activity [53], use the notion of Menger curvature and its role in removability pointed out in [54], and a chain of recent results on T (1) and T (b) theorems developing the classical Calderon-Zygmund theory of singular integrals, which culminates with a recent paper by Nazarov, Treil and Volberg
Analysis and Operators 2000–2004
577
[59], where such theorems are obtained without assuming the so-called “doubling condition” for related measures. We refer to Tolsa’s original papers [79], [80] and to Tolsa’s contribution to the present volume [81] and to the survey provided by J. Verdera in [82] for the proof of these remarkable results and a global presentation of this circle of ideas. Other important results, as the characterization of subsets at the boundary where harmonic functions in a domain may tend to infinity [34], the construction of a rank one perturbation of a unitary operator satisfying the linear growth condition which is not similar to a normal operator [1], optimality results related to contractive liftings and distance to intertwining operators [32], the study of BMO-regular lattices [48], a noncommutative version of Grothendieck’s theorem [70], a description of subspaces of noncommutative Lp -spaces [74], breakthroughs on hypercyclicity [14], etc. were obtained without collaboration between different nodes. We refer to the annual, midterm and final reports available on the network homepage for more information about these individual accomplishments, which were largely used in the network training and dissemination of knowledge program. We will devote the remainder of the paper to a description of some joint results by members of different teams, or by postdocs appointed by the network and senior members of the host node (these two situations overlap), which give a good sampling set for network activity. In the last section we will take advantage of the large diffusion of this volume to attract again attention on two long standing problems, which are too hard to be inserted in any realistic network research workplan but seem to present strategic interest. 2. Surjective Toeplitz operators We will use standard notations, and denote respectively by D, P+ , and T the open unit disk, the open upper half-plane and the unit circle. For 0 < p < +∞, the usual Hardy spaces on the disc or the half-plane are defined by the formulae 2π p |f (reit )|p dt < +∞} H (D) = {f ∈ Hol(D) | sup 0≤r≤1
H p (P+ ) = {F ∈ Hol(P+ ) | sup b>0 ∞
∞
0 +∞
−∞
|F (x + ib)|p dx < +∞}.
+
We will denote by H (D) and H (P ) the spaces of bounded analytic functions on D and P+ . The Nevanlinna class 2π N (D) = {f ∈ Hol(D) | sup | log+ |f (reit )|dt < +∞} 0≤r≤1
0
is the set of all functions analytic on D which can be written as the quotient of two bounded analytic functions on D. For f ∈ N (D), the function f ∗ is defined a.e on D by the formula f ∗ (eit ) := limr→1− f (reit ) (in fact we have nontangential limits a.e. on T). Identifying f to f ∗ we get H p (D) {f ∈ LP (T) | fˆ(n) = 0 for n < 0} for p ≥ 1. Similarly H p (P+ ) {F ∈ Lp (R) | Fˆ |R− = 0 a.e} for p ≥ 1.
578
J. Esterle
An inner function on the disc is a function φ ∈ H ∞ (D) such that |φ(eit )| = 1 a.e., and an inner function φ is said to be singular if φ(z) = 0 for z ∈ D. A nonzero function f ∈ H 2 (D) is said to be outer when 2π 1 log |f (0)| = log |f ∗ (eit )|dt, 2π 0 or, equivalently, if we have, for z ∈ D, 2π it e +z 1 ∗ it log |f (e )|dt . f (z) = exp 2π 0 eit − z A standard factorization result shows that each nonzero function f ∈ N (D) can be written in a unique way in the form f = φ.g, where g ∈ N (D) is outer, and where φ is inner. The shift operator S and the backward shift R = S ∗ are defined on H 2 (D) by the formulae S(f )(z) = zf (z) for |z| < 1, R(f )(z) =
f (z) − f (0) z
for |z| < 1, z = 0, R(f )(0) = f (0).
A closed subspace of H 2 (D) is said to be z-invariant when S(M ) ⊂ M . A classical result of Beurling shows that each nonzero z-invariant subspace M of H 2 (D) has the form M = φH 2 (D), where φ is inner. Consequently every closed subspace N of H 2 (D) which is invariant for the backward shift has the form N = H 2 (D) φH 2 (D). For f ∈ L1 (T), |z| < 1, set +∞ f (ζ) 1 fˆ(n)z n = dζ. P+ (f )(z) = 2iπ T ζ − z n=0 This is the Cauchy projection, which maps Lp (T) onto H p (D), for 1 < p < +∞, but maps L∞ (T) onto the space BMOA of (nontangential limits) of analytic functions of bounded mean oscillation on D. For ψ ∈ L∞ (T), the Toeplitz operator of symbol ψ is the operator Tψ : H 2 (D) → H 2 (D) defined by Tψ (f ) = P+ (f ∗ ψ). It is known that if |ψ| = 1 a.e., Tψ is invertible iff dist(ψ, H ∞ ) < 1 ¯ H ∞ ) < 1). If Tψ is Toeplitz, Ker(Tψ ) is nearly invariant for the and dist(ψ, backward shift: if f ∈ Ker(Tψ ), and if f (0) = 0, then Rf ∈ Ker(Tψ ), where R is the backward shift. Let g be the extremal function for M := Ker(Tψ ), i.e., g = 1, Re(g(0)) is maximum for the unit ball of M . Results of Hitt and Sarason [44], [73] show that g is outer, that multiplication by g is an isometry on N := {f /g : f ∈ M }, and that N is a R-invariant subspace of H 2 (D). Hence N = H 2 φH 2 (D), where φ ∈ H ∞ is inner.
Analysis and Operators 2000–2004
One can write g =
a 1−b ,
579
a, b in unit ball of H ∞ , where
1 1 + b(z) = 1 − b(z) 2π
2π 0
eit + z |g(eit )|2 dt. eit − z
Then |a|2 + |b|2 = 1 a.e., and φ is a divisor of b in H ∞ . The following result from [40] provides a characterization of surjective Toeplitz operators with nontrivial kernel associated to unimodular functions (an easy application of a theorem by Hartman and Winter reduces the problem to the case where the symbol of the Toeplitz operator is unimodular). Theorem 2.1. If ψ unimodular and Ker(Tψ ) = {0} then Tψ is onto iff g0 := a π satisfies g02 = exp(u + v˜), u, v ∈ L∞ o condiR (T), v < 2 (Helson-Szeg˝ 1−b/φ tion). A. Hartmann (team 1) was a postdoc at Trondheim (team 7) from September 2001 to August 2002. This paper follows another joint paper [42] with K. Seip, coordinator of team 7, on extremal functions of kernels of Toeplitz operators on the Hardy spaces H p (D), also initiated during the postdoc appointment. The situation turns out to be very interesting, since these extremal functions happen to be contractive divisors when p < 2 and (modulo p-dependent multiplicative constants) to be expansive divisors when p > 2. A. Hartmann is now preparing his habilitation at Bordeaux.
3. Bergman and related spaces
The space B 2 (D) = {f ∈ Hol(D) | D |f (x + iy)|2 dxdy < +∞} is the Bergman space, natural analog of the Hardy space H 2 (D). On B 2 (D) are now available analogs of the notions of inner and outer functions, an elaborate theory of contractive divisors of z-invariant (i.e., S-invariant) subspaces of B 2 due to Hedenmalm, based on the notion of extremal function, and it is known that the lattice of z-invariant subspaces of B 2 (D) is very large. The z-invariant subspaces of B 2 (D) satisfy a ‘Beurling-type theorem’ [2], [76]: if M is z-invariant, then M = ∨n≥0 z n (M zM ). On the other hand no characterization of zero sets is known for functions in B 2 (D), and there was no progress repored during network activity concerning the characterization of inner and outer functions in terms of their behavior near the boundary. We refer to the monograph [43] for a description of the state of the art up to the year 2000 concerning the Bergman space B 2 (D). For ω ∈ L2 [0, 1], strictly positive, with the convention dm(x + iy) =
dxdy , π
580
J. Esterle
set
Bω2 (D) = {f ∈ Hol(D)| L2ω (D)
|f (z)|2 ω 2 (|z|)dm(z) < +∞},
D
= {f meas. |
|f (z)|2 ω 2 (|z|)dm(z) < +∞}, D
so that Bω2 (D) is a closed subspace of L2ω (D). Set 1 1 σ(n) = [2 r2n+1 ω 2 (r)dr] 2 , 0
f (n) (0) fˆ(n) = n!
for n ≥ 0, f ∈ Hol(D).
We have Bω2 (D) = H 2 (σ):= {f ∈ Hol(D) |
+∞
|fˆ(n)|2 σ(n)2 < +∞}.
n=0
+∞ ¯ n −2 Set kλω (z) = n=0 λ σ (n)z n , so that kλω is the reproducing kernel for 2 Bω (D), which means that we have f (λ) = f, kλω for f ∈ Bω2 (D), |λ| < 1. For φ ∈ L∞ (D), f ∈ Bω2 (D), λ ∈ D, define the Toeplitz operator of symbol φ by the formula (3.1) Tφ (f )(λ) = φf, kλω , so that Tφ (f ) = P+ (φf ), where P+ is the orthogonal projection from L2ω (D) onto Bω2 (D). Since kλω is bounded on D, formula 3.1 defines an analytic function Tφ (f ) on D for φ ∈ L2ω (D), f ∈ B 2 (ω), and for φω 2 ∈ L1 (D), the Berezin transform of φ is defined on D by the formula Berω (φ)(λ) =
φkλω , kλω kλω 2
(3.2)
Set ωα (r) = (1 − r2 )α/2 for α ∈ (−1, +∞). Following partial answers by Stroethoff and Zheng to a question of Sarason concerning boundedness of the product of Toeplitz operators on the standard Bergman space, Sandra Pott and Elizabeth Strouse (team 1) obtained in [72] the following result Theorem 3.1. Let α ∈ (−1, +∞), and let φ, ψ ∈ B 2 (ωα ). (i) If Tφ Tψ¯ defines a bounded operator from B 2 (ωα ) into itself, then sup Berωα (|φ|2 )(λ) Berωα (|ψ|2 )(λ) < +∞. λ∈D
(ii) If supλ∈D Berωα (|φ|2 )(λ) Berωα (|ψ|2 )(λ) < +∞, then Tφ Tψ¯ defines a bounded operator from B 2 (ωβ ) into itself for every β > α.
Analysis and Operators 2000–2004
581
This work was initiated during a three months postdoc appointment of S. Pott at Bordeaux in the spring 2003. The existence of a nontrivial zero-free closed z-invariant subspace M of Bω2 (D) such that dim(M zM ) = 1 is an open problem. Partial results go back to Nikolski [61] and Atzmon obtained a positive answer with a mild regularity condition on the weight by using entire functions of zero exponential type in 1997 [9]. Borichev (team 1), Hedenmalm (team 7) and Volberg (partially in team 6) obtained in [18] the following result, which shows that the problem has a positive answer for all “large” weights (their functions F has in some sense “extremal growth”). Theorem 3.2. Assume that ω(r) decreases to zero as r → 1− , and satisfies for some ∈ (0, 1) 1 lim (1 − r) log log = 0. − ω(r) r→1 Then there exists a non z-cyclic function F ∈ B 2 (ω) without zeroes in D. 4. Fourier frames, interpolation, and sampling Other spaces of holomorphic functions include the Paley-Wiener space PW of all entire functions of exponential type at most π whose restrictions to R are square-integrable. A sequence Λ = (λk )k∈Z is sampling for P W iff there exists A, B > 0 such that +∞ +∞ +∞ A |g(x)|2 dx ≤ |g(λk )|2 ≤ B |g(x)|2 dx (4.1) −∞
k=−∞
−∞
holds for all f ∈ P W . π Set fˆ(z) = √12π −π f (t)e−izt dt for f ∈ L2 [−π, π]. Then F : f −→ fˆ is an isometry from L2 [−π, π] onto PW, by the classical Paley-Wiener theorem. Condition (4.1) is equivalent to the fact that π π π +∞ |f (t)|2 dt ≤ | f (t)e−iλk t dt|2 ≤ B |f (t)|2 dt (4.2) A −π
k=−∞
−π
−π
for all f ∈ L2 [−π, π]. In other terms the sequence (λk ) is sampling for PW iff the system {eiλk x } is a Fourier frame in the sense of Duffin and Schaeffer. A nondecreasing sequence (λk )k∈Z is separated if inf k∈Z λk+1 −λk > 0, interpolating if the equation f (λk ) = ak ∀k has a solution in PW for every square integrable sequence (ak )k∈Z , and an interpolating sequence (λk ) is said to be complete when this solution is always unique. Separated complete interpolating sequences were characterized by Pavlov [67] and Hruschev-Nikolski-Pavlov [45] and these sequences are in some sense sampling sequences with no redundant points. There exists sampling sequences for which no subsequence is complete
582
J. Esterle
interpolating, but every sampling sequence has a separated sampling subsequence. To a separated sequence Λ = (λk )k∈Z is associated a distribution function nΛ defined by the formula nλ (b)−nλ (a) = card(Λ∩(a, b]) for a < b, normalized so that nλ (0) = 0. A necessary condition for a separated sequence Λ to be sampling is given by Landau’s inequality nΛ (b) − nΛ (a) ≥ (b − a) − A log+ (b − a) − B for a < b, where the constants A and B are independent of a and b. Denote by U the set of all entire function E without zeroes in the upper half-plane such that |E(z)| ≥ |E(¯ z )| for Im(z) > 0. If E ∈ U , denote by H(E) f¯(¯ z) f (z) is the set of entire functions f such that E(z) and E(z) belong to H 2 (P+ ). In +∞ f (t) 2 fact H(E) is a Hilbert space with respect to the norm f E = −∞ | E(t) | dt. The family of these spaces is exactly the class of de Branges Hilbert spaces of entire functions. J. Ortega-Cerda (Barcelona) and K. Seip (Trondheim) gave in [65] the following characterisation of separated sampling sequences for PW. Theorem 4.1. A separated sequence Λ is sampling for PW if and only if there exists two entire functions E, F in U such that (i) H(E) = P W ¯ z )F¯ (¯ (ii) Λ is the zero sequence of the entire function z −→ E(z)F (z) + E(¯ z) We see that the very important notion of de Branges Hilbert spaces of entire functions plays a crucial role in this characterization. Using this theorem it is in particular possible to deduce Landau’s inequality from the HruschevNikolski-Pavlov theorem by using the John-Nirenberg theorem for BMO functions. Other interpolation problems were studied jointly by teams 3 and 7 in [15] and by teams 1 and 3 in [41]. We refer to the monograph [75] for more information about interpolation and sampling in spaces of analytic functions and to the contribution of J. Bruna [19] to 3ecm for a general presentation of sampling in complex and harmonic analysis. 5. Approximation in the boundary and sets of determination for harmonic functions Recall that the Poisson kernel on D is defined by the formula P (z, eit ) = 2 1 1−|z| ∞ 2π |z−eit |2 . Denote by h (D) the space of real-valued bounded harmonic functions on D. The following are known to be equivalent (a) For each f ∈ L1 (∂D) and > 0 there exist sequences (λk ) ∈ R and (xk ) in E such that f = λk P (xk , .) in L1 (∂D) and |λk | < f L1 (∂D) + (b) supE h = supD h for every h ∈ h∞ (∂D). (c) Almost every point of ∂D is the nontangential limit of some sequence in E.
Analysis and Operators 2000–2004
583
Analogous results for C + (∂D) were obtained by Hayman and Lyons in 1990. Extensions of these results to all connected Greenian open subsets of RN , were obtained in [68] by S. Gardiner (team 4) and J. Pau (team 3) during the one year postdoctoral position of Jordi Pau at Dublin [35] Set
Uy (x) = − log( x − y ) x = y, N = 2 Uy (x) = x − y 2−N x = y, N ≥ 3 Uy (x) = +∞ x = y.
An open set Ω is Greenian when Uy has a subharmonic minorant on Ω for every y ∈ Ω (this is always true for N ≥ 3). In this case Uy has a largest harmonic minorant hy on Ω and the Green function G on Ω is defined on Ω × Ω by the formula G(x, y) = Uy (x) − hy (x). For the general case of the Gardiner-Pau theorem the notion of Martin boundary and sets of minimal boundary points would be needed (see Chapter 8 of [6]). The situation is simpler for Lipschitz domains. Let ν0 ≥ 0 be a measure with compact support contained in Ω. The Martin kernel is then defined on Ω × ∂Ω by the formula G(x, z) . G(z, u)dν0 (u) Ω
M (x, y) = lim z→y
We can now state the Gardiner-Pau theorem in the special case of Lipschitz domains. Theorem 5.1. Let Ω be a bounded Lipschitz domain of RN and let µ ≥ 0 be a measure on ∂Ω. Set H(x) = ∂Ω M (x, y)dµ(y) (x ∈ Ω). The following conditions are equivalent for E ⊂ Ω : (a) For each f ∈ L1 (µ) and > 0 there exist sequences (λk ) ∈ R and (xk ) in E such that f= λk M (xk , .) in L1 (µ) and |λk |H(xk ) < f L1 (µ) + h = supD (b) supE H on Ω.
h H
for every harmonic function h such that
h H
is bounded
Extensions to general domains of the results of Hayman and Lyons mentioned above are also obtained in [35] 6. Algebraic Riccati equations Consider the state linear system z (t) = Az(t) + Bu(t) y(t) = Cz(t) z(0) = z0
584
J. Esterle
where u(t), y(t), z(t) respectively belong to the separable Hilbert spaces U, Y, Z, where A is the generator of a C0 -semigroup T (t)t>0 on Z, and where B : U → Z and C : Y → Z are bounded linear operators. The trajectories are given by t z(t) = T (t)z0 + 0 T (t − s)Bu(s)ds y(t) = Cz(t). Given a bounded invertible positive operator R : U → U we want to minimize for u ∈ L2 ([0, ∞), U ) the quantity +∞ +∞ y(s) 2 ds + Ru(s) 2ds. J(z0 , u) = 0
0
This problem is discussed in Chapter 6 of the monograph [23] by Ruth Curtain and Hans Zwart. Assuming that the linear system is optimizable, which just means that for each z0 ∈ Z there exists an input function u such that J(z0 , u) < +∞, there exists a self-adjoint bounded nonnegative operator Π : Z → Z such that minu∈L2 ([0,∞),U ) = z0 , Πz0 . The minimizing function s −→ umin (s, z0 ) can be computed explicitly from Π, and Π happens to be the minimal nonnegative solution in B(Z) of the weak algebraic Riccati equation Az1 , Πz2 + Πz1 , Az2 + Cz1 , Cz2 − R−1 B ∗ Πz1 , R−1 B ∗ Πz2 = 0, for z1 , z2 ∈ Dom(A). Set Q = C ∗ C, D = BR−2 B ∗ , so that Q and D are bounded and nonnegative. In some situations studied by H. Langer (team 8), A.C.M. Ran (team 2) and B.A. van de Rotten (team 2) the strong algebraic Riccati equation ΠDΠ − A∗ Π − ΠA − Q = 0 has a unique nonnegative bounded solution Π, which is the minimal symmetric solution of the weak equation. Set A −D ˜ A= , −Q −A∗ which we view as a perturbation of
A A˜0 = 0
0 . −A∗
If the closed densely defined operator A satisfies A − zI invertible, M for | Re z| < ω0 , (A − zI)−1 ≤ 1 + |z|β with ω0 > 0, M > 0, β > 1/2, then there exists ω > 0 such that A˜ − zI is invertible for |z| ≤ ω, lim|t|→∞ sup|s|≤ω (A˜ − (s + it)I)−1 = 0. ˜ − of H ˜ := H ⊕ H, ˜ + and H Moreover there exist two closed subspaces H ˜ ˜ ˜ ˜ ˜ invariant for A such that D(A) ∩ H+ is dense in H+ , D(A) ∩ H− is dense in
Analysis and Operators 2000–2004
585
˜ is the direct sum of H ˜ + and H ˜ − , and such that if A˜+ is the restric˜−, H H ˜ ∩H ˜ − we have ˜ ∩H ˜ + and A˜− the restriction of A˜ to D(A) tion of A˜ to D(A) inf z∈Spec A˜+ Re z > 0, supz∈Spec A˜− Re z < 0. Assume that {(A − zI)−1 Dx : x ∈ H, | Re z| ≤ ω0 }− = H (this means that the pair (A, D) is approximately controllable) and that {(A∗ − zI)−1 Qx : x ∈ H, | Re z| ≤ ω0 }− = H (this means that the pair (Q, A) is approximately observable) then there exists a (possibly unbounded) positive one-to-one selfadjoint operator Π− and a (possibly unbounded) negative self-adjoint one-toone operator Π+ such that x x ˜ ˜ . H+ = H− = Π− (x) x∈D(Π ) Π+ (x) x∈D(Π ) −
+
˜ Recall These results use in particular two structures of Krein spaces on H. that an operator A is said to be mu-sectorial if A − zI is invertible for Re z > 0 and if there exists θ ∈ (0, π/2) and β > 0 such that π/2 + θ ≤ Arg(Ax, x) ≤ 3π/2 − θ and Re(Ax, x) ≤ −β x 2 , x ∈ D(A). Langer, Ran and van de Rotten proved in [51] the following result Theorem 6.1. Assume that A is mu-sectorial, that the pair (A, D) is approximately controllable, and that the pair (Q,A) is approximately observable. Then the positive operator Π− is bounded, and it is the unique nonnegative bounded solution of the algebraic Riccati equation ΠDΠ − A∗ Π − ΠA − Q = 0. The notion of angular subspace plays an important role in this result and in many other situations. For example let A B A= B∗ D be a block operator matrix in a Hilbert space H = H1 ⊕ H2 , with bounded operators A, B and D, where A and D are self-adjoint. It is well known that if the spectra of A and D are separated, e.g., d = max[σ(D)] < min[σ(A)] = a, then the interval (d, a) belongs to the resolvent set of A and min σ(A) ≤ d < a ≤ max σ(A). Moreover the spectral subspaces of A associated to [a, +∞) is angular: this subspace is the graph of a contraction K : H1 → H2 (a similar property holds of course for the spectral subspace associated to (−∞, d]). The purpose of the paper [50], by H. Langer (team 8), A. Markus, V. Matsaev (team 9), and C. Tretter (team 8) is to investigate the situation where the spectra of A and D are not separated. For example if the operator A has spectrum on a closed interval ∆ ⊂ ρ(D) then the spectral subspace associated to ∆ has an angular representation associated to an operator K which is in general defined only on a subspace of H1 and is no longer a contraction. If the interval ∆ ⊂ ρ(D) is half-open or open then the operator K may be unbounded. The first Schur complement S1 (λ) = A−λ−B(D−λ)−1 B ∗ corresponding to ∆ plays an impor-
586
J. Esterle
tant role in this investigation. This paper should become a reference for further investigations because the methods used can be extended to some situations where the operator A is not self-adjoint and has unbounded coefficients. The two papers mentioned above are part of a large flow of joint papers involving teams 2, 8 and 9 which use a blend of complex analysis and operator theoretical methods to solve problems arising from control theory or differential-difference equations, see for example [3], [13], [38], [39], [49]. We refer to the annual, midterm and final reports, available on the network homepage, for further information. 7. Hadamard products Recall that the Hadamard product M ◦ A of two matrices M = (mij )i,j≥0 and A = (aij )i,j≥0 is given by the formula M ◦ A = (mi,j ai,j ). Identify a bounded operator A on l2 to the matrix (ai,,j ), where (ei )i≥1 is the standard orthonormal basis for l2 and where ai,j = Aei , ej for i ≥ 1, j ≥ 1. If M = (mi,j ) is an infinite matrix, set M ◦A = (mi,j ai,j ). The matrix M is called a Schur multiplier is the map A −→ M ◦ A is a bounded map from B(l2 ) into itself. L.N. Nikolskaia (Bordeaux) and Yu.B. Farforovskaya (St. Petersburg) obtained recently interesting results on this very classical subject. For example let φ : Z + → Z + be a map, and set σφ = {(i, j) ∈ Z + × Z + | j ≥ φ(i)} They show in particular in [31] that if M (φ) is the matrix associated to the characteristic function of σ(φ) (i.e., M (φ)i,j = 1 if j ≥ φ(i), M (φ)i,j = 0, otherwise), then M (φ) is a Schur multiplier if and only if φ(Z + ) is finite. In this case we have c · log(n + 1) ≤ M (φ) HSM ≤ 1 + log(n) where n = card(φ(Z + )), and where c is an absolute constant. They also obtain in [31] a complete characterization of Toeplitz Schur multipliers. Theorem 7.1. Let T = (ti−j ) be an infinite Toeplitz matrix. Then T is a Schur multiplier if and only if there exists a measure µ on the unit circle such that tn = µ ˆ(n) for n ∈ Z, and that in this case the Schur-multiplier norm of T equals the total variation of µ on the unit circle. The situation is more complicated for Hankel matrices M = (mi+j ), and some inequalities follow from Pisier’s general version of a theorem of Grothendieck, which shows that M = (mi,j ) is a Schur multiplier of norm ≤ C if and only if there exist two bounded sequences (xi ) and (yj ) in the Hilbert space such that sup xi . sup yi ≤ C and such that mi,j = xi , yj for i ≥ 1, j ≥ 1. Other interesting links between operator theory and Fourier analysis can be found in [12], where C. Badea (team 1), and G. Cassier team 6 develop a theory of constrained von Neumann inequalities, i.e., inequalities verified by Hilbert space contractions satisfying some algebraic conditions, and deduce from these inequalities in some usual situations estimates on Fourier coefficients.
Analysis and Operators 2000–2004
587
8. Duality of metric entropy For two subsets K and T of a vector space E, the (possibly infinite) covering number of K by T , denoted N (K, T ), is defined as the minimal number of translates of T needed to cover K. N (K, T ) = min{N : there exists x1 . . . xn ∈ E, K ⊂ ∪1≤i≤n xi + T }. Similarly the packing number M (K, T ) is the (possibly infinite) maximal number of disjoint translates of T by elements of K. These notions are closely related, and we have the inequality 1 N (K, T − T ) ≤ M (K, T ) ≤ N (K, (B − B). 2 If T is a ball in a normed space, and if K is a subset of a normed space, these notions reduce to considerations involving -nets or -separated subsets of K. Now for two Banach spaces X and Y , with unit balls BX and BY respectively, and for a linear operator u : X −→ Y , the (possibly infinite) kth entropy number of u is defined by the formula ek (u) := inf{ : N (u(BX ), BY ) ≤ 2k−1 }. Hence e1 (u) = u op , and one can easily see that ek (u) → 0 as k → ∞ if and only if u is a compact operator. So the sequences (ek (u))k≥1 and (ek (u∗ ))k≥1 always begin with the same number u op = u∗ op , and ek (u) → 0 if and only if ek (u∗ ) → 0. Since the sequence (ek (u))k≥1 quantifies in some sense the compactness of u, it is natural to ask to what extent do (ek (u))k≥1 and (ek (u∗ ))k≥1 behave similarly. This led to the duality conjecture for metric entropy Conjecture 8.1. (Pietsch, 1972) Do there exist numerical constants a, b ≥ 1 such that for any two Banach spaces X and Y and any linear operator u : X −→ Y , the inequality ebk (u∗ ) ≤ aek (u) holds for every k ≥ 1. If K ⊂ Rn is a convex body, we will denote by K 0 = {u ∈ Rn | sup x, u ≥ 1} x∈K
the polar body of K. S. Arstein (team 9), V. Milman (team 9) and S.J. Szarek (team 6) obtained in [7] the following result Theorem 8.2. Let D be the Euclidean unit ball in Rn . Then there exists two universal constants α and β such that 1
N (D, α)−1 K 0 ) β ≤ N (D, K) ≤ N (D, αN (D, αK 0 )β for any dimension n and any convex body K ⊂ Rn , symmetric with respect to the origin.
588
J. Esterle
This theorem implies the duality conjecture for metric entropy in the special but central case where one of the Banach spaces X or Y is a Hilbert space. A. Pajor (team 6) and M. Milman (team 9) proved in [57] other interesting results concerning the regularization procedure of arbitrary star body obtained by cutting by random half-spaces, showing that the resulting convex body has (with large probability) better regularity properties. For √ example cutting with suitable n/2 half spaces a n1 ball of diameter of order n containing the standard Euclidean ball one obtains a body with (absolutely) bounded diameter and still containing the unit ball. These packing and covering numbers appear naturally in numerous subfields of mathematics, ranging from classical and functional analysis through probability theory and operator theory to information theory and computer science, where a code is typically a packing, while covering numbers quantify the complexity of a set. In fact, the quantity log(N (K, tT )) is the complexity of K, measured in bits, at the level of resolution t with respect to the metric for which T is the unit ball. Accordingly, Theorem 8.2 says that when K is a subset of a Hilbert space the complexity of K is controlled by the complexity of the Euclidean ball with respect to the norm of Rn for which the unit ball is K 0 , and vice-versa, at every level of resolution. The phenomenon of concentration of measure, related to convex geometry and high-dimensional Banach space geometry, plays an increasing role in mathematical physics and statistics, see the monograph by Talagrand [78] and the very interesting conference of Massart [52] on applied statistics at 4ecm, and a recent spectacular application of convex geometry to complex analysis (estimates of the volume of level sets of analytic functions were obtained in [58]). The links between high-dimensional convex geometry, complexity theory, the phenomenon of concentration of measure, fluid dynamics, etc. led to an innovative RTN project, involving people originating from different areas (Brenier, Gromov, Milman, Pastour, etc.) coordinated by Pajor, which was one of the two projects in Mathematics accepted in the November call of 6th PCRDT (see http://phd-math.univ-mlv.fr/ for further information). 9. Variational principles and invariant subspaces In the research objectives mentioned at the beginning of this paper, application of variational principles to differential equations were expected. In fact A. Atzmon (Tel-Aviv) and G. Godefroy (Paris) obtained in [11] an application of variational principles to invariant subspaces, a very different direction. If a Banach space X admits an equivalent Gateaux smooth norm (which is true for all separable Banach spaces) it is proved in [11] that given a function G : X → R ∪ {∞} which is lower-semicontinuous and bounded below, if > 0 and y ∈ X satisfy 2 G(y) < inf(G) + 12
Analysis and Operators 2000–2004
589
then there exists a Lipschitzian and Gateaux smooth function g : X → R such that (1) supx∈X ( g(x) + g (x) ) < (2) G + g attains its minimum on X at w such that y − w < . An operator A is said to have a moment sequence if there exists x0 ∈ X \ {0}, x∗0 ∈ X ∗ \ {0} and a positive Borel measure µ on R such that ∗ n tn dµ(t). x0 , A (x) = R
The variational principle described before allowed Atzmon and Godefroy to prove for all Banach spaces the following result previously obtained by Atzmon in the reflexive case [10]. Theorem 9.1. Let X be a real Banach space, and let A : X → X be a bounded linear operator. If A has a moment sequence, then A has a nontrivial invariant subspace. 10. More on invariant subspaces Let ω : Z → (0, +∞) and assume that ω(n + 1) ω(n + 1) ≤ sup < +∞. n∈Z ω(n) ω(n) n∈Z 2 Set lω2 (Z) = {u = (un )n∈Z | n∈Z |un | < +∞}. The bilateral shift S : (un )n∈Z → (un−1 )n∈Z is bounded and invertible on lω2 (Z), and a closed subspace M of lω2 (Z) is said to be translation invariant if S(M )∪S −1 (M ) ⊂ M . The existence of nontrivial translation invariant subspaces is an open problem. Atzmon’s Hilbert space version of the moment sequence theorem gives a positive answer for all symmetric weights. For “antisymmetric weights”, i.e., weights 1 satisfying ω(n) = ω(−n) Domar obtained in 1997 in [27] a positive answer, with some regularity assumptions, using entire functions of exponential type. The Borichev-Hedenmalm-Volberg theorem gives other types of translation invariant subspaces for antisymmetric weights which are log-convex on Z+ , and Atzmon obtained in 1997 in [9] a positive answer for new classes of weights by using entire functions of zero exponential type. The problem is still open in the case where the spectrum of the bilateral shift is an annulus. Inspired by the solution by Borichev and Hedenmalm of Levin’s problem [16] and using minimum principles for almost holomorphic functions, (see [17] for recent results of this type) A. Volberg and the author showed in [30] that if the spectrum of the bilateral shift equals the unit circle, and if ω(n) tends to infinity sufficiently quickly and regularly as n → −∞ then all translation invariant subspaces M are generated by their “analytic part” M + = {u = (un )n∈Z ∈ M | un = 0 ∀n < 0}. For example if ω(n) = 1 for n ≥ 0, ω(n) = exp(|n|/ log(1 + |n|)2 ) for n < 0, then all translation invariant subspaces of lω2 (Z) are generated by the 0 < inf
590
J. Esterle
Fourier sequence of a singular inner function. This shows that if there exists a weighted Hardy space, for which the spectrum of the shift and of the backward shift equals the closed unit disc, such that dim(M zM ) ≥ 2 for every zerofree z-invariant subspace, then there is a counterexample to the translation invariant subspace problem. The so-called Brown approximation scheme plays a very important role in the construction of nontrivial invariant subspaces. For example if T is an absolutely continuous contraction on the Hilbert space H for which the functional calculus h −→ h(T ) is an isometry, this scheme shows that for every f ∈ L1 (T) and every > 0 there exists x, y ∈ H such that T n x, y = fˆ(−n) for n ≥ 0, which implies that T has a very rich lattice of invariant subspaces. I. Chalendar (team 6), J.R. Partington (team 5) and R. Smith (team 5) show in [22] that the existence of pairs (x, y) of elements of H such that T n x, y = fˆ(−n) ( n ≥ 0) for some specific f ∈ L1 (T) does imply the existence of nontrivial invariant subspaces for T . This is the case for example for functions f ∈ L1 (T) which agree a.e. with the nontangential limit on the circle of the quotient of two bounded analytic functions on the open unit disc. They also establish for the first time a link between the Brown approximation scheme and the Hilbert space version of the Atzmon-Godefroy moment theorem mentioned above, which gives in particular nontrivial translation invariant subspaces for all weighted Hilbert spaces of sequences associated to an even weight. The ideas introduced in this paper could play a role to solve the “recalcitrant cases” of weighted Hilbert spaces of sequences for which the spectrum of the bilateral shift has nonempty interior and for which the existence of translation invariant subspaces remain unknown (this was a network objective, for which significant partial results were obtained in [29]). Other network contributions to the Brown approximation scheme can be found in [20]. 11. Two open problems We conclude this report by two open problems, which seem still out of reach, and cannot reasonably be part of the research objectives of a realistic network project. 1. Let p ≥ 2, let F : Cp → Cp be holomorphic and let F n be the nth iterate of F . Is ∩n≥1 F n (Cp ) always nonempty? More generally is ∩n≥1 F1 ◦ F2 ◦ · · · ◦ Fn (Cp ) always nonempty if (Fn ) is a sequence of holomorphic functions from Cp into itself (a negative answer would imply that characters on Fr´echet algebras are continuous, see [26], [28]). 2. Discontinuous algebra norms on C[0, 1] do exist if 2ℵ0 = ℵ1 , as shown independently by H.G. Dales and the author in 1976, their existence is not decidable if 2ℵ0 = ℵ2 , as shown independently in 1994 by Woodin and Frantiszek (Solovay and Woodin had already shown in 1976 that Martin’s axiom does not imply
Analysis and Operators 2000–2004
591
that all algebra norms on C[0, 1] are continuous, and detailed references about these questions can be found in the monograph [24]). What about 2ℵ0 ≥ ℵ3 ? References [1] O. Alata, Olivier, M. Najim, C. Ramananjarasoa and F. Turcu, Extension of the Schur-Cohn stability test for 2-D AR quarter-plane model, IEEE Trans. Inform. Theory 49 (2003), 3099–3106. [2] A. Aleman, S. Richter and C. Sundberg Beurling’s theorem for the Bergman space, Acta Math. 177 (1996), 275–310. [3] D. Alpay, A. Dijksma (team 2), and H. Langer (team 8), Factorization of Junitary matrix polynomials on the line and a Schur algorithm for generalized Nevanlinna functions, Lin.Alg. Appl. 387, 2004, 313–342. [4] P. Ara and M. Mathieu (team 4), Local multipliers of C ∗ -algebras. Springer Monographs in Mathematics. Springer-Verlag London, Ltd., London, 2003. xii+319 pp. [5] S. Argyros, G. Godefroy (team 6) and H.P. Rosenthal, Descriptive set theory and Banach spaces, Handbook of the geometry of Banach spaces, Vol. 2, 1007–1069, North-Holland, Amsterdam, 2003. [6] D.H. Armitage (team 4) and S.J. Gardiner (team 4), Classical potential theory. Springer Monographs in Mathematics. Springer-Verlag London, Ltd., London, 2001. xvi+333 pp. [7] S. Artstein (team 9), V. Milman (team 9) and S. Szarek (team 6), Duality of metric entropy, Annals of Math., to appear. [8] S. Artstein (team 9), V. Milman (team 9), S. Szarek (team 6) and N. TomczakJaegermann, On convexified packing and metric entropy, Geom. Funct. Anal. 14 (2004), 1134–1141. [9] A. Atzmon, Entire functions, invariant subspaces and Fourier transforms, Israel Math. Conf. Proceedings 11 (1997), 37–52. [10] A. Atzmon, The existence of translation invariant subspaces for symmetric selfadjoint sequence spaces on Z, J. Func. An. 178 (2000), 372–380. [11] A. Atzmon (team 9) and G. Godefroy (team 6), An application of the smooth variational principle to the existence of nontrivial invariant subspaces, C.R. Acad. Sci. Paris S´er.I 332 (2001) , 151–156. [12] C. Badea (team 1) and G. Cassier (team 6), Constrained von Neumann inequalities, Adv. Math. 166 (2002), 260–297. [13] A. Batkai (team 8), P. Binding, A. Dijksma (team 2), R. Hryniv and H. Langer (team 8), Spectral problems for operator matrices, Math. Nachrichten, to appear. [14] F. Bayart (team 1) and S. Grivaux (team 6), Hypercyclicity: the role of the unimodular point spectrum, C. R. Acad. Sci. Paris, 338(2004), 703–708. [15] B. Boe(team 7) and A. Nicolau (team 3), Interpolation by functions in the Bloch space, preprint (http://www.mat.uab.es). [16] A. Borichev and H. Hedenmalm, Completeness of translates in weighted spaces on the half-line, Acta Math. 174 (1995), 1–84.
592
J. Esterle
[17] A. Borichev (team 1), F. Nazarov and M. Sodin (team 9), Lower bounds for quasianalytic functions II, The Bernstein quasianalytic functions, Math. Scand. 95 (2004), 44–58. [18] A. Borichev (team 1), H. Hedenmalm (team 7) and A. Volberg (team 6) Large Bergman spaces: invertibility, cyclicity and subspaces of arbitrarily large index, J. Func. An. 207 (2004), 111–160. [19] J. Bruna, Sampling in complex and harmonic analysis, European Congress of Mathematics,Vol I (Barcelona, 2000), 225-246, Progr. Math. 201, Birkh¨ auser, Basel, 2001. [20] G. Cassier (team 6), I. Chalendar (team 6) and B. Chevreau (team 1), A mapping theorem for the boundary set XT of a contraction T , J. Op. Th. 50 (2003), 331– 343. [21] I. Chalendar (team 6), J.R. Partington (team 5) and M. Smith (team 5), Approximation in reflexive Banach spaces and applications to the invariant subspace problem, Proc. Amer. Math.Soc. 132, (2003), 1133–1142. [22] I. Chalendar (team 6), J.R. Partington (team 5) and R. Smith (team 5), L1 - factorizations,moment problems and invariant subspaces, Studia Math. 167 (2005), 183–194. [23] R. Curtain and H. Zwart, An introduction to infinite-dimensional linear systems theory, Texts in Applied Mathematics, 21. Springer-Verlag, New York, 1995. xviii+698 pp. [24] H.G. Dales (team 5), Banach algebras and automatic continuity, London Mathematical Society Monographs. New Series, 24. Oxford Science Publications. The Clarendon Press, Oxford University Press, New York, 2000. xviii+907 pp. [25] R. Deville (team 1) and N. Ghoussoub, Perturbed minimization principles and applications, Handbook of the geometry of Banach spaces, Vol. I, 393–435, NorthHolland, Amsterdam, 2001. [26] P.G. Dixon and J. Esterle, Michael’s problem and the Poincar´e-Fatou-Bieberbach phenomenon, Bull. Amer. Math. Soc. 15 (1986), 127–187. [27] Y. Domar, Entire functions of order ≤ 1, with bounds on both axes, Ann. Acad. Sci. Fenn. Math. 22 (1997), 339–34. [28] J. Esterle, Picard’s theorem, Mittag-Leffler methods, and continuity of characters on Fr´echet algebras, Ann. Sci. Ec. Norm. Sup. 29 (1996), 539–582. [29] J. Esterle (team 1), Apostol’s bilateral weighted shifts are hyper-reflexive, Op. Th. Adv. Appl. 127 (2001), 243–266. [30] J. Esterle (team 1) and A. Volberg (team 6), Asymptotically holomorphic functions and translation invariant subspaces of weighted Hilbert spaces of sequences, Ann. Sci. Ec. Norm. Sup. 35 (2002), 185–230. [31] Yu. Farforovskaia (team 10) and L. Nikolskaia (team 1), Toeplitz and Hankel matrices as Hadamard-Schur multipliers, St. Petersburg Math. Journal 15 (2004) 1–14. [32] C. Foias, A.E. Frazho and M.A. Kaashoek (team 2), The distance to intertwining operators, contractive liftings and a related optimality result, Int. Eq. and Op. Th. 47 (2003), 71–89.
Analysis and Operators 2000–2004
593
[33] T.W. Gamelin and S.V. Kislyakov (team 10), Uniform algebras as Banach spaces, Handbook of the geometry of Banach spaces, Vol. I, 671–706, North-Holland, Amsterdam, 2001. [34] S. Gardiner and W. Hansen Boundary sets where harmonic functions may become infinite, Math. Ann. 323 (2002), 41–54. [35] S. Gardiner (team 4) and J. Pau (team 3),Approximation in the boundary and sets of determination for harmonic functions, Illinois J. Math. 47 (2003), 1115– 1136. [36] G. Godefroy (team 6), Renormings of Banach spaces Handbook of the geometry of Banach spaces, Vol. I, 781–835, North-Holland, Amsterdam, 2001. [37] I. Gohberg (team 9), S. Goldberg and M.A. Kaashoek (team2), Basic Classes of Linear Operators, Birkh¨ auser Verlag,Basel, 2003; 423 pp. [38] I. Gohberg (team 9), M.A. Kaashoek (team 2) and A.L. Sakhnovich, Scattering problems for a canonical system with a pseudo-exponential potential, Asymp. An. 29 (2002), 1–38. [39] I. Gohberg (team 9), M.A. Kaashoek (team 2) and F. van Schagen (team 2), On inversion of convolution integral operators on a finite interval, in: Operator Theoretical Methods and Applications to Mathematical Physics. The Erhard Meister Memorial Volume, OT 147, Birkh¨ auser Verlag, Basel, 2004, pp. 277–285. [40] A. Hartmann (team 1), D. Sarason, and K. Seip (team 7), Surjective Toeplitz operators, Acta Sci. Math. Szeged 70 (2004), 609–621. [41] A. Hartmann (team 1), X. Massaneda (team 3), A. Nicolau (team 3) and P. Thomas, Free Interpolation in the Nevanlinna and Smirnov classes and harmonic majorants, J. of Func. An., to appear. [42] A. Hartmann (team 1) and K. Seip (team 7), Extremal functions as divisors for kernels of Toeplitz operators J. Funct. Anal. 202 (2003), no. 2, 342–362. [43] H. Hedenmalm (team 7), B. Korenblum and K. Zhu, Theory of Bergman spaces, Graduate Texts in Mathematics, 199. Springer-Verlag, New York, 2000. x+286 pp. [44] D. Hitt, Invariant subspaces of H 2 of an annulus, Pacific J. Math. 134 (1988), 101–120. [45] S.V. Hruscev, N.K. Nikolski and B.S. Pavlov, Unconditional bases of exponentials and reproducing kernels, Springer Lect. Notes 864, Springer-Verlag, New York (1981), 214–335. [46] B. Jacob (team 5), J.R. Partington (team 5) and S. Pott (team 1), Conditions for admissibility of observation operators and boundedness of Hankel operators, Int. Eq. and Op. Theory 47 (2003), 315–338. [47] S.V. Kislyakov (team 10)Banach spaces and classical harmonic analysis, Handbook of the geometry of Banach spaces, Vol. I, 871–898, North-Holland, Amsterdam, 2001. [48] S.V. Kislyakov, On BMO-regular lattices of measurable functions, (Russian) Algebra i Analiz 14 (2002), no. 2, 117–135; translation in St. Petersburg Math. J. 14 (2003), no. 2, 273–286. [49] H. Langer (team 8), A. Markus, V. Matsaev (team 9) and C. Tretter (team 8), A new concept for block operator matrices:The quadratic numerical range, Linear Algebra Appl. 330 (2001) 89–112.
594
J. Esterle
[50] H. Langer (team 8), A. Markus, V. Matsaev (team 9) and C. Tretter (team 8), Self-adjoint block operator matrices with non-separated diagonal entries and their Schur complement, J. Func. An. 199 (2003), 427–451. [51] H. Langer (team 8), A.C.M. Ran (team 2) and B. van de Rotten (team 2), Invariant subspaces of infinite-dimensional Hamiltonians and solutions of the corresponding Riccati equations, Operator Theory: Advances and Applications 130 (2001), 235–254. [52] P. Massart, A nonasymptotic theory for model selection, European Congress of Mathematics (Stockholm, 2004), this volume. [53] J. Mateu (team 3), X. Tolsa (team 3) and J. Verdera (team 3), The planar Cantor sets of zero analytic capacity and the local T (b)-theorem, J. Amer. Math. Soc. 16 (2003), no. 1, 19–28. [54] P. Mattila, M. Melnikov and J. Verdera, The Cauchy integral, analytic capacity, and uniform rectifiability, Annals of Math. 144 (1996),127–136. [55] B. Maurey (team 6) Banach spaces with few operators, Handbook of the geometry of Banach spaces, Vol. 2, 1247–1297, North-Holland, Amsterdam, 2003. [56] B. Maurey (team 6) Type, cotype and K-convexity, Handbook of the geometry of Banach spaces, Vol. 2, 1299–1332, North-Holland, Amsterdam, 2003. [57] V. Milman (team 9) and A. Pajor (team 6), Regularization of star bodies by random hyperplane cut off, Studia Math. 159 (2003), 247–261. [58] F. Nazarov, M. Sodin (team 9) and A. Volberg (team 6) A. Local dimension-free estimates for volumes of sublevel sets of analytic functions Israel J. Math. 133 (2003), 269–283. [59] F. Nazarov, S. Treil and A. Volberg (team 6) The T (b)-theorem on nonhomogeneous spaces, Acta Math. 190 (2003), no. 2, 151–239. [60] A. Nicolau (team 3), J. Ortega-Cerda (team 3) and K. Seip (team 7), The constant of interpolation, Pac. J. Math., 213 (2004), 389–398. [61] N.K. Nikolski, Selected Problems of Weighted Approximation and Spectral Analysis, Proc. Steklov Math. Inst., Vol 120 (1974), Amer. Math. Soc. Providence (1976). [62] N.K. Nikolski (team 1) , Operators, functions, and systems: an easy reading. Vol. 1. Hardy, Hankel, and Toeplitz. Mathematical Surveys and Monographs, 92. American Mathematical Society, Providence, RI, 2002. xiv+461 pp. [63] N.K. Nikolski (team 1), Operators, functions, and systems: an easy reading. Vol. 2. Model operators and systems. Mathematical Surveys and Monographs, 93. American Mathematical Society, Providence, RI, 2002. xiv+439 pp. [64] N.K. Nikolski (team 1) and S. Treil, Linear resolvent growth of rank one perturbation of a unitary operator does not imply its similarity to a normal operator. Dedicated to the memory of Thomas H. Wolff, J. Anal. Math. 87 (2002), 415–431. [65] J. Ortega-Cerda (team 3) and K. Seip (team 7), On Fourier frames, Annals of Math. 155 (2002), 789–806. [66] J.R. Partington (team 5), Linear operators and linear systems: An Analytical approach to Control Theory, London Math. Soc. Student texts 60, Cambridge University Press, 2004, 176 pp. [67] B.S. Pavlov, The basis property of a system of exponentials and the condition of Muckenhoupt, Dokl. Acad. Nauk SSSR 247 (1979), 37–40.
Analysis and Operators 2000–2004
595
[68] G. Pisier (team 6), Introduction to operator space theory. London Mathematical Society Lecture Note Series, 294. Cambridge University Press, Cambridge, 2003. viii+478 pp. [69] G. Pisier (team 6), Operator spaces Handbook of the geometry of Banach spaces, Vol. 2, 1425–1458, North-Holland, Amsterdam, 2003. [70] G. Pisier (team 6) and S. Shlyakhtenko, Grothendieck’s theorem for operator spaces, Invent. Math. 150 (2002), 185–217. [71] G. Pisier (team 6) and Q. Xu (team 6) Non-commutative Lp -spaces, Handbook of the geometry of Banach spaces, Vol. 2, 1459–1517, North-Holland, Amsterdam, 2003. [72] S. Pott (team 1) and E. Strouse (team 1), Product of Toeplitz operators on weighted Bergman spaces, submitted. [73] D. Sarason, Nearly invariant subspaces for the backward shift, Op. Theory Adv. Appl. 35 (1988), 481–493. [74] Y. Raynaud (team 6) and Q. Xu (team 6), Subspaces of non-commutative Lp spaces, J. Func. Anal. 203 ((2003), 149–196. [75] K. Seip (team 7), Interpolation and Sampling in Spaces of Analytic Functions, University Lect. Notes Series 33, AMS, Providence, R.I., 2004, xii+139 pp. [76] S. Shimorin (team 7), On Beurling type theorems in weighted l2 and Bergman spaces, Proc. Amer. Math. Soc. 131 (2003), 1777–1787. [77] M. Sodin (team 9), Zeroes of gaussian analytic functions, European Congress of Mathematics (Stockholm, 2004), this volume. [78] M. Talagrand (team 6), Spin glasses: A challenge for mathematicians. Cavity and mean field models. Erg. Math. 46, Springer-Verlag Berlin, 2003, ix, 586 pp. [79] X. Tolsa (team 3), Painlev´e’s problem and the semiadditivity of analytic capacity Acta Math. 190 (2003), 105–149. [80] X. Tolsa (team 3), The semiadditivity of continuous analytic capacity and the inner boundary conjecture. Amer. J. Math. 126 (2004), 523–567. [81] X. Tolsa (team 3), The semiadditivity of analytic capacity and the Painlev´ e problem, European Congress of Mathematics (Stockholm, 2004), this volume. [82] J. Verdera (team 3), Ensembles effacables, ensembles invisibles et le probl`eme du voyageur de commerce, ou comment l’analyse r´eelle aide l’analyse complexe, Gazette des math´ematiciens 101 (2004), 21–49. (Traduit du catalan par N. Marco.) J. Esterle Laboratoire Bordelais d’Analyse et G´eom´ etrie UMR 5467 Universit´ e Bordeaux 1 351, Cours de la Lib´eration F-33405-Talence, France e-mail: [email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Analysis of the Bottom of the Spectrum of Schr¨ odinger Operators with Magnetic Potentials and Applications Bernard Helffer Abstract. The aim of this article is to present a review on the analysis of the discrete spectrum of the Schr¨ odinger operator with magnetic field, with particular emphasis on the bottom of the spectrum.
1. Introduction We would like to review the discrete spectrum of a Schr¨odinger operator with magnetic field in the semi-classical regime. We know that there has been many surveys on this question (see for example [25], [26] or [66]), but we will mainly discuss, without to pretend to exhaustivity or completeness, recent results obtained in this regime for low lying eigenvalues for this operator and refer to another survey [27] for the description of the possible applications to Superconductivity. In an open set Ω ⊂ Rn – or more generally on a compact or non compact manifold, with or without boundary – we consider the Schr¨ odinger operator with magnetic field: ∆h,A,V = (hDxj − Aj )2 + V , j
where h is a possibly small > 0 parameter (semi-classical limit), ωA , called is the 1-form magnetic potential (sometimes identified with a vector A), = (A1 , . . . , An ) , ωA = Aj (x)dxj , A j ∞
and V is a C potential. Because some open set Ω is involved, we should be more precise on the problem in consideration and define which selfadjoint realization we will consider. When Ω is bounded with regular boundary, we will This research is partially supported by the programme SPECT (ESF). This programme in Spectral Theory and Partial Differential Equations is supported by the European Science Foundation (http://www.esf.org/). Although the present text is not a report on the whole activity of the programme, we hope that it will give a good illustration of one of its main subjects.
598
B. Helffer
mainly be interested in the analysis of two selfadjoint realizations which are determined by: • the Dirichlet condition: u/∂Ω = 0 , or /∂Ω = 0 . • the Neumann condition: (n · (h∇ − iA)u) When Ω = Rn , one can show (due to the regularity assumptions) that ∆h,A,V is essentially selfadjoint on C0∞ (Rn ) as soon that it is semi-bounded. This is in particular the case when V is semibounded. Our basic object is the magnetic field, which is by definition of the 2-form σB = dωA .
(1.1)
When n = 2, one uses frequently the identification of σB with a function B , through the formula: σB = B dx1 ∧ dx2 . When n = 3, the two-form σB σB =
Bij dxi ∧ dxj ,
i<j
= (B1 , B2 , B3 ) by the Hodge map: can be identified with a vector field B B1 = B23 , B2 = −B13 , B3 = B12 . Sometimes, we extend Bij into an antisymmetric matrix. In particular, one can also write 1 Bij dxi ∧ dxj . σB = 2 i=j
For the analysis, it is crucial to realize that the components of the magnetic field can be recovered as a commutator: Bjk =
1 [hDxj − Aj , hDxk − Ak ] . ih
(1.2)
Finally, another important point is the gauge invariance: φ (u, A) → (u exp −i , A + ∇φ) , h where x → exp −i φh has to be a C ∞ function in Ω. So φ can be chosen locally as a C ∞ function and globally when Ω is simply connected. In particular, it is easy to see that dφ is a C ∞ -one form and that ωA and ωA + dφ have same associated σB . The converse is only partially true. The topology of Ω can indeed play a role and create interesting phenomena like the Aharonov-Bohm effects, and this will be discussed later.
Analysis of the Spectrum of Schr¨ odinger Operators
599
2. Mathematical questions and physical motivations We are mainly interested in: (1) the dependence of the bottom of the spectrum (also called in a more physical language ground state energy) of the selfadjoint realization of ∆h,A,V on A (or on the associate magnetic field σB ), on the parameter h, on the geometry of Ω (holes, points of maximal curvature, corners); (2) the localization, if it exists, of a corresponding eigenvector (also called groundstate) in the semi-classical limit, that is when h → 0; (3) the multiplicity of the lowest eigenvalue; (when A = 0, we know that the lowest eigenvalue (if it exists) is simple). In addition to their intrinsic interest, these mathematical questions are strongly motivated by various areas in Physics or Mathematics. In Atomic Physics, we mention for example all the questions around the stability of matter (see the results by Lieb-Solovej-Yngason in the presentation given in [77] and references therein). In Geometry, one more naturally speak about connections (instead of magnetic potential) and of curvature (instead of magnetic fields). Geometry and Complex Analysis have brought many interesting questions in spectral analysis. We refer to Montgomery [65], Helffer-Mohamed [34], Kwek-Pan [55], Demailly [18], Fu-Straube1 [22], Christ-Fu [15]. We will not discuss here about all the questions coming from Solid State Physics. The properties of the Schr¨ odinger operator with constant magnetic field and periodic potentiial are connected to the spectral properties of the fascinating Harper’s operator, which is isospectral, for a suitable ˜h > 0 depending ˜ on the flux, to an h-pseudodifferential operator on L2 (R) cos ˜hDx ) + cos x. We refer to Bellissard [5, 6], Helffer-Sj¨ ostrand [46, 76] for works which were mainly obtained in the nineties in some attempt to understand the ten martinis conjecture2 about the Cantor structure of the spectrum of this Harper’s operator with semi-classical techniques and also the so called De Haas-Van Alphen effect. Finally many of the results presented here are motivated by questions coming from the theory of Superconductivity (see Del Pino-Fellmer-Sternberg [71], Lu-Pan [58, 59, 60, 61], Helffer-Morame [35, 36, 37], Pan [68], FournaisHelffer [21].) 3. Compactness of the resolvent and essential spectrum 3.1. The general question. It is well known that, when Ω = Rn , −∆A + V has compact resolvent, if V tends to +∞ as |x| → +∞. Let us recall that it 1The authors investigate there the connection between the compactness of the ∂-Neumann
operator on Hartogs domains in C 2 and the spectral properties of certain Schr¨ odinger operators in the semi-classical framework. 2This ten martini conjecture was finally solved in 2004 without the help of semi-classical analysis by the union of efforts of many mathematitians including Last (1994), Puig (2004), Avila-Krikorian (2004) and Avila-Jitomirskaya (2004) (see [51]).
600
B. Helffer
is not a necessary condition. The following standard example: −∆ + x21 x22 , shows that there are cases (already without magnetic potential) when V does not tend to ∞ but the operator has compact resolvent: The next question, which was already addressed in Avron-Herbst-Simon [4], is the question of the magnetic bottles and is devoted to the analysis of the same question for the most specific case when V = 0. This was analyzed, sometimes as side-product of other questions3 (like for example by Helffer-Nourrigat [39]), by many authors including D. Robert [72], B. Simon [74], Helffer-Mohamed [33]. This has now a thirty years story and we refer to [25] and to [54] for references. We just recall here a few elements. When n = 2, the inequality B(x)|u(x)|2 dx ≤ ||∇h,A u||2 , ∀u ∈ C0∞ (Ω) , (3.1) ±h Ω
with ∇h,A = h∇ − iA , gives immediately the result that if |B(x)| → +∞, then the resolvent of the Dirichlet realization has compact resolvent. A first has remark is that this is more difficult for n ≥ 3. If the same proof works if B has a strongly oscilconstant direction (see [4]) it is no more the case when B lating direction. So there is an analog of this inequality, but with an additional remainder term and an additional assumption of regularity on the component of the magnetic field. Another remark is that it works for the Dirichlet problem but not for the Neumann one. 3.2. Some criteria for magnetic bottles. Inequality (3.1) is based on the fact that B(x) can be expressed as the bracket of two selfadjoint vector fields (see (1.2)). One can iterate this argument with higher-order brackets along Kohn’s argument [53]. This leads to Helffer-Mohamed Criterion, which in this particular case (we look here at pure magnetic effect, we assume V = 0) reads: Theorem 3.1. Suppose Ω = Rn and, for k ≥ 0, let |Dxα Bj (x)| . mk (x) = |α|=k,j,
Suppose that there exists r ≥ 0 and C > 0 such that: mk (x) → +∞ as |x| → +∞ , mr (x) := k≤r
and mr+1 (x) ≤ C (1 + mr (x)) . Then −∆A has compact resolvent. 3Initially developed for analyzing the hypoellipticity of polynomial of vector fields like the
H¨ ormander operators, these “nilpotent” techniques relate the problem of compact resolvent for some operators with polynomial coefficients to the question of irreducibility of some representation of a nilpotent Lie group.
Analysis of the Spectrum of Schr¨ odinger Operators
601
The two following examples in R2 illustrate the theorem. In the first case, the operator (Dx1 − x2 x21 )2 + (Dx2 + x1 x22 )2 satisfies the condition with r = 0, but it is easy to obtain directly the statement by using inequality (3.1). The second example satisfies the condition with r = 1: (Dx1 − x2 x21 )2 + (Dx2 − x1 x22 )2 . Note that in this case B(x) has no constant sign and vanishes for x1 = ±x2 . With a different approach, the reader can look at [54] for other results (involving the notion of capacity) and references on the subject. 3.3. Are there magnetic bottles for Dirac or Pauli? Here we consider the case when Ω = Rn (n = 2, 3). We would like to recall in this subsection rather old results in order to mention still open conjectures. We recall that the Dirac operator4 is defined on L2 (Ω, Ck ) (k = 2 if n = 2, k = 4 if n = 3), as: Dh,A = αj (hDxj − Aj ) (3.2) j
where, for j = 1, . . . , n, the αj ’s are the Pauli (symmetric) k × k matrices: αj αk + αk αj = 2δjk .
(3.3)
2 Also of interest is its square Dh,A , which is also called the Pauli operator. Under suitable assumptions, which are in particular implied by the assumptions of Theorem 3.1, Helffer-Nourrigat-Wang [41] have shown that the Dirac and Pauli operators are not with compact resolvent! Moreover, they show that when n ≥ 3, the essential spectrum is R. Together with other results recalled in the excellent book by B. Thaller [78], this leads to the following:
Conjecture 3.2. The pure magnetic Dirac operator Dh,A has never compact resolvent. Note that it is also related to recent papers by L. Erd¨ os and J.P. Solovej [20] and the Aharonov-Casher theorem (see [16] and reference therein). 4. Decay estimates 4.1. Semi-classical decay estimates. The fine analysis of the groundstates in the semiclassical regime requires an a priori control of the decay of the eigenfunctions. There are very few cases, outside the case of the eigenfunctions of the harmonic oscillator, where explicit expressions are known and therefore, except the 1-dimensional case where special techniques can be used, we need to develop a priori estimates. In the semi-classical regime, this control of the decay has played a basic role in the analysis of the tunnelling effect for the low lying eigenvalues of −h2 ∆ + V (Helffer-Sj¨ ostrand [42], Simon [75]). These authors have shown 4We mention only the “massless” case
602
B. Helffer
how the techniques initially developed by S. Agmon [1] (one can also refer to Lithner) for the analysis of the decay at ∞ of eigenfunctions associated with eigenvalues below the essential spectrum can be transposed to the semi-classical context. Later it also appears as quite important for the analysis of pure magnetic Laplacians (Brummelhuis[12], Helffer-Mohamed [34], Helffer-Nourirgat [40], Helffer-Morame [35, 36, 37]). We describe below various types of estimates which were obtained in this spirit. 4.1.1. Boundaryless case. There are two types of results in the case without boundary. Type 1: The magnetic field increases the decay. This concerns the operator ∆h,A,V . The groundstate decays at least like in the case when A = 0 (connected to diamagnetism) and it is proven in [45] that one has “roughly” an estimate of the type |uh (x)| C exp −dV −min V (x, V (−1) (min V ))/h .
(4.1)
Here dV −min V is the Agmon distance which will be defined later. By “roughly”, we mean that it should be an L2 corresponding estimate, that it could be true only on compact subsets and that, depending on the assumptions, C has to be replaced, in the right-hand side of (4.1), by C exp h , ∀ > 0 or by CN0 h−N0 for some N0 , the inequality being true uniformly for h ∈]0, h0 ], with h0 > 0 small enough. In addition, it has be shown that the estimate is rather optimal (as A = 0) when WKB constructions are available (in the case when V has non degenerate minima) [42]. When a magnetic field is added, explicit computations in the case when B is constant and V quadratic [79] show that the estimate is not optimal. Type 2: The magnetic field can create the decay. This concerns the operator ∆h,A,0 . The magnetic field is itself creating the decay, and the next inequality expresses the property that a groundstate lives near the minima of the function x → |B(x): √ |uh (x)| C exp −dB (x, |B|(−1) (min |B|))/ Ch . (4.2) Note that, as it is already clear from the proof, the result is NOT optimal (see [45], [25], [19], [67], [62]). Degenerate minima. When |B| has a degenerate minimum, it is interesting to analyze the decay inside the submanifold where |B| is minimum. Analogous phenomena were met previously in the analysis of the Schr¨ odinger operator with degenerate wells [43, 44], that is when the set of minima of the electric potential V is a union of disjoint connected submanifolds. Two typical models are given below corresponding to quite different effects:
Analysis of the Spectrum of Schr¨ odinger Operators
603
• Miniwells: h2 (Dx2 + Dy2 ) + (1 + x2 )(1 − x2 − y 2 )2 .
(4.3)
The well is x2 + y 2 = 1 and the miniwells are (0, ±1). • Uniformly degenerate wells: h2 (Dx2 + Dy2 ) + (1 − x2 − y 2 )2 .
(4.4)
The well is x2 + y 2 = 1 and invariant by rotation. In the first case, the ground state is localized at the mini-wells. In the second case instead, the ground state is uniformly localized in the circle as an invariance argument immediately shows. 4.1.2. The case with boundary. When a boundary is involved, the situation is more complicate depending on the boundary condition. The localization, which appears previously at the minima of V or at the minima of |B|, can also appear at the boundary. The situation could depend dramatically on which realization is concerned. When B is constant, we will for example meet √ (4.5) |uh (x)| C exp −d(x, ∂Ω)/ Ch , for the groundstate of the Neumann realization of the magnetic Laplacian and sometimes it will be possible to measure the decay inside the boundary [59], [71], [35] and the recent [21]. 4.2. Agmon estimates. The Agmon distance dV is associated to the metric (V − E)+ dx2 . The proof of the so called Agmon estimates is based on the identity: Φ Re exp 2 (∆h,A,V − λ) u | u h Φ 2Φ 2 2 |u| . (4.6) = || exp ∇h,A u|| + (V − λ − |∇Φ|2 ) exp h h Then the main point is to find the right Φ (usually associated to a possibly regularized distance) and to apply this identity to an eigenvector uh (attached to λh ). As observed in [45], this leads to the result that the decay obtained when A = 0 for a groundstate can be also shown in the same way for the case with magnetic field, but the result is no more optimal [19, 67, 62]. Rough heuristics. The main idea for understanding qualitatively the localization of the ground state of the Dirichlet realization of ∆h,A + V0 + hV1 is to replace it by by −h2 ∆ + V0 + h(||σB || + V1 ) . When When V0 = 0 = V1 , the groundstate is localized near the minima of ||B||. V = hV1 , one should compare the effect of V1 and ||σB ||.
604
B. Helffer
The situation for the Neumann realization is completely different! When V = 0 and σB is constant (not zero), the groundstate is localized at the boundary: the effective potential near the boundary is hΘ0 ||σB ||, with 0 < Θ0 < 1, where Θ0 will be defined in the analysis of the second model of Section 5. So this effective potential is below the internal effective potential which is h||σB ||. 5. Five reference models One part of the analysis is based on spectral properties of models. Let us briefly describe some of them and refer to the original papers for further analysis [17], [9], [58], [59], [60], [61], [8], [71], [36, 37, 38]. The first model is the harmonic oscillator H(t, Dt ) := Dt2 + t2 on R .
(5.1)
As it is standard, it appears immediately, when analyzing the Schr¨ odinger operator with constant magnetic field equal to 1 on R2 . Actually, the operator (Dx1 − x22 )2 + (Dx2 + x21 )2 is unitary equivalent to the operator H(t, Dt ) seen as an unbounded operator on L2 (R2s,t ). In particular the bottom of the spectrum is 1, that is the celebrated first Landau level. The second model occurs in the analysis of the Schr¨odinger operator with constant magnetic field equal to 1 on R2+ . This time the Neumann realization of (Dx1 − x22 )2 + (Dx2 + x21 )2 in R2+ is unitary equivalent to the Hilbertian integral (over ρ ∈ R) of the family H N eu (ρ) := Dt2 + (t − ρ)2 in R+ ,
(5.2)
with Neumann condition at 0. So the analysis of the bottom of the spectrum is the infimum over ρ of the bottom of σ(H N eu (ρ)). It has been shown ([9], [17]) that this infimum Θ0 is attained for a unique ρ0 > 0 and that Θ0 ∈]0, 1[. The fact that Θ0 is strictly below the value 1, which corresponds to the groundstate energy in R2 , is crucial for understanding the localization of the ground state in a general bounded domain Ω. The third model occurs ([61], [36], [37]) in the analysis of the model in and the vector (0, 0, 1), one is reduced to R3+ . If π2 − θ is the angle between B analyze the infimum Θ(θ) over ρ of the bottom of the spectrum of Dt2 + Ds2 + (t cos θ − s sin θ − ρ)2 on R2,+ , with Neumann condition at t = 0. What is crucial here is that the map θ → Θ(θ) is monotonically increasing bijection from [0, π2 ] onto [Θ0 , 1]. In particular Θ is minimal when θ = 0, that is when the magnetic vector is tangent to the boundary. The fourth model is the family Mρ (u, Du ) := Du2 + (u2 − ρ)2 on R .
(5.3)
It will appear in the analysis of Montgomery’s model ([65], [38], and also in the analysis of superconductivity in dimension 3 ([68], [37]). Again one can show
Analysis of the Spectrum of Schr¨ odinger Operators
605
[55] that the infimum over ρ of the bottom of the spectrum of Mρ is attained at a unique ρ0 . The fifth model appears in the analysis of domains with corners by V. Bon˜ naillie [10, 11]. The question is to analyze the function ]0, π[ α → Θ(α), where 2 ˜ Θ(α) is the bottom of the spectrum of the Neumann realization of Ds +(Dt −s)2 in an infinite sector of angle α. Let us mention here two conjectures which are discussed in the work of V. Bonnaillie. ˜ Conjecture 5.1. For α ∈]0, π[ , Θ(α) is an eigenvalue of multiplicity 1. ˜ is a bijection of ]0, π[ onto ]0, Θ0 [. Conjecture 5.2. The function Θ Application. As application of this analysis of the models, we can understand some asymptotic properties of the ground states of the Neumann problem, like the localization at the boundary, the localization at the points of maximal curvature (n = 2), the localization at the points where the magnetic field (seen as a vector) is tangent at the boundary (n = 3) or at the corners (Jadallah [50], Bonnaillie [10, 11]). Below, the reader can find the result of a numerical computation kindly communicated by Hornberger [48] (and [49]) for describing this maximal curvature effect.
When B is constant (= 0), the groundstate is localized at the points of maximal curvature. 6. Diamagnetism, paramagnetism in the semi-classical regime. 6.1. Diamagnetism. It is well known (Kato’s inequality) that the ground state energy satisfies λh,A,V ≥ λh,0,V . (6.1)
606
B. Helffer
A simple result (Lavine-O’Caroll (heuristic)[57], Helffer [23, 24]) gives a characterization of the equality: σB = 0 λh,A,V = λh,0,V if and only if (6.2) 1 ω ∈ Z, ∀ path γ . A 2π γ In order to have a better understanding of the phenomenon, it is interesting to measure quantitatively λh,A,V − λh,0,V , especially in the case when σB = 0. This is in this last case (in the two-dimensional case) called the Aharonov-Bohm effect5 for bounded states. Let us mention two types of results where this effect appears: (1) New Hardy estimates for Bohm-Aharonov Hamiltonians have been obtained by Laptev-Weidl [56], Balinsky [Ba], Christ-Fu [15]. The simplest relevant inequality is that, for all u ∈ C0∞ (R2 \ 0) , the following inequality |u(x)|2 −2 dx ≤ (min |k − Ψ|) |(∇ + iABH )u(x)|2 dx , (6.3) k∈Z |x|2 R2 R2 is true. Here ABH is the Bohm-Aharonov Hamiltonian, defined by Ψ (−y, x) , (6.4) r2 where Ψ is the normalized circulation of A around a small positively oriented circle centered at 0. (2) Semi-classical analysis permits to compare between direct effects and, in the case of “holes” or high electric barriers on the support of B, flux effects. We refer to [23, 24] for a fine analysis of this effect in a very particular situation. The electric potential V is assumed to have a unique (non degenerate) minimum xmin , which is outside the support of the magnetic field. Then we have roughly: ABH =
λh,A,V − λh,0,V ∼ (1 − cos
S0 2S1 Φ )a(h) exp − + b(h) exp − , h h h
where • S0 is the Agmon length (for the metric dV −min V ) of the shortest touristical path (that is starting from xmin , turning around the support of σB , and coming back to xmin ), • S1 is the Agmon distance to the support of B, • a(h) ∼ hµ1 aj hj , b(h) ∼ hµ2 bj hj , j≥0
j≥0
with a0 = 0 and b0 = 0. 5The initial Aharonov-Bohm effect gives actually the interpretation of a scattering experiment.
Analysis of the Spectrum of Schr¨ odinger Operators
607
the minimal geodesic to supp(B)
xmin
supp(B)
the minimal geodesic around supp(B)
6.2. Paramagnetism. Let us briefly discuss some question around the paramagnetism and come back to the analysis of the Pauli operator. We are interested in the validity of the inequality: 2 + V ) ≤ λmin (∆h,0,V ) , λmin (Dh,A
with 2 = ∆h,A ⊗ I + h Dh,A
(6.5)
σj Bj .
j
Counterexamples to this inequality (initially formulated as a conjecture) have been found by Avron-Simon (who consider a radial example [3]), Helffer (by semi-classical analysis [24]) and appear as an important technical tool in the recent preprint by Christ-Fu [15]. In Helffer’s example, under the same assumptions as above, the term hσ · B perturbs the bottom of the spectrum in comparison with the one of the magnetic Schr¨odinger operator by O(exp − 2Sh1 ) . When S0 < 2S1 , this error is negligeable and do not perturbe significantly the diamagnetic inequality obtained for ∆h,A,V , except when cos Φ h is exponentially closed to 1. Remark 6.1. B. Parisse has shown by semi-classical estimates, that flux effects can produce by tunnelling the splitting of a double eigenvalue for DA,V [69]. 7. Can we hear the zero locus of a magnetic field ? This formulation is due to R. Montgomery [65] (in reference to the celebrated sentence by M. Kac). We have already mentioned that for the Dirichlet realization (V = 0), the ground state is localized near the minimum of ||B||. By finding the right substitute for the harmonic approximation, one can be more precise by giving the asymptotics of the groundstate:
608
B. Helffer
(1) when ||B|| has a non degenerate strictly positive minimum (Helffer-Mohamed [34], Helffer-Morame[35] have given an expansion in the form µ1 (h) = inf ||B|| h + γ2 h2 + o(h2 ) ), (2) when ||B|| vanishes at a point [34], along a closed curve (see Montgomery [65], Helffer-Mohamed [38], Kwek-Pan [55]). Typically, the model considered by Montgomery was locally (hDt )2 + (t2 − hDs )2 ,
(7.1)
on say Ts × Rt , which is related to the spectral analysis of Dt2 + (t2 − ρ)2 in R. This model appears also in the analysis of superconductivity but note that it appears also, when looking for necessary conditions for analytic hypoellipticity of partial differential equations (see Helffer, Pham The Lai-Robert [70], Christ [14], Chanillo-Helffer-Laptev [13]). Typically the non-hypoanalyticity of Dt2 + (t2 Dx − Dy )2 in R3 , can be proved by showing that there exists ρ ∈ C such that Ker (Dt2 + (t2 − ρ)2 ) ∩ S(R) = {0}. More recently, a contribution of Kwek-Pan [55] should be mentioned. The authors consider models of the type: (hDt )2 + (γ(s)t2 − hDs )2 .
(7.2)
This time, γ(s) = 0 is not constant and there is a localization of the ground state near the minima of |γ|. The bottom of the spectrum is given by 4
1
5
µ1 (h) = h 3 (min |γ| 3 )ˆ ν0 + O(h 3 ) , where νˆ0 = inf σ(Dt2 + (t2 − ρ)2 ) . ρ
It is expected (Helffer-Morame, work in progress) that: 4 1 5 11 δ h 6 . µ1 (h) ∼ h 3 (min |γ| 3 )µ0 + ν1 h 3 + h 6
(7.3)
≥0
What is proved for the moment is only the existence of an eigenvalue with this expansion. 8. Further analysis of the Neumann problem This has been the most active subject in the recent years, in connection with superconductivity. 8.1. Neumann realization when n = 2. The initial question was proposed by Bernoff-Sternberg in [8]. These authors construct only quasimodes supporting the idea of an expansion for the groundstate energy of the Neumann realization in the case when B is constant. The proof that this was true is a result of the joint efforts of Lu-Pan [58], Del Pino-Sternberg [71] and Helffer-Morame [35] (the remainder was recently improved in Fournais-Helffer [21]).
Analysis of the Spectrum of Schr¨ odinger Operators
609
Theorem 8.1. The groundstate of the Neumann realization satisfies 3
7
µ(1) (h) = Θ0 h − 2κmax M3 h 2 + O(h 4 ) .
(8.1)
Here κmax = maximal curvature, M3 > 0 universal constant. For the upper bound, there is a construction by Bernoff Sternberg [8] of a quasimode in a neighborhood of a point of maximal curvature (s = 0 defining this point). This was continued in a more systematic way in [21]. The main term of the quasimode takes the form, for α > 0 to be chosen suitably, φ0 (s, t) = (2α)1/4 h−5/16 e−αs
2
/h1/4 iξ0 s/h1/2
e
u0 (h−1/2 t)χ(
2s )χ(t/t0 ), (8.2) |∂Ω|
where t0 is a constant defining the tubular neighborhood of the boundary on which one may use boundary coordinates x → (t(x), s(x)) with s(x) ∈ ∂Ω and t(x) = d(x, ∂Ω), u0 is the ground state of the second model in Section 5 considered for ξ0 = −ρ0 and χ is a cutoff function permitting to localize near 0. 7
7
Remark 8.2. The remainder O(h 4 ) is optimal. The coefficient of h 4 is related to the second derivative of the curvature at the points of maximal curvatures. (See Bernoff-Sternberg [8].) It is proved by Fournais-Helffer in [21], by using a reduction to the boundary based on the introduction of a Grushin problem, that there exists a complete expansion: j 3 7 15 µ(1) (h) ∼ Θ0 h − 2κmax M3 h 2 + γh 4 + h 8 αj h 8 . (8.3) j≥0
and that a similar expansion exists for the second eigenvalue. This gives an information of the splitting: 7
λ2 (h) − λ1 (h) ∼ γ12 h 4 ,
(8.4)
with γ12 = 0, which could be important for the analysis of the non-linear Ginzburg-Landau problem. 8.2. Neumann realization when n = 3. Wee keep the assumption of constant magnetic field and analyze the three-dimensional case. It has been observed by Lu-Pan [61] (see also [36]), that the ground state uh is localized (as h → 0) near the boundary ∂Ω but more precisely on the set: | N (x) = 0 } , ΓB = {x ∈ ∂Ω | B
(8.5)
where N (x) is the normal at x to ∂Ω, that is the set of points in ∂Ω where B is tangent. It is natural to assume that: ΓB is a regular submanifold of ∂Ω ,
(8.6)
610
B. Helffer
which will be oriented on each component. At each point x of ΓB , we define by: the normal curvature along B κn,B (x) := Kx (T (x) ∧ N (x) ,
B ), ||B||
(8.7)
where K denotes the second fundamental form on the surface ∂Ω and T (x) is the unit oriented tangent vector to ΓB at x. It is also natural to assume that: κn,B = 0 , on ΓH .
(8.8)
is tangent to Γ The last generic assumption is that the set of points where B B is isolated. On ΓB , we can introduce the function: 13 1 2 1 2 B |2 , γ˜0 (x) = ( ) 3 νˆ0 δ03 |κn,B (x)| 3 δ0 + (1 − δ0 )|T (x) | (8.9) 2 |B| where νˆ0 > 0 and δ0 ∈]0, 1[ are universal constants attached to spectral invariants related to two model Hamiltonians respectively defined on R and R+ . The constant νˆ0 is indeed νˆ0 = inf inf σ(Du2 + (u2 − ρ)2 ) . ρ
The involved operator appeared as our fourth model in Section 5. The minimum of the function γ˜0 , which plays the role of effective curvature, is also an important invariant: γˆ0 = inf γ˜0 (x) .
(8.10)
x∈ΓB
We now state the main theorem (see Lu-Pan [61], Pan [68] and Helffer-Morame [36, 37]). Theorem 8.3. Under the previous assumptions, there exists η > 0 such that: 2
4
4
h,N ) = bΘ0 h + γˆ0 b 3 h 3 + O(hη+ 3 ) , inf σ(PA,Ω
(8.11)
where γˆ0 is defined in (8.10). Although it is not verified in detail (see however [68]), one can hope a corresponding exponential localization at the points of ΓB , where the infimum of γ˜0 is attained. 9. Nodal sets and multiplicity We would like to describe in this section other flux effects due to the simultaneous presence of magnetic fields and holes. Let us consider the case of an annulus like symmetric domain in R2 and the Dirichlet case with 0-magnetic field.
Analysis of the Spectrum of Schr¨ odinger Operators
As before let us consider Θ :=
1 2π
611
ωA . σ
(So Θ is the normalized flux in the hole, i.e., the normalized circulation of ωA along a simple path around the hole.) Theorem 9.1. Let λ(Θ) be the ground state energy of the Dirichlet realization of ∆A + V in Ω and let us assume that the magnetic field is 0 and that Ω and V are invariant by the symmetry (x1 , x2 ) → (x1 , −x2 ) . Then: (1) (2) (3) (4)
Θ → λ(Θ) is 1-periodic, λ(−Θ) = λ(Θ) . The multiplicity is 1 for Θ ∈ Z + 12 , ≤ 2 for Θ = Z + 12 . [0, 12 ] Θ → λ(Θ) is monotonic. The zero set is empty for Θ ∈ Z + 12 . For Θ = Z + 12 , there is a basis of the groundstate eigenspace such that the nodal set is one line joining the two components of the boundary.
The initial motivation is the paper by Berger-Rubinstein [7] which was further developed by various subsets of {Helffer, Maria Hoffmann-Ostenhof, Thomas Hoffmann-Ostenhof, Nadirashvili, Owen} (see [29, 30, 31, 32]). Extensions have been obtained for the case of many holes (the nodal lines have to “split” the domain) and for Schr¨ odinger operators with periodic potentials V in Rn or in strips (with additional symmetry). Below we give (after [29]) three qualitative figures (which are not the result of a computation !!) describing the possible topological structures of the nodal lines for the groundstate6 of the magnetic Schr¨odinger operator with 0-magnetic field in domains with holes, each hole creating a normalized flux equal to 1/2 (modulo Z). Essentially this illustrates the properties that the nodal set is a union of disjoint nodal lines, each one joining two distinct components of the boundary and such that the nodal domain D is connected and is such that the magnetic potential 1-form is exact in D. Figure 1 describes the case of one hole (with normalized flux congruent modulo Z to 12 ). The dotted line corresponds to the nodal line announced in Theorem 9.1 (4).
Figure 2 describes the case of two holes H1 and H2 . The support of B is contained in the union of the two holes (with normalized flux congruent modulo 6Actually, in the case when the multiplicity is > 1, this property is only true for a basis of groundstates having the property to be “real” in a suitable sense.
612
B. Helffer
Z to 12 for each hole). Two possibilities for the nodal set of a “real” groundstate (see the previous footnote) are left open: • one line joining the boundaries of the two holes H1 and H2 , • two lines, each line joining the boundary of one hole to the external boundary.
H1
H1
H2
H2
Figure 3 describes the three topological possibilities for the nodal set in the case of three holes. H3 H1
H3
H3 H1
H2
H1
H2
H2
Note that there are very few “quantitative” results, except some examples where semiclassical analysis (for example in the case of two holes 12 + 12 = 1) is relevant (see above) or for very symmetric situations by analyzing singular limits of domains. The last figure gives such an example for a Neumann realization in a rectangle of length L containing two squares of size h. In this case, one can decide between the two topological situations, for the nodal set of a groundstate. When e > 0 is small, one can show ([52]), that one is in Situation 1 (left case). On the other hand when the two (square)-holes are sufficiently closed (when a = L − 2h − 2e is small), we are in Situation 2 (right case). h
e
H2
H1
H1
a
H2
L
Acknowledgements. The author would like to thank all his collaborators, colleagues or students, who have helped him, directly or indirectly, to write this survey and particularly V. Bonnaillie, S. Fournais, T. Hoffmann-Ostenhof, K. Hornberger, R. Joly, A. Laptev, A. Morame, T. Ramond and G. Raugel.
Analysis of the Spectrum of Schr¨ odinger Operators
613
References [1] S. Agmon. Lectures on exponential decay of solutions of second-order elliptic equations. Math. Notes, T. 29, Princeton University Press (1982). [2] Y. Aharonov and D. Bohm. Significance of electromagnetic potentials in the quantum theory. Phys. Rev. (2) 115, p. 485–491 (1959). [3] J. Avron and B. Simon. A counterexample to the paramagnetic conjecture. Phys. Lett. A 75 (1-2), p. 41–42 (1979). [4] J. Avron, I. Herbst, and B. Simon. Schr¨ odinger operators with magnetic fields I. Duke Math. J. 45, p. 847–883 (1978). [Ba] A. Balinsky. Hardy type inequalities for Aharonov-Bohm magnetic potentials with multiple singularities. Math. Res. Lett. 10 (2-3), p. 169–176 (2003). [5] J. Bellissard. Noncommutative methods in semiclassical analysis. Transition to chaos in classical and quantum mechanics (Montecatini Terme, 1991), p. 1–64, Lecture Notes in Math., 1589, Springer, Berlin, 1994. [6] J. Bellissard. Le papillon de Hofstadter (d’apr`es B. Helffer et J. Sj¨ ostrand). S´eminaire Bourbaki, Vol. 1991/92. Ast´erisque 206 (1992), Exp. No. 745, 3, p. 7– 39. [7] J. Berger and J. Rubinstein. On the zero set of the wave function in superconductivity. Comm. Math. Phys. 202, p. 621–628 (1999). [8] A. Bernoff and P. Sternberg. Onset of superconductivity in decreasing fields for general domains. J. Math. Phys. 39, p. 1272–1284 (1998). [9] C. Bolley and B. Helffer. An application of semi-classical analysis to the asymptotic study of the supercooling field of a superconducting material. Ann. Inst. Poincar´e (Section Physique Th´eorique) 58 (2) (1993), p. 169–233. [10] V. Bonnaillie. On the fundamental state for a Schr¨ odinger operator with magnetic fields in a domain with corners. C. R. Math. Acad. Sci. Paris 336 (2), p. 135–140 (2003). [11] V. Bonnaillie. On the fundamental state energy for a Schr¨ odinger operator with magnetic field in domains with corners. To appear in Asymptotic Analysis (2004). [12] R. Brummelhuis. Exponential decay in the semi-classical limit for eigenfunctions of Schr¨ odinger operators with magnetic fields and potentials which degenerate at infinity. Comm. Partial Differential Equations 16 (8-9), p. 1489–1502 (1991). [13] S. Chanillo, B. Helffer, and A. Laptev. Nonlinear eigenvalues and analytic hypoellipticity. J. Funct. Anal. 209 (2), p. 425–443 (2004). [14] M. Christ. A progress report on analytic hypoellipticity, in: Geometric Complex Analysis (Hayama, 1995), World Sci. Publishing, River Edge, NJ, 1996, p. 123– 146. [15] M. Christ and S. Fu. Compactness in the ∂-Neumann problem, magnetic Schr¨ odinger operators, and the Aharonov-Bohm effect. Preprint 2003, math. CV/0311225. [16] H.L Cycon, R.G Froese, W. Kirsch, and B. Simon. Schr¨ odinger operators with application to quantum mechanics and global geometry. Text and Monographs in Physics. Springer-Verlag (1987). [17] M. Dauge and B. Helffer. Eigenvalues variation I, Neumann problem for SturmLiouville operators. J. Differential Equations 104 (2), p. 243–262 (1993).
614
B. Helffer
[18] J.P. Demailly. Champs magn´etiques et in´egalit´es de Morse pour la d -cohomologie. Ann. Inst. Fourier 35 (4), p. 189–229 (1985). [19] L. Erd¨ os. Gaussian decay of the magnetic eigenfunctions. Geom. Funct. Anal. 6 (2), p. 231–248 (1996). [20] L. Erd¨ os and J.P. Solovej. The kernel of Dirac operators on S3 and R3 . Rev. Math. Phys. 13 (10), p. 1247–1280 (2001). [21] S. Fournais and B. Helffer. Accurate estimates for magnetic bottles in connection with superconductivity. Preprint 2004. [22] S. Fu and E.J. Straube. Semi-classical analysis of Schr¨ odinger operators and compactness in the ∂-Neumann problem. J. Math. Anal. Appl. 271 (1), p. 267– 282 (2002). Erratum J. Math. Anal. Appl. 280 (1), p. 195-196 (2003). [23] B. Helffer. Introduction to the semiclassical analysis for the Schr¨ odinger operator and applications. Springer lecture Notes in Math. 1336 (1988). [24] B. Helffer. Effet d’Aharonov-Bohm pour un ´etat born´e, Comm. Math. Phys. 119 (2), p. 315–329 (1988). [25] B. Helffer. On spectral theory for Schr¨ odinger operators with magnetic potentials. Advanced Studies in Pure Mathematics 23, p. 113–141 (1994). [26] B. Helffer. Semi-classical analysis for the Schr¨ odinger operator with magnetic wells, (after R. Montgomery, B. Helffer-A. Mohamed). Proceedings of the conference in Minneapolis, The IMA Volumes in Mathematics and its applications, Vol. 95. Quasiclassical Methods, Springer Verlag, p. 99–114 (1997). [27] B. Helffer. Semi-classical methods in Ginzburg-Landau theory. In Abstract and Applied Analysis, Proceedings of the International Conference in Hanoi 13–17 August 2002, edited by N.M. Chuong, L. Nirenberg and W. Tutschke, WorldScientific (2004). [28] B. Helffer. Introduction to semi-classical methods for the Schr¨ odinger operator with magnetic field. CIMPA Lecture Notes 2004 (provisory version). http://www.math.u-psud.fr/∼helffer. [29] B. Helffer, T. and M. Hoffmann-Ostenhof, and M. Owen. Nodal sets for the groundstate of the Schr¨ odinger operator with zero magnetic field in a non simply connected domain. Comm. Math. Phys. 202 (3), p. 629–649 (1999). [30] B. Helffer, T. and M. Hoffmann-Ostenhof, and M. Owen. Nodal sets, multiplicity and super conductivity in non simply connected domains. Lecture Notes in Physics 62 (200O). [31] B. Helffer, M. and T. Hoffmann-Ostenhof, and N. Nadirashvili. Spectral theory for the diedral group. Geom. Funct. Anal. 12 (5), p. 989–1017 (2002). [32] B. Helffer, T. Hoffmann-Ostenhof, and N. Nadirashvili. Periodic Schr¨ odinger operators and Aharonov-Bohm hamiltonians. Mosc. Math. J. 3 (1), p. 45–61, 258 (2003). [33] B. Helffer and A. Mohamed. Sur le spectre essentiel des op´erateurs de Schr¨ odinger avec champ magn´etique. Ann. Inst. Fourier 38 (2), p. 95–112 (1988). [34] B. Helffer and A. Mohamed. Semiclassical analysis for the ground state energy of a Schr¨ odinger operator with magnetic wells. J. Funct. Anal. 138 (1), p. 40–81 (1996). [35] B. Helffer and A. Morame. Magnetic bottles in connection with superconductivity. J. Funct. Anal. 185 (2), p. 604–680 (2001).
Analysis of the Spectrum of Schr¨ odinger Operators
615
[36] B. Helffer and A. Morame. Magnetic bottles in connection with superconductivity: Case of dimension 3. Proc. Indian Acad. Sci. (Math. Sci.) 112 (1), p. 71–84 (2002). [37] B. Helffer and A. Morame. Magnetic bottles for the Neumann problem: Curvature effects in the case of dimension 3. The general case. Ann. Sci. Ecole Norm. Sup. (4) 37 (1), p. 105–170 (2004). [38] B. Helffer and A. Morame. Work in progress. [39] B. Helffer and J. Nourrigat. Hypoellipticit´e maximale pour des op´ erateurs polynˆ omes de champs de vecteurs. Birkh¨ auser, Boston, Vol. 58 (1985). [40] B. Helffer and J. Nourrigat. D´ecroissance a ` l’infini des fonctions propres de l’op´erateur de Schr¨ odinger avec champ ´electromagn´etique polynomial. J. Anal. Math. 58, p. 263–275 (1992). [41] B. Helffer, J. Nourrigat, and X.P. Wang. Spectre essentiel pour l’´equation de Dirac. Ann. Sci. Ecole Norm. Sup. 22 (4), p. 515–533 (1989). [42] B. Helffer and J. Sj¨ ostrand. Multiple wells in the semiclassical limit I. Comm. Partial Differential Equations 9 (4), p. 337–408 (1984). [43] B. Helffer and J. Sj¨ ostrand. Puits multiples en limite semiclassique V – le cas des minipuits –. Current topics in Partial Differential Equations, Papers dedicated to Professor S. Mizohata on the occasion of his sixtieth birthday, edited by Y. Ohia, K. Kasahara and N. Shimakura (1986) Kinokuniya company LTD, Tokyo, p. 133–186. [44] B. Helffer and J. Sj¨ ostrand. Puits multiples en limite semi-classique VI – le cas des puits vari´et´es –. Ann. Inst. Poincar´e (Section Physique Th´eorique) 46 (4), p. 353–373 (1987). [45] B. Helffer and J. Sj¨ ostrand. Effet tunnel pour l’´equation de Schr¨ odinger avec champ magn´etique. Ann. Scuola Norm. Sup. Pisa Cl. Sci. 14 (4), p. 625–657 (1987). [46] B. Helffer and J. Sj¨ ostrand. Equation de Schr¨ odinger avec champ magn´etique et ´equation de Harper. Schr¨ odinger operators (S∅nderborg, 1988), p. 118–197, Lecture Notes in Phys. 345, Springer, Berlin, 1989. [47] R. Hempel and I. Herbst. Strong magnetic fields, Dirichlet boundaries, and spectral gaps. Comm. Math. Phys. 169 (2), p. 237–259 (1995). [48] K. Hornberger. Personal communication. [49] K. Hornberger and U. Smilansky. The boundary integral method for magnetic billiards. J. Phys. A, Math. Gen. 33 (14), p. 2829–2855 (2000). [50] H.T. Jadallah. The onset of superconductivity in a domain with a corner. J. Math. Phys. 42 (9), p. 4101–4121 (2001). [51] S. Jitomirskaya. The resolution of the ten martinis conjecture. Proceedings of QMath9, September 12th–16th, Giens, 2004. [52] R. Joly. Personal communication (2003) and to appear as a chapter of a book in preparation by J. Hale and G. Raugel. [53] J. Kohn. Lectures on degenerate elliptic problems. Pseudodifferential operators with applications, C.I.M.E., Bressanone 1977, p. 89–151 (1978). [54] V. Kondratiev, V. Maz’ya, and M. Shubin. Discreteness of spectrum and strict positivity criteria for magnetic Schr¨ odinger operators. Comm. Partial Differential Equations 29 (3-4), p. 489–521 (2004).
616
B. Helffer
[55] K.H. Kwek and X-B. Pan. Schr¨ odinger operators with non-degenerately vanishing magnetic fields in bounded domains. Trans. Amer. Math. Soc. 354 (10), p. 4201–4227 (2002). [56] A. Laptev and T. Weidl. Hardy inequality for magnetic Dirichlet forms. Oper. Theory, Adv. Appl. (108), p. 299–305 (1999). [57] R. Lavine and M. O’Carroll. Ground state properties and lower bounds for energy levels of a particle in a uniform magnetic field and external potential. J. Mathematical Phys. 18 (10), p. 1908–1912 (1977). [58] K. Lu and X-B. Pan. Estimates of the upper critical field for the GinzburgLandau equations of superconductivity. Physica D 127, p. 73–104 (1999). [59] K. Lu and X-B. Pan. Eigenvalue problems of Ginzburg-Landau operator in bounded domains. J. Math. Physics 40 (6), p. 2647–2670 (1999). [60] K. Lu and X-B. Pan. Gauge invariant eigenvalue problems on R2 and R2+ . Trans. Amer. Math. Soc. 352 (3), p. 1247–1276 (2000). [61] K. Lu and X-B. Pan. Surface nucleation of superconductivity in 3-dimension. J. Differential Equations 168 (2), p. 386–452 (2000). [62] A. Martinez and V. Sordoni. Microlocal WKB expansions. J. Funct. Anal. 168, p. 380–402 (1999). [63] H. Matsumoto. Semiclassical asymptotics of eigenvalues for Schr¨ odinger operators with magnetic fields. J. Funct. Anal. 129 (1), p. 168–190 (1995). [64] H. Matsumoto and N. Ueki. Spectral analysis of Schr¨ odinger operators with magnetic fields. J. Funct. Anal. 140 (1), p. 218–255 (1996). [65] R. Montgomery. Hearing the zerolocus of a magnetic field. Comm. Math. Physics 168, p. 651–675 (1995). [66] A. Mohamed and G.D. Raikov. On the spectral theory of the Schr¨ odinger operator with electromagnetic potential. In: Pseudo-differential Calculus and Mathematical Physics. Math. Top. 5, Berlin Akademie Verlag p. 298–390 (1994). [67] S. Nakamura. Gaussian decay estimates for the eigenfunctions of magnetic Schr¨ odinger operators. Comm. Partial Differential Equations 21 (5-6), p. 993– 1006 (1996). [68] X-B. Pan. Surface conductivity in 3 dimensions. Preliminary version in October 2001. Trans. Amer. Math. Soc. 356 (10), p. 3899–3937 (2004). [69] B. Parisse. Effet d’Aharonov-Bohm sur un ´etat born´e de l’op´erateur de Dirac. Asymptotic Anal. 10 (3), p. 199–224 (1995). [70] Pham The Lai and D. Robert. Sur un probl`eme aux valeurs propres non lin´eaire. Israel J. Math. 36, p. 169–186 (1980). [71] M. del Pino, P.L. Felmer, and P. Sternberg. Boundary concentration for eigenvalue problems related to the onset of superconductivity. Comm. Math. Phys. 210, p. 413–446 (2000). [72] D. Robert. Comportement asymptotique des valeurs propres d’op´erateurs du type Schr¨ odinger ` a potentiel “d´eg´en´er´e”. J. Math. Pures Appl. (9) 61, no . 3, p. 275–300 (1982). [73] I. Shigekawa. Eigenvalue problems for the Schr¨ odinger operator with the magnetic field on a compact Riemannian manifold. J. Funct. Anal. 75 (1), p. 92–127 (1987).
Analysis of the Spectrum of Schr¨ odinger Operators
617
[74] B. Simon. Some quantum operators with discrete spectrum but classically continuous spectrum. Ann. Physics 146, p. 209–220 (1983). [75] B. Simon. Semi-classical analysis of low lying eigenvalues I. Ann. Inst. Poincar´e (Section Physique Th´eorique) 38 (4), p. 295–307 (1983). [76] J. Sj¨ ostrand. Microlocal analysis for the periodic magnetic Schr¨ odinger equation and related questions. in Microlocal analysis and applications (Montecatini Terme, 1989), 237–332, Lecture Notes in Math. 1495, Springer, Berlin, 1991. [77] J.P. Solovej. Mathematical results on the structure of large atoms. European Congress of Mathematics, Vol. II (Budapest, 1996), p. 211–220, Progr. Math., 169, Birkh¨ auser, Basel, 1998. [78] B. Thaller. The Dirac equation. Texts and Monographs in Physics. SpringerVerlag, Berlin, 1992. [79] N. Ueki. Lower bounds for the spectra of Schr¨ odinger operators with magnetic fields. J. Funct. Anal. 120 (2), p. 344–379 (1994). Erratum n◦ 11, p. 257–259 (1995). [80] N. Ueki. Asymptotics of the infimum of the spectrum of Schr¨ odinger operators with magnetic fields. J. Math. Kyoto Univ. 37 (4), p. 615–638 (1998). Bernard Helffer D´ epartement de Math´ematiques UMR CNRS 8628 Bˆ at. 425, Universit´e Paris-Sud, F-91405 Orsay Cedex, France e-mail: [email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Mathematical Aspects of Quantum Chaos J.P. Keating
1. Introduction This is a brief report on some of the highlights of the research supported by the EC Training Research Network Mathematical Aspects of Quantum Chaos, as described in my lecture at the 2004 European Mathematical Congress. The Network consisted of teams from Bristol (UK), Bologna (Italy), Paris (France), Tel Aviv (Israel), Ulm (Germany) and Uppsala (Sweden). It covered a broad spectrum of interests, from mathematics to theoretical physics. My focus here will be on a personal selection of some of the main mathematical achievements. 2. Quantum chaos It is now widely appreciated that within classical mechanics it is possible to have a broad spectrum of qualitatively different types of dynamics. At one end there is integrability and at the other chaos. The subject of Quantum Chaos is concerned with the question of how this important fact manifests itself in quantum mechanics, in the semiclassical limit (i.e., as → 0), as the boundary with classical mechanics is approached. For example, how does it influence the distribution of the eigenvalues and the morphology of eigenfunctions of the Schr¨ odinger operator in the limit as the de Broglie wavelength tends to zero? As an example, consider quantum billiards in R2 . A billiard is an enclosure with hard walls, so that the classical trajectories are straight line segments with specular reflections (angle of incidence equal to angle of reflection) at the boundary. In some billiards the classical trajectories are integrable (e.g., in a rectangle or a circle), in others they are strongly chaotic. The Schr¨ odinger equation for billiards is just the Helmholtz equation 2 2 ∇ Ψ = k2 Ψ (2.1) 2m with appropriate (e.g., Dirichlet) conditions on the boundary. The question then is: how do the eigenfunctions Ψn and eigenvalues kn reflect the chaotic nature of the underlying classical dynamics in the limit as k → ∞? Given that many of the principal consequences of chaos, such as ergodicity and mixing, are statistical, it is natural to expect its influence on quantum mechanics in the semiclassical limit to be seen most clearly in the statistical properties of the eigenfunctions and eigenvalues. −
620
J.P. Keating
One of the fundamental results in the field of Quantum Chaos concerns quantum ergodicity. Originally put forward by Schnirelman in 1974, the quantum ergodicity theorem asserts, in its simplest form, that in systems in which the classical trajectories are ergodic, quantum eigenfunctions (specifically |Ψn |2 ) become uniformly distributed (with respect to Liouville measure) as one approaches the semiclassical limit through subsequences of eigenstates that have density one with respect to all subsequences [22]. For example, in billiards in which the classical trajectories are ergodic, the integral of |Ψn |2 over an interior region γ tends, as n → ∞ through almost all subsequences of eigenstates, to the ratio of the area of γ to that of the whole billiard. This prompts the important question as to whether the eigenfunctions can be ergodic with respect to all subsequences – that is whether they can exhibit quantum unique ergodicity [21] – or whether exceptional subsequences exist. Related to this is the long-standing and important issue of scars: in some chaotic systems one finds eigenfunctions with an enhanced modulus near to short classical periodic orbits [13]. Are there subsequences of eigenstates for which this persists in the semiclassical limit? Obviously such subsequences, if they exist, must be of density zero. Another important problem is that of determining the rate of approach to the quantum ergodic limit as the semiclassical limit is approached. For example, if f (x) denotes a function of position x in a two-dimensional billiard, then quantum ergodicity implies that the integral of f (x)|Ψn (x)|2 over the billiard converges to the average of f (x) over the billiard in the limit as n → ∞. The question is then: what is the rate of convergence? General heuristic calculations imply that for generic billiards the variance of the integral of f (x)|Ψn (x)|2 is proportional to the integral of the classical time-correlation function divided by kn , as n → ∞ [10]. Quantum ergodicity provides information about the semiclassical limit of eigenfunctions on fixed scales. What do these eigenfunctions look like on the scale of the de Broglie wavelength (which is of the order of )? It was conjectured by Berry in 1977 that on this scale they may be modelled statistically by Gaussian Random Functions [3]. There is considerable experimental and numerical evidence in support for this, but proving it remains an important open problem. Note that it implies that quantum wavefunctions in classically chaotic systems exhibit statistical university on the wavelength scale. There are in addition two very important conjectures relating to energy level correlations on the scale of the mean level spacing in the semiclassical limit: in 1977, Berry and Tabor [4] suggested that in generic integrable systems these should be the same as those arising from a Poisson process (i.e., the same as those of uncorrelated random numbers); and in 1984 Bohigas, Giannoni and Schmit [7] proposed that in generic chaotic systems they are the same as those relating to the eigenvalues of random matrices in the limit as the matrix size becomes infinite. In the case of chaotic systems, the appropriate ensemble of
Mathematical Aspects of Quantum Chaos
621
random matrices depends on the symmetries (e.g., time-reversibility) of the classical dynamics. For example, in two-dimensional billiards Weyl’s law gives that #{j : kj ≤ k} ∼ ck2
(2.2)
when k → ∞, where the constant c is proportional to the area of the billiard. Let us define Xj = ckn2 . (2.3) In this case the Berry-Tabor conjecture asserts that if the classical dynamics of a billiard is integrable (and sufficiently “generic”) then the Xj have the same local statistical properties as independent random variables from a Poisson process. This means that N (T, L) := #{j : T ≤ Xj ≤ T + L},
(2.4)
the number of Xj s in a randomly shifted interval [T, T + L] of fixed length L, k is distributed according to the Poisson law Lk! e−L . By contrast, if the classical dynamics is chaotic (and sufficiently “generic”) then the Xj have the same local statistical properties as the eigenvalues of certain ensembles of random matrices in the limit as the matrix-size tends to infinity. Again, there is considerable experimental and numerical evidence in support of these conjectures. In this case it is worth noting that they imply statistical university on the scale of the mean level spacing. Quantum eigenvalues are related, in the semiclassical limit, to periodic orbits of the corresponding classical dynamical system. This relationship is described by a class of formulae known as trace formulae; for example, the eigenvalues of the Laplacian on surfaces of constant negative curvature are related to closed geodesics on these surfaces by the Selberg Trace Formula. Statistical correlations between the quantum energy levels on the scale of the mean spacing, such as those suggested by the Bohigas-Giannoni-Schmit conjecture for chaotic systems, therefore imply the existence of correlations between classical periodic orbits [1]. For example, in billiards these correlations relate to the lengths, stabilities and Maslov indices of the periodic orbits. 3. Model systems Except for quantum ergodicity, no significant rigorous results relating to questions in Quantum Chaos have been obtained for general systems. Substantial progress has, however, been made in the study of certain model systems. These include the following. Billiards in Rn . Hyperbolic billiards; for example compact surfaces of constant negative curvature. In this case there is an important distinction between arithmetic surfaces, where the eigenvalues of the Laplacian are known not to exhibit random-matrix
622
J.P. Keating
statistics because of the existence of Hecke operators (note again the important role played by symmetries), and non-arithmetic surfaces. Quantum maps. Let Φ be a smooth symplectic map acting on T2 . A specific example might be a linear hyperbolic automorphism (represented by an element of the modular group SL2 (Z)), which may easily be shown to generate strongly chaotic (Anosov) dynamics. To quantize Φ, fix an integer N ≥ 1 – this plays the role of the inverse of Planck’s constant – and define the corresponding Hilbert space HN of quantum states to be L2 (Z/N Z) with the inner product 1 ψ1 , ψ2 := ψ1 (Q)ψ 2 (Q) . (3.1) N Q
mod N
The quantum map UΦ ∈ U (N ) then acts on the wavefunctions ψ in a way that corresponds, in the limit as N → ∞, to the action of Φ; that is, UΦ generates the quantum dynamics. For Φ ∈ SL2 (Z) the explicit form of the corresponding quantum map was written down by Hannay and Berry [12]. In this case the eigenvalue statistics do not coincide with those of random matrix theory [15], because of the presence of (non-generic) arithmetic symmetries of HN that play the same role as the Hecke operators in the theory of modular forms [16]. Quantum graphs; for example, graphs with the one-dimensional Laplacian acting on the edges and matching conditions for the solutions at the vertices. The non-trivial zeros of the Riemann zeta function ζ(s). These are conjectured to have asymptotically the same statistical distribution, on the scale of their mean spacing, as the eigenvalues of random unitary matrices in the limit as the matrix size tends to infinity [19, 20]. The Katz-Sarnak philosophy extends this connection to families of L-functions and matrices from the other classical compact groups [14]. 4. Highlights from the network 4.1. Eigenfunction statistics. Let me focus first on some of the main results relating to the statistical properties of eigenfunctions obtained by members of the network. These include the following. • The extension of the quantum ergodicity theorem to multicomponent wavefunctions (e.g., to particles with spin) [8]. • A proof that scarring survives in the semiclassical limit for certain particular subsequences of the eigenstates of the quantum maps corresponding to linear hyperbolic automorphisms of T2 (i.e., for Φ ∈ SL2 (Z)), and hence that these systems do not exhibit quantum unique ergodicity. Specifically, a sparse sequence of N s and eigenfunctions in HN was identified such that as N → ∞ these eigenfunctions tend to a limit that is not uniform on T2 , but instead has part of its mass concentrated on fixed points of the classical map. This represents the first proof of scarring in the semiclassical limit in a strongly chaotic system [11].
Mathematical Aspects of Quantum Chaos
623
• A proof that star graphs (a particular family of quantum graphs) are not quantum ergodic in the limit as the number of bonds tends to infinity, even though any given star graph is ergodic, and a proof that scarring survives in the semiclassical limit for certain particular subsequences of the eigenstates in these systems [2]. • A rigorous calculation of the rate of approach to the quantum ergodic limit as N → ∞ for the linear hyperbolic automorphisms of T2 [17]. The result obtained is consistent with the general heuristic expectations of [10], once the non-generic (Hecke-type) symmetries of the cat maps are taken into account. • The investigation, involving both extensive numerics and heuristic arguments, of the statistical properties of nodal domains in the quantum eigenfunctions of chaotic systems, and in the random wave model, leading to a number of extremely interesting conjectures [5, 6]. 4.2. Eigenvalue statistics. The main results relating to spectral statistics include the follow. • Proof of the Berry-Tabor conjecture for pair correlations in the case of a free particle on a k-dimensional torus threaded by flux lines of strength α = (α1 , . . . , αk ), if α is diophantine of type κ < k−1 k−2 and the components of (α, 1) are linearly independent over Q [18]. This then provides specific values of α for which the Berry-Tabor conjecture holds. Previous rigorous results were only able to establish convergence to the Poisson limit for almost all systems in certain families. • The development of a theoretical understanding of the origins of the correlations between classical periodic orbits in chaotic systems that are related, via the trace formulae, to the random-matrix conjecture for spectral statistics [23]. • Applications of random matrix theory to shed significant new light on some very deep and long-standing problems in the theory of the Riemann zeta-function and families of L-functions; in particular, the development of random-matrix inspired conjectures for the moments of ζ(1/2 + it) and for families of L-functions evaluated at the centre of the critical strip [9]. Acknowledgement. I am sure that all of the other members of the Network would wish to join me in acknowledging the outstanding leadership provided by our coordinator, Dr. Jonathan Robbins, and in expressing our gratitude to the EU for funding this collaboration. References [1] N. Argaman, F.M. Dittes, E. Doron, J.P. Keating, A. Kitaev, M. Sieber and U. Smilansky, Correlations in the actions of periodic orbits derived from quantum chaos, Phys. Rev. Lett., 71: 4326–4329 (1993). [2] G. Berkolaiko, J.P. Keating and B. Winn, No quantum ergodicity for star graphs, Commun. Math. Phys., 250: 259–285 (2004).
624
J.P. Keating
[3] M.V. Berry, Regular and irregular semiclassical wavefunctions, J. Phys. A 10: 2083–2091 (1977). [4] M.V. Berry and M. Tabor, Level clustering in the regular spectrum, Proc. Roy. Soc. Lond. A, 356: 375–394 (1977). [5] G. Blum, S. Gnutzmann and U. Smilansky, Nodal domains statistics: A criterion for quantum chaos, Phys. Rev. Lett., 88: 114101 (2002). [6] E. Bogomolny and C. Schmit, Percolation model for nodal domains of chaotic wave functions, Phys. Rev. Lett., 88: 114102 (2002). [7] O. Bohigas, M.-J. Giannoni and C. Schmit, Characterization of chaotic quantum spectra and universality of level fluctuation laws, Phys. Rev. Lett., 52: 1–4 (1984). [8] J. Bolte and R. Glaser, A semiclassical Egorov theorem and quantum ergodicity for matrix valued operators, Commun. Math. Phys., 247: 391–419 (2004). [9] J.B. Conrey, D.W. Farmer, J.P. Keating, M.O. Rubinstein and N.C. Snaith, Integral moments of L-functions, Proc. Lond. Math. Soc, in the press. [10] B. Eckhardt, S. Fishman, J. Keating, O. Agam, J. Main and K. M¨ uller, Approach to ergodicity in quantum wave functions, Phys. Rev. E, 52: 5893–5903, 1995. [11] F. Faure, S. Nonnenmacher and S. De Bievre, Scarred eigenstates for quantum cat maps of minimal periods, Commun. Math. Phys., 239: 449–492 (2003). [12] J.H. Hannay and M. V. Berry, Quantization of linear maps on the torus – Fresnel diffraction by a periodic grating, Physica D 1: 267–290 (1980). [13] E.J. Heller, Bound state eigenfunctions of classically chaotic Hamiltonian systems – scars of periodic orbits, Phys. Rev. Lett., 53: 1515–1518 (1984). [14] N.M. Katz and P. Sarnak, Random Matrices, Frobenius Eigenvalues and Monodromy, American Mathematical Society Colloquium Publications, 45. American Mathematical Society, Providence, Rhode Island, 1999. [15] J.P. Keating, The cat maps: quantum mechanics and classical motion, Nonlinearity, 4: 309–341 (1991). [16] P. Kurlberg and Z. Rudnick, Hecke theory and equidistribution for the quantization of linear maps of the torus, Duke Math. J., 103:47–77 (2000). [17] P. Kurlberg and Z. Rudnick, On the distribution of matrix elements for the quantum cat map, Annals of Math., in the press. [18] J. Marklof, Pair correlation densities of inhomogeneous quadratic forms, Annals of Math., 158: 419–471 (2003). [19] H.L Montgomery, The pair correlation of zeros of the zeta function, Proc. Symp. Pure Math., 24: 181–193, 1973. [20] A.M. Odlyzko, The 1020 th zero of the Riemann zeta function and 70 million of its neighbors, Preprint, 1989. [21] Z. Rudnick and P. Sarnak, The behaviour of eigenstates of arithmetic hyperbolic manifolds, Commun. Math. Phys., 161: 195–213 (1994). [22] A.I. Schnirelman, Ergodic properties of eigenfunctions, Uspehi Mat. Nauk, 29: 181–182, 1974. [23] M. Sieber, Leading off-diagonal approximation for the spectral form factor for uniformly hyperbolic systems, J. Phys. A, 35: L613–L619 (2002). J.P. Keating School of Mathematics University of Bristol Bristol BS8 1TW, UK
4ECM Stockholm 2004 c 2005 European Mathematical Society
The Research Training Network “Algebraic Combinatorics in Europe” C. Krattenthaler Abstract. I present four highlights from the scientific activities of the eight teams of the Research Training Network “Algebraic Combinatorics in Europe” in Bordeaux, Jerusalem, Link¨ oping, Lyon, Marburg, Marne-la-Vall´ee, Roma, and Wien: • the proof of the combinatorial invariance conjecture for Kazhdan–Lusztig polynomials in an important special case; • the proof of the conjectures on the random assignment problem; • important progress on the Neggers–Stanley Conjecture on posets and the Charney-Davis conjecture on (certain) simplicial complexes; • new bijections between maps and trees with interesting applications to the Ising and the hard particle model on maps.
1. Brief introduction to ACE The Research Training Network “Algebraic Combinatorics in Europe” (ACE) is a research training network in the 5th Framework Programme “Improving Human Potential and the Socio-Economic Knowledge Base” of the European Commission, running from September 1, 2002 to August 31, 2005, of which I happen to be the coordinator. The network consists of eight teams at • the Universit´e Bordeaux 1 (Scientist-in-Charge: Mireille Bousquet-M´elou); • the Hebrew University of Jerusalem, the Weizmann Institute, Rehovot and the Bar Ilan University, Ramat Gan (Scientist-in-Charge: Ehud Friedgut); • Link¨ opings Universitet, KTH Stockholm and Chalmers Tekniska H¨ ogskola (Scientist-in-Charge: Svante Linusson); • the Universit´e Claude Bernard Lyon 1 – CNRS Rhˆ one–Alpes (Scientist-inCharge: Christian Krattenthaler); before, until August 31, 2003: Universit´e Louis Pasteur, Strasbourg (Scientist-in-Charge: Dominique Foata); • the Universit´e de Marne-la-Vall´ee (Scientists-in-Charge: Jacques D´esarm´enien and Jean-Yves Thibon); • the Philipps–Universit¨ at Marburg (Scientist-in-Charge: Volkmar Welker); • the Universit` a di Roma “Tor Vergata” (Scientist-in-Charge: Francesco Brenti); • the Universit¨ at Wien (Scientist-in-Charge: Markus Fulmek). Research partially supported by EC’s IHRP Programme, grant HPRN-CT-2001-00272, “Algebraic Combinatorics in Europe”.
626
C. Krattenthaler
The members of the Scientific Committee of the network are Anders Bj¨ orner, Dominique Foata, Gil Kalai and Alain Lascoux. In line with the goals of the 5th Framework Programme, the network trains pre- and post-doctoral researchers, by letting them work within the joint work programme of the eight teams. I refer the reader to the WWW site of the network at http://igd.univ-lyon1.fr/~kratt/ace for the various activities (the series of meetings S´eminaire Lotharingien de Combinatoire, other meetings and conferences, summer schools, preprints, etc.) that are also open to the public outside of the network. In the next section, I try to briefly indicate what one should understand by “Algebraic Combinatorics.” Then, in Section 3, I shall give a list of themes and objects that the teams of the network are interested in. Finally, in Sections 4–7, I present some of the advances that have been made by members of the network which I consider important. 2. What is “Algebraic Combinatorics”? Clearly, this puts me in a difficult position since it is probably already difficult to “define” Combinatorics itself. Maybe it is preferable to quote “the” prototypical example, representation theory of the symmetric group (cf. [28, 29, 41]). This is an algebraic subject, the study of which leads necessarily to combinatorial objects like partitions (which label the irreducible representations), standard Young tableaux (for constructing bases of the irreducible representations), and to combinatorial algorithms like the Robinson–Schensted correspondence and Sch¨ utzenberger’s jeu de taquin, which play an essential role in deeper studies of the subject. Let us content ourselves with Algebraic Combinatorics concerns itself with the study of combinatorial problems arising from other fields and, on the other hand, with the application and use of techniques coming from other parts of mathematics to combinatorial problems, assuming that there is some algebraic flavor on (at least) one of the two sides. 3. The research themes of ACE The “definition” given in the previous section is clearly still rather wide. More specifically, the research of the network features three main directions: Enumerative Combinatorics (with permutations, words, plane partitions, alternating sign matrices, fully packed loop configurations, maps, trees, paths as objects of study), Combinatorics of Coxeter Groups and Related Polynomials (with symmetric functions, non-commutative symmetric functions, quasi-symmetric functions, and related combinatorial Hopf algebras, Kazhdan–Lusztig polynomials, Schubert polynomials, Macdonald polynomials as objects of study), and Geometric Combinatorics (with simplicial complexes, order complexes of posets,
The ACE Research Training Network
627
combinatorial homotopy and homology, discrete Morse theory, algebraic shifting as objects of study). The following sections bring to the fore four highlights from the research done by the network. 4. Combinatorial invariance of Kazhdan–Lusztig polynomials Here, I describe the recent advance that has been made by Brenti (Roma team), Caselli (post-doc in the network) and Marietti (Roma team) towards the proof of one of the outstanding conjectures on Kazhdan–Lusztig polynomials. For in-depth introductions into the subject, I refer the reader to [31, Ch. 7] and [6, Ch. 6]. I also point out the survey article [14]. The Kazhdan–Lusztig polynomials are polynomials Pu,v (q) in q, which are indexed by elements u, v of a Coxeter group W . For the sake of simplicity, for the most part of this section, I shall restrict my explanations to the case where W is the symmetric group Sn of permutations of {1, 2, . . . , n}. To start with, I have to define the Kazhdan–Lusztig polynomials for the symmetric group Sn . Elements of Sn will be written in one-line notation σ = σ1 σ2 . . . σn , meaning the permutation 1 2 ... n σ= . σ1 σ 2 . . . σn The length of a permutation σ, denoted by (σ), is the number of inversions of σ, an inversion being a pair (i, j) with i < j but σ(i) > σ(j). For example, we have (2413) = 3, the inversions being the pairs (1, 3), (2, 3) and (2, 4). The definition of the Kazhdan–Lusztig polynomials depends heavily on the Bruhat order on Sn , which I have to explain next. Let u, v ∈ Sn . Write u → v if there exists a transposition (i, j) (i.e., a permutation which interchanges the elements i and j but leaves fixed all other elements) such that u ◦ (i, j) = v
and (u) < (v).
For example, we have 2143 → 3142. Figure 1 shows the Bruhat graph for S3 , which is the graph containing all the relations → for S3 .
321 3 k Q 6 Q 231 312 H YH * 6 6 H HH H 213 132 kQ Q 3 123 Figure 1. The Bruhat graph of S3
628
C. Krattenthaler
The Bruhat order then is the partial order on Sn which is the transitive closure of the relation →, that is, u≤v
if u = v or else there exist w1 , w2 , . . . , wk such that u → w1 → · · · → wk → v.
Figure 2 shows the Hasse diagram of the Bruhat order on S3 (which encodes the partial order ≤ in the way that u ≤ v if and only if u is lower than v and there is an upward path from u to v in the diagram). 321 q @
231 132
q @q Z 312 Z Z q Zq 213 @ @q 123
Figure 2. The Bruhat order of S3 We are now almost in the position to be able to define the Kazhdan– Lusztig polynomials for Sn , except that, first, we have to define the so-called Rpolynomials. These are also polynomials in q indexed by pairs of permutations. The definition of the R-polynomials Ru,v (q), u, v ∈ Sn , is recursive: • Ru,v (q) = 0 if u ≤ v; • Ru,v (q) = 1 if u = v; • for any adjacent transposition s = (i, i + 1), we have if (us) < (u), Rus,vs (q) Ru,v (q) = qRus,vs (q) + (q − 1)Ru,vs (q) if (us) > (u). In fact, it is not at all clear that this is a coherent definition, i.e., it is not clear whether there are indeed polynomials Ru,v (q) satisfying these (somewhat strange) properties. This is, however, a theorem due to Kazhdan and Lusztig [32] (see also [31, Sec. 7.5]). The Kazhdan–Lusztig polynomials Pu,v (q), u, v ∈ Sn , are then defined by • Pu,v (q) = 0 if u ≤ v; • Pu,v(q) = 1 if u =v; • deg Pu,v (q) < 12 (v) − (u) if u < v; • q (v)− (u) Pu,v (1/q) = u≤a≤v Ru,a (q) Pa,v (q) if u ≤ v, which involves the R-polynomials in the last rule. Again, one can show that this is a coherent definition (see [32] or [31, Sec. 7.10 and 7.11]). Clearly, in a first run, this is completely impossible to grasp. It seems therefore that we should better abandon this and do something else. (I almost
The ACE Research Training Network
629
hesitate to add that it is straight-forward to extend the above definitions of R-polynomials and Kazhdan–Lusztig polynomials for Sn to arbitrary Coxeter groups.) However: Kazhdan–Lusztig polynomials for a Coxeter group W are instrumental in the construction of certain important representations of the Hecke algebra associated to W . This is, in fact, where these polynomials arose for the first time [32]. (In that context, the artificially looking definition of the Kazhdan–Lusztig polynomials becomes natural.) Furthermore, Kazhdan and Lusztig have shown in [33] that, for Weyl groups W , the coefficients of Pu,v (q) are dimensions of local intersection homology of the Schubert variety associated to W . (Thus, in particular, the coefficients of Pu,v (q) are non-negative in that case, which is not at all evident from the definition of Pu,v (q). For arbitrary Coxeter groups, this is still an open conjecture.) Furthermore, Kazhdan–Lusztig polynomials appear also in the theory of Verma modules (see, e.g., [4], [19]), the algebraic geometry and topology of Schubert varieties (see, e.g., [33], [34], [6]), canonical bases ([27], [50]), and immanent inequalities ([30]). Thus, these complicatedly defined polynomials are of fundamental importance in representation theory, Schubert calculus, and related areas. In particular, it is of greatest interest to study their properties, because this would then have immediate implications in representation theory, Schubert calculus, etc. In this regard, over 20 years ago Lusztig (in private) and Dyer [26] made the following surprising invariance conjecture. Conjecture 4.1. Pu,v (q) depends only on the combinatorial structure of the interval [u, v] := {x : u ≤ x ≤ v}; that is, given two Coxeter groups W and W , if u, v ∈ W and u , v ∈ W , and if [u, v] ∼ = [u , v ] as partially ordered sets (posets), then Pu,v (q) = Pu ,v (q). For example, if W = W = S4 , then the intervals [1234, 3142] and [1234, 4123] are isomorphic as posets, as is shown in Figure 3, and indeed one can verify that P1234,3142(q) = P1234,4123(q) = 1. 3142 HH HH H 1342 3124 2143 HH HH H H HH HH 1324 1243 2134 HH H HH 1234
4123 HH HH H 1423 2143 3124 HH HH H H HH HH 1243 1324 2134 HH H HH 1234
Figure 3. The intervals [1234, 3142] and [1234, 4123]
630
C. Krattenthaler
Conjecture 4.1 has been shown by Brenti, Caselli and Marietti [16] for the important special case where u = e and u = e , e and e denoting the identity elements in W and W , respectively. That is, we have the following theorem. Theorem 4.2. For any Coxeter group, the Kazhdan–Lusztig polynomial Pe,v (q) depends only on the poset structure of [e, v]. Remark 4.3. For the case of W = W = Sn , this was proved by Brenti [15]. Independently, du Cloux [22] showed this for finite and affine Coxeter groups. Furthermore, Delanoy [24] proved it for simply-laced Coxeter groups. The proof of Theorem 4.2 begins with the observation that it suffices to show the claim for the R-polynomials, i.e., to show that if x, y ∈ [e, v] then Rx,y (q) depends only on the poset structure of [e, v]. This follows directly from the definition of the R- and the Kazhdan–Lusztig polynomials. To accomplish the latter, the crucial idea is the idea of a special matching introduced in [15]. To motivate this idea, let us recall that, given s = (i, i + 1) such that (vs) < (v), the map x → xs
(4.1)
defines a matching of the elements of [e, v] with the property: if x y and xs = y, then xs ≤ ys.
(4.2)
(Here, x y means that x < y and there is no z with x < z < y.) Here is the definition of a special matching. As before, we view a matching as a fixed-point free involution, i.e., as a function M without fixed points such that M (M (x)) = x for all x. Definition 4.4. A matching M is called special if for all x and y we have: if x y and M (x) = y, then M (x) ≤ M (y).
(4.3)
The reader should note the analogy between (4.2) and (4.3). In particular, the map (4.1) is a special matching. In order to give an example, let us consider the poset given in Figure 4 (which is indeed a lower interval [e, v] in a Coxeter group). Then the pairing {{1, 2}, {3, 7}, {4, 9}, {5, 14}, {6, 12}, {8, 13}, {10, 15}, {11, 17}, {16, 18}} defines a special matching. What Brenti, Caselli and Marietti manage to prove is the following theorem, which “generalizes” the original definition of the R-polynomials given earlier. Theorem 4.5. Let W be a Coxeter group, v ∈ W , and let M be a special matching defined on the interval [e, v]. Then if M (u) u, RM (u),M (v) (q) Ru,v (q) = qRM (u),M (v) (q) + (q − 1)Ru,M (v) (q) if M (u) u. Theorem 4.5 implies immediately the Invariance Theorem 4.2.
The ACE Research Training Network
631
18 qH HH H HHq17 q h q16 15 h h HHHH h h H H h h hH H hhh H HHhhhhHH H H Hq14 H( h q qhX q13 ((hh 11 12 10 X hX XX ( H qX hhhH ( X HH X h ( h ( X X ( h X X( h h XX h (((XX hh H ( X X ( hXXX H X ( ( ( h( Xq Xq8 hh qh ( HX q q h h hX 9 5 XX 6 HXX 7 hhh ((( X ( X H ( ( X XXhhH ( X h ( hh ( X( XXX X H X ( (h Xh h (h Xh H h (X ( Xh ( X H X qH q q ( 4 2 3 HH H HH q 1
Figure 4. A lower interval in a symmetric group
3 2 1 2
1 5 3 4
4 7 1 1
2 3 5 6
3m1 4 2 2 5 7m3 1 3 1 5m 2 4m1 6
3 1m4 2 2 5 7 3m 1m3 1 5 2 4 1m6
Figure 5. A matrix, an assignment, and a minimum assignment
5. The random assignment problem The purpose of this section is to report the solution of the random assignment problem by Linusson and W¨ astlund (Link¨ oping team), the origins of which lie in conjectures by M´ezard and Parisi [37, 39]. Let M be an n × n matrix of non-negative real numbers. An assignment is a selection of entries of M , such that we take exactly one from each row and each column. A minimum assignment is an assignment in which the sum of its entries is minimal. Figure 5 shows a 4 × 4 matrix, together with an assignment (with sum of entries equal to 19), and a minimum assignment (with sum of entries equal to 6). (Readers familiar with graph theory and combinatorial optimization will realize that finding the minimum assignment in an n × n matrix M is equivalent to finding the minimum weight perfect matching in the corresponding bipartite graph, in which the edge from i to j has weight Mij .) Now let us choose the entries of the matrix M independently at random according to the exponential distribution with mean 1. The problem that we pose ourselves is to compute the expected value of the minimum assignment.
632
C. Krattenthaler
3 1 2 5 1 3
4 7 1
2 3 5
4 1 6
3 1 2 5 1m3
4 2 7 3 1 5
4 1m 6
Figure 6. A rectangular matrix, and a minimum 2-assignment Using the (highly non-rigorous) replica method, M´ezard and Parisi [37] predicted that π2 lim E(min. assignment) = . (5.1) n→∞ 6 (Instead of the exponential distribution, they do in fact consider the uniform distribution on [0, 1], but, modulo physical intuition, in the limit n → ∞ this is equivalent.) Several years later, Parisi [39] discovered that, apparently, there was a finite version of this conjecture. Conjecture 5.1. If the entries of an n × n matrix M are chosen independently, each according to the exponential distribution with mean 1, then 1 1 1 E(min. assignment) = 1 + + + · · · + 2 . 4 9 n Clearly, this is well in accordance with the prediction (5.1). When one first sees the conjecture then one has the feeling that one should be able to resolve it during a weekend, say. However, one quickly realizes that the simplicity of the statement is deceiving. In particular, it is probably too special to be approached directly, so that one is quickly led to look for generalizations that may be easier to prove. One possible direction of generalization could be to extend the conjecture to rectangular matrices, m × n matrices M say. In that case, one has to also adapt the notion of assignment. In fact, we generalize the concept to the concept of a k-assignment of M , which is a selection of k elements of M out of which no two elements are in the same row or in the same column. For example, for the 3 × 5 matrix in Figure 6, the circled elements mark a 2-assignment, which is in fact also a minimum 2-assignment. Coppersmith and Sorkin [23] came up with the following surprising generalization of Conjecture 5.1. Conjecture 5.2. If the entries of an m × n matrix M are chosen independently, each according to the exponential distribution with mean 1, then 1 . E(min. k-assignment) = (m − i)(n − j) i+j
It is not immediately obvious, but it is not too difficult to show, that for m = n = k this implies Conjecture 5.1. The reader may wonder why we did not consider another source of possible generalization: why not replace the exponential distribution with mean 1 by exponential distributions with other means? (That the distribution needs to
The ACE Research Training Network
633
be exponential seems to be instrumental if one wants to have exact results for the finite problem.) This leads us to the conjecture by Buck, Chan and Robbins [20]. Conjecture 5.3. If the entries of an m × n matrix M are chosen independently, the (i, j)-entry according to the exponential distribution with mean 1/ui vj , then m + n − 1 − |X| − |Y | (−1)k−1−|X|−|Y | , E(min. k-assignment) = uX¯ vY¯ k − 1 − |X| − |Y | X⊆{1,...,n} Y ⊆{1,...,n}
where uX¯ =
m
ui , and similarly for vY¯ .
i=1 i∈X /
Again, it is not completely obvious but not too difficult to show that Conjecture 5.3 implies Conjecture 5.2, and thus Conjecture 5.1, see [20]. Now, a na¨ıve way to try to prove this is to somehow apply induction. For example, let us suppose that M11 happens to be the smallest element in the first row. One can show that, conditioning on that event, the differences M12 −M11 , . . . , M1n −M11 are again independent and exponentially distributed. Hence, one subtracts M11 from all the entries in the first row, thus creating a 0 in the place of the (1, 1)-entry. One could imagine to do something similar for other rows, or columns. In the end, one will realize that, for such an approach to be successful, it will be necessary to have a formula for matrices in which some entries are fixed to be 0. Such a formula was found, and proved, by Linusson and W¨ astlund [36], see the theorem below. (It generalizes a previous one from [35], which was already sufficient to prove Conjectures 5.1 and 5.2. We should remark that a proof of the prediction (5.1) was found earlier by Aldous [1], and that a proof of Conjectures 5.1 and 5.2 different from the one in [35] has been found by Nair, Prabhakar and Sharma [38].) Here is the main theorem from [36]. Theorem 5.4. Let M be a (random) m × n matrix in which the entries in a certain (fixed) set Z are all 0, while an (i, j)-entry outside of Z is distributed exponentially with mean 1/ui vj . Then −µ (X, Y ), ˆ1 . E(min. k-assignment) = uX¯ vY¯ (X,Y )∈Jk (M )
Clearly, I need to explain the notation which is used in the statement of the theorem. A subset λ of the rows and columns of M is called a cover if all entries of Z are in λ. A (k−1)-cover is one consisting of exactly k−1 rows and columns. A partial (k − 1)-cover is a subset of a (k − 1)-cover. Let Jk (M ) be the set of partial (k − 1)-covers, and let Jˆk (M ) be the poset consisting of the elements of Jk (M ) ordered by inclusion, together with an artificial maximal element ˆ1. The function µ(., ˆ 1) in the theorem denotes the M¨ obius function on the poset Jˆk (M ),
634
C. Krattenthaler
i.e., the function on Jˆk (M ) which is uniquely determined by B⊇A µ(B, ˆ1) = δA,ˆ1 . (The reader may have wondered what a problem such as the random assignment problem, which is clearly a probabilistic problem, may have to do with algebraic combinatorics. Now, with the appearance of the M¨ obius function, the relation with algebraic combinatorics becomes transparent.) The proof of the theorem is based on a clever induction, elements from the “Hungarian method” for finding the minimum weight perfect matching in a bipartite graph, K¨ onig’s theorem on matchings and vertex coverings in bipartite graphs, and an urn model invented by Buck, Chan and Robbins in [20]. Again, it is not obvious that Theorem 5.4 would imply Conjectures 5.1–5.3, but this is shown in [36]. 6. The Charney–Davis and the Neggers–Stanley conjectures Here I report on progress made by Reiner and Welker (Marburg team), and by Br¨ and´en (pre-doc in the network) on two tantalizing conjectures in poset theory and the theory of simplicial complexes: the Neggers–Stanley Conjecture and the Charney–Davis Conjecture. I refer the reader to [13, 46, 47] for an introduction into the relevant theory at the interplay between combinatorics and commutative algebra, and for more background information. I begin by explaining the first of the two conjectures. Let P be a partially ordered set (poset) on {1, 2, . . . , n}. Let L(P ) denote its set of linear extensions, that is, the set of w = (w1 , w2 , . . . , wn ) ∈ Sn for which i
tdes(w)
w∈L(P )
is the generating function for the linear extensions w ∈ L(P ) counted according to the number of descents des(w), which is the number of all i, 1 ≤ i ≤ n − 1, such that wi > wi+1 . This polynomial arose originally in the context of Stanley’s theory of P -partitions [44], but has since then appeared and found applications in many other situations. The Neggers–Stanley conjecture says the following. Conjecture 6.1. For any labelled poset P on {1,2,...,n}, the polynomial W (P, t) has only real (non-positive) zeroes. Neggers had made this conjecture for naturally labelled posets, that is, for posets with the property that i
The ACE Research Training Network
635
curvature. Given an abstract simplicial complex ∆ triangulating a (d − 1)dimensional (homology) sphere, the face numbers fi count the number of idimensional faces. These face numbers are usually collected in the f -vector (f−1 , f0 , f2 , . . . , fd−1 ). The information about the face numbers can also be equivalently, but very often more conveniently, encoded by the h-vector (h0 , h1 , . . . , hd ), defined by d
hi td−i =
i=0
d
d
fi−1 (t − 1)d−i .
i=0 i
We denote the polynomial i=0 hi t by h(∆, t). The Charney–Davis Conjecture [21, Conjecture D] concerns the evaluation of h(∆, t) at t = −1 in the case where d is even and ∆ is a flag simplicial homology (d−1)-sphere, “flag” meaning that the minimal subsets of vertices which do not span a simplex all have cardinality two. Conjecture 6.2. When ∆ is a flag simplicial homology (d − 1)-sphere and d is even, then (−1)d/2 h(∆, −1) ≥ 0. In [40], Reiner and Welker establish a link between these seemingly unrelated conjectures. Let P be a graded poset, that is, a poset in which every maximal chain has the same number of elements, r say. They construct a flag simplicial polytopal sphere ∆eq (P ) of dimension |P |−r−1 (called the equatorial sphere) such that W (P, t) = h(∆eq (P ), t). (6.1) (Space does not permit to reproduce this construction here.) Furthermore, by using the fact that ∆eq (P ) is the boundary of a simplicial convex polytope, which makes it possible to apply the g-Theorem [45] characterizing the number of faces of such polytopes, they prove that the sequence of coefficients of W (P, t) is (symmetric and) unimodal. (A sequence is unimodal if its coefficients are weakly increasing up to a certain point, and from that point on weakly decreasing.) Since one can show that polynomials with real non-positive zeroes have automatically unimodal coefficients, this statement is slightly weaker than Conjecture 6.1 for graded posets. As a consequence, the Neggers–Stanley Conjecture for P implies the Charney–Davis Conjecture for ∆eq (P ), because it is well known that if p(t) = pd td + · · · + p1 t + p0 is a polynomial in t with non-negative coefficients of even degree, and if p(t) is symmetric and has only real zeroes, then (−1)d/2 p(−1) ≥ 0. The paper [40] does in fact put forward evidence for the thesis that both of the conjectures should be fruitfully viewed within the context of Koszul algebras. It is worthwhile to point out that this
636
C. Krattenthaler
work inspired Athanasiadis [3] to find a powerful general theorem which implies the above result by Reiner and Welker, and proves at the same Stanley’s conjecture on the unimodality of the h-vector of the Birkhoff polytope of magic squares. In [11], Br¨ and´en gives a different proof (avoiding the powerful g-Theorem) of symmetry and unimodality of the coefficients of W (P, t), in the generalized context of, what he calls, sign-graded posets P . Moreover, he is able to prove that W (P, −1) has always the “right” sign, so that, via (6.1), the Charney– Davis Conjecture holds for all the flag simplicial polytopal spheres ∆eq (P ). On the slightly negative side, Br¨ and´en constructs counter-examples to the Neggers–Stanley Conjecture in [12]. However, all his examples are not naturally labelled posets, so that it is still possible that Neggers’ original conjecture is true. There may well be also other important families of posets for which the Neggers–Stanley conjecture is still true. (In fact, this is known for some already, see [11] for references.) Also the Charney–Davis Conjecture may still be true in general. 7. Map enumeration The subject of this final section is the recent work by Bousquet–M´elou (Bordeaux team) and Schaeffer (Marne-la-Vall´ee team) on map enumeration. A map is an embedding of a connected graph on a surface without crossing edges. The connected components of the complement of the graph in the surface are called faces of the map. For our purposes, the surface will always be the plane. For such a planar map, one usually marks one edge on the outer face and orients it so that the outer face is to the right of the edge. This marked edge is called the root edge of the map. Figure 7.a shows a rooted planar map. The root edge is indicated by the arrow.
a.
b.
c.
Figure 7. A map, an Ising configuration, and a hard particle configuration In the Ising model on maps, one colors the vertices of the map in two colors, black and white, say; see Figure 7.b. (In the figure, “bicolored” edges, that is, edges between vertices of different color are dotted.) The task is then to
The ACE Research Training Network
637
compute the partition function (generating function, in combinatorial terms) X #white vertices of m Y #black vertices of m u#bicolored edges , m∈C 2-colorings of m
for a given class C of maps. In the hard particle model on maps, we are allowed to place “hard” particles on the vertices, but two hard particles can never be placed on neighboring vertices. See Figure 7.c for an example. Here, the task is to compute the partition function (generating function) X #empty vertices of m Y #particles , m∈C particle configurations
for a given class C of maps. Before I become more specific, let me recall some (very selective) history: Map enumeration begins with the fundamental work of Tutte (see, e.g., [18, 48, 49]). Among other things, using the so-called quadratic method and Lagrange inversion, he derived miraculous closed form expressions for the number of maps in several classes of maps. The long-standing question of giving a (combinatorial) explanation of why these closed form formulas arise, was answered by Schaeffer [42, 43], by constructing new bijections between maps and trees. On the physics side, the idea of putting models on maps originates from twodimensional quantum gravity (see [5, 17]). In many cases, the physicists found (most of the time using matrix integrals; see [25] for a recent survey) nice algebraic partition functions. Again, the question arises why these nice algebraic functions arise. Bousquet-M´elou and Schaeffer [8] give the answer again in terms of bijections between maps and trees. (This paper extends largely a previous one [10] by Bouttier, Di Francesco and Guitter, which is as well based on “Schaeffer-like” bijections.) Bousquet-M´elou and Schaeffer consider a very fine partition function (generating function) for bipartite maps (these are maps in which the vertices are colored black or white, and each edge connects a black and a white vertex), namely #white vertices of degree i in m #black vertices of degree i in m xi yi . m a bipartite map i≥1
i≥1
The main theorem that they prove is the following. Theorem 7.1. The above degree generating function for bipartite maps rooted at a black vertex of degree 2 (i.e., the root edge is directed from a black vertex to a white vertex, and the black vertex has degree 2) is equal to (7.1) y2 (W0 − B2 )2 + W1 − B3 − B22 ,
638
C. Krattenthaler
where the Wi ≡ Wi (x1 , . . . , y1 , . . . ) and Bi ≡ Bi (x1 , . . . , y1 , . . . ) are degree generating functions for certain families of trees. The latter can be calculated from the system of equations k k 1 Wi = [z i ] + W (z) , xk+1 z + B(z) , and Bi = [z i ] yk+1 z k≥0
k≥0
where W (z) := i≥0 Wi z i and B(z) := the coefficient of z i in the series S(z).
i≤1 Bi z
i
, and where [z i ]S(z) denotes
The point is that the above definition of the auxiliary generating functions Wi and Bi implies that, once one poses certain additional restrictions (for example: restrict oneself to maps in which the vertices have bounded degree), one finds easily polynomial equations for the Wi ’s and Bi ’s. (As a consequence, the generating function in (7.1) is algebraic in these cases, which gives the desired explanation for the phenomena observed by physicists mentioned earlier. Not only that, all the terms in (7.1) have a precise combinatorial meaning as generating functions of trees.) Furthermore, suitable specializations, respectively limits, produce the afore-mentioned Ising and hard particle generating functions. To illustrate the power of the theorem, here are two corollaries obtained in this manner. (The first one was originally obtained, in an equivalent form, by Bouttier, Di Francesco and Guitter [7], while the second was originally obtained, again in equivalent form, by Boulatov and Kazakov [9].) Corollary 7.2. The hard particle generating function for tetravalent maps (these are maps in which all the vertices have degree 4; see Figure 7.a for an example of such a map) rooted at an edge whose ends are vacant (i.e., the vertices which the root edge connects are not occupied by a particle) is xP 3 +
xP 2 (3 − 2P ) 27x3 yP 6 − 1 − 9xyP 2 (1 − 9xyP 2 )3
where P ≡ P (x, y) is the power series defined by P = 1 + 3xyP 3 +
3xP 2 . (1 − 9xyP 2 )2
Corollary 7.3. The Ising generating function for bicolored tetravalent maps rooted at a white vertex is 1 135x2 y 2 P 6 + 72xy 2 P 5 − 3y(15x + 8y − 3v 2 y + 36xy)P 4 I(X, Y, u) = 9 −y(32+3v 2 −36x−36y)P 3 −(3+2v 2 −72y)P 2 +12(1−3y)P −9 +
(1 − P − yP 2 )(1 + 3yP )(12 − 8P + 3v 2 P ) , 9(1 + 3xP )
The ACE Research Training Network
639
where P ≡ P (x, y, v) is defined by P = 1 + 3xyP 3 + v 2
P (1 + 3xP )(1 + 3yP ) , (1 − 9xyP 2 )2
and where x = X(u − u1 )2 , y = Y (u − u1 )2 and v = u1 . The proof of Theorem 7.1 relies, first of all, on a (trivial) decomposition of maps which results by cutting the root edge: either the map gets disconnected when we cut the root edge (see the left half of Figure 8), in which case we end up with two maps, each with a pending (half-)edge, or the map stays connected (see the right half of Figure 8), in which case we end up with one map which contains two pending (half-)edges.
11 00
Figure 8. Cutting the root edge The heart of the proof is then a bijection between maps with one or two pending half-edges and certain trees, called blossom trees. These are rooted trees with white and black vertices, in which neighboring vertices must have different color, in which black vertices may have buds (indicated by arrows) and white vertices may have leafs (indicated by half-edges). See Figure 9 for an example.
Figure 9. A blossom tree
640
C. Krattenthaler
Figure 10. The closure of a blossom tree Given a blossom tree, one can construct a map by the following closure procedure. Starting from the root, we go around the tree, and, whenever we encounter a bud immediately followed by a leaf, we connect the two by an edge. See Figure 10 for an example. It can be shown that, under certain additional (technical) conditions, this defines a bijection. If one puts the pieces together appropriately, one arrives finally at the assertion of Theorem 7.1. References [1] D. Aldous, The ζ(2) limit in the random assignment problem, Random Structures Algorithms 18 (2001), 381–418. [2] S.E. Alm and G.B. Sorkin, Exact expectations and distributions in the random assignment problem, Combin. Probab. Comput. 11 (2002), 217–248. [3] C. Athanasiadis, Ehrhart polynomials, simplicial polytopes, magic squares and a conjecture of Stanley, preprint, 2003; math.CO/0312031. [4] A. Beilinson and J. Bernstein, Localisation de g-modules, C. R. Acad. Sci. Paris 292 (1981), 15–18. [5] D. Bessis, C. Itzykson and J.B. Zuber, Quantum field theory techniques in graphical enumeration, Adv. Appl. Math. 1(2) (1980), 109–157. [6] S. Billey and V. Lakshmibai, Singular loci of Schubert varieties, Progress in Math. 182, Birkh¨ auser, Boston, MA, 2000. [7] D.V. Boulatov and V.A. Kazakov, The Ising model on a random planar lattice: the structure of the phase transition and the exact critical exponents, Phys. Lett. B 186(3-4) (1987), 379–384. [8] M. Bousquet-M´elou and Schaeffer, The degree distribution in bipartite planar maps: applications to the Ising model, preprint, 2003; math.CO/0211070.
The ACE Research Training Network
641
[9] J. Bouttier, P. Di Francesco and E. Guitter, Critical and tricritical hard objects on bicolourable random lattices: exact solutions, J. Phys. A 35(17) (2002), 3821– 3854. [10] J. Bouttier, P. Di Francesco and E. Guitter, Census of planar maps: from the one-matrix model solution to a combinatorial proof, Nucl. Phys. B645 (2002), 477–499. [11] P. Br¨ and´en, Sign-graded posets, unimodality of W-polynomials and the Charney– Davis Conjecture, preprint, 2004; math.CO/0406019. [12] P. Br¨ and´en, Counterexamples to the Neggers–Stanley conjecture, preprint, 2004; math.CO/0408312. [13] F. Brenti, Log-concave and unimodal sequences in algebra, combinatorics, and geometry: an update, Contemp. Math. 178 (1994), 71–89. [14] F. Brenti, Kazhdan-Lusztig polynomials: History, Problems, and Combinatorial Invariance, S´eminaire Lotharingien Combin. 46 (2003), Article B46c, 30 pp. [15] F. Brenti, The intersection cohomology of Schubert varieties is a combinatorial invariant, Europ. J. Combin. 25 (2004), 1151–1167. [16] F. Brenti, F. Caselli and M. Marietti, Special matchings and Kazhdan-Lusztig polynomials for doubly laced Coxeter systems, preprint, 2003. [17] E. Br´ezin, C. Itzykson, G. Parisi and J. B. Zuber, Planar diagrams, Comm. Math. Phys. 59(1) (1978), 35–51. [18] W.G. Brown and W.T. Tutte, On the enumeration of rooted non-separable planar maps, Canad. J. Math. 16 (1964), 572–577. [19] J.-L. Brylinski and M. Kashiwara, Kazhdan-Lusztig conjecture and holonomic system, Invent. Math. 64 (1981), 387–410. [20] M.W. Buck, C.S. Chan and D.P. Robbins, On the expected value of the minimum assignment, Random Structures Algorithms 21 (2002), 33–58. [21] R. Charney and M. Davis, Euler characteristic of a nonpositively curved, piecewise Euclidean manifold, Pacific J. Math. 171 (1995), 117–137. [22] F. du Cloux, Rigidity of Schubert closures and invariance of Kazhdan-Lusztig polynomials, Adv. in Math. 180 (2003), 146–175. [23] D. Coppersmith and G.B. Sorkin, Constructive bounds and exact expectations for the random assignment problem, Random Structures Algorithms 15 (1999), 133–144. [24] E. Delanoy, Invariance combinatoire des polynˆ omes de Kazhdan–Lusztig sur les intervalles partant de l’origine, preprint, 2003; math.CO/0312237. [25] P. Di Francesco, 2D quantum gravity, matrix models and graph combinatorics, preprint, 2004; math-ph/0406013. [26] M.J. Dyer, On the “Bruhat graph” of a Coxeter system, Compositio Math. 78 (1991), 185–191. [27] I. Frenkel, M. Khovanov and A. Kirillov, Kazhdan-Lusztig polynomials and canonical basis, Transform. Groups 3 (1998), 321–336. [28] W. Fulton, Young tableaux, Cambridge University Press, Cambridge, 1997. [29] W. Fulton and J. Harris, Representation Theory, Springer–Verlag, New York, 1991. [30] M. Haiman, Hecke algebra characters and immanent conjectures, J. Amer. Math. Soc. 6 (1993), 569–595.
642
C. Krattenthaler
[31] J.E. Humphreys, Reflection groups and Coxeter groups, Cambridge University Press, Cambridge, 1990. [32] D. Kazhdan and G. Lusztig, Representations of Coxeter groups and Hecke algebras, Invent. Math. 53 (1979), 165–184. [33] D. Kazhdan and G. Lusztig, Schubert varieties and Poincar´e duality, Geometry of the Laplace operator, Proc. Sympos. Pure Math. 34, Amer. Math. Soc., Providence, RI, 1980, pp. 185–203. [34] V. Lakshmibai and B. Sandhya, Criterion for smoothness of Schubert varieties in Sl(n)/B , Proc. Indian Acad. Sci. Math. Sci. 100 (1990), 45–52. [35] S. Linusson and J. W¨ astlund, A proof of Parisi’s conjecture on the random assignment problem, Probab. Theory Related Fields 128 (2004), 419–440. [36] S. Linusson and J. W¨ astlund, A proof of a conjecture of Buck, Chan and Robbins on the random assignment problem, preprint; math.CO/0307357. [37] M. M´ezard and G. Parisi, Replicas and optimization, J. Phys. Lett. 46 (1985), 771–778. [38] C. Nair, B. Prabhakar and M. Sharma, A proof of the conjecture due to Parisi for the finite random assignment problem, preprint, 2003; available at http://www.stanford.edu/~balaji/rap.html. [39] G. Parisi, A conjecture on random bipartite matching, manuscript, 1998; cond-mat/9801176. [40] V. Reiner and V. Welker, On the Charney-Davis and Neggers-Stanley conjectures, preprint, 2002; available from http://www.mathematik.uni-marburg.de/~welker. [41] B.E. Sagan, The symmetric group, 2nd edition, Springer–Verlag, New York, 2001. [42] G. Schaeffer, Bijective census and random generation of Eulerian planar maps with prescribed vertex degrees, Electron. J. Combin. 4(1) (1997), #R20, 14 pp. [43] G. Schaeffer, Conjugaison d’arbres et cartes combinatoires al´ eatoires, Ph.D. thesis, Universit´e Bordeaux 1, France, 1998. [44] R.P. Stanley, Ordered structures and partitions, Mem. Amer. Math. Soc. No. 119, Amer. Math. Soc., Providence, R. I., 1972. [45] R.P. Stanley, The number of faces of a simplicial convex polytope, Adv. in Math. 35 (1980), 236–238. [46] R.P. Stanley, Log-concave and unimodal sequences in algebra, combinatorics, and geometry, Ann. New York Acad. Sci. 576 (1989), 500–534. [47] R.P. Stanley, Combinatorics and commutative algebra, 2nd edition, Birkh¨ auser, Basel, 1996. [48] W.T. Tutte, A census of planar maps, Canad. J. Math. 15 (1963), 249–271. [49] W.T. Tutte, On the enumeration of planar maps, Bull. Amer. Math. Soc. 74 (1968), 64–74. [50] D. Uglov, Canonical bases of higher level q-deformed Fock spaces and KazhdanLusztig polynomials, Physical Combinatorics, Progress in Math., 191, Birkh¨ auser, Boston, MA, 2000, 249–299. C. Krattenthaler Institut Girard Desargues, Universit´e Claude Bernard Lyon-I 21, avenue Claude Bernard, F-69622 Villeurbanne Cedex, France e-mail:
[email protected] URL: http://igd.univ-lyon1.fr/~kratt
4ECM Stockholm 2004 c 2005 European Mathematical Society
Algebras with Involution and Adjoint Groups Marina Monsurr` o
Introduction The aim of this paper is to stress out the deep relations existing between the study of linear algebraic groups and the theory of central simple algebras endowed with an involution. Historically, those relations can be traced out to two main sources. In 1960, Weil [15] proved that every semisimple linear algebraic group of adjoint type (excluded the exceptional groups and the groups of trialitarian type D4 ) can be realized as the connected component of the identity in the automorphism group of a separable algebra with involution and that the converse is true (excluded the factors of degree 2 with involution of orthogonal type). The algebraic groups arising from algebras with involution are called classical. Besides, investigating the representations of a semisimple linear algebraic group G, Tits proved in [12] that some irreducible representations over a separable closure descend to representations of G in the group of invertible elements in some uniquely determined central simple algebra (carrying, in certain cases, a canonical involution). These results have been known and used for more than twenty years but, the last few years, investigations from different sources unexpectedly converged in to renew interest in this topic. One reason for this convergence is that the theory of linear algebraic groups provides the proper level of generality to clarify the relationship between various results on central simple algebras and quadratic forms. Index reduction formulas are a case in point. Other impetus for this study was given by the fact that the language of algebras with involution, more explicit and straightforward, can be successfully applied in order to investigate some “mysterious” properties of algebraic groups; namely, the rationality problem for some of them can be treated in this way and cohomological invariant of high degree (as the Rost invariant) can be written in a very explicit way. In the first part of this note, we sketch the correspondence between linear algebraic groups, via the Dynkin diagram, and we recall some basic properties of those objects. At the end of this note, we give an example of the fruitfull interaction of these two theories by showing some results on stable rationality.
644
M. Monsurr` o
1. Algebraic groups and Dynkin diagrams We start by some recalls on Algebraic Groups; in particular we outline the main steps of the construction of the associated Dynkin diagram. 1.1. Basic definitions. Let G be an algebraic group defined over a field F and let Falg denote an algebraic closure of F . We say that G is solvable if the group of Falg -rational points G(Falg ) is solvable. We say that G is semisimple if it is not trivial, connected, and if GFalg has no nontrivial normal connected solvable subgroups. A subtorus T of G is called maximal if it is not contained in any subtorus. Recall that maximal subtori remain maximal under base extension and that they are conjugated over Falg by an element of GFalg . A semisimple algebraic group is called split if it contains a maximal split subtorus. Remark that over a separable closed field any semisimple group is split. We say that a semisimple split algebraic group G is simple if GFalg contains no nontrivial normal connected subgroups. 1.2. Root systems. Let G be a semisimple split algebraic group, T ⊂ G a maximal split subtorus and T ∗ the abelian group of characters of T . We consider the adjoint representation ad : G → GL(Lie(G)) which maps every g ∈ G to ad(g). For every character α ∈ T ∗ , we set Vα := {v ∈ Lie(G) |t · v = α(t)v ∀t ∈ T }. The weights of the representation ad are the characters α ∈ T ∗ such that Vα = {0} and we call roots the nontrivial weights. Since T is diagonalisable, a classical argument from representation theory applied to ad|T yields the decomposition Lie(G) = Vα where the sum is taken over the weights α ∈ T ∗ of the representation ad|T and moreover, we get dim(Vα ) = 1 for every root α. We denote by Φ(G) the set of the roots. One can easily show that (1) 0 ∈ / Φ(G) and Φ(G) generates T ∗ ⊗Z R over R; (2) if α ∈ Φ(G), and xα ∈ Φ(G) for some x ∈ R, then x = ±1; (3) for all α ∈ Φ(G), there exists a reflection sα with respect to α such that sα (Φ(G)) = Φ(G); (4) for all α and β ∈ Φ(G), we have sα (β) − β = uα with u ∈ Z; in other words, Φ(G) is a Root System in T ∗ ⊗Z R (cf. [5]). The reflection sα in (3) is uniquely determined by α. For α ∈ Φ(G), we define α∗ ∈ (T ∗ ⊗Z R)∗ by sα (v) = v − α∗ , v · α ∀v ∈ T ∗ ⊗Z R.
Algebras with Involution and Adjoint Groups
645
E6 An E7 Bn E8 Cn F4 Dn G2
Figure 1. The Dinkin diagrams Such α∗ are called coroots. One can easily prove that α∗ , β is an integer for all β ∈ Φ(G) and that α∗ , α = 2. Now, we denote by Π(G) the set of simple roots in Φ(G) (cf. [5]), by Λr the root lattice in T ∗ (the additive subgroup of T ∗ ⊗Z R generated by all roots α ∈ Φ(G)), and by Λ the dual lattice defined by Λ = {v ∈ T ∗ ⊗Z R |α∗ , v ∈ Z for α ∈ Φ(G)}. In general, we have the inclusion Λr ⊂ T ∗ ⊂ Λ and we say that the split semisimple group G is of adjoint type (or simply adjoint) if Λr = T ∗ and, respectively, simply connected if Λ = T ∗ . We recall, and we will see it in detail in a special case, that we can associate to the root system Φ(G) a graph having Π(G) as its set of vertices; two vertices α and β are connected by α∗ , β · β ∗ , α edges. If α∗ , β > β ∗ , α, then all edges between α and β are directed, with α the origin and β the target. This graph is called the Dynkin diagram of Φ(G) and it does not depend on the choice of Π(G). Two root systems are isomorphic if and only if their Dynkin diagrams are isomorphic and the graph is connected if and only if the root system is irreducible. There are four infinite families An , Bn , Cn , Dn of irreducible root systems and five exceptional ones, denoted by E6 , E7 , E8 , F4 and G2 . They are classified by the following graphs: We refer to Bourbaki [3] for proofs and details. The following result tells us how to construct a graph, the Dynkin diagram, associated to any split semisimple algebraic group. Proposition 1.1. A split semisimple algebraic group G is simple if and only if the associated root system Φ(G) is irreducible. Recall that a split semisimple group G of adjoint type can be written as the product of uniquely determined simple subgroups Gi and, therefore, the associated root system Φ(G) is given by Φ(G) = Φ(Gi ).
646
M. Monsurr` o
2. Algebras with involutions and automorphisms In this section we collect known facts from the theory of hermitian forms and algebras with involutions. The main references are [6] and [10]. 2.1. Involutions. Let F be a field of characteristic different from 2, and let K be either the field F , or a quadratic ´etale extension of F (not necessarily a field). Suppose that A is an Azumaya algebra over K (if K is a field, then A is a central simple algebra over K) endowed with an involution (antiautomorphism of order two) σ such that F coincides with the subfield of σ-invariant elements in K. The involution σ is called of the first kind if it fixes K elementwise (i.e., if K = F ), and of the second kind otherwise. We can think of involutions of the first kind (K = F ) on a central simple F -algebra A as twisted forms of nonsingular symmetric or skew-symmetric bilinear forms up to scalar factors. Let V be a finite-dimensional vector space over the separable closure Fs of F and b : V × V → Fs be a nonsingular symmetric or skew-symmetric bilinear form. There exists an unique involution σb on EndFs (V ) defined by b(f (v), w) = b(v, σb (f )(w)),
∀ v, w ∈ V, f ∈ EndFs (V ).
We call σb the adjoint involution with respect to the bilinear form b. After a scalar extension to Fs , the algebra A splits as A ⊗F Fs EndFs (V ) for some vector space V over Fs , and every involution on the first kind is adjoint to a nonsingular symmetric or skew-symmetric bilinear form. In the case where the involution is not trivial on the center K, the algebra structure must be considered over the subfield F of invariant elements (which is clearly of codimension 2 in K), so that the involution behaves well under base extension. Since K ⊗F Fs Fs × Fs , the algebra A decomposes, after scalar extension to Fs , as the direct product of two factors permuted by the involution (and then anti-isomorphic to each other). In order to obtain a notion which is stable under scalar extension, with a slight abuse of terminology, we thus call again (A, σ) a central simple algebra with a second kind involution over F . In this way, we allow A to be of the form B × B op for some central simple F -algebra B with σ the switch involution (x, y op ) → (y, xop ). Summarizing, the involutions on a central simple algebra A over a field F can be divided into the following categories: Definition 2.1. An involution σ on a central simple F -algebra A is called • of orthogonal type if (A, σ) ⊗F Fs (EndFs (V ), σb ) for some finite-dimensional vector space V over Fs and some nonsingular symmetric bilinear form B.
Algebras with Involution and Adjoint Groups
647
• of symplectic type if (A, σ) ⊗F Fs (EndFs (V ), σb ) for some finite-dimensional vector space V over Fs and some nonsingular skew-symmetric bilinear form B. • of unitary type if (A, σ) ⊗F Fs (EndFs (V ) × EndFs (V )op , ) for some finite-dimensional vector space V over Fs , where is the switch involution. We call degree of A the dimension of the vector space V . We say that (A, σ) is split if it is isomorphic to (EndF (V ), σb ) for some F -vector space V and some bilinear form b; more generally, a splitting field for (A, σ) is a field extension L of F such that (A, σ) ⊗F L is split. By a theorem of Wedderburn, for every finite-dimensional simple F -algebra A there exists a splitting field L such that [L : F ] is finite. 2.2. Automorphisms. Let (A, σ) be an algebra with involution (of any kind) over a field F . As announced in the introduction, we are mainly interested in its group of automorphisms. Definition 2.2. An automorphism of (A, σ) is an F -algebra automorphism which commutes with σ. We denote the group of automorphisms of (A, σ) by Aut(A, σ). Recall that, by the Skolem-Noether theorem, any algebra automorphism of a central simple F -algebra A is inner (cf. [10], Theorem 8.4.2). So, if we take an element a ∈ A× , the condition σ ◦Int(a) = Int(a)◦σ is equivalent to σ(a)a ∈ K, where K denotes the center of A. We thus have that Int(a) belongs to Aut(A, σ) if and only if the product σ(a)a is in the center. In order to get a better understanding of the structure of these automorphisms groups, we first consider the simplest situation, namely the split orthogonal case. If (A, σ) is of the form (EndF (V ), σb ) for some nonsingular symmetric bilinear form b defined over an F -vector space V , easy calculations show that Int(a) belongs to Aut(A, σ) if and only if b(a(v), a(w)) = λb(v, w) for all v and w in V , where λ = σ(a)a ∈ F × . This means that Int(a) is an automorphism of (A, σ) if and only if a is a similitude of the bilinear form b with multiplier λ. In the general situation, the group of similitudes (by analogy with the split orthogonal case) of an algebra with involution (A, σ) with center K is defined by Sim(A, σ) = {a ∈ A× | σ(a)a ∈ K}. Int
Thus, the map a −→ Int(a) induces an isomorphism between Sim(A, σ)/K × and the group of inner automorphisms of the algebra with involution (A, σ). For any similitude a in Sim(A, σ), the element σ(a)a ∈ K is called the multiplier
648
M. Monsurr` o
of a. Keeping the analogy with the split orthogonal case, the similitudes with multiplier equal to 1 are called isometries and we set Iso(A, σ) = {a ∈ A× | σ(a)a = 1}. We moreover define AutK (A, σ) as the subgroup of Aut(A, σ) consisting of K-algebra automorphisms. Theorem 2.3. ([6], 12.15) With the above notation, we have AutK (A, σ) = {Int(a) | a ∈ Sim(A, σ)}. Therefore, there is an exact sequence 1 → K × → Sim(A, σ) −→ AutK (A, σ) → 1. Int
In order to obtain a more detailed view of the situation, we have to consider the three different types separately. 2.3. The orthogonal case. Let A be a central simple algebra over F endowed with an orthogonal involution σ. We recall that orthogonal involutions are a special case of involutions of the first kind, so that the center F is also the field fixed (elementwise) by the involution. As observed above, in the split case the algebra with involution (A, σ) is of the form (EndF (V ), σb ), where b is a nonsingular symmetric bilinear form. In this situation, we easily remark that Iso(EndF (V ), σb ) coincides with the orthogonal group O(V, b) of the space (V, b). We have seen that orthogonal involutions are adjoint to symmetric bilinear forms; however, in order to define orthogonal groups, the point of view of quadratic forms seems the more appropriate. In this paper we consider fields of characteristic different from two, so that these two notions are equivalent. On the other hand, in characteristic 2, some more precise structures are required; two slightly different solutions are proposed in [11] and in [6], allowing to develop the all theory in the most general case. By analogy with the split case, we define the following objects: O(A, σ) := Iso(A, σ), ×
PGO(A, σ) := GO(A, σ)/F , ×
GO(A, σ) := Sim(A, σ)
O (A, σ) := {a ∈ O(A, σ) | NrdA (a) = 1} +
×
where NrdA : A → F is the reduced norm map. Since every algebra (A, σ) with orthogonal involution becomes split after scalar extension, the group O(A, σ) is a form of an orthogonal group. As pointed out before, by the Skolem-Noether theorem the map a → Int(a) induces an isomorphism ∼
Aut(A, σ) −→ PGO(A, σ). If the degree of A is odd, one can show that (A, σ) is always split and that the multiplier of every similitude of (A, σ) is a square in F . This yields
Algebras with Involution and Adjoint Groups
649
the automorphisms (cf. [6], Proposition 12.4) GO(A, σ) O+ (A, σ) × F × and PGO(A, σ) O+ (A, σ). Finally, we deduce from Theorem 2.3 that the groups Aut(A, σ) and O+ (A, σ) are isomorphic. In particular, Aut(A, σ) is a connected and absolutely simple adjoint group and one easily checks that this group is of type Bn , where deg(A) = 2n + 1 (see §3). The corresponding simply connected group is the spinor group Spin(A, σ). If the degree of A is even, then the groups GO(A, σ) and PGO(A, σ) are not connected. The connected component of the identity in GO(A, σ) is GO+ (A, σ) = {a ∈ GO(A, σ) | NrdA (a) = (σ(a)a)n/2 }. Therefore, the connected component of the identity in Aut(A,σ) = PGO(A,σ) is PGO+ (A, σ) = GO+ (A, σ)/F × . This last group is adjoint, semisimple, of type Dn if deg(A) = 2n ≥ 4; it is absolutely simple if n > 3. The corresponding simply connected group is the spinor group Spin(A, σ). 2.4. The symplectic case. Let A be a central simple algebra over F endowed with a symplectic involution σ. As in the above section, inspired by the split case, we set Sp(A, σ) := Iso(A, σ), GSp(A, σ) := Sim(A, σ) and PGSp(A, σ) := GSp(A, σ)/F × . By Theorem 2.3, we have an isomorphism ∼
PGSp(A, σ) −→ Aut(A, σ). Remark that in the split case, Sp(EndF (V ), σb ) is the group of isometries of the nonsingular skew-symmetric bilinear form (V, b); since every central simple algebra with symplectic involution splits after scalar extension, it follows that Sp(A, σ) is a form of a symplectic group. We further remark that in the symplectic case (split or not), the dimension of the vector space V , and hence the degree of the algebra A, is always even, say deg(A) = 2n. The groups Sp(A, σ), GSp(A, σ) and PGSp(A, σ) are connected. Moreover, the group PGSp(A, σ) is absolutely simple, adjoint, of type Cn . The corresponding absolutely almost simple and simply connected group is the symplectic group Sp(A, σ). 2.5. The unitary case. Let (A, σ) an algebra with unitary involution. As above, we denote by K the center of A, where K is a quadratic ´etale extension of the field F fixed by the involution. As in the previous cases, we first look at a simpler situation; suppose, thus, that K is a field, that A = EndK (W ) for some K-vector space W and that σ = σh is the involution adjoint to some nonsingular hermitian form h on
650
M. Monsurr` o
W . Easy computations show that in this case the group Iso(A, σ) is the unitary group U(W, h) of the hermitian space (W, h) (cf. [10], §7.1). In the more general situation, by analogy with this case, we set U (A, σ) := Iso(A, σ),
GU(A, σ) := Sim(A, σ),
PGU(A, σ) := GU/K × and SU(A, σ) := {u ∈ u(A, σ) | NrdA (u) = 1}. Now, an unitary algebra (A, σ) may have automorphisms which do not restrict to the identity on K; however, one can prove that AutK (A, σ) (see the notation of Theorem 2.3 above) coincides with the connected component of Aut(A, σ). We can then apply Theorem 2.3 and get an isomorphism ∼
Aut(A, σ)0 → PGU(A, σ). If K = F × F , then A = B × B op for some central simple F -algebra B and we may assume that σ = is the switch involution. Clearly U (B × B op , ) = {(u, (u−1 )op ) | u ∈ B × } GL(B), hence SU(A, σ) SL(B) and PGU(A, σ) PGL(B). Since every central simple algebra with unitary involution becomes isomorphic to (EndFs (V ) × EndFs (V )op , ) by extending scalars to a separable closure, the group SU(A, σ) is a form of special linear group. It is a simply connected, absolutely almost simple group of type An−1 if deg(A) = n ≥ 2. The connected component of the identity in Aut(A, σ), which is isomorphic to PGU(A, σ), is the corresponding adjoint simple group of type An−1 . 3. An example of explicit construction of the Dynkin diagram To give an idea of the way the Dynkin diagram of a simple adjoint group is constructed, we consider the easiest case: a central simple algebra endowed with an orthogonal involution (A, σ), with odd degree deg(A) = 2n + 1. As pointed out above, in this case (A, σ) is split, (A, σ) = (EndF (V ), σb ), where dim(V ) = 2n + 1; we also recall that Aut(A, σ) O+ (A, σ). Let B = (v0 , v1 , . . . , v2n ) be a basis of V such that 1 if i = j ± n and b(vi , vj ) = b(v0 , vi ) = 0 ∀ i ≥ 1 0 otherwise. We denote by q the quadratic form associated to the nonsingular symmetric bilinear form b. We want to apply the procedure of §1.2 to the group G = O+ (V, q) ⊂ GL2n+1 (F ). The subgroup T of diagonal matrices t = diag(1, t1 , t2 , . . . , tn , t1 −1 , . . . , tn −1 ) is a split maximal subtorus of G. We denote by χi ∈ T ∗ the characters defined,
Algebras with Involution and Adjoint Groups
651
for all i, by χi (t) = ti . We can identify the group of characters T ∗ to Zn by the isomorphism ∼ T ∗ −→ Zn χi → ei where {ei } denotes the canonical basis of Zn . We finally remark that Lie(G) = {x ∈ End(V ) | tr(x) = 0 and b(v, xv) = 0 ∀ v ∈ V }. We can easily calculate the weight subspaces in Lie(G) with respect to adT and, via the above identification T ∗ ↔ Zn , we obtain Φ(G) = {±ei ∀ i} ∪ {±ei ± ej ∀ i > j} and Π(G) = {α1 = e1 − e2 , . . . , αn−1 = en−1 − en , αn = en }. If we calculate the αi ∗ , αj , we get: 2(αi , αi+1 ) = −1 ∀ i ≤ n − 1, (αi , αi ) 2(αi+1 , αi ) αi+1 ∗ , αi = = −1 ∀ i ≤ n − 2, (αi+1 , αi+1 )
αi ∗ , αi+1 =
where (·, ·) denotes the usual scalar product, and αn ∗ , αn−1 = −2. The Dynkin diagram obtained in this way is of type Bn (see Figure 1). 4. Stable rationality and R-equivalence As announced in the introduction, we will give in this section an example of the applications of algebras with involution to the study of linear algebraic groups. Namely, we present some (partial) answers to the classical problem of rationality. We first introduce some notations. We recall that two irreducible varieties are birationally equivalent when their function fields are isomorphic. Definition 4.1. Let X be an irreducible variety over the field F . We say that X is stably rational if X × AnF is rational (i.e., birationally equivalent to Ak ) for some integer n. The first example of a simply connected semisimple not stably rational group, found by Platonov in [9], was the group SL1 (A) for a suitable central simple algebra A. For adjoint semisimple groups, instead, only a few results were known. As it was noticed first by Voskresenskii ([13]), the invariant which Platonov used in [9] to show that the group SL1 (A) is not stably rational, is nothing but the group of R-equivalence classes. The notion of R-equivalence was introduced by Manin in [7] and studied for linear algebraic groups in [4] by Colliot-Th´el`ene and Sansuc.
652
M. Monsurr` o
Definition 4.2. Let G be an algebraic group over a field F . We define the (normal) subgroup RG(F ) of G(F ) as the set of elements x ∈ G(F ) such that there exists a rational map f ∈ G(F (t)), f : A1F → G, defined in 0 and 1, with f (0) = 1 and f (1) = x. We denote the quotient G(F )/RG(F ) by G(F )/R and we call it the group of R-equivalence classes of G. We say that an algebraic group G over F is R-trivial if G(E)/R is trivial for each field extension E/F . The relation between R-triviality and stable rationality is given by the following Theorem 4.3. ([4]) If a connected group variety G defined over a field F is stably rational then it is R-trivial. Thanks to those results, the theory of central simple algebras with involution, and in particular the study of their automorphisms groups, allow us to give quite a complete description of this problem for classical groups. 4.1. The case An . As we have seen in §2.5 above, groups of type An correspond to central simple algebras with unitary involution, an adjoint group is then of the form PGU(A, σ) where the degree of A is n+1. For even n, Voskresenskii and Klyachko proved in [14] that any adjoint simple group of type An is rational. Again using algebras with involutions, Merkurjev proves in [8] that adjoint groups of type A1 are rational. The first examples of non R-trivial and, a fortiori, non stably rational adjoint groups of type An are obtained by Berhuy, Tignol and the author in [2] for n + 1 (the degree of A) a multiple of 8. 4.2. The case Bn . We have seen that groups of type Bn correspond to central simple algebras of degree 2n + 1 endowed with an orthogonal involution and that those algebras are always split (cf. §2.3 and §3 above). One can easily prove (cf. [8], Lemma 1) that the variety of the group O+ (A, σ) is rational; adjoint groups of type Bn are thus always rational. 4.3. The case Cn . By using the correspondence we established in §2, the case Cn is obtained for central simple algebras of degree 2n endowed with a symplectic involution σ; adjoint groups of type Cn are thus of the form PGSp(A, σ). Merkurjev proved in [8] that for n odd and for n = 2 such groups are rational. In the case where n is a multiple of 4, instead, an infinite family of counterexamples are constructed by Berhuy, Tignol and the author in [2]; the construction deeply uses an explicit expression of the Rost invariant for the symplectic group (introduced by the same authors in [1]) and gives a family of groups of type C4k that are non R-trivial and, a fortiori, not stably rational. 4.4. The case Dn . We first recall that we are not considering the trialitarian forms of D4 . Except for this case, adjoint groups of type Dn can be realized as the connected component of the identity in the automorphism group of a central simple algebra of degree 2n endowed with an orthogonal involution. The first examples of non R-trivial and, a fortiori, non stably rational adjoint
Algebras with Involution and Adjoint Groups
653
groups of this type are obtained by Merkurjev in [8] for involutions σ having discriminant different from 1. In the case of trivial discriminant an infinite family of non R-trivial groups of type D4k is constructed in [1] by a method analogous to the symplectic case. References [1] G. Berhuy, M. Monsurr` o, J.-P. Tignol, The discriminant of a symplectic involution, Pacific J. of Math., 209 (2003), 201–218. [2] G. Berhuy, M. Monsurr` o, J.-P. Tignol, Cohomological invariants and Rtriviality of adjoint classical group, Math. Z., to appear. Available on the preprint server Linear Algebraic Groups and Related Structures http://www.mathematik.uni-bielefeld.de/LAG/. ´ ements de math´ematique, Hermann, Paris, 1975, Groupes et [3] N. Bourbaki, El´ alg`ebres de Lie. Chapitres 7 et 8. [4] J.-L. Colliot-Th´ el`ene, J.-J. Sansuc, La R-´equivalence sur les tores, Ann. Scient. ´ Norm. Sup., 4e s´erie, 10 (1977), 175–230. Ec. [5] Humphreys, James E., Introduction to Lie algebras and representation theory. Second printing, revised. Graduate Texts in Mathematics, 9. Springer-Verlag, New York-Berlin, 1978. [6] M. Knus, A. Merkurjev, M. Rost, J.-P. Tignol, Book of involutions, American Mathematical Society, Colloquium Publications Volume 44 (1998). [7] Y.I. Manin, Cubic forms, Amsterdam, North Holland, 1974. [8] A.S. Merkurjev, R-equivalence and rationality problem for semisimple adjoint classical algebraic groups, Pub. Math. IHES, 46 (1996), 189–213. [9] V.P. Platonov, Algebraic groups and reduced K-theory, Proc. Inter. Congr. Math. Vol 1 Helsinki 1978, 311–317. [10] W. Scharlau, Quadratic and hermitian forms, Grundlehren Math. Wiss. 270, Springer-Verlag, Berlin (1985). [11] J. Tits, Formes quadratiques, groupes orthogonaux et alg`ebres de Clifford, Invent. Math. 5 (1968), 19–41. [12] J. Tits, Repr´esentations lin´eaires irr´eductibles d’un groupe r´eductif sur un corps quelconque, J. reine angew. Math. 247 (1971), 196–220. [13] V.E. Voskresenskii, Algebraic Tori, Nauka Moskow, 1977, 223. [14] V.E. Voskresenskii, A.A. Klyachko, Toroidal Fano varieties ansd root system, Math. USSR Izvestija 24 (1985), 221–244. [15] A. Weil, Algebras with involutions and the classical groups, J. Ind. Math. Soc. 24 (1960), 589–623. Marina Monsurr` o EPFL, Lausanne e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Constructing Algebraic Varieties via Commutative Algebra Miles Reid Abstract. Problems on the existence and moduli of abstract varieties in the classification of varieties can often be studied by embedding the variety X into projective space, preferably in terms of an intrinsically determined ample line bundle L such as the (anti-) canonical class or its submultiples. A comparatively modern twist on this old story is to study the graded coordinate ring R(X, L) = H 0 (X, L⊗n ), n≥0
which in interesting cases is a Gorenstein ring; this makes available theoretical and computations tools from commutative algebra and computer algebra. The varieties of interest are curves, surfaces, 3-folds, and historical results of Enriques, Fano and others are sometimes available to serve as a guide. This has been a prominent area of work within European algebraic geometry in recent decades, and the lecture will present the current state of knowledge, together with some recent examples.
EAGER EAGER builds on the success of the former EU networks AGE and EuroProj that have run since the late 1980s. EAGER is supported by the programme Improving Human Potential and the Socio-economic Knowledge base of the European Commission, Contract No. HPRN-CT-2000-00099. (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11)
North Italy, A. Conte (Torino), overall coordinator Spain, J.-C. Naranjo (Barcelona) South Germany and Romania, Fabrizio Catanese (Bayreuth) North Germany, K. Hulek (Hannover) France, A. Hirschowitz (Nice) Scandinavia, K. Ranestad (Oslo) South Italy, C. Ciliberto (Roma) Israel, M. Teicher (Bar-Ilan) Benelux, E. Looijenga (Utrecht) Poland, J. Wisniewski (Warszawa) United Kingdom, Portugal and Hungary, Miles Reid (Warwick) → link to independent group “Vector bundles on algebraic curves”(VBAC) (12) Switzerland, Christian Okonek (Z¨ urich) (13) Program Management node, W. Decker
656
M. Reid
EAGER objectives (1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
Classification of algebraic varieties Homological and categorical methods Moduli stacks of curves Moduli of vector bundles Abelian varieties and their moduli Hodge theory and algebraic cycles Toric methods and group actions Computer algebra Coding theory Computer Aided Geometric Design
Other. Calabi–Yau manifolds and mirror symmetry. Topology of algebraic surfaces and 4-manifolds. Moduli spaces. Algebraic stacks and their Gromov– Witten invariants. Free resolutions, homological algebra and derived categories. Birational methods. Deformation theory. Analytic and differential geometric methods. Syzygies and homological methods, derived categories. Today’s lecture only treats a small fraction of the first topic, namely: Classification of algebraic varieties via commutative algebra methods. Classification of varieties The classification of surfaces goes back to the 19th century. 1846:
Cayley and Salmon: 27 lines on S3 ⊂ P3
1860s: Riemann surfaces, Brill–Noether, RR theorem 1890–1910: Castelnuovo, Enriques and other: Birational classification of surfaces by their plurigenera 1930s: Enriques and students: Surfaces of general type 1930s: Fano: 3-folds V2g−2 ⊂ Pg+1 1950s: Kodaira: deformation theory, moduli, classification of complex analytic surfaces 1980s: Mori theory, minimal models of 3-folds. The conclusion that classification is the division K < 0, K = 0, K > 0 plus fibrations, where K is the canonical class. 1980s: Differentiable and symplectic 4-manifolds (Donaldson and others) 1990s: Calabi–Yau 3-folds, orbifolds, mirror symmetry. EAGERists are involved in all these topics (and many more, of course). Any number of survey lectures could be made out of other EAGER topics.
Constructing Algebraic Varieties via Commutative Algebra
657
1. Preliminary philosophical remarks Surfaces. In what follows the ultimate aim (not necessarily expressed) is the study of regular surfaces of general type, for example, the simply connected Godeaux surfaces (that is, canonical surfaces S with pg = 0, KS2 = 1). This is a mature subject, that involves most other areas of geometry. To study S, it may be convenient to know a lot about curves C ⊂ S, possibly passing through singular points of S; or it may be convenient to express S as a hypersurface section of some higher-dimensional “key variety”, e.g., a Fano 3-fold or Fano 4-fold, possibly with orbifold singularities. Surprisingly, it turns out to be advantageous in some problems not to worry too much in advance what dimension of variety we are studying: taking a hypersurface section is a known operation. Commutative algebra. The geometric constructions of Enriques, Horikawa and others can often be interpreted in algebraic terms as constructions of rings by generators and relations. As samples: (1) The hypersurface Xd ⊂ Pn defined by fd = 0 has homogeneous coordinate ring the graded ring C[x0 , x1 , . . . , xn ]/(fd ). (2) The geometric idea of projection corresponds algebraically to elimination of variables. (3) “Key varieties” may have a homological or commutative algebra treatment, such as determinantal form of the equations. 2. Definition of graded ring A graded ring R = n≥0 Rn is a (commutative) ring with a grading such that multiplication does Ri × Rj → Ri+j . Extra assumptions. The following are often in force: = C); (1) R0 = k is a field (often k (2) The maximal ideal m = n>0 Rn is finitely generated. =⇒ R = k[x0 , . . . , xn ]/IR , where the generators xi ∈ Rai of m have wt xi = ai , and IR is the homogeneous ideal of relations. (3) R is an integral domain. Example 2.1. The standard textbooks define a projective variety to be a closed subvariety X ⊂ Pn in “straight” projective space Pn (all the generators of degree 1, so xi ∈ R1 ). Write ! forms of degree d vanishing on X . IX = d≥0
Then IX is a homogeneous ideal and k[X] = k[x0 , . . . , xn ]/IX is the coordinate ring of X. Here R is generated by its elements of degree 1; we are usually interested in the more general case of varieties in weighted projective space.
658
M. Reid
For details, see my website + algebraic geometry links + surfaces + graded rings and homework. 3. The Proj construction R → Proj R As described in [EGA2] or [Hartshorne, Chap. II] or my notes (webloc. cit.), X = Proj R is defined as the quotient (Spec R \ 0)/C∗ of the variety Spec R = V (I) ⊂ Cn+1 by the action of the multiplicative group C∗ = Gm (k) induced by the grading. In more detail, if R = k[x0 , . . . , xn ]/I with wt xi = ai then λ ∈ C∗ acts on R by multiplication by λn on Rn , that is, λ : xi → λ ai xi . It therefore acts on the affine variety Spec R = V (I) ⊂ Cn+1 . Note the philosophy: grading = C∗ action. The origin 0 ∈ Cn+1 is in the closure of every orbit (because (0, 0, . . . , 0) = Rn is limλ→0 (λa0 x0 , . . . , λan xn ); this uses the fact that the grading of R = by N with n > 0, or wt xi = ai > 0. Therefore we must exclude the unstable point 0 to be able to take a sensible quotient. For all f ∈ Rd homogeneous of degree d > 0, form the ring " 0 g "" 1 = R wt g = de ⊂ Frac R f fe "
(3.1)
consisting of rational functions that are homogeneous of deg 0 with only f or its powers in the denominator. Then define 0 # 1 Xf = Spec R , and X = Xf . f f ∈Rd
In other words, on taking the quotient (Spec R \ 0)/C∗ : (1) The typical C∗ invariant open set is (f = 0) for f ∈ Rd . (2) the ring (3.1) is the ring of all C∗ -invariant regular functions on this open. Thus the quotient Proj R is the space of orbits of the C∗ action, with all C∗ invariant functions. Remark 3.1. X = Proj R is really a stack, and it is sometimes convenient to treat it as an orbifold. It is a projective scheme X, OX , but it has the extra structure of the sheaves OX (k) for all k ∈ Z, defined by " g "" Γ(Xf , OX (k)) = wt g = de + k ⊂ Frac R. fe " Then k∈Z OX (k) is a sheaf of graded algebras.
Constructing Algebraic Varieties via Commutative Algebra
659
For straight projective space (that is, wt xi = 1 for all xi ), OX (1) is an ample invertible sheaf, and OX (k) = OX (1)⊗k . But for wP we must take OX (k) as extra data. For example, if all the ai have some common factor q | ai then Rn = 0 for all n not divisible by q, and so OX (k) = 0. In this case we say that X has nontrivial orbifold structure in codim 0. Example 3.2. C2g+2 ⊂ P(1, 1, g + 1) defined by y 2 = f2g+2 (x1 , x2 ) is a hyperelliptic curve of genus g. X10 ⊂ P(1, 1, 2, 5) defined by z 2 = f10 (x1 , x2 , y) is a famous example of Enriques and Kodaira of a canonical surface with pg = 2, K 2 = 1.
4. Hilbert series It follows from my assumptions on R that Rn if a finite-dimensional vector space over R0 = C for each n. Set Pn (R) = dimk Rn and PR (t) = Pn tn . n≥0
The formal power series PR (t) is the Hilbert series of R. Under our assumptions it is a rational function in t; thus R = k[x0 , . . . , xn ]/IR
with wt xi = ai
$n
implies that i=0 (1 − tai ) · PR (t) is a polynomial in t, called the Hilbert numerator; it contains information and hints as to the homological algebra or commutative algebra properties of R. Example 4.1. If R = k[x0 , . . . , xn ] is the weighted polynomial ring then 1 . ai i=0 (1 − t )
PR (t) = $n
Example 4.2. If R = k[x0 , . . . , xn ]/(fd ) is the ring of a weighted hypersurface of degree d in P(a0 , . . . , an ) then 1 − td . ai i=0 (1 − t )
PR (t) = $n
Likewise, a codim 2 complete intersection has Hilbert numerator (1 − td1 )(1 − td2 ). See the homework sheet on webloc. cit. for more examples.
660
M. Reid
5. Hilbert series from orbifold RR From now on, X is a projective variety, and OX (k) = OX (kA) with A an ample Q-divisor. So rA is an ample Cartier divisor for some r > 0. Assume that R = R(X, A) = H 0 (X, OX (kA)). k≥0
(This is an extra assumption on R, akin to projective normality.) Usually the terms of the Hilbert series Pn (R) = h0 (X, OX (nA)) are given by RR and vanishing for n 0, plus initial assumptions for small n. If A is Q-Cartier, the form of RR we need is orbifold RR (also known as equivariant RR or the Atiyah–Singer Lefschetz formula). See [YPG, Chap. III] for details. A simple example gives the flavor. Example 5.1. C a curve, A = D + ar P with D an integral divisor, r > 1 and a ∈ [1, . . . , r − 1] coprime to r. Then OC (nA) = OC ([nA]), where we round down the divisor nA to the nearest integer (because a meromorphic function has poles of integral order), so that RR takes the form % na & χ(C, OC (nA)) = χ(OC ([nA])) = 1 − g + n deg A − . r ! Here the fractional part na is the small change we lose on rounding down r nA to [nA]. This introduces the orbifold correction term r−1 ia i 1 t · − 1 − tr i=1 r
(5.1)
1 r 2r into the Hilbert series. (The effect of multiplying by 1−t + ··· r = 1 + t + t is just to repeat the rounding-down errors periodically.)
Remark 5.2. Set ab ≡ 1 mod r and let ε be a primitive rth root of 1 (for example, ε = exp(2πi/r)). Then one checks that r−1 1 ia εi = 1 − εb r i=1 Thus the term (5.1) is “cyclotomic” in nature. Generalisations of this idea give very quick and convenient ways of calculating the orbifold contributions to RR. We are in fact close to the proof of the Atiyah–Singer equivariant Lefschetz formula: the denominator is the equivariant Todd class det(ε : TX,P ). See [YPG, Chap. III].
Constructing Algebraic Varieties via Commutative Algebra
661
Example 5.3 (Bauer, Catanese, Pignatelli). C a curve of genus g = 3 with points P, Q ∈ C such that P + 3Q = KC . For example, C = C4 ⊂ P2 , with Q a flex and P the 4th point of intersection of the flex line with C. I choose the divisor A = 12 P + Q. Then 1 n = 0; 1 n = 1; 2 n = 2 (P + 2Q = KC − Q is a g31 ); h0 (nA) = 3 n = 3 (3A = KC + 12 P and g = 3); −2 + 3n if n ≥ 4 even; 2 3n−1 −2 + 2 if n ≥ 4 odd. Therefore PC,A (t) = 1 + t + 2t2 + 3t3 + 4t4 + 5t5 + 7t6 + · · · (1 − t2 )PC,A (t) = 1 + t + t2 + 2t3 + 2t4 + 2t5 + 3t6 + 3tn (1 − t)(1 − t2 )PC,A (t) = 1 + t3 + t6 . Thus PC,A (t) =
1 − t9 . (1 − t)(1 − t2 )(1 − t3 )
This gives C9 ⊂ P(1, 2, 3) as a possible model for C. One checks that it works: C has a 12 (1) orbifold singular points at (0, 1, 0). The linear system |2A| = P + 2Q is the g31 . R(C, A) is a Gorenstein ring because 3A = KC + 12 P is the orbifold canonical class of C. 6. Some classes of varieties to study Regular surfaces of general type (Enriques). Assume that KS is ample, and that q = h1 (S, OS ) = 0. (We say that S is a regular surface; irregular surfaces with q > 0 are studied by different methods.) 1 k = 0; k = 1 (the definition of pg ); p g Pn (S) = 1 + pg + n K 2 k ≥ 2 (by RR and vanishing). 2 An easy calculation gives pS (t) =
1 + (pg − 3)t + (K 2 − 2pg + 4)t2 + (pg − 3)t3 + t4 . (1 − t)3
About a dozen important cases were treated geometrically by Enriques, Kodaira, Horikawa and others. Algebraic treatment by Ciliberto, Catanese, Reid and others.
662
M. Reid
Example 6.1. pg = 4, K 2 = 6. The first possible case suggested by the Hilbert series is S3,4 ⊂ P(1, 1, 1, 1, 2). This really works. There are lots of degenerate cases studied by Horikawa, and recently by [Bauer, Catanese and Pignatelli]; see below. The situation for pg = 3, K 2 = 2, 3, 4 or for pg = 2, K 2 = 1, 2, 3 is similar. Beyond these initial cases, the calculations get very difficult. Fano 3-folds. Nonsingular 3-folds V with −KV ample, usually anticanonically embedded as V2g−2 ⊂ Pg+1 . These were studied by Fano in the 1930s and Iskovskikh from 1970s, later Mori and Mukai. Q-Fano 3-folds. 3-folds V with terminal singularities and −KV ample (Mori, Reid and others, 1990s). In studying 3-folds, terminal singularities are unavoidable; the most important and interesting singularities are the cyclic quotient singularities 1r (1, a, r − a) with r ≥ 2 and a ∈ [1, r − 1] coprime to r. Several hundred families of Q-Fano 3-folds are known, for example the “famous 95” Fano hypersurfaces studied in [Corti, Pukhlikov, Reid]. See [DB]. Q-K3s. These are surfaces X with quotient singularities and KX = OX , H 1 (OX ) = 0 polarised by a Q-divisor. They appear naturally as anticanonical surfaces X ∈ |−KV | on a Q-Fano 3-fold V . Remark 6.2. It can happen that a surface of general type S is contained in a Q-Fano 3-fold V , for example: (1) S ∈ |−2KV |, so adjunction gives KS = KV |S ; (2) or V is a Q-Fano 3-fold of index 2 with −KV = 2A and S ∈ |3A|, so that KS = A|S . A striking fact: the basket of singularities of V (giving the fractional contributions to its Hilbert series) is then already determined by S: in the two cases above (1) V has basket (K 2 − 4pg + 12) × 12 (1, 1, 1). So for example, if S has pg = 1, K 2 = 1 then V has 9 × 12 (1, 1, 1) points, whereas if S has pg = 1, K 2 = 2 then V has 10 × 12 (1, 1, 1) points. We really meet these cases below. (2) V has basket (K 2 − 3pg + 6) × 13 (1, 2, 2). This follows automatically from orbifold RR! 7. Appendix: Cohen–Macaulay and Gorenstein I omit the definitions and treatment by homological algebra, which are standard and not very difficult. In practice, we want R to be Cohen–Macaulay and (better) Gorenstein; otherwise the ring and the variety are very difficult to construct.
Constructing Algebraic Varieties via Commutative Algebra
663
Criterion. Let R = R(X, A). Then • R is Cohen–Macaulay if and only if H i (X, OX (kA)) = 0 for all i with 0 < i < dim X and all k, for i = 0 and k < 0, and for i = dim X and k 0. • R is Gorenstein if and only if it is Cohen–Macaulay and KX = kA for some k ∈ Z. Example 7.1. These conditions hold in most of our cases: (1) X is a K3 surface with quotient singularities and A an ample Weil divisor; (2) X is a regular surface of general type and A = KX . Then H 1 (KX ) = 0 follows from regularity and Serre duality, and H 1 (nKX ) = 0 for n ≥ 2 from Kodaira vanishing; (3) V is a Q-Fano 3-fold of Fano index f and −KV = f A; (4) C is an orbifold curve (with a point 1r P ), and we interpret KC in the criterion as orbi-KC = KC + r−1 r P. The cone over a projectively embedded Abelian surfaces is a simple example of a geometrically interesting variety that is not Cohen–Macaulay. 8. Application 1 Horikawa’s study of surfaces with pg = 4, K 2 = 6 divides them into several cases, and solves many problems, but leaves the existence of degenerations between cases II and IIIb as an open question. [Bauer, Catanese, Pignatelli] have recently proved that such a degeneration does occur. II. The case assumption is that |KX | is a free linear system and defines a 3to-1 morphism ϕKX : X → Q ⊂ P3 , where Q is the quadric cone x1 x3 = x22 . In this case pulling back the pencil of the quadric cone provides a pencil |A| on the canonical model X with 2A = KX . In general X has an orbifold point of type 12 (1, 1) over the vertex of Q. Restricting A to a general C ∈ |A| gives rise to the example treated above of a curve of genus 3 and an orbifold divisor A = 12 P + Q, so that 2A = P + 2Q is a g31 . It follows that X = X9 ⊂ P(1, 1, 2, 3). This has all the required properties, and every surface in II is given by this construction. IIIb . The case assumption is that |KX | has a double point as its base locus on the canonical model (or a −2-curve as base component on the minimal model), ' → Q ⊂ P3 is a 2-to-1 morphism to the quadric cone. Then again and ϕK : X KX = 2A with A2 = 3/2. At the level of a general curve C ∈ |A|, the curve C is a nonsingular hyperelliptic curve of genus 3, and the restriction A|C is 32 P , where P is a Weierstrass point. (Thus 2A = P + g21 can be viewed as a g31 with a fixed point.)
664
M. Reid
Bauer, Catanese and Pgnatelli [BCP2] calculate R(C, A) and R(X, KX ) in case IIIb : 1 R C, P = k[a, b, c]/ c2 − f7 (a4 , b) with wt a, b, c = 1, 4, 14, 2 giving C = C28 ⊂ P(1, 4, 14). Then R(C, A) = R(C, 32 P ) is the third Veronese embedding: it needs generators x = a3 ,
y = a2 b,
z = ab2 ,
t = b3 ,
u = ac,
v = bc
with wt x, y, z, t, u, v = 1, 2, 3, 4, 5, 6. And relations x y z u rank ≤1 y z t v
(8.1)
(meaning the 2×2 minors = 0, which gives 6 equations); and 3 further equations derived from c2 = f7 , of the form u2 = [a2 f ],
uv = [abf ],
v 2 = [b2 f ],
where [a2 f ] means that we write out the terms a30 , a26 b, . . . , a2 b7 of a2 f in terms of x, y, z, t. If we group together the terms in f as f = a28 + a24 b + · · · + a4 b6 + b7 = aA + b4 B with A = A9 , B = B4 ∈ k[x, y, z, t] then the 3 final equations become u2 = xA + z 2 B,
uv = yA + ztB,
v 2 = zA + t2 B.
(8.2)
This is the “rolling factors” format of [Dicks]: you go from one relation to the next by replacing an entry in the top row of the matrix of (8.1) by an entry in the bottom. Equations (8.1) and (8.2) are 9 equations with 16 syzygies defining a codim 4 Gorenstein ring. They can be written as the 4 × 4 Pfaffians of the following extrasymmetric matrix: 0 3 1 2 5 0 z x y u 4 2 3 6 t y z v of weights M = 5 6 9 u v A M = . 4 7 0 Bz 8 −sym Bt The matrix M is skew, with the following extra symmetry: the top right 3 × 3 block is symmetric, and the bottom right 3×3 block is B times the top left. Thus instead of 15 independent entries it only has 9, and likewise, only 9 independent 4 × 4 Pfaffians. The format relates closely to the Segre embedding of P2 × P2 as a (nongeneric) linear section of Grass(2, 6).
Constructing Algebraic Varieties via Commutative Algebra
665
This format is flexible: it carries its own syzygies with it, so that we can vary the entries as we like and obtain a flat deformation. Replacing by λ z x y u t y z v u v A M = Bλ Bz −sym Bt with a constant λ = 0 deforms the hyperelliptic curve to a nonhyperelliptic trigonal curve. Similarly (but with some more work), one can prove that the surfaces in case IIIb have small deformations in II. 9. Appendix: All about Pfaffians Let M0 = {mij } be a 2k × 2k skew matrix. Its Pfaffian is Pf M0 =
sign(σ)
k
mσ(2i−1)σ(2i) ;
i=1
(sum over the symmetric group S2k ), and means that we only take 1 occurrence of each repeated factor. Skewsymmetry causes each term to occur 2k · k! times, so the Pfaffian consists of 2k! = 1 · 3 · · · (2k − 1) · k! terms. For example, a 4 × 4 Pfaffian is of the form 2k
Pf 12.34 = m12 m34 − m13 m24 + m14 m23 which is familiar as the Pl¨ ucker equations of Grass(2, n). In fact det M0 = (Pf M0 )2 . The Pfaffian is a skew determinant, and every aspect of the theory of determinants extends to Pfaffians. For example, it follows from the definition that a Pfaffian can be expanded along any row exactly like a determinant: thus a 6 × 6 Pfaffian is Pf 12.34.56 = m12 · Pf 34.56 −m13 · Pf 24.56 + · · · . If M is a (2k + 1) × (2k + 1) skew matrix, write Pf i = (−1)i Pf Mi , where Mi is the skew 2k × 2k matrix obtained by deleting the ith row and column from M . Then the adjoint matrix of M (matrix of 2k × 2k cofactors) is the matrix of rank 1 (or 0) adj M = Pf ·t Pf,
where Pf = (Pf 1 , . . . , Pf 2k+1 ).
Since det M = 0 we get Pf ·M = 0, and if M has rank 2k then Pf generates ker M (skew Cramer’s rule).
666
M. Reid
10. Application 2 Surfaces with pg = 1, K 2 = 2 were studied in [Catanese and Debarre], following Enriques; an alternative construction as a section of a higher-dimensional variety was given by Jan Stevens in 1995 (but as far as I know not written down). I start from the graded ring over the canonical curve C ∈ |KS |: a reasonably general 4 × 4 symmetric matrix M of linear forms on P2y1 ,y2 ,y3 defines an invertible sheaf OC (A) on the plane quartic C = C4 : (det M = 0) ⊂ P2 , with the resolution M OC (A) ← 4OP2 (−1) ←− 4OP2 (−2) ← 0, (10.1) and satisfying OC (2A) = KC (in other words, A is an ineffective theta characteristic on C). The corresponding graded ring R(C, A) = k[y1 , y2 , y3 , z1 , z2 , z3 , z4 ]/IC is generated by y1 , y2 , y3 ∈ H 0 (OC (2A)) and z1 , . . . , z4 ∈ H 0 (OC (3A)) = OC (A)(1) with relations (z1 , . . . , z4 )M = 0 from (10.1) and zi zj = Mij (the ijth maximal minor of M . These equations define a codim 5 embedding C ⊂ P(23 , 34 ) with Hilbert numerator 1 − 4t5 − 10t6 + 15t8 + 20t9 − 20t11 − · · · The same construction starting from a 4 × 4 symmetric matrix M over P3 leads to a quartic K3 surface X4 ⊂ P3 carrying an ineffective Weil divisor AX with a resolution similar to (10.1), and R(X, A) embeds X into P(24 , 34 ). However, now X has 10 nodes at points where rank M = 2. These are 12 (1, 1) orbifold points at which OX (AX ) is the odd eigensheaf. The problem is to deform the graded ring R(C, A) or R(X, AX ) with ⊂ new generators of degree 1. First project X from a chosen node to X6,6 1 P(2, 2, 2, 3, 3); the exceptional curve of this projection is P = P(1, 1) embedded into P(2, 2, 2, 3, 3). Since P(2, 2, 2, 3, 3) has no forms of degree 1, this embedding is not projectively normal; in coordinates it is (v, w) → (v 2 , vw, w2 , v 3 + αv 2 w, βvw2 + w3 ) with 1 + αβ = 0. The following result is joint work with Grzegorz Kapustka and Michal Kapustka (who held an EAGER visiting studentship at Warwick in spring 2004). ∼ Π ⊂ Claim. General forms of degree 1, 2, 2, 2, 3, 3 define an embedding P2 = P(1, 2, 2, 2, 3, 3) with image Π contained in 3 sextics. The complete intersection of two general sextics through Π is a Q-Fano 3-fold V6,6 with 9 × 12 (1, 1, 1) orbifold points on P2y1 ,y2 ,y3 , 24 ordinary nodes on Π, and nonsingular otherwise.
Constructing Algebraic Varieties via Commutative Algebra
667
The 24 nodes of V6,6 on Π are resolved by the (small) blowup V → V6,6 2 of Π, and the birational image E ⊂ V of Π has E ∼ = P , OE (−E) ∼ = OP2 (2); it contracts to a tenth orbifold point 12 (1, 1, 1) on a Fano V ⊂ P(1, 24 , 34 ).
The proof is a calculation in computer algebra. According to results of Jan Stevens, V actually extends to a Fano 6-fold W ⊂ P(14 , 24 , 34 ) of Fano index 4 having 10 isolated orbifold points of type 12 (1, . . . , 1). (It can be obtained by an immersion P5 → P(14 , 23 , 32 ) contained in two sextics, but the computation is quite bulky.) References [BCP1] Ingrid Bauer, Fabrizio Catanese and Roberto Pignatelli, Canonical rings of surfaces whose canonical system has base points, in Complex geometry (G¨ ottingen, 2000), Springer, Berlin, 2002, pp. 37–72, [BCP2] Ingrid Bauer, Fabrizio Catanese and Roberto Pignatelli, The moduli space of surfaces with K 2 = 6, pg = 4, math.AG/0408062, 17 pp. [DB] Gavin Brown, Graded rings database, online at http://www.maths.warwick.ac.uk/˜gavinb/grdb.html [CD] F. Catanese and O. Debarre, Surfaces with K 2 = 2, pg = 1, q = 0, J. reine angew. Math. 395 (1989) 1–55 [CPR] A. Corti, A. Pukhlikov and M. Reid, Birationally rigid Fano hypersurfaces, in Explicit birational geometry of 3-folds, A. Corti and M. Reid (eds.), CUP 2000, pp. 175–258 [Dicks] M. Reid, Surfaces with pg = 3, K 2 = 4 according to E. Horikawa and D. Dicks, in Proceedings of Algebraic geometry mini-symposium (Tokyo Univ., Dec 1989, distributed in Japan only), 1–22 (get from my website) ´ ements de g´eom´etrie alg´ebrique. II, Etude ´ [EGA2] A. Grothendieck, El´ globale ´el´ementaire de quelques classes de morphismes, IHES Publ. Math. 8 (1961) 222 pp. [YPG] M. Reid, Young person’s guide to canonical singularities, in Algebraic Geometry, Bowdoin 1985, ed. S. Bloch, Proc. of Symposia in Pure Math. 46, A.M.S. (1987), vol. 1, pp. 345–414 Miles Reid EAGER network e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Mathematical Problems of Large Quantum Systems Jan Philip Solovej
1. Introduction In this lecture I will present work done within the framework of the European network The Analysis of Large Quantum Systems. The research in this network deals with the rigorous mathematical analysis of models from physics. The models studied range from atomic, molecular and condensed matter systems over statistical mechanics models to problems in quantum field theory and even string theory. Many of the questions studied are of current interest in physics others are fundamental mathematical problems left unanswered for decades. It will not be possible to discuss in detail all the problems studied within the network. I will describe the general framework for a large class of problems and some general approaches used. Most of the problems are within the theory of many-body quantum mechanics. The different particles may belong to different species of identical particles. These may in turn obey either Bose or Fermi statistics. I will mainly focus on the situation where all the particles belong to the same species of identical particles. One general approach is to approximate the extremely complex manybody quantum models by effective models. It is of course a very important issue to rigorously control the degree of approximation or to find limits in which the effective models become exact. The effective models may then in turn be studied in their own right. As an example of a many-body quantum system I will discuss the fundamental example of the charged gas and sketch a recent resolution of a conjecture by Dyson on the charged Bose gas. As we shall explain, one consequence of Dyson’s conjecture is that charged Bose systems are unstable. Charged bosons do exist in nature but they always have other significant interactions besides the electromagnetic interactions discussed here. In Section 2, I introduce the general mathematical framework of manybody quantum mechanics and identical particles obeying Fermi or Bose statistics. In Section 3, I formulate precisely some of the questions studied. In Section 4, I discuss general approaches such as typical estimates and reduction to effective models. I also discuss some of these effective models, such as the fermionic Hartree-Fock model, the bosonic Hartree model, and Bogolubov’s model. In Section 5, I introduce the charged coulomb gas and review some of the classical results known for the charged Fermi gas, e.g., stability of matter. In Section 6, I discuss the proof of Dyson’s formula for the charged Bose gas.
670
J.P. Solovej
2. Many-body quantum systems A quantum particle is described by a Hilbert space H and a Hamilton operator h (often an unbounded operator on H). The Hamilton operator gives the possible energy levels of the particle and the unitary one-parameter group eH describes the time evolution (we are using units in which Planck’s constant = 1). If we have N identical copies of this particle it is described by the Hilbert space HN =
N *
H
and normally by a Hamilton operator of the form HN =
N
hi +
i=1
Wij .
(2.1)
1≤i<j≤N
Here hi is the operator h acting on the ith tensor factor and Wij is an operator W on H ⊗ H acting on the ith and jth factors. The operator W represents a two-particle interaction between the particles and is symmetric under the interchange of the tensor factors. There could in principle be more complicated interactions, but we shall not consider those here. There could also be several species of identical particles. E.g., if we had N particles of a species labeled a and M particles of a species labeled b the (a) (b) Hilbert space would be HN ⊗ HM and the Hamiltonian would have the form HN,M =
N i=1
(a)
hi
+
(a)
Wij +
1≤i<j≤N
M i=1
(b)
hi +
(b)
Wij +
1≤i<j≤M
N M
(a,b)
Wi,j .
i=1 j=1
Of special interest are identical particles which obey Fermi or Bose statistics. For fermions the Hilbert space is restricted to the subspace of antisymmetric tensor products N + F H ⊂ HN . HN = For bosons the Hilbert space is restricted to the subspace of symmetric tensor products N * B HN = H ⊂ HN . sym F B and HN Note that the operator HN from (2.1) leaves both subspaces HN invariant. The situation described above where one fixes the number of particles N is called the canonical picture. It is often convenient to consider what is called the grand canonical picture where one does not fix particle number. The Hilbert spaces are the fermionic and bosonic Fock spaces ∞ ∞ F F B B F (H) = HN , F (H) = HN . N =0
N =0
Mathematical Problems of Large Quantum Systems
671
∞ B,F The Hamiltonian is H = = C N =0 HN . The convention is here that H0 B,F and H0 = 0. On the Fock spaces the particle number N is an operator. The Hamiltonian H conserves particle number in the sense that it commutes with the particle number operator. To treat many-body Fermi or Bose systems it is convenient to introduce the notion of second quantization. For each f ∈ H we define the creation operators aF (f )∗ and aB (f )∗ on F F (H) and F B (H) respectively (aB (f )∗ is an F,B F,B F,B unbounded operator). These operators map HN to HN they +1 and on HN are defined by √ aF,B (f )∗ Ψ = N + 1PF,B f ⊗ Ψ F,B where PF,B is the orthogonal projection of HN +1 onto HN +1 . The formal adjoints of these operators (defined on the subspace of vectors with finite particle number) are the annihilation operators aF (f ) and aB (f ). These operators satisfy the celebrated canonical commutation and anticommutation relations
[aB (f ), aB (g)∗ ] = (f, g),
{aF (f ), aF (g)∗ } = (f, g).
In the following we shall drop the index F or B either when the index is unimportant or when it follows from the context which case we consider. We may write the grand canonical operator hαβ a∗α aβ + 12 Wαβµν a∗α a∗β aν aµ H= αβ
a∗α
αβµν
∗
where = a(uα ) and uα , α = 0, . . . is an orthonormal basis for the oneparticle space H. Here hαβ = (uα , huβ ),
Wαβµν = (uα ⊗ uβ , W uµ ⊗ uν ).
We call the two terms in H the quadratic term and the quartic term referring to the number of creation and annihilation operators. The particle number ∗ operator is N = α aα aα . Each term in the Hamiltonian H has an equal number of creation and annihilation operator, which simply reflects the fact that it preserves particle number. There are also Hamiltonians for which this is not the case. In relativistic quantum mechanics one is forced to look at more general Hamiltonians that are particle non-conserving. In these Hamiltonians the quartic terms have an unequal number of creation and annihilation operators. We shall not discuss these further here although many of the issues discussed in the next section are relevant also in this case. Another type of generalization of the Hamiltonian H is to have two types of particles with non-conserving terms of the form a∗ ab + a∗ ab∗ . These type of terms occur for instance when considering the interactions of electrons with quantized electromagnetic fields. The creation and annihilation operators a, a∗ correspond to the fermionic electron whereas the b, b∗ create and annihilate the bosonic photon.
672
J.P. Solovej
It is often difficult to give a precise mathematical meaning to Hamiltonians with particle non-conserving terms. A consequence of the non-conserving terms is that one must deal with situations with infinitely many particles. It is part of the problem to define this properly. Before turning to the questions of interest we introduce the notion of density matrices for the many-body vectors in Fock space. For any normalized Ψ ∈ F (H) we define the 1-particle density matrix γΨ on the one-particle space H by (f, γΨ g) = (Ψ, a(g)∗ a(f )Ψ) . Then 0 ≤ γΨ ≤ (Ψ, N Ψ)1 (as operators) and TrγΨ = (Ψ, N Ψ). For normalized fermionic Ψ ∈ F F (H) we have (f, γΨ f ) = (Ψ, a(f )∗ a(f )Ψ) ≤ (Ψ, {a(f )∗ , a(f )}Ψ) ≤ f 2 Thus for fermions 0 ≤ γΨ ≤ 1. For bosons however γΨ may have an eigenvalue (Ψ, N Ψ). If for large N , Ψ is an N -particle state with γΨ having an eigenvalue of order N one says that Ψ satisfies Bose-Einstein condensation. The ratio of the largest eigenvalue of γΨ to N is called the degree of condensation. 3. Issues of interest The operators HN (canonical) and H (grand canonical) defined in the previous section are usually unbounded defined on dense subspaces DN = operators m D ⊂ F (H). D(HN ) ⊂ HN and D = ∪∞ m=0 N =0 N One important question is to know whether the canonical and grand canonical ground state energies (Ψ, HN Ψ) (Ψ, H + µN Ψ) EN = , E(µ) = inf , (3.1) inf Ψ∈DN \{0} (Ψ, Ψ) Ψ∈D\{0} (Ψ, Ψ) are finite, i.e., whether EN > −∞ and E(µ) > −∞ for µ large enough. In many situations one cannot expect that the grand canonical energy is finite without including a chemical potential µ. (Of course the term µN could be considered part of H if we simply replaced the one-particle operator h by h + µ. It is however convenient to display the dependence on µ explicitly.) If the ground state energy EN is finite we say that our system is stable of the first kind. The operator HN then has a self-adjoint Friedrichs’ extension. If the ground state energy E(µ) is finite for some µ we say that our systems is stable of the second kind. In this case H + µN has a self-adjoint Friedrichs’ extension. (But of course even if stability of the second kind fails, but stability of the first kind holds, H has a natural self-adjoint extension.) If EN is finite we may ask whether it is an eigenvalue. In this case a corresponding eigenvector is called a ground state. We may in this case want to know different properties of the ground state, such as details about its 1-particle density matrix, e.g., whether the system has Bose condensation.
Mathematical Problems of Large Quantum Systems
673
Having defined the operator HN as a self-adjoint operator we may study the positive temperature state defined by the many-body operator exp(−βHN ), where β (up to a constant) is the inverse temperature or we may study the evolution semigroup exp(itHN ). We have similar operators in the grand canonical picture. In statistical mechanics the issue of interest is the limit of the ground state or positive temperature states as the number of particles tends to infinity. 4. General approach: estimates and effective models The study of stability of the first or second kind requires different types of estimates. The operators involved are often elliptic differential operators. We shall consider examples of these and the typical estimates in the next section. The full many-body problem is often too difficult to study directly. A general approach is to identify different limits in which certain simplified effective models become exact. Typically these effective models are obtained by restricting the minimization in (3.1) to certain subsets of wave functions Ψ. We will now give examples of such effective models both for fermionic and bosonic systems. One very simple effective model, which illustrates this idea is the bosonic Hartree model. In which the minimization defining EN is restricted to simple tensor products (4.1) Ψ = φ ⊗ · · · ⊗ φ, φ ∈ H. -. / , N
Assuming that φ is normalized the energy expectation is N (N − 1) (φ ⊗ φ, W φ ⊗ φ). 2 Finding the optimal φ is a non-linear variational problem. Note that wave functions of the form (4.1) are Bose condensed. The 1-particle density matrix γΨ is N times the 1-dimensional projection onto φ. For fermions the corresponding approach would be to take a pure antisymmetric state (a Slater determinant) (Ψ, HN Ψ) = N (φ, hφ) +
Ψ = (N !)−1/2 φ1 ∧ · · · ∧ φN
(4.2)
where φ1 , . . . , φN is an orthonormal family in H. For such a state the 1-particle density matrix γΨ is the orthogonal projection onto the space spanned by φ1 , . . . , φN . The energy expectation is (Ψ, HN Ψ) = TrH (γΨ h) + 12 TrH⊗H [W γΨ ∧ γΨ ] where in general γ ∧ γ(f ⊗ g) = γf ∧ γg. We see again that finding the optimal N -dimensional projection is a non-linear variational problem. This is the famous Hartree-Fock model.
674
J.P. Solovej
The Slater determinants (4.2) may be characterized as the wave functions that minimize Hamiltonians of the form Aα,β a∗α aβ . (4.3) α,β
If the operator A with Aα,β = (uα , Auβ ) has N negative eigenvalues then the minimizer is an N -particle Slater determinant. We may generalize the Slater determinants to what we will call quasi-free fermionic wave functions Ψ ∈ FF (H), that are characterized as the minimizers of more general particle nonconserving quadratic Hamiltonians ∗ Aα,β a∗α aβ + Bα,β aα aβ + Bα,β a∗α a∗β . (4.4) α,β
α,β
α,β
Normalized quasi-free fermionic wave functions satisfy the relation (Ψ, a1 a2 a3 a4 Ψ) =
(Ψ, a1 a2 Ψ)(Ψ, a3 a4 Ψ) − (Ψ, a1 a3 Ψ)(Ψ, a2 a4 Ψ) +(Ψ, a1 a4 Ψ)(Ψ, a2 a3 Ψ),
(4.5) ai
and its generalizations to higher-order products. Here refers to any fermionic creation or annihilation operator. The expectation of a product of an odd number of creation and annihilation operators vanishes. Thus a normalized quasi-free wave function is determined by the 1-particle density matrix γΨ together with the vector δΨ ∈ H ∧ H defined by (f ⊗ g, δΨ ) = (Ψ, a(f )a(g)Ψ).
(4.6)
In a given basis we may identify δΨ with the operator having matrix elements (uα , δuβ ) = (Ψ, aα aβ Ψ). The following relations between γΨ and δ then hold 2 + |δ|2 = γΨ , γΨ
t γΨ δ = δγΨ .
(4.7)
Conversely any two matrices satisfying these relations determine a quasi-free wave function Ψ. The vector δ represents the presence of virtual pairs the socalled Cooper pairs which according to the theory of Bardeen-Cooper-Schrieffer is responsible for the phenomenon of superconductivity. The energy expectation in a quasi-free wave function is (Ψ, HΨ) = TrH (γΨ h) + 12 TrH⊗H [W γΨ ∧ γΨ ] + 12 (δΨ , W δΨ ). This is what in [1] was called the generalized Hartree-Fock functional. It follows (see [1]) that if W is a positive operator then the best choice is to take δΨ = 0. If however W is not positive it may be advantageous to have δΨ = 0 even though the Hamiltonian is particle conserving. For Bose systems the situation is more complicated. If the operator A has negative spectrum then the operator (4.3) has no finite grand canonical ground state energy. When restricted to the particle sector HN it has a ground state of the form (4.1). On the other hand if we consider a bosonic quadratic Hamiltonian of the form (4.3) (assuming that A with Aα,β = (uα , Auβ ) and B with Bα,β =
Mathematical Problems of Large Quantum Systems
675
(uα , Buβ ) are trace class and A ≥ (B ∗ B)1/2 + (BB ∗ )1/2 + cI for some c > 0) it has a ground state Ψ ∈ FB (H) which we call a quasi-free bosonic wave function. The relation corresponding to (4.5) for quasi-free bosonic wave functions is (Ψ, a1 a2 a3 a4 Ψ) =
(Ψ, a1 a2 Ψ)(Ψ, a3 a4 Ψ) + (Ψ, a1 a3 Ψ)(Ψ, a2 a4 Ψ) +(Ψ, a1 a4 Ψ)(Ψ, a2 a3 Ψ),
(4.8)
and similarly for higher-order products. Again the expectation of odd products vanishes. Thus the quasi-free bosonic wave functions are also characterized by the 1-particle density matrix γΨ and the vector δΨ defined as in (4.6). For bosonic wave functions the relation between γ and δ is 2 + γΨ = |δ|2 , γΨ
t γΨ δ = δγΨ .
(4.9)
The quasi-free bosonic wave functions have a large deviation in particle number, in contrast to the states of type (4.1). In Bogolubov’s theory one attempts to combine both types of states as follows. Before restricting to quasifree bosonic wave functions we perform the unitary transformation that maps aα to aα + zα for some zα ∈ C. If we do this transform on the operator H and then calculate the expectation in a quasi-free bosonic wave function we arrive at the energy expectation in a Bogolubov wave function (Ψ, HΨ) = TrH (γΨ h) + 12 TrH⊗H [W γΨ ⊗ γΨ ] + 12 (δΨ , W δΨ ) sym 0 + hαβ z α zβ + Wαβµν 12 δµν z α z β + 12 δµν zα zβ α,β
(4.10)
α,β,µ,ν
1 + (γΨ )αµ z β zν + (γΨ )αν z β zµ + 12 z α z β zµ zν . Thus here the energy is a functional in {zα }, γΨ and δ. We have introduced the notation γ ⊗ γ given by sym
γ ⊗ γ(f ⊗ g) = γf ⊗ γg. sym
sym
In the last section we will give an example of a system where this functional gives the correct energy in a certain limit. 5. A fundamental example: The charged Coulomb gas The most fundamental example of a quantum system is a charged gas in 3dimensions. Let us take the special situation where the negatively charged particles are all identical and the positively charged particles are also all identical. Let us assume that we have chosen units in which as before = 1 and the mass and charge of the negatively charged particle are also both 1. The mass and charge of the positively charged particle are denoted m and Z. The one-particle Hilbert spaces for the negatively and positively charged particles are H± = L2 (R3 ; Cq− ) and L2 (R3 ; Cq+ ) respectively. Here q± ∈ N ∪ {0} denote
676
J.P. Solovej
the number of spin states for the two types of particles. The Hamiltonian is then HN,M =
N i=1
+
− 12 ∆xi +
1≤i<j≤N
M k=1
Z 1 ∆Rk − 2m |x − Rk | i=1 N
−
1 + |xi − xj |
M
k=1 2
1≤k< ≤N
(5.1)
Z , |Rk − R |
where xi ∈ R3 denotes the position of the negatively charged particle i and Rk ∈ R3 the position of the positively charged particle k. When N = M = 1 and Z = 1 this describes a hydrogen atom. It follows rather easily from the Sobolev inequality that this system is stable of the first kind. The more fundamental result is that if, say, the negatively charged particles are fermions then the system is even stable of the second kind. This result is what is known as Stability of Matter. It was originally proved by Dyson and Lenard [4, 5] and a simple proof with realistic bounds was later given by Lieb and Thirring [11]. The proof of Lieb and Thirring is based on a generalization, the Lieb-Thirring inequality, of the Sobolev inequality to antisymmetric functions. Stability of matter holds uniformly in the mass m, i.e., even if we let m = ∞. Lieb and Lebowitz [8] used the result on stability of matter to show that charged systems have well-defined thermodynamic behavior. In the fermionic case with m = ∞ it was moreover shown by Lieb and Simon [9] that for neutral atoms (i.e., M = 1 and N = Z) the ground state energy of the Hamiltonian is to leading order given by the Hartree-Fock energy when Z → ∞. Moreover they showed that the Hartree-Fock variational problem has a minimizer. Stability of the second kind fails if both the negatively and positively charged particles are bosons. This was proved by Dyson in [3]. In fact, Dyson made a precise conjecture about the asymptotic energy for large particle number in the case where both particles are bosons of the same mass and charge. Dyson’s conjecture was based on earlier work of Foldy [6] applying Bogolubov’s theory as explained in the previous section to a charged system. Dyson’s conjecture was proved in the two papers [10, 12]. Theorem 5.1 (Dyson’s formula). Let m = 1, Z = 1, and q± = 1 and consider B B the Hamiltonian HN,M in (5.1) acting in the bosonic Hilbert space HN ⊗ HM . If we denote the ground state energy by EN,M we have EN,M K→∞ N +M =K (N + M )7/5 2 1 = inf 2 |∇Φ| − I0 lim
min
R3
Φ R3
5/2
" " " 0 ≤ Φ, "
R3
2
Φ =1
(5.2)
Mathematical Problems of Large Quantum Systems
with I0 given by
I0 = (2/π)3/4
∞
0
677
1/2 45/4 Γ(3/4) 1 + x4 − x2 x4 + 2 dx = . 5π 1/4 Γ(5/4)
This theorem immediately implies that stability of the second kind cannot hold since 7/5 > 1. In the last section we will explain how a Bogolubov wave function as described in the previous section gives rise to the energy asymptotic in Dyson’s formula. The details, which prove that Dyson’s formula gives an asymptotic upper bound, can be found in [12]. In [10] it was proved that Dyson’s formula gives an asymptotic lower bound. Before explaining how a Bogolubov wave function gives the correct energy we will first explain how, from a mathematical point of view, we can think of the positively and negatively charged particles as being two states of the same particle. In fact, since in Dyson’s formula we do not fix the number of negative and positive particles but only the total number we may think of the charge of each particle as a variable. Moreover, it turns out that if we do not enforce the bosonic symmetry, but minimize over all wave functions, we will get the same energy as if we minimize over bosonic wave functions. Thus the ground state energy minN +M =K EN,M is the ground state energy 2K 2 3 of the system over the Hilbert space sym H, where H = L (R × {1, −1}). Here the set {1, −1} refers to the values of the charge variable. If we denote the position of particle i by xi and the charge of particle i by ei = ±1 we can write the Hamiltonian in the compact form K i=1
− 12 ∆xi +
1≤i<j≤K
ei ej . |xi − xj |
6. Bogolubov’s method for the charged Bose gas In this section we will first describe how to construct a Bogolubov wave function that gives the asymptotic correct energy in Dyson’s formula. This means we have to choose the {zα }, γ = γΨ and δ and evaluate (4.10). A Bogolubov wave function will not have fixed particle number, but we will explain how we may change it to a fixed particle number state without significantly changing the energy. We will choose an orthonormal basis u0 , u1 , . . . and z0 = 0, z1 = z2 . . . = 0. (In fact, we do not have to explicitly choose u1 , . . ..) We will choose γsuch that γu0 = 0 and such that γ has a real integral kernel. We will set δ = γ(γ + 1) in order to satisfy (4.9). The state u0 is chosen to be u0 (x, e) = 2−1/2 φ0 (x)(δe,1 + δe,−1 ),
678
J.P. Solovej
where φ0 (x) = K 3/10 Φ(K 1/5 x), where Φ is the minimizer in Dyson’s formula (5.2) 2 (minimizers exist, but an almost minimizer would also do). Then again φ0 = 1. A straightforward calculation then shows that the energy expectation (4.10) becomes (6.1) z02 (∇φ0 )2 + Tr − 12 ∆γ + 2z02 Tr K γ − γ(γ + 1) where K is the operator with integral kernel K(x, y) = φ0 (x)|x − y|−1 φ0 (y).
(6.2)
Moreover, the expected particle number in the state Ψ is 2z02 + Tr(γ). In defining γ we use the method of coherent states. Let χ be a non-negative real and smooth function supported in the unit ball in R3 , with χ2 = 1. Let K −2/5 " " K −1/5 and define χ (x) = −3/2 χ(x/). We choose −3 γ = (2π) f (u, |p|)Pφ⊥0 |θu,p θu,p |Pφ⊥0 dudp R3 ×R3
where
Pφ⊥0
is the projection orthogonal to φ0 , θu,p (x) = exp(ip · x)χ (x − u),
and 1 f (u, |p|) = 2
p4 + 32πz02 φ0 (u)2
−1 . 1/2 p2 (p4 + 64πz02 φ0 (u)2 ) We note that γ is a positive trace class operator, γφ0 = 0, and that γ has a real integral kernel. We use the following version of the Berezin-Lieb inequality [2, 7]. Assume that ξ(t) is an operator concave function of R+ ∪ {0} with ξ(0) ≥ 0. Then if Y is a positive semi-definite operator we have Tr (Y ξ(γ)) ≥ (2π)−3 ξ(f (u, |p|)) θu,p , Pφ⊥0 Y Pφ⊥0 θu,p dudp. (6.3) We use this for the function ξ(t) = t(t + 1). Of course, if ξ is the identity function then (6.3) is an identity. If Y = I then (6.3) holds for all concave ξ with ξ(0) ≥ 0. Proving an upper bound on the energy expectation (6.1) is thus reduced to the calculations of explicit integrals. After estimating these integrals one arrives at the leading contribution (for large z0 ) 2 2 2 2 4π 1 2 z0 (∇φ0 ) + f (u, |p|) 2 p + 2z0 φ0 (u) p2 4π − 2 2z02 φ0 (u)2 f (u, |p|)(f (u, |p|) + 1) dpdu p 5/2 2 2 = z0 (∇φ0 ) − I0 (2z02 )5/4 φ0 ,
Mathematical Problems of Large Quantum Systems
679
where I0 is as in Theorem 5.1. The function f was chosen so as to minimize the above integral. If we choose z0 = K/2 we get after a simple rescaling that the energy above is K 7/5 times the right side of (5.2) (recall that Φ was chosen as the minimizer). We also note that the expected number of particles is 2z02 + Tr(γ) = K + O(K 3/5 ), as K → ∞. The only remaining problem is to show how a similar energy could be achieved with a wave function with a fixed number of particles K. We indicate this fairly simple argument here. We construct a trial function Ψ as above, but with an expected particle number K chosen appropriately close to, but slightly smaller than K. More precisely, K will be smaller than K by an appropriate lower-order correction. It is easy to see then that the mean deviation of the particle number distribution √ √ in the state Ψ is lower order than K. In fact, it is of order K ∼ K. Using that we have a good lower bound on the energy EN,M for all N, M and that Ψ is sharply localized around its mean particle number, we may, without changing the energy expectation significantly, replace Ψ by a normalized wave function Ψ that only has particle numbers less than K. Since the energy is decreasing as a function of particle number we see that the energy expectation in the state Ψ is, in fact, an upper bound to the energy for K particles. References [1] Bach, Volker, Lieb, Elliott H. and Solovej, Jan Philip, Generalized Hartree-Fock theory and the Hubbard model, J. Stat. Phys., 76, 3–90, (1994). [2] Berezin F.A., Izv. Akad. Nauk, ser. mat., 36 (No. 5) (1972). English translation: USSR Izv. 6 (No. 5) (1972) and Berezin F. A., General concept of quantization. Commun. Math. Phys. 40, 153–174 (1975). [3] Dyson, Freeman J., Ground state energy of a finite system of charged particles, Jour. Math. Phys. 8, 1538–1545 (1967). [4] Dyson, Freeman J. and Lenard, Andrew, Stability of matter. I, Jour. Math. Phys. 8, 423–434, (1967). [5] Dyson, Freeman J. and Lenard, Andrew, Stability of matter. II, Jour. Math. Phys. 9, 698–711 (1968). [6] Foldy, Leslie L., Charged boson gas, Phys. Rev. 124, 649–651 (1961); Errata ibid 125, 2208 (1962). [7] Lieb, Elliott H., The classical limit of quantum spin systems, Commun. Math. Phys. 31, 327–340 (1973). [8] Lieb, Elliott H. and Lebowitz, Joel L., The constitution of matter: Existence of thermodynamics for systems composed of electrons and nuclei. Advances in Math. 9, 316–398 (1972). [9] Lieb, Elliott H. and Simon, Barry, The Hartree-Fock theory for Coulomb systems. Comm. Math. Phys. 53, no. 3, 185–194, (1977).
680
J.P. Solovej
[10] Lieb, Elliott H. and Solovej, Jan Philip, Ground state energy of the twocomponent charged Bose gas, Commun. Math. Phys. 252, 485–534, (2004). [11] Lieb, Elliott H. and Thirring, Walter E., Bound for the kinetic energy of fermions which proves the stability of matter, Phys. Rev. Lett. 35, 687–689 (1975). [12] Solovej, Jan Philip, Upper Bounds to the Ground State Energies of the One- and Two-Component Charged Bose Gases. To appear in Commun. Math. Phys. Jan Philip Solovej Department of Mathematics University of Copenhagen
4ECM Stockholm 2004 c 2005 European Mathematical Society
The Grothendieck–Teichm¨ uller Group and Galois Theory of the Rational Numbers – European Network GTEM – Jakob Stix Abstract. GTEM is an acronym for Galois Theory and Explicit Methods. A selection from the activities within the network is presented. We focus on nonabelian Galois action, Grothendieck–Teichm¨ uller theory and anabelian geometry. The style is expository and proofs are omitted.
1. The network The network is based on nodes from eight European countries. Geographically ordered, the participating Universities are located in Nottingham, Leiden, Essen, Lille, Bonn, Paris 6, Heidelberg, Besan¸con, Bordeaux, Lausanne, Barcelona, Rom, and Tel Aviv. They joined under the acronym GTEM, read je t’aime, which abbreviates the common interest of these various different research groups: Galois Theory and Explicit Methods. More precisely, research within the network centers around many interesting topics of arithmetic related with GalQ , the absolute Galois group of the rational numbers, as follows: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
(Explicit) finite Galois groups over Q. The Inverse Galois Problem (IGP). Dessins d’enfants. Grothendieck–Teichm¨ uller Theory: GalQ and GT. Arithmetic of elliptic curves over number fields. Algorithms in number theory, in particular class field theory. Differential Galois Theory. Arithmetic of covers, arithmetic fundamental groups. (Birational) anabelian geometry. Miscellaneous: Iwasawa Theory, Invariant Theory, explicit implementations, . . .
The research so far was very vivid and produced many results. The following list of network related achievements mentions a few with no claim on completeness at all. Any omission, is due to mainly the ignorance of the author for which he offers his apologies. In the sequel, members of the GTEM network are typeset in smallcaps.
682
J. Stix
(1) Realization of Galois groups by ‘middle convolution’ and ‘parabolic ¨ lklein [41], Dettweiler– cohomology’ (Dettweiler–Reiter [8], Vo Wewers [9]). (2) Realization of Galois groups via Galois representations (Dieulefait, Vila, Crespo, see [13]). e (3) A combinatorial description of GalQp in GalQ using p-adic GT (Andr´ [1]). (4) New GT variants (Harbater–Schneps [17], Hatcher–Lochak–Schneps [18], Nakamura–Schneps [29]). (5) Proof of the generalized Kotchevkov conjecture (Zapponi [42]). (6) Advances in the absolute Inverse Galois Problem (IGP) (Haran–Jarden–Pop [15]). (7) Construction of Hurwitz moduli schemes and applications to IGP (D` ebes, Deschamps, Emsalem, Flon, Romagny, see [13]). (8) Explicit 2- and 3-descent (Cremona–Stoll [5]) (9) Proof of the differential Abhyankar Conjecture, treatment of the differential IGP (Bertrand, Matzat–van der Put [26] [27], Hartmann [14]) (10) Unravelling a connection between the Lam´e operator and dessins d’enfant (Litcanu [22], Zapponi [43]). (11) Results in birational (pro-) anabelian geometry (Pop [31], Efrat [10], Efrat–Fesenko [11], Koenigsmann [21]), (12) Proof of anabelian geometry for nonconstant curves in characteristic p (Stix [37]). (13) A proof of finiteness of isomorphy classes of smooth proper curves over ıdi [32] [35], Tamagawa [40]). Falg p with given π1 (Raynaud [33], Pop-Sa¨ (14) Description of the stable reduction of X(pn ) (Bouw–Wewers [4]). (15) Results about reduction / lifting of curves (Green, Lehr–Matignon, Henrio, Mezard, Wewers, Bouw–Wewers, see [13]). (16) Determination of the asymptotics for certain Galois groups over Q, database of Galois extensions of polynomials over Q up to degree 15 (Malle ¨ners–Malle [20]). [24, 25], Klu It is obvious that from the abundance of mathematics contained in this list, in the sequel, we will have to select a small portion and elaborate on these particular results and questions. Namely, the rest of this article will be devoted to the GT-geometric aspect of GTEM research, unfortunately leaving aside such exciting areas as differential Galois theory in which great strides have been made (see articles above, especially [26]). Again, of course, the choice reflects a personal bias. We will stick to the ground field of the rational numbers because the knowledgeable reader will anyway easily transfer what will be said to more general settings or may have a look at the references.
Grothendieck–Teichm¨ uller Theory and Galois Theory over Q
683
2. Galois action and P1 − {0, 1, ∞} In his ‘Esquisse d’un programme’ [12], Grothendieck suggests – among other things – that one should try to give a description of the absolute Galois group GalQ = Aut(Qalg /Q) of the rational numbers, by studying its action on the geometry (more precisely: algebraic invariants) of Q-varieties. As good candidates he proposes the geometric fundamental group of (categories of) moduli spaces of curves with marked points. Special attention should be paid to the moduli space of the Riemann sphere with n ≥ 4 marked points. The Grothendieck– 3 that will be discussed in Section 3, is the simplest inTeichm¨ uller group GT, carnation of this idea. By enlarging and modifying the category of Q-varieties 3 (which might be equal to GT) 3 reunder discussion, one gets variants of GT flecting more and more arithmetic of the rational numbers. 2.1. Abelian actions. Let us first explain what we mean by action on geometry by means of an example. We clearly do not have in mind the natural action by definition of GalQ on the algebraic closure Qalg of Q. This action is tautological and cannot serve to clarify the structure of GalQ . But GalQ also acts on the multiplicative group Gm , a Q-variety, and moreover respects the group structure. Consequently, the tower of ‘multiplication by n maps’ Gm ⊃ ζn n· Gm 1 yields a compatible action of GalQ on the kernels, the groups µn = ζn of nth roots of unity. At first sight, roots of unity might belong to the world of Qalg that we just have rejected, but we may also canonically identify µn with a group of covering automorphisms, which is clearly a geometric group. We deduce an abelian representation GalQ → Aut(µn ); altogether we have a geometric description of the cyclotomic character ˆ ∗. χcycl : GalQ → Z We understand the abelian part of GalQ by Class Field Theory, which is the main achievement of abelian mathematics in number theory. The result concerning the tower of Gm is the following: Theorem 2.1 (Kronecker–Weber). The cyclotomic character identifies the maxiab mal abelian quotient (GalQ ) of the absolute Galois group of the rationals ˆ ∗ . The corresponding maximal abelian extension Qab coincides with the with Z 4 maximal cyclotomic extension Q( n ζn ).
684
J. Stix
2.2. Nonabelian actions. We called the cyclotomic character an abelian representation. Here the nonabelian counterparts are not the representations of higher rank, i.e., -adic representations GalQ → GLn (Z ), which generalize the example with the multiplicative group to other (systems of finite) abelian group schemes and still exploit the group structure of the geometric object. For us the nonabelian representations are Galois actions on nonabelian groups. Due to the absence of real paths in the context of algebraic varieties (over Qalg ) Grothendieck generalized the concept of a fundamental group by the algebraic equivalent of the theory of covers. The (´etale) fundamental group π1alg of a connected variety is defined as the pro-finite group of automorphisms of the tower of all finite ´etale covers, see [36] Exp. V. Thus a variety X/Q defined over the rationals leads to a GalQ -action on the fundamental group π1alg (X ⊗ Qalg ) of the corresponding geometric variety, the base change X ⊗ Qalg , just because the Galois group acts on the latter and π1alg is a functor. Note that although X is defined with equations using coefficients from Q, the action however will most likely be non trivial as the definition of covers may require non-rational algebraic numbers. The tower of Gm ’s discussed above can be identified with the tower of ˆ we recover again unramified covers of Gm ⊗ Qalg and as π1alg (Gm ⊗ Qalg ) = Z the cyclotomic character ˆ ∗. χcycl : GalQ → Aut π alg (Gm ⊗ Qalg ) = Z 1
In the nonabelian case we need to be more precise. The functor π1alg actually depends on a pair of a space together with a base point. If we want to neglect the base point we pay the price by knowing everything only up to inner automorphisms. We deduce as nonabelian actions homomorphisms GalQ → Out π1alg (X ⊗ Qalg ) , where Out = Aut/Inn is the group of outer automorphisms. We call such a map an exterior representation. The first example of a curve with nonabelian fundamental group is certainly P1 − {0, 1, ∞}. Its collection of ´etale covers was proven to be very rich. Theorem 2.2 (Belyi [2]). (a) A proper smooth curve X/C, i.e., a compact Riemann surface, is defined over Qalg if and only if there exists a map β : X → P1C with ramification at most above 0, 1, ∞. (b) The exterior representation GalQ → Out π1alg (P1Qalg − {0, 1, ∞}) is faithful (meaning the homomorphism is injective). A map β as in (a) is called a Belyi map. Let us deduce (b) from (a). By (a) the elliptic curve Ej with j-invariant j ∈ Qalg is the (canonical) smooth compactification of a finite ´etale cover of the three punctured projective line.
Grothendieck–Teichm¨ uller Theory and Galois Theory over Q
685
If σ ∈ GalQ belongs to the kernel of the map in (b) then Eσ(j) is isomorphic to Ej , whence σ(j) = j for all j and so (b) follows from (a). Because the nonabelian Galois action on π1alg (P1Qalg − {0, 1, ∞}) is faithful we consequently want to know the structure of the latter group. The structure is determined in two steps. First, by a result of Grauert–Remmert/Grothendieck [36] Exp. XII Thm 5.1, namely GAGA for finite ´etale covers, finite analytic ´etale covers of an algebraic variety over C are in fact algebraic. Thus the algebraic fundamental group coincides via analytification (X X an ) with the pro-finite completion of the traditional topological fundamental group. Secondly, at least in characteristic 0, the algebraic fundamental group behaves geometrically and does not change under base change of algebraically closed fields, see [36] Exp. X Cor 1.8 (the proof for the proper case given there works also in general if the characteristic is 0). Both isomorphisms taken together imply isomorphisms ∼ =
∼ =
π1top (X an ) − → π1alg (X ⊗ C) − → π1alg (X ⊗ Qalg ). 2 on two generHence, π1alg (P1Qalg − {0, 1, ∞}) is a free pro-finite group F ators x, y, or more symmetrically, three generators x, y, z modulo the relation xyz = 1. The elements x, y, z are loops/inertia generators at 0, 1, ∞ respectively. Therefore Belyi’s Theorem above states that GalQ acts faithfully on a 2. group as ‘easy’ as F 2.3. Dessin d’enfant – children’s drawings. The part of Belyi’s Theorem that we proved above actually only exploits the action of GalQ on the set of isomorphy classes of finite ´etale covers of the three punctured projective line. A combinatorial description of these isomorphy classes is obtained through the notion of a dessin d’enfant, see [12]. Definition 2.3. A dessin d’enfant is a CW-structure on a compact, oriented, topological surface together with a bipartite structure, such that attaching maps of 2-cells are covering maps over the interior of 1-cells and finite to one over 0-cells. A bipartite structure here means that the set of vertices (the 0skeleton) is labeled with labels from {0, 1} such that each attaching map of a 1-cell hits vertices of both labels. Isomorphisms of dessin are defined as cellular homeomorphisms of the underlying surface. The name dessin d’enfant, or dessin for short, stems from the fact that such an object is encoded in the graph on the surface formed by the 1-skeleton: its shape may happen to resemble the masterpieces of our childhood. The picture on the right gives an example of a dessin on the Riemann sphere with black and white vertices corresponding to those labeled 0 and 1 respectively. 1 formed by the graph To aBelyi map β : X → P we associate the dessin −1 β [0, 1] on the topological surface underlying X an . The bipartite structure
686
J. Stix
is given by labeling a vertex v ∈ β −1 ({0, 1}) by β(v). One easily checks that this construction identifies isomorphism classes of ´etale covers of P1 − {0, 1, ∞} with isomorphism classes of dessins. The combinatorial description of covers by dessins allows for a more combinatorial analysis of the respective Galois action. The Galois action preserves certain invariants of the dessin: the genus of the surface, the degree = number of 1-cells, the valency list = ramification indices above 0, 1 and ∞ respectively. At least conjecturally this is only the tip of an iceberg of combinatorial invariants that describe Galois orbits of GalQ acting on isomorphy classes of dessins. At least we know that GalQ acts continuously on the finite sets of isomorphy classes of dessins with fixed genus, degree and valency list. Thus there are number fields corresponding to the stabilizers of this action that are mysteriously attached to the combinatorial data of each dessin d’enfant. A nontrivial result in the spirit of the above discussion is the following (see [42] for the definition of the dessin called ‘Leila flowers’). Theorem 2.4 (Zapponi [42]). A generalization of the Kochetkov Conjecture is true, in particular: The 24 ‘Leila flowers’ of type a < b < c < d < e form at least two Galois orbits if abcde(a + b + c + d + e) is a square ∈ Q. The proof uses Strebel differentials (quadratic differentials) and a stratification of the decorated moduli space following Kontsevich and Penner. Recently, in his Diplomarbeit, Ronkine obtained the following higher-dimensional analogue of Belyi’s Theorem. Theorem 2.5 (Ronkine [34]). Let X/C be a smooth, proper surface of general type. Then the birational class of X is defined over Qalg if and only if there exists an X birational to X and a map γ : X → P1 of relative dimension 1 with singular locus at most over {0, 1, ∞} and either γ truly varying or all nonsingular fibres of γ are defined over Qalg . From Ronkine’s Theorem we deduce that GalQ acts continuously on the finite set (Geometric Shafarevich Conjecture) of truly varying, smooth, proper curves of genus g ≥ 2 parametrized by P1C − {0, 1, ∞}. To have a higherdimensional analogue of dessins would be desirable. 3. Grothendieck–Teichm¨ uller theory So far we have neglected that the category of covers is governed by a group, the algebraic fundamental group. Instead of only exploiting the action on isomorphy classes we now study what the Galois action does with the group structure. 3 In some sense the Galois action is local. It respects inertia 3.1. The group GT. groups of boundary components (cusps) and, moreover, acts cyclotomically on inertia generators up to conjugation.
Grothendieck–Teichm¨ uller Theory and Galois Theory over Q
687
Recall that we identified π1alg (P1 −{0, 1, ∞}) with the pro-finite free group ˆ and f ∈ F 2 let ϕλ,f be the following endo 2 on generators x, y. For λ ∈ Z F morphism of F2 : x → xλ ϕλ,f := y → f −1 y λ f. Deligne’s method of tangential base points (see [6]) lifts the exterior Galois representation from Belyi’s theorem to an injective homomorphism ˆ2 GalQ → Aut F mapping σ to ϕλσ ,fσ where λ(σ) = χcycl (σ) is the cyclotomic character and of F 2. fσ is a uniquely determined element from the commutator subgroup F 2 Following Drinfel’d [7] and Ihara [19], and in fact using results of Lochak– Schneps [23], we define the pro-finite Grothendieck–Teichm¨ uller group (unhistorically) as " " ∗ " I, II, III and 3 ˆ ˆ GT = (λ, f ) ∈ Z × F2 " 2) ϕλ,f ∈ Aut(F I: II: III:
θ(f )f = 1, λ−1 2 ˜ 2 (f˜)ρ(f) ˜ f˜ = 1 in Γ 0,5 . ρ4 (f˜)ρ3 (f)ρ
ω 2 (f xm )ω(f xm )f xm = 1, m =
2 (resp. Γ 0,5 ), see [23]. Here θ and ω (resp. ρ) are certain automorphisms of F The original defining equations differ but are equivalent to the above I, II, III. 3 is not induced from the product Zˆ ∗ × F Note that the group structure on GT 2 but stems from composition of ϕλ,f within Aut(F2 ). Let us mention that it 3 is actually a pro-finite is by no means obvious but nevertheless true that GT group. The theme of GT was discovered by different people from different perspectives. For example Drinfel’d was led to consider a pro-algebraic version of GT over a field k when studying the universal way to deform associativity and braiding in a quasi-associative, quasi-braided tensor category, see [7]. To summarize the above we note the following theorem. Theorem 3.1 (Drinfel’d [7], Ihara [19]; Deligne [6]). We have injective group 3 ⊂ Aut F 2 that map σ to (λσ , fσ ) and thus yield homomorphisms GalQ → GT ˆ a parametrization of the set of elements of GalQ by an invertible element of Z and a pro-word in the letters x, y from the commutator subgroup F2 . 3.2. Moduli of curves. Let Mg,n be the moduli space of smooth proper curves of genus g with n ordered marked points. The double ratio defines an isomorphism M0,4 = P1 − {0, 1, ∞} that ultimately will lead us to a change in g,n is the pro-finite perspective. First of all note that π1alg (Mg,n ⊗ Qalg ) = Γ alg completion of the mapping class group. Here π1 (Mg,n ) is the fundamental
688
J. Stix
g,n inherits a group in the sense of algebraic stacks (orbifold-π1 ). The group Γ nonabelian action by GalQ in the usual way by functoriality. It was observed by Lochak and Schneps that symmetries of the moduli spaces for small values of (g, n) explain the nature of the equations that define 3 . More precisely, I and II is the shadow of the natural S3 action permutGT ing the cusps of M0,4 , and the cyclic permutation action of cusps of M0,5 is responsible for equation III according to the following theorem. Theorem 3.2 (Lochak–Schneps [23]). The equations I, II, III are nonabelian cocycle equations. The corresponding nonabelian cohomology classes have natural representatives that lead to parameterizations of pairs (λ, f ) that belong 3 to GT. Underlying these efforts is the hope to arrive at a combinatorial descrip3 from geometric Galois theory. By imposing tion of GalQ , resp. its image in GT, 3 that correct additional constraints one desires to get hold of a variant of GT actually coincides with GalQ . 3.3. Actions on towers. Let V be a category of smooth varieties over Q. We define a generalized Grothendieck–Teichm¨ uller group to be the pro-finite group of local automorphisms of the functor π1alg restricted to VQalg that is the category of base changes X ⊗ Qalg of varieties from V and maps defined over Q. 3V := Autlocal π alg : V alg → (groups) GT 1
Q
Here automorphisms of the functor are invertible natural transformations up to inner automorphisms. Locality refers to cyclotomic action up to conjugation on inertia subgroups coming from boundary components of natural compactifications. The natural nonabelian action of GalQ is compatible with Q-rational 3V . maps of Q-varieties and thus induces a natural homomorphism GalQ → GT A particular choice of V is the ‘genus 0 Teichm¨ uller tower’ (together with the compactifications by stable curves) " " n ≥ 4, Sn -action on M0,n , . T0 := M0,n "" forgetful maps M0,n+1 → M0,n By the following theorem we recover the classical pro-finite Grothendieck– Teichm¨ uller group in this generalized setting. Theorem 3.3 (Drinfel’d, Ihara, Harbater–Schneps [17]). There exists a natural isomorphism 3. 3 T = GT GT 0 The full Teichm¨ uller tower consists of " " 2 − 2g − n < 0, Aut(Mg,n ), " T := Mg,n " forgetful maps Mg,n+1 → Mg,n (again with compactifications by stable curves). On the corresponding gener3 we have the following theorem. alized GT
Grothendieck–Teichm¨ uller Theory and Galois Theory over Q
689
Theorem 3.4 (Hatcher–Lochak–Schneps [18], Nakamura–Schneps [29]). 3 new of elements There is an equation IV from M1,2 that leads to the group GT 3 that satisfy equation IV. in GT The new group acts on the algebraic fundamental groups of the full Teichm¨ uller tower T in a natural way such that its image contains the Galois action: 3 new ⊆ GT 3T . GalQ ⊆ GT 3 new equals GT 3 T , but it is not known It has been announced that GT whether the other inclusion in Theorem 3.4 is in fact an isomorphism or a strict inclusion. The proof uses the curve complex of Hatcher–Thurston. This complex being simply connected allows us to detect sets of equations that generate all equations that an action on the algebraic fundamental groups of the full Teichm¨ uller tower have to verify. What is more, the above theorem is in accordance with Grothendieck’s ‘first two level philosophy’. The ‘first two level philosophy’ predicts that gen3 T come from cases of modular dimension 3g − 3 + n equal to 1, erators for GT namely M0,4 and M1,1 , whereas equations are generated by equations of origin in modular dimension 2, namely M0,5 and M1,2 . 3 Still the guiding question is the following. How close is 3.4. Arithmetic in GT. 3 actually GalQ to GT? One way to decide whether both groups don’t coincide 3 consists of disproving group-theoretic properties of GalQ for GT. What is more, GalQ is not just a group, it is the group of arithmetic of integers. As arithmetic content of GalQ itself we can consider the family of (conjugacy classes of) decomposition subgroups GalQp (resp. GalR ) of GalQ that are parametrized by the finite (resp. infinite) places of Q. These decomposition subgroups are canonically isomorphic to the absolute Galois groups of the completion of Q at the respective places. It is an important result of F.K. Schmidt and Neukirch, that these decomposition subgroups are group3 theoretically characteristic among the set of all closed subgroups. Now, if GT coincides or at least is close to GalQ , then we should be able to describe arith3 by geometric Galois theory. Here arithmetic in GT 3 means conjumetic in GT gacy classes of subgroups related to a place of Q. Let us reconsider how we received our knowledge of the group theoretic structure of π1alg (P1Qalg −{0, 1, ∞}). We used C-analytic methods and knowledge about the topological fundamental group, which could as well be considered as the C-analytic π1 . This kind of geometry is certainly related to the infinite place of Q. So we are led to think that the various places of Q should lead to 3 through the different ways to complete Q and then do arithmetic within GT analysis.
690
J. Stix
Let us check for the infinite place whether we can detect arithmetic in 3 We have inclusions GT. top alg an Γtop 0,4 := π1 (M0,4 ) ⊂ π1 (M0,4 ) = Γ0,4 ,
and a compatible nonabelian action GalR → Out(Γtop 0,4 ) of the subgroup GalR ⊂ GalQ . Let Out(Γtop 0,4 ) be the closure of the image under top the canonical map Out(Γ0,4 ) → Out(Γ0,4 ) induced by pro-finite completion. Then we have the following theorem. Theorem 3.5 (Andr´ e [1] Thm 3.3.1).
ˆ The intersection Out(Γtop 0,4 ) ∩ GalQ inside Out Γ0,4 coincides with GalR .
3 → Out(Γ 0,4 ) the The theorem suggests that under the canonical map GT preimage of Out(Γtop 0,4 ) reflects the arithmetic at the infinite place. For analysis at p we chose to do Cp -analytic geometry in the sense of Berkovich. But what is the right fundamental group here to replace the question mark in the following table: infinite place: finite place p:
completion R completion Qp
↔ ↔
C-analytic: π1top Cp -analytic (Berkovich): ?
3.5. The tempered fundamental group. Let X be a variety over Q. For a fixed isomorphism of fields C ∼ e defines the tempered π1 of the Berkovich= Cp , Andr´ Cp -analytic space X ⊗ Cp , a topological group together with a homomorphism π1temp (X ⊗ Cp ) → π1alg (X ⊗ Qalg ), that identifies π1alg with the pro-finite completion of π1temp . The tempered fundamental group classifies ‘topological by finite ´etale’ Cp -analytic covers. In a sense, it catches reduction behavior of covers mod p. Namely for an elliptic curve E/Q we can compute the tempered fundamental group of the underlying Cp -analytic space as follows. ˆ ×Z ˆ E good at p Z π1temp (E ⊗ Cp ) = ˆ E bad at p Z×Z The definition of the tempered fundamental group arose from the study of p-adic differential equations. But it also serves for the following analogue of Theorem 3.5. Let Out(Γtemp 0,4 ) be the closure of the image under the canonical temp 0,4 ) induced by pro-finite completion. map Out(Γ0,4 ) → Out(Γ Theorem 3.6 (Andr´ e [1] Thm 7.2.1).
ˆ The intersection Out(Γtemp 0,4 )∩GalQ inside Out Γ0,4 coincides with GalQp .
Grothendieck–Teichm¨ uller Theory and Galois Theory over Q
691
If we moreover define tempered analogues of the generalized Grothendieck– Teichm¨ uller groups associated to a category V of Q-varieties by := Autlocal π1alg : VCp → (groups) GTtemp V 3 T , then we obtain more 3p be the closure of image of GTtemp → GT and let GT 0 T0 precisely the following result. Theorem 3.7 (Andr´ e [1] Thm 8.7.1).
3 p ∩ GalQ ⊂ GT. 3 GalQp = GT
Hence, we are also able to detect arithmetic at a finite prime within the Grothendieck–Teichm¨ uller group. 4. Anabelian geometry Anabelian geometry deals with the geometry encoded in the algebraic fundamental group of a variety, as well as the arithmetic encoded in the corresponding outer Galois representations. To my knowledge, the following conjecture of anabelian nature was first raised as a question by Ihara and later given the status of a conjecture by Oda–Matsumoto. It is now a theorem by Pop which relies on birational anabelian results, but remains unpublished to date. Theorem 4.1 (Pop). The Ihara/Oda–Matsumoto Conjecture holds. Namely, 3V is an let V be the category of all smooth varieties over Q. Then GalQ → GT isomorphism. A direct consequence of Andr´ e’s studies of the tempered fundamental group leads to the following local version of the conjecture. Theorem 4.2 (Andr´ e [1] Thm. 9.2.2). Let V be the category of all smooth is an isomorphism. varieties over Q as above. Then GalQp → GTtemp V One may think of these two theorems as astonishing facts because they claim to give an alternative, non tautological description of the absolute Galois group of the rationals together with its arithmetic. However, first of all the category V above is too large to lead to anything manageable. And secondly, to give the list of varieties in V one still needs to write down equations with rational numbers. Nevertheless, the results guides us to look for interesting V that allow for a concrete combinatorial/geometric/group theoretical description 3V and thus GalQ . According to Pop, in his approach it is even possible of GT to work with complements of rational hyperplane arrangements on P2 . He then uses Ronkine’s theorem (Theorem 2.5) combined with a clever covering trick 3V . to give, in principle, a complex analytic description of the corresponding GT The use of rational numbers to describe GalQ has disappeared! The anabelian methods of Pop yield even stronger results: it is sufficient to work with pro- completions for a fixed prime .
692
J. Stix
4.1. Birational pro- anabelian geometry. In birational anabelian geometry one deals with absolute Galois groups of function fields. A major breakthrough in anabelian geometry towards Grothendieck’s conjectures in the field was achieved by Pop in the ’90s. Let K ins denote the pure inseparable closures of a field (in particular K ins = K if the characteristic is 0), and let Isomi denote isomorphisms of perfect fields up to powers of the Frobenius automorphism (again, if the characteristic is 0 disregard the i ). Theorem 4.3 (Pop [30]). Let K, L be infinite, finitely generated fields. Then the natural map Isomi (Lins , K ins ) → Isomout (GalK , GalL ) is a bijection. The birational pro- anabelian conjecture asks for a stronger and more geometric result about pro- completed absolute Galois groups Gal∧ K . Let k be an algebraically closed field and a prime different from the characteristic. Then the conjecture claims the following. Conjecture 4.4. Let K/k, L/k be function fields of transcendence degree exceeding 1. Then the natural map ∧ Isomi (Lins , K ins ) → Isomout (Gal∧ K , GalL )
is a bijection. This conjecture goes back to Bogomolov and there are already some articles devoted to it. The claim of the conjecture includes in particular the case of k = C, hence a question of complex analysis rather than arithmetic geometry! 4.2. Anabelian phenomena over Falg p . So far there has been progress on the conjecture above only in the case of k being the algebraic closure of a finite field. In this case, recently there have been given proofs by Bogomolov–Tschinkel for the case of transcendence degree 2 and some further assumptions, and by Pop in the general case (again k = Falg p ). Theorem 4.5 (Bogomolov–Tschinkel [3]; Pop [31]). Let k be the algebraic closure of a finite field of characteristic different from . Let K/k, L/k be function fields of transcendence degree exceeding 1. Then the natural map ∧ Isomi (Lins , K ins ) → Isomout (Gal∧ K , GalL )
is a bijection. For a detailed and carefully written survey see the Bourbaki talk by Szamuely [38]. An important step in the proof consists in the treatment of the local theory. By that we mean to characterize decomposition groups or better to identify valuations of the field purely from the data of Galois groups. Here one has also results by Ware, Efrat, Efrat–Fesenko, and Koenigsmann. Let us now turn to more geometric anabelian phenomena. If we replace the absolute Galois group of the function field of a variety by the algebraic
Grothendieck–Teichm¨ uller Theory and Galois Theory over Q
693
fundamental group of the variety itself, then we get a relatively small quotient that nevertheless still contains room for anabelian geometry. Let us consider smooth proper curves over an algebraically closed field. In characteristic 0 the respective fundamental group is isomorphic to the profinite completion of the topological fundamental group of the corresponding Riemann surface. Hence it does not vary in geometric fibres of a connected family. However, when the characteristic is positive, some covers cease to exist and it turns out that the fundamental group is a very subtle invariant. Theorem 4.6 (Raynaud [33], Pop/Sa¨ıdi [32], Tamagawa [40]). Let π be a proˆp × $ ˆ ×2 for any prime p. Then there are finite group not isomorphic to Z
=p Z only finitely many isomorphy classes of smooth projective curves of genus g over the algebraic closure of a finite field whose algebraic fundamental group is isomorphic to the given π. The exceptions correspond to the infinity of ordinary elliptic curves over the algebraic closure of Fp . All three articles cited above for this theorem contain an abundant wealth of beautiful mathematics. They first of all exploit Raynaud’s Θ-divisor in the Jacobian of the curve and how this divisor behaves in families. The central idea is to compare the number of elementary abelian p covers of cyclic prime to p covers of the curve. This number is encoded in the group theory of the fundamental group. When this number is maximal we call the cyclic cover newordinary and this property is obstructed by torsion points on Θ. This leads to general questions of torsion points on divisors of abelian varieties and thus to other ingredients: a generalized Anderson–Indik theorem and Hrushovski’s theorem on relative Mordell–Lang. To get this setup up and running in the general case, delicate studies of families of abelian varieties are necessary. Among the interesting things proven along the way there is a ‘new-Torelli theorem’ (Tamagawa [40]) stating that a family of curves is trivial if and only if a certain family of generalized Prym-varieties is trivial. 4.3. Anabelian curves: with Galois action. Of course, we conjecture not only finiteness in Theorem 4.6 but that the sets of Falg points of Mg with fixed p prescribed fundamental group coincide with the orbits under Frobenius. This is obviously the strongest form possible. So far, to get uniqueness up to Frobenius we have to exploit arithmetic of Galois action and, unfortunately, also restrict to non-constant curves. Following earlier results due to Tamagawa for affine, hyperbolic curves over finite fields, and fields finitely generated over Q, see [39], and then soon afterwards by Mochizuki for hyperbolic curves over sub-p-adic fields, see [28], we have also the following theorem. Theorem 4.7 (Stix [37]). Let k be an infinite but finitely generated field of positive characteristic p. Let X and X be smooth, hyperbolic (i.e., with negative
694
J. Stix
Euler characteristic) curves over k, such that and X ⊗ kalg is not defined over Falg p . Then the following holds. (1) X and X have isomorphic exterior Galois representations on π1alg if and only if there is a purely inseparable map X → X (or vice versa). (2) The canonical map Autk (X) → OutGalk π1alg (X ⊗ kalg ) is an isomorphism of finite groups. References 3 [1] Andr´e, Y., On a geometric description of Gal(Qp /Q), and a p-adic avatar of GT, Duke Math. J. 119 (2003), 1–39. [2] Belyi, G.V., Galois extensions of a maximal cyclotomic field, Izv. Akad. Nauk SSSR Ser. Mat. 43 (1979), no. 2, 267–276, 479, English transl. in: Math. USSRIzv. 14 (1980). [3] Bogomolov, F., Tschinkel, Y., Reconstruction of function fields, arXiv: math.AG/0303075. [4] Bouw, I., Wewers, S., Stable reduction of modular curves, in: Modular curves and abelian varieties, Progr. Math. 224, Birkh¨ auser, Basel, 2004, 1–22. [5] Cremona, J.E., Stoll, M., Minimal models for 2-coverings of elliptic curves, LMS J. Comput. Math. 5 (2002), 220–243 (electronic). [6] Deligne, P., Le groupe fondamental de la droite projective moins trois points, in: Galois groups over Q, Publ. MSRI 16 (1989), 79–298. [7] Drinfel’d, V.G., On quasitriangular quasi-Hopf algebras and on a group that is closely connected with Gal(Q/Q), Algebra i Analiz 2 (1990), 149–181; translated in: Leningrad Math. J. 2 (1991), 829–860. [8] Dettweiler, M., Reiter, S., An algorithm of Katz and its application to the inverse Galois problem. Algorithmic methods in Galois theory. J. Symbolic Comput. 30 (2000), no. 6, 761–798. [9] Dettweiler, M., Wewers, S., Variation of local systems and parabolic cohomology, arXiv: math.AG/0310139. [10] Efrat, I., The local correspondence over absolute fields: an algebraic approach, Internat. Math. Res. Notices 23 (2000), 1213–1223. [11] Efrat, I., Fesenko, I.B., Fields Galois-equivalent to a local field of positive characteristic, Math. Res. Lett. 6 (1999), 345–356. [12] Grothendieck, A., Esquisse d’un programme, in: Geometric Galois actions 1, eds. L. Schneps, P. Lochak, London Math. Soc. Lecture Note 242, 5–48, Cambridge Univ. Press, 1997, with an English translation on pp. 243–283. [13] GTEM preprint server, http://www.math.leidenuniv.nl//gtem/view.php [14] Hartmann, J., On the Inverse Problem in Differential Galois Theory, thesis, Heidelberg 2002, http://www.ub.uni-heidelberg.de/archiv/3085. [15] Haran, D., Jarden, M., Pop F., P -adically projective groups as absolute Galois groups, available for download: GTEM-113.ps, August 2004. [16] Henrio, Y., Rel`evement galoisien des revˆetements de courbes nodales, Manuscripta Math. 106 (2001), no. 2, 131–150. [17] Harbater, D., Schneps, L., Fundamental groups of moduli and the GrothendieckTeichm¨ uller group. Trans. Amer. Math. Soc. 352 (2000), no. 7, 3117–3148.
Grothendieck–Teichm¨ uller Theory and Galois Theory over Q
695
[18] Hatcher, A., Lochak, P., Schneps, L., On the Teichm¨ uller tower of mapping class groups. J. Reine Angew. Math. 521 (2000), 1–24. [19] Ihara, Y., Braids, Galois groups, and some arithmetic functions, in: Proceedings of the ICM, Kyoto, Japan 1990, Mathematical society of Japan, 1991. [20] Kl¨ uners, J., Malle, G., A database for field extensions of the rationals, LMS J. Comput. Math. 4 (2001), 182–196 (electronic). [21] Koenigsmann, J., Encoding valuations in absolute Galois groups, in: Valuation theory and its applications, Vol. II (Saskatoon, SK, 1999), 107–132, Fields Inst. Commun., 33, Amer. Math. Soc., Providence, RI, 2003. [22] Litcanu, R., Lam´e operators with finite monodromy – a combinatorial approach, J. Differential Equations 207 (2004), 93–16. [23] Lochak, P., Schneps, L., A cohomological approach of the Grothendieck– Teichm¨ uller group, Inventiones Math. 127 (1997), 571–600. [24] Malle, G., On the distribution of Galois groups, J. Number Theory 92 (2002), no. 2, 315–329. [25] Malle, G., On the distribution of Galois groups II, Experiment. Math. 13 (2004), no. 2, 129–135. [26] Matzat, B.H., van der Put, M., Iterative differential equations and the Abhyankar conjecture, J. Reine Angew. Math. 557 (2003), 1–52. [27] Matzat, B.H., van der Put, M., Constructive differential Galois theory, in: Galois groups and fundamental groups, 425–467, Math. Sci. Res. Inst. Publ., 41, Cambridge Univ. Press, Cambridge, 2003. [28] Mochizuki, S., The local pro-p anabelian geometry of curves, Inventiones Math. 138 (2), (1999), 319–423. [29] Nakamura, H., Schneps, L., On a subgroup of the Grothendieck-Teichm¨ uller group acting on the tower of profinite Teichm¨ uller modular groups, Invent. Math. 141 (2000), no. 3, 503–560. [30] Pop, F., On Grothendieck’s conjecture of birational anabelian geometry, Ann. of Math. (2) 139 (1994), 145–182. [31] Pop, F., Pro- birational anabelian geometry over algebraically closed fields I, arXiv: math.AG/0307076. [32] Pop, F., Sa¨ıdi, M., On the specialization homomorphism of fundamental groups of curves in positive characteristic, in: Galois groups and fundamental groups, 107–118, MSRI Publ., 41, Cambridge Univ. Press, Cambridge, 2003. [33] Raynaud, M., Sur le groupe fondamental d’une courbe compl`ete en caract´eristique p > 0, in: Arithmetic fundamental groups and noncommutative algebra (Berkeley, CA, 1999), 335–351, Proc. Sympos. Pure Math., 70, Amer. Math. Soc., Providence, RI, 2002. [34] Ronkine, I., Eine h¨ oherdimensionale Variante des Satzes von Belyi, Bonn 2003, Diplomarbeit. [35] Sa¨ıdi, M., On complete families of curves with a given fundamental group in positive characteristic, arXiv: math.AG/0305120 . ´ [36] Grothendieck, A., Mme. Raynaud, M., Revˆetements Etales et Groupe Fondamental (SGA 1), LNM 224, Springer, 1971. [37] Stix, J., Projective anabelian curves in positive characteristic and descent theory for log ´etale covers, thesis, Bonner Mathematische Schriften 354 (2002).
696
J. Stix
[38] Szamuely, T., Groupes de Galois de corps de type fini [d’apr`es Pop], S´eminaire Bourbaki no. 923, Ast´erisque 294 (2004), 403–431. [39] Tamagawa, A., The Grothendieck conjecture for affine curves, Compositio Mathematica 109, (1997), 135–194. [40] Tamagawa, A., Finiteness of isomorphism classes of curves in positive characteristic with prescribed fundamental group, J. Algebraic Geometry 13 (2004), 675–724. [41] V¨ olklein, H., A transformation principle for covers of P1 , J. Reine Angew. Math. 534 (2001), 156–168. [42] Zapponi, L., Fleurs, arbres et cellules: un invariant galoisien pour une famille d’arbres, Compositio Math. 122 (2000), 113–133. [43] Zapponi, L., Some arithmetic proerties of Lame operators with dihedral monodromy, arXiv: math.NT/0403287. Jakob Stix Mathematisches Institut der Universit¨ at Bonn
Plenary Speakers
4ECM Stockholm 2004 c 2005 European Mathematical Society
Hydrodynamic Limits Fran¸cois Golse Abstract. This article reviews recent progress on the derivation of the fundamental PDE models in fluid mechanics from the Boltzmann equation.
1. Introduction The subject of hydrodynamic limits goes back to the work of the founders of the kinetic theory of gases, J. Clerk Maxwell and L. Boltzmann. At a time when the existence of atoms was controversial, kinetic theory could explain how to estimate the size of a gas molecule from macroscopic data such as the viscosity of the gas. Later, D. Hilbert formulated the question of hydrodynamic limits as a mathematical problem, giving an example in his 6th problem on the axiomatization of physics [25]. In Hilbert’s own words “[. . . ] Boltzmann’s work on the principles of mechanics suggests the problem of developing mathematically the limiting processes [. . . ] which lead from the atomistic view to the laws of motion of continua”. Hilbert himself attacked the problem in [26], as an application of his own work on integral equations. We should mention that there are several interpretations of what is meant by “the atomistic view” in Hilbert’s problem. One can either choose molecular dynamics (i.e., the N -body problem of classical mechanics with elastic collisions, assuming all bodies to be spherical and of equal mass); another possibility is to start from the Boltzmann equation of the kinetic theory of gases (which is what Hilbert himself did in [26]). However, one should be aware that the Boltzmann equation is not itself a “first principle” of physics, but is a low density limit of molecular dynamics – which can be considered as a first principle within the theory of classical, nonrelativistic mechanics. The problem of hydrodynamic limits is to obtain rigorous derivations of macroscopic models such as the fundamental PDEs of fluid mechanics from a microscopic description of matter, either molecular dynamics or the kinetic theory of gases. The situation can be summarized by the following diagram: MOLECULAR DYNAMICS
−→
KINETIC THEORY
$
6 HYDRODYNAMICS
700
F. Golse
First, we recall that a rigorous derivation of the Boltzmann equation from molecular dynamics on short time intervals (i.e., the horizontal arrow in the diagram above) was obtained by O.E. Lanford in [30]. Hence, although not a first principle itself, the Boltzmann equation can be rigorously derived from first principles and therefore has more physical legitimacy than phenomenological models (such as lattice gases). On the other hand, “formal” derivations of the Euler system for compressible fluids from molecular dynamics were discussed by C.B. Morrey in [37]. Later on, S.R.S. Varadhan and his collaborators considered stochastic variants of molecular gas dynamics and obtained rigorous derivations of macroscopic PDE models from these variants: see for instance [49] and the references therein, notably [39]. In the present work, we shall mostly restrict our attention to derivations of the fundamental PDEs of fluid mechanics from the Boltzmann equation. Perhaps the most complete result in this direction is the derivation of the Navier-Stokes equations for incompressible flows from the Boltzmann equation. Indeed, unlike in the case of other hydrodynamic models, this derivation is valid for all physically admissible data, without any restriction on the regularity or the size of the solutions considered. We conclude this presentation with a quick survey of other recent results and open problems on hydrodynamic limits of kinetic models.
2. The Navier-Stokes equations The Navier-Stokes equations govern incompressible flows of a viscous fluid. In the sequel, we only consider the case of a fluid with constant density that can be set equal to 1 without loss of generality. The unknown is the velocity field u ≡ u(t, x) ∈ R3 , where t ∈ R+ and x ∈ R3 are the time and space variables. In the absence of external forces (such as electromagnetic forces, gravity. . . ) the velocity field u satisfies divx u = 0 , ∂t u + (u · ∇x )u + ∇x p = ν∆x u ,
(2.1)
where ν > 0 is a constant called the “kinematic viscosity”. Here, the notation (u · ∇x )u designates the parallel derivative of u along itself, whose coordinates are given by i
((u · ∇x )u) :=
3 j=1
uj
∂ui . ∂xj
In physical terms, the first equality in (2.1) is the incompressibility condition, while the second equality is the motion equation – i.e., Newton’s second law of motion applied to an infinitesimal volume of the fluid.
Hydrodynamic Limits
701
Observe that, for any C 1 divergence-free vector field v on R3 ∂(v i v j ) ∂v i i ((v · ∇x )v) = v = =: (divx (v ⊗ v)) . j j ∂x ∂x j=1 j=1 i
3
3
j
The expression divx (v ⊗ v) defines a (vector-valued) distribution on R3 if v ∈ L2 (R3 ), and it coincides with (v · ∇x )v if v is of class C 1 on R3 . This remark justifies the following notion of weak solution of the Navier-Stokes equations. Definition 2.1. A weak solution of the Navier-Stokes equations is a vector-field1 u ∈ C(R+ ; w−L2 (R3 ; R3 )) which satisfies divx u = 0 , (2.2) ∂t u + divx (u ⊗ u) − ν∆x u = −∇x p , in the sense of distributions on R∗+ × R3 , for some p ∈ D (R∗+ × R3 ). In fact, the term −∇x p is the Lagrange multiplier associated to the constraint divx u = 0. In other words, the motion equation in (2.2) should be viewed as ∂t u + ∇x u − ν∆x u = 0 modulo gradient fields. After these preliminaries, we can state Leray’s existence result of a global weak solution for the incompressible Navier-Stokes equations. Theorem 2.2 (J. Leray [31]). For each uin ∈ L2 (R3 ; R3 ) such that divx uin = 0, there exists a weak solution of the Navier-Stokes equations satisfying the initial " data u"t=0 = uin . Moreover, this solution verifies the “energy inequality” t 2 2 1 1 |u(t, x)| dx + ν |∇x u(s, x)| dxds ≤ 2 |uin (x)|2 dx (2.3) 2 R3
0
R3
R3
for each t > 0. Notice that the scalar function p ≡ p(t, x) (the pressure) is not an unknown in the Navier-Stokes equations, since it is defined (modulo a constant) in terms of u by the relation −∆x p = divx ((u · ∇x )u) . Whether Leray solutions of the Navier-Stokes equations are uniquely determined by their initial data is still unknown. Likewise, it is still unknown whether any Leray solution of the Navier-Stokes equations with smooth initial data remains smooth for all subsequent times. However, if the Cauchy problem (2.1) has a smooth solution u with ∇x u ∈ L∞ (R+ × R3 ), any Leray solution of (2.1) must coincide with u. Observe that, for smooth solutions of the Navier-Stokes equations decaying sufficiently fast as |x| → +∞, the energy inequality (2.3) is in fact an equality, as can be seen by taking the scalar product of both sides of the motion equation in (2.1) with u and integrating over [0, t] × R3 . 1The notation w−Lp designates the Lp space endowed with its weak topology.
702
F. Golse
3. The Boltzmann equation In kinetic theory, the dynamics of a gas of (like) hard spheres is described by the Boltzmann equation. It governs the evolution of the number density F ≡ F (t, x, v) ≥ 0, the 1-particle phase-space density of the gas molecules at time t. In other words, F (t, x, v) is the density at time t ≥ 0 (with respect to the Lebesgue measure dxdv in R3 × R3 ) of the gas molecules located at the position x ∈ R3 that have velocity v ∈ R3 . In the absence of external forces (such as electromagnetic forces, gravity. . . ) the Boltzmann equation for F is ∂t F + v · ∇x F = C(F )
(3.1)
where C(F ) is the Boltzmann collision integral. Collisions other than binary are neglected in the Boltzmann equation, and these collisions are viewed as purely instantaneous and local. Indeed, in the kinetic theory of gases, the molecular radius is neglected everywhere in the description of the collision process except in the expression of the scattering cross-section. An important consequence of these physical assumptions is that C is a bilinear operator acting only on the v-variable in F . For a gas of hard spheres, the collision integral is given by the expression2 C(F )(v) = (F (v )F (v∗ ) − F (v)F (v∗ ))|v − v∗ |dv∗ dσ , (3.2) R3 ×S2
where the velocities v and v∗ are defined in terms of v, v∗ ∈ R3 and σ ∈ S2 by v ≡ v (v, v∗ , σ) = 12 (v + v∗ ) + 12 |v − v∗ |σ , v∗ ≡ v∗ (v, v∗ , σ) = 12 (v + v∗ ) − 12 |v − v∗ |σ .
(3.3)
Perhaps the most important result on the structure of the Boltzmann collision integral is Boltzmann’s H Theorem. Assume that F ≡ F (v) > 0 a.e. is rapidly decaying and such that ln F has polynomial growth as |v| → +∞. Then C(F ) ln F dv ≥ 0 . R(F ) = − R3
Moreover, the following conditions are equivalent: R(F ) = 0 ⇔ C(F ) = 0 a.e. ⇔ F is a Maxwellian, i.e., there exists ρ, θ > 0 and u ∈ R3 such that |v−u|2 ρ e− 2θ F (v) = M(ρ,u,θ) (v) := 3/2 (2πθ)
a.e. in v ∈ R3 .
From the physical viewpoint, the nonnegative quantity R(F ) represents the entropy production rate. 2In this formula, the molecular radius is chosen as the unit of length.
Hydrodynamic Limits
703
All hydrodynamic limits of the kinetic theory of gases considered in the present work bear on solutions of the Boltzmann equation that are fluctuations of some uniform Maxwellian state. We henceforth choose this uniform equilibrium state to be M = M(1,0,1)
(the centered, reduced Gaussian distribution)
without loss of generality. The size of the number density fluctuations around the equilibrium state M will be measured in terms of the relative entropy of the number density relatively to M , whose definition is recalled below. Definition 3.1. Given two measurable functions f ≥ 0 and g > 0 a.e. on R3 ×R3 , the relative entropy of f relative to g is f H(f |g) = − f + g dxdv ≥ 0 . f ln g R3 ×R3 (Notice that the integrand is a nonnegative measurable function, so that the integral is a well-defined element of [0, +∞].) In [15], R. DiPerna and P.-L. Lions defined the following notion of a weak solution of the Boltzmann equation. Definition 3.2. A renormalized solution of the Boltzmann equation (3.1) is a nonnegative function F ∈ C(R+ ; L1loc (R3 × R3 )) such that C(F ) ∈ L1loc (R+ × R3 × R3 ) , 1+F and that satisfies the equality (∂t + v · ∇x ) ln(1 + F ) =
C(F ) 1+F
in the sense of distributions on R∗+ × R3 × R3 . The motivation for this definition is that the collision integral acts as the convolution of F with itself in the v variable, and as a pointwise product in the t and x variables. Since the natural estimates for solutions of the Boltzmann equation are bounds on (1 + |v|2 )F (t, x, v)dxdv , |x|≤r
R3
the collision integral C(F ) may not be defined as a distribution on R∗+ ×R3 ×R3 ) for such F s. But the expression C(F 1+F is homogeneous of degree one for F large, and happens to be well defined for any number density F that satisfies the natural bounds for solutions of the Boltzmann equation. Theorem 3.3 (P.-L. Lions [33]). For each F in ≥ 0 a.e. such that H(F in |M ) < +∞, there exists a renormalized solution F of the Boltzmann equation (3.1)
704
F. Golse
" with initial data F "t=0 = F in . This renormalized solution satisfies, for each t > 0, the “entropy inequality” t R(F )(s, x)dxds ≤ H(F in |M ) . (3.4) H(F (t)|M ) + 0
R3
If F is a smooth solution of the Boltzmann equation that satisfies the assumptions of Boltzmann’s H Theorem for all t > 0 and converges to M as |x| → +∞ rapidly enough, the entropy inequality (3.4) is in fact an equality. This fact alone suggests that there is a deep analogy between Leray solutions of the Navier-Stokes equations in 3 space dimensions and renormalized solutions of the Boltzmann equation. In fact, as we shall see below, Leray’s theory can be seen as asymptotic to the DiPerna-Lions theory of renormalized solutions in some appropriate hydrodynamic limit. 4. From Boltzmann to Navier-Stokes The incompressible Navier-Stokes equations can be formally derived from the Boltzmann equation as follows. According to Hilbert’s prescription [26] for the hydrodynamic limit of the Boltzmann equation leading to the Euler system for compressible fluids, the solution of the Boltzmann equation is sought as a formal series F (t, x, v) = M(1,u(2 t,x),1) (v) + n Fn (2 t, x, v) n≥2
where u solves the incompressible Navier-Stokes equations (2.1) and Fn depends on t and x through ∇kt,x u, k = 0, . . . , n. In other words, the incompressible Navier-Stokes equations are derived from the Boltzmann equation in a regime of small, slowly varying fluctuations of number density about a uniform Maxwellian state, which, in the present case, is chosen to be the centered reduced Gaussian distribution M = M(1,0,1) . This formal argument was discussed by Y. Sone in [47] for the steady problem, and by C. Bardos, F. Golse and C.D. Levermore [3] for the evolution problem (this latter reference also treated the case of an external conservative force leading to a coupling with a drift-diffusion equation for the temperature field). Later, a rigorous derivation based on a truncated variant of Hilbert’s formal solution above, following a method originally used by R. Caflisch for the compressible Euler limit of the Boltzmann equation (see [11]) was sketched by A. DeMasi, R. Esposito and J. Lebowitz in [13]. However, this derivation has the same shortcomings as the original Caflisch method: first, it gives solutions of the Boltzmann equation that fail to be everywhere nonnegative3 and therefore lose physical meaning. Also, this derivation holds only on the time interval 3R. Esposito informed the author that this could probably be remedied by supplementing Hilbert’s formal solution with initial layer terms, as done by Lachowicz [28] in the context of the compressible Euler limit; however, there is no written account of this so far.
Hydrodynamic Limits
705
on which the limiting solution of the Navier-Stokes equations is smooth. As mentioned above, based on current knowledge of the Navier-Stokes equations, we do not know whether this method leads to a derivation of the Navier-Stokes equations that is valid globally in time. However, if one gives up the idea of working with Hilbert’s formal solution and uses instead an energy method based on intrinsic quantities pertaining to the theory of Boltzmann’s equation – essentially the relative entropy and the entropy production – one arrives at the following global result. Theorem 4.1. Let uin ∈ L2 (R3 ; R3 ) be such that divx uin = 0. For each > 0, let F ≡ F (t, x, v) be a renormalized solution of the Boltzmann equation (3.1) with initial data F (0, x, v) = M(1,uin (x),1) (v) . Then the family of vector fields u ≡ u (t, x) ∈ R3 defined by 1 t x vF 2 , , v dv u (t, x) = R3 is weakly relatively compact in L1loc (R+ × R3 ; R3 ) and each of its limit points as → 0 is a Leray solution of the incompressible Navier-Stokes equations (2.1) with initial data uin and viscosity ν = 15 D∗ (v ⊗ v − 13 |v|2 I) ,
(4.1)
where D∗ is the Legendre dual of the Dirichlet form of the collision integral C linearized at M . The Dirichlet form of the collision integral linearized at M is easily found to be
D(Φ) =
1 8
R3 ×R3 ×S2
|Φ + Φ∗ − Φ − Φ∗ |2 |v − v∗ |M M∗ dvdv∗ dσ .
(Here Φ∗ , Φ and Φ∗ designate resp. Φ(v∗ ), Φ(v ) and Φ(v∗ ), where v and v∗ are defined in (3.3).) The formula above holds for Φ ∈ Cc (R3v ; M3 (R)), with | · | denoting the Hilbert-Schmidt norm on matrices: |A|2 = trace(AT A) ,
A ∈ M3 (R) .
It can be extended to the form domain of the linearized collision integral, which is L2 ((1 + |v|)M dv). Remark. The definition of u consists in intertwining the evolution of the Boltzmann equation with the invariance group of the Navier-Stokes equations – we recall that, if u ≡ u(t, x) is a solution of the Navier-Stokes equations, then Tλ u :≡ λu(λ2 t, λx) is also a solution of the Navier-Stokes equations for each λ > 0. The theorem above was proved by F. Golse and L. Saint-Raymond [22] in the case of Maxwell molecules; the extension to all hard potentials with Grad’s
706
F. Golse
cutoff assumption (including the hard sphere case described in the present paper) can be found in [23]. A general strategy for proving global hydrodynamic limits leading to incompressible models was proposed by C. Bardos, F. Golse and C.D. Levermore [5]. This method was based on a priori bounds deduced from the entropy inequality together with some appropriate compactness results. In [5], the incompressible Navier-Stokes limit was obtained under two additional assumptions which, at the time, were left unverified. In addition, only the stationary case was considered in [5]: indeed, high frequency oscillations in time due to the presence of acoustic waves may destroy the compactness of number density fluctuations as → 0. Subsequently, several intermediate results were obtained on this limit. In [34], P.-L. Lions and N. Masmoudi succeeded in controlling the acoustic waves, and proved a result analogous to Theorem 4.1 under the same unverified assumptions as in [5]. In [18], F. Golse and C.D. Levermore went further in the direction of a complete proof by observing that the local conservation laws of momentum and energy could be recovered in the limit → 0 instead of being postulated on the renormalized solutions of the Boltzmann equation for each > 0, as was done in [5]. At the same time, L. Saint-Raymond was able to prove the Navier-Stokes limit for the BGK model of the Boltzman equation [43],[44]. These contributions contained one important idea used in the proof of Theorem 4.1. Finally, we should also mention that C. Bardos and S. Ukai [7] obtained a complete derivation of the Navier-Stokes equations for the Boltzmann equation in the case of small initial data for the Navier-Stokes equations – at variance with the strategy outlined in [5], the proof by Bardos and Ukai rests on the spectral analysis of the linearized equation, instead of energy bounds and compactness estimates. Unlike Theorem 4.1, this method cannot be applied to initial data of arbitrary size. 5. Sketch of the convergence proof First, we recast the Boltzmann equation (3.1) in the hydrodynamic time and space variables. In other words, consider the relative number density fluctuation g defined by |v|2 F t2 , x , v − M (v) g (t, x, v) = (5.1) , where M (v) = (2π)13/2 e− 2 . M (v) In terms of g , the Boltzmann equation (3.1) becomes 1 (5.2) ∂t g + v · ∇x g + Lg = Q(g , g ) , where the linearized collision operator L and the quadratic operator Q are defined in terms of the collision integral C by the formulas Lg = −M −1 DC[M ](M g) ,
Q(g, g) = 12 M −1 D2 C[M ](M g, M g) .
(5.3)
Hydrodynamic Limits
707
Notice that, since F is a renormalized solution of (3.1), its fluctuation g does not satisfy (5.2), but a renormalized form thereof. However, for the sake of simplicity, we proceed as if g did satisfy (5.2). In other words, this amounts to assuming that, for each > 0, the number density F is a classical solution of the Boltzmann equation, without uniform regularity bounds in the vanishing limit. In some sense, this lack of uniformity is the essential difficulty to overcome in this type of problem. We recall the following important property of the linearized collision operator. Lemma 5.1 (Hilbert [26]). The operator L is a nonnegative, Fredholm, selfadjoint unbounded operator on L2 (R3 ; M dv) with ker L = span{1, v1 , v2 , v3 , |v|2 } . 5.1. Step 1: Asymptotic fluctuations. First, we seek the asymptotic form of the number density fluctuations g in the vanishing limit. Multiplying the Boltzmann equation (5.2) by and letting → 0 suggests that g → g in the sense of distributions on R+ × R3 × R3 with Lg = 0 . By Hilbert’s lemma, g is an infinitesimal Maxwellian, i.e., is of the form g(t, x, v) = ρ(t, x) + u(t, x) · v + 12 θ(t, x)(|v|2 − 3) .
(5.4)
Notice that g is parametrized by its own moments, since ρ = g ,
u = vg ,
and θ = ( 13 |v|2 − 1)g ,
where the bracket notation designates the Gaussian integral: φ = φ(v)M (v)dv . R3
5.2. Step 2: Local conservation laws. Next, we use an extremely important feature of the Boltzmann collision integral. Proposition 5.2. For each measurable f ≡ f (v) rapidly decaying at infinity (in the v-variable), the collision integral satisfies C(f )dv = vk C(f )dv = |v|2 C(f )dv = 0 , k = 1, 2, 3 . (5.5) R3
R3
R3
Assuming that, for each > 0, the solution F satisfies the decay assumption in the above proposition, the first relation entails the continuity equation ∂t g + divx vg = 0 . Passing to the limit in the sense of distributions in this continuity equation, we obtain divx vg = 0 , or equivalently divx u = 0 , (5.6) which is the incompressibility condition in the Navier-Stokes equations.
708
F. Golse
The second relation in (5.5) together with entropy production controls implies that ∂t vg + divx (vg ⊗ vg ) − ν∆x vg → 0 modulo gradients
(5.7)
R∗+ ×R3 .
This leads to the Navier-Stokes motion in the sense of distributions on equation in the limit as → 0. Indeed, denoting A(v) = v ⊗ v − 13 |v|2 I (the traceless part of v ⊗ v), the second relation in (5.5) implies that 1 1 (5.8) ∂t vg + divx A(v)g + ∇x 13 |v|2 g = 0 . Observe that A⊥ span{1, v1 , v2 , v3 , |v|2 }; by Hilbert’s lemma, there exists a unique symmetric matrix field Aˆ in the domain of L such that LAˆ = A ,
ˆ ker L . with A⊥
Since L is self-adjoint on L2 (M dv), 1 1 ˆ A(v)g = (LA)(v)g 8 7 ˆ ˆ ˆ 1 Lg = AQ(g = A(v) , g ) − A(∂t + v · ∇x )g .
(5.9)
Let Π be the orthogonal projection on ker L in L2 (R3 ; M dv): for each φ ∈ L2 (R3 ; M dv), one has Πφ = φ + v · vφ + 12 (|v|2 − 3)( 31 |v|2 − 1)φ . Because of step 1, one expects that g can be replaced by Πg as → 0 in the right-hand side of (5.9). Hence 1 ˆ ˆ A(v)g AQ(Πg , Πg ) − Av · ∇x Πg ˆ ˆ = AQ(Πg , Πg ) − A ⊗ A : ∇x vg in some sense as → 0. The contraction in the last term of the right-hand side of the equality above bears on the indices of A and ∇x vg ; in other words, with the convention of repeated indices, (Aˆ ⊗ A : ∇x vg )ij = Aˆij Akl ∂xk vl g . The nonlinear term is simplified as follows. Lemma 5.3. For each φ ∈ ker L, one has Q(φ, φ) = 12 L(φ2 ) . Proof. Differentiate twice the relation C(M(ρ,u,θ) ) = 0 with respect to the parameters ρ, u and θ. See [4] for a complete argument.
Hydrodynamic Limits
709
Eventually, we arrive at the formula 1 2 ˆ ˆ A(v)g 12 AL((Πg ) ) − A ⊗ A : ∇x vg = 12 A|Πg |2 − Aˆ ⊗ A : ∇x vg
(5.10)
= vg ⊗ vg − 13 |vg |2 I − νD(vg ) , where
ν=
1 ˆ 10 A
: A and, for each vector field
ξ ≡ ξ(x) ∈ R3
D(ξ) = ∇x ξ + (∇x ξ)T − 23 (divx ξ)I . Substituting the formula (5.10) for the momentum flux in (5.8), and taking into account the incompressibility condition (5.6), we arrive at the asymptotic momentum conservation law (5.7). Actually, we do not know whether renormalized solutions of the Boltzmann equation (3.1) satisfy the local conservation laws of momentum and energy that Proposition 5.2 would entail in the case of classical solutions of (3.1) that are rapidly decaying as |v| → +∞. Instead of following exactly the argument described above, one must consider an approximate local conservation law of momentum modulo a defect term that vanishes as → 0. This leads to technical complications much too intricate to be described here. 5.3. Compactness arguments. The DiPerna-Lions entropy inequality gives a priori bounds on the number density fluctuations that are uniform in ; it was proved in [5] that (1 + |v|2 )g is weakly relatively compact in L1loc (R+ × R3x ; L1 (R3v )) . Hence, modulo extracting subsequences, for each φ ≡ φ(v) = O(|v|2 ) as |v| → +∞, one has φg → φg weakly in L1loc (R+ × R3x ; L1 (R3v )) , and this justifies passing to the limit in expressions that are linear in g . It remains to pass to the limit in the nonlinear term, i.e., to justify that div(vg ⊗ vg ) → div(vg ⊗ vg)
modulo gradients as → 0
and this requires a.e. pointwise, instead of weak convergence. Perhaps the main compactness argument in the proof is a “velocity averaging” lemma, a typical example of which (in a time-independent situation) is as follows: Lemma 5.4 (F. Golse, L. Saint-Raymond [21]). Let fn ≡ f (x, v) be a bounded p D sequence in L1 (RD x ; L (Rv )) for some p > 1 such that the sequence v · ∇x fn 1 D D is bounded in L (R × R ). Then • the sequence fn is weakly relatively compact in L1loc (RD × RD ); and • for each φ ∈ Cc (RD ), the sequence of moments fn (x, v)φ(v)dv is strongly relatively compact in L1loc (RD ) . RD
710
F. Golse
number density fluctuations
vanishing entropy production
ε
infinitesimal Maxwellians
hydrodynamic fluctuations compactness by velocity averaging
Figure 1. Convergence of the number density fluctuations With the compactness lemma above, the a.e. pointwise convergence of the number density fluctuations g (modulo extraction of a subsequence) is essentially obtained as follows: first, the entropy production bound inferred from (3.4) implies that g approaches the manifold of infinitesimal Maxwellians, i.e., the class of functions of the form (5.4) a.e. pointwise. Since an infinitesimal Maxwellian f is parametrized by its velocity averages f M dv , vf M dv , ( 13 |v|2 − 1)f M dv , R3
R3
R3
one concludes by applying Lemma 5.4. The situation is summarized in Figure 1. The idea of gaining compactness in the strong topology by velocity averaging in the context of transport equations is due to F. Golse, B. Perthame and R. Sentis, and appeared for the first time in [20]. This first result was an L2 -variant of the lemma above, and was proved with Fourier techniques, by controlling the small divisors involving the symbol of v · ∇x . Independently, the regularity of the spherical harmonic coefficients of the solution of the radiative transfer equation was studied in [1]. Later, a systematic study of the regularity and compactness of velocity averages of solutions of transport equations in Lp for all p ∈ [1, +∞) appeared in [19]. The L1 -variant of velocity averaging contained in [19] was one of the key arguments in the proof by R.J. DiPerna and P.-L. Lions of global existence of a renormalized solution of the Boltzmann equation in [15]. More recently, velocity averaging results have been generalized to cases D where fn is bounded in Lp (RD x ×Rv ) and v ·∇x fn = div x gn , with gn relatively p D −m,p D compact in L (Rx ; W (Rv )) for some p ∈ (1, +∞): see [16], [41], [14]. These results are proved with various techniques from harmonic analysis: see Chapter 1 in [8] for a survey as of 2000. This class of results is of considerable importance in the so-called “kinetic formulation” of hyperbolic conservation
Hydrodynamic Limits
711
laws, a topic in some sense analogous to hydrodynamic limits: see [40] for a detailed introduction to this very active research field. As for the L1x (Lpv ) case considered in the lemma above, its proof is based on a representation of the solution in physical space (instead of Fourier space). One of the key ideas in the proof of this result is that the group generated by v · ∇x , defined by the formula etv·∇x φ(x, v) = φ(x + tv, v) exchanges x- and v-regularity for t = 0. This implies dispersion estimates “` a la Strichartz” (see [12], and also Chapter 1 in [8]); the proof of the velocity averaging lemma above is based on these dispersion estimates together with an interpolation argument somewhat reminiscent of [32]. A preliminary version of Lemma 5.4 was used in [43]. 6. Other hydrodynamic limits Hydrodynamic models other than the incompressible Navier-Stokes equations can also be derived from the Boltzmann equation. Here are some examples. 6.1. The incompressible Euler limit. Let uin ≡ uin (x) ∈ R3 satisfy uin ∈ H 3 (R3 , R3 ) and divx uin = 0; let u ∈ C([0, T ); H 3 (R3 , R3 )) be the maximal solution of the incompressible Euler equations (see Kato [27]) ∂t u + (u · ∇x )u + ∇x p = 0 , " = uin . u"
divx u = 0 ,
(6.1)
t=0
These equations can be derived from the Boltzmann equation in the following manner. Theorem 6.1 (L. Saint-Raymond [45]). For each > 0, let δ = a with a ∈ (0, 1) and let Fin be defined as Fin (x, v) = M(1,δ uin (x),1) (v) . Let F be " a renormalized solution of the Boltzmann equation (3.1) with initial data F "t=0 = Fin . Then, in the limit as → 0, one has t x 1 vF , , v dv → u(t, x) δ R 3 δ in L∞ ([0, T ]; L1loc (R3 )) for each T ∈ (0, T ) as → 0, where u is the maximal solution of (6.1) on [0, T ) × R3 . The proof of this result differs from that of the Navier-Stokes limit. In particular, under the scaling assumption leading to the incompressible Euler equations, the entropy production rate in the Boltzmann equation does not balance the action of the streaming operator on F , which makes it impossible to apply the velocity averaging compactness lemma as in the Navier-Stokes limit. Here, the compactness of hydrodynamic fluctuations is obtained as a consequence of the stability (under perturbations of the initial data) of smooth
712
F. Golse
solutions of the incompressible Euler equations. This theorem is proved by a variant of the relative entropy method (see H.-T. Yau [50] on the hydrodynamic limit of interacting diffusions on a lattice). Preliminary versions of the theorem above can be found in [8] and [34]; see also [42] for the BGK model of the Boltzmann equation. However, the main feature of the relative entropy method is that the target equation (in this case the incompressible Euler equations) should have local smooth solutions. 6.2. The acoustic limit. Here is another example of a hydrodynamic limit of the Boltzmann equation, leading to a model for compressible fluids. Consider the acoustic system ∂t ρ + divx u = 0 , ∂t u + ∇x (ρ + θ) = 0 , 3 2 ∂t θ
" (ρ, u, θ)"t=0 = (ρin , uin , θin ) .
(6.2)
+ divx u = 0 ,
The initial data satisfies ρin , θin ∈ L2 (R3 ) ,
uin ∈ L2 (R3 ; R3 ) .
Clearly, the system above essentially reduces to a system of uncoupled wave equations for ρ + θ and the potential in the Helmholtz decomposition4 of u, so that the Cauchy problem has a unique solution (ρ, u, θ) ∈ C(R; L2 (R3 ) × L2 (R3 ; R3 ) × L2 (R3 )) . Moreover, the solution map U (t) defined by U (t)(ρin , uin , θin ) = (ρ(t, ·), u(t, ·), θ(t, ·)) is a unitary group on L2 (R3 ) × L2 (R3 ; R3 ) × L2 (R3 ). Theorem 6.2 (F. Golse – C.D. Levermore [18]). Let δ > 0 satisfy δ | ln δ |1/2 = √ o( ), and consider, for each > 0, Fin (x, v) = M(1+δ ρin (x),δ uin (x),1+δ θin (x)) (v) . Let F be a renormalized solution relative to M of the Boltzmann equation (3.1). Then, in the limit as → 0, one has 1 ρ(t, x) 1 t x dv → u(t, x) v , ,v −M F δ R 3 1 2 θ(t, x) ( 3 |v| − 1) in L1loc (R+ × R3 ), where (ρ, u, θ) is the solution of the acoustic system (6.2). 4I.e., u = u − ∇ φ with div u = 0. x x 0 0
Hydrodynamic Limits
713
The proof of this result follows the same pattern as that of the incompressible Navier-Stokes limit. Unfortunately, the condition on the size of the number density fluctuations δ is not optimal. A formal argument similar to steps 1–2 in the proof of the incompressible Navier-Stokes limit suggests that the same conclusion should hold under the assumption that only δ → 0 as → 0. Since we do not know whether renormalized solutions of the Boltzmann equation (3.1) satisfy the local conservation laws implied by Proposition 5.2 in the case of classical solutions of (3.1) that are rapidly decaying as |v| → +∞, the analogue of step 2 in the proof of the incompressible Navier-Stokes limit involves variants of these local conservation laws of momentum and energy modulo defect terms that vanish √ as → 0, provided that δ satisfies the stronger assumption δ | ln δ |1/2 = o( ). 6.3. Models involving a heat equation. In fact, the result obtained in [22] or in [23] leads to the Navier-Stokes equations coupled with a drift-diffusion equation for (fluctuations of) the temperature field, i.e., the Navier-Stokes-Fourier system divx u = 0 , ∂t u + divx (u ⊗ u) + ∇x p = ν∆x u , (6.3) ∂t θ + divx (uθ) = κ∆x θ . The heat conductivity κ is given by a formula similar to (4.1), i.e., κ=
∗ 1 2 4 15 D ( 2 (|v|
− 5)v) .
A rigorous derivation of the linear variant of this system (i.e., the Stokes-Fourier system) from renormalized solutions of the Boltzmann equation can be found in [18]; previously, the evolution Stokes equations (for the velocity field only) had been similarly obtained by P.-L. Lions and N. Masmoudi in [34]. More elaborate asymptotic limits leading to a viscous heating term in the right-hand side of the drift-diffusion equation for the temperature field have been formally derived from the Boltzmann equation in [6], but obtaining a complete mathematical argument justifying this derivation remains a real challenge. 7. Open problems An outstanding open problem in this field is the derivation of the Euler equations for compressible fluids from the Boltzmann equation. The compressible Euler system (for a perfect monatomic gas) is ∂t ρ + divx (ρu) = 0 , ∂t (ρu) + divx (ρu ⊗ u) + ∇x (ρθ) = 0 , ∂t (ρ( 12 |u|2
+
3 2 θ))
+
divx (ρu( 12 |u|2
+
3 2 θ))
(7.1)
= 0,
where ρ ≡ ρ(t, x) ≥ 0 is the density of the fluid at time t and position x, while θ ≡ θ(t, x) > 0 is the temperature field and u ≡ u(t, x) ∈ R3 the velocity field.
714
F. Golse
This is a system of conservation laws with an entropy ρ η(ρ, u, θ) = ρ ln 3/2 θ that is a convex function of ρ, ρu and ρ( 12 |u|2 + 32 θ) (the conserved densities). Hence (7.1) is a symmetrizable hyperbolic system, for which the Cauchy problem has local smooth solutions: see for instance the book by A. Majda [36]. It is known that, for a large class of initial data, the solution of (7.1) becomes singular in finite time (see [46]). Yet, the existence of global weak solutions of (7.1) is still unknown – and a major open problem of the theory of hyperbolic systems. However, in the case where ρ, u and θ only depend upon one space variable (say, x1 ), global existence of a weak solution to (7.1) for which η decreases across shock waves has been proved for initial data with small total variation. This result stems from Glimm’s remarkable paper [17] and is due to T.-P. Liu [35]. So far, solutions of (7.1) have been derived from solutions of the Boltzmann equation (3.1) in the regularity phase: see [38], [11], [28]. The idea is to start from initial data of the form Fin (x, v) = M(ρin (x),uin (x),θin (x)) parametrized by > 0. For each > 0, let F be a solution of (3.1) such that " F "t=0 = Fin ; then, one shows that the hydrodynamic moments of F 1 ρ(t, x) t x ρu(t, x) F , , v v dv → 3 R |v|2 ρ(|u|2 + 3θ)(t, x) as → 0, where (ρ, u, θ) is the solution of (7.1) with initial data (ρin , uin , θin ). The convergence above is of course local in time – at best over the lifespan of a smooth solution of (7.1). It would be of considerable interest to derive the global BV solutions constructed by T.-P. Liu from the Boltzmann equation. As in the case of the incompressible Euler limit of the Boltzmann equation, the entropy production bound entailed by Boltzmann’s H Theorem does not balance the action of the streaming operator on the number density: the compactness of hydrodynamic moments of the number density is probably to be sought in some stability property of BV solutions of the compressible Euler system. Most likely, such a theory should use Bressan’s remarkable results in that direction (see [9], [10]). Another open problem would be to improve Theorem 6.2, by relaxing the unphysical assumption made on the size of the number density fluctuations δ to reach the physically natural condition that δ → 0 as → 0. This will probably require more information on the local conservation laws of momentum and energy for renormalized solutions of the Boltzmann equation. Such information would most likely be an important prerequisite for progress on the compressible Euler limit.
Hydrodynamic Limits
715
Finally, we have only treated evolution problems in this paper. In fact, steady problems are perhaps even more important for applications (as in aerodynamics). For instance, it is well known that, for any force field f ≡ f (x) ∈ L2 (Ω; R3 ) such that divx f = 0, the steady incompressible Navier-Stokes equations in a smooth, bounded open domain Ω ⊂ R3 −ν∆x u = f − ∇x p − (u · ∇x )u , divx u = 0 , x ∈ Ω, " (7.2) u" = 0 ∂Ω
has at least one classical solution u ≡ u(x) ∈ H 2 (Ω, R3 ), obtained by a LeraySchauder fixed point argument (see for instance [29]). Unfortunately, the parallel theory for the Boltzmann equation is not as advanced: see however the classical papers by Guiraud [24], and more recent work by L. Arkeryd and A. Nouri (see for instance [2]). Yet, the fact that the solutions of (7.2) are more regular than in the case of the evolution problem could be of considerable help in the context of the hydrodynamic limit. A rather exhaustive description of these kinds of problems (at the formal level) may be found in the recent monograph by Y. Sone [48] References [1] V. Agoshkov: Space of functions with differential difference characteristics and smoothness of solutions of the transport equation, Dokl. Akad. Nauk SSSR 276 (1984), 1289–1293. [2] L. Arkeryd, A. Nouri: The stationary Boltzmann equation in Rn with given indata, Ann. Sc. Norm. Super. Pisa Cl. Sci. (5) 1 (2002), 359–385. [3] C. Bardos, F. Golse, C.D. Levermore: Sur les limites asymptotiques de la th´eorie cin´etique conduisant ` a la dynamique des fluides incompressibles, C.R. Acad. Sci. 309 (1989), 727–732. [4] C. Bardos, F. Golse, C.D. Levermore: Fluid dynamic limits of kinetic equations. I. Formal derivations, J. Statist. Phys. 63 (1991), 323–344. [5] C. Bardos, F. Golse, C.D. Levermore: Fluid Dynamic Limits of Kinetic Equations II: Convergence Proofs for the Boltzmann Equation, Comm. Pure & Appl. Math 46 (1993), 667–753. [6] C. Bardos, C.D. Levermore: Kinetic equations and an incompressible limit that recovers viscous heating, preprint. [7] C. Bardos, S. Ukai: The classical incompressible Navier-Stokes limit of the Boltzmann equation, Math. Models and Methods in the Appl. Sci. 1 (1991), 235–257. [8] F. Bouchut, F. Golse, M. Pulvirenti: “Kinetic Equations and Asymptotic Theory”, L. Desvillettes & B. Perthame ed., Editions scientifiques et m´edicales Elsevier, Paris, 2000. [9] A. Bressan: “Hyperbolic systems of conservation laws. The one-dimensional Cauchy problem”, Oxford University Press, Oxford, 2000.” [10] A. Bressan: Hyperbolic systems of conservation laws in one space dimension, in “Proceedings of the International Congress of Mathematicians, Vol. I (Beijing, 2002), 159–178, Higher Ed. Press, Beijing, 2002. [11] R.E. Caflisch: The fluid dynamic limit of the nonlinear Boltzmann equation, Comm. on Pure and Appl. Math. 33 (1980), 651–666.
716
F. Golse
[12] F. Castella, B. Perthame: Estimations de Strichartz pour les ´ equations de transport cin´etiques, C.R. Acad. Sci. S´er. I 322 (1996), 535–540. [13] A. DeMasi, R. Esposito, J. Lebowitz: Incompressible Navier-Stokes and Euler Limits of the Boltzmann Equation; Commun. Pure & Appl. Math. 42 (1990), 1189–1214. [14] R. DeVore, G. Petrova: The averaging lemma, J. Amer. Math. Soc. 14 (2001), 279–296. [15] R.J. DiPerna, P.-L. Lions: On the Cauchy problem for the Boltzmann equation: global existence and weak stability results, Ann. of Math. 130 (1990), 321–366. [16] R.J. DiPerna, P.-L. Lions, Y. Meyer: Lp regularity of velocity averages, Ann. Inst. Henri Poincar´e, Anal. Non-lin´eaire 8 (1991), 271–287. [17] J. Glimm: Solutions in the large for nonlinear hyperbolic systems of equations, Comm. Pure Appl. Math. 18 (1965), 697–715. [18] F. Golse, C.D. Levermore: The Stokes-Fourier and Acoustic Limits for the Boltzmann Equation, Comm. on Pure and Appl. Math. 55 (2002), 336–393. [19] F. Golse, P.-L. Lions, B. Perthame, R. Sentis: Regularity of the moments of the solution of a transport equation, J. Funct. Anal. 76 (1988), 110–125. [20] F. Golse, B. Perthame, R. Sentis: Un r´esultat de compacit´e pour les ´equations de transport et application au calcul de la limite de la valeur propre principale de l’op´erateur de transport, C.R. Acad. Sci. 301 (1985), 341–344. [21] F. Golse, L. Saint-Raymond: Velocity averaging in L1 for the transport equation, C. R. Acad. Sci. 334 (2002), 557–562. [22] F. Golse, L. Saint-Raymond: The Navier-Stokes limit of the Boltzmann equation for bounded collision kernels, Invent. Math. 155 (2004), no. 1, 81–161. [23] F. Golse, L. Saint-Raymond: The Navier-Stokes limit of the Boltzmann equation for hard potentials, in preparation. [24] J.-P. Guiraud: Probl`eme aux limites int´erieures pour l’´equation de Boltzmann, (French) in “Actes du Congr`es International des Math´ematiciens” (Nice, 1970), vol. 3, pp. 115–122. Gauthier-Villars, Paris, 1971. [25] D. Hilbert, Mathematical Problems, International Congress of Mathematicians, Paris 1900, translated and reprinted in Bull. Amer. Math. Soc. 37 (2000), 407436. [26] D. Hilbert: Begr¨ undung der kinetischen Gastheorie Math. Ann. 72 (1912), 562– 577. [27] T. Kato: Nonstationary flows of viscous and ideal fluids in R3 , J. Funct. Anal. 9 (1972), 296–305. [28] M. Lachowicz: On the initial layer and the existence theorem for the nonlinear Boltzmann equation, Math. Methods Appl. Sci. 9 (1987), no. 3, 342–366. [29] O.A. Ladyzhenskaya: “The mathematical theory of viscous incompressible flows”, Gordon and Breach, Science Publishers, New York-London-Paris 1969. [30] O.E. Lanford: Time evolution of large classical systems, in “Dynamical systems, theory and applications” (Rencontres, Battelle Res. Inst., Seattle, Wash., 1974), pp. 1–111. Lecture Notes in Phys., Vol. 38, Springer, Berlin, 1975. [31] J. Leray: Essai sur le mouvement d’un liquide visqueux emplissant l’espace, Acta Math. 63 (1934), 193–248. [32] J.-L. Lions, Th´ eor`emes de trace et d’interpolation I, II, Ann. Scuola Norm. di Pisa 13 (1959), pp. 389–403, & 14 (1960), pp. 317–331.
Hydrodynamic Limits
717
[33] P.-L. Lions: Conditions at infinity for Boltzmann’s equation, Comm. in Partial Differential Equations 19 (1994), 335–367. [34] P.-L. Lions, N. Masmoudi: From Boltzmann Equations to the Navier-Stokes and Euler Equations I, II, Archive Rat. Mech. & Anal. 158 (2001), 173–193, & 158 (2001), 195–211. [35] T.-P. Liu: Solutions in the large for the equations of nonisentropic gas dynamics, Indiana Univ. Math. J. 26 (1977), 147–177. [36] A. Majda: “Compressible fluid flow and systems of conservation laws in several space variables”, Springer-Verlag, New York, 1984. [37] C.B. Morrey: On the derivation of the equations of hydrodynamics from statistical mechanics, Comm. Pure Appl. Math. 8 (1955), 279–326. [38] T. Nishida: Fluid dynamical limit of the nonlinear Boltzmann equation to the level of the compressible Euler equation, Comm. Math. Phys. 61 (1978), 119–148. [39] S. Olla, S.R.S. Varadhan, H.-T. Yau: Hydrodynamical limit for a Hamiltonian system with weak noise, Comm. Math. Phys. 155 (1993), 523–560. [40] B. Perthame: “Kinetic formulation of conservation laws”, Oxford University Press, Oxford, 2002. [41] B. Perthame, P. Souganidis: A limiting case for velocity averaging, Ann. Scient. Ecole Norm. Sup. (4) 31 (1998), 591–598. [42] L. Saint-Raymond: Du mod` ele BGK de l’´equation de Boltzmann aux ´equations d’Euler des fluides incompressibles, Bull. Sci. Math. 126 (2002), 493–506. [43] L. Saint-Raymond: Discrete time Navier-Stokes limit for the BGK Boltzmann equation, Comm. Partial Diff. Eq. 27 (2002), 149–184. [44] L. Saint-Raymond: From the BGK model to the Navier-Stokes equations, Ann. Sci. Ecole Norm. Sup. (4) 36 (2003), 271–317. [45] L. Saint-Raymond: Convergence of solutions to the Boltzmann equation in the incompressible Euler limit, Arch. Ration. Mech. Anal. 166 (2003), 47–80. [46] T. Sideris: Formation of Singularities in 3D Compressible Fluids, Commun. Math. Phys. 101 (1985), 475–485. [47] Y. Sone: Asymptotic Theory of Flow of a Rarefied Gas over a Smooth Boundary II, in “Rarefied Gas Dynamics”, Vol. II, D. Dini ed., Editrice Tecnico Scientifica, Pisa, 1971, 737–749. [48] Y. Sone: “Kinetic theory and fluid dynamics”, Birkh¨ auser Boston, Inc., Boston, MA, 2002. [49] S.R.S. Varadhan: Entropy methods in hydrodynamic scaling, in “Proceedings of the International Congress of Mathematicians”, Vol. 1, (Z¨ urich, 1994), 196–208, Birkh¨ auser, Basel, 1995. [50] H.T. Yau: Relative entropy and hydrodynamics of Ginzburg-Landau models, Lett. Math. Phys. 22 (1991), 63–80. Fran¸cois Golse Universit´ e Paris 7 & I.U.F., Laboratoire J.-L. Lions Boˆıte courrier 187 F-75252 Paris cedex 05, France e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Mathematical Aspects of Mean Field Spin Glass Theory Francesco Guerra Abstract. A comprehensive review will be given about the rich mathematical structure of mean field spin glass theory, mostly developed, until now, in the frame of the methods of theoretical physics, based on deep physical intuition and hints coming from numerical simulation. Central to our treatment is a very simple and yet powerful interpolation method, allowing to compare different probabilistic schemes, by using convexity and positivity arguments. In this way we can prove the existence of the thermodynamic limit for the free energy density of the system, a long standing open problem. Moreover, in the frame of a generalized variational principle, we can show the emergency of the Derrida-Ruelle random probability cascades, leading to the form of free energy given by the celebrated Parisi Ansatz. All these results seem to be in full agreement with the mechanism of spontaneous replica symmetry breaking as developed by Giorgio Parisi.
1. Introduction The mean field model for spin glasses, introduced by David Sherrington and Scott Kirkpatrick more that thirty years ago [1], [2], is a celebrated model. Hundreds and hundreds of articles have been devoted to its study during the years, appearing in the theoretical physics literature. The relevance of the model stems surely from the fact that it is intended to represent some important features of the physical spin glass systems, of great interest for their peculiar properties, exhibiting a new magnetic phase, where magnetic moments are frozen into disordered equilibrium orientations, without any long-range order. See for example [3] for a very readable review about the physical properties of spin glasses. But another important source of interest is connected with the fact that disordered systems, of the Sherrington-Kirkpatrick type, and their generalizations, seems to play a very important role for theoretical and practical assessments about hard optimization problems, as it is shown for example by Mark M´ezard, Giorgio Parisi and Riccardo Zecchina in [4]. It is interesting to remark that the original paper was entitled “Solvable Model of a Spin-Glass”, while a previous draft, as I have it from David Sherrington, contained even the stronger denomination “Exactly Solvable”. However, it turned out that the very natural solution devised by the authors was valid only at high temperatures, or large external magnetic fields. While, at low tem-
720
F. Guerra
peratures, the proposed solution exhibited a nonphysical drawback given by a negative entropy, as properly recognized by the authors in their very first paper. It took some years to find an acceptable solution. This was done by Giorgio Parisi in a series of papers, by marking a radical departure from the previous methods. In fact, a very deep method of “spontaneous replica symmetry breaking” was developed. As a consequence the physical content of the theory was encoded in a functional order parameter of new type, and a remarkable structure emerged for the pure states of the theory, a kind of hierarchical, ultrametric organization. These very interesting developments, due to Giorgio Parisi, and his coworkers, are explained in a lucid way in the classical book [5]. Part of this structure will be recalled in the following. It is important to remark that Parisi solution is presented in the form of an ingenious and clever Ansatz. Until few years ago it was not known whether this Ansatz would give the true solution for the model, in the so called thermodynamic limit, when the size of the system becomes infinite, or it would be only a very good approximation for the true solution. The general structure offered by the Parisi solution, and their possible generalizations for similar models, exhibit an extremely rich and interesting mathematical content. Very appropriately, Michel Talagrand has inserted a strongly suggestive sentence in the title to his recent book [6]:“Spin glasses: a challenge for mathematicians”. As a matter of fact, how to face this challenge is a very difficult problem. Here we would like to recall the main features of a very powerful method, yet extremely simple in its very essence, based on a comparison and interpolation argument on sets of Gaussian random variables. The method found its first simple application in [7], where it was shown that the Sherrington-Kirkpatrick replica symmetric approximate solution was a rigorous lower bound for the quenched free energy of the system, uniformly in the size. Then, it was possible to reach a long waited result [8]: the convergence of the free energy density in the thermodynamic limit, by an intermediate step where the quenched free energy was shown to be subadditive in the size of the system. Moreover, still by interpolation on families of Gaussian random variables, the first mentioned result was extended to give a rigorous proof that the expression given by the Parisi Ansatz is also a lower bound for the quenched free energy of the system, uniformly in the size [9]. The method gives not only the bound, but also the explicit form of the correction in a quite involved form. As a recent and very important result, along the task of facing the challenge, Michel Talagrand has been able to dominate these correction terms, showing that they vanish in the thermodynamic limit. This milestone achievement was firstly announced in a short note [10], containing only a synthetic sketch of the proof, and then presented with all details in a long paper to be published on Annals of Mathematics [11].
Mathematical Aspects of Mean Field Spin Glass Theory
721
The interpolation method is also at the basis of the far reaching generalized variational principle proven by Michel Aizenman, Robert Sims and Shannon Starr in [12]. In our presentation, we will try to be as self-contained as possible. We will give all definitions, explain the basic structure of the interpolation method, and show how some of the results are obtained. We will concentrate mostly on questions connected with the free energy, its properties of subadditivity, the existence of the infinite volume limit, and the replica bounds. For the sake of comparison, and in order to provide a kind of warm up, we will recall also some features of the standard elementary mean field model of ferromagnetism, the so called Curie-Weiss model. We will concentrate also here on the free energy, and systematically exploit elementary comparison and interpolation arguments. This will show the strict analogy between the treatment of the ferromagnetic model and the developments in the mean field spin glass case. Basic roles will be played in the two cases, but with different expressions, by positivity and convexity properties. The organization of the paper is as follows. In Section 2, we introduce the ferromagnetic model and discuss behavior and properties of the free energy in the thermodynamic limit, by emphasizing, in this very elementary case, the comparison and interpolation methods that will be also exploited, in a different context, in the spin glass case. Section 3 is devoted to the basic features of the mean field spin glass models, by introducing all necessary definitions. In Section 4, we introduce the Gaussian comparison and interpolation method, by giving simple applications to the existence of the infinite volume limit of the quenched free energy [8], and to the proof of general variational bounds, by following the useful strategy developed in [12]. Section 5 will briefly recall the main features of the Parisi representation, and will state the main theorem concerning the free energy. Finally, Section 6 will be devoted to conclusions and outlook for future developments. It is a pleasure to thank the Organizing Committee, and in particular Professor Ari Laptev, for the kind invitation to talk in a such stimulating cultural atmosphere. 2. The mean field ferromagnetic model. Structure and results The mean field ferromagnetic model is among the simplest models of statistical mechanics. However, it contains very interesting features, in particular a phase transition, characterized by spontaneous magnetization, at low temperatures. We refer to standard textbooks, for example [13], for a full treatment, and a complete appreciation of the model in the frame of the theory of ferromagnetism. Here we consider only some properties of the free energy, easily obtained through comparison methods.
722
F. Guerra
The generic configuration of the mean field ferromagnetic model is defined through Ising spin variables σi = ±1, attached to each site i = 1, 2, . . . , N . The Hamiltonian of the model, in some external field of strength h, is given by the mean field expression 1 HN (σ, h) = − σi σj − h σi . (2.1) N i (i,j)
Here, the first sum extends to all N (N − 1)/2 site couples, an the second to all sites. For a given inverse temperature β, let us now introduce the partition function ZN (β, h) and the free energy per site fN (β, h), according to the wellknown definitions exp(−βHN (σ, h)), (2.2) ZN (β, h) = σ1 ...σN −1
−βfN (β, h) = N
E log ZN (β, h).
It is also convenient to define the average spin magnetization 1 σi . m= N i
(2.3)
(2.4)
Then, it is immediately seen that the Hamiltonian in (2.1) can be equivalently written as 1 σi , (2.5) HN (σ, h) = − N m2 − h 2 i where an unessential constant term has been neglected. In fact we have 1 1 1 (2.6) σi σj = σi σj = N 2 m2 − N, 2 2 2 i,j;i=j
(i,j)
where the sum over all couples has been equivalently written as one half the sum over all i, j with i = j, and the diagonal terms with i = j have been added and subtracted out. Notice that they give a constant because σi2 = 1. Therefore, the partition function in (2.2) can be equivalently substituted by the expression 1 ZN (β, h) = exp(− βN m2 ) exp(βh σi ), (2.7) 2 σ ...σ i 1
N
which will be our starting point. Our interest will be in the limN →∞ N −1 log ZN (β, h). To this purpose, let us establish the important subadditivity property, holding for the splitting of the big N site system in two smaller N1 site and N2 site systems, respectively, with N = N1 + N2 , log ZN (β, h) ≤ log ZN1 (β, h) + log ZN2 (β, h).
(2.8)
Mathematical Aspects of Mean Field Spin Glass Theory
723
The proof is very simple. Let us denote, in the most natural way, by σ1 , . . . , σN1 the spin variables for the first subsystem, and by σN1 +1 , . . . , σN the N2 spin variables of the second subsystem. Introduce also the subsystem magnetizations m1 and m2 , by adapting the definition (2.4) to the smaller systems, in such a way that N m = N1 m1 + N2 m2 .
(2.9)
Therefore, we see that the large system magnetization m is the linear convex combination of the smaller system ones, according to the obvious m=
N1 N2 m1 + m2 . N N
(2.10)
Since the mapping m → m2 is convex, we have also the general bound, holding for all values of the σ variables m2 ≤
N1 2 N2 2 m + m . N 1 N 2
(2.11)
Then, it is enough to substitute the inequality in the definition (2.7) of ZN (β, h), and recognize that we achieve factorization with respect to the two subsystems, and therefore the inequality ZN ≤ ZN1 ZN2 . So we have established (2.8). From subadditivity, the existence of the limit follows by a simple argument, as explained for example in [14]. In fact, we have lim N −1 log ZN (β, h) = inf N −1 log ZN (β, h).
N →∞
N
(2.12)
Now we will calculate explicitly this limit, by introducing an order parameter M , a trial function, and an appropriate variational scheme. In order to get a lower bound, we start from the elementary inequality m2 ≥ 2mM − M 2 , holding for any value of m and M . By inserting the inequality in the definition (2.7) we arrive at a factorization of the sum over σ’s. The sum can be explicitly calculated, and we arrive immediately to the lower bound, uniform in the size of the system, 1 N −1 log ZN (β, h) ≥ log 2 + log cosh β(h + M ) − βM 2 , 2
(2.13)
holding for any value of the trial order parameter M . Clearly it is convenient to take the supremum over M . Then we establish the optimal uniform lower bound 1 N −1 log ZN (β, h) ≥ sup(log 2 + log cosh β(h + M ) − βM 2 ). 2 M
(2.14)
It is simple to realize that the supremum coincides with the limit as N → ∞. To this purpose we follow the following simple procedure. Let us consider all possible values of the variable m. There are N + 1 of them, corresponding to any number K of possible spin flips, starting from a given σ configuration,
724
F. Guerra
K = 0, 1, . . . , N . Let us consider the trivial decomposition of the identity, holding for any m, 1= δmM , (2.15) M
where M in the sum runs over the N +1 possible values of m, and δ is Kronecker delta, being equal to 1 if M = N , and zero otherwise. Let us now insert (2.15) in the definition (2.7) of the partition function inside the sum over σ’s, and invert the two sums. Because of the forcing m = M given by the δ, we can write m2 = 2mM − M 2 inside the sum. Then if we neglect the δ, by using the trivial δ ≤ 1, we have un upper bound, where the sum over σ’s can be explicitly performed as before. Then it is enough to take the upper bound with respect to M , and consider that there are N + 1 terms in the now trivial sum over M , in order to arrive at the upper bound 1 N −1 log ZN (β, h) ≤ sup(log 2 + log cosh β(h + M ) − βM 2 ) + N −1 log(N + 1). 2 M (2.16) Therefore, by going to the limit as N → ∞, we can collect all our results in the form of the following theorem giving the full characterization of the thermodynamic limit of the free energy. Theorem 2.1. For the mean field ferromagnetic model we have lim N −1 log ZN (β, h) = inf N −1 log ZN (β, h)
N →∞
N
1 = sup log 2 + log cosh β(h + M ) − βM 2 . 2 M
(2.17)
This ends our discussion about the free energy in the ferromagnetic model. Now we are ready to attack the much more difficult spin glass model. But it will be surprising to see that, by following a simple extension of the methods here described, we will arrive to similar results.
3. The basic definitions for the mean field spin glass model As in the ferromagnetic case, the generic configuration of the mean field spin glass model is defined through Ising spin variables σi = ±1, attached to each site i = 1, 2, . . . , N . But now there is an external quenched disorder given by the N (N − 1)/2 independent and identical distributed random variables Jij , defined for each couple of sites. For the sake of simplicity, we assume each Jij to be a centered 2 ) = 1. By quenched disorder we unit Gaussian with averages E(Jij ) = 0, E(Jij mean that the J have a kind of stochastic external influence on the system, without participating to the thermal equilibrium.
Mathematical Aspects of Mean Field Spin Glass Theory
725
Now the Hamiltonian of the model, in some external field of strength h, is given by the mean field expression 1 Jij σi σj − h σi . (3.1) HN (σ, h, J) = − √ N (i,j) i Here, √ the first sum extends to all site couples, an the second to all sites. Notice the N necessary to ensure a good thermodynamic behavior to the free energy. For a given inverse temperature β, let us now introduce the disorder dependent partition function ZN (β, h, J) and the quenched average of the free energy per site fN (β, h), according to the definitions exp(−βHN (σ, h, J)), (3.2) ZN (β, h, J) = σ1 ...σN
−βfN (β, h) = N −1 E log ZN (β, h, J).
(3.3)
Notice that in (3.3) the average E with respect to the external noise is made after the log is taken. This procedure is called quenched averaging. It represents the physical idea that the external noise does not participate to the thermal equilibrium. Only the σ’s are thermalized. For the sake of simplicity, it is also convenient to write the partition function in the following equivalent form. First of all let us introduce a family of centered Gaussian random variables κ(σ), indexed by the configurations σ, and characterized by the covariances (3.4) E κ(σ)κ(σ ) = q 2 (σ, σ ), where q(σ, σ ) are the overlaps between two generic configurations, defined by σi σi , (3.5) q(σ, σ ) = N −1 i
with the obvious bounds −1 ≤ q(σ, σ ) ≤ 1, and the normalization q(σ, σ) = 1. Then, starting from the definition (3.1), it is immediately seen that the partition function in (3.2) can be also written, by neglecting unessential constant terms, in the form 9 N κ(σ)) exp(βh exp(β σi ), (3.6) ZN (β, h, J) = 2 σ ...σ i 1
N
which will be the starting point of our treatment. 4. Gaussian comparison and applications Our basic comparison argument will be based on the following very simple theorem.
726
F. Guerra
ˆi , for i = 1, . . . , K, be independent families of Theorem 4.1. Let Ui and U centered Gaussian random variables, whose covariances satisfy the inequalities for generic configurations ˆj ) ≡ Sˆij , E(Ui Uj ) ≡ Sij ≥ E(Uˆi U
(4.1)
and the equalities along the diagonal ˆi ) ≡ Sˆii , E(Ui Ui ) ≡ Sii = E(Uˆi U
(4.2)
then for the quenched averages we have the inequality in the opposite sense ˆi ), wi exp(Ui ) ≤ E log wi exp(U (4.3) E log i
i
where the wi ≥ 0 are the same in the two expressions. The proof is extremely simple and amounts to a straightforward calculation. In fact, let us consider the interpolating expression √ √ ˆi ), E log wi exp( tUi + 1 − tU (4.4) i
where 0 ≤ t ≤ 1. Clearly the two expressions under comparison correspond to the values t = 0 and t = 1 respectively. By taking the derivative with respect to t, and then integrating by parts with respect to the Gaussian variables, we immediately see that the interpolating function is nonincreasing in t, and the theorem follows. On the other hand, considerations of this kind are present in the mathematical literature of some years ago. Two typical references are [15] and [16]. We give here some striking applications of the basic comparison Theorem. In [8] we have given a very simple proof of a long waited result, about the convergence of the free energy per site in the thermodynamic limit. Let us show the argument. Let us consider a system of size N and two smaller systems of sizes N1 and N2 respectively, with N = N1 + N2 , as before in the ferromagnetic case. Let us now compare 9 N κ(σ)) exp(βh exp(β σi ), (4.5) E log ZN (β, h, J) = E log 2 σ ...σ i 1
with E log
σ1 ...σN
9 exp(β
N
N1 (1) (1) κ (σ ) exp(β 2
9
N2 (2) (2) κ (σ ) exp(βh σi ) 2 i
≡ E log ZN1 (β, h, J) + E log ZN2 (β, h, J), (4.6) where σ (1) are the (σi , i = 1, . . . , N1 ), and σ (2) are the (σi , i = N1 + 1, . . . , N ). Covariances for κ(1) and κ(2) are expressed as in (3.4), but now the overlaps are substituted with the partial overlaps of the first and second block, q1 and
Mathematical Aspects of Mean Field Spin Glass Theory
727
q2 respectively. It is very simple to apply the comparison theorem. All one has to do is to observe that the obvious N q = N1 q1 + N2 q2 ,
(4.7)
analogous to (2.9), implies, as in (2.11), N1 2 N2 2 q + q . (4.8) q2 ≤ N 1 N 2 Therefore, the comparison gives the superadditivity property, to be compared with (2.8), E log ZN (β, h, J) ≥ E log ZN1 (β, h, J) + E log ZN2 (β, h, J).
(4.9)
From the superadditivity property the existence of the limit follows in the form lim N −1 E log ZN (β, h, J) = sup N −1 E log ZN (β, h, J),
N →∞
(4.10)
N
to be compared with (2.12). The second application is in the form of the Aizenman-Sims-Starr generalized variational principle. Here, we will need to introduce some auxiliary system. The denumerable configuration space is given by the values of α = 1, 2, . . . . We introduce also a probability measure wα for the α system, and suitably defined overlaps between two generic configurations p(α, α ), with p(α, α) = 1. A family of centered Gaussian random variables κ ˆ (α), now indexed by the configurations α, will be defined by the covariances E κ ˆ (α)ˆ κ(α ) = p2 (α, α ). (4.11) We will need also a family of centered Gaussian random variables ηi (α), indexed by the sites i of our original system and the configurations α of the auxiliary system, so that (4.12) E ηi (α)ηi (α ) = δii p(α, α ). Both the probability measure wα , and the overlaps p(α, α ) could depend on some additional external quenched noise, that does not appear explicitly in our notation. In the following, we will denote by E averages with respect to all random variables involved. In order to start the comparison argument, we will consider firstly the case where the two σ and α systems are not coupled, so to appear factorized in the form 9 9 N N κ(σ)) exp(β κ ˆ (α)) exp(βh wα exp(β σi ) E log 2 2 σ1 ...σN α i 9 N κ ˆ (α)). (4.13) wα exp(β ≡ E log ZN (β, h, J) + E log 2 α In the second case the κ fields are suppressed and the coupling between the two systems will be taken in a very simple form, by allowing the η field
728
F. Guerra
to act as an external field on the σ system. In this way the σ’s appear as factorized, and the sums can be explicitly performed. The chosen form for the second term in the comparison is E log wα exp(β ηi (α)σi ) exp(βh σi ) σ1 ...σN
α
i
i
≡ N log 2 + E log
wα (c1 c2 . . . cN ), (4.14)
α
where we have defined (4.15) ci = cosh β(h + ηi (α)), as arising from the sums over σ’s. Now we apply the comparison Theorem. In the first case, the covariances involve the sums of squares of overlaps 1 2 q (σ, σ ) + p2 (α, α ) . (4.16) 2 In the second case, a very simple calculation shows that the covariances involve the overlap products q(σ, σ )p(α, α ). (4.17) Therefore, the comparison is very easy and, by collecting all expressions, we end up with the useful estimate, as in [12], holding for any auxiliary system as defined before, wα (c1 c2 . . . cN ) N −1 E log ZN (β, h, J) ≤ log 2 + N −1 E log α
−N
−1
E log
wα exp(β
α
9
N κ ˆ (α)). 2
(4.18)
5. The Parisi representation for the free energy We refer to the original paper [17], and to the extensive review given in [5], for the general motivations, and the derivation of the broken replica Ansatz, in the frame of the ingenious replica trick. Here we limit ourselves to a synthetic description of its general structure, independently from the replica trick. First of all, let us introduce the convex space X of the functional order parameters x, as nondecreasing functions of the auxiliary variable q, both x and q taking values on the interval [0, 1], i.e., X x : [0, 1] q → x(q) ∈ [0, 1].
(5.1)
Notice that we call x the function, and x(q) its values. We introduce a metric on X through the L1 ([0, 1], dq) norm, where dq is the Lebesgue measure. For our purposes, we will consider the case of piecewise constant functional order parameters, characterized by an integer K, and two sequences q0 , q1 , . . . , qK , m1 , m2 , . . . , mK of numbers satisfying 0 = q0 ≤ q1 ≤ · · · ≤ qK−1 ≤ qK = 1, 0 ≤ m1 ≤ m2 ≤ · · · ≤ mK ≤ 1,
(5.2)
Mathematical Aspects of Mean Field Spin Glass Theory
729
such that x(q) = m1 for 0 = q0 ≤ q < q1 , x(q) = m2 . . . , x(q) = mK
for q1 ≤ q < q2 , for qK−1 ≤ q ≤ qK . (5.3)
In the following, we will find convenient to define also m0 ≡ 0, and mK+1 ≡ 1. The replica symmetric case of Sherrington and Kirkpatrick corresponds to K = 2, q1 = q¯, m1 = 0, m2 = 1.
(5.4)
Let us now introduce the function f , with values f (q, y; x, β), of the variables q ∈ [0, 1], y ∈ R, depending also on the functional order parameter x, and on the inverse temperature β, defined as the solution of the nonlinear antiparabolic equation 1 1 (∂q f )(q, y) + (∂y2 f )(q, y) + x(q)(∂y f )2 (q, y) = 0, 2 2 with final condition f (1, y) = log cosh(βy).
(5.5) (5.6)
Here, we have stressed only the dependence of f on q and y. It is very simple to integrate Eq. (5.5) when x is piecewise constant. In fact, consider x(q) = ma , for qa−1 ≤ q ≤ qa , firstly with ma > 0. Then, it is immediately seen that the correct solution of Eq. (5.5) in this interval, with the right final boundary condition at q = qa , is given by √ 1 log exp ma f (qa , y + z qa − q) dµ(z), (5.7) f (q, y) = ma where dµ(z) is the centered unit Gaussian measure on the real line. On the other hand, if ma = 0, then (5.5) loses the nonlinear part and the solution is given by √ f (q, y) = f (qa , y + z qa − q) dµ(z), (5.8) which can be seen also as deriving from (5.7) in the limit ma → 0. Starting from the last interval K, and using (5.7) iteratively on each interval, we easily get the solution of (5.5), (5.6), in the case of piecewise order parameter x, as in (5.3). Now we introduce the following important definitions. The trial auxiliary function, associated to a given mean field spin glass system, as described in Section 3, depending on the functional order parameter x, is defined as β2 1 log 2 + f (0, h; x, β) − q x(q) dq. (5.9) 2 0 Notice that in this expression the function f appears evaluated at q = 0, and y = h, where h is the value of the external magnetic field. This trial expression should be considered as the analog of that appearing in (2.13) for the ferromagnetic case.
730
F. Guerra
The Parisi spontaneously broken replica symmetry expression for the free energy is given by the definition β2 1 −βfP (β, h) ≡ inf log 2 + f (0, h; x, β) − q x(q) dq , (5.10) x 2 0 where the infimum is taken with respect to all functional order parameters x. Notice that the infimum appears here, as compared to the supremum in the ferromagnetic case. In [9], by exploiting a kind of generalized comparison argument, involving a suitably defined interpolation function, we have established the following important result. Theorem 5.1. For all values of the inverse temperature β, and the external magnetic field h, and for any functional order parameter x, the following bound holds β2 1 −1 N E log ZN (β, h, J) ≤ log 2 + f (0, h; x, β) − q x(q) dq, 2 0 uniformly in N . Consequently, we have also β2 1 N −1 E log ZN (β, h, J) ≤ inf log 2 + f (0, h; x, β) − q x(q) dq , x 2 0 uniformly in N . However, this result can be understood also in the frame of the generalized variational principle established by Aizenman-Sims-Starr and described before. In fact, one can easily show that there exist an α systems such that wα (c1 c2 . . . cN ) ≡ f (0, h; x, β), N −1 E log α
N
−1
E log
α
wα exp(β
9
N β2 κ ˆ (α)) ≡ 2 2
1
q x(q) dq, 0
uniformly in N . This result stems from previous work of Derrida, Ruelle, Neveu, Bolthausen, Sznitman, Aizenman, Talagrand, Bovier, and others, and in a sense is implicit in the treatment given in [5]. We plan to deal with this important representation in a forthcoming note. We see that the estimate in Theorem 5.1 are also a consequence of the generalized variational principle. Up to this point we have seen how to obtain upper bounds. The problem arises whether, as in the ferromagnetic case, we can also get lower bounds, so to shrink the thermodynamic limit to the value given by the inf x in Theorem 5.1. After a short announcement in [10], Michel Talagrand wrote an extended paper [11], to appear on Annals of Mathematics, where the complete proof of the control of the lower bound is firmly established. We refer to the original paper for the complete details of this remarkable achievement. About the methods, here we only recall that in [9] we have given also the corrections to the bounds
Mathematical Aspects of Mean Field Spin Glass Theory
731
appearing in Theorem 5.1, albeit in a quite complicated form. Talagrand, with great courage, has been able to establish that these corrections do in fact vanish in the thermodynamic limit. In conclusion, we can establish the following extension of Theorem 2.1 to spin glasses. Theorem 5.2. For the mean field spin glass model we have lim N −1 E log ZN (β, h, J) = sup N −1 E log ZN (β, h, J)
N →∞
N
= inf log 2 + f (0, h; x, β) − x
β2 2
1
q x(q) dq .
(5.11)
0
6. Conclusion and outlook for future developments As we have seen, in these last few years there has been an impressive progress in the understanding of the mathematical structure of spin glass models, mainly due to the systematic exploration of comparison and interpolation methods. However many important problems are still open. The most important one is to establish rigorously the full hierarchical ultrametric organization of the overlap distributions, as appears in Parisi theory, and to fully understand the decomposition in pure states of the glassy phase, at low temperatures. Moreover, is would be important to extend these methods to other important disordered models as for example neural networks. Here the difficulty is that the positivity arguments, so essential in comparison methods, do not seem to emerge naturally inside the structure of the theory. We plan to report on these problems in future works. Acknowledgments. We gratefully acknowledge useful conversations with Michael Aizenman, Pierluigi Contucci, Giorgio Parisi and Michel Talagrand. The strategy explained in this paper grew out from a systematic exploration of comparison and interpolation methods, developed in collaboration with Fabio Lucio Toninelli. This work was supported in part by MIUR (Italian Minister of Instruction, University and Research), and by INFN (Italian National Institute for Nuclear Physics). References [1] D. Sherrington and S. Kirkpatrick, Solvable Model of a Spin-Glass Phys. Rev. Lett. 35, 1792–1796 (1975). [2] S. Kirkpatrick and D. Sherrington, Infinite-ranged models of spin-glasses, Phys. Rev. B17, 4384–4403 (1978). [3] D.L. Stein, Disordered Systems: Mostly Spin Glasses, in: Lectures in the Sciences of Complexity, ed. D.L. Stein, Addison-Wesley, NY, 1989. [4] M. M´ezard, G. Parisi and R. Zecchina, Analytic and Algorithmic Solution of Random Satisfiability Problems, Science 297, 812 (2002).
732
F. Guerra
[5] M. M´ezard, G. Parisi and M. A. Virasoro, Spin glass theory and beyond, World Scientific, Singapore, 1987. [6] M. Talagrand, Spin glasses: a challenge for mathematicians. Mean field models and cavity method, Springer-Verlag, Berlin (2003). [7] F. Guerra, Sum rules for the free energy in the mean field spin glass model, Fields Institute Communications 30, 161 (2001). [8] F. Guerra and F.L. Toninelli, The Thermodynamic Limit in Mean Field Spin Glass Models, Commun. Math. Phys. 230, 71–79 (2002). [9] F. Guerra, Broken Replica Symmetry Bounds in the Mean Field Spin Glass Model, Commun. Math. Phys. 233, 1–12 (2003). [10] M. Talagrand, The Generalized Parisi Formula, Compte Rendu de l’Acad´emie des Sciences, Paris 337, 111–114 (2003). [11] M. Talagrand, The Parisi formula, Annals of Mathematics, to appear. [12] M. Aizenman, R. Sims and S. Starr, Extended variational principle for the Sherrington-Kirkpatrick spin-glass model, Phys. Rev. B68, 214403 (2003). [13] H.E. Stanley, Introduction to phase transitions and critical phenomena, Oxford University Press, New York and London, 1971. [14] D. Ruelle, Statistical mechanics. Rigorous results, W.A. Benjamin Inc., New York, 1969. [15] K. Joag-dev, M.D. Perlman and L.D. Pitt, Association of normal random variables and Slepian’s inequality, Annals of Probability 11, 451–455 (1983). [16] J.-P. Kahane, Une in´egalit´e du type Slepian and Gordon sur les processus gaussiens, Israel J. Math. 55, 109–110 (1986). [17] G. Parisi, A sequence of approximate solutions to the S-K model for spin glasses, J. Phys. A13, L-115 (1980). Francesco Guerra Dipartimento di Fisica Universit` a di Roma “La Sapienza” and INFN, Sezione di Roma1 Piazzale A. Moro 2 I-00185 Roma, Italy e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Complexity Theory, Proofs and Approximation Johan H˚ astad Abstract. We give a short introduction to some questions in complexity theory and proceed to describe some recent developments. In particular, we discuss probabilistically checkable proofs and their applications in establishing inapproximability results. In a traditional proof the proof-checker reads the entire proof and decides deterministically whether the proof is correct. In a probabilistically checkable proof the proof-checker randomly verifies only a very small portion of the proof but still cannot be fooled into accepting a false claim except with small probability.
1. Introduction The question of what can be done in a completely mechanical way has now been studied for at least 70 years. The early studies led to the invention of the Turing machine [32], a formal model of computation in the form of a primitive computer and the definition that a task can be solved mechanically iff it can be solved by the Turing machine. Many other definitions of mechanical computability were proposed but as they were all proved to be equivalent this led to the consensus that indeed the correct model had been found. The (nonmathematical) statement that computability by the Turing machine indeed captures the true spirit of the intuitive notion of “computable by a mechanical procedure” is usually called “Church’s thesis”. With the invention of modern computers it was realized that in practice it does not make a big difference whether a problem cannot be solved at all by a computer or if any general solution requires 10200 elementary computational steps. In the latter case, even if every atom in the universe is turned into a super-fast computer, we would not see the end of the computation before our sun has long ago ceased to exist. This realization lead to the development of complexity theory, where we do not only care if the problem can be solved mechanically, but where we also study how many elementary computational steps are needed. One basic parameter of complexity theory is the length of the input. Clearly it is reasonable to expect that more operations are needed to factor a 1024-bit number than a 128-bit number. The variable n is usually used to denote the size of the input. Sometimes it is simply the number of bits needed to specify the input but it is also commonly used to denote a more natural sizeparameter closely related to the number of bits needed to specify the input. In
734
J. H˚ astad
particular, if one studies graphs, n is usually the number of nodes in the graph, while the number of bits to fully specify a graph on n nodes is n2 . The most studied computational problems are problems where “reasonable” size instances can be solved “reasonably” quickly. This informally defined set of problems in practice coincides well with a class that can be defined formally in a simple way; the class P , polynomial time. A computational problem belongs to P if the number of elementary steps needed to solve it on instances of size n can be bounded by a polynomial, i.e., as O(nk ). Many computational problems could be put into the class P , some by straightforward algorithms and some by more sophisticated algorithms, but some natural problems resisted all attempts. Many such problems had the additional property that if indeed the answer was found then it could be verified in polynomial time. An example would be integer factorization. It might be difficult to find the factors but once they are found it is easy to verify that indeed we have a correct solution. This gave birth to the complexity class N P and the most famous problem of complexity theory is whether all problems in N P also belong to P . This problem is still open and is one of the seven milliondollar millennium problems of the Clay institute. In spite of its importance both in theory and practice it is our belief that this is the Clay problem with the fewest number of active attackers. It seems like many people in the area are waiting for a new idea to surface before it will be possible to fruitfully devote time to this problem. Hopefully that new idea will come soon. For most people, from an intuitive standpoint, it would seem obvious that N P = P . If indeed any problem in N P also was in P , this would mean that whenever it is “easy” to verify a found solution it is also “easy” to find the solution. One can think of an N P -statement as a “theorem” and the fact that it is easy to verify a solution would translate into having a short proof. The conclusion would now be that it is also easy to find the proof and for mathematicians spending their lives missing short proofs this would seem especially surprising. The feeling among almost all people working in computational complexity is that the basic intuition is correct and indeed that N P = P . The state of complexity theory is such, however, that we currently have no idea how to prove this. The machinery to prove lower bounds is simply too primitive. To prove that a problem does not lie in P we have to prove that any fast algorithm, no matter how crazy, makes a mistake on some input, and this quantification over all algorithms is problematic. The fact that we cannot handle this basic question of complexity theory is a major stumbling block to the continued development of the theory because if we cannot tell whether N P = P there are many question that we cannot answer. One major technique is to relate other questions to the N P/P -question. One prominent example of this is to prove that any given computational problem, X, is N P -complete or N P -hard. In either case, if X belongs to P then
Complexity Theory, Proofs and Approximation
735
N P = P . This is in many cases the best available evidence that a problem cannot be solved efficiently. The notion of N P -completeness was put forth in 1971 by Cook [11] and soon extended by Karp [23]. Somewhat surprisingly, we have since then seen that very many problems are N P -complete. In fact, there are very few, a handful some would argue, natural problems that are not known to lie either in P or to be N P -hard. Some famous examples of such problems would be integer factorization, discrete logarithms and graph isomorphism. One of the most famous problems in N P is the Traveling Salesperson Problem, TSP, in which we have n cities and are given distances between the cities. The goal is to find a tour that visits all cities once and is of minimal total length. This was one of the first problems showed to be N P -hard in 1972 [23]. Thus if we believe that N P = P then we cannot solve this efficiently and optimally for all instances of the problem. This does not prevent us from looking for algorithms with interesting properties. We can study algorithms that find the optimum on random instances, algorithms that find a reasonably good solution on all instances or even algorithms that find reasonably good solutions on random instances. To study random instances one defines a probability distribution on the inputs. If one puts no condition on the probability distribution this is not a very interesting notion in that, in this case, one can make sure that random instances behave like worst case instances. If, however, one demands that the probability distribution is simple, this could possibly change the situation. One definition of simple is that instances with the given distribution can be generated by a probabilistic algorithm running in polynomial time. With this notion Levin [28] proved that N P -complete problems exist but they have proved to be rare and rather special. On the other hand, even for difficult problems, it is often easy to come up with some notion of random instances that makes the problem easy on the average. Whether such problems capture some real property of the computational problem or are simply a consequence of a “friendly” distribution is sometimes mostly a matter of taste. A mathematically more appealing notion is that of an approximation algorithm with a guaranteed approximation ratio. Consider TSP discussed above and suppose that we have the triangle-inequality. In this case there is a very efficient algorithm that finds a tour that is at most twice as long as the optimal tour. There is also a less efficient, but still polynomial time, algorithm by Christofides [10] that finds a tour that is at most a factor 1.5 longer than the optimum. This ratio of approximation is true for any input. This raises the question whether this is the best achievable factor and what can be said for other problems. Do all N P -hard optimization problems allow some nontrivial approximation algorithms? We discuss some of the most famous results on this question in the technical part of this paper. Positive results that prove that some problems can be solved within specified factors are complemented by results that show, based on some assumption,
736
J. H˚ astad
usually that N P = P , that there is no polynomial time algorithm achieving a given factor. Many of these results are based on the very interesting notion called probabilistically checkable proofs. To describe these let us first describe N P as a proof system. The most typical N P -complete problem is satisfiability of Boolean formulas. We are given a formula ϕ with logical connectives ∧ (and), ∨ (or) and negation and Boolean variables x1 , x2 , . . . , xn . The question is whether there is an assignment that makes the formula evaluate to true. If there is such an assignment, once it is found it can be checked quickly. We can view this assignment as a proof that ϕ is satisfiable. It is checked by a proof-checker, which in computer science traditionally is called a verifier and denoted by V , that simply evaluates the formula on this assignment and verifies that it evaluates to true. This is an excellent proof-system, each true statement of the given type, i.e., “ϕ is satisfiable” has a proof that is fairly short and can be checked efficiently by V . The proof-system is perfectly sound in that V is never convinced of an incorrect statement. It seems like there is little more we can hope for but it turns out to be profitable to ask how much of the proof V has to read. Naively one would think that V would have to read the entire proof but this is in fact not true in general. In a PCP we have a statement, still of the form “ϕ is satisfiable” and a written proof. The verifier V is, however, probabilistic and reads only a very small portion of the proof. In fact, it might decide to look at as little as three bits of the proof. Completeness is as before in that V accepts a correct proof for a correct theorem with probability 1. Soundness has to be relaxed and in fact all that can be achieved is that any proof for an incorrect statement is found to be incorrect with a constant probability s < 1. It is a remarkable theorem by Arora et al. [5], the PCP-theorem that this is in fact possible. We elaborate on the connection to approximability in the technical part of the paper but let us here give a glimpse of the connection. Given a statement ϕ, which we do not know if it is satisfiable or not we can consider the optimization problem of finding the “best” proof for it. The quality of the proof is defined as the probability that V accepts it. If ϕ is satisfiable we know that there is a proof that makes V accept with probability 1. On the other hand, if ϕ is not satisfiable no proof makes V accept with probability greater than s. This implies that if we could approximate the optimum of the proof optimization problem within a factor better than 1/s we could in fact determine whether ϕ is satisfiable. Since the latter is an N P -hard problem, so is the former and thus “all” we have to do is to construct the PCP in such a way that the proof optimization problem is in fact the optimization problem we care about. For the record let us note that there are other ways to use PCPs to get inapproximability results, but the proof optimization problem is the most basic.
Complexity Theory, Proofs and Approximation
737
In the rest of this paper we essentially retell the story told in the introduction but in a more technical way. We give essentially no proofs and some of the definitions are not totally rigorous, but the aim is to give the interested reader enough detail to convey a feeling for the area. 2. Basic definitions We are studying efficient algorithms for computational problems and thus we should define “algorithm” and “computational problem”. The standard formal definition of an algorithm goes through the notion of a Turing machine which is a bit cumbersome and we choose not to do this. It is also the case that almost any intuitive notion of an algorithm can be formalized in a suitable manner leading to an equivalent notion. Anybody who has written a computer program and is comfortable with formal definitions should easily be able to abstract the notion. The only crucial point is that we want the computer words to be bounded in size. If we bound the size by a constant this makes the model slightly awkward in that indirect addressing does not allow us to access all of the memory in a straightforward manner. If we allow ourselves words that are of a length that is logarithmic in the amount of memory we use, indirect addressing works without problems and we get a robust, simple and intuitive model of computation. The number of operations is defined as the number of machine steps performed. Those who have not programmed have done calculation by hand. There is no essential difference to machine calculation. The fact that the word size of the computer is limited is reflected in the fact that we have a finite number of symbols. Now you can simply count the number of symbols written and erased. The details of the model do affect the number of operations, but not on the level of detail we discuss in this paper. The formal definition of a computational problem is simple but maybe not very informative. We use the binary alphabet, i.e., Σ = {0, 1} and inputs and outputs of our algorithms are nonempty finite strings over Σ and this is denoted by Σ∗ . A computational problem is now simply a mapping from Σ∗ to Σ∗ . To make informal sense of a computational problem we need a more intuitive way of thinking of the input and the output. Most of the time, this is easy; integers are specified by their representation in the binary number system, text as the ASCII-value of their characters, etc. Sometimes the situation is more complicated and, in particular if we want an algorithm to deal with more or less arbitrary real numbers we have to be careful, but this goes beyond the scope of this paper. For each computational problem we have a parameter n which nicely measures the size of the instance. Let us take an example. Consider the problem of adding and multiplying two integers each with n bits. It is not difficult to convince oneself that the standard grade-school algorithm for adding two numbers runs in O(n) time
738
J. H˚ astad
and this is optimal since any algorithm must read its input. The grade-school algorithm for multiplying two numbers multiplies each digit of one number with each digit of the other resulting in an O(n2 ) time algorithm. There are many ways to improve this and the fastest algorithm designed by Sch¨ onhage and Strassen already in 1971 [31] runs in time O(n log n log log n). A fundamental question that now arises is whether this is optimal and the honest answer to this question is that we have no idea. There is in fact no lower bound for the complexity of multiplication that goes essentially beyond the fact that we have to read the input and write the output. In particular, it is possible, but in many people’s eyes unlikely, that multiplication can be done in time O(n). Thus in one sense, complexity theory has not left first grade as we do not understand multiplication. Note, however, that it is not obvious that resolving this question is simpler than deciding the N P/P -question. Proving a lower bound larger than cn for multiplication for any constant c is a fairly subtle issue when we know that the true bound is at most O(n log n log log n). To prove that N P = P we need to prove the lower bound nk for any k on the number of operations to solve satisfiability, and as we believe the true bound is more like 2n the margin here appears to be larger and more crude methods might apply. To paraphrase, the lower bound for multiplication will probably need something like a very sharp knife while proving N P = P might need something closer to a nuclear bomb. 3. NP and P Let us give a formal definition of N P , or at least what would have been a formal definition, had we defined our computational model and running time of an algorithm formally. To make formal sense of N P we focus on decision problems. A decision problem is a computational problem where we limit the output to a single bit. The standard terminology in this case would be that inputs that map to 1 are “accepted” and inputs that map to 0 are “rejected”. Many times one calls the elements of N P “languages” where a language is the subset of Σ∗ given by the accepted inputs. Definition 3.1. Let L ⊆ Σ∗ . L ∈ N P iff there is a Turing machine M that runs in time polynomial in the length of its first input such that x ∈ L iff there exists y such that M (x, y) = 1. We could require that the length of y is polynomial in the length of x but this is assured by the fact that M can only read a polynomial number of bits in polynomial time. Satisfiability is the standard N P -problem. It is the language of (codings of) satisfiable Boolean formulas. The input y is an assignment to the variables occurring in the formula coded by x and M checks whether this assignment satisfies the formula.
Complexity Theory, Proofs and Approximation
739
Note further that TSP as described in the introduction does not belong to N P as it is not a decision problem. To make it a decision problem we can introduce a parameter K, and ask whether there exists a tour of length at most K. The problem now belongs to N P . It is not always important to make the distinction between the optimization problem and the decision problem but on the formal level this might cause some confusion. As we want to make P ⊆ N P we define also P as a set of decision problems. Definition 3.2. Let L ⊆ Σ∗ . L ∈ P iff there is a Turing machine M that runs in time polynomial in the length of its input such that x ∈ L iff M (x) = 1. We proceed to make a formal definition of the property of being N P complete. We want to capture the idea of having a subroutine that decides a language L. Such a machine, traditionally denoted by M L , is given the ability to ask questions of the type “x ∈ L?” which are answered correctly in one elementary step. Such machines are called “oracle Turing machines” and L is the called the “oracle language”. Definition 3.3. Let L ⊆ Σ∗ . L is N P -complete iff L ∈ N P and for any language L ∈ N P there is an oracle Turing machine M L that runs in time polynomial in the length of its input such that x ∈ L iff M L (x) = 1. Note that if L is NP-complete and L belongs to P then so does any language in N P as we can replace calls to the oracle with a polynomial time machine deciding L. A language is N P -hard if we drop the requirement that it belongs to N P . Definition 3.4. Let L ⊆ Σ∗ . L is N P -hard iff for any language L ∈ N P there is an oracle Turing machine M L that runs in time polynomial in the length of its input such that x ∈ L iff M L (x) = 1. We extend the above notion to non-decision problems by saying that giving a subroutine that solves the given problem we can decide an arbitrary language in N P in polynomial time. There are thousands of known N P -hard and N P -complete problems. Satisfiability is N P -complete and TSP in its decision form is N P -complete and in its optimization form it is N P -hard. Thus we expect that none of these problems can be solved in polynomial time. Problems in N P can now be classified to be of three types. They can be N P -complete, belong to P or neither. Surprisingly the third category is very rare for natural problems and with few exceptions, already by early 1980’s most problems were known to be either N P -complete or to belong to P . The main progress on this set of problems in the last decade has been on a more refined measure of hardness.
740
J. H˚ astad
4. Approximation algorithms Given an N P -hard optimization problem we can study polynomial time heuristics that return good but possibly not optimal solutions. For our model problem TSP, a large number of heuristics are known and many are discussed by Johnson and McGeoch in [21]. Many heuristics are hard to analyze and best evaluated experimentally but for some strong and precise statements can be made. Let O be an optimization problem with instances x and solutions y where the objective value is V al(x, y). For TSP x is thus a set of distances, y is a proposed order in which to visit the cities and V al(x, y) is the total length of the tour given by y with distances x. The optimal value for a minimization problem is defined as Opt(x) = min V al(x, y). y
Definition 4.1. An algorithm A is a C-approximation for a minimization problem O if for each instance x, V al(x, A(x)) ≤ C · Opt(x). The approximation ratio for maximization problems is defined in an analogous way. We let Opt(x) = max V al(x, y). y
Definition 4.2. An algorithm A is a C-approximation for a maximization problem O if for each instance x, V al(x, A(x)) ≥ Opt(x)/C. Sometimes one requires not an approximation algorithm to output a solution but only an estimate for the optimal value. It is interesting that almost all lower bounds apply to this weaker model while the almost all known upper bounds are given by an algorithm in the stronger model. Sometimes we allow A to be a randomized algorithm. We then study E[V al(x, A(x))] where the expectation is taken only over the random choices of A and we emphasize that there is no randomization over the input and the bound is true for worst-case inputs. We now turn to our main example which is of both practical and theoretical interest. 5. Linear systems of equations Systems of linear equations over different fields appear in many situations. We are given a set of equations n aij xi = bj , 1 ≤ j ≤ m i=1
and we want to find values of xi to satisfy these equations in an as good way as possible. If one can satisfy all equations then such an assignment can be found in polynomial time by Gaussian elimination, or even more efficient algorithms in some situations. The most interesting situation for us now is the case when the system is inconsistent.
Complexity Theory, Proofs and Approximation
741
If we cannot satisfy all equations, there are sometimes several possible definitions of “best solution”. If the field in question is the rational numbers, one common definition of “best” is the least squares approximation, i.e., to minimize n 2 m aij xi − bj j=1
i=1
and also in this case it is possible to find the best solution in polynomial time. Another extreme is when the field is the field with two elements, GF [2], where the two elements are 0 and 1 and addition is performed modulo 2. In this situation the only possible measure is to maximize the number of satisfied equations and this is the measure we adopt for any field. Definition 5.1. For a field F let Max-Lin-F be the optimization problem to, given a set of linear equations, simultaneously satisfy the maximal number of equations. If F is the finite field of p elements we call the problem Max-Lin-p. It is not difficult to classify these problems on the N P -hardness scale and the following theorem is a possible exercise in a basic complexity class. Theorem 5.2. For any prime p, Max-Lin-p in its decision form is N P -hard and this is also true for Max-Lin-Q, where Q is the field of rational numbers. Let us turn to the approximability of Max-Lin-p. Suppose we have m equations. If we pick an assignment to the variables uniformly at random then we satisfy each equation with probability 1/p and thus we expect to satisfy, on the average, m/p equations. This leads to a randomized p-approximation algorithm but it is not difficult to make a deterministic algorithm that finds a solution that satisfies at least m/p equations. We have the following theorem: Theorem 5.3. For any prime p one can, in deterministic polynomial time, approximate Max-Lin-p within a factor of p. This is complemented by the following theorem by H˚ astad [19]. Theorem 5.4. For any prime p and any > 0, it is N P -hard to approximate Max-Lin-p within p − . Thus in particular, even if we know that there is an assignment that satisfies almost all equations there is no efficient way to find an assignment that does significantly better than a random assignment. The result applies as long as we allow three variables in each equation and has been extended by Engebretsen et al. [13] to apply to any group. On the other hand, if we only allow two variables in each equation we do get non-trivial approximation for any p [17, 3]. Over the rational numbers our knowledge is not quite as complete. We can pick a maximal set of linearly independent equations and satisfy these equations disregarding the remaining equations. This does not yield a very
742
J. H˚ astad
good approximation ratio but we should not hope for too much in view of the following lower bound by Amaldi and Kann [2]: Theorem 5.5. There is a δ > 0 such that it is N P -hard to approximate MaxLin-Q within nδ . The proof of Theorem 5.4 is, in principle, simple. We start with a Boolean formula ϕ and any δ > 0. We construct, in polynomial time, a linear system L of m equations. We make sure that if ϕ is satisfiable then there is an assignment that satisfies (1 − δ)m of the equations of L while if ϕ is not satisfiable, no assignment satisfies more than a fraction ( p1 + δ)m of the equations. It follows that any algorithm that determines the maximal number of simultaneously satisfiable equations within a factor smaller than 1−δ 1 p +δ can be used to determine whether ϕ is satisfiable or not and hence it must be an N P -hard task to achieve this approximation ratio. Choosing δ a suitable function of now establishes the result. This reduction of creating L from ϕ is just a computational procedure and could be described by a combinatorial algorithm. It has, however, been profitable to think in terms of proof systems and we turn to probabilistically checkable proofs. 6. Probabilistically Checkable Proofs First let us phrase N P as a proof system. Definition 6.1. A Turing machine V running in polynomial time in the length of its first input is a verifier in an N P - proof system for a language L iff • For x ∈ L there exists a π such that V (x, π) = 1. • For x ∈ L, for all π, V (x, π) = 0. The machine V is called the verifier and it is the same as the machine M in Definition 3.1. We are interested in discussing verifiers that read a very small portion of the proof. It is most convenient to use the concept of an oracle Turing machine as already used in Definition 3.3. This time we let V access the proof by asking questions “i?” which is answered by πi , the ith bit of the proof. We also assume that V is probabilistic and this is achieved by having a source of “random coins” which are bits each taking the value 0 with probability 12 independently of each other and the input. We denote the random string by r. Definition 6.2. Let c and s be real numbers such that 1 ≥ c > s ≥ 0. A probabilistic polynomial time Turing machine V is a verifier in a Probabilistically Checkable Proof (PCP) with soundness s and completeness c for a language L iff • For x ∈ L there exists an oracle π such that P rr [V π (x, r) = 1] ≥ c. • For x ∈ L, for all π P rr [V π (x, r) = 1] ≤ s.
Complexity Theory, Proofs and Approximation
743
In many circumstances one would expect a good verifier to always accept a correct proof of a correct statement and c = 1 is also the most common value, but values slightly below 1 for c are also useful. The famous PCP-theorem [5] can now be stated as follows: Theorem 6.3. Any L ∈ N P allows a PCP with perfect completeness (c = 1), constant soundness s < 1, where V only accesses three bits of π and uses O(log n) random coins on inputs of length n. The size of π is polynomial in n. Even a sketch of the proof of this theorem would take us too far. One key idea is to code the satisfying assignment as the outputs of a low degree polynomial over a finite field, a second is to use proof-composition, a type of recursive proof technique. Both were introduced prior to [5] and we refer to that paper for a discussion of the history. To see the connection to inapproximability we consider the proof optimization problem. Definition 6.4. Let V be a verifier in a PCP for a language L. The proof optimization problem is, given an input x, to determine the maximal probability with which V accepts x. We have the following trivial observation. Theorem 6.5. If the verifier V has soundness s and completeness c then, if we can determine the optimum of the proof optimization problem within a factor smaller than c/s, then we can decide membership in L with the same amount of resources. Proof. Suppose that we have an algorithm A that determines the value of the proof optimization problem within a factor k < sc . Then, on input x, run A and if the value of the obtained solution is greater than s accept the output and otherwise reject. By the soundness condition of the proof-system, whenever we accept the input this is the correct decision. The fact that we always accept elements of L is implied by the completeness condition and the assumed approximation ratio. The key now to getting interesting in-approximability results is to design a PCP for an N P -complete problem with the property that the proof optimization problem is in fact equivalent to an optimization problem we care about. Let us describe the properties of the PCP that underlies the proof of Theorem 5.4 in the case of p = 2. Given a parameter δ, the proof consists of a polynomial number of bits nk (πj )j=1 and is verified as follows. V flips O(log n) random coins to determine three addresses j1 , j2 and j3 and a bit b. The verifier now accepts if the exclusive-or of πj1 , πj2 and πj3 equals b. The completeness is 1 − δ and the soundness is 12 + δ.
744
J. H˚ astad
Now we can see that proof optimization problem is just Max-Lin-2 in disguise. Optimizing over the proof is the same as thinking of the ith bit of the proof as a variable xi and then to optimize over these variables. Suppose that V flips R coins. Each possible outcome of the random coins leads to a linear equation which determines whether V accepts this particular set of coin flips. We end up with 2R equations and the maximum fraction of simultaneously satisfiable equations is exactly the maximum probability to convince the verifier. Note that it is important that the verifier does not use too many random coins as the number of different sets of coinflips is the number of resulting equations. Also it is important that the proof is small in that each bit of the proof directly corresponds to a variable in the linear system of equations. To describe in detail how to construct this PCP is not feasible in these notes and we refer to the original paper [19]. On the very high level, the proof utilizes Theorem 6.3 as a black box and then improves the parameters. This is done by repeating the proof in parallel and then condensing the answers using an interesting binary code called the long code and proposed by Bellare et al. [7]. The long code of input v ∈ {0, 1}t is indexed by functions f : {0, 1}t → {0, 1} t and the value at position f is f (v). Thus 22 bits are used to code t bits and it is the longest binary code, disallowing coordinates that are equal for each pair of inputs. This code is extremely long but as it is used for constant size inputs its length does not affect the results except in that the implicit constants are rather weak. Let us now consider some other problems. 7. Independent set and Coloring Given a graph G, the independent set problem is to find the largest number of nodes of which no two are connected. A related problem is “clique” where we ask for the largest number of nodes all of which are pairwise connected. These two problems are clearly equivalent as can be seen from changing edges to non-edges. Independent set initially sounds like an innocent problem and for a while it was somewhat surprising that, for graphs with n nodes, the best approximation ratio achieved by any polynomial time algorithm was as poor as O( (lognn)2 ) [9]. This implies that even for a graph which has an independent set of size linear in the number of nodes the algorithm can guarantee only that we find an independent set of size Ω((log n)2 ). For graphs with an independent set as large O(n/(log n)2 ) the algorithm gives no guarantee. This poor performance was explained by subsequent lower bounds. Based on the assumption that N P cannot be solved in probabilistic polynomial time, H˚ astad [18] proved that for any > 0 one cannot approximate independent set within a factor n1− in polynomial time. Making stronger, but still almost universally believed assumptions, Khot [25] showed that it is possible to make
Complexity Theory, Proofs and Approximation
745
decrease as (log n)−γ for some γ > 0. Thus what seemed to be trivial upper bounds pointed very much in the correct direction, namely that independent set is indeed a very difficult problem. To get these inapproximability results, very strong PCPs are needed and the required properties have very natural parameters also when formulated as proof systems. Suppose we restrict V to use O(log n) random coins and to read q bits of the proof, require (almost) perfect completeness and we are looking to minimize the soundness. It was established by Samorodnitsky and Trevisan [30] √ that if we allow non-perfect completeness one could achieve soundness 2−q+O q . This was later extended to perfect completeness by H˚ astad and Khot [20]. It is amazing that the probability of being cheated essentially decreases by a factor of 2 for each bit read. Through a sequence of reductions this gives the desired bound for independent set. A very related problem is graph coloring. In this case we want to color the nodes in a graph in order that any two adjacent nodes are of different colors. The objective function to be minimized is given by the number of different colors. Note that each color class is an independent set and using this it is possible to prove that a good approximation algorithm for independent set would have yielded an almost as good approximation algorithm for coloring, but no direct reduction is known in the other direction. Feige and Kilian [15] showed, however, that it is possible to extend the lower bounds of independent set to coloring and thus also this problem is very difficult to approximate. Of special interest are graphs which can be colored with very few colors, the first interesting case being three-colorable graphs. This is one of the major open problems of the area of approximability. By a result of Blum and Karger [8] it is known how to color such a graph in polynomial time with roughly O(n3/14 ) colors while the best lower bound by Khanna et el. [24] is that unless P = N P it cannot be done with 4 colors. Most people in the area seem to expect the true answer to be of the form O(nδ ) for some positive δ but this conjecture must be considered highly speculative. 8. Maximum cut Maximum cut is the following problem. Given a graph, divide the nodes into two groups V1 and V2 so that a maximum number of edges are cut, i.e., go between the two parts. For a long time, the best approximation algorithm for this problem was a random assignment, giving an approximation ratio of 2 as a random assignment cuts half the edges on the average. A leap forward was made by Goemans and Williamson [17] when semidefinite programming was introduced as a tool to achieve good provable approximation ratios. Linear programming had long been used as a tool for designing heuristics and semi-definite programming is an extension. In a semi-definite program, we have a set of variables organized in a matrix. Apart from linear
746
J. H˚ astad
conditions on the variables we also have the constraint that the matrix is positive semi-definite. Assuming a linear objective function, the optimum can, by a result of Alizadeh [1], be found to any desired accuracy. One reason to hope for semi-definite programming to be solved efficiently is that the set of semidefinite matrices form a convex set and hence there is no problem with local extrema. Using this method for maximum cut, Goemans and Williamson [17] found a polynomial time approximation algorithm with approximation ratio max θ
π 1 − cos θ · ≈ 1.138. 2 θ
(8.1)
This algorithm remains the champion while the lower bound on approximability is 17/16 − for any > 0 [19]. There has recently been work by Khot et al. [26] indicating that the upper bound might be the correct answer. Given two strong, but not unrealistic conjectures, one can prove up, to an arbitrary > 0, matching lower bounds. 9. Set cover In set cover we are given a sequence of subsets (Si )m i=1 of a universe X of cardinality n. The goal is to find a minimal size sub-collection that covers X. There is a straightforward greedy algorithm for this problem. Keep picking the set that covers the maximal number of uncovered elements. If the optimal covering contains k elements then it is not difficult to see that at each iteration we cover at least a fraction k1 of the uncovered elements. The number of remaining uncovered elements after t sets have been picked is thus at most (1 −
1 t )n k
and it follows that after at most k ln n sets have been picked, all elements are covered. We conclude that we get an ln n approximation algorithm which was first described by Johnson [22]. This is complemented by a lower bound, proved by Feige [14], that says that if N P is not contained in deterministic time nO(log log n) then no polynomial time algorithm can approximate set cover within a factor (1 − o(1) ln n. Slightly weaker results are known if we are only willing to assume N P = P . 10. Vertex cover Vertex cover is the special case of set cover where each element only appears in two sets. This is mostly easily visualized as a graph. The edges of the graph correspond to the elements while each node gives a set defined by the edges incident to that node. The task now is to find the minimal number of nodes such that each edge has at least one endpoint in the picked set.
Complexity Theory, Proofs and Approximation
747
There are many ways to approximate this problem within a factor 2 and one is to relax it to linear programming. Introduce a variable xi for each node and minimize n xi i=1
given the constraint xi + xj ≥ 1 for any edge (i, j) as well as xi ≥ 0 for any i. Clearly any legitimate solution to the vertex cover gives a solution to the linear program by making xi = 1 when i is included in the solution and setting xi = 0 otherwise. Thus we know that the optimum to the linear program is at most the value of the optimal solution to vertex cover. The optimal solution to any linear program can be found in polynomial time, but the optimal solution probably takes values outside {0, 1} and hence does not correspond directly to a vertex cover. To recover a correct solution to vertex cover from a general solution to the linear program one can proceed as follows. For any i with xi ≥ 1/2 increase xi to 1 while otherwise set xi = 0. It is not difficult to see that the cost increases by at most a factor 2 and we get a solution for vertex cover giving an efficient 2-approximation algorithm. The strongest known lower bound on approximability for vertex cover by Dinur √ and Safra [12], is that it is N P -hard to approximate vertex cover within 10 5−21− ≈ 1.36 for any > 0. Khot and Regev [27] have proved that, again subject to an unproven and slightly speculative conjecture, the lower bounds can be improved to 2 − for any > 0. 11. Traveling salesperson problem Let us finally return to TSP. In most reasonable circumstances, instances obey the triangle inequality so let us concentrate on this case. If we only assume the triangle inequality the algorithm by Christofides [10] with the best approximation ratio has been known for over 20 years and it gives a factor 1.5. Here we have a lower bound but much weaker than for other problems. The best lower bound with a fully published proof is 3813/3812 by B¨ockenhauer and Seibert [6], but stronger results are in the process of being verified. It seems, however, that a ratio of 1.01 is not achievable by the current methods. One interesting subcase is that the cities are points in the two-dimensional plane and the distances are Euclidean distances. To find the optimal solution is this case was early on proved to be N P -hard by Papadimitriou [29]. For a long time, the algorithm of Christofides remained the best also in this case but eventually a celebrated result by Arora [4] showed that the Euclidean structure can be used and in fact for any > 0 it is possible to find an approximation within a factor (1 + ) in polynomial time. Thus the Euclidean case is provably simpler than the general case with the triangle inequality.
748
J. H˚ astad
An interesting extension is that of non-symmetric TSP, i.e., where it is possible that d(i, j) = d(j, i) which is quite possible in many models of reality, even for a modern salesperson with prevailing western winds playing a factor at long distance flights. Clearly any lower bound for the symmetric model also applies to the asymmetric case and in fact the bounds can be strengthened slightly but no bound beyond 1.01 is currently claimed. More interestingly, all approximation algorithms that give a constant approximation factor rely on the distancefunction being symmetric and the smallest achievable approximation ratio in polynomial time is currently O(log n), the first such algorithm given by Frieze et al. [16]. It is difficult to guess what the true bound might be and we end with this totally open question. 12. Final words If most problems were classified as either in P or as N P -hard by the 1980’ies we are now closing in on knowing approximability of most N P -hard optimization problems. Clearly many problems do remain open, but progress since the beginning 1990’ies, when this research started, has been spectacular. One cannot help beeing amazed that problems keep on turning out to be solvable in polynomial time or to be N P -hard. The in-between case, that one can prove must occur by constructing artificial problems, continues to be rare for natural problems. Why this is so, we can only speculate. References [1] F. Alizadeh. Interior point methods in semidefinite programming with applications to combinatorial optimization. SIAM Journal on Optimization, 5:13–51, 1995. [2] E. Amaldi and V. Kann. The complexity and approximability of finding feasible subsystems of linear relations. Theoretical Computer Science, 147:181–210, 1995. [3] G. Andersson, L. Engebretsen, and J. H˚ astad. A new way to use semidefinite programming with applications to linear equations mod p. Journal of Algorithms, 39:162–204, 2001. [4] S. Arora. Polynomial-time approximation schemes for Euclidean TSP and other geometric problems. Journal of the ACM, 45:753–782, 1998. [5] S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy. Proof verification and intractability of approximation problems. Journal of the ACM, 45:501–555, 1998. [6] H.-J. B¨ ockenhauer and Sebastian Seibert. Improved lower bounds on the approximability of the traveling salesman problem. RAIRO Theoretical Informatics and Applications, 34:213–255, 2000. [7] M. Bellare, O. Goldreich, and M. Sudan. Free bits, PCPs and non-approximability – towards tight results. SIAM Journal on Computing, 27:804–915, 1998. ˜ 3/14 )-coloring algorithm for 3-colorable graphs. [8] A. Blum and D. Karger. An O(n Information processing letters, 61:49–53, 1997.
Complexity Theory, Proofs and Approximation
749
[9] R. Boppana and M. Hald´ orsson. Approximating maximum independent sets by excluding subgraphs. BIT, 32:180–196, 1992. [10] N. Christofides. Worst-case analysis of a new heuristic for the traveling salesman problem. Technical report, Graduate School of Industrial Administration, Carnegie-Mellon University, 1976. [11] S.A. Cook. “The complexity of theorem proving procedures”, Proceeding of 3rd annual ACM symposium on theory of computation, 1971, pp. 151–158. [12] I. Dinur and S. Safra. On the importance of being biased. In Proceedings of 34th Annual ACM symposium on Theory of Computing, pages 33–42, 2002. [13] L. Engebretsen, J. Holmerin, and A. Russell. Inapproximability results for equations over finite groups. Theoretical Computer Science, 312:17–45, 2004. [14] U. Feige. A threshold of ln n for approximating set cover. Journal of the ACM, vol 45, 1998, pp. 634–652. [15] U. Feige and J. Kilian. Zero-knowledge and the chromatic number. Journal of Computer and System Sciences, 57:187–200, 1998. [16] A. Frieze, G. Galbiati, and F. Maffioli. On the worst-case performance of some algorithms for the asymmetric traveling salesman problem. Networks, 12:23–39, 1982. [17] M. Goemans and D. Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM, 42:1115–1145, 1995. [18] J. H˚ astad. Clique is hard to approximate within n1− . Acta Mathematica, 182:105–142, 1999. [19] J. H˚ astad. Some optimal inapproximability results. Journal of ACM, 48:798–859, 2001. [20] J. H˚ astad and S. Khot. Query efficient PCPs with perfect completeness. In Proceedings of 42nd Annual IEEE Symposium of Foundations of Computer Science, pages 610–619, 2001. [21] D. Johnson and L. McGeoch. The traveling salesman problem: A case study in local optimization. In E.H.L. Aarts and J.K. Lenstra, editors, Local Search in Combinatorial Optimization, pages 215–310. John Wiley and Sons, Ltd., 1997. [22] D.S. Johnson. Approximation algorithms for combinatorial problems. Journal Computer and System Sciences, 1974:256–278, 9. [23] R. Karp. Reducibility among combinatorial problems. In R. Miller and J. Thatcher, editors, Complexity of Computer Computations, pages 85–103. Plenum Press, 1972. [24] S. Khanna, M. Linial, and S. Safra. On the hardness of approximating the chromatic number. In Proceedings of the 2nd Isreal Symposium on Theory of Computing, pages 250–260. IEEE Computer Society, 1993. [25] S. Khot. Improved inapproximability results for maxclique and chromatic number. In Proceedings of 42nd Annual IEEE Symposium of Foundations of Computer Science, pages 600–609, 2001. [26] S. Khot, E. Mossel G. Kindler, and R. O’Donnell. Optimal inapproximability results for max-cut and other 2-variable CSPs? In Proceedings of 45th Annual IEEE Symposium of Foundations of Computer Science, pages 146–154, 2004.
750
J. H˚ astad
[27] S. Khot and O. Regev. Vertex cover might be hard to approximate to within 2 − ε. In Proc. of 18th IEEE Annual Conference on Computational Complexity (CCC), pages 379–386, 2003. [28] L. Levin. Average case complete problems. SIAM Journal on Computing, 15:285– 286, 1986. [29] C. Papadimitriou. Euclidean TSP is NP-complete. Theoretical computer science, 4:237–244, 1977. [30] A. Samorodnitsky and L. Trevisan. A PCP characterization of NP with optimal amortized query complexity. In In proceedings of the 32nd Annual ACM Symposium on Theory of Computing, pages 191–199, 2000. [31] A. Sch¨ onhage and V. Strassen. Schnelle Multiplikation grosser Zahlen. Computing, 7:281–292, 1971. [32] A. Turing. On computable numbers, with an application to the entscheidungsproblem. Proc. London Math. Soc. Ser 2, 42:230–265, 1936. Johan H˚ astad Royal Institute of Technology Stockholm, Sweden
4ECM Stockholm 2004 c 2005 European Mathematical Society
Random Surfaces Enumerating Algebraic Curves Andrei Okounkov
1. Overview The discovery that a relation exists between the two topics in the title was made by physicists who viewed them as two approaches to Feynman integral over all surfaces in string theory: one via direct discretization, the other through topological methods. A famous example is the celebrated conjecture by Witten connecting combinatorial tessellations of surfaces (conveniently enumerated by random matrix integrals) with intersection theory on the moduli spaces of curves, see [45]. Several mathematical proofs of this conjecture are now available [22, 36, 31], but the exact mathematical match between the two theories remains miraculous. The goal of this lecture is to describe an a priori different connection between enumeration of algebraic curves and random surfaces. The underlying mathematical conjectures relating Gromov-Witten and Donaldson-Thomas theory of a complex projective threefold X were made in [30]. Related physical proposal, first made in [43] and developed in [16], played an important role in development of these ideas. A link to matrix integrals will be briefly explained at the end of the lecture. An occasion like this calls for a review, but instead I chose to present views that are largely conjectural, definitely not in their final form, but appealing and with large unifying power. These ideas were developed in collaboration with A. Iqbal, D. Maulik, N. Nekrasov, R. Pandharipande, N. Reshetikhin, and C. Vafa. I would like to thank the organizers for the opportunity to present them here and my coauthors for the joy of joint work. 2. Enumerative geometry of curves Let X be a smooth complex projective threefold such as, e.g., the projective space P3 . We are interested in algebraic curves C in X. For example, (the real locus of) a degree 4 genus 0 curve in P3 may look like the one plotted in Figure 1. Specifically, we are interested in enumerative geometry of curves in X. For example, we would like to know how many curves of given degree and genus meet given subvarieties of X, assuming we expect this number to be finite. Partially supported by NSF and Packard Foundation.
752
A. Okounkov
Figure 1. A degree 4 rational curve in RP3 2.1. Parametrized curves and stable maps. 2.1.1. A rational curve C in X = P3 like the one in Figure 1 is the image of the Riemann sphere P1 under a map P1 ∈ z → f (z) = [f0 (z) : f1 (z) : f2 (z) : f3 (z)]
(2.1)
given in homogeneous coordinates by polynomials fi of degree d. Modulo reparameterization of P1 , this leaves 4d complex parameters for C. To pass through a point in a threefold is a codimension 2 condition on C. We, therefore, expect that finitely many degree d rational curves will meet 2d points in general position. For example, there is obviously a unique line through two points. Similarly, since any conic lies in a plane, there will be none such passing through 4 generic points. In general, the number of degree d = 1, 2, . . . rational curves through 2d general points of P3 equals 1, 0, 1, 4, 105, 2576, 122129, . . . , see for example [8, 12] on how to do such computations. An important ingredient is a compactification of the space of maps (2.1) to the moduli space of stable maps, introduced by Kontsevich. The domain of a stable map need not be irreducible, it may sprout off additional P1 ’s like in the case of a smooth conic degenerating to a union of two lines. 2.1.2. In general, the moduli spaces Mg,n (X, β) of pointed stable maps to X (where X may be of any dimension) consist of data (C, p1 , . . . , pn , f ) where C is a complete curve of arithmetic genus g with at worst nodal singularities, p1 , . . . , pn are smooth marked points of C, and f : C → X is an algebraic
Random Surfaces Enumerating Algebraic Curves
753
map of given degree β = f∗ ([C]) ∈ H2 (X) . Two such objects are identified if they differ by a reparameterization of the domain. One further requirement is that the group of automorphisms (that is, self-isomorphisms) should be finite; this is the stability condition. 2.1.3. The space Mg,n (X, β) carries a canonical virtual fundamental class [3, 4, 26] of dimension vir dim Mg,n (X, β) = −β · KX + (g − 1)(3 − dim X) + n ,
(2.2)
where KX is the canonical class of X. The Gromov-Witten invariants of X are defined as intersections of cohomology classes on Mg,n (X, β) defined by conditions we impose on f (e.g., by constraining the images f (pi ) of the marked points) against the virtual fundamental class. In exceptionally good cases, for example when X = P3 and g = 0, the virtual fundamental class is the usual fundamental class. Even for X = P3 , the situation with higher genus curves is considerably more involved, both in foundational aspects as well as in combinatorial complexity. It is, therefore, remarkable that conjectural correspondence with Donaldson-Thomas theory, to be described momentarily, gives all-genera fixeddegree answers with finite amount of computation. 2.2. Equations of curves and Hilbert scheme. Instead of giving a parameterization, one can describe algebraic curves C ⊂ X by their equations. 2.2.1. Concretely, if X ⊂ PN for some N and [x0 : x1 : · · · : xN ] are homogeneous coordinates on PN then homogeneous polynomials f vanishing on C form a graded ideal I(C) ⊂ C[x0 , . . . , xN ] , containing the ideal I(X) of X. This ideal is what replaces parametrization of C in the world of equations. For example, the curve in Figure 1 is cut out (that is, its ideal is generated) by one quadratic and 3 cubic equations. 2.2.2. Let I(C)k ⊂ C[x0 , . . . , xN ]k denote subspaces formed by polynomials of degree k. The codimension of I(C)k is the number of linearly independent degree k polynomials on C. By Hilbert’s theorem, codim I(C)k = (β · h) k + χ(OC ) ,
k
0,
(2.3)
where β ∈ H2 (X) is the class of C and h is the hyperplane class induced from the ambient PN . The number χ(OC ) = dim H 0 (C, OC ) − dim H 1 (C, OC ) is the holomorphic Euler characteristic of C. By definition, g = 1 − χ is the arithmetic genus of C. It is easy to see that C is uniquely determined by any I(C)k provided k 0. A natural parameter space for ideals I with given Hilbert function
754
A. Okounkov
(2.3) is the Hilbert scheme Hilb(X; β, χ) constructed by Grothendieck. It is defined by certain natural equations in the Grassmannian of all possible linear subspaces Ik ⊂ C[x0 , . . . , xN ]k of given codimension (2.3). 2.2.3. While Hilb(X; β, χ) and Mg,n (X, β) play the same role of a compact parameter space in the world of equations and parameterizations, respectively, it should be stressed that there is no direct geometric relation between the two. This is most apparent in the case β = 0. In degree 0 case, the stable map moduli spaces become essentially Deligne-Mumford spaces of stable curves – very nice and well-understood varieties. The Hilbert scheme of points in a 3fold X, by contrast, seems very complicated. Even the number of its irreducible components, or their dimensions, is not known. 2.2.4. All of what we said so far about the Hilbert scheme applied very generally, in any dimension. The case of curves in a 3-fold, however, is special: in this case Hilb(X; β, χ) carries a virtual fundamental class constructed by R. Thomas [44]. The technically important thing about 3-folds is that Serre duality limits the number of interesting Exti -group from an ideal sheaf to itself to just i = 1, 2. From (2.2) we see that the case dim X = 3 is special for Gromov-Witten theory, too. In fact, we have vir dim Hilb(X; β, χ) = vir dim Mg (X, β) = −β · KX .
(2.4)
As we will see in the next section, it is very fortunate that this dimension depends only on β. 2.3. Gromov-Witten and Donaldson-Thomas invariants. Choose β ∈ H2 (X) such that −β · KX ≥ 0. Let γ1 , . . . , γn ∈ H∗ (X) be a collection of cycles in X such that (codim γi − 1) = −β · KX . By the dimension formula (2.4), the virtual number of degree β curves of some fixed genus meeting all of γi ’s is expected to be finite. The precise technical definition of this virtual number is different for stable maps and the Hilbert scheme. 2.3.1. On the stable maps side, we can use marked points pi to say “curve meets γi ” in the language of cohomology. Namely, imposing the condition f (pi ) ∈ γi can be interpreted as pulling back the Poincar´e dual class γi∨ via the evaluation map evi : (C, p1 , . . . , pn , f ) → f (pi ) . (2.5) The Gromov-Witten invariants are defined by n = ev∗i (γi∨ ) . (2.6) γ1 , . . . , γn GW β,g • [Mg,n (X,β) ]vir i=1 The bullet here stands for moduli space with possibly disconnected domain and [ ]vir is its virtual fundamental class. The disconnected theory contains,
Random Surfaces Enumerating Algebraic Curves
755
of course, the same information as the connected one, but has slightly better formal properties. Most importantly, since connected curves do not form a component of the Hilbert scheme, we prefer to work with possibly disconnected curves on the Gromov-Witten side as well. 2.3.2. On the Hilbert scheme side, instead of marked points, it is natural to use characteristic classes of the universal ideal sheaf I → Hilb(X) × X , which has the property that for any point I ∈ Hilb(X), the restriction of I to I ×X ∼ = X is I itself. We have c1 (I) = 0 and c2 (I) ∈ H 2 (Hilb(X) × X) can be interpreted as the class of locus {(I, point of the curve defined by I)} ⊂ Hilb(X) × X . The class of curves I ∈ Hilb(X) meeting γ ∈ H∗ (X) can be described as the coefficient of γ ∨ in the K¨ unneth decomposition of c2 (I). We denote this component by c2 (γ) ∈ H codim γ−1 (Hilb(X)) and define n DT γ1 , . . . , γn β,χ = c2 (γi ) . (2.7) [Hilb(X;β,χ)]vir i=1
We call these numbers the Donaldson-Thomas invariants of X. 2.4. Main conjecture. 2.4.1. As already pointed out, there is no reason for the corresponding invariants (2.6) and (2.7) to agree and, in fact, they don’t. For one thing, the moduli spaces are empty and, hence, integrals vanish if g, χ " 0, which goes in the opposite directions via χ = 1 − g. Also, the Donaldson-Thomas invariants are integers while the Gromov-Witten invariants are typically fractions (because stable maps can have finite automorphisms). However, a conjecture proposed in [30] equates natural generating functions for the two kinds of invariants after a nontrivial change of variables. 2.4.2. Concretely, set ZGW (γ1 , . . . , γn ; u)β =
u2g−2 γ1 , . . . , γn GW β,g
g
and define the reduced partition function by
: ZGW (γ; u)β = ZGW (γ; u)β ZGW (∅; u)0 .
This reduced partition function counts maps without collapsed connected components. The degree zero function ZGW (∅; u)0 is known explicitly for any 3-fold X by the results of [11], see below. Define ZDT (γ; q)β and its reduced version by the same formula, with q χ replacing u2g−2 .
756
A. Okounkov
Conjecture 2.1. The reduced Donaldson-Thomas partition function ZDT (γ; q)β
is a rational function of q. The change of variables q = −eiu relates it to the Gromov-Witten partition functions (−iu)− vir dim ZGW (γ; u)β = (−q)− vir dim /2 ZDT (γ; q)β ,
where vir dim = −β · KX is the virtual dimension. 2.4.3. Conjecture 2.1 has been established when X is either a local curve, that is, an arbitrary rank 2 bundle over a smooth curve [42] or the total space of canonical bundle over a smooth toric surface [30, 28]. In the local curve case, equivariant theory is needed [6]. In my opinion, this provides substantial evidence for the “GW=DT” correspondence. 2.4.4. Conjecture 2.1 is actually a special case of more general conjectures proposed in [30] that extend the GW/DT correspondence to the relative context and descendent invariants. On the Gromov-Witten side, the descendent insertions are defined by τk (γi ) = ev∗i (γi∨ ) ψik ∈ H codim γi +k (Mg,n (X; β)) , where ψi is the 1st Chern class of the line bundle Li over Mg,n (X; β) with fiber the cotangent line Tp∗i C to the curve C at the marked point pi . These should correspond to K¨ unneth components of characteristic classes of the universal sheaf I. For example, we conjecture that GW=DT
τk (pt) −−−−−−−−−→ (−1)k+1 chk+2 (pt) , provided codim γi > 0 for all other insertions. Here chk+2 (I) ∈ H k+2 (Hilb(X) × X) are the components of the Chern character of I and chk+2 (pt) are the coefficient unneth decomposition. of pt∨ = 1 ∈ H ∗ (X) in their K¨ 2.4.5. In the degree 0 case, which is left out by Conjecture 2.1, we expect the following simple answer which depends only on characteristic numbers of X. Denote the Chern classes of T X by ci and let (1 − q n )−n (2.8) M (q) = n>0
be the McMahon function.
Conjecture 2.2.
ZDT (X, q)0 = M (−q)
X
(c3 −c1 c2 )
.
This conjecture has been proven for a large class of 3-folds including all toric ones [30].
Random Surfaces Enumerating Algebraic Curves
757
Comparing the asymptotic expansion ln M (e−u ) ∼
∞ ζ(3 − 2g)ζ(1 − 2g) g=0
(2g − 2)!
u2g−2 ,
u → +0 .
(2.9)
in which the singular g = 1 term is understood as the second term in 1 1 1 ζ(3 − 2g)ζ(1 − 2g) 2g−2 u + ln u + ζ (−1) + O(g − 1) , = (2g − 2)! 24 g − 1 12 to evaluation of ZGW (X, u)0 obtained in [11], we find ln ZDT (X, −eiu )0 ∼ · · · + 2 ln ZGW (X, u)0 , where dots stand for singular or constant terms in the asymptotic expansion. There are some plausible explanations for the unexpected factor of 2 in this formula, but none convincing enough to be presented here. McMahon’s discovery was that the function M (q) is the generating function for 3-dimensional partitions. We will see momentarily how 3-dimensional partitions arise in Donaldson-Thomas theory.
3. Random surfaces 3.1. Localization and dissolving crystals. 3.1.1. For the rest of this lecture, we will assume that X is a smooth toric 3-fold, such as P3 or (P1 )3 . By definition, this means, that the torus T = (C∗ )3 acts on X with an open orbit. Since anything that acts on X naturally acts on both Mg,n (X; β) and Hilb(X; β, χ), localization in T-equivariant cohomology [2] can be used to compute intersections on these moduli spaces, see [10, 23, 13]. Localization reduces intersection computations to certain integrals over the loci of T-fixed points. On the Gromov-Witten side, these fixed loci are, essentially, moduli spaces of curves and the integrals in question are the socalled Hodge integrals. While any fixed-genus Hodge integral can, in principle, be evaluated in finite time, a better structural understanding of the totality of these numbers remains an important challenge. By contrast, the T-fixed loci in the Hilbert scheme are isolated points. Together with the conjectural rationality of ZDT , this reduces, for fixed degree, the all-genera answer to a finite sum. 3.1.2. It is the localization sum in the Donaldson-Thomas theory that can be interpreted as the partition function of a certain random surface ensemble. The link is provided by the combinatorial geometry of the T-fixed points in the Hilbert scheme, which is standard and will be quickly reviewed now.
758
A. Okounkov
3.1.3. As a warm-up, let us start with surfaces instead of 3-folds and look at the Hilbert scheme Hilb(C2 ; d, n) formed by ideals I ⊂ C[x, y] such that codim I≤k = dk + n ,
k
0,
(3.1)
where I≤k stands for subspace of polynomials of degree ≤ k. The torus (C∗ )2 acts on Hilb(C2 ; d, n) by rescaling x and y. The monomials xi y j are eigenvectors of the torus action with distinct eigenvalues. Any torus-fixed linear subspace I ⊂ C[x, y] is, therefore, spanned by monomials. Since I is also an ideal, together with any monomial xi y j it contains all monomials xa y b with a ≥ i and b ≥ j. 1
x
x2
x3
x4
x5
x6
x7
x8
x9
y
xy
x2 y
x3 y
x4 y
x5 y
x6 y
x7 y
x8 y
x9 y
y2
xy 2
x2 y 2
x3 y 2
x4 y 2
x5 y 2
x6 y 2
x7 y 2
x8 y 2
x9 y 2
y3
xy 3
x2 y 3
x3 y 3
x4 y 3
x5 y 3
x6 y 3
x7 y 3
x8 y 3
x9 y 3
y4
xy 4
x2 y 4
x3 y 4
x4 y 4
x5 y 4
x6 y 4
x7 y 4
x8 y 4
x9 y 4
y5
xy 5
x2 y 5
x3 y 5
x4 y 5
x5 y 5
x6 y 5
x7 y 5
x8 y 5
x9 y 5
y6
xy 6
x2 y 6
x3 y 6
x4 y 6
x5 y 6
x6 y 6
x7 y 6
x8 y 6
x9 y 6
Figure 2. A typical monomial ideal I ⊂ C[x, y] See Figure 2 for an image of a typical torus-fixed ideal I. Monomials in the ideal I are shaded gray; the generators of I are circled. Monomials not in I form a shape similar to the diagram of a partition, except that it has some infinite rows and columns. The total width of these infinite rows and columns (2, in this example) is the degree d in (3.1). The constant term χ (= 9 here) can be interpreted as the “renormalized area” of this infinite diagram. 3.1.4. For Hilb(C3 ; d, χ), the description of T-fixed points I is similar, but now in terms of 3-dimensional partitions, with possibly infinite legs along the coordinate axes, see Figure 3. The 2D partitions λ1 , λ2 , λ3 , on which the infinite legs end, describe the nonreduced structure of I along the coordinate axes. The degree d = |λ1 | + |λ2 | + |λ3 |
Random Surfaces Enumerating Algebraic Curves
759
is the total cross-section of the infinite legs; the number χ is the renormalized volume of this 3D partition.
Figure 3. A monomial ideal in Hilb(C 3 ; d, χ) A general projective toric X corresponds to lattice polytope ∆X , with vertices corresponding to T-fixed points, edges – to T-invariants P1 ’s et cetera. For example, (P1 )3 and P3 corresponds to a cube and simplex, respectively. To specify a T-fixed point in Hilb(X; β, χ), we place a 3D partition at every vertex of ∆X . These 3D partition may have infinite legs along the edges of ∆X ; we require that these legs glue in an obvious way, see Figure 4, left half. We have β= |λE | [E] ∈ H2 (X) , edges E
where [E] is the class of the T-invariant P1 corresponding to the edge E and λE is cross-section profile along E. The number χ is the renormalized volume of this assembly of 3D partitions. Note that the edge lengths do not have any intrinsic meaning in Figure 4; formally, they have to be viewed as infinitely long. It is an interesting problem to construct a generalization of Donaldson-Thomas theory in which the edge lengths will play a role. This should involve doubling of the degree parameters in the theory. The right half of Figure 4 shows the complement of the 3D partition structure on the left. Note that it is highly reminiscent of a partially dissolved
760
A. Okounkov
Figure 4. A T-fixed point in Hilb((P1 )3 ; β, χ) cubic crystal – some atoms are missing from the corners and along the edges. So, at least as far as the index set is concerned, the localization sum in DonaldsonThomas theory of X has the shape of a partition function in a random surface model, the surface being the surface of the dissolving crystal. We now move on to the computation of localization weight. 3.2. Equivariant vertex. The weight of a T-fixed point I ∈ Hilb(X; β, χ) in the virtual localization formula for Donaldson-Thomas invariants was computed in [30]. Here, for simplicity, we focus on the case X = C3 and β = 0, that is, on the case of a single 3D partition without infinite legs. The general case is parallel. 3.2.1. Let Iπ ∈ Hilb(C3 ; 0, χ) be a monomial ideal corresponding to a 3D partition π ⊂ Z3≥0 . Let Cπ ⊂ Z3≥0 denote the complement of π; we view the elements of Cπ as the atoms that remain in the crystal. Let z ∈ C∗ ⊂ T act on the coordinates in C3 by z · (x1 , x2 , x3 ) = (z t1 x1 , z t2 x2 , z t3 x3 ) . The localization weight w(π) of Iπ will be a rational function of the parameters ti . Let T be the linear function taking value T (❒) = t1 a1 + t2 a2 + t3 a3 . on a box ❒ = (a1 , a2 , a3 ) ∈ Z3≥0 . For a pair of boxes ❒1 and ❒2 , we define U (❒1 , ❒2 ) =
δT (δT + t1 + t2 )(δT + t1 + t3 )(δT + t2 + t3 ) , (δT + t1 )(δT + t2 )(δT + t3 )(δT + t1 + t2 + t3 )
where δT = T (❒1 ) − T (❒2 ) .
Random Surfaces Enumerating Algebraic Curves
761
Recall that χ is the number of missing atoms. We would have liked to define w(π) by w(π) “=” (−q)χ U (❒1 , ❒2 ) , (3.2) ❒1 ,❒2 ∈ crystal Cπ
which has a standard grand-canonical Gibbs form with (−q) being the fugacity and − log U (❒1 , ❒2 ) U (❒2 , ❒1 ) being the (translation-invariant) interaction energy between the atoms in positions ❒1 and ❒2 . 3.2.2. Since the product (3.2) is not even close to being well defined or convergent, the following regularization is required. Define Rπ (z) = trace of z acting on Iπ = z T (❒) , (3.3) ❒∈Cπ
This can be viewed as a generating function of the set Cπ . One checks that for any 3D partition π Vπ (z) = −
Rπ (z) Rπ (z −1 ) + R∅ (z) R∅ (z −1 )
(3.4)
is a Laurent polynomials in z ti , that is, it has the form Vπ (z) = vπ (a) z T (a) , vπ (a) ∈ Z , a∈Z3
where the sum is finite, that is, vπ (a) = 0 for all but finitely many a. We define the equivariant vertex measure of a 3D partition π by w(π) = q χ T (a)−vπ (a) . a∈Z
Note that a naive expansion of the Rπ (z) Rπ (z −1 ) product in (3.4) leads to the infinite product in (3.2). 3.2.3. It is a theorem from [30] that the virtual fundamental class of the Hilbert scheme restricts to the T-fixed point Iπ as follows: " 1 " 0 χ 3 q Hilb(C ; 0, χ) vir "" = w(π) . Iπ
762
A. Okounkov
3.2.4. One special case worth noting is when t1 + t2 + t3 = 0 .
(3.5)
In this case U (❒1 , ❒2 ) U (❒2 , ❒1 ) = 1 and the equivariant vertex measure w becomes uniform on partitions of fixed size. Condition (3.5) is the Calabi-Yau condition, it means restriction to the subtorus in T preserving the holomorphic 3-form Ω = dx1 ∧ dx2 ∧ dx3 on C3 . This explains why the McMahon function (2.8) appears in DonaldsonThomas theory. For general ti , the analog of McMahon’s identity is the following formula proven in [30] (t1 +t2 )(t1 +t3 )(t2 +t3 ) − t1 t2 t3 w(π) = M (−q) . (3.6) π
This formula implies Conjecture 2.2 for any toric 3-fold X. 3.2.5. If π has infinite legs, additional counterterms are needed in (3.4) to make it finite and the measure w(π) well-defined [30]. The equivariant vertex is a function of 3 partitions λ, µ, ν defined by w(π) . (3.7) W(λ, µ, ν) = π ending on λ, µ, ν
This function, which is the main building block in localization formula for Donaldson-Thomas invariants, is, in general, rather intricate. Conjecturally it is related to general triple Hodge integrals. In the Calabi-Yau case (3.5) it specializes to the topological vertex [1, 43], which has an expression in terms of Schur functions. The conjectural relation to Hodge integrals is proven in the one-leg case [42]. In the much simpler Calabi-Yau case, it is known in the two-leg case, see [28] and also [27, 40, 25]. 3.2.6. Conjecture 2.1 relates the Donaldson-Thomas partition function ZDT , which we just interpreted as the partition function of a certain dissolving crystal model, to the the Gromov-Witten partition function via the substitution −q = eiu . This means that the asymptotic expansion of the free energy ln ZDT in the thermodynamic limit −q = fugacity → 1 gives a genus-by-genus count of connected curves in Gromov-Witten theory. Letting q → −1 does corresponds to letting the energy cost of removal of an
Random Surfaces Enumerating Algebraic Curves
763
atom from the crystal go to zero. As a result, the expected number of removed atoms (t1 + t2 )(t1 + t3 )(t2 + t3 ) 2ζ(3) w(π) |π| def |π|w = ∼ , (3.8) w(π) t1 t2 t3 ln(−q)3 diverges. In general, the words “thermodynamic limit” have to be taken with a grain of salt since w is not necessarily a positive measure. However, for example in the uniform measure case (3.5) it is positive for −q ∈ (0, 1). After scaling by − ln(−q) in every direction, a macroscopic limit shape emerges. A simulation of the limit shape can be seen in Figure 5.
Figure 5. A random 3D partition of a large number The limit shape dominates the partition function ZDT . The GromovWitten partition function ZGW is determined by the fluctuations around the limit shape. 3.2.7. The limit shape of a uniformly random 3D partition of a large number, first determined in [7], is, as it turns out, nothing but the so-called Ronkin function of the simplest plane curve z +w = 1,
(3.9)
see [19] for a much more general result. Surprisingly (or not ?) the straight line (3.9) is essentially the Hori-Vafa mirror [14] of C3 , see, e.g., Section 2.5 in [1]. The mirror geometry thus can
764
A. Okounkov
be interpreted as the limit shape in the localization formula for the original counting problem. This phenomenon was first observed in [34] in the context of supersymmetric gauge theories on R4 . Namely, in [34] the Seiberg-Witten curve was identified with the limit shape in a certain random partition ensemble originating from localization on the instanton moduli spaces [33]. This limit shape interpretation gave a a gauge-theoretic derivation of the Seiberg-Witten prepotential, see [34] and also [32] for a different approach. Via a physical procedure called geometric engineering, supersymmetric gauge theories correspond to Gromov-Witten theory of certain noncompact toric Calabi-Yau threefolds X, see for example [17, 15]. For toric Calabi-Yau X, the random surface model can be viewed as a very degenerate limit of the planar dimer model. There is general method for finding limit shapes in the dimer model, which often gives essentially algebraic answers [18]. In particular, it reproduces the Hori-Vafa mirrors of toric CalabiYau 3-folds [20]. It would be extremely interesting to extend the “mirror geometry = limit shape” philosophy to a more general class of varieties and/or theories. 3.2.8. A natural set of observables to average against the equivariant vertex measure is provided by the characteristic classes of the universal sheaf I, see Section 2.3.2, in particular, by the components chk (I) of its Chern character. The restriction chk (π) of chk (I) to a fixed point Iπ ∈ Hilb(C3 ; 0, χ) is determined in terms of the generating function (3.3) by Rπ (eα ) . αk chk (π) = R∅ (eα ) k
The algebra generated by chk (π) can be viewed as the algebra of symmetric polynomials in π; this is a 3-dimensional analog of the algebra introduced in [21]. We have ch1 (π) = 0, ch2 (π) = degree = 0, and ch3 (π) = t1 t2 t3 |π| , so from (3.6) we get the evaluation ch3 (π)w = −(t1 + t2 )(t1 + t3 )(t2 + t3 ) E3 (−q) . Here E2k+1 are the following “odd weight” analogs of the classical Eisenstein series qn d2k , k = 1, 2, . . . . (3.10) E2k+1 (q) = n
d|n
One further computes, for example, 1 d ch4 (π)w = − (t1 + t2 )(t1 + t3 )(t2 + t3 )(t1 + t2 + t3 ) q E3 (−q) , 2 dq
Random Surfaces Enumerating Algebraic Curves
765
and the natural conjecture is that all chk (π)w belong to the differential algebra d generated by the functions (3.10) and the operator q dq . A similar statement for ordinary 2D partitions and usual even weight Eisenstein series was proven in [5]. Note, in particular, this conjecture implies that the “thermodynamic” asymptotics of chk (π)w as q → −1 is completely determined by the first few coefficients of its “low temperature” q-expansion. For a complete 3-fold X, a similar property is implied by the conjectural rationality of the reduced partition function ZDT . 3.2.9. Recall that on the Gromov-Witten side, the observables chk (I) are supposed to correspond to descendent invariants. While working out an exact match, especially in the equivariant theory, remains an open problem (see the discussion in [30]), there is one case that we understand well. Let X = P1 × C2 and let β be d times the class of P1 × {0}. Let C∗ act on 2 C with opposite weights. The C∗ -equivariant Gromov-Witten theory of X is the Gromov-Witten of P1 with additional insertion of two Chern polynomials of the Hodge bundle. Because of our choice of weights and Mumford’s relation, these Chern polynomials cancel out, leaving us with the Gromov-Witten theory of P1 . A complete description of the Gromov-Witten theory of P1 was obtained in [37, 38, 39]. In particular, we have the following formula for disconnected, degree d descendent invariants of the point class ;
where the summation is over partitions λ of d, dim λ is the dimension of the corresponding representation of the symmetric group, and pk is the following polynomial of λ 0 1 (λi − i + 12 )k − (−i + 12 )k + (1 − 2−k )ζ(−k) pk (λ) = i
“=”
(λi − i + 12 )k .
i
Here the first line is the ζ-regularization of the divergent sum in the second line. The weight function in (3.11) is known as the Plancherel measure on partitions of d. Sums of the form (3.11) are distinguished discrete analogs of matrix integrals mentioned at the very beginning of the lecture, see, e.g., the discussion in [35]. What happens on the Donaldson-Thomas side is that with our choice of torus weights the contribution of most T-fixed points to the localization formula vanishes. The only remaining ones are of the form seen in Figure 6, they are pure edges, that is, cylinders over an ordinary partition λ.
766
A. Okounkov
Figure 6. A pure edge
Sure enough, the localization weight of such a pure edge in this case specializes to the Plancherel weight of its cross-section λ. Also, the restrictions of chk (I) to such a fixed point has a simple linear relation to the numbers pk (λ). It was noticed by several people, in particular in [24, 29], that the sum (3.11) is closely related to localization expressions in the classical cohomology of the Hilbert scheme of d points in C2 . Perhaps the best explanation for this relation is that it is a specialization of the triangle of equivalences in Figure 7, see [41].
Quantum cohomology of Hilbd (C2 )
Gromov-Witten theory of P1 × C2
Donaldson-Thomas theory of P1 × C2
Figure 7. Three points of view on curves in P1 × C2
Random Surfaces Enumerating Algebraic Curves
767
References [1] M. Aganagic, A. Klemm, M. Marino, C. Vafa, The topological vertex, hepth/0305132. [2] M. Atiyah and R. Bott, The moment map and equivariant cohomology, Topology 23 (1984), 1–28. [3] K. Behrend, Gromov-Witten invariants in algebraic geometry, Invent. Math. 127 (1997), 601–617. [4] K. Behrend and B. Fantechi, The intrinsic normal cone, Invent. Math. 128 (1997), 45–88. [5] S. Bloch and A. Okounkov, The Character of the Infinite Wedge Representation, Adv. Math. 149 (2000), no. 1, 1–60. [6] J. Bryan and R. Pandharipande, The local Gromov-Witten theory of curves, math.AG/0411037. [7] R. Cerf and R. Kenyon, The low-temperature expansion of the Wulff crystal in the three-dimensional Ising model, Comm. Math. Phys 222 (2001),147–179. [8] D. Cox and S. Katz, Mirror symmetry and algebraic geometry, American Mathematical Society, Providence, RI, 1999. [9] S. Donaldson and R. Thomas, Gauge theory in higher dimensions, in The geometric universe: science, geometry, and the work of Roger Penrose, S. Huggett et. al eds., Oxford Univ. Press, 1998. [10] G. Ellingsrud, S. Strømme, Bott’s formula and enumerative geometry. J. Amer. Math. Soc. 9 (1996), no. 1, 175–193. [11] C. Faber and R. Pandharipande, Hodge integrals and Gromov-Witten theory, Invent. Math. 139 (2000), 173-199. [12] W. Fulton and R. Pandharipande, Notes on stable maps and quantum cohomology, Algebraic geometry – Santa Cruz 1995, 45–96, Proc. Sympos. Pure Math., 62, Part 2, AMS, Providence, RI, 1997. [13] T. Graber and R. Pandharipande, Localization of virtual classes, Invent. Math. 135 (1999), 487–518. [14] K. Hori and C. Vafa, Mirror Symmetry, hep-th/0002222. [15] A. Iqbal and A.-K. Kashani-Poor, SU(N) Geometries and Topological String Amplitudes, hep-th/0306032. [16] A. Iqbal, N. Nekrasov, A. Okounkov, C. Vafa, Quantum Foam and Topological Strings, hep-th/0312022. [17] S. Katz, A. Klemm, C. Vafa, Geometric engineering of quantum field theories, Nuclear Phys. B 497 (1997), no. 1-2, 173–195. [18] R. Kenyon, A. Okounkov, Limit shapes and complex Burgers equation, in preparation. [19] R. Kenyon, A. Okounkov, S. Sheffield, Dimers and Amoebae, math-ph/0311005. [20] R. Kenyon, A. Okounkov, C. Vafa, in preparation. [21] S. Kerov and G. Olshanski, Polynomial functions on the set of Young diagrams, C. R. Acad. Sci. Paris S´er. I Math., 319, no. 2, 1994, 121–126. [22] M. Kontsevich, Intersection theory on the moduli space of curves and the matrix Airy function, Comm. Math. Phys. 147 (1992), 1-23. [23] M. Kontsevich, Enumeration of rational curves via torus actions, The moduli space of curves (Texel Island, 1994), 335–368, Progr. Math., 129, Birkh¨ auser Boston, Boston, MA, 1995.
768
A. Okounkov
[24] W.-P. Li, Zh. Qin, W. Wang, Hilbert schemes, integrable hierarchies, and Gromov-Witten theory, math.AG/0302211. [25] J. Li, C.-C. Liu, K. Liu, J. Zhou , A Mathematical Theory of the Topological Vertex, math.AG/0408426. [26] J. Li and G. Tian, Virtual moduli cycles and Gromov-Witten invariants of algebraic varieties, JAMS 11 (1998), 119–174. [27] C.-C. Liu, K. Liu, J. Zhou , A Proof of a Conjecture of Marino-Vafa on Hodge Integrals, math.AG/0306434. [28] C.-C. Liu, K. Liu, J. Zhou , A Formula of Two-Partition Hodge Integrals, math.AG/0310272. [29] A. Losev, A. Marshakov, N. Nekrasov, Small Instantons, Little Strings and Free Fermions, hep-th/0302191. [30] D. Maulik, N. Nekrasov, A. Okounkov, and R. Pandharipande, GromovWitten theory and Donaldson-Thomas theory, I and II, math.AG/0312059, math.AG/0406092. [31] M. Mirzakhani, Weil-Petersson volumes and intersection theory on the moduli spaces of curves, available from http://abel.math.harvard.edu/∼mirzak/. [32] H. Nakajima, K. Yoshioka, Instanton counting on blowup, I, math.AG/0306198. [33] N. Nekrasov, Seiberg-Witten Prepotential From Instanton Counting, hep-th/0206161. [34] N. Nekrasov and A. Okounkov, Seiberg-Witten Theory and Random Partitions, hep-th/0306238. [35] A. Okounkov, The uses of random partitions, math-ph/0309015. [36] A. Okounkov and R. Pandharipande, Gromov-Witten theory, Hurwitz numbers, and matrix models, I, math.AG/0101147. [37] A. Okounkov and R. Pandharipande, Gromov-Witten theory, Hurwitz theory, and completed cycles, math.AG/0204305. [38] A. Okounkov and R. Pandharipande, The equivariant Gromov-Witten theory of P1 , math.AG/0207233. [39] A. Okounkov and R. Pandharipande, Virasoro constraints for target curves, math.AG/0308097. [40] A. Okounkov and R. Pandharipande, Hodge integrals and invariants of the unknot, Geom. Topol. 8(2004), 675–699. [41] A. Okounkov and R. Pandharipande, Quantum cohomology of the Hilbert scheme of points in the plane, math.AG/0411210. [42] A. Okounkov and R. Pandharipande, Gromov-Witten/Donaldson-Thomas correspondence for local curves, in preparation. [43] A. Okounkov, N. Reshetikhin, and C. Vafa, Quantum Calabi-Yau and classical crystals, hep-th/0310061. [44] R. Thomas, A holomorphic Casson invariant for Calabi-Yau 3-folds, and bundles on K3 fibrations, JDG 54 (2000), 367–438. [45] E. Witten, Two-dimensional gravity and intersection theory on moduli space, Surveys in Diff. Geom. 1 (1991), 243–310.
4ECM Stockholm 2004 c 2005 European Mathematical Society
On Heegaard Diagrams and Holomorphic Disks Peter Ozsv´ath and Zolt´ an Szab´ o
1. Introduction The aim of this paper is to give a quick introduction to Heegaard Floer homology [31] for three-dimensional manifolds and a related Floer homology invariant for knots [37], [41] and discuss some recent results. Let Y be an oriented closed 3-manifold. In its simplest form Heegaard 3 . Note Floer homology associates to Y a finitely generated Abelian group HF that this functor can be regarded as a 3 + 1-dimensional quantum field theory since a smooth oriented cobordism W between two closed oriented 3-manifolds Y1 and Y2 induces a map 3 (Y1 ) −→ HF 3 (Y2 ). FW : HF Furthermore if we decompose such a cobordism W = W1 ∪Y2 W2 then the induced maps satisfy a multiplicative formula FW = FW2 · FW1 . Heegaard Floer homology seems closely related to both the instanton Floer homology [11], [4] and Seiberg-Witten Floer homology [25]. Similarly the fourdimensional invariant is a natural analogue of Donaldson invariants [3], [5], and Seiberg-Witten invariants [44], [30]. 2. Heegaard diagrams 3 (Y ) uses both topological and analytical tools, such as The construction of HF Heegaard diagrams and holomorphic disks. Its starting point is a decomposition of Y into more elementary pieces called handlebodies. A genus g handlebody U is diffeomorphic to a regular neighborhood of a bouquet of g circles in R3 . The boundary of U is an oriented surface with genus g. If we glue two such handlebodies together along their common boundary, then we get a closed 3manifold Y = U0 ∪Σ U1 oriented so that Σ is the oriented boundary of U0 . This is called a Heegaard decomposition for Y . It is easy to describe such a decomposition with the help of Heegaard diagrams. Peter Ozsv´ ath was partially supported by NSF grant numbers DMS-0234311. Zolt´ an Szab´ o was partially supported by NSF grant numbers DMS-0107792.
770
P. Ozsv´ ath and Z. Szab´ o
The diagram is an oriented surface Σ of genus g and a collection of closed embedded curves α1 , . . . , αg , β1 , . . . , βg in Σ, where the α curves are disjoint from each other and Σg − α 1 − · · · − α g is connected, and the β curves satisfy the same properties. This diagram uniquely determines a Heegaard decomposition, where the α curves bound disjoint embedded disks in U0 , and the β circles bound similar disks in the handlebody U1 . According to a classical result of Singer [43] any oriented, closed threemanifold admits a Heegaard decomposition and consequently a Heegaard diagram. It is important to note however that for a given 3-manifold there are lots of different Heegaard diagrams. For example the following moves do not change the underlying 3-manifold: • isotopies: replace αi by a curve αi which is isotopic through isotopies which are disjoint from the other αj (j = i); or, the same moves for the β curves. • handleslides: replace αi by αi , which is a curve with the property that αi ∪ αi ∪ αj bound a pair of pants which is disjoint from the remaining αk (k = i, j); or, the same moves for the β curves. • stabilizations/destabilizations: A stabilization replaces Σ by its connected sum with a genus one surface Σ = Σ#E, and replaces {α1 , . . . , αg } and {β1 , . . . , βg } by {α1 , . . . , αg+1 } and {β1 , . . . , βg+1 } respectively. Here αg+1 and βg+1 are a pair of curves supported in E, meeting transversally in a single point. In the opposite direction it can be shown that any two Heegaard diagram for the same 3-manifold can always be connected by using the above Heegaard moves finitely many times. 3 (Y ). 3. Construction of HF Let us fix a diagram (Σg , α1 , . . . , αg , β1 , . . . , βg ) for Y together with an additional basepoint z ∈ Σg − α1 − · · · − αg − β1 − · · · − βg . To this data we associate the g-fold symmetric product Symg (Σg ). This space is constructed by taking the g-fold product of Σg and dividing it with the symmetry group on g letters, in other words Symg (Σg ) is the space of unordered ahler g-tuples of points in Σg . Fixing a complex structure on Σg induces a K¨ structure on Symg (Σg ). In this g-dimensional complex manifold the α and β curves induce a pair of totally real g-dimensional tori Tα = α1 × · · · × αg and Tβ = β1 × · · · × βg .
On Heegaard Diagrams and Holomorphic Disks
771
Furthermore the basepoint z induces a complex codimension 1 subvariety Vz = z × Symg−1 (Σg ) in Symg (Σg ) which is disjoint from Tα and Tβ . The Heegaard Floer homology of Y is the homology of a chain complex 3 CF (Tα , Tβ , z) whose generators are the intersection points Tα ∩ Tβ , where (informally) the boundary map is given by counting holomorphic disks in the complement of Vz . More precisely given two intersection points x, y ∈ Tα ∩ Tβ we study Whitney disks: " = " u(−i) = x, u(i) = y " , u : D −→ Symg (Σ)" " u(e1 ) ⊂ Tα , u(e2 ) ⊂ Tβ where D is the unit disk in C, e1 ⊂ ∂D denotes the arc where Re(z) ≥ 0, and e2 ⊂ ∂D denotes the arc where Re(z) ≤ 0. We say that two such maps are homotopic if we can connect them by a one-parameter family of maps satisfying the same boundary conditions. Let π2 (x, y) be the set of homotopy classes connecting x to y. For a given φ ∈ π2 (x, y) let M(φ) denote the moduli space of holomorphic representatives. Also let µ(φ) denote the Maslov index of φ, that is, the expected dimension of M(φ) see also [42]. This means that if we vary the almost complex structure of Symg (Σg ) in an n-dimensional family, the corresponding parametrized moduli space has dimension n + µ(φ) around solutions that are smoothly cut out by the defining equation. Another useful function is given by computing the algebraic intersection number between φ and Vz . This gives nz : π2 (x, y) −→ Z. Both of these maps are additive in the sense that nz (φ ∗ ψ) = nz (φ) + nz (ψ) and µ(φ ∗ ψ) = µ(φ) + µ(ψ), where here ∗ denotes the natural juxtaposition operation. Our aim will be to count pseudo-holomorphic disks. For this to make sense, we need to have a sufficiently generic situation, so that for each φ the moduli spaces M(φ) are cut out transversally. In particular if M(φ) is non-empty then dim M(φ) = µ(φ).
(3.1)
To achieve this, we need to introduce a suitable perturbation of the notion of pseudo-holomorphic disks, see for instance Section 3 of [31], see also [14], [15]. Specifically, for such a perturbation, we can arrange Equation (3.1) to hold for all homotopy classes φ with µ(φ) ≤ 2. Indeed, since there is a one-parameter family of holomorphic automorphisms of the disks which preserve ±i and the boundary arcs e1 and e2 , the moduli space M(φ) admits a free action by R, provided that φ is non-trivial. In particular, if φ has µ(φ) = 1, then M(φ)/R is a zero-dimensional manifold.
772
P. Ozsv´ ath and Z. Szab´ o
3 (Tα , Tβ , z) by the formula We then define the boundary map on CF M(φ) ∂x = # · y. (3.2) R y {φ∈π2 (x,y)|µ(φ)=1,nz (φ)=0}
After establishing energy bounds for pseudo-holomorphic representatives of φ, we can apply Gromov’s compactness result, see for example [21], [12], [45], [39]. This implies that for generic choices of almost complex structures M(φ)/R is a compact zero-dimensional manifold. This moduli space also comes with an orientation and #(M(φ)/R) denotes the algebraic count of points in it. In the case when b1 (Y ) = 0 there are only finitely many terms in the above equation. However when b1 (Y ) > 0 we have to restrict to a special class of admissible Heegaard diagrams to assure that only finitely many homotopy classes in Equation (3.2) would support holomorphic disks, see [31]. By analyzing Gromov limits of pseudo-holomorphic disks, one can prove 3 (Tα , Tβ , z) is a chain complex. that ∂ 2 = 0, i.e., that CF In [31] we study the effect of the Heegaard moves on the Floer homology and prove the following: Theorem 3.1. Let (Σg , α, β, z) and (Σg , α , β , z ) be two (admissible) pointed Heegaard diagrams for Y . Then the two corresponding chain complexes are 3 (α, β, z) and HF 3 (α , β , z ) are chain homotopy equivalent. In particular HF isomorphic. The proof can be broken up into parts, where one shows that the homology groups are identified as the Heegaard diagram undergoes the following changes: (1) (2) (3) (4)
the the the the
complex structure over Σ is varied attaching circles are moved by isotopies (in the complement of z) attaching circles are moved by handle-slides (in the complement of z) Heegaard diagram is stabilized.
The first step is a direct adaptation of the corresponding fact from Lagrangian Floer theory (independence of the particular compatible almost-complex structure), [12], [13]. To see the second step, we observe that any isotopy of the α and β can be realized as a sequence of exact Hamiltonian isotopies and metric changes over Σ. The third step follows from naturality properties of the Floer homology theories (using a holomorphic triangle construction) and a direct calculation in a special case (where handle-slides are made over a g-fold connected sum of S 1 × S 2 ). The final step can be seen as an invariance of the theory under a natural degeneration of the (g + 1)-fold symmetric product of the connected sum of Σ with E, as the connected sum neck is stretched, compare also [28], [22]. 3 (Y ) = HF 3 (Tα , Tβ , z) by As a corollary of Theorem 3.1 we can define HF using any admissible Heegaard diagram for Y .
On Heegaard Diagrams and Holomorphic Disks
773
As an easy example, note that S 3 can be given a genus one Heegaard diagram, with two attaching circles α1 and β1 , which meet in a unique trans3 (S 3 ) in this case has verse intersection point. Correspondingly, the complex CF 3 (S 3 ) ∼ a single generator, and there are no differentials. Hence, HF = Z. In fact 3 (S 3 ) is when H1 (Y ) = 0, the Floer homology has a natural Z grading and HF supported in degree 0. 3 is the simplest version of HeeAs we mentioned in the introduction HF gaard Floer homology. The other versions HF + and HF − are defined by counting all the holomorphic disks in Symg (Σg ), (not just those with nz = 0), and using Vz to give a filtration on the corresponding chain complex. In fact these version play a prominent role when one studies invariants for smooth closed oriented four-manifolds, see [32]. 4. Floer homology for knots in S 3 We describe here constructions of Heegaard Floer homology applicable to knots. For simplicity, we restrict attention to knots in S 3 . This knot invariant was introduced in [37] and also independently by Rasmussen in [40], [41]. Let us consider a Heegaard diagram (Σ, α, β) for S 3 equipped with two basepoints w and z. This data gives rise to a knot in S 3 as follows. We connect w and z by a curve a in Σ − α1 − · · · − αg and also by another curve b in Σ − β1 − · · · − βg . By pushing a and b into Uα and Uβ respectively, we obtain a knot K ⊂ S 3 . We call the data (Σ, α, β, w, z) a two-pointed Heegaard diagram compatible with the knot K. Given a knot K one can always find such a Heegaard diagram. The simplest construction now is to consider a differential on Tα ∩ Tβ defined analogously to Equation (3.2), only now we count holomorphic disks for which nz (φ) = nw (φ) = 0: ∂K (x) =
y {φ∈π2 (x,y)|µ(φ)=1,nz (φ)=nw (φ)=0}
#
M(φ) R
· y.
(4.1)
There are two obvious differences between this picture and the usual chain 3 (S 3 ). The first one is that we are forced to use a complicated Heecomplex CF gaard diagram of S 3 , a diagram which is compatible with K. (The standard genus 1 Heegaard diagram in which α and β intersect each other once is compatible with the unknot only). On the bright side however we are only counting holomorphic disks in the complement of Vz and Vw , and as we will see this gives a bigraded theory. For x and y we can define a difference f (x, y) = nz (φ) − nw (φ), where φ ∈ π2 (x, y). It is easy to see that f (x, y) is additive and independent of the choice of φ. It is also clear that f (x, y) = 0 gives an obstruction for y to appear in ∂K (x). In particular it follows that f gives a decomposition of the chain complex.
774
P. Ozsv´ ath and Z. Szab´ o
With a little work we can lift f uniquely to function F : Tα ∩ Tβ −→ Z satisfying the relation F (x) − F (y) = f (x, y), (4.2) and the additional symmetry " " #{x ∈ Tα ∩ Tβ "F (x) = i} ≡ #{x ∈ Tα ∩ Tβ "F (x) = −i} (mod 2) for all i (compare, more generally, Equation (4.3)). The second grading comes from the Maslov grading. For x, y ∈ Tα ∩ Tβ let g(x, y) = µ(φ), where φ ∈ π2 (x, y) with nw (φ) = 0. In itself this gives only a relative grading. However if we forget the basepoint z we get (Σ, α, β, w) which is a Heegaard diagram of S 3 and according to Theorem 3.1 the homology of 3 (Tα , Tβ , w) is isomorphic to Z. Using the normalization that this homology CF is supported in grading zero we get a function G : Tα ∩ Tβ −→ Z 3 (Tα , Tβ , w). that associates to each intersection point its absolute grading in CF Clearly G(x) − G(y) = g(x, y). Using these functions, let Ci,j denote the free Abelian group generated by those intersection points x ∈ Tα ∩ Tβ that satisfy i = F (x), j = G(x). Then the boundary map can be written as ∂K : Ci,j −→ Ci,j . i,j
i,j
According to our previous discussion ∂K respects F and decreases the grading G by 1, so it maps Ci,j to Ci,j−1 . It is proved in [37], [41] that the homology of this chain complex doesn’t depend on the choice of two-pointed Heegaard diagram for K, and so for each i, j we get a knot invariant Hi,j (K). The Floer homology of K is related to the symmetrized Alexander polynomial ∆K (T ) of K. In particular (−1)G(x) · T F (x) = ∆K (T ) (4.3) x∈Tα ∩Tβ
Clearly this implies i
(−1)j · rk(Hi,j (K)) · T i = ∆K (T ).
(4.4)
j
It is interesting to compare this with [1], [29], and [10]. 5. Calculations of knot Floer homology In this section we will study an explicit construction that associates a twopointed Heegaard diagram compatible to K. Let us fix an oriented knot projection for K. Let v1 , . . . , vn denote the double points in the projection. Fix an edge e which appears in the closure of the unbounded region A in the planar projection. If we forget the pattern of
On Heegaard Diagrams and Holomorphic Disks
775
over and under crossings in the diagram we get an immersed circle C in the plane. Then an -neighborhood nd(C) is a handlebody of genus n + 1. Clearly S 3 − nd(C) is also a handlebody, so we get a Heegaard diagram of S 3 . Let Σ be the oriented boundary of S 3 − nd(C). This will be our Heegaard surface, but we still have to find the α and β circles and the two basepoints. The complement of C in the plane has n + 2 components. For each region, except for A, we associate an α curve, which is the intersection of the region with Σ. It is easy to see Σ − α1 − · · · − αn+1 is connected and all αi bound disjoint disks in S 3 − nd(C). Fix a point in the edge e and let βn+1 be the meridian for K around this point. The curves β1 , . . . , βn correspond to the double points v1 , . . . , vn , as described in Figure 1. Note that this is where we use the pattern about over and under crossings. Finally we choose w and z on the two sides of βn+1 . They are placed in a such a way that the small arc oriented from z to w points to the same direction as K. This arc is in the complement of the α curves. We can also choose a long arc from w to z in the complement of the β curves that travels along the knot K. It follows that this two-pointed Heegaard diagram is compatible to K, so we can use it to compute Hi,j (K). The intersection points x ∈ Tα ∩Tβ admit a combinatorial description. To this end let B denote the unique bounded region that contains e in its boundary. Each double point vi of C is contained in four (not necessarily distinct) regions. Let us recall from [23], that a Kauffman state (for the projection and the distinguished edge e) is a map that associates for each double point vi one of the four in-coming quadrants (corners) in such a way that each component in S 2 − C − A − B gets exactly one corner. Let us write a Kauffman state as (c1 , . . . , cn ), where ci is a corner for vi . In order to see the relation between Tα ∩ Tβ and Kauffman states note that in a neighborhood of each vi , there are at most four intersection points of βi with circles corresponding to the four regions which contain vi , see Figure 1 . Clearly these intersection points are in one-to-one correspondence with the corners. (There are fewer than four intersection points with βi if some of the corners are in the unbounded region A.) Moreover there is a unique α curve that intersects βn+1 , and this corresponds to region B. Let us denote this curve by αn+1 . Since αn+1 and βn+1 intersect in a unique point and βn+1 is disjoint from αi for 1 ≤ i ≤ n, it follows that Tα ∩ Tβ agrees with (α1 × · · · × αn ) ∩ (β1 × · · · × βn ) in Sym (Σ). In this way for any x ∈ Tα ∩ Tβ we associated n corners that use all the double points v1 , . . . , vn and all the regions in S 2 − C except for A and B. This gives a one-to-one correspondence between our generators x ∈ Tα ∩ Tβ and Kauffman states. We can also describe F (x) and G(x) in terms of the knot projection. Both of these function will be given as a state sum over the corners of the corresponding Kauffman state. For a given corner c we define two functions n
776
P. Ozsv´ ath and Z. Szab´ o
r1
r2
r3
r4
β α2
α1
α3
α4
Figure 1. Special Heegaard diagram for knot crossings. At each crossing as pictured on the left, we construct a piece of the Heegaard surface on the right (which is topologically a four-punctured sphere). The curve β is the one corresponding to the crossing on the left; the four arcs α1 , . . . , α4 will close up.
−1/2 0
1/2 0
0
1/2
0
−1/2
Figure 2. The definition of a(ci ) for both kinds of crossings. a(c) and b(c) by Figure 2 and Figure 3 respectively. The following result is proved in [35]. Proposition 5.1. Fix an oriented knot projection for K together with a distinguished edge. Let us fix a two-pointed Heegaard diagram for K as above. For x ∈ Tα ∩Tβ let (c1 , . . . , cn ) be the corresponding Kauffman state. Then we have F (x) =
n i=1
a(ci )
G(x) =
n
b(ci ).
i=1
It is clear from the above formulas that if K has an alternating projection, then F (x) − G(x) is independent of the choice of state x. It follows that if we use the chain complex associated to this Heegaard diagram, then there are no differentials in the knot Floer homology, and indeed, its rank is determined by its Euler characteristic. Indeed, by calculating the constant, we get the following result, proved in Theorem 1.3 of [35]:
On Heegaard Diagrams and Holomorphic Disks
−1 0
777
1 0
0
0
0 0
Figure 3. Definition of b(ci ). Theorem 5.2. Let K ⊂ S 3 be an alternating knot in the three-sphere, and write its symmetrized Alexander polynomial as ∆K (T ) =
n
ai T i
i=−n
and let σ(K) denote its signature. Then, Hi,j (K) = 0 for j = i +
σ(K) 2 ,
and
Hi,i+σ(K)/2 ∼ = Z|ai | . Thus, for alternating knots, this choice of Heegaard diagram is remarkably successful. For a general knot Proposition 5.1 gives a combinatorial definition of the Abelian groups Ci,j (K). However, computing the boundary map ∂K involves counting holomorphic disks, and at present we don’t have a combinatorial description of ∂K in terms of Kauffman states. Luckily ∂K respects certain additional filtrations which can be described in terms of states, and this property, together with some additional tricks, can be used to give calculations of knot Floer homology groups in certain cases, cf. [38], [6]. As a particular example, these filtrations are used in [38] to show that knot Floer homology of the eleven-crossing Kinoshita-Terasaka knot (a knot whose Alexander polynomial is trivial) differs from that of its Conway mutant. In a different direction, some knots admit Heegaard diagrams on a genus one surface. For these knots, calculation of the differentials becomes a purely combinatorial matter, cf. Section 6 of [37] and also [40], [41], [17]. Sometimes, it is more convenient to use more abstract methods to calculate knot Floer homology. In particular, there is a relationship between knot Floer homology and the Heegaard Floer homology of three-manifolds obtained by surgery along K, cf. [37], [41]. With the help of this relationship, we obtain the following structure for the knot Floer homology of a knot for which some positive surgery is an lens space (proved in Theorem 1.2 of [34]): Theorem 5.3. Suppose that K ⊂ S 3 is a knot for which there is a positive integer p for which Sp3 (K) is a lens space. Then, there is an increasing sequence of integers n−m < · · · < nm
778
P. Ozsv´ ath and Z. Szab´ o
with the property that ns = −n−s , with the s ≤ m we let 0 δi = δs+1 − 2(ns+1 − ns ) + 1 δs+1 − 1
following significance. For −m ≤ if s = m if m − s is odd if m − s > 0 is even,
Then for each s with |s| ≤ m we have Hns ,δs (K) = Z Furthermore Hi,j (K) = 0 for all other values of i, j. For example, all (right-handed) torus knots satisfy the hypothesis of this theorem. (Recall that if Tp,q denotes the right-handed (p, q) torus knot, then 3 (Tp,q ) is a lens space.) The above theorem can be fruitfully thought of from Spq±1 three perspectives: as a source of examples of knot Floer homology calculations (for example, a calculation of the knot Floer homology of torus knots), as a restriction on knots which admit lens space surgery, (for example all the coefficients of its Alexander polynomial satisfy |ai | ≤ 1), or as a restriction on lens spaces which can arise as surgeries on knots in S 3 , cf. [34]. 5.1. Knot Floer homology and the Seifert genus. A knot K ⊂ S 3 can be realized as the boundary of an embedded, orientable surface in S 3 . Such a surface is called a Seifert surface for K, and the minimal genus of any Seifert surface for K is called its Seifert genus, denoted g(K). Of course, a knot has g(K) = 0 if and only if it is the unknot. The knot Floer homology of K detects the Seifert genus, according to the following result proved in [36]. (This result is a natural analogue of a theorem of Kronheimer and Mrowka in Seiberg-Witten theory, see [24], [26].) To state it, we first define the degree of the knot Floer homology to be the integer " deg H(K) = max{i ∈ Z" ⊕j Hi,j (K) = 0}. Theorem 5.4. For any knot K ⊂ S 3 , g(K) = deg H(K). Given a Seifert surface of genus g for K, one can construct a Heegaard diagram for which all the points in Tα ∩ Tβ have filtration level ≤ g. This gives at once the bound deg H(K) ≤ g(K) (this result is analogous to a classical bound on the genus of a knot in terms of the degree of its Alexander polynomial). The inequality in the other direction is much more subtle. First, one relates the degree of the knot Floer homology by a similar quantity defined using the Floer homology of the zero-surgery S03 (K). Next, one appeals to a theorem of Gabai [16], according to which if K is a knot with Seifert genus g > 0, then S03 (K) admits a taut foliation F whose first Chern class is g−1 times a generator for H 2 (S03 (K); Z). The taut foliation naturally induces a symplectic structure on [−1, 1] × S03 (K), according to a result of Eliashberg and Thurston [8], which,
On Heegaard Diagrams and Holomorphic Disks
779
according to a recent result of Eliashberg [7], [9] can be embedded in a closed symplectic four-manifold X (indeed, one can arrange for S03 (K) to divide the four-manifold X into two pieces with b+ 2 (Xi ) > 0). The non-vanishing of the four-manifold invariant ΦX,k for a symplectic four-manifold can then be used to prove that the Heegaard Floer homology of S03 (K) is non-trivial in the Spinc structure gotten by restricting the canonical Spinc structure k of the ambient symplectic four-manifold – i.e., this is the Spinc structure belonging to the foliation F . The details of this argument are given in [36]. As an obvious application of Theorem 5.4 we have the following. Corollary 5.5. The knot Floer homology Hi,j (K) distinguishes every non-trivial knot from the unknot. Theorem 5.4 have applications in a different direction as well. By applying 3 of three-manifolds a relationship between knot Floer homology of K and HF obtained by making surgery along K, we get the following (see [36], [33]) Theorem 5.6. For a knot K in S 3 and a rational number r let Sr3 (K) denote the result of surgery along K with slope r. Let U denote the unknot. If for a 3 (S 3 (U )) as 3 (S 3 (K)) is isomorphic to HF given K there exist an r so that HF r r a graded Abelian group, then U = K. Clearly Theorem 5.6 implies the following conjecture of Gordon [18], (which was proved for r = 0 by Gabai [16], for r non-integral by Cullen, Gordon, Luecke and Shalen [2], for r = ±1 by Gordon and Luecke [19], [20], and finally settled in a joint work with Kronheimer and Mrowka using Seiberg-Witten theory [27]): Conjecture 5.7. Let K be a knot in S 3 and let U denote the unknot. If for a given K there exist a rational number r so that Sr3 (K) is orientation preservingly diffeomorphic to Sr3 (U ) then U = K. References [1] S. Akbulut and J.D. McCarthy. Casson’s invariant for oriented homology 3spheres, volume 36 of Mathematical Notes. Princeton University Press, Princeton, NJ, 1990. An exposition. [2] M. Culler, C. McA. Gordon, J. Luecke, and P.B. Shalen. Dehn surgery on knots. Ann. of Math., 125(2):237–300, 1987. [3] S.K. Donaldson. Polynomial invariants for smooth four-manifolds. Topology, 29(3):257–315, 1990. [4] S.K. Donaldson. Floer homology groups in Yang-Mills theory, volume 147 of Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge, 2002. With the assistance of M. Furuta and D. Kotschick. [5] S.K. Donaldson and P.B. Kronheimer. The Geometry of Four-Manifolds. Oxford Mathematical Monographs. Oxford University Press, 1990. [6] E. Eftekhary. Knot Floer homologies for pretzel knots. math.GT/0311419.
780
P. Ozsv´ ath and Z. Szab´ o
[7] Y.M. Eliashberg. Few remarks about symplectic filling. Geom. Topol., 8:277–293, 2004. [8] Y.M. Eliashberg and W.P. Thurston. Confoliations. Number 13 in University Lecture Series. American Mathematical Society, 1998. [9] J.B. Etnyre. On symplectic fillings. Algebr. Geom. Topol, 4:73–78, 2004. [10] R. Fintushel and R.J. Stern. Knots, links, and 4-manifolds. Invent. Math., 134(2):363–400, 1998. [11] A. Floer. An instanton-invariant for 3-manifolds. Comm. Math. Phys., 119:215– 240, 1988. [12] A. Floer. Morse theory for Lagrangian intersections. J. Differential Geometry, 28:513–547, 1988. [13] A. Floer. The unregularized gradient flow of the symplectic action. Comm. Pure Appl. Math., 41(6):775–813, 1988. [14] A. Floer, H. Hofer, and D. Salamon. Transversality in elliptic Morse theory for the symplectic action. Duke Math. J, 80(1):251–29, 1995. [15] K. Fukaya, Y-G. Oh, K. Ono, and H. Ohta. Lagrangian intersection Floer theory – anomaly and obstruction. Kyoto University, 2000. [16] D. Gabai. Foliations and the topology of 3-manifolds III. J. Differential Geom., 26(3):479–536, 1987. [17] H. Goda, H. Matsuda, and T. Morifuji. Knot Floer homology of (1, 1)-knots. math.GT/0311084. [18] C. McA. Gordon. Dehn surgery on knots. In Proceedings of the International Congress of Mathematicians Vol. I (Kyoto, 1990), pages 631–642. Springer-Verlag, 1991. [19] C. McA. Gordon and J. Luecke. Knots are determined by their complements. J. Amer. Math. Soc., 2(2):371–415, 1989. [20] C. McA. Gordon and J. Luecke. Knots are determined by their complements. Bull. Amer. Math. Soc. (N.S.), 20(1):83–87, 1989. [21] M. Gromov. Pseudo holomorphic curves in symplectic manifolds. Inventiones Mathematicae, 82:307–347, 1985. [22] E. Ionel and T.H. Parker. Relative Gromov-Witten invariants. Ann. of Math. (2), 157(1):45–96, 2003. [23] L. H. Kauffman. Formal knot theory. Number 30 in Mathematical Notes. Princeton University Press, 1983. [24] P.B. Kronheimer. Embedded surfaces and gauge theory in three and four dimensions. In Surveys in differential geometry, Vol. III (Cambridge, MA, 1996), pages 243–298. Int. Press, Boston, MA, 1998. [25] P.B. Kronheimer and T.S. Mrowka. Floer homology for Seiberg-Witten Monopoles. In preparation. [26] P.B. Kronheimer and T.S. Mrowka. Monopoles and contact structures. Invent. Math., 130(2):209–255, 1997. [27] P.B. Kronheimer, T.S. Mrowka, P.S. Ozsv´ ath, and Z. Szab´ o. Monopoles and lens space surgeries. math.GT/0310164. [28] A-M. Li and Y. Ruan. Symplectic surgery and Gromov-Witten invariants of Calabi-Yau 3-folds. Invent. Math., 145(1):151–218, 2001.
On Heegaard Diagrams and Holomorphic Disks
781
[29] G. Meng and C.H. Taubes. SW=Milnor torsion. Math. Research Letters, 3:661– 674, 1996. [30] J.W. Morgan. The Seiberg-Witten Equations and Applications to the Topology of Smooth Four-Manifold. Number 44 in Mathematical Notes. Princeton University Press, 1996. [31] P.S. Ozsv´ ath and Z. Szab´ o. Holomorphic disks and topological invariants for closed three-manifolds. To appear in Annals of Math.. math.SG/0101206. [32] P.S. Ozsv´ ath and Z. Szab´ o. Holomorphic triangles and invariants for smooth four-manifolds. math.SG/0110169. [33] P.S. Ozsv´ ath and Z. Szab´ o. Knot Floer homology and rational surgery. In preparation. [34] P.S. Ozsv´ ath and Z. Szab´ o. On knot Floer homology and lens space surgeries. math.GT/0303017. [35] P.S. Ozsv´ ath and Z. Szab´ o. Heegaard Floer homology and alternating knots. Geom. Topol., 7:225–254, 2003. [36] P.S. Ozsv´ ath and Z. Szab´ o. Holomorphic disks and genus bounds. Geom. Topol., 8:311–334, 2004. [37] P.S. Ozsv´ ath and Z. Szab´ o. Holomorphic disks and knot invariants. Adv. Math., 186(1):58–116, 2004. [38] P.S. Ozsv´ ath and Z. Szab´ o. Knot Floer homology, genus bounds, and mutation. Topology Appl., 141(1-3):59–85, 2004. [39] T.H. Parker and J.G. Wolfson. Pseudo-holomorphic maps and bubble trees. J. Geom. Anal., 3(1):63–98, 1993. [40] J.A. Rasmussen. Floer homology of surgeries on two-bridge knots. Algebr. Geom. Topol., 2:757–789, 2002. [41] J.A. Rasmussen. Floer homology and knot complements. PhD thesis, Harvard University, 2003. math.GT/0306378. [42] J. Robbin and D. Salamon. The Maslov index for paths. Topology, 32(4):827–844, 1993. [43] J. Singer. Three-dimensional manifolds and their Heegaard diagrams. Trans. Amer. Math. Soc., 35(1):88–111, 1933. [44] E. Witten. Monopoles and four-manifolds. Math. Research Letters, 1:769–796, 1994. [45] R. Ye. Gromov’s compactness theorem for pseudo holomorphic curves. Trans. Amer. Math. Soc., 342(2):671–694, 1994. Peter Ozsv´ ath Department of Mathematics Columbia University New York 10025, USA e-mail:
[email protected] Zolt´ an Szab´ o Department of Mathematics Princeton University New Jersey 08544, USA e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Emergence of Symmetry: Conformal Invariance in Scaling Limits of Random Systems Oded Schramm
Many random systems in two dimensions exhibit approximate conformal invariance in large scales. In fact, conformal invariance may be present when one takes the scaling limit of the process. A scaling limit for a process defined on a grid is obtained by letting the mesh of the grid tend to zero. In physics, conformal invariance is usually discussed indirectly as some invariance or covariance of correlation functions. This naturally leads to an algebraic setting, and representation theory often plays a significant role. It is usually necessary to introduce some unproven but rather plausible assumptions in order to make this type of analysis work. But these theories have been very successful in making some precise predictions about the long-range behavior of the discrete systems. On the other hand, within probability conformal invariance usually means the invariance under conformal maps of probabilities of events. The distinction from the physics point of view might seem rather minor, but leads to different types of theories. The recent progress in the mathematical understanding of these processes is twofold: on the one hand the consequences of conformal invariance are better understood, and on the other hand the number of fundamental processes for which conformal invariance has been mathematically established has very significantly increased. We now discuss informally the Stochastic Loewner evolution processes and their relevance to random scaling limits in two dimensions. Suppose that we consider some random process in the plane, such as critical percolation. We may choose to focus on a particular large percolation cluster and look at its boundary. Suppose that we fix a piece α of the boundary curve and consider the conditioned distribution of the rest of the curve α . We may conformally map the complement of α to the upper half plane H and consider the image of α under this map φ. This image will be a random curve in the upper half plane. It is not hard to see that conditioned on α the curve α is obtained as an interface of a percolation cluster in the complement of α, which has appropriate boundary conditions (“open” and “closed”) on the two sides of α. If we assume (for now) conformal invariance, then conditioned on α (and φ) the distribution φ(α ) is approximately the distribution of an interface in the upper half plane of a percolation cluster with corresponding boundary conditions on the two arcs corresponding under φ to the two sides of α. In fact, we will conveniently
784
O. Schramm
assume that these two arcs are the positive and negative parts of the real axis. This is achieved by an appropriate normalization of φ. (Initially, the collection of all possible φ is 3-dimensional, over the reals.) We may now invoke Loewner’s theorem to get a conformally natural description of the path γ = φ(α ). There is a parameterization of the curve γ such that for every t 0 the conformal map gt : H \ γ[0, t] → H normalized by gt (z) = z + o(1), satisfies ∂t gt (z) =
z→∞
2 , gt (z) − Wt
(0.1) (0.2)
where Wt = gt (γ(t)) . The equation (0.2) is a variant of Loewner’s original equation that is appropriate for H and the normalization (0.1). If we would have started out by conditioning on some larger segment α ˆ ⊃ α instead of α, where α ˆ = α ∪ φ−1 (γ[0, t]), then we would obtain the same description for the resulting α . (This assumes conformal invariance, of course.) If we work out the consequence of this fact for the Loewner driving parameter Wt , we find that conditioned on (Ws : s < t) the process (Ws+t − Wt : s > 0) has the same distribution as the process (Ws : s > 0). When we take into account that Wt must be continuous (which can be justified) and that the distribution of Wt is the same as that of −Wt , it follows that Wt = B(κ t) where κ 0 is some constant and B is one-dimensional Brownian motion. In this way, questions about the poorly understood path α in two dimensions are reduced to one-dimensional Brownian motion, which is rather well understood. The process obtained by solving (0.2) where Wt = B(κ t) is called Stochastic Loewner evolution with parameter κ, or just SLE(κ). Using Cardy’s formula, (or just some obvious symmetry properties of percolation) one can see that κ = 6 is the only possible choice for the scaling limit of percolation. The above discussion relied on the assumption of conformal invariance. (Actually, to make this rigorous, some tameness properties for critical percolation are also needed, but these are available in the literature.) Fortunately, Stanislav Smirnov has recently established the conformal invariance of critical percolation, and so we may conclude that SLE(6) does describe boundaries of percolation clusters. Smirnov’s proof is motivated by Carleson’s version of Cardy’s formula. Subsequently, there have been other processes for which conformal invariance has been established. For these, the SLE point of view was essential. These processes are the loop-erased random walk (which was the process motivating the introduction of SLE), the uniform spanning tree and its Peano path, the harmonic explorer and the level sets of the Gaussian free field.
Emergence of Symmetry
785
The bibliography below is only a small sample of the relevant literature. More references can be found in the surveys [LPSA94, Wer01, Law01, SW01, Car03, Wer04, GK04, KN04, Law05]. References John L. Cardy. Critical percolation in finite geometries. J. Phys. A, 25 (4), L201–L206, 1992. [Car03] John Cardy. Conformal invariance in percolation, self-avoiding walks, and related problems. Ann. Henri Poincar´e, 4(suppl. 1):S371–S384, 2003, arXiv:cond-mat/0209638. [GK04] Ilya A. Gruzberg and Leo P. Kadanoff. The Loewner equation: maps and shapes. J. Statist. Phys., 114(5-6):1183–1198, 2004, arXiv:condmat/0309292. [Ken00] Richard Kenyon. The asymptotic determinant of the discrete Laplacian. Acta Math., 185 (2), 239–286, 2000. [Kes87] Harry Kesten. Scaling relations for 2D-percolation. Comm. Math. Phys., 109 (1), 109–156, 1987. [KN04] Wouter Kager and Bernard Nienhuis. A guide to stochastic L¨ owner evolution and its applications. J. Statist. Phys., 115(5-6):1149–1229, 2004, arXiv:math-ph/0312056. [Law01] G.F. Lawler. An introduction to the stochastic Loewner evolution, 2001. preprint. [Law05] Gregory F. Lawler. Conformally invariant processes in the plane. Amer. Math. Soc., 2005. To appear. [Loe23] K. L¨ owner (C. Loewner). Untersuchungen u ¨ber schlichte konforme Abbildungen des Einheitskreises, i. Math. Ann., 89:103–121, 1923. [LPSA94] Robert Langlands, Philippe Pouliot, and Yvan Saint-Aubin. Conformal invariance in two-dimensional percolation. Bull. Amer. Math. Soc. (N.S.), 30 (1), 1–61, 1994. [LSW01] Gregory F. Lawler, Oded Schramm, and Wendelin Werner. Values of Brownian intersection exponents. I. Half-plane exponents. II. Plane exponents. Acta Math., 187 (2), 237–273 & 275–308, 2001, arXiv:math.PR/9911084 & arXiv:math.PR/0003156. [LSW04] Gregory F. Lawler, Oded Schramm, and Wendelin Werner. Conformal invariance of planar loop-erased random walks and uniform spanning trees. Ann. Probab., 32(1B):939–995, 2004, arXiv:math.PR/0112234. [RS05] Steffen Rohde and Oded Schramm. Basic properties of SLE. Ann. Math. (to appear), 2005?, arXiv:math.PR/0106036. [Rus78] Lucio Russo. A note on percolation. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 43 (1), 39–48, 1978. [Sch00] Oded Schramm. Scaling limits of loop-erased random walks and uniform spanning trees. Israel J. Math., 118:221–288, 2000. [Smi01] Stanislav Smirnov. Critical percolation in the plane: conformal invariance, Cardy’s formula, scaling limits. C. R. Acad. Sci. Paris S´ er. I Math., 333 (3), 239–244, 2001. [Car92]
786 [SS05]
[SS] [SW78]
[SW01]
[Wer01]
[Wer04]
O. Schramm Oded Schramm and Scott Sheffield. The harmonic explorer and its convergence to SLE(4). Ann. Probab. (to appear), 2005? arXiv:math.PR/0310210. Oded Schramm and Scott Sheffield. Contour lines of the 2D Gaussian free field. In preparation. P.D. Seymour and D.J.A. Welsh. Percolation probabilities on the square lattice. Ann. Discrete Math., 3:227–245, 1978. Advances in graph theory (Cambridge Combinatorial Conf., Trinity College, Cambridge, 1977). Stanislav Smirnov and Wendelin Werner. Critical exponents for twodimensional percolation. Math. Res. Lett., 8(5-6):729–744, 2001, arXiv:math.PR/0109120. Wendelin Werner. Critical exponents, conformal invariance and planar Brownian motion. In European Congress of Mathematics, Vol. II (Barcelona, 2000), volume 202 of Progr. Math., pages 87–103. Birkh¨ auser, Basel, 2001, arXiv:math.PR/0007042. Wendelin Werner. Random planar curves and Schramm-Loewner evolutions. In Lectures on probability theory and statistics, volume 1840 of Lecture Notes in Math., pages 107–195. Springer, Berlin, 2004, arXiv:math.PR/0303354.
Oded Schramm URL: http://research.microsoft.com/~schramm
4ECM Stockholm 2004 c 2005 European Mathematical Society
Recent Progresses in K¨ahler and Complex Algebraic Geometry Claire Voisin
0. Introduction On a complex vector space V , a Hermitian bilinear form h is decomposed into real and imaginary parts as h = g − iω, where g is a symmetric real bilinear form and ω is a real 2-form which is of type (1, 1) for the complex structure on V . Here the notion of (complex valued) form of type (p, q) on V is the following: the space V ∗ ⊗ C of complex valued forms on V splits as a direct sum of V ∗1,0 ⊕ V ∗0,1 , where V ∗1,0 is the space of C-linear forms and V ∗0,1 is its complex conjugate. Then the forms of type (p, q) are generated by the α1 ∧ · · · ∧ αp ∧ β1 ∧ · · · ∧ βq , where αi ∈ V ∗1,0 and βj ∈ V ∗0,1 . The correspondence h → ω is a bijection between Hermitian bilinear forms and real forms of type (1, 1) on V . Thus the notion of (semi)-positivity for Hermitian bilinear forms provides a corresponding notion of (semi)-positivity for real forms of type (1, 1). Note that when h is positive definite, ω is nondegenerate, i.e., ω n = 0, n = rkC V . On a complex manifold X, the tangent space TX,x at any point is endowed with a complex structure, and the above correspondence induces a bijective correspondence between Hermitian bilinear forms on TX , and real 2-forms of type (1, 1) on X, that is of type (1, 1) on TX,x for any x ∈ X. In particular, if h is a Hermitian metric on TX , one can write h = g − iω, where g is a Riemannian metric (compatible with the complex structure), and ω is a positive real (1, 1)-form. Definition 0.1. The metric h is said to be K¨ ahler if furthermore the 2-form ω is closed. Since ω is non-degenerate, it will provide in particular a symplectic structure on the K¨ ahler manifold X, thus putting K¨ ahler geometry at the intersection of symplectic geometry and complex geometry. The work of Gromov [20] made the relation between symplectic and K¨ ahler geometry stronger: a
788
C. Voisin
symplectic manifold (X, ω) can be endowed with a compatible almost complex structure (i.e., a complex structure on each tangent space TX,x , varying in a smooth way), which is well defined up to deformations. Here “compatible” means that ωx has to be of type (1, 1) and positive on each TX,x for the given complex structure. Assuming X is compact, the K¨ahler assumption has for main differentialtopological consequence the Hodge decomposition theorem. Theorem 0.2. If X is K¨ ahler compact, the de Rham cohomology spaces H k (X, C) := {closed complex valued k-forms}/{exact ones} splits as H k (X, C) = ⊕p+q=k H p,q (X),
(0.1)
where H p,q (X) is the space of classes admitting a representative which is a closed form of type (p, q) (that is of type (p, q) at any point). Note that by the definition of H p,q (X), one has H p,q (X) = H q,p (X), a property which is called Hodge symmetry. The data of the decomposition (0.1), together with the rational (integral) structure of H k (X, C), that is the isomorphism H k (X, C) = H k (X, Q) ⊗ C, (H k (X, C) = H k (X, Z) ⊗ C), is exactly what is called a rational (integral) Hodge structure of weight k (see [19], [10], [36] I, 7.1.1). A deeper consequence of Hodge theory is the formality theorem [11], which says that the rational homotopy type of a compact K¨ahler manifold is determined by its rational cohomology ring, but we won’t use it here. Let us now turn to complex projective manifolds. They are defined as the complex submanifolds of PN (C). A classical theorem due to Chow, later generalized by Serre [29], says that they are also the smooth algebraic subvarieties of projective space. The projective space is K¨ahler, and by restriction of any K¨ ahler metric, it follows that complex projective manifolds are K¨ ahler. The projective space also carries a holomorphic line bundle L = O(1), whose holomorphic sections identify to linear forms on CN +1 . Namely, L is defined as the dual of the tautological sub-line bundle whose fiber at u ∈ PN (C) is the line generated by u in CN +1 . If X ⊂ PN (C) is a complex submanifold, the induced holomorphic line bundle admit as holomorphic sections the restrictions σ0 , . . . , σN +1 of the linear forms on CN +1 , and these sections have the property that for any point x ∈ X, at least one of these does not vanish on the fiber Lx , and that the map x → (σ0 (x), . . . , σN (x)),
Recent Progresses in K¨ ahler and Complex Algebraic Geometry
789
(a non-zero N + 1-tuple which is well defined up to a multiplicative coefficient, according to the trivialization of Lx chosen), is holomorphic and provides the initial holomorphic embedding to PN (C). Definition 0.3. A line bundle on a compact complex manifold is said to be very ample if its holomorphic sections provide as above an embedding to projective space. It is said to be ample if some power L⊗k is very ample. Since the pioneering work of Kodaira [23], line bundles in complex projective geometry can be considered to have as K¨ahler analogues real (1, 1)-classes in K¨ ahler compact geometry. We survey in this paper classical and recent results which underline both the similarities and the differences between K¨ahler and complex projective geometries. The first section is devoted to the results by Demailly and his collaborators showing a complete similarity between various notions of positivity for line bundles on projective manifolds and for real closed forms of type (1, 1) on K¨ ahler compact manifolds. Sections 2 and 3 show in contrast strong differences between these geometries. On the analytic side, we show that the Hodge conjecture cannot be possibly extended to K¨ ahler compact manifolds: Hodge classes are not necessarily generated over Q by Chern classes of coherent sheaves. We also show that coherent sheaves on compact K¨ahler manifolds do not necessarily admit locally free resolutions, while the existence of locally free resolutions in algebraic geometry plays a key role in the proof of central theorems (see, e.g., [30], [2]). On the topological side, we show that there exist K¨ahler compact manifolds which do not have the homotopy type of, and a fortiori cannot be deformed to, complex projective manifolds. It is interesting to note that these differences appear only in higher dimensions. Any compact complex curve is projective (hence K¨ahler). Any compact K¨ ahler surface has small deformations which are projective (a result due to Kodaira). The Hodge conjecture is true for degree 2 classes on complex manifolds in the form of the Lefschetz theorem on (1, 1)-classes, and coherent sheaves on compact complex surfaces admit finite locally free resolutions [28]. 1. Positivity properties of line bundles and (1, 1)-classes 1.1. Line bundles and their Chern forms and currents. Let L be a holomorphic line bundle on a complex manifold and h be an Hermitian metric on L. On small open sets U of X, we can choose non-zero holomorphic sections σU trivializing L. The function h(σU ) is thus positive where defined and we can define the real (1, 1)-form 1 ∂∂ log h(σU ). ωL,h,U := 2iπ It is immediate to see that this form does not depend on the choice of σU (this follows from the vanishing ∂∂ log | g |2 = 0 for g an invertible holomorphic
790
C. Voisin
function), so that the ωL,h,U coincide on the overlaps and we have in fact a globally defined (1, 1)-form ωL,h called the Chern form of (L, h). If h is changed to eu h, for some real function u, ωL,h is changed to ωL,h + 1 2iπ ∂∂u, from which it follows that the forms ωL,h determine a class c(L) in the 1,1 (X, R), defined as the quotient of the space of d-closed real forms of space H∂∂ (1, 1)-type by the space consisting of i∂∂f , f a real functions on X. Note that since the later space consists of d-exact forms, there is a natural map 1,1 (X, R) → H 2 (X, R), H∂∂ which is an isomorphism onto the subspace HR1,1 (X) := H 1,1 (X) ∩ H 2 (X, R) when X is K¨ahler, by the ∂∂-lemma (cf. [36] I, 6.1.3). The image of c(L) under this map is the real Chern class c1 (L), which is a topological invariant of L. As we shall see in next sections, a deep use is made by analysts of a singular version of this construction. Namely, introduce singular metrics on L, which are locally of the form hsing = eφ h, where h is a smooth metric, and φ is an integrable function. Then one can define locally the closed current TL,hsing by the formula 1 ∂∂ log φ. 2iπ This is a real closed current of type (1, 1), that is a linear form on the space of compactly supported forms of degree 2n − 2 on X, n = dim X, which is real on real forms, and vanishes on forms of type (p, q) = (n − 1, n − 1). Positivity or semi-positivity of (1, 1)-forms makes sense as explained in the introduction. Similarly, positivity of (1, 1)-currents is defined as follows: TL,hsing = ωL,h +
Definition 1.1. A current T is said to be positive if T (α) ≥ 0, for any (n − 1, n − 1)-form α which can be locally written as α = (iα1 ∧ α1 ) ∧ · · · ∧ (iαn−1 ∧ αn−1 ), where the αi ’s are of type (1, 0), and more generally, on any combination of such forms with coefficients given by non-negative real functions. A typical example of a positive (1, 1)-current is the current of integration on an analytic hypersurface of X. There are on the other hand two different notions of positivity for line bundles: that of ampleness, (see the introduction), and that of effectivity, where L is said to be effective if there is a non-zero holomorphic section of L on X. This last notion is in fact better behaved if one introduces the notion of pseudoeffectiveness:
Recent Progresses in K¨ ahler and Complex Algebraic Geometry
791
Definition 1.2. (see [13]) A line bundle L on X is said to be pseudo-effective if its class c(L) is in the closure of the set of classes c(L ), for L effective. These two notions of positivity are strongly different. Indeed, an effective line bundle may become negative after restriction to the zero locus of one of its sections, hence may be very far from ample. The typical example is −1 . L = OX (Ep ) := IE p
Here τ : Xp → X is the blow-up of a point p ∈ X, and Ep is the exceptional divisor. Its ideal sheaf IEp is a holomorphic line bundle, whose inverse admits a canonical section whose 0-divisor is Ep . On the other hand L|Ep is negative. It turns out that these two notions correspond respectively to the notions of positivity for (1, 1)-forms and (1, 1)-currents: Lemma 1.3. If L is ample on X, there exists an Hermitian metric h on L whose Chern form ωL,h is positive. This follows from the corresponding statement for projective space. The Fubini-Study K¨ ahler form on projective space is the Chern form of an adequate metric on the line bundle O(1). Lemma 1.4. If L is pseudo-effective, there exists a singular Hermitian metric hsing on L such that the associated Chern current TL,hsing is positive. When a multiple of L is effective, let σ be a non-zero section of L⊗m . The 1 m metric on L will be defined as hm , where hm is the singular Hermitian metric 1 on Lm for which hm (σ) = 1. The associated current is easily shown to be m , D where D is the divisor of σ. The converse statements are central in complex algebraic geometry. Theorem 1.5. (Kodaira [23]) A line bundle L on a compact complex manifold X is ample if and only if it admits a metric h, such that ωL,h is a positive (1, 1)-form. Theorem 1.6. (Demailly [13]) A line bundle on a projective complex manifold X is pseudo-effective if and only if it admits a singular Hermitian metric whose associated Chern current is positive. Kodaira’s theorem has been extended by Siu [31], [32] to the semi-positive case. Theorem 1.7. Let L be a line bundle on a compact complex manifold X, which admits a Hermitian metric whose Chern form is semi-positive, and satisfies n c (L) > 0, where n = dim X. Then h0 (X, L⊗m ) grows like mn with m and 1 X X is Moishezon. (Recall that a Moishezon manifold is a compact complex manifold which is birationally equivalent to a projective manifold.)
792
C. Voisin
The assumptions in the above theorems are not of an algebraic nature. The following result, in contrast, gives a purely algebraic criterion for ampleness of line bundles: Theorem 1.8. (Nakai-Moishezon criterion) A line bundle on a (complex) projective manifold X is ample if and only if, for any subvariety Y ⊂ X of dimension p, one has c1 (L)p > 0. Y
The proof is by induction on the dimension, using Riemann-Roch theorem and Serre’s vanishing theorem. Note that, unlike Kodaira’s theorem, one has to assume first that X is projective. One might ask what happens when the inequalities in the Nakai-Moishezon criterion become large. The line bundles which satisfy the conditions c1 (L)p ≥ 0, for any Y ⊂ X of dimension p Y
are called nef (numerically effective). Applying the Nakai-Moishezon criterion, and assuming X is projective, one sees that their Chern classes lie in the closure of the ample cone generated by the Chern classes of ample line bundles. It is unfortunately not true that we can extend Lemma 1.3 to this case, allowing ωL,h to be semi-positive (see, e.g., [15] for a counterexample). To conclude, let us mention the Kleiman-Seshadri criterion which says that ampleness can be tested on closed complex curves C ⊂ X only: Theorem 1.9. Let X be projective and L be a line bundle on X. Then L is ample if and only if its first class c1 (L) belongs to the interior of the subset of HR1,1 (X) defined by the equations α ≥ 0, ∀C ⊂ X. C
1.2. Cones of curves, divisors and (1, 1)-classes. Let us now assume for simplicity that X is K¨ahler. For a class α in HR1,1 (X), we want to define various notions of positivity, extending the ones introduced in the context of line bundles. One important point is that there might be no proper closed analytic subset of positive dimension in X, so that positivity cannot a priori be tested by integration over analytic subsets. Definition 1.10. A class α ∈ HR1,1 (X) is K¨ ahler if it can be represented (in de Rham cohomology) by a a K¨ ahler form. Definition 1.11. A class α ∈ HR1,1 (X) is pseudo-effective if it can be represented by a real closed positive current of type (1, 1). Definition 1.12. A class α ∈ HR1,1 (X) is numerically effective if for any > 0, it can be represented by a closed real (1, 1)-form α ˜ such that α + ω ≥ 0 as a real (1, 1)-form.
Recent Progresses in K¨ ahler and Complex Algebraic Geometry
793
Here ω is a given K¨ ahler form. As shows the example mentioned in the previous section, this does not imply that α can be represented by a semipositive (1, 1)-form. Definition 1.13. A K¨ ahler current T is a real current of type (1, 1) such that for some > 0, T − ω > 0 as (1, 1)-current. ahler The set of K¨ahler classes is an open cone in HR1,1 (X), called the K¨ cone. The set of pseudo-effective classes is a closed cone, called the pseudoeffective cone, which obviously contains the K¨ ahler cone. It is immediate from the definitions that the closure of the K¨ ahler cone is the numerically effective cone consisting of numerically effective classes, and that the set of classes of K¨ ahler currents is the interior of the pseudo-effective cone. If X is complex projective, it is natural to restrict these definitions to the Q or R-vector space N S(X) generated by Chern classes of line bundles, called the rational (or real) N´eron-Severi group. Due to the hard Lefschetz theorem, this space is dual via Poincar´e duality to the Q (or R) vector subspace of HRn−1,n−1 (X) generated by cohomology classes [C] of closed complex curves in X. It is clear from Kodaira’s embedding theorem that L is numerically effective in the sense of the previous section, if and only if its first Chern class c1 (L) is numerically in the sense of Definition 1.12. A consequence of Kleiman-Seshadri criterion for ampleness is then the following: Theorem 1.14. Let X be projective and L be a holomorphic line bundle on X. Then c1 (L) is numerically effective if and only if C c1 (L) ≥ 0, for any complex curve C ⊂ X. Finally, we have the following easy fact concerning pseudo-effective line bundles (see previous section): for a curve C ⊂ X, consider the Hilbert scheme M parametrizing deformations of C in X. There is a universal subscheme C q↓ M
p
→ X .
We have then Lemma 1.15. If for generic m ∈ M, the curve Cm is irreducible, and the map p is surjective, then for any pseudo-effective line bundle L, we have C c1 (L) ≥ 0. The last inequality turns out to be also true more generally for pseudoeffective classes. The proof of the Lemma is as follows. It suffices to show it for line bundles L such that L⊗k is effective for some k > 0. Next let σ be a section of L⊗k and D be its divisor. Then by the properties above, the generic curve Cm has no component contained in D. It follows that the intersection number
794
C. Voisin
Cm · D ≥ 0. But this is equal to k and [D] = kc1 (L).
C
c1 (L) since C and Cm are homologous
1.3. Analytic characterizations of the K¨ahler and pseudo-effective cones. To complete the parallel between the positivity properties of line bundles and that of (1, 1)-classes, and to have a good picture of how positivity can be tested by restriction to subvarieties, there are two missing statements in the previous sections, which are (1) A characterization of the K¨ ahler cone analogous to the characterization of the ample cone given by Nakai-Moishezon criterion. (2) A converse to Lemma 1.15, providing a characterization of the pseudoeffective cone for projective varieties. These are precisely the two recent theorems proved by Demailly and his collaborators. Theorem 1.16. (Demailly-Paun [14]) Let X be a compact K¨ ahler manifold. Then the K¨ ahler cone of X is a connected component of the subset of HR1,1 (X) defined by the equations αp > 0, Y ⊂ X, dim Y = p. (1.1) Y
Remark 1.17. It is not clear whether this set is open or not. Remark 1.18. In contrast to what happens in the projective situation, that is in the Nakai-Moishezon criterion, the K¨ ahler cone cannot be in general equal to the whole subset defined above. Indeed, consider the case of a general complex torus T . Then T does not contain any positive dimensional proper analytic subset. So we just get the inequality T αn > 0. On the other hand, the space ahler cone HR1,1 (T ) identifies to the space of Hermitian forms on Cn , while the K¨ identifies to the set of positive Hermitian forms. Since T αn identifies to the discriminant of the Hermitian form in an adequate basis of Cn , Theorem 1.16 just says that positive Hermitian forms are a component of the set of Hermitian forms with positive discriminant. Theorem 1.16 had been proved before by Campana and Peternell for X projective and α ∈ N S(X)R . An extension of this result to the case of classes α ∈ HR1,1 (X) ⊂ H 2 (X, R) which become rational when pulled-back to the universal cover of X was proved by Eyssidieux ([17]). More importantly, it had been established before by Lamari [24] and independently by Buchdahl [5], [6] in the case of surfaces. The proof of Theorem 1.16 starts as follows: one wants to show that the K¨ ahler cone is both open and closed in the set defined by the inequalities (1.1). It is clearly open. Next consider a class α which is in the closure of the K¨ ahler cone and satisfies these inequalities. So α is numerically effective (see Definition 1.12) and αn > 0.
Recent Progresses in K¨ ahler and Complex Algebraic Geometry
795
The first step is then to show: Theorem 1.19. [14] If a real (1, 1) class α is numerically effective and satisahler current (see fies the condition αn > 0, then α is representable by a K¨ Definition 1.13). The second step is then an induction step, which makes use of earlier results of Paun: 1,1 (X) be a Theorem 1.20. [27] Let X be a complex analytic space and α ∈ H∂∂ real class which is representable by a K¨ ahler current. Then if for any proper ahler class, α is a closed analytic subset Y of X, the restriction α|Y is a K¨ K¨ ahler class.
Note the shift here from complex manifolds to analytic spaces, necessary in order to make an induction argument. Next, we have the following theorem, due to Boucksom, Demailly, Paun and Peternell giving a numerical characterization of the pseudo-effective cone: Theorem 1.21. [3] Let X be a projective manifold. Then the pseudo-effective cone consisting of pseudo-effective classes (cf. definition 1.11) β ∈ N S(X)R , is equal to the set {α ∈ HR1,1 (X),
α ≥ 0}, C
for all curves C ⊂ X satisfying the assumptions of Lemma 1.15. The proof of Theorem 1.21 provides another, a priori smaller, set of inequalities characterizing the pseudo-effective cone. Namely, there is the notion of moving intersection of pseudo-effective classes, which is the analytic analogue of the “intersection of the moving part” of an effective divisor. The paper proves that the pseudo-effective cone is equal to the set n−1 >≥ 0}, {α ∈ N S(X)R , < α, βm
where β runs through the set of pseudo-effective divisors, and n−1 ∈ H n−1,n−1 (X) βm
is the n − 1th moving intersection of β. (The bracket here is the intersection pairing between H 1,1 (X) and H n−1,n−1 (X).) The proof uses the following: the pseudo-effective cone is certainly contained in the one defined by the above inequalities. So, to show they are equal, it suffices to show that if a pseudo-effective class in N S(X)R is in the interior of the cone defined by the above inequalities, it is also in the interior of the pseudo-effective cone. This is proved eventually using a criterion due to Boucksom ([4]) characterizing the interior of the pseudo-effective cone as the set of pseudo-effective classes β ∈ N S(X)R having a positive moving self-intersection: n > 0. βm
796
C. Voisin
2. Hodge classes and analytic geometry 2.1. Constructions of Hodge classes. Let X be a compact complex manifold of dimension n, and k be an integer ≤ n. Definition 2.1. The space Hdg 2k (X) of degree 2k rational Hodge classes is the set of classes α ∈ H 2k (X, Q) which can be represented in de Rham cohomology by a closed form of type (k, k). It can be shown that this is equivalent to be representable by a closed current of type (k, k). When X is K¨ahler, classes representable by a closed form of type (k, k) are exactly the elements of the space H k,k (X) ⊂ H 2k (X, C) (see Introduction), so that in that case Hdg 2k (X) = H 2k (X, Q) ∩ H k,k (X). There are three standard ways of constructing Hodge classes (in fact integral ones). – The class of an analytic subset. Let Z ⊂ Xbe a closed analytic subset of codimension k. Then there is a closed analytic subset Zsing ⊂ Z ⊂ X which is of codimension k + 1, such that Z Zsing ⊂ X Zsing is a complex submanifold of codimension k. Thus we have a class [Z Zsing ] ∈ H 2k (X Zsing , Z), and the isomorphism H 2k (X, Z) ∼ = H 2k (X Zsing , Z), which comes from the fact that the real codimension of Zsing is ≥ 2k + 2, provides us with the desired class [Z] ∈ H 2k (X, Z). This is a Hodge class, as a consequence of Lelong’s theorem, which says that the current of integration over Z = Z
ZZsing
is well defined and closed. It is then immediate to check that it represents the class [Z], and since it is of type (k, k), this concludes the proof. – Chern classes of holomorphic vector bundles. If E is a complex vector bundle on a topological manifold X, we have the rational Chern classes ci (E) ∈ H 2i (X, Q). (Note that the Chern classes are usually defined as integral cohomology classes, ci ∈ H 2i (X, Z), but in this text, the notation ci will be used for the rational ones.) If E is now a holomorphic vector bundle on a complex manifold X, the Chern classes of E are Hodge classes. This follows indeed from Chern-Weil theory, which provides de Rham representatives of ci (E) as follows: If ∇ is a complex connection on E, with curvature operator R∇ ∈ A2X ⊗ End E, then a representative of ck (E) is given by the degree 2k closed form i σk ( R∇ ), 2π
Recent Progresses in K¨ ahler and Complex Algebraic Geometry
797
where σk is the polynomial invariant under conjugation on the space of matrices, which to a matrix associates the kth symmetric function of its eigenvalues. Now, if E is a holomorphic vector bundle on X, there exists a complex connection ∇ on E such that R∇ is of type (1, 1), that is R∇ ∈ A1,1 X ⊗ End E. (Given a Hermitian metric h on E, one can take the so-called Chern connection, which is compatible with h, and has the property that its (0, 1)-part is equal to i the ∂-operator of E.) This implies that σk ( 2π R∇ ) ∈ Ak,k (X), and shows that ck (E) is Hodge. – Chern classes of coherent sheaves. Coherent sheaves F on a complex manifold X are sheaves of OX -modules which are locally presented as quotients φ
r s OX → OX → F → 0,
where φ is a matrix of holomorphic functions. If X is a smooth projective complex manifold, it is known that coherent sheaves are algebraic and admit a finite locally free resolution 0 → Fn → · · · → F0 → F → 0, s where the Fi are locally free, i.e., locally isomorphic to some OX . Such a locally free sheaf of OX is the sheaf of sections of a holomorphic vector bundle Fi on X, and we can define the Chern classes of F by the Whitney formula:
c(F ) := Πl c(Fl )l . Here the total Chern class c(F ) determines the Chern classes ci (F ) by the formula c(F ) = 1+c1 (F )+· · ·+cn (F ), and we put by definition c(Fl ) := c(Fl ). (Here l = (−1)l , and the series can be inverted because the cohomology ring is nilpotent in degree > 0.) The Whitney formula and the case of holomorphic bundles imply that the Chern classes ci (F ) are Hodge classes. On a general compact complex manifold (and even K¨ ahler), such a finite locally free resolution does not exist in general (see section 2.3). In order to define the ci (F ), one can use a finite locally free resolution 0 → Fn → · · · → F0 → F ⊗ HX → 0, of F ⊗ HX by sheaves of locally free HX -modules, where HX is the sheaf of real analytic complex functions. The Fl are then the sheaves of real analytic sections of some complex vector bundles Fl of real analytic class, and one can then define (using again the definition c(Fl ) = c(Fl )) c(F ) = c(F ⊗ HX ), c(F ⊗ HX )) = Πl c(Fl )l . This defines unambiguously the Chern classes of F , and some further work allows to show that these classes are Hodge classes.
798
C. Voisin
2.2. The projective case. The three constructions described above provide us with three subspaces of Hdg 2k (X), namely the Q-vector space generated by the classes [Z], Z ⊂ X of codimension k, the Q-vector space generated by the Chern classes ck (E), for all holomorphic vector bundles E on X, and the Q-vector space generated by the Chern classes ck (F ), for all coherent sheaves F on X. It is always the case that the first space is contained in the last one. Indeed, if Z ⊂ X is a closed analytic subset of codimension k, one can consider its ideal sheaf IZ ⊂ OX . It is a coherent sheaf, and one has the relation (cf. [2], [18] p. 298): ck (IZ ) = (−1)k (k − 1)![Z].
(2.1)
In the projective situation, one has furthermore the following result: Theorem 2.2. If X is a smooth projective complex variety, these three subspaces of Hdg 2k (X) coincide. That the second and third space coincide follows from the above mentioned fact that coherent sheaves admit finite locally free resolutions. That the Chern classes of a holomorphic vector bundle E are integral combinations of classes of subvarieties follows from the following fact: if L is an ample line bundle on X, the sheaf vector bundles E ⊗ L⊗k is generated by global holomorphic sections, for large k. It follows that E ⊗L⊗k is the pull-back via a holomorphic map Φ:X→G of the tautological quotient vector bundle Q on G, where G is the Grassmannian of codimension r subspaces of H 0 (X, E), r = rank E. (Indeed the map Φ is the map which to x ∈ X associates the subspace Vx ⊂ H 0 (X, E) consisting of sections vanishing at x.) So the Chern classes of E ⊗ L⊗k are the pull-back via Φ of those of Q, and one uses then the fact that the cohomology of G is generated by classes of algebraic subvarieties. Finally, the Hodge conjecture predicts the following: Conjecture 2.3. (Hodge) If X is smooth projective complex, the Hodge classes of X are generated over Q by classes [Z] of algebraic subvarieties of X (or equivalently by Chern classes of holomorphic vector bundles or coherent sheaves). The conjecture is known to be true for degree 2 Hodge classes (it is then known as the Lefschetz theorem on (1, 1)-classes). It is in this case an easy consequence of the exponential exact sequence exp
∗ 0 → Z → OX → OX → 0, 2iπ
∗ and the fact that the space H 1 (X, OX ) identifies to the set of isomorphism classes of holomorphic line bundles on X.
Note that this proof shows that for X compact K¨ahler, degree 2 integral Hodge classes are of the form c1 (L), for L a holomorphic line bundle on X.
Recent Progresses in K¨ ahler and Complex Algebraic Geometry
799
The Hodge conjecture is also true for degree 2n − 2 Hodge classes. This is a consequence of the above and of the hard Lefschetz theorem, a particular case of which will say the following: let L be an ample line bundle on X. Then cup-product with the class c1 (L)n−2 induces an isomorphism ∪c1 (L)n−2 : H 2 (X, Q) ∼ = H 2n−2 (X, Q). One can show that this induces an isomorphism on Hodge classes: ∪c1 (L)n−2 : Hdg 2 (X, Q) ∼ = Hdg 2n−2 (X, Q).
Note that this proof already fails in the general K¨ ahler case, since in general there will not be anymore a Hodge class α of degree 2 inducing a Lefschetz isomorphism ∪αn−2 as above. 2.3. The K¨ahler case. In the K¨ ahler case, it was classically known that the construction of Hodge classes via analytic subsets and via holomorphic vector bundles may not generate the same subspace of Hdg(X). There are examples of Chern classes of holomorphic vector bundles on a compact K¨ ahler manifold X which are not in Q-vector space generated by classes of analytic subsets. Namely take for X a complex torus which has Hdg 2 (X) ∼ = Q generated by c1 (L), where c1 (L) is represented by a real (1, 1)-form on Cn which is non degenerate but neither positive nor negative. (We use here the fact that for a torus X = Cn /Γ, the space H 1,1 (X) identifies naturally to the space of real (1, 1)-forms with constant coefficients on Cn .) Then such a torus contains no complex hypersurface, because such an hypersurface D ⊂ X is the zero set of a holomorphic section σD of a line bundle LD on X, and c1 (LD ) is represented by a semi-positive non zero (1, 1)-form on Cn . Next, we proved in [34] that on compact K¨ ahler manifolds of dimension ≥ 3, Chern classes of coherent sheaves may generate a subspace of Hdg(X) which is strictly larger than the space generated by Chern classes of vector bundles. Theorem 2.4. (Voisin [34]) Let X be a compact K¨ ahler manifold which satisfies the assumptions Hdg 2 (X) = Hdg 4 (X) = 0. Then any holomorphic vector bundle E on X satisfies the property ci (E) = 0, ∀i > 0. Note that a general complex torus of dimension ≥ 3 satisfies these assumptions. On the other hand, let X be as in Theorem 2.4, and let x ∈ X. Then the Hodge class [x] ∈ Hdg 2n (X) is non zero, and by (2.1) this is up to a coefficient the Chern class of the coherent sheaf Ix . This provides the announced example, since in this case no non-zero Hodge classes comes from Chern classes of holomorphic vector bundles.
800
C. Voisin
Note that this result also implies the following: Corollary 2.5. There exist compact K¨ ahler manifolds X and coherent sheaves on them which do not admit a locally free resolution by sheaves of OX -modules. Indeed, consider the above example: if the coherent sheaf Ix admitted a locally free resolution 0 → Fn → · · · → F0 → Ix , Whitney formula would give: c(Ix ) = Πl c(Fll . But the theorem says that the right-hand side is equal to 1, while the left-hand side has the non zero term cn (Ix ) proportional to [x] in top degree. Remark 2.6. The existence of locally free resolutions has been proved for coherent sheaves on compact complex surfaces by Schuster [28]. Theorem 2.4 is a consequence of the Bando-Siu extension of UhlenbeckYau theorem to reflexive sheaves F on compact K¨ ahler manifolds. A reflexive sheaf is a sheaf which has the Hartogs extension property that any section defined away from a codimension 2 closed analytic subset extends. Equivalently, F should be equal to its bidual. Theorem 2.7. (Bando-Siu [1]) Let F be a reflexive coherent on compact K¨ ahler manifold X. Assume F is stable with respect to some K¨ ahler form ω. Then F admits a Hermite-Einstein metric relative to ω. It follows that if furthermore c1 (F ) = 0 = c2 (F ), then F is locally free and admits a flat holomorphic connection, so that ci (F ) = 0, i > 0. We shall explain later on the notions of stability and Hermite-Einstein metrics in the easier context of locally free sheaves. From the above results, one concludes that the only possible way to extend the Hodge conjecture to the K¨ahler case would be the following: Question 2.8. Are the Hodge classes on a compact K¨ahler manifold generated by Chern classes of coherent sheaves? This question was answered negatively in [34]: Theorem 2.9. (Voisin [34]) Let X be a compact K¨ ahler manifold of dimension n, with K¨ ahler form ω. Assume the following: (1) Hdg 2 (X) = 0. (2) Hdg 4 (X), [ω]n−2 = 0. (Here [ω] ∈ H 2 (X, R) is the de Rham class of ω and the intersection is the Poincar´e pairing between H 4 (X) and H 2n−4 (X).) (3) X does not contain proper positive dimensional analytic subset. Then any coherent sheaf F on X satisfies the condition c2 (F ) = 0. On the other hand there exist compact complex manifolds X satisfying the assumptions but have Hdg 4 (X) = 0.
Recent Progresses in K¨ ahler and Complex Algebraic Geometry
801
The examples are general 4-dimensional Weil tori. The algebraic Weil tori were proposed as candidates for a counterexample to the Hodge conjecture even in projective geometry. Weil tori are constructed as follows: one starts with a rank 4n lattice Γ endowed with an endomorphism I such that I 2 = −IdΓ . Let ΓC := Γ ⊗ C. The torus will be of the form X = ΓC /W ⊕ Γ, where W ⊂ ΓC is a rank 2n complex vector subspace, which is stable under I, satisfies the property that W ⊕ W = ΓC , and is such that the eigenvalues of I acting on W consist of n eigenvalues equal to i and n eigenvalues equal to −i. The Weil classes on such tori are the degree 2n Hodge classes constructed as follows: let K = Q[I]. Then, using the action of I on X, K acts on the space Γ∗Q = H 1 (T, Q), and this way Γ∗Q is a K-vector space of rank 2n. There is a natural trace map 2n 2n + + Γ∗Q → Γ∗Q ∼ = H 2n (X, Q). K
Q
One shows that the image of this map consists of Hodge classes. (This is a rank 2 Q-vector subspace.) It was known to Zucker [37] that for general Weil tori, the Weil classes are not in the space generated by classes of analytic subsets. The assumptions of Theorem 2.9 were also essentially checked there. The proof of Theorem 2.9 uses the Uhlenbeck-Yau Theorem. Theorem 2.10. (Uhlenbeck-Yau [33]) Let X be a compact complex manifold of dimension n with K¨ ahler form ω. Let E be a holomorphic vector bundle on X, which is stable with respect to ω. Then E admits a Hermite-Einstein metric h relative to ω. Here the stability condition is the following: Denote by E the sheaf of holomorphic sections of E. Then E is ω-stable if for any subsheaf F ⊂ E such that 0 < rk F < rk E, one has c1 (F ), [ω]n−1 c1 (E), [ω]n−1 < . rk F rk E The Hermite-Einstein condition on h is the following. Associated to h is the Chern connection ∇h , with curvature operator R∇h ∈ A1,1 X ⊗ End E. Then h is Hermite-Einstein if 0 , R∇h = µωIdE + R∇ h where µ ∈ C is determined by the equation i rk E[ω]n µ = c1 (E), [ω]n−1 , 2π
802
C. Voisin
0 and the form valued matrix R∇ has primitive coefficient. (A 2-form α on X h n−1 ∧ α = 0 everywhere on X.) is said to be primitive if ω An important consequence of Uhlenbeck-Yau’s theorem is the following:
Corollary 2.11. If E is ω-stable and satisfies the conditions c1 (E) = 0, < c2 (E), [ω]n−2 >= 0, then E admits a flat holomorphic connection and thus the rational Chern classes of E vanish, ci (E) = 0, ∀i > 0. The corollary shows that under the assumptions of theorem 2.9, we have c2 (E) = 0 for all ω-stable vector bundles on X. Induction on the rank and arguments involving desingularizations of non locally free sheaves give the result for all coherent sheaves on X. 3. The topology of projective and K¨ahler manifolds 3.1. Kodaira’s theorem on surfaces. Let X be a K¨ ahler compact manifold. Kodaira’s embedding theorem gives the following. Theorem 3.1. (Kodaira [23]) X is projective if and only if X carries a K¨ ahler form ω whose cohomology class is rational, [ω] ∈ H 2 (X, Q). Indeed, by multiplying ω by an integer, we may assume its class is integral. The Lefschetz theorem on (1, 1)-classes then says that [ω] = c1 (L), for some line bundle on X. Finally, the isomorphism ∼ H 1,1 (X) H 1,1 (X) = ∂∂
R
and the construction of the Chern forms ωL,h show that for some metric h on L, we have ω = ωL,h . Kodaira’s Theorem 1.5 then says that L is ample. Note that, in particular, if X is K¨ahler and H 2,0 (X) = 0, then X is ahler cone is projective. Indeed in that case HR1,1 (X) = H 2 (X, R). Since the K¨ then open in H 2 (X, R), it has to contain rational classes, since they are dense in H 2 (X, R). Starting with a K¨ ahler manifold X, one can deform the complex structure. It is known that the small deformations preserve the K¨ ahler property and that the spaces H p,q vary smoothly inside the fixed space H p+q (X, C), which does not depend on the complex structure (see, e.g., [36] I, 9.3.2). Given a family (Xt )t∈B of deformations of the complex structure on X, one can consider the set ∪t∈B HR1,1 (Xt ) ⊂ H 2 (X, R), inside which sits as an open set the union of the K¨ ahler cones Kt ⊂ HR1,1 (Xt ). Assuming the union of the Kt contains an open set of H 2 (X, R), then by the same density argument, it must contain a rational class, which means by Kodaira’s theorem 3.1 that some Xt is projective.
Recent Progresses in K¨ ahler and Complex Algebraic Geometry
803
It turns out that this is precisely what happens in the case of K¨ ahler surfaces. Theorem 3.2. (Kodaira [22]) A compact K¨ ahler surface admits a (arbitrarily small) deformation which is projective. Kodaira’s proof was obtained as a consequence of his classification of surfaces. A more direct proof was given recently by Buchdahl [7], in the case of unobstructed surfaces. His proof uses the following criterion, valid in any dimension, for X to admit small projective deformations. Proposition 3.3. Assume an unobstructed compact K¨ ahler manifold X has a K¨ ahler class ω ∈ HR1,1 (X) ⊂ H 1,1 (X) ∼ = H 1 (X, ΩX ) satisfying the following condition: the interior product (combined with cupproduct in cohomology) ω : H 1 (X, TX ) → H 2 (X, OX )
(3.1)
is surjective. Then X admits arbitrarily small deformations which are projective. Let us explain the relation between this criterion with the previous argument: The space H 1 (X, TX ) is the space of first-order deformations of the complex structure up to isomorphisms. Assume that there is an actual family of deformations (Xt )t∈B of the complex structure on X ∼ = X0 such that the tangent space TB,0 identifies to H 1 (X, TX ), by the Kodaira-Spencer map which to a tangent vector to B at 0 associates the corresponding infinitesimal deformation of X0 . Then the surjectivity of the map (3.1) means exactly that the natural map *t∈B HR1,1 (Xt ) → H 2 (X, R) has a surjective differential at the point ω (see [36] II, 5.3.4). This certainly implies that, even after shrinking B, the union of the K¨ ahler 2 cones of the Xt contains an open set of H (X, R), so that we are reduced to the previous situation. However, in general, the family (Xt )t∈B does not exist (there are usually obstructed first-order deformations, which do not extend to all higher-orders). This problem limits Buchdahl’s proof, which has a more analytic flavor, to the unobstructed case. 3.2. Higher-dimensional case. In higher dimension, the Kodaira theorem left open the question whether a compact K¨ ahler manifold can be deformed to a projective one, a problem known as the Kodaira problem (see [12]), although it is not clear whether the question was asked by Kodaira himself. Here we are considering more generally large deformation, that is, we say that X is a deformation of X if there exist connected analytic spaces X , B,
804
C. Voisin
a smooth proper holomorphic map φ : X → B, and two points t, t ∈ B such that Xt ∼ = X, Xt ∼ = X . Clearly, if X and X are deformations of each other, they are diffeomorphic, (although the diffeomorphism between them may not be canonically determined up to isotopy, because of the monodromy group of the fibration given by φ). Indeed, this fibration can be trivialized way in a C-infinite way over paths in B, and B is path connected. So, a fortiori, X and X are homeomorphic and in particular have the same homotopy type. Hence a weakening of the Kodaira problem asks the following:
Question 3.4. Does any compact K¨ahler manifold have the homotopy type of a projective complex manifold? Note that there are no symplectic obstructions, by the work of Donaldson [16], Munoz et al [26] on approximate holomorphic sections of line bundles on symplectic manifolds, which show that any symplectic manifold can be realized as a symplectic submanifold of projective space. Unfortunately, the answer to this question is negative. Theorem 3.5. (Voisin [35]) In any dimension ≥ 4, there exist compact K¨ ahler manifolds which do not have the homotopy type of complex projective manifolds. In any dimension ≥ 6 there exist simply connected such examples. The examples constructed in [35] have the following shape (at least in the non simply-connected case). One considers complex tori T admitting an endomorphism φT . Later on, we will make an assumption on φT , but for the moment we just assume that the eigenvalues of φT ∗ acting on the tangent space of T at 0 are all different from 0 or 1. It follows that inside T × T the four subtori T1 := T × 0, T2 = 0 × T, T3 = Tdiag = {(x, x), x ∈ T }, T4 = Tgraph = {(x, φT (x)), x ∈ T } meet pairwise transversally at finitely many points. We first blow-up the finitely many pairwise intersection points of these tori; then the proper transforms T'i of the Ti ’s are smooth and do not meet anymore. So we can blow-up them again. The resulting compact complex manifold is K¨ahler because the K¨ ahler property is stable under blow-ups. We prove next that for adequate choice of (T, φT ), the manifold X so constructed does not have the homotopy type of a complex projective manifold. More precisely, let us make the following assumptions on (T, φT ): (∗) the dimension n of T is ≥ 2 and the endomorphism φ := φT ∗ of H1 (T, Z) satisfies the properties that all of its eigenvalues are distinct, none is real, and the Galois group of its characteristic polynomial acts as the symmetric group of 2n elements on the set of eigenvalues.
Recent Progresses in K¨ ahler and Complex Algebraic Geometry
805
The precise statement is then the following: ahler compact manifold Theorem 3.6. Assume the assumptions (*). If X is a K¨ such that there exists a graded ring isomorphism γ : H ∗ (X , Z) ∼ = H ∗ (X, Z), then X is not projective. The key point is the notion of polarized Hodge structure. Consider a Hodge structure of weight r, that is, a lattice H and a decomposition HC = ⊕p+q=r H p,q . Definition 3.7. A polarization of this Hodge structure is a bilinear form q :H ×H →Z which is skew-symmetric if r is odd and symmetric otherwise, and satisfies the conditions:
q(α, β) = 0, α ∈ H p,q , β ∈ H p ,q , (p, ) = (p , q ). (α, β) → ip−q q(α, β) is a positive definite Hermitian form on H p,q . Hodge theory and the K¨ ahler identities show the following (see [36] I, 6.2.3). Let X be a complex projective manifold, and η = c1 (L) ∈ H 2 (X, Z) be the first Chern class of an ample line bundle on X. Then defining the primitive cohomology H r (X, Z)prim := Ker ∪ η n−r+1 ⊂ H r (X, Z, the form
α ∪ η n−r ∪ β
qη (α, β) = X
defines up to sign a polarization on H r (X, Z)prim . (Here we are working with cohomology modulo torsion.) Note that for r = 1, we have H 1 (X, Z)prim = H 1 (X, Z) and for r = 2, we have H 2 (X, Q)prim ⊕ Qη = H 2 (X, Q). In other words, the cohomology groups of a projective complex manifolds carry Hodge structures which are compatible with the cup-product, and furthermore, for degree 1 and 2, these Hodge structures can be polarized. The proof of Theorem 3.6 consists in showing that if we have X , X and γ as stated there, the Hodge structure on H 1 (X , Z) (which has to be compatible via the cup-product with the Hodge structures on higher cohomology groups) cannot be polarized.
806
C. Voisin
References [1] S. Bando, Y.-T. Siu. Stable sheaves and Einstein-Hermitian metrics, in Geometry and Analysis on Complex Manifolds (T. Mabuchi et al. Eds.), World Scientific, New Jersey (1994) 39–50. [2] A. Borel, J.-P. Serre. Le Th´eor`eme de Riemann-Roch, Bull. Soc. Math. France, 86, (1958) 97–136. [3] S. Boucksom, J.-P. Demailly, M. Paun, T. Peternell. Moving intersections and the dual of the pseudo-effective cone, preprint 2004. [4] S. Boucksom. On the volume of a line bundle, Int. J. Math. 13 (2002), 1043–1063. [5] N. Buchdahl. On compact K¨ ahler surfaces, Ann. Inst. Fourier (Grenoble) 49 (1999), 287–302. [6] N. Buchdahl. A Nakai-Moishezon criterion for non-K¨ ahler surfaces, Ann. Inst. Fourier (Grenoble) 50 (2000), no. 5, 1533–1538. [7] N. Buchdahl. Algebraic deformations of compact K¨ ahler surfaces, preprint 2003. [8] F. Campana, T. Peternell. Algebraicity of the ample cone of projective varieties, J. Reine Angew. Math. 407 (1990). [9] O. Debarre. Higher-dimensional Algebraic Geometry, Universitext, Springer. [10] P. Deligne. Th´eorie de Hodge II, Publ. Math. IHES 40 (1971), 5–57. [11] P. Deligne, P. Griffiths, J. Morgan, D. Sullivan. Real Homotopy Theory of K¨ ahler Manifolds, Inventiones Math. 29, 245–274 (1975). [12] J.-P. Demailly, T. Eckl., T. Peternell. Line bundles on complex tori and a conjecture of Kodaira, to appear in Commentarii Mathematici Helvetici. [13] J.-P. Demailly. Regularization of closed positive currents and intersection theory. J. Algebraic Geom. 1 (1992), no. 3, 361–409. [14] J.-P. Demailly, M. Paun. Numerical characterization of the K¨ ahler cone of a compact K¨ ahler manifold, Annals of Math. , Vol. 159, no. 2, 2004, 1247–1274. [15] J.-P. Demailly, T. Peternell, M. Schneider. Pseudo-effective line bundles on compact K¨ ahler manifolds, Internat. J. Math. 12 (2001), no. 6, 689–741. [16] S. Donaldson. Lefschetz pencils on symplectic manifolds. J. Differential Geom. 53 (1999), no. 2, 205–236. [17] P. Eyssidieux. Th´eor`emes de Nakai-Moishezon pour certaines classes de type (1, 1) et applications, unpublished. [18] W. Fulton. Intersection Theory, Ergebnisse der Mathematik und ihrer Grenzgebiete 3.Folge.Band 2, Springer-Verlag (1984). [19] Ph. Griffiths. Periods of integrals on algebraic manifolds, I, II, Amer. J. Math. 90 (1968), 568-626, 805–865. [20] M. Gromov. Pseudo-holomorphic curves in symplectic manifolds, Invent. Math. 82 (1985), 307–347. [21] S. Kleiman. Towards a numerical theory of ampleness, Ann. Math. 84 (1966) 293–344. [22] K. Kodaira. On compact complex analytic surfaces, I, Ann. of Math. 71 (1960), 111–152. [23] K. Kodaira. On K¨ ahler varieties of restricted type (an intrinsic characterization of algebraic varieties), Ann. of Math. 60 (1954) 28–48.
Recent Progresses in K¨ ahler and Complex Algebraic Geometry
807
[24] A. Lamari. Le cˆ one K¨ ahlerien d’une surface, J. Math. Pures Appl. 78 (1999) 249–263. [25] R. Lazarsfeld. Positivity in Algebraic Geometry, book to appear in Ergebnisse, Springer-Verlag. [26] V. Munoz, F. Presas, I. Sols. Almost holomorphic embeddings in Grassmannians with applications to singular symplectic submanifolds. J. Reine Angew. Math. 547 (2002), 149–189 [27] M. Paun. Fibr´es en droites num´eriquement effectifs et vari´et´es k¨ ahl´eriennes compactes ` a courbure de Ricci nef, Th`ese, Universit´e Grenoble I (1998) 80p. [28] H.-W. Schuster. Locally free resolutions of coherent sheaves on surfaces, J. Reine Angew. Math. 337 (1982), 159–165. [29] J.-P. Serre. G´eom´etrie alg´ebrique et g´eom´etrie analytique, Ann. Inst. Fourier 6, (1956) 1–42. [30] J.-P. Serre. Faisceaux alg´ebriques coh´erents, Ann. Math. 61, 197–278. [31] Y.-T. Siu. A vanishing theorem for semi-positive line bundles over non-K¨ ahler manifolds, J. Differential Geometry 19 (1984) 431–452. [32] Y.-T. Siu. Some recent results in complex manifold theory for the semi-positive case, in Proceedings of the International Congress of Mathematicians, Bonn 1984. [33] K. Uhlenbeck, S.-T. Yau. On the existence of Hermitian-Yang-Mills connections in stable vector bundles, Comm. Pure Appl. Math. (1986) 257–293. [34] C. Voisin. A counterexample to the Hodge conjecture extended to K¨ ahler varieties, IMRN 2002, n0 20, 1057–1075. [35] C. Voisin. On the homotopy types of compact K¨ ahler and complex projective manifolds, Inventiones Math. , Vol. 157, Number 2, 329–343 (2004). [36] C. Voisin. Hodge Theory and Complex Algebraic Geometry I, II, Cambridge studies in advanced Mathematics 76–77, Cambridge University Press 2003. [37] S. Zucker. The Hodge conjecture for cubic fourfolds, Compositio Math. 34 (1977) 199–209.
Prize Lectures
4ECM Stockholm 2004 c 2005 European Mathematical Society
Isoperimetric Inequalities, Probability Measures and Convex Geometry F. Barthe Abstract. We survey recent developments of the isoperimetric problem for product probability measures. These questions were considered with motivations coming from different areas (differential geometry, Banach spaces, probability theory) and substantial progress was enabled by combining geometric, probabilistic and analytic approaches.
1. Introduction We start with the classical isoperimetric inequality in Euclidean space (Rn , | · |, ·, ·) with Lebesgue measure. The goal of this first section is to introduce in this simple situation some of the ideas and techniques which proved useful in more general settings. It is well known that among smooth domains with prescribed measure, Euclidean balls have minimal boundary measure. Denoting B2n = {x ∈ Rn ;
n
x2i ≤ 1},
i=1
this fact reads as follows: if Voln (A) = Voln (rB2n ) then 1
1
Voln−1 (∂A) ≥ Voln−1 (∂(rB2n )) = n Vol(B2n ) n Voln (A)1− n . A major difficulty in proving such an isoperimetric inequality is probably the existence of a minimizing set. This requires to consider a suitable class of sets, preferably stable under limits, together with an extension of the notion of boundary measure to this class. This explains why the first complete proofs are somewhat recent, and coincide with the development of geometric measure theory and calculus of variation. We refer to the survey [48] for more details and references. In this paper we shall consider the notion of Minkowski content as an extension for the boundary measure of smooth domains. For h ≥ 0, the hneighborhood, or h-enlargement of A ⊂ Rn is Ah = {x ∈ Rn ; dist(x, A) ≤ h}. The Minkowski content of A is simply lim inf h→0
Voln (Ah \ A) . h
812
F. Barthe
Hence the isoperimetric inequality follows when h goes to zero from another minimizing property of balls: if Voln (A) = Voln (rB2n ) then for all h > 0, Vol(Ah ) ≥ Vol((rB2n )h ) = Vol((r + h)B2n ). The above fact can be proved by Steiner symmetrization (this operation is a key ingredient in several proofs of the isoperimetric inequality). The Steiner symmetral of a set A with respect to a hyperplane H is defined as follows: let u be a unit vector orthogonal to H. For every x ∈ H, consider the line through x with direction u. If this line meets A with positive measure, on a set denoted Ax , then consider the segment with same length as Ax , parallel to u and centered at x. The union of all such segments is the Steiner symmetral with respect to H, denoted SH (A). Fubini theorem ensures that Voln (SH (A)) = Voln (A), and it is not hard to prove that for all h > 0, Voln (Ah ) ≥ Voln ((SH (A))h ). Thus the symmetral has the same volume and smaller enlargements. Next one can find a sequence of hyperplane (Hn )n≥1 such that the iterated symmetrals converge to a ball in Hausdorff distance. Proving that the Steiner symmetral has smaller enlargements strongly relies on a obvious property that we wish to put forward, as it will be crucial in more general situations: on the real line (R, | · |, dx) the sets [−t, t] for t > 0 are solutions to the isoperimetric problem, (for all positive values of the measure) and the family is stable by enlargement, since [−t, t]h = [−t − h, t + h]. The isoperimetric inequality may also be derived from the famous BrunnMinkowski inequality on sum sets. It states that for all non-empty compact subsets A, B of Rn , one has 1
1
1
Voln (A + B) n ≥ Voln (A) n + Voln (B) n , where A + B = {a + b; (a, b) ∈ A × B}. There is equality when A and B are dilates of a common convex set. In particular, choosing B = hB2n and noting that Ah = A + hB2n since A is closed, the above inequality yields n 1 1 Voln (Ah ) ≥ Voln (A) n + h Voln (B2n ) n with equality when A is a ball. There are several proofs for the Brunn-Minkowski inequality. Some of them involve tricky dissection arguments or the Steiner symmetrization (see, e.g., the survey article [19]). Next we briefly expose two related modern approaches, as they turned out to be very flexible and to allow extension to more general spaces. The first one is based on mass transport. Computing the volume of a sum set is hard since it lies in Rn and is parametrized by A × B ⊂ R2n . One looks instead for a smaller set containing still enough volume, but parametrized only by A. This is done by selecting for each a ∈ A a point ϕ(a) ∈ B. Obviously
Isoperimetric Inequalities
813
Voln (A+B) ≥ Voln ({a+ϕ(a); a ∈ A}) and the latter volume can be computed by a change of variable formula when Id + ϕ is a diffeomorphism. A successful approach is to choose ϕ as a “monotone measure preserving map”. More precisely if λA denotes the uniform probability measure on A (assuming Voln (A) > 0 otherwise we have nothing to prove), then one chooses ϕ so that the image measure of λA by ϕ is λB . There are several such mappings. Among them one chooses either the so-called Knothe map, which has the additional property that its differential at every point is triangular with positive diagonal [28], or the Brenier map which is the gradient of a convex function [17] (its differential is symmetric positive). In fact, among all maps with ϕλA = λB the Brenier maps minimizes the quadratic transportation cost |x − ϕ(x)|2 dλA (x). A remarkable presentation of this topic is provided in [49]. Using either Knothe or Brenier map one can bound from below Voln ({a + ϕ(a); a ∈ A}) = A det(Id + Dϕ) and prove the Brunn-Minkowski inequality. The Knothe map allowed extensions of the latter inequality [28], and also a very short proof of the isoperimetric inequality [37]. The geometric use of the Brenier map goes back to McCann’s thesis [36]. Monotone transportation maps also played a central role in the development of functional inequalities of Brunn-Minkowski type, see, e.g., [19, 3, 18]. The best known example is known as the Pr´ekopa-Leindler inequality. It can be viewed as a converse of the H¨older inequality: Let t ∈ (0, 1) and f, g, h : Rn → R three measurable functions such that for all x, y ∈ Rn one has h(tx + (1 − t)y) ≥ f (x)t g(y)1−t , then
Rn
h≥
t f
1−t g
Rn
.
(1.1)
Rn
2. Isoperimetry in metric measured spaces We formulate the isoperimetric problem in the general setting of a metric space, with a Borel measure (X, d, µ). We still denote by Ah the enlargement of a subset A of X, Ah = {x ∈ X; dist(x, A) ≤ h}. The boundary measure in the sense of µ, or µ-Minkowski content is µs (∂A) = lim inf h→0
µ(Ah \ A) . h
When the space is a Riemannian manifold and µ has density ρµ with respect to the volume, then for smooth domains A, this notion coincides with the integral of ρµ with respect to the surface measure on ∂A. The isoperimetric function, or profile, encodes the minimal µ-boundary measure for given measure: I(x,d,µ) (a) = inf{µs (∂A); µ(A) = a},
a ∈ [0, µ(X)].
814
F. Barthe
When there is no ambiguity on the metric space we simply write Iµ . With such notation the Euclidean isoperimetric inequality reads as 1
1
I(Rn ,|.|,dx) (a) = n Voln (B2n ) n a1− n ,
a ≥ 0.
This general setting for the isoperimetric problem provides a common framework for studying questions which are usually considered by different communities: the isoperimetric problem on Riemannian manifolds and the concentration phenomenon for probability measures. We briefly recall the basics of these fields. The reader will find details and complete reference in the survey article of Ros [42] and in the book by Ledoux [33]. 2.1. Isoperimetry for Riemannian manifolds. The ambient space is a Riemannian manifold M with or without boundary, with geodesic distance and Riemannian volume. When M is compact, deep results of geometric measure theory and calculus of variation provide for any t ∈ (0, Vol(M )) the existence of a compact region of volume t and whose boundary has minimal surface area. These results are completed by regularity properties for these isoperimetric sets. Characterizing such minimizers is very hard. The class of critical sets is somewhat more accessible and corresponds to sets with boundary of constant mean curvature. The study up to second order (stable surfaces) is also well developed and led to several striking results. However there are only few cases where the isoperimetric problem is completely solved. We list the simplest examples: The cases of Rn , of the sphere S n and the hyperbolic space Hn have a common answer (geodesic balls are isoperimetric sets), which can be proved in a similar way due to the rich symmetry groups of these spaces. The solutions of the isoperimetric problem in the Euclidean ball B2n are orthogonal balls or their complements. The case of the slab Rn−1 × [0, 1] was solved by Pedrosa and Ritor´e for n ≤ 8 [40]. They showed that the solutions are either half-spheres (set on the boundary) or a piece of cylinder touching both sides. However when n ≥ 10 a third shape (unduloid) is better for some values of the volume. A few more examples are solved, including the projective space P3 [41]. Let us put forward the case of the square [0, 1]2 . It is easy to check that optimal sets are quarters of discs centered at a corner (or their complements) or slabs of the square ([0, t] × [0, 1]). This is a simple example where the shape of the solution changes for some value of the volume. A similar conjecture exists for the cube [0, 1]3 (and is supported by experiments with soap bubbles): the solutions are believed to be among pieces of balls centered at a corner of the cube, cylinders with an edge of the cube as axis, or slabs in the cube parallel to a face (together with their complements). This question still resists the efforts of many mathematicians. The conjecture has been confirmed for volume 1/2 by Hadwiger [22] ([0, 1/2] × [0, 1]2 is optimal) and a recent work by Hauswirth, P´erez, Romon and Ros [23] proves that balls are best for small volume.
Isoperimetric Inequalities
815
2.2. Concentration of measure. Let us state more precisely the isoperimetric inequality on spheres S n ⊂ Rn+1 , which goes back to Levy and Schmidt. Theorem 2.1 ([34, 44]). Let σn denote the uniform probability measure on S n and let C be a spherical cap (or geodesic ball). If A ⊂ S n is such that σn (A) = σn (C), then for all h > 0, σn (Ah ) ≥ σn (Ch ). In particular if σn (A) ≥
1 2
then 9 −(n − 1)h2 π exp . σn (Ah ) ≥ 1 − 8 2
√ A striking consequence is that enlarging a set of measure 1/2 by say h = 5/ n is enough to capture most of the measure. The above fact can be translated in terms of functions: let f : S n → R be a 1-Lipschitz function. Let m be a median of the law of f . Then for h > 0, 9 π −(n − 1)h2 σn (|f − m| > h) ≤ exp . 2 2 This means that in high dimension, Lipschitz function are very strongly concentrated around their median. This was formalized by Gromov [20] in terms of observable diameter: the one of the sphere is of order n−1/2 although its actual diameter is π. The relevance of above concentration phenomenon was understood and strongly put forward by V. Milman in the setting of Banach space geometry, see [37]. He used it first to give a new insight on the Dvoretzky theorem. This fundamental result asserts that for every Banach space (X, · X ) of infinite dimension, for every ε > 0 and d ≥ 1 there exists a d-dimensional subspace E ⊂ X which is ε-close to the Euclidean space (Rd , |·|) in the BanachMazur distance (there exists a linear map u : (E, · X ) → (Rd , | · |) with u · u−1 ≤ 1 + ε). Milman proved that every n-dimensional Banach F space has a subspace of dimension c(ε) log n which is ε-close to the Euclidean space of same dimension. The first step in his proof is to chose a representation of F in Rn such that S n−1 is the largest ellipsoid inside the unit ball of F . This inclusion implies that · F considered as a function on the sphere is 1-Lipschitz. By the concentration inequality, this function is almost constant an a subset of S n of large measure. In the corresponding directions, the unit ball of F thus looks like a Euclidean ball. The rest of the proof consists in building a linear subset where this is true. The concentration of measure subsequently became a central tool in the geometry of Banach spaces. As expected it also became an important field in probability theory. This is due in particular to the profound investigation of this phenomenon by Talagrand with striking results for product probability measure (via the “induction method”) and for the supremum of a random process, see, e.g., [47, 33]. His contribution was revisited and partly extended by
816
F. Barthe
Ledoux who pushed forward the relevance of logarithmic Sobolev inequalities for establishing concentration inequalities [31]. See also the works of Massart and his collaborators for further developments of this entropy methods and applications to statistics [16]. The topics was recently revived by Bobkov and Houdr´e who promoted a functional approach to several isoperimetric inequalities for product probability measures. They use L1 -Sobolev type inequalities instead of L2 -inequalities (as log-Sobolev), and improve on several concentration inequalities (their results give precise bounds of the measure of Ah in terms of the one of A whatever this value is, whereas concentration inequalities deal with large sets and not too small enlargements). 3. Product probability measures The general framework exposed in the previous section promoted interaction between geometry and probability theory. One illustration is the geometry of Markov diffusion generators developed by Bakry and his collaborators. A notion of Ricci curvature for these operators is defined and the analogue of several classical comparison theorems of differential geometry hold in this new setting with soft proofs. See, e.g., [32]. Our aim is to present another illustration, related to product measures. The problem we consider in the rest of this paper is the following: assume that we have good information on the isoperimetric problem for a metric probability space (X, d, µ), that is a lower bound on Iµ . What can we infer for the isoperimetric question on the product space (X n , dn , µn )? Here the distance on the product is the L2 combination of the ones on the factors (as for Riemannian manifolds): 12 n dn (xi )ni=1 , (yi )ni=1 = d(xi , yi )2 . i=1
For simplicity we denote by I the isoperimetric profile of this space. First observe that this sequence of functions is non-increasing in n µn
Iµ ≥ Iµ2 ≥ Iµ3 ≥ · · · ≥ Iµ∞ , where by definition Iµ∞ = inf k Iµk . Indeed if A ⊂ X k then the cylinder A×X ⊂ X k+1 verifies µk (A) = µk+1 (A × X) since µ is a probability measure. It is not (∂(A × X)). Thus the isoperimetric hard to check that also µks (∂A) = µk+1 s problem on X k consists in restricting the isoperimetric question on X k+1 to cylinders. Hence Iµk ≥ Iµk+1 . Next let us try and use our knowledge on Iµ in order to get information on I . It is natural to hope for an analogue of the Steiner symmetrization. Let A ⊂ X 2 , and for every y ∈ X denote Ay = {x ∈ X; (x, y) ∈ A}. This is an “horizontal” section of A. Whenever Ax has positive measure one is tempted to replace it with a solution of the isoperimetric problem in X with the same µ2
Isoperimetric Inequalities
817
measure. Putting all these “symmetrized” slices together provides a new set A∗ with same measure as A by Fubini. It is true that µ2s (∂A) ≥ µ2s (∂(A∗ )) ? This would allow to reduce the question to particular sets. However the answer to the above question need not be positive in general. First one needs to fix family (B(t))t∈(0,1] of (approximate) solutions of the isoperimetric problem in (X, d, µ) with µ(B(t)) = t. It is obvious that if the shape of the sets B(t) changes drastically for some value, then when putting them one onto another will create lots of “horizontal” boundary measure. One example is given by the uniform measure on the square. Actually one can check that the above symmetrization procedure does reduce the boundary measure provided the family (B(t)) is stable by enlargement, that is for all t ∈ (0, 1] and h > 0, B(t)h = B(t + ψ(t, h))) for some function ψ. Before pushing further one should wonder whether many spaces enjoy this property. A look back to our list of probability spaces where a complete answer is known provides a sad first answer: only S n can do, by choosing a family of geodesic balls with common center. This provides a very restricted range of isoperimetric functions. Luckily enough, probability measures on the real line (R, | · |) provide a wide family of spaces with “chained” isoperimetric sets. The following statement combines several results by Bobkov and Houdr´e [14]. Theorem 3.1. Let I : [0, 1] → R+ . Assume that its restriction to (0, 1) is continuous and positive, that I(0) = 0 and I(t) = I(1 − t) for all t ∈ [0, 1]. If for all a, b ≥ 0 with a + b ≤ 1 one has I(a + b) ≤ I(a) + I(b), then there exists a probability measure dm(x) = ρ(x)dx on R such that ρ is even, for all x ∈ R the set (−∞, x] is a solution to the isoperimetric problem for m, and Im = I. This provide a large class of isoperimetric profiles for which the symmetrization procedure works. Concave profiles are part of this class, they correspond to densities ρ such that log ρ is concave on its support. Note that the condition I(t) = I(1 − t) is natural since in a smooth situation a set and its complement have the same boundary measure. Let us provide a few examples: For I(t) = min(t, 1−t), which corresponds to the classical Cheeger isoperimetric inequality, the associated measure is the symmetric exponential law dm(t) = e−|t| dt/2. If one starts with I = IS n one gets dm(t) = cn cos(t)n−1 1|t|≤ π2 dt. A nice consequence of this wide family of spaces with chained isoperimetric sets, called model spaces, is that one can perform a analogue of Steiner symmetrization in product sets. The main difference is that the symmetrized set does not live in the original space but rather in products of the model spaces.
818
F. Barthe
The next statement is a variation on independent results of [5, 42] extending first result of [8]. Theorem 3.2. Let (X, d, µ) be a Riemannian manifold with an absolutely continuous probability measure, and m be a probability measure on (R, | · |) as produced by the latter theorem. If Iµ ≥ cIm then for all n ≥ 1, Iµn ≥ cImn . It is important to study the product of such probability measures on R, since they have a minimal behavior under product. With start with the Gaussian measure which is best understood. 3.1. Gaussian measure. We denote by γ the standard Gaussian distribution on the real line, 2 dt t ∈ R. dγ(t) = e−t /2 √ , 2π Its n-fold product γ n is the standard normal law on Rn . The isoperimetric problem for the Euclidean space with Gaussian measure was solved by Sudakov and Tsirel’son [45] and Borell [15] Theorem 3.3. Half-spaces solve the isoperimetric problem in (Rn , | · |, γ n ). In particular half-lines (−∞, t] solve the problem on the real line. In higher dimensions, the corresponding cylinders {x ∈ Rn ; x1 ≤ t} are isoperimetric sets. Since they have the same Gaussian measure and boundary measure as n (−∞, t], the isoperimetric profile Iγ = Iγ does not depends on n (One can check that Iγ (t) ∼0 t 2 log(1/t)). The above property actually characterizes the Gaussian: If a symmetric probability measure on R is such that the halfspaces {(x, y); x ≤ t} solve the isoperimetric question for µ2 then µ is Gaussian [12, 29, 39]. As a consequence of the previous two results, one recovers a statement first proved by functional methods in [8], which started these developments: if Iµ ≥ cIγ then Iµn ≥ cIγ n = cIγ . Let us give a surprising application of this fact, taken from [4]. One can check that the isoperimetric function (Sn , d, σn ) satisfies for t ∈ [0, 1], √ of the sphere n−1 )/ Voln (S n ) is such that there is IS n (t) ≥ cn Iγ (t) with cn = 2π Voln−1 (S equality for t = 1/2. It follows that ISn (t) ≥ I(S n )k (t) ≥ cn Iγ (t). For t = 1/2 the first and third term coincide, therefore I(S n )k (1/2) = ISn (1/2). This solves the isoperimetric problem for measure 1/2 in a product of k spheres. n n An optimal set is S+ ×(S n )k−1 where S+ = {x ∈ Rn+1 ; x ∈ S n and x1 ≤ 0} is a half-sphere. This principle may extends to products of compact manifolds. It is an illustration of the interplay we announced in the introduction: a probabilistic object (the Gaussian measure) helps in solving a purely geometric problem.
Isoperimetric Inequalities
819
3.2. Isoperimetric dimension. By definition (X, d, µ) has isoperimetric dimension (at most) n ∈ [1, ∞) if there exists c > 0 such that Iµ ≥ cIS n = cI(R,|·|,cn cos(t)n 1|t|≤ π dt) . 2
This essentially means that Iµ (t) is larger than kt when t ∈ [0, ε] and is bounded away from zero on [ε, 1/2] for some ε. By our comparison theorem, if µ has isoperimetric dimension n then for k ≥ 1 1−1/n
Iµk ≥ cI(Rk ,|·|,$ k
i=1
cn cos(xi )n−1 1|xi |≤ π dxi ) . 2
Thus we have to understand the isoperimetric problem on Rk with the above explicit product density. Finding an exact solution is probably very hard. However good estimates can be done. As the Brunn-Minkowski inequality implies the Euclidean isoperimetric inequality, suitable extensions of Inequality (1.1) provide isoperimetric inequalities for measures on Rk with β-concave density: Theorem 3.4 ([5]). Let µ be a probability measure on Rk . Assume that support(µ) ⊂ B2k and that dµ(x) = c(x)α dx, x ∈ Rk where the function c is concave on its support, and α ≥ 0. Then for t ∈ [0, 1], 1 d 1− 1 Iµ (t) ≥ t d + (1 − t)1− d − 1 , 2 where d = α + k. In other words the isoperimetric dimension is the dimension of the space plus the power of the convex function in the density. Applying this result to the previous product measure on Rk when µ has isoperimetric dimension n, we obtain that µk has isoperimetric dimension nk (and this cannot be improved in general: if Iµ (t) ≤ ct1−1/n then considering product sets shows that Iµk ≤ ck t1−1/(nk) ). On the other hand Iµ ≥ cISn ≥ dn Iγ so Iµ∞ ≥ dn Iγ . An argument based on the central limit theorem shows that on the other hand Iµ∞ ≤ cµ Iγ . The limit of the sequence (Iµk )k≥1 is comparable to the Gaussian isoperimetric function, and the speed of convergence is quantified in terms of isoperimetric dimension. In this case we have a good description of the sequence Iµk . In the next section, we discuss what happens if from the start Iµ is smaller than the Gaussian isoperimetric inequality. In some sense this corresponds to infinite isoperimetric dimensions. 3.3. Below the Gaussian. Although our argument may be extended to more general distributions, the main example is given by the probability measures νp on R defined for p ∈ [1, 2) by dνp (t) = cp e−|t| dt, p
t ∈ R.
820
F. Barthe
The first result in this case is due to Bobkov and Houdr´e [13]. It completes a very precise concentration inequality for the exponential measure by Talagrand [46] which had consequences for log-concave densities. Theorem 3.5.
Iν Iν1 ≥ Iν1∞ ≥ √1 . 2 6
Geometrically speaking this means that for all n, among subsets of Rn n with prescribed measure for ν1n , coordinate half-spaces √ {x ∈ R ; x1 ≤ t} have almost minimal boundary measure (up to a factor 2 6). As we have seen, these half-spaces are not exact minimizers (otherwise the measure would be Gaussian). The proof of this striking result relies on a Sobolev type inequality on (R, ν1 ) which has the tensorisation property (it implies the same inequality for (Rn , ν1n ) by induction) and yields an isoperimetric inequality. The inequality reads as follows: for every Lipschitz function on R with median 0 under ν1 , one has 2 1 + f dν1 ≤ 1 + C(f )2 dν1 , where C is a numerical constant. This inequality has common features with a functional version of the Gaussian isoperimetric inequality proposed by Bobkov [9]: for every Lipschitz function f : R → [0, 1] one has Iγ (f )2 + (f )2 dγ. f dγ ≤ Iγ In particular they both involve a mixture of L1 and L2 behaviors. In view of the results for the Gaussian and the exponential measure, a similar isoperimetric inequality for intermediate measures was conjectured. A natural approach would be to interpolate between the above two inequalities. No success came from this direction. In the rest of the paper we present a softer method which allowed to prove the following Theorem 3.6 ([7]). There exists a universal constant such that for all p ∈ (1, 2), Iν Iνp ≥ Iνp∞ ≥ p . K 3.4. Analytic techniques. Let µ be a probability measure on Rn with dµ(x) = e−V (x) dx where V is regular and has at least linear growth at infinity. The corresponding µ-Laplacian operator is L = ∆ − ∇V · ∇. It generates a semigroup Ptµ . It is known as the Heat semigroup of µ. For suitable functions f on Rn , u(t, x) = Ptµ f (x) is the solution of the PDE ∂t u = Lu with initial condition u(0, x) = f (x) for all x ∈ Rn . Actually Pt is self-adjoint in L2 (µ) and is ergodic. Inspired by differential geometry techniques, Bakry and Ledoux established a connection between isoperimetry and the integrability improving properties of the Heat semigroup.
Isoperimetric Inequalities
821
A special case of their result is given in Theorem 3.7 ([2]). If V ≥ 0 (as symmetric operators) then for every Borel set A ⊂ Rn and all positive t, one has µ 1A 2L2 (µ) µ(A) − Pt/2 √ . 2t is the characteristic function of the set A.
µs (∂A) ≥
Here 1A
Sobolev inequalities for µ are a convenient tool to understand semigroup properties. Let us give two examples: If µ satisfies the following Poincar´e (or spectral gap) inequality: for smooth functions f with f dµ = 0, one has λ f 2 dµ ≤ |∇f |2 dµ = f · (−Lf ) dµ, then the spectrum of L is included in (−∞, −λ] ∪ {0} (zero corresponds to constant functions). Therefore on functions with mean zero one has Ptµ f L2 (µ) ≤ e−λt f L2 (µ) . As noted by Ledoux [30], combining this inequality with the latter theorem and choosing properly the time parameter yields √ µs (∂A) ≥ c λ min(µ(A), 1 − µ(A)), √ that is Iµ ≥ c λIν1 . Combining these observation provides a new soft proof of the BobkovHoudr´e isoperimetric inequality. It is easy to see that (a smooth log-concave perturbation of) the exponential measure ν1 satisfies a Poincar´e inequality. This inequality has the tensorisation property, so the same spectral gap inequality holds for ν1n . By the previous remarks this implies a dimension free isoperimetric inequality Iν1n ≥ Iν1 /K. A similar reasoning is available for the Gaussian measure. It satisfies a logarithmic Sobolev inequality (see, e.g., [21, 31]): for all f : Rn → R, 2 2 n 2 n 2 n f dγ f log(f ) dγ − log f dγ ≤ 2 |∇f |2 dγ n . The celebrated theorem of Gross then ensures that the semigroup is hypercontractive, see [21] (for the Gaussian measure this property of the OrnsteinUhlenbeck semigroup was discovered by Nelson). Namely for all p > 1, t > 0 and f : Rn → R, one has > n> > γ > ≤ f Lp (γ n ) , >Pt > Lq (γ n )
when q ≥ 1 + (p − 1)e2t . Ledoux also showed that hypercontractivity may be combined with the above Theorem 3.7 in order to recover the Gaussian isoperimetric inequality, up to a constant.
822
F. Barthe
Let us mention that Wang also investigated isoperimetric consequences of semigroup properties with other motivations, see, e.g., [50]. In a joint work with Cattiaux and Roberto [7], isoperimetric inequalities for νpn are derived ν from this semigroup approach. It turns out that for p < 2, the semigroup Pt p is never continuous from L2 (νp ) to L2+ε (νp ). Improving properties happen in a finer scale: Theorem 3.8. For all p ∈ (1, 2), n ≥ 1, t > 0 and f : Rn → R, one has > n > > νp > >Pt f > Ψc t n ≤ ec2 t f L2 (νpn ) , L
1
(νp )
2− p2 and the Orlicz norm is defined by where Ψq (x) = x2 exp q log(1 + x2 ) |g| g LΨ(µ) = inf λ > 0; dµ ≤ Ψ(1) . Ψ λ Note that the exponent 2 − 2/p is strictly less than 1, so that our Orlicz spaces are larger than L2+ε for ε > 0. The above result is proved by time differentiation. One shows that the ratio of the left-hand side by the right-hand side is non-increasing thanks to the following Sobolev type inequality: for all p ∈ (1, 2) and every smooth f : R → R, one has 2 2 f 2 log(2− p ) (1+f 2 ) dνp − f 2 dνp log(2− p ) 1+ f 2 dνp ≤ C (f )2 dνp , (3.1) where C is a universal constant. This inequality is a tight version of a result of Rosen [43], who was interested in embeddings. Our version is exact for constant functions, and has the tensorisation property. Hence a similar inequality holds for νpn with a constant independent of n. On the real line, the inequality can be proved by level-sets decompositions. This follows earlier works, started with [11], where it was realized that given a probability measure on the real line, the best constant (maybe infinite) for which it satisfies a logarithmic Sobolev inequality can be computed up to a numerical factor, recently reduced to 16. In our case, if dµ(x) = ρµ (x)dx is a probability measure on R with median m, then the best constant C for which µ satisfies an inequality like (3.1) is comparable to max(B+ , B− ) where x 1 2 1 dt, B+ = sup µ([x, +∞)) log2− p 1 + · µ([x, +∞)) x>m m ρµ (t) and B− is a similar quantity involving half-lines (−∞, x] for x < m. This criterion is very efficient in practice as the quantities involved are easily estimated. This allows us to reach a much wider class of measures than νp . We do not give details on the proof here, but let us note that x x −1 1 2 = inf (f ) dµ; f (m) = 0 and f (t) = 1, ∀t ≥ x . m ρµ m
Isoperimetric Inequalities
823
is the µ-capacity of [x, +∞) with respect to [m, +∞). So the above result is an adaptation to the probabilistic setting of Maz’ja’s capacity theory, see [35, 7]. The results of this section can be extended to more general distributions. The main conclusion is that there exists a wide range of probability measures, between the Gaussian and exponential measures, for which Iµ∞ is comparable to Iµ . In this case the sequence of isoperimetric functions (Iµn )n≥1 is approximately constant. This provides a family of comparison theorems of the following kind: let α ∈ [0, 1/2], and (X, d, µ) be a Riemannian manifold with an absolutely continuous probability measure, if for all t ∈ [0, 1], Iµ (t) ≥ c min(t, 1 − t) logα (1/ min(t, 1 − t)), then for all n ≥ 1, 1 c Iµn (t) ≥ min(t, 1 − t) logα , t ∈ [0, 1]. K min(t, 1 − t) The reader probably noticed that the above arguments are restricted to measures which have exponential (or faster) tails: we considered νp for p ≥ 1 only. Talagrand observed that a dimension free concentration for a probability measure on R implies that its tails are exponential (or faster) [46]. It follows that in this case Iµ∞ has to be zero. In the preprint [6], we provide quantitative estimates on the decay to zero of the sequence (Iµn )n≥1 when Iµ (t) " t for t small. 4. Perspectives By now, we have a rather good understanding of the isoperimetric behavior of product measures with density with respect to the Riemannian volume, provided one is not interested in exact but in up-to-constant solutions. Other natural distributions are still a challenge. Recent progress have been made for product Markov chains, or Gibbs measures for example, see, e.g., [25, 51]. Let us point out a very nice open question from convex geometry, related to isoperimetric properties of half-spaces. Let K ⊂ Rn be an origin symmetric convex compact set with non-empty interior and with Voln (K) = 1. It is called isotropic if there exists LK > 0 such that for all θ ∈ S n−1 , x, θ2 dx = L2K . K
In other words the covariance of the uniform distribution on K is a multiple of the identity. Hensley showed that for every unit vector u ∈ S n−1 , Voln−1 (u⊥ ∩ K) is comparable to 1/LK , up to universal constants [24]. Denoting by λK the Lebesgue measure restricted to K and Hu = {x ∈ Rn ; x, u ≥ 0}, this can be rephrased in terms of boundary measure: for every u ∈ S n−1 , one has asz and Simonovits [27] λK (Hu ) = 1/2 and (λK )s (∂Hu ) ≥ c/LK . Kannan, Lov´ n proved that for every set A ⊂ R with λK (A) = 1/2, one has c · (λK )s (∂A) ≥ √ n LK
824
F. Barthe
So among all sets which meet K with measure√1/2, half-spaces have minimal boundary measure (inside K) up to this factor n. This statement is very similar to the previous results on the distributions νpn , apart from the dependence in the dimension. It turns out that removing, or improving, this dependence in n would have spectacular consequences in convex geometry. In particular this would imply a version of the central limit theorem for uniform distributions on convex sets (see, e.g., [1, 10]). This question is also crucial in randomized volume algorithms (see [26]), and, as K. Ball recently understood, it is related to the very challenging hyperplane problem on the uniform boundedness of LK (see [38] for an introduction). References [1] M. Antilla, K. Ball, and I. Perissinaki. The central limit problem for convex bodies. Trans. Amer. Math. Soc., 355(12):4723–4735, 2003. [2] D. Bakry and M. Ledoux. L´evy-Gromov isoperimetric inequality for an infinitedimensional diffusion generator. Invent. Math., 123:259–281, 1996. [3] F. Barthe. On a reverse form of the Brascamp-Lieb inequality. Invent. Math., 134:335–361, 1998. [4] F. Barthe. Extremal properties of central half-spaces for product measures. J. Funct. Anal., 182:81–107, 2001. [5] F. Barthe. Log-concave and spherical models in isoperimetry. Geom. Funct. Anal., 12:32–55, 2002. [6] F. Barthe, P. Cattiaux, and C. Roberto. Concentration for independent random variables with heavy tails. To appear in AMRX. [7] F. Barthe, P. Cattiaux, and C. Roberto. Interpolated inequalities between exponential and Gaussian, Orlicz hypercontractivity and application to isoperimetry. To appear in Revista Math. Iberoamericana. [8] F. Barthe and B. Maurey. Somes remarks on isoperimetry of Gaussian type. Ann. Inst. H. Poincar´ e, Probabilit´es et Statistiques, 36(4):419–434, 2000. [9] S.G. Bobkov. An isoperimetric inequality on the discrete cube, and an elementary proof of the isoperimetric inequality in Gauss space. Ann. Probab., 25(1):206–214, 1997. [10] S.G. Bobkov. On concentration of distributions of random weighted sums. Ann. Probab., 31(1):195–215, 2003. [11] S.G. Bobkov and F. G¨ otze. Exponential integrability and transportation cost related to logarithmic Sobolev inequalities. J. Funct. Anal., 163:1–28, 1999. [12] S.G. Bobkov and C. Houdr´e. Characterization of Gaussian measures in terms of the isoperimetric property of half-spaces. Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI), 228:31–38, 1996. (Russian). [13] S.G. Bobkov and C. Houdr´e. Isoperimetric constants for product probability measures. Ann. Probab., 25(1):184–205, 1997. [14] S.G. Bobkov and C. Houdr´e. Some connections between isoperimetric and Sobolev-type inequalities. Mem. Amer. Math. Soc., 129(616):viii+111, 1997. [15] C. Borell. The Brunn-Minkowski inequality in Gauss space. Invent. Math., 30:207–216, 1975.
Isoperimetric Inequalities
825
[16] S. Boucheron, G. Lugosi, and P. Massart. Concentration inequalities using the entropy method. Ann. Probab., 31(3):1583–1614, 2003. [17] Y. Brenier. Polar factorization and monotone rearrangement of vector-valued functions. Comm. Pure Appl. Math, 44:375–417, 1991. [18] D. Cordero-Erausquin, R.J. McCann, and M. Schmuckenschl¨ ager. A Riemannian interpolation inequality ` a la Borell, Brascamp and Lieb. Invent. Math., 146(2):219–257, 2001. [19] R.J. Gardner. The Brunn-Minkowski inequality. Bull. Amer. Math. Soc. (N.S.), 3:355–405, 2002. [20] M. Gromov. Metric structures for Riemannian and non-Riemannian spaces, volume 152 of Progress in Mathematics. Birkh¨ auser Boston Inc., Boston, MA, 1999. [21] L. Gross. Logarithmic Sobolev inequalities and contractivity properties of semigroups. In Dirichlet forms (Varenna, 1992), volume 1563 of Lecture Notes in Math., pages 54–88. Springer, Berlin, 1993. [22] H. Hadwiger. Gitterperiodische Punktmengen und Isoperimetrie. Monatsh. Math., 76:410–418, 1972. [23] L. Hauswirth, J. P´erez, P. Romon, and A. Ros. The periodic isoperimetric problem. Trans. Amer. Math. Soc., 356(5):2025–2047, 2004. [24] D. Hensley. Slicing convex bodies – bounds for slice area in terms of body’s covariance. Proc. Amer. Math. Soc., 79(4):619–625, 1980. [25] C. Houdr´e and P. Tetali. Isoperimetric invariants for product Markov chains and graph products. Combinatorica, 24(3):359–388, 2004. [26] R. Kannan, L. Lov´ asz, and Mikl´ os M. Simonovits. Random walks and an O∗ (n5 ) volume algorithm for convex bodies. Random Structures Algorithms, 11(1):1–50, 1997. [27] R. Kannan, L. Lov´ asz, and M. Simonovits. Isoperimetric problems for convex bodies and a localization lemma. Discrete Comput. Geom., 13(3-4):541–559, 1995. [28] H. Knothe. Contributions to the theory of convex bodies. Michigan Math. J., 4:39–52, 1957. [29] S. Kwapien, M. Pycia, and W. Schachermayer. A proof of conjecture of Bobkov and Houdre. Electron. Comm. Probab., 1:no. 2, 7–10 (electronic), 1996. [30] M. Ledoux. A simple analytic proof of an inequality by P. Buser. Proc. Amer. Math. Soc., 121(3):951–959, 1994. [31] M. Ledoux. Concentration of measure and logarithmic Sobolev inequalities. In S´eminaire de Probabilit´es, XXXIII, number 1709 in Lecture Notes in Math., pages 120–216, Berlin, 1999. Springer. [32] M. Ledoux. The geometry of Markov diffusion generators. Ann. Fac. Sci. Toulouse Math. (6), 9(2):305–366, 2000. [33] M. Ledoux. The concentration of measure phenomenon, volume 89 of Mathematical Surveys and Monographs. American Mathematical Society, Providence, RI, 2001. [34] P. L´evy. Probl`emes concrets d’analyse fonctionnelle. Gauthiers-Villars, Paris, 1951. [35] V.G. Maz’ja. Sobolev spaces. Springer-Verlag, 1985.
826
F. Barthe
[36] R.J. McCann. A Convexity Theory for Interacting Gases and Equilibrium Crystals. PhD thesis, Princeton University, 1994. [37] V. Milman and G. Schechtman. Asymptotic Theory of Finite-Dimensional Normed Spaces. Number 1200 in Lecture Notes in Math. Springer Verlag, 1986. [38] V.D. Milman and A. Pajor. Isotropic position and inertia ellipsoids and zonoids of the unit ball of a normed n-dimensional space. In Geometric Aspects of Functional Analysis, number 1376 in LMN, pages 64–104. Springer, 1989. [39] K. Oleszkiewicz. On certain characterization of normal distribution. Statist. Probab. Lett., 33(3):277–280, 1997. [40] R. Pedrosa and M. Ritor´e. Isoperimetric domains in the Riemannian product of a circle with a simply connected space form and applications to free boundary problems. Indiana Univ. Math. J., 48(4):1357–1394, 1999. [41] M. Ritor´e and A. Ros. Stable constant mean curvature tori and the isoperimetric problem in three space forms. Comment. Math. Helv., 67(2):293–305, 1992. [42] A. Ros. The isoperimetric problem. http://www.ugr.es/ aros/isoper.htm, 2001. [43] J. Rosen. Sobolev inequalities for weight spaces and supercontractivity. Trans. Amer. Math. Soc., 222:367–376, 1976. [44] E. Schmidt. Die Brunn-Minkowskische Ungleichung und ihr Spiegelbild sowie die isoperimetrische Eigenschaft der Kugel in der euklidischen und nichteuklidischen Geometrie I, II. Math. Nachr., 1:81–157, 1948. 2:171–244, 1949. [45] V.N. Sudakov and B.S. Tsirel’son. Extremal propreties of half-spaces for spherically invariant measures. J. Soviet Math., 9:9–18, 1978. Translated from Zap. Nauchn. Sem. Leningrad. Otdel. Math. Inst. Steklova. 41 (1974) 14–24. [46] M. Talagrand. A new isoperimetric inequality and the concentration of measure phenomenon. In J. Lindenstrauss and V. D. Milman, editors, Geometric Aspects of Functional Analysis, number 1469 in Lecture Notes in Math., pages 94–124, Berlin, 1991. Springer-Verlag. [47] M. Talagrand. Concentration of measure and isoperimetric inequalities in prod´ uct spaces. Inst. Hautes Etudes Sci. Publ. Math., 81:73–205, 1995. [48] G. Talenti. The standard isoperimetric theorem. In Handbook of convex geometry, Vol. A, B, pages 73–123. North-Holland, Amsterdam, 1993. [49] C. Villani. Topics in optimal transportation, volume 58 of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI, 2003. [50] F.Y. Wang, A generalization of Poincar´e and log-Sobolev inequality. Potential analysis, 22:1–15, 2005. [51] B. Zegarlinski. Isoperimetry for Gibbs measures. Ann. Probab., 29(2):802–819, 2001. F. Barthe Institut de Math´ematiques Laboratoire de Statistique et Probabilit´es UMR C 5583, Universit´e Toulouse III 118 route de Narbonne F-31062 Toulouse cedex 04, France e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Symplectic Topology and Algebraic Families Paul Biran Abstract. In this paper we outline a recent direction of research interrelating symplectic geometry and algebraic geometry. We show how methods and ideas from symplectic geometry can be used to study classical algebro-geometric problems on hyperplane sections and degenerations of algebraic varieties.
1. Hyperplane sections Let X be a smooth projective variety. A subvariety Σ ⊂ X is called a hyperplane section of X if there exists a projective embedding X ⊂ CP N and a hyperplane H ⊂ CP N transverse to X such that Σ = X ∩ H. Hyperplane sections play an important role in algebraic geometry as they are “responsible” for projective embeddings of X and also because problems on X can often be studied by considering the geometry of Σ. The following questions naturally appear in the study of projective embeddings: Given a pair of smooth varieties X and Σ, can Σ appear as a hyperplane section of some projective embedding of X? Moving the attention to Σ, one may also ask: Which algebraic varieties X may contain Σ as one of their hyperplane sections? 1.1. Classical restrictions. Restrictions on the possible pairs (X, Σ) go back to Lefschetz who discovered that there exist intimate relations between Σ and X, both topological and algebro-geometric. For example, the inclusion Σ ⊂ X induces isomorphisms Hi (Σ; Z) → Hi (X; Z),
πi (Σ) → πi (X),
for every i < dimC Σ.
A typical algebro-geometric relations is the isomorphism Pic(X) → Pic(Σ) whenever dimC Σ > 2 coming from restricting line bundles from X to Σ. We refer the reader to [23, 15] for more Lefschetz-type relations. 1.2. Modern restrictions. In a series of papers starting from 1976 (see, e.g., [23, 24]) Sommese established surprising restrictions going far from the Lefschetz type. For example: The author was supported by the Israel Science Foundation (grant No. 205/02 *).
828
P. Biran
Theorem 1.1 (Sommese [23]). (1) There exist varieties Σ that cannot be hyperplane sections in any smooth variety X. For example, Σ = Abelian variety of (complex) dimension ≥ 2 is such a variety. (2) If a product of two smooth varieties Σ = Σ1 × Σ2 is a hyperplane section in a smooth variety then one of the factors Σ1 or Σ2 is one-dimensional. The main techniques in Sommese’s are in the framework of algebraic geometry and complex analysis. Note that the assumption on the smoothness of X is crucial since any projective variety Σ is a hyperplane section in X being a cone over Σ. Finally, let us remark that Sommese proved in fact a slightly stronger statement. Namely, he proved in [23] that the varieties appearing in Theorem 1.1 cannot even be ample divisors in any smooth X. 2. Hyperplane sections and Lagrangian spheres We shall outline now an alternative approach, based on symplectic topology, to the problems of the previous section. Given a projectively embedded variety X ⊂ CP N we denote by X ∨ ⊂ N ∗ (CP ) its dual variety, consisting of all hyperplanes H ⊂ CP N which are somewhere non-transverse to X: X ∨ = {H ∈ (CP N )∗ | H X}. The dual variety X ∨ is typically a hypersurface in (CP N )∗ (usually singular), but in special situation its codimension might be larger than 1. We define the defect of X ⊂ CP N to be def(X) = codimC X ∨ − 1. Varieties with positive defect are sometimes called varieties with small dual. Theorem 2.1. Let X ⊂ CP N be a projectively embedded smooth variety and Σ = X H be a hyperplane section. Then at least one of the following holds: (1) def(X) > 0. (2) Σ has a Lagrangian sphere, when viewed as a symplectic manifold endowed with the symplectic structure induced from CP N . Outline of the proof. Suppose that codimC (X ∨ ) = 1. Choose a generic line ⊂ (CP N )∗ intersecting X ∨ transversely (and only at smooth points of X ∨ ). ' Consider the pencil {X ∩ H}H∈ parametrized by . Passing to the blow-up X ' of X along the base locus of the pencil we obtain a holomorphic map π : X → ≈ CP 1 . The critical values of π are in 1-1 correspondence with the point of ∩ X ∨ . Moreover, the fact that intersects X ∨ transversely implies that π is a so called Lefschetz fibration, namely each critical point of π has non-degenerate (complex) Hessian (in other words, locally π looks like a holomorphic Morse function). The condition codimC (X ∨ ) = 1 ensures that ∩ X ∨ = ∅ hence at least one of the fibres of π is singular. Let X0 be such a fibre and p ∈ X0 a critical point of π. The important point now is that the vanishing cycle (corresponding to p) that lies in the nearby smooth fibre X can be represented
Symplectic Topology and Algebraic Families
829
by a (smooth) Lagrangian sphere. By Moser argument all the smooth divisors in the linear system {X ∩ H}H∈(CP N )∗ are symplectomorphic. In particular Σ has a Lagrangian sphere too. The existence of Lagrangian vanishing cycles was known folklorically for long time. Its importance to symplectic geometry was realized by Arnold [1], Donaldson [8], and by Seidel [20]. Using Theorem 2.1 we can apply techniques from symplectic topology to rule out the “typical” situation def(X) = 0. Namely, in various situations we can show that a given variety Σ cannot have Lagrangian spheres. Theorem 2.1 then implies that if Σ is a hyperplane section in some X then necessarily X has a small dual. In this case we can use the wealth of results of the theory of varieties with small duals (see, e.g., [18, 9, 10, 26]) to obtain further information on X. Let us remark that for a variety X to have “small dual” is a very restrictive condition. Moreover, in low dimension (up to 5) such varieties have been completely classified by Ein [9, 10]. In the next two subsections we present applications of Theorem 2.1. 2.1. Lagrangian spheres. The following theorem exhibits examples of symplectic manifolds that do not contain any Lagrangian spheres. Theorem 2.2 (See [3, 4, 5]). None of the following projective varieties Σ has a Lagrangian sphere when endowed with any symplectic structure compatible with its complex structure: (1) Σ = smooth projective variety with dimC Σ ≥ 2, KΣ = 0 and b1 (Σ) = 0. (2) Σ = smooth projective variety whose universal cover is a Stein domain in Cn , n ≥ 2. (3) Σ = CP n × C, where C is an algebraic curve with genus ≥ 1. More generally, Σ = Y × C where Y is smooth Fano variety with Fano index larger than 12 dimC Y + 1. (4) Σ = Y × CP 1 , where Y is any smooth variety with π2 (Y ) = 0 and dimC Y ≡ 2 (mod 4). More generally, Σ = Y × CP n whenever dimC Y ≡ n + 1 (mod 2n + 2). The proof of this theorem uses homological computations in the framework of Floer theory for Lagrangian submanifolds [11, 19, 12], as well as geometric techniques developed in [3]. We refer the reader to [3, 4] for the proofs. 2.2. Old and new restrictions via symplectic topology. Reconsider the case of an Abelian variety Σ of dimension ≥ 2 mentioned in Theorem 1.1. Let us explain, from a symplectic perspective, why Σ cannot be a hyperplane section in any smooth X. By Theorem 2.2 (statement 1 or 2) Σ does not contain any Lagrangian spheres1. Therefore, if Σ ⊂ X is a hyperplane section then 1Note that in the case of Abelian varieties this also follows from Gromov’s theorem on nonexistence of exact Lagrangians in Cn , see [17].
830
P. Biran
necessarily def(X) > 0. By results of Kleiman [18], it follows that X contains rational curves, hence π2 (X) = 0. By Lefschetz theorem π2 (Σ) = 0 too, which is a contradiction, Σ being an Abelian variety. (The algebro-geometric proof is not more complicated, but it uses completely different methods, e.g., Kodaira’s vanishing theorem.) Similarly, using Theorem 2.2 one can prove: Theorem 2.3. None of the following algebraic varieties Σ can be a hyperplane section in any smooth variety: (1) Σ = any smooth variety with dimC Σ ≥ 2, KΣ = 0 and b1 (Σ) = 0. (2) Σ = any smooth variety with dimC Σ ≥ 2, whose universal cover is a Stein domain in Cn . Ruling out def(X) > 0 in the first statement of the theorem uses slightly more involved arguments from Ein’s theory of small dual varieties [9, 10]. We refer the reader to [4] for the proof and for more results in this direction. Consider now the case of products Σ1 × Σ2 . Recall that by Sommese’s Theorem 1.1, Σ cannot be a hyperplane section unless one of the factors Σi is one-dimensional. Examples, due to Fujita [13] and Silva [22] indeed realize some varieties of this type as hyperplane sections. For example, Σ = CP m × C where C is an algebraic curve can always be realized as a hyperplane section (see [22]). Using our approach we can describe in which X such a Σ can be a hyperplane section when genus(C) ≥ 1. Indeed, by Theorem 2.2, Σ = CP m × C cannot have Lagrangian spheres if genus(C) ≥ 1. Therefore, if Σ is a hyperplane section in X, def(X) > 0. Using this and Ein’s theory of varieties with small dual [9, 10] we prove: Theorem 2.4. Let Σ = CP m × C ⊂ X be a hyperplane section where C is a curve with genus(C) ≥ 1. Then X is a scroll, i.e., X = P(E) for some vector bundle E → C, and all the fibres of X = P(E) are embedded in CP N linearly. Moreover def(X) = m. Similarly we prove: Theorem 2.5. Suppose Σ = Y m × CP 1 ⊂ X is a hyperplane section, where π2 (Y ) = 0, m = dimC Y > 0. If m ≡ 2 (mod 4) then m = 1 (i.e., Y is a curve), def(X) = 1, and X is a P2 -bundle over Y , embedded in CP N as a scroll. Proofs of Theorems 2.4 and 2.5 can be found in [4]. 3. Degenerations of algebraic varieties The methods described in the previous sections can be used to study degenerations of algebraic varieties. By a degeneration we mean a proper holomorphic map π : W → D from a K¨ ahler manifold W to the unit disc D ⊂ C with the following properties:
Symplectic Topology and Algebraic Families
831
(1) Every 0 = t ∈ D is a regular value of π. Hence the fibres Wt = π −1 (t) over t = 0 are compact K¨ahler manifolds. (2) t = 0 is a singular value of π. 3.1. The case of isolated singularities. We say that a smooth varieties V can be included in a degeneration with isolated singularities if there exists a degeneration π : W → D with the following properties: (1) All the critical points of π (lying in W0 ) are isolated. (2) V is biholomorphic to one of the smooth fibres Wt0 of π, for some 0 = t0 ∈ D. Example 3.1. Let V = X H be a hyperplane section of a projectively embedded smooth variety X. As we have seen in the proof of Theorem 2.1, if codimC X ∨ = 1, V can be included in a degeneration with isolated singularities. A concrete example of this type is V ⊂ CP n+1 being a smooth hypersurface of degree d ≥ 2. One can obtain V as a hyperplane section of X = CP n+1 by embedding X into the projectivization of the space of degree d homogeneous polynomials via the Veronese embedding (see [16]). A simple computation shows that codimC X ∨ = 1, whenever d ≥ 2. As indicated in the proof of Theorem 2.1, given a degeneration π : W → D with isolated singularities, the smooth fibres of π, when viewed as symplectic manifolds, must contain Lagrangian spheres (see [1, 8, 20]). These spheres represent the vanishing cycles of the degeneration. Note that from a symplectic view point all the smooth fibres of π are symplectically equivalent. Note also that a degeneration with isolated singularities can be (symplectically) perturbed so that all the singularities become of quadratic type (i.e., ordinary double points). See [21, 4]. In view of this and Theorem 2.2 we obtain: Theorem 3.2. None of the following algebraic varieties V can be included into a degeneration with isolated singularities: (1) V = smooth projective variety with dimC V ≥ 2, KV = 0 and b1 (V ) = 0. (2) V = smooth projective variety whose universal cover is a Stein domain in Cn , n ≥ 2. (3) V = CP n × C, where C is an algebraic curve with genus ≥ 1. More generally, V = Y × C where Y is smooth Fano variety with Fano index larger than 12 dimC Y + 1. (4) V = Y × CP 1 , where Y is any smooth variety with π2 (Y ) = 0, and dimC Y ≡ 2 (mod 4). More generally, V = Y × CP n whenever dimC Y ≡ n + 1 (mod 2n + 2). 3.2. Non-isolated singularities. Let V, S be smooth projective varieties. We say that V can be included in a quadratic degeneration with an S-singularity if there exists a degeneration π : W → D with the following properties:
832
P. Biran
(1) The critical points of π form a subvariety biholomorphic to S. (2) The complex Hessian ∂ 2 π is non-degenerate when restricted to the normal bundle of S in W . (3) V is biholomorphic to one of the smooth fibres Wt0 of π, for some 0 = t0 ∈ D. This type of degenerations can be thought of as the complex analogue of Morse-Bott functions. Such degenerations have been recently studied by Jerby [14] using symplectic methods. It turns out that the Lagrangian vanishing cycle is replaced in the case of non-isolated singularities by a bundle of vanishing cycles over the singular locus S. The fibres are isotropic spheres, and moreover, given any Lagrangian submanifold L ⊂ S, the total space of the restriction of this bundle to L is a Lagrangian submanifold in V (see [14] for the details). Floer homological computations concerning this Lagrangian submanifold yield restrictions on such degenerations. Before we describe Jerby’s results here is an example of a quadratic degeneration with a non-isolated singularity. Example 3.3. Consider the pencil in CP n spanned by the following two quadrics: Q0 = {z02 + · · · + zn2 = 0},
Q1 = {2z22 + 3z32 + · · · + nzn2 = 0}.
The quadric Q1 is singular along the rational curve S = {z2 = · · · = zn = 0}. Note that different divisors in this pencil do not intersect transversely. The lack of transversality is at two points (lying in S). Blowing-up CP n along the base locus of the pencil we obtain a pencil of quadrics that intersect (this time transversely) at these two points. Blowing up again at these two points we obtain a well-defined quadratic degeneration with a CP 1 -singularity whose smooth fibres are quadrics blown-up at two points. Given a Fano variety V , denote by CV the minimal Chern number of V , namely the positive generator of the subgroup {cV1 (A) | A ∈ π2 (V )} ⊂ Z, where cV1 = c1 (T (V )) ∈ H 2 (V ; Z) is the first Chern class of the tangent bundle of V . Theorem 3.4 (Jerby [14]). Let V be a smooth Fano variety with h1,1 (V ) = 1. Assume that dimC V ≥ 5 and CV ≥ 4. If V can be included in a quadratic degeneration with a CP 1 -singularity then 2CV | dimC V . We refer the reader to [14] for the proof and to more results in this direction. 3.2.1. An application to linear systems. Let L be a pencil of divisors on a smooth projective variety X. We say that L is regular if every two divisors in L intersect transversely. The pencil L is said to have quadratic singularities if for every singular divisor D ∈ L, the singular locus Dsing is a smooth subvariety and at every singular point p ∈ Dsing we can write locally D as the zero set {f = 0} of a holomorphic function f with non-degenerate complex Hessian in the normal direction to Dsing . Finally, we denote by Lsing = ∪D∈L Dsing and call it the singular locus of the pencil.
Symplectic Topology and Algebraic Families
833
It turns out that certain smooth algebraic varieties Σ cannot participate in regular pencils containing some types of singularities. For example, consider the pencil L of quadrics of Example 3.3. Note that Lsing contains a rational curve (and some isolated points). However, L is not a regular pencil, as a straightforward computation shows. This is not a coincidence. The following result was recently obtained by Jerby [14] as a corollary to Theorem 3.4. Theorem 3.5 (Jerby [14]). Let X be a smooth algebraic variety and D ⊂ X a smooth effective divisor isomorphic to a hypersurface of degree d in CP n+1 , where 2d < n + 4 and n ≥ 5. Then there are no regular pencils L ⊂ |D| having a quadratic CP 1 -singularity (i.e., with Lsing having a component being a rational curve). Note that no assumption on how D is embedded in X is made (e.g., we do not assume that D ⊂ X is ample). Thus, the above theorem exhibits an intrinsic geometric property of degree d hypersurfaces of CP n+1 . 4. Discussion and open problems The symplectic approach described above shows that the symplectic structure of an algebraic variety carries non-trivial information on the algebro-geometric properties of the variety. The symplectic point of view has some advantages as well as drawbacks in comparison to algebraic approaches. For example, Sommese’s methods imply that certain varieties cannot be ample divisors in any smooth varieties whereas our techniques only show that these varieties cannot be hyperplane sections (i.e., very ample divisors). On the other hand, the symplectic approach is very robust with respect to small deformations of the varieties in question. For example, appropriate versions of Theorems 2.3, 2.4 and 2.5 continue to hold also for small deformations of the varieties, since a small deformations of the complex structure will be still tamed by the same symplectic structure2. Consequently, non-existence of Lagrangian spheres remains valid. Robustness under small deformations seems important in questions of smoothing of singularities (see, e.g., [25]). 4.1. Some (open) problems. 4.1.1. Bounding the number of singularities. Consider hypersurfaces Σ ⊂ CP n+1 of given degree d with isolated singularities. A classical problem is to find the maximal possible number of singular points S(d, n) on such a hypersurface. 2In other words, the symplectic structure is still positive with respect to a small deformation of the complex structure.
834
P. Biran
Currently the precise answer is known only for special cases of d and n, and in general only asymptotic results (in d and n) are known (see [2, 6]). A related, somewhat more general, problem is the following. Let V be a smooth projective variety. Denote by νS (V ) the maximal possible number of isolated singularities that can occur in the central fibre W0 whenever V is included into a degeneration π : W → D with isolated singularities. Note that in some cases νS may be infinite (e.g., when V is a curve), but in other cases it is finite. Note also that S(d, n) ≤ νS (V ) where V ⊂ CP n+1 is a smooth hypersurface of degree d. Bounds on the numbers S(d, n) and νS (V ) can, in principle, be obtained by means of symplectic topology. For this end, denote by νLag (V ) the maximal number of disjoin Lagrangian spheres that can simultaneously be embedded into (V, ω) where ω is an arbitrary symplectic form compatible with the complex structure of V . We claim that νS (V ) ≤ νLag (V ). Indeed, the vanishing cycles corresponding to different singular points in W0 can be represented by disjoin Lagrangian spheres. It seems likely that symplectic techniques (e.g., in the framework of Floer theory) can lead to computable bounds on νLag (V ), thus bounding νS (V ). The simplest non-trivial example seems to be the complex quadric Qn = 2 2 = 0} ⊂ CP n+1 , with n ≥ 2. When n = even, due to topological {z0 + · · ·+ zn+1 reasons we have νS (Qn ) = νLag (Qn ) = 1. When n = odd ≥ 3, νLag (Qn ) is currently unknown but by results from [3] there are good reasons to expect that νLag (Qn ) = 1 too. 4.1.2. Hyperplane sections and Stein fillings. Let Σ be a smooth projective variety that cannot be a hyperplane section in any smooth variety. It would be interesting to figure out if Σ also cannot be a symplectic hyperplane section in the sense of Donaldson [7]. Namely, is it or not possible to embed Σ as a (real) codimension-2 symplectic submanifold of a symplectic manifolds (X 2n , ω) in such a way that [Σ] ∈ H2n−2 (X) is Poincar´e dual to a multiple of [ω] ∈ H 2 (X)? An affirmative answer to this question (namely that Σ cannot be a symplectic hyperplane section) would give rise to new examples of contact manifolds that do not have Stein fillings. To see this, consider a circle bundle P → Σ whose Chern class is [τ ] ∈ H 2 (Σ; Z), where τ is an integral symplectic structure on Σ. Endowing P with a connection 1-form α whose curvature is τ we obtain a contact structure ξ = ker α on P . If Σ cannot be a symplectic hyperplane section in any symplectic manifold (X, ω) then (P, ξ) does not admit any Stein filling (i.e., (P, ξ) cannot be the boundary of any Stein manifold). Note that the only known examples of Stein non-fillable contact manifolds are either due to topological reasons or, in dimension 3, due to contacttopological reasons (e.g., overtwisted structures). Non-fillability of circle bundles P over Σ with dimR Σ ≥ 4 would be a new “contact phenomenon”. A interesting example to consider seems to be P → Σ where Σ is an Abelian variety of complex dimension ≥ 2.
Symplectic Topology and Algebraic Families
835
References [1] V. Arnold, Some remarks on symplectic monodromy of Milnor fibrations. The Floer memorial volume, 99–103, Progr. Math., 133, Birkh¨ auser 1995. [2] V. Arnold, V. Goryunov, O. Lyashko & V. Vasil’ev, Singularity theory. I. Springer-Verlag, Berlin, 1998. [3] P. Biran, Lagrangian non-intersections. math.SG/0412110, to appear in Geom. Funct. Anal. [4] P. Biran, Algebraic families and Lagrangian cycles. In preparation. [5] P. Biran, Geometry of Symplectic Intersections. Proceedings of the International Congress of Mathematicians (Beijing 2002), Vol. II, 241–255. [6] A. Dimca, Singularities and topology of hypersurfaces. Universitext. SpringerVerlag, New York, 1992. [7] S.K. Donaldson, Symplectic submanifolds and almost-complex geometry, J. Differential Geom. 44 (1996), 666–705. [8] S. Donaldson, Polynomials, vanishing cycles and Floer homology. Mathematics: frontiers and perspectives, 55–64, Amer. Math. Soc., 2000. [9] L. Ein, Varieties with small dual varieties. I. Invent. Math. 86 (1986), 63–74. [10] L. Ein, Varieties with small dual varieties. II. Duke Math. J. 52 (1985), 895–907. [11] A. Floer, Morse theory for Lagrangian intersections. J. Differential Geom. 28 (1988), 513–547. [12] K. Fukaya, Y.-G. Oh, H. Ohta & K. Ono, Lagrangian intersection Floer theory – anomaly and obstruction. Preprint. [13] T. Fujita, On the hyperplane section principle of Lefschetz. J. Math. Soc. Japan 32 (1980), no. 1, 153–169. [14] Y. Jerby, Algebraic families and Lagrangian submanifolds. MSc Thesis, Tel-Aviv University, 2004. [15] M. Goresky & R. MacPherson, Stratified Morse theory. Ergebnisse der Mathematik und ihrer Grenzgebiete (3), 14. Springer-Verlag, Berlin, 1988. [16] P. Griffiths & J. Harris, Principles of algebraic geometry. Wiley Interscience Publication, New York, 1978. [17] M. Gromov, Pseudoholomorphic curves in symplectic manifolds, Invent. Math., 82 (1985), 307–347. [18] S. Kleiman, About the conormal scheme. Complete intersections (Acireale, 1983), 161–197, Lecture Notes in Math., 1092, Springer 1984. [19] Y.-G. Oh, Floer cohomology, spectral sequences, and the Maslov class of Lagrangian embeddings. Internat. Math. Res. Notices 1996, 305–346. [20] P. Seidel, Floer homology and the symplectic isotopy problem, PhD thesis, Oxford University 1997. [21] P. Seidel, Graded Lagrangian submanifolds. Bull. Soc. Math. France 128 (2000), 103–149. [22] A. Silva, Relative vanishing theorems. I. Applications to ample divisors. Comment. Math. Helv. 52 (1977), 483–489. [23] A. Sommese, On manifolds that cannot be ample divisors. Math. Ann. 221 (1976), 55–72.
836
P. Biran
[24] A. Sommese, Hyperplane sections. Algebraic geometry (Chicago, Ill., 1980), pp. 232–271, Lecture Notes in Math., 862, Springer, Berlin-New York, 1981. [25] A. Sommese, Nonsmoothable varieties. Comment. Math. Helv. 54 (1979), no. 1, 140–146. [26] F. Zak, Tangents and secants of algebraic varieties. Mathematical Monographs, 127. American Mathematical Society, 1993. Paul Biran School of Mathematical Sciences Tel-Aviv University Tel-Aviv 69978, Israel e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
Vortices in the Ginzburg–Landau Model of Superconductivity Sylvia Serfaty
1. Introduction 1.1. Presentation of the Ginzburg–Landau model. The Ginzburg–Landau energy was introduced by Ginzburg and Landau (see [GL]) in the 50’s, it was the first model able to explain thoroughly superconductivity. It started out as a phenomenological theory, but it was later derived (in a certain limit) from the microscopic (quantic) theory of Bardeen–Cooper–Schrieffer. It is now a widely accepted model.1 The Ginzburg–Landau energy has also proved useful in the modelling of superfluidity (a phenomenon very close to superconductivity, both mathematically and physically) and of Bose–Einstein condensates in rotation (Bose–Einstein condensates were predicted by Bose and Einstein in the early 20th century, and only first realized experimentally in the 90’s). All these physical phenomena have in common the appearance of topological vortices, which are the main object of our study. Superconductivity was first observed in 1911 by Heike Kammerlingh Onnes, who discovered that the electrical resistance of mercury completely disappeared at very low temperature. The general striking feature of superconducting materials (in general they are metallic alloys) is that, at low temperatures (below a critical temperature) they lose their resistivity and permanent currents can circulate without energy dissipation. Moreover, they repel applied magnetic fields (this is called the Meissner effect). For further reference, we refer to Section 1.3 below and to the physics literature, e.g., [T, DG]. . . Real samples are 3D; however, we will consider only the 2D model for simplicity (it already contains most of the important features). The 2D Ginzurg– Landau energy may be written (after various suitable rescalings) as: 2 1 1 Gε (u, A) = |∇A u|2 + |curl A − hex |2 + 2 1 − |u|2 . (1.1) 2 Ω 2ε Joint work with Etienne Sandier (Universit´e Paris–XII). 1 It has earned Ginzburg the 2003 Physics Nobel Prize, jointly with Abrikosov for his work on explaining vortex lattices, and Legett for his modelling of superfluidity. Experimental discoveries on superconductivity and Bose–Einstein condensates have also won other Nobel prizes.
838
S. Serfaty
Here Ω denotes a smooth bounded and simply connected domain corresponding to the cross-section of the sample (assuming everything is translation-invariant in the third direction). The function u : Ω → C is called the order parameter, |u(x)|2 ≤ 1 indicates the local density of superconducting electrons (the “Cooper pairs”), responsible for the superconductivity phenomenon. Where |u(x)| ∼ 1 it is the superconducting phase, where |u(x)| ∼ 0, it is the normal phase. This order parameter is coupled, in a gauge-invariant fashion, to a magnetic potential A : Ω → R2 , with ∇A = ∇−iA the covariant derivative, and the function h = curl A = ∂2 A1 − ∂1 A2 is the induced magnetic field in the sample. The real parameter hex is the intensity of the external applied magnetic field. Finally, the parameter 1/ε is called the Ginzburg–Landau parameter, it is a dimensionless parameter depending on the material (ratio of two characteristic lengths). When 1/ε is large enough, we are in the category of “type–II” superconductors, when ε → 0, they are sometimes called “extreme type–II” (or this is also called the “London limit”). This is the asymptotic regime we will be interested in. The Euler–Lagrange equations associated to Gε , or Ginzburg–Landau equations, can be written: u (1.2) −∇2A u = 2 (1 − |u|2 ) in Ω ε −∇⊥ h = iu, ∇A u in Ω (1.3) h = hex
on ∂Ω
(1.4)
(∇u − iAu) · ν = 0
on ∂Ω.
(1.5)
Here ·, · denotes the scalar product in C as identified with R2 . The Ginzburg– Landau equations and functional are invariant under U (1)-gauge-transformations (it is an Abelian gauge-theory) of the type : u → ueiΦ (1.6) A → A + ∇Φ. The physically relevant quantities are those that are gauge-invariant, such as the energy Gε , |u|, h, etc. For more references on all the results we present here, we refer to the forthcoming monograph [SS6]. We will also mention results on a simplified model, without magnetic field. It consists in taking A = 0 and hex = 0; then the energy reduces to 1 (1 − |u|2 )2 Eε (u) = |∇u|2 + (1.7) 2 Ω 2ε2 with still u : Ω → C. Critical points of this energy are solutions of u −∆u = 2 (1 − |u|2 ). (1.8) ε The first main study of this functional was done in the book [BBH], where they replace the effect of the applied field hex by a fixed Dirichlet boundary
Vortices in the Ginzburg–Landau Model of Superconductivity
839
condition (see also [BR] for results with analogue boundary conditions for Gε ). Since then, a large literature on this model has been developed. 1.2. Vortices. A typical vortex centered at a point x0 will “look like” u = ρei ϕ 0| with ρ = f ( |x−x ) where f (0) = 0 and f tends to 1 as r → +∞, i.e., its ε characteristic core size is ε, and 1 ∂ϕ =d∈Z 2π ∂B(x0 ,Rε) ∂τ is an integer, called the degree of the vortex. For example ϕ = dθ where θ is the polar angle centered at x0 yields a vortex of degree d. We have the important relation di δai (1.9) curl ∇ϕ = 2π i
where the ai ’s are the centers of the vortices, the di ’s their degrees and δ the Dirac mass. In the limit ε → 0, vortices become point-like or more generally in any dimension codimension 2 singularities – to be compared with the case of realvalued phase-transition models where the order parameter u is real-valued, leading to codimension 1 singular sets in the limit (see [MM]). 1.3. Critical fields. When an external magnetic field is applied to a superconductor, several responses can be observed depending on the intensity of the field hex . There are three critical fields Hc1 , Hc2 , Hc3 . When hex < Hc1 , then the response is essentially the same as without field, the material remains in its superconducting phase |u| 1 everywhere and there are no vortices, the magnetic field does not penetrate into the sample (the Meissner effect). For a value Hc1 scaling like O(|log ε|), the first vortices (zeroes of u) appear in the sample, and the magnetic field penetrates through them; as hex is further raised, the number of vortices increases. Since they repel each other, they tend to become arranged in a triangular array called the “Abrikosov lattice” (see [A]). At a second critical field Hc2 scaling like O ε12 , bulk-superconductivity is destroyed, and there only remains surface superconductivity. Above a third critical field Hc3 = O ε12 , superconductivity is destroyed everywhere and the sample is in the normal state u ≡ 0. 1.4. Questions, results and methods. (1) To understand the vortices and their repartition, interaction (for that, we use some potential theory), their motion. . . (2) To understand the influence of the boundary conditions and/or of the applied field. Find the asymptotic values of the critical fields (as ε → 0). (3) To prove compactness results and derive limiting energies/reduced problems, thus following the strategy of Γ-convergence. This enables us to understand the behavior of global minimizers (or energy minimizers) and
840
S. Serfaty
their vortices. In order to achieve this, one needs to find lower bounds for the energy, together with matching upper bounds. (4) To understand and find local minimizers. This is done through a special “local minimization in energy sectors” method. (5) To understand the behavior of critical points of the energy (i.e., solutions which are not necessarily stable). The method used here is to pass to the limit in the “stress-energy tensor”. For most of these questions, we need to capture vortices for arbitrary maps u (not necessarily solutions), and to be able to treat possibly unbounded numbers of vortices (as ε → 0). In order to achieve this, we introduced two technical tools which we use throughout: the “vortex ball construction”, yielding the lower bounds on the energy, and the vorticity measures, which serve to describe vortex-densities instead of individual vortices. 2. Mathematical tools 2.1. The vortex-ball construction. As we mentioned, this serves to obtain lower bounds for bounded or unbounded numbers of vortices. The idea is that, whatever the map u, for topological reasons, a vortex of degree d confined in a ball of radius R should cost at least an energy πd2 log Rε . Then, since there may be a large number of these vortices, one must find a way to add up those estimates. It is done following the method of Jerrard [Je] and Sandier [Sa1] of growing and merging of balls. Using this method, we obtain the following result: Theorem 2.1 (see [SS6]). If Gε (u, A) ≤ εα (α > −1), for ε small enough, for any r < 1 there exists a collection of disjoint balls (Bi )i∈I = (B(ai , ri ))i∈I (depending on ε), such that (1) i∈I ri ≤ r ! (2) ||u| − 1| ≥ 12 ⊂ ∪i∈I Bi . (3) Writing di = deg(u, ∂Bi ) if Bi ⊂ Ω and di = 0 otherwise (1 − |u|2 )2 r 1 2 2 2 |∇A u| + r |curl A| + ≥π |di | log −C . 2 ∪i∈I Bi 2ε2 ε i |di | i∈I
Remark 2.2. This estimate is in fact completely sharp. Examples where it is can be constructed. 2.2. The vorticity measures. Recall that a complex-valued map u can be written in polar coordinates u = ρeiϕ where the phase ϕ can be multi-valued. Given a configuration (u, A), we define its vorticity by µ(u, A) = curl iu, ∇A u + curl A. Formally iu, ∇u = ρ2 ∇ϕ ∇ϕ
(2.1)
Vortices in the Ginzburg–Landau Model of Superconductivity
841
considering that ρ = |u| 1. Taking the curl of this expression and using (1.9), one would get the approximate (formal) relation µ(u, A) 2π di δai (2.2) i
where ai ’s are the vortices of u and di ’s their degrees. Thus we see why the quantity µ corresponds to a vorticity-measure of the map u (just like the vorticity for fluids). The following theorem gives a rigorous content to (2.2). Theorem 2.3 (see [JS1] and [SS6]). The (ai , di ) s being given by the previous construction Theorem 2.1, we have di δai (C 0,γ (Ω))∗ ≤ Crγ Gε (u, A). ∀ 0 < γ < 1 µ(u, A) − 2π 0
i
Therefore, if r is taken small enough, µ(u, A) and 2π a weak norm.
i
di δai are close in
Remark 2.4. When the second Ginzburg–Landau equations (1.3)–(1.4) relative to the field are verified, taking the curl of (1.3), we find that the vorticity and the induced field are linked by the relation −∆h + h = µ(u, A) in Ω (2.3) h = hex on ∂Ω. Thus knowing the vorticity is equivalent to knowing the induced field h. 3. Global minimization (Γ-convergence type) results 3.1. Results for Eε . 3.1.1. In two dimensions. For the two-dimensional simplified model (1.7), the main result of [BBH] can be written under the form Theorem 3.1 (Bethuel–Brezis–H´elein [BBH]). For any family uε (with ε → 0) such that Eε (uε ) ≤ C|log ε| and uε = g on ∂Ω, where g is a map from ∂Ω to S 1 , of degree d; up toextraction, there exist a finite family (ai , di ) of n points+degrees such that ni=1 di = d and as ε → 0 di δai µ(uε ) := curl iuε , ∇uε 2π i
Eε (uε ) ≥ π
n
|di ||log ε| + W (a1 , . . . , an ) + o(1).
i=1
Here W denotes a function of the points ai ∈ Ω (depending also on the degrees), called “renormalized energy” and which has the form W (x1 , · · · , xn ) = −π di dj log |xi − xj | + interaction with the boundary. i=j
842
S. Serfaty
W corresponds to the finite part of the energy, and to the interaction between the vortices (vortices of same sign repel, of opposite sign attract). This result is typically a Γ-convergence type result (the upper bound part of the Γ-convergence can also be written down), it exhibits a limiting or reduced energy, which in this case is set on a finite-dimensional space. Thus, Γ-convergence achieves a dimension-reduction. Minimizing the energies Eε reduces to the easier problem of minimizing the limiting energy (in fact it is proved in [BBH] that minimizers all have vortices of degree +1 (or all −1 if d < 0) which converge to minimizers of W ). 3.1.2. In higher dimensions. Three-dimensional as well as higher-dimensional versions of that result have been given, see the works of Lin–Rivi`ere [LR1], Jerrard–Soner [JS1], Sandier [Sa2], Bethuel–Brezis–Orlandi [BBO]. Essentially, up to extraction µ(uε ) 2πJ, where J is an integer-multiplicity rectifiable current and Eε (uε ) ≥ π J lim inf ε→0 |log ε| where J , the total mass of the current, corresponds to the length (in 3D) or area (in higher D) of the vortices. Here, the situation is quite different from dimension 2, because the main order |log ε| of the energy already gives a nontrivial limiting problem: to find the mass of the limiting object J; this is in contrast with the 2D problem which led only to minimizing the number of points (then one needs to go to the next order (order 1) in the energy to get an interesting problem). In the higherdimensional case, minimizing this limiting problem is a nontrivial question and yields minimal connections (in 3D) or minimal surfaces/codimension 2 objects (in the sense of area-minimizing currents). 3.2. Global minimization results for Gε . 3.2.1. Close to the first critical field Hc1 . Let us introduce h0 the solution of −∆h0 + h0 = 0 in Ω (3.1) on ∂Ω h0 = 1 and
(3.2) K0 = (2 max |h0 − 1|)−1 . We also introduce the set Λ = {x ∈ Ω/h0 (x) = min h0 } and we will assume here for simplicity that it is reduced to only one point called p. With these notations, a first essential result is the asymptotic formula for Hc1 (confirming physical predictions that Hc1 = O(|log ε|)): Hc1 = K0 |log ε| + O(1).
(3.3)
Theorem 3.2 (see [S1, SS6]). Assume hex ≤ Hc1 + O(log |log ε|), then for hex ∈ (Hn , Hn+1 ) where Hn has an expansion of the form Hn = K0 (|log ε| + (n − 1) log(K0 |log ε|) + γn ) ,
n∈N
Vortices in the Ginzburg–Landau Model of Superconductivity
843
ε global minimizers of Gε have exactly n vortices of degree 1, ai → p as ε → 0 and the a˜εi = hnex (aεi − p) converge as ε → 0 to a minimizer of
wn (x1 , · · · , xn ) = −π
log |xi − xj | + πn
n
D2 h0 (p)xi , xi .
(3.4)
i=1
i=j
Through this theorem we see that the behavior is as expected: below Hc1 = H1 there are no vortices in energy minimizers, then at Hc1 the first vortex becomes energetically preferred, close to the point p. Then, there is a sequence of additional critical fields H2 , H3 , . . . separated by increments of log |log ε|, for which a second, third, etc, vortex becomes favorable. Each time the optimal vortices are located close to p as ε → 0, and after blowing-up
hex at the scale n around p, they converge to configurations which minimize 2 wn in R . Now, wn , which appears as a limiting energy (after that rescaling) contains a repulsion and a confinement term. It is a standard two-dimensional interaction, however rigorous results on its minimization are hard to obtain as soon as n ≥ 5. When D2 h0 has rotational symmetry, numerical minimization yields very regular shapes (regular polygons for n ≤ 6, regular “stars” (regular polygon + its center)) which look very much like the birth of a triangular lattice as n becomes large; their density tends to be uniform supported in a fixed disc of Rn as n → ∞. All these results are in very good agreement with experimental observations.
Remark 3.3. It was proved in [S2] that for hex < Hc1 , the energy-minimizer is unique and has no vortex. 3.2.2. Global minimizers in the intermediate regime. In the next higher regime of applied field, the result is the following: Theorem 3.4 ([SS6]). Assume hex satisfies log |log ε| " hex − Hc1 " |log ε|
as ε → 0
then there exists 1 " nε " hex such that hex ∼ K0 (|log ε| + nε log
|log ε| ) nε
and if (uε , Aε ) minimizes Gε , then µ ˜(uε , Aε ) µ0 2πnε
as ε → 0
where µ ˜(uε , Aε ) is the image-measure of µ(uε , Aε ) under the blow-up φ : x → hex nε (x − p), and µ0 is the unique minimizer over probability measures of − log |x − y| dµ(x) dµ(y) + D2 h0 (p)x, x dµ(x). (3.5) I(µ) = R2 ×R2
R2
844
S. Serfaty
Here, nε corresponds to the expected optimal number of vortices. The problem of minimizing I is a classical one in potential theory (see [ST]). Its minimizer µ0 is a probability measure of constant density over a subdomain of R2 (typically a disc or an ellipse). This result connects “continuously” with Theorem 3.2, except nε 1. Again, vortices in the minimizers converge to p as ε → 0, and when one blows up at the right scale
hex nε
around p, one obtains a
uniform density of vortices in a subdomain of R2 (a disc if D2 h0 has rotational symmetry). 3.2.3. Global minimizers in the regime nε proportional to hex . This happens in the next regime: hex = λ|log ε| with λ > K0 . Theorem 3.5 (see [SS2, SS6]). Assume hex = λ|log ε| where λ > 0 is a constant independent of ε. If (uε , Aε ) minimizes Gε , then as ε → 0 µ(uε , Aε ) µ∗ hex the unique minimizer over H −1 (Ω) ∩ (C00 (Ω))∗ of 1 1 |µ| + G(x, y) d(µ − 1)(x) d(µ − 1)(y) (3.6) Eλ (µ) = 2λ Ω 2 Ω×Ω 1 1 = |µ| + |∇hµ |2 + |hµ − 1|2 2λ Ω 2 Ω where G(·, y) is the solution to −∆G + G = δy with G = 0 on ∂Ω; or hµ and µ are related by −∆hµ + hµ = µ in Ω hµ = 1 on ∂Ω. Remark 3.6. The result we really obtained is stronger: it is the full Γ-convergence of hG2ε to Eλ . ex
Again, by Γ-convergence, we reduce the problem to minimizing the limiting energy Eλ on the space of bounded Radon measures on Ω. It turns out that the problem of minimizing Eλ is the dual, in the sense of convex duality, of an obstacle problem: Proposition 3.7. µ minimizes Eλ if and only if hµ is the minimizer for |∇h|2 + h2 . min 1 h≥1− 2λ h=1 on ∂Ω
(3.7)
Ω
Now, the solution of the obstacle problem (3.7) (where the obstacle is the 1 constant function 1 − 2λ ) is well known, and given by a variational inequality (see [KS]). Obstacle problems are a particular type of free-boundary problems, the free-boundary here being the boundary of the coincidence set 1 ωλ = x ∈ Ω/hµ (x) = 1 − . 2λ
Vortices in the Ginzburg–Landau Model of Superconductivity
845
Then −∆hµ + hµ = 0 outside of ωλ , so ωλ is really the support of µ∗ , on which 1 µ∗ is equal to the constant density (1 − 2λ )dx. An easy analysis of this obstacle problem yields the following: (1) ωλ = ∅ (hence µ∗ = 0) if and only if λ < K0 , where K0 was given by (3.2). (This corresponds to the case hex < Hc1 .) (2) For λ = K0 then ωλ = {p}. This is the case when hex ∼ Hc1 to leading order. In the scaling chosen here µ∗ = 0 but the true behavior of the vorticity is ambiguous unless we go to the next order term as was done in Theorems 3.2 and 3.4. (3) For λ > K0 , then the measure of ωλ is nonzero, so the limiting vortex density µ∗ = 0. Moreover, as λ increases (i.e., as hex does), ωλ increases. When λ = +∞, ωλ becomes Ω and µ∗ = 1, this corresponds to the case |log ε| and was also studied in more details in [SS6]. hex To sum up, the optimal limiting vortex densities µ∗ are always uniform densities on a subdomain ωλ (thus the actual number of vortices is proportional to hex ) which is nonempty for hex > Hc1 , is first nucleated at p, and grows with hex . For applied fields larger than ε12 but below Hc2 , results showing the decrease of bulk-superconductivity were obtained in [SS4]. For applied fields above Hc2 and Hc3 , the situation and phase transitions have been completely studied in an abundant literature (refer to the works of Pan, Lu–Pan, Baumann– Phillips–Tang, Bernoff–Sternberg, Del Pino–Felmer–Stenberg, Helffer–Morame, Helffer–Pan, Giorgi–Phillips. . . ) 4. Local minimizers: branches of solutions After understanding the behavior of global minimizers for all ranges of applied fields, it is of interest to understand the behavior of local minimizers, since they are also stable observable configurations (called “metastable” in physics), and to exhibit some specific ones. Theorem 4.1 ([S2, SS6]). For ε small enough and for every n ∈ N and hex such that hex ≤ C|log ε|q for some q > 0, M n2 ≤ hex , and n2 log hnex " |log ε|, there exists a local minimizer of Gε with exactly n vortices aεi of degree 1, and as ε → 0 (1) If n is fixed and hex = O(1) converges up to extraction of a subsequence to a constant (denoted by hex ), the aεi ’s converge to a minimizer of Rn,hex = −π
i=j
log |xi − xj | + π
i,j
S(xi , xj ) + 2πhex
n
h0 (xi ),
i=1
where S is the regular part of the previously defined Green’s function G. h ε ε ex (ai − p) converge to a minimizer (2) If n is fixed and hex → ∞, the a˜i = n of wn (defined in (3.4)).
846
S. Serfaty
(3) If n → ∞ and hex → ∞ (but still n " hex ) the a˜εi = hnex (aεi −p) are such that n1 i δa˜ε → µ0 in (C00 (Ω))∗ , where µ0 is the minimizer of I (defined i in (3.5)). The method of the proof consists in finding these solutions as local minimizers by minimizing Gε over some open sets Un = {(u, A)/π(n − 1)|log ε| < Fε (u, A) < π(n + 1)|log ε|} where Fε is a free Ginzburg–Landau energy (without applied field). Minimizing over Un consists roughly speaking in minimizing over configurations with n vortices, the difficulty is in proving that the minimum over Un is achieved at an interior point (this comes from the quantization of the energy cost of vortices). We thus find a multiplicity of locally minimizing solutions, for a given hex in a wide range (from hex = O(1) to hex |log ε|). Essentially, these are solutions with 0, 1, 2, 3. . . vortices which coexist and are all stable, even if not energy-minimizing. We also have derived multiple “renormalized energies” Rn,hex , wn , I(µ) corresponding to the three regimes above. Observe that wn corresponds somewhat to the limit of Rn,hex as hex → ∞, while I is a continuum limit as n → ∞ (but still n " hex ) of wn . Eλ can also be seen as the limit as both n and hex tend to ∞ but n/hex not tending to 0. Thus these limiting or renormalized energies are not only valid for global minimization, but also for local minimization. 5. Critical points approach The issue here is to derive conditions on limiting vortices or vortex-densities just assuming that we start from a family of solutions to (1.2)–(1.5) or critical points of Gε , not necessarily stable. The strategy consists in passing to the limit ε → 0, not in (1.2)–(1.5), but in the stationarity relation d "" " Gε (u ◦ χt , A ◦ χt ) = 0 dt t=0 satisfied for the critical points (with χt a one-parameter family of diffeomorphisms such that χ0 = Id). That relation is equivalent (by Noether’s theorem) to a relation of the form div Tε = 0 where Tε is called the “stress-energy” or “energy-momentum” tensor. For the present energy-functional A 2 1 |∂1 u| − |∂2A u|2 2∂1A u, ∂2A u Tε = |∂2A u|2 − |∂1A u|2 2∂1A u, ∂2A u 2 2 h (1 − |u|2 )2 1 0 + − , 0 1 2 2ε2 where ∂jA = ∂j − iAj . This strategy was already implemented for the functional Eε in [BBH], leading to
Vortices in the Ginzburg–Landau Model of Superconductivity
847
Theorem 5.1 (Bethuel–Brezis–H´elein [BBH]). Let uε be a family of critical points of Eε (with Dirichlet data) such that Eε (uε ) ≤ C|log ε|, then, there exist n points ai such that up to extraction, n µ(uε ) 2π di δai i=1
with ∀i,
∇i W (a1 , . . . , an ) = 0.
(5.1)
Recall that W is the renormalized energy, the criticality condition (5.1) expresses the fact that the limiting force acting on each vortex is 0. Thus, the vortices of critical points of Eε converge to critical points of the limiting energy W . This result corresponds to what we obtain for the vortex-densities for Gε . In what follows we assume that (uε , Aε ) are sequences ofcritical points of Gε such that Gε (uε , Aε ) ≤ ε−α , α < 13 , and Nε is defined as i |di | where the di ’s are the degrees of the balls of total radius r = ε2/3 given by Theorem 2.1. Theorem 5.2 ([SS3, SS6]). Let (uε , Aε ) and Nε be as above. If Nε vanishes in a neighborhood of 0 then µε := µ(uε , Aε ) tends to 0 in W −1,p (Ω) for some p ∈ (1, 2). If not then going to a subsequence, we have µε →µ (5.2) Nε in W −1,p (Ω) for some p ∈ (1, 2), where µ is a measure. Moreover, one of the two following possibilities occur. (1) There exists a subsequence ε → 0 such that Nε = o(hex ) along the subsequence. Then, we have (5.3) µ∇h0 = 0. (2) There exists a subsequence ε → 0 such that hex /Nε tends to λ ∈ R+ along the subsequence. Then, letting hµ be the solution of −∆hµ + hµ = µ in Ω and hµ = λ on ∂Ω, the symmetric 2-tensor Tµ with coefficients 1 hµ |∂1 hµ |2 − |∂2 hµ |2 1 0 2∂1 hµ ∂2 hµ − Tij = 0 1 2∂1 hµ ∂2 hµ |∂2 hµ |2 − |∂1 hµ |2 2 2 is divergence free in finite part. In the latter case, if µ is such that hµ ∈ H 1 (Ω) then Tµ is in L1 and divergence 1,q free in the sense of distributions. Moreover |∇hµ |2 is in Wloc (Ω) for any q ∈ [1, +∞). If, moreover, we assume that ∇hµ ∈ C 0 (Ω) then µ∇hµ = 0. Finally, if we assume ∇hµ ∈ C 0 (Ω) ∩ W 1,1 (Ω) (this is the case if µ is in p L , for some p > 1 for instance), then hµ is in C 1,α (Ω) for any α ∈ (0, 1) and 0 ≤ hµ ≤ λ. In this case (5.4) µ = hµ 1{|∇hµ |=0} , ∞ and thus µ is a nonnegative L function.
848
S. Serfaty
To sum up, the limiting condition is µ∇h0 in the first case, it means when they are too few, vortices all concentrate at the critical points of h0 at the limit, or is in case 2 a weak form of the relation µ∇hµ = 0 (which cannot be written as such when hµ is not regular enough, counterexamples of that case can be built). We obtained an analogue result for critical points of Eε with possible large numbers of vortices. Also, once more the strategy carries through to higher dimensions for the functional Eε . It was proved that the vorticities for critical points of Eε converge to stationary varifolds, i.e., critical points for the length/area (see [LR1, BBO]). Once limiting energies or limiting conditions have been derived, a natural question is to solve inverse problems: given limiting vortices which satisfy the conditions, does there exist a sequence of solutions / local minimizers converging to that limit? Is there a one-to-one correspondance? Theorem 4.1 is already a result of that type. For other inverse problems results, see [PR] in dimension 2, [MSZ] in dimension 3. 6. Further studies: dynamics and stability The philosophy that has been successful here is clear: one can extract limiting reduced energies (sometimes depending on some parameter regimes). These energies come up as Γ-limits, thus giving the behavior of energy-minimizers, but they are not only relevant for energy-minimizers, they are also relevant for critical points (“critical points converge to critical points”), for local minimizers, and for inverse problems. Then, it seems natural to try to explore how much further these limiting energies can be relevant. In [SS5, S3], we give criteria to determine when a limiting energy F of a family of energies Eε is the limit in a sort of C 1 or C 2 sense. When these criteria are satisfied, it allows us to pass to the limit for solutions (that is, we can say that critical points converge to critical points). And also for solutions of the gradient-flow dynamics ∂t uε = −∇Eε (uε ), we can say they converge to solutions to the gradient-flow of the limiting energy ∂t u = −∇F (u); and finally we may also pass to the limit in stability/instability relations (saying stable/unstable solutions converge to stable/unstable solutions). The abstract criteria that we formulate can be verified for the Ginzburg–Landau functionals Eε and Gε , for the case of a finite number of vortices, thus recovering the limiting dynamical laws for the vortices under the heat-flow ∂t ai = − π1 ∇i W (a1 , . . . , an ), as obtained by PDE methods by Lin [Li], Jerrard–Soner [JS2], Spirn [Sp]. Remark 6.1. The analogous result for Eε is also true in higher dimensions where the limiting energy-density is length/surface. It was established that the limit of the parabolic evolution of Eε is a Brakke flow (a weak form of gradient flow for the limiting energy) (see [LR2, BOS]).
Vortices in the Ginzburg–Landau Model of Superconductivity
849
References [A] [BBH] [BBO] [BOS] [BR] [DG] [GL] [KS]
[Je] [JS1] [JS2] [Li] [LR1]
[LR2] [MM] [MSZ] [PR] [ST] [Sa1] [Sa2] [SS1]
A. Abrikosov, On the Magnetic Properties of Superconductors of the Second Type, Soviet Phys. JETP 5, (1957), 1174–1182. F. Bethuel, H. Brezis and F. H´elein, Ginzburg–Landau Vortices, Birkh¨ auser, (1994). F. Bethuel, H. Brezis and G. Orlandi, Asymptotics for the Ginzburg–Landau equation in arbitrary dimensions. J. Funct. Anal., 186 (2001), no. 2, 432–520. F. Bethuel, G. Orlandi and D. Smets, Convergence of the parabolic Ginzburg– Landau equation to motion by mean-curvature, to appear in Annals of Math. F. Bethuel and T. Rivi`ere, Vortices for a Variational Problem Related to Superconductivity, Annales IHP, Analyse non lin´ eaire, 12, (1995), 243–303. P.G. DeGennes, Superconductivity of Metal and Alloys, Benjamin, New York and Amsterdam, (1966). V.L. Ginzburg, L.D. Landau, in Collected papers of L.D. Landau, edited by D. Ter Haar, Pergamon Press, Oxford (1965). D. Kinderlehrer, G. Stampacchia, An introduction to variational inequalities and their applications. Pure and Applied Mathematics, Vol. 88. New York. Academic Press (1980) R. Jerrard, Lower Bounds for Generalized Ginzburg–Landau Functionals, SIAM J. Math. Anal. 30, No. 4, (1999), 721–746. R.L. Jerrard and H.M. Soner, The Jacobian and the Ginzburg–Landau functional, Calc. Var., 14, (2002), No. 2, 151–191. R.L. Jerrard and H.M. Soner, Dynamics of Ginzburg–Landau vortices, Arch. Rational Mech. Anal. 142, No. 2, (1998), 99–125. F.H. Lin, Some Dynamical Properties of Ginzburg–Landau Vortices, Comm. Pure Appl. Math., 49, (1996), 323–359. F.H. Lin and T. Rivi`ere, Complex Ginzburg–Landau equations in high dimensions and codimension two area minimizing currents. J. Eur. Math. Soc., 1, (1999), no. 3, 237–311. F.H. Lin and T. Rivi`ere, A quantization property for moving line vortices, Comm. Pure Appl. Math. 54, No. 7, (2001), 826–850. L. Modica and S. Mortola, Il limite nella Γ-convergenza di una famiglia di funzionali ellittici, Boll. Un. Mat. Ital. A (5), 14 (1977), no. 3, 526–529. A. Montero, P. Sternberg and W. Ziemer, Local minimizers with vortices to the Ginzburg–Landau system in 3D, to appear in Comm. Pure Appl. Math. F. Pacard and T. Rivi`ere, Linear and nonlinear aspects of vortices, Progress in Nonlinear PDE’s an Their Applications, Vol. 39, Birkh¨ auser. (2000) E. Saff and V. Totik, Logarithmic potentials with external fields, SpringerVerlag, Berlin, (1997). E. Sandier, Lower Bounds for the Energy of Unit Vector Fields and Applications, J. Functional Analysis, 152, No. 2, (1998), 379–403. E. Sandier, Ginzburg–Landau minimizers from Rn+1 to Rn and minimal connections, Indiana Univ. Math. J., 50, (2001), no. 4, 1807–1844. E. Sandier and S. Serfaty, On the Energy of Type–II Superconductors in the Mixed Phase, Reviews in Math. Phys., 12, No. 9, (2000), 1219–1257.
850 [SS2]
[SS3] [SS4]
[SS5]
[SS6] [S1]
[S2] [S3] [Sp] [T]
S. Serfaty E. Sandier and S. Serfaty, A Rigorous Derivation of a Free-Boundary Problem Arising in Superconductivity, Annales Scientifiques de L’Ecole Normale Sup´erieure, 4e ser, 33, (2000), 561–592. E. Sandier and S. Serfaty, Limiting Vorticities for the Ginzburg–Landau equations, Duke Math. J., 117, (2003), no. 3, 403–446. E. Sandier and S. Serfaty, The decrease of bulk-superconductivity close to the second critical field in the Ginzburg–Landau model, SIAM J. Math. Anal., 34 (2003), no. 4, 939–956. E. Sandier and S. Serfaty, Gamma-convergence of gradient flows with applications to Ginzburg–Landau, Comm. Pure Appl. Math. 57 (2004), no. 12, 1627–1672. E. Sandier and S. Serfaty, Vortices in the Magnetic Ginzburg–Landau Model, monograph in preparation. S. Serfaty, Local Minimizers for the Ginzburg–Landau Energy near Critical Magnetic Field, part I, Comm. Contemporary Mathematics, 1 , No. 2, (1999), 213–254; part II, Comm. Contemporary Mathematics, 1, No. 3, (1999), 295– 333. S. Serfaty, Stable Configurations in Superconductivity: Uniqueness, Multiplicity and Vortex-Nucleation, Arch. for Rat. Mech. Anal., 149 (1999), 329–365. S. Serfaty, Stability in 2D Ginzburg–Landau Passes to the Limit, Indiana Univ. Math. J., 54, No 1, (2005), 199–222. D. Spirn, Vortex dynamics of the full time-dependent Ginzburg–Landau equations, Comm. Pure Appl. Math., 55, (2002), no. 5, 537–581. M. Tinkham, Introduction to Superconductivity, 2d edition, McGraw–Hill, (1996).
Sylvia Serfaty Courant Institute, NYU
4ECM Stockholm 2004 c 2005 European Mathematical Society
Validated Numerics for Pedestrians Warwick Tucker Abstract. The aim of this paper is to give a very brief introduction to the emerging area of validated numerics. This is a rapidly growing field of research faced with the challenge of interfacing computer science and pure mathematics. Most validated numerics is based on interval analysis, which allows its users to account for both rounding and discretization errors in computer-aided proofs. We will illustrate the strengths of these techniques by converting the well-known bisection method into a efficient, validated root finder.
1. Introduction Since the creation of the digital computer, numerical computations have played an increasingly fundamental role in modeling physical phenomena for science and engineering. With regards to computing speed and memory capacity, the early computers seem almost amusingly crude compared to their modern counterparts. Nevertheless, real-world problems were solved, and the speed-up due to the use of machines pushed the frontier of feasible computing tasks forward. Through a myriad of small developmental increments, we are now on the verge of producing Peta-flop/Peta-byte computers – an incredible feat which must have seemed completely unimaginable fifty years ago. Due to the inherent limitations of any finite-state machine, numerical computations are almost never carried out in a mathematically precise manner. As a consequence, they do not produce exact results, but rather approximate values that usually, but far from always, are near the true ones. In addition to this, external influences, such as an over-simplified mathematical model or a discrete approximation of the same, introduce additional inaccuracies into the calculations. As a result, even a seemingly simple numerical algorithm is virtually impossible to analyze with regards to its accuracy. To do so would involve taking into account every single floating point operation performed throughout the entire computation. It is somewhat amazing that a program performing only two floating point operations can be challenging to analyze! At speeds of one billion operations per second, any medium-sized program is clearly out of reach. This is a particularly valid point for complex systems, which require enormous models and very long computer runs. The grand example in this setting is weather prediction, although much simpler systems display the same kind of inaccessibility.
852
W. Tucker
This state of affairs has led us to the rather awkward position where we can perform formidable computing tasks at very high speed, but where we do not have the capability to judge the validity of the final results. The question “Are we just getting the wrong answers faster?” is therefore a valid one, albeit slightly unkind. Fortunately, there are computational models in which approximate results are automatically provided with guaranteed error bounds. The simplest such model – interval analysis – was developed by Ramon Moore in the 1960’s, see [Mo66]. At the time, however, computers were still at an early stage of development, and the additional costs associated with keeping track of the computational errors were deemed as too high. Furthermore, without special care in formulating the numerical algorithms, the produced error bounds would inevitably become overly pessimistic, and therefore quite useless. Today, the development of interval methods has reached a high level of sophistication: tight error bounds can be produced – in many cases even faster than non-rigorous computations can provide an “approximation”. As a testament to this, several highly non-trivial results in pure mathematics have recently been proved using computer-aided methods based on such interval techniques, see, e.g., [Ha95], [Tu02], and [GM03]. We have now reached the stage where we can demand rigor as well as speed from our numerical computations. In light of this, it is clear that the future development of scientific computation must include techniques for performing validated numerics. 2. Interval arithmetic In this section, we will briefly describe the fundamentals of interval arithmetic. For a concise reference on this topic, see, e.g., [AH83], [KM81], [Mo66], or [Mo79]. For early papers on the topic see, [Yo31], [Wa56], and [Su58]. Let IR denote the set of closed intervals. For any element [a] ∈ IR, we adapt the notation [a] = [a, a ¯]. If is one of the operators +, −, ×, ÷, we define arithmetic operations on elements of IR by [a] [b] = {a b : a ∈ [a], b ∈ [b]}, except that [a] ÷ [b] is undefined if 0 ∈ [b]. Working exclusively with closed intervals, we can describe the resulting interval in terms of the endpoints of the operands: [a] + [b]
=
[a] − [b]
=
[a + b, a ¯ + ¯b] ¯ ¯ − b] [a − b, a
[a] × [b] [a] ÷ [b]
= =
¯b, a ¯b, a ¯¯b), max(ab, a¯b, a ¯¯b)] [min(ab, a¯b, a / [b]. [a] × [1/¯b, 1/b], if 0 ∈
To increase speed, it is customary to break the formula for multiplication into nine cases (depending of the signs of the endpoints), where only one case in-
Validated Numerics for Pedestrians
853
volves more than two multiplications. When computing with finite precision, directed rounding must also be taken into account, see, e.g., [KM81] or [Mo79]. It follows immediately from the definitions that addition and multiplication are both associative and commutative. The distributive law, however, does not always hold. As an example, we have [−1, 1]([−1, 0] + [3, 4]) = [−1, 1][2, 4] = [−4, 4] whereas [−1, 1][−1, 0] + [−1, 1][3, 4] = [−1, 1] + [−4, 4] = [−5, 5]. This unusual property is important to keep in mind when representing functions as part of a computer program. Interval arithmetic satisfies a weaker rule than the distributive law, which we shall refer to as sub-distributivity: [a]([b] + [c]) ⊆ [a][b] + [a][c]. Another key feature of interval arithmetic is that it is inclusion monotonic, i.e., if [a] ⊆ [a ], and [b] ⊆ [b ], then [a] [b] ⊆ [a ] [b ], where we demand that 0 ∈ / [b ] for division. Finally, we can turn IR into a metric space by equipping it with the Hausdorff distance: a − ¯b|}. (2.1) d([a], [b]) = max{|a − b|, |¯ 3. Interval-valued functions One of the main points of studying interval arithmetic is that we want a simple way of enclosing the range of a real-valued function. Let D ⊆ R, and consider a function f : D → R. We define the range of f over D to be the set R(f ; D) = {f (x) : x ∈ D}. Except for the most trivial cases, mathematics provides few tools to describe the range of a given function f over a specific domain D. Indeed, today there exists an entire branch of mathematics and computer science – Optimization Theory – devoted to “simply” finding the smallest element of the set R(f ; D). We shall see that interval arithmetic provides a helping hand in this matter. As a first step, we begin by attempting to extend the real functions to interval functions. By this, we mean functions who take and return intervals rather than real numbers. We already have the theory to extend rational functions, i.e., functions on the form f (x) = p(x)/q(x), where p and q are polynomials. Simply substituting all occurrences of the real variable x with the interval variable [x] (and the real arithmetic operators with their interval counterparts) produces a rational interval function F ([x]), called the natural interval extension of f . As long as no singularities are encountered, we have R(f ; [x]) ⊆ F ([x]), by the inclusion monotonicity property.
854
W. Tucker
For future reference, we define the class of standard functions to be the set S =
{ax , loga x, xp/q , abs x, sin x, cos x, tan x, . . . . . . , sinh x, cosh x, tanh x, arcsin x, arccos x, arctan x}.
By using the fact that these functions are piecewise monotonic, it is possible to extend all standard functions to the interval realm: any f ∈ S has a sharp interval extension F . By sharp, we mean that the interval evaluation F ([x]) produces the exact range of f over the domain [x]: f ∈ S ⇒ R(f ; [x]) = F ([x]). Note that, in particular, this implies that F ([x, x]) = f (x), i.e., F and f are identical on R. Of course, the class of standard functions is too small for most practical applications. We will use them as building blocks for more complicated functions as follows. Definition 3.1. Any real-valued function expressed as a finite number of standard functions combined with constants, arithmetic operations, and compositions is called an elementary function. The class of elementary functions is denoted by E. Thus a representation of an elementary function is defined in terms of its sub-expressions. The leaves of the tree of sub-expressions (sometimes called a Directed Acyclic Graph – or a DAG for short) are either constants or the variable of the function, see Figure 1. sin x x2 + sin x
x x2
(x2 + sin x)(3x2 + 6)
3
3x2
3x2 + 6
6 Figure 1. A DAG for f (x) = (x2 + sin x)(3x2 + 6). It is important to note that, due to the intrinsic nature of interval arithmetic, the interval extension F depends on the particular representation of f . To illustrate this point, consider the functions f (x) = x−x and g(x) = 0. Their natural interval extensions are F ([x]) = [x] − [x] and G([x]) = [0, 0], respectively. Although f and g are identical over R, their extensions differ over IR.
Validated Numerics for Pedestrians
855
Nevertheless, given a real-valued function f , any one of its (well-defined) interval extensions F satisfies R(f ; [x]) ⊆ F ([x]) due to the inclusion monotonicity property: Theorem 3.2 (The fundamental theorem of interval analysis). Given an elementary function f , and a natural interval-extension F such that F ([x]) is well defined for some [x] ∈ IR, we have (1) [z] ⊆ [z ] ⊆ [x] ⇒ F ([z]) ⊆ F ([z ]),
(inclusion monotonicity)
(2) R(f ; [x]) ⊆ F ([x]).
(range enclosure)
For a proof, see, e.g., [Mo66]. Of course, the enclosure F ([x]) is rarely sharp, and may in fact grossly overestimate R(f ; [x]). If f is sufficiently regular, however, this overestimation can be made arbitrarily small by subdividing [x] into many smaller intervals, evaluating F over each sub-interval, and then taking the union of all resulting sets. To make this statement more precise, we define EL to be the set of all (representations of) elementary functions whose sub-expressions are Lipschitz: EL = {f ∈ E : each sub-expression of f is Lipschitz}. Theorem 3.3 (Tight range enclosure). Consider f : I → R with f ∈ EL , and let F be an inclusion isotonic interval extension of f such that F ([x]) is well defined for some [x] ⊆ I. Then there exists a positive real number K, depending on F and [x], such that, if [x] = ∪ki=1 [x(i) ], then R(f ; [x]) ⊆
k #
F ([x(i) ]) ⊆ F ([x])
i=1
and
w
k #
i=1
(i)
F ([x ])
≤ w R(f ; [x]) + K max w [x(i) ] . i=1,...,k
Here, w([x]) = x ¯ − x denotes the width of [x]. For a proof of this theorem, see, e.g., [Mo66]. In essence, the second part of Theorem 3.3 says that, if the listed conditions are satisfied, then the overestimation of the range tends to zero no slower than linearly as the domain shrinks: d R (f ; [x]) , F ([x]) = O(w([x])), where d(·, ·) is the Hausdorff distance, as defined in (2.1). Since Lipschitz func tions satisfy w R(f ; [x]) = O(w([x])), it also follows that w F ([x]) = O(w([x])), i.e., the width of the enclosure scales (at most) linearly with w([x]), see Figure 2.
856
W. Tucker 2
1.5
1
0.5
0
−0.5
−1
−1.5
−2 −5
−4
−3
−2
−1
0
1
2
3
4
5
Figure 2. Successively tighter interval enclosures of f (x) = cos3 x + sin x. 4. The bisection method As a simple illustration of the powers of interval analysis, we will study the bisection method. This is a well-known algorithm for locating a zero of a continuous function. To be precise, let f be continuous on [a, b], and suppose that f (a)f (b) < 0. Then, by the intermediate-value theorem, f has at least one root α ∈ (a, b). The bisection method proceeds as follows: Initially, we set a0 = a and b0 = b. At stage k, we compute the midpoint ck = (ak + bk )/2. Now there are three possibilities. If f (ck ) = 0, then we can set α = ck , and terminate the search. If f (ak )f (ck ) < 0, we set ak+1 = ak and bk+1 = ck . If f (ak )f (ck ) > 0, we set ak+1 = ck and bk+1 = bk . The search is guaranteed to converge to a root since we have |ak − bk | = 2−k |a0 − b0 |. When programming the bisection method, it is common to end the search when some predefined tolerance is met, e.g., |ak − bk | ≤ tol. A C++ implementation of the real-valued bisection method is presented in Figure 3. void bisect(pfcn f, double a, double b, double tol) { // We are assuming that f(a)*f(b) < 0. double c = (a + b)/2; double fc = f(c); if ( (b - a < tol) || (fc == 0) ) // If the tolerance is met, or f(c) = 0 cout << c << endl; // ... print the midpoint. else { // Otherwise... double ff = f(a)*fc; if ( ff < 0 ) bisect(f, a, c, tol); // ... check the left half, or else if ( ff > 0 ) bisect(f, c, b, tol); // ... check the right half. } }
Figure 3. A recursive implementation of the real-valued bisection method.
Validated Numerics for Pedestrians
857
There are, however, several flaws with the bisection method, when used as a root-finding device. One is the problem of finding points a and b satisfying the starting condition f (a)f (b) < 0. Indeed, if f is of constant sign, with the exception of a very small set, it may be impossible to even start the search. An example is given by the class of functions 2
fρ (x) = 1 − 2e−ρ
(x−1/2)2
.
Clearly, fρ (1/2) = −1 for all ρ. Nevertheless, when taking ρ large, almost all other function values are very close to +1, see Figure 4(a). 1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
−0.2
−0.2
−0.4
−0.4
−0.6
−0.6
−0.8
−0.8
−1 −5
−4
−3
−2
−1
0
1
2
3
4
−1 −5
5
2
−4
−3
−2
−1
0
1
2
3
4
5
2
Figure 4. (a)fρ (x) = 1 − 2e−ρ (x−1/2) for ρ ∈ {1, 10, 100}. (b) f (x) = sin sin(x) + 15/(x2 + 1) A second problem occurs when f has several roots within the search domain. Suppose that f has N simple roots α1 < α2 < · · · < αN in [a, b]. Then the bisection method will find the even-labeled roots with probability zero, whilst the odd-labeled roots are located with uniform probability, see [Co77]. The interval bisection methods deals elegantly with both problems. Instead of aiming directly at finding a root of f , it discards subsets of [a, b] that are guaranteed to be root-free. By using the second part of Theorem 3.2, it follows that 0 ∈ / F ([x]) ⇒ 0 ∈ / R(f ; [x]). Therefore, the strategy of the interval version bisection scheme is to recursively bisect the search space, retaining only those subintervals [xi ] satisfying 0 ∈ F ([xi ]). Such intervals are called feasible, since they may contain roots of f . Once a feasible subinterval has reached a width smaller than the tolerance tol, it is sent to the output. A C++ implementation of the interval-valued bisection method is presented in Figure 5. Note how much clearer the code is compared to its real-valued counterpart in Figure 3. When the search has been exhausted, we are left with a collection of feasible intervals [x1 ], . . . , [xM ] whose union contains all roots of f within [a, b]: Z = {α ∈ [a, b] : f (α) = 0} ⊆
M #
[xi ] = S.
i=1
858
W. Tucker
void bisect(pfcn F, interval X, double tol) { if ( subset(0.0, F(X)) ) // If zero is contained in F(X) if ( width(X) < tol ) // ... and the tolerance is met cout << X << endl; // ... print the subinterval. else { // Otherwise, divide and conquer. bisect(F, interval(min(X), mid(X)), tol); bisect(F, interval(mid(X), max(X)), tol); } }
Figure 5. A recursive implementation of the interval-valued bisection method. Of course, the set S may grossly overestimate Z. If, however, f satisfies the assumptions of Theorem 3.3, we can expect a very good agreement between S and Z, as long as the tolerance tol is kept reasonably small. In the case where f has only simple roots in [a, b], there are several ways of determining weather a feasible interval [xi ] contains a unique root or not, see, e.g., [AH83], [HH95], [Mo66], or [Mo79]. 5. Examples We will now provide two concrete examples of the interval-valued bisection method. The first example deals with the family of problematic functions fρ , described earlier. We will fix the search region to [x] = [−5, 5], and set the tolerance to tol = 10−10 . It is clear that each fρ has two simple roots, both approaching 1/2 as ρ increases. Table 1. Interval enclosures of the roots of 2 2 fρ (x) = 1 − 2e−ρ (x−1/2) for varying ρ. ρ 100 101 102 103 104 105 106
α1 [−0.3325546111592, −0.3325546110863] [+0.4167445388156, +0.4167445388885] [+0.4916744538786, +0.4916744539515] [+0.4991674453776, +0.4991674454505] [+0.4999167445203, +0.4999167445931] [+0.4999916744418, +0.4999916745147] [+0.4999991674412, +0.4999991675141]
α2 [+1.3325546111445, +1.3325546112174] [+0.5832554610969, +0.5832554611698] [+0.5083255461067, +0.5083255461796] [+0.5008325546077, +0.5008325546806] [+0.5000832553923, +0.5000832554652] [+0.5000083255436, +0.5000083256164] [+0.5000008325441, +0.5000008326170]
In Table 1, we clearly see the numerically computed root-enclosures behave as expected. Note that for very large values of ρ, even starting the realvalued bisection method would require almost as much work as actually finding the roots of fρ . The second example deals with a function havingmany roots within the search domain. We will study the function f (x) = sin sin(x) + 15/(x2 + 1) ,
Validated Numerics for Pedestrians
859
which is not completely trivial to analyze by hand. Once again, the search region is fixed to [x] = [−5, 5], and the tolerance is set to tol = 10−10 . The graph of Figure 4(b) indicates that f has nine roots in the search region. This is indeed confirmed by the output of the interval-valued bisection method, detailed in Table 2. Table 2. Interval enclosures of the roots of f (x) = sin sin(x) + 15/(x2 + 1) . α1 α2 α3 α4 α5 α6 α7 α8 α9
[−1.61951630492695, −1.61951630485419] [−1.04787158852560, −1.04787158845284] [−0.69981597283914, −0.69981597276638] [−0.39748093411618, −0.39748093404342] [+0.49000622362655, +0.49000622369932] [+0.85439020273042, +0.85439020280319] [+1.35143495448574, +1.35143495455850] [+2.29537873135996, +2.29537873143273] [+4.12523527877056, +4.12523527884333]
Note, however, that the real-valued bisection method, started on the domain [a, b], would fail to locate the four even-labeled roots for almost any choices of a ∈ [−5, α1 ) and b ∈ (α9 , 5]. References [AH83] Alefeld, G., Herzberger, J., Introduction to Interval Computations. Academic Press, New York, 1983. [Co77] Corliss, G., Which Root Does the Bisection Algorithm Find?, SIAM Review 19 (1977), 325–327. [CXSC] CXSC – C++ eXtension for Scientific Computation, version 2.0. Available from http://www.math.uni-wuppertal.de/org/WRST/xsc/cxsc.html [GM03] Gabai, D., Meyerhoff, G.R., Thurston, N., Homotopy hyperbolic 3-manifolds are hyperbolic, Annals of Mathematics, 157:2 (2003), 335–431. [Ha95] Hass, J., Hutchings, M., Schlafly, R. The double bubble conjecture, Electronic Research Announcements of the AMS 1 (1995), 98–102. [HH95] Hammer, R. et al., C++ toolbox for Verified Computing. Springer-Verlag, Berlin, 1995. [INv4] INTLAB – INTerval LABoratory, version 4.1.2. Available from http://www.ti3.tu-harburg.de/~rump/intlab/ [KM81] Kulisch, U.W., Miranker, W.L., Computer Arithmetic in Theory and Practice. Academic Press, 1981. [Mo66] Moore, R.E., Interval Analysis. Prentice-Hall, Englewood Cliffs, New Jersey, 1966. [Mo79] Moore, R.E., Methods and Applications of Interval Analysis. SIAM Studies in Applied Mathematics, Philadelphia, 1979.
860 [PrBi]
W. Tucker
PROFIL/BIAS – Programmer’s Runtime Optimized Fast Interval Library/Basic Interval Arithmetic Subroutines. Available from http://www.ti3.tu-harburg.de/Software/PROFILEnglisch.html [Su58] Sunaga, T., Theory of an Interval Algebra and its Application to Numerical Analysis. RAAG Memoirs, 2 (1958), 29–46. [Tu02] Tucker, W., A Rigorous ODE Solver and Smale’s 14th Problem. Found. Comp. Math., 2:1 (2002), 53–117. [Wa56] Warmus, M., Calculus of Approximations. Bulletin de l’Acad´emie Polonaise de Sciences, 4:5 (1956), 253–257. [Yo31] Young, R.C., The algebra of multi-valued quantities. Mathematische Annalen, 104 (1931), 260–290. Warwick Tucker Department of Mathematics Uppsala University Box 480 Uppsala, Sweden e-mail:
[email protected]
4ECM Stockholm 2004 c 2005 European Mathematical Society
From Classical to Non-commutative Iwasawa Theory: An Introduction to the GL2 Main Conjecture Otmar Venjakob Abstract. This paper, which is an extended version of my talk ‘The GL2 main conjecture for elliptic curves without complex multiplication’ given on the 4ECM, aims to give a survey on recent developments in non-commutative Iwasawa theory. It is written mainly for non-experts and contains neither proofs nor any new results, but hopefully serves as introduction to the original articles [8, 41]. Also, technical details are sometimes placed into footnotes in order to keep the main text as easily accessible as possible.
1. Classical Iwasawa theory For the motivation of non-commutative Iwasawa theory it might be helpful to first go back to the origin of classical Iwasawa theory starting in some sense with the work of Kummer on cyclotomic fields. The ideal class group Cl(K) of a number field K measures the failure of unique factorisation into prime elements in its ring of integers OK . It was Kummer who observed that the vanishing of Cl(Q(ζp )), where ζp denotes a primitive pth root of unity for a fixed odd prime p, implies that the famous equation xp + y p = z p only has trivial solutions in Z.1 In fact, it is even sufficient that only the pprimary part A1 := Cl(Q(ζp ))(p) of the ideal class group vanishes; in this case p is called regular, otherwise irregular. Thus, if all prime number would be regular the proof of Fermat’s last theorem would have been rather easy. Hence, it was important for Kummer to be able to tell the regular from the irregular primes and he found the following criterion which reveals a mysterious relationship between A1 and certain special values of the complex Riemann zeta This survey was written during a stay at Centro de Investigacion en Matematicas (CIMAT), Mexico, as a Heisenberg fellow of the Deutsche Forschungsgemeinschaft (DFG) and I want to thank these institutions for their hospitality and financial support, respectively. 1Indeed, after adjoining ζ the right-hand side decomposes into the product of pairwise prime p elements x+ζpi y, i = 0, . . . , p−1, in Z[ζp ]; the ideal class group being zero we can now consider the equation prime by prime showing that z must be the product of p pairwise prime numbers zi which leads to a contradiction as is easily shown, see [45].
862
O. Venjakob
function ζ(s) =
∞ 1 1 = , s n 1 − p−s p n=1
where the Euler product ranges over all prime numbers and converges for ,(s) > 1. Theorem 1.1 (Kummer). A1 is trivial if and only if p does not divide any of the numerators of the values ζ(−1), ζ(−3), . . . , ζ(4 − p).2 For example the first four irregular primes are 37, 59, 67, 103 and in contrast to the regular ones it is known that there exist infinitely many of them. Now it was Iwasawa’s idea to study more general the (p-primary) ideal class groups An := Cl(kn )(p) of the fields kn = Q(µpn ) which arise by adjoining the pn th root of unities µpn to Q, both in order to understand better Kummer’s criterion and to see whether the order of An could be at least controlled in view of Fermat’s last theorem. Studying the ideal class groups for the whole tower of number fields kn simultaneously leads naturally to considering the infinite Galois extension Q∞ := Q(µp∞ ) of Q whose Galois group G is given explicitly ∼ = g χ(g) for all g ∈ G and by the cyclotomic character χ : G −→ Z× p , i.e., ζ = ζ ζ ∈ µp∞ . We write Qcyc for the unique subextension whose Galois group is isomorphic to Zp ∼ = 1 + pZp ⊆ Z× p and denote the corresponding Galois group by Γ. We fix a topological generator γ, say 1+p, of Γ. Also we set ∆ := G(k1 /Q) and note that G ∼ = Γ × ∆. Iwasawa was not only interested in the size of An but also in the finer structure of the ideal class group as Galois module. The natural Galois action on An for all n extends naturally to an action of the Iwasawa algebra Λ := Λ(G), i.e., the completed group algebra Zp [[G]] := lim Zp [G(kn /Q)], ←− n on the projective limit X := lim An . ←− n In fact, Iwasawa showed that X is a finitely generated Λ-torsion module. Since Λ∼ = Zp [∆][[Γ]] decomposes into a finite product of rings3 each of which is isomorphic to the power series ring Z[[T ]] in one variable T = γ − 1 there is a nice structure theory which assigns to any Λ-torsion module M a characteristic polynomial FM (with coefficients in Zp [∆]), see 3.2 for more details. Neglecting 2Note that ζ has trivial zeroes at the even negative integers, also the numerator of ζ(2 − p) is never divisible by p. The proof of this theorem relies decisively on the analytic class number formula and a decomposition of A1 and the zeta function of Q(ζp ) into eigenspaces and Lfunctions with respect to the powers ω i of the Teichmueller character ω : G(Q(ζp )/Q) → µp−1 , respectively. 3Corresponding to the characters ω i of ∆.
From Classical to Non-commutative Iwasawa Theory
863
for simplicity the Zp -torsion part of M , FM can be interpreted as the character$p−1 Qp )istic polynomial of the endomorphism T acting on the free Qp [∆](= i=1 module M ⊗Zp Qp . This FX will be used to define the algebraic p-adic zeta function below. On the other hand, Kummer had shown mysterious congruences between the values of the modified ζ-function ζ(p) (s) := (1 − p−s )ζ(s) with the Euler factor at p eliminated, which turn out to be equivalent to the existence of a continuous function ζp-adic : Zp \ {1} → Qp such that ζp-adic (1 − n) = ζ(p) (1−n) for all n > 1. Remember that already Euler knew that for n > 1 the values ζ(1 − n) are rational and thus can also be considered as elements of the local field Qp . Furthermore, Kubota and Leopold showed that ζp−adic can be expanded into a p-adic power series, thus being p-adic analytic. As Iwasawa observed ζp-adic can also be interpreted as an element of Q(G), the total ring of fractions of Λ.4 First note that every continuous character ψ : G → Z× p extends linearly to a ring homomorphism Λ → Zp which we also call ψ by abuse of language. Apart from some bad denominators this map extends also to Q(G). In particular, elements Z ∈ Q(G) can be considered as functions on certain subsets of the set of continuous characters of G by setting Z(ψ) := ψ(Z), if the latter is defined. Theorem 1.2 (Iwasawa, Kubota, Leopoldt). There exists a unique element Z ∈ Q(G) such that Z(n) := Z(χn ) = ζ(p) (1 − n) for all k > 1. Note that Z(n) is zero for all odd n due to the trivial zeroes of the Riemann zeta function. Also, by decomposing the cyclotomic character into the product of its projections onto 1 + pZp and µp−1 , respectively, one can extend Z to padic analytic functions Zωi (s), s ∈ Zp \ {1}. 5 Alternatively, Z is determined by an interpolation property with respect to Dirichlet characters instead of powers of the cyclotomic character. In this case the Dirichlet L-functions are involved. In some sense generalizing the analytic class number formula Iwasawa detected a deep relationship between the “p-adic families” of ideal class groups An , namely X , on the algebraic side and of special values of ζ, namely Z on the (p-adic) analytic side, which he formulated in the following classical Main Conjecture (Theorem of Mazur and Wiles). There is the following equality of ideals in Λ : (FZp (1) · Z) = (FX + ). 4Q(G) is isomorphic to the product $ p−1 Q(Z [[T ]]) of fields of fractions of Z [[T ]]. p p i=1 5More precisely, the Z are the p-adic versions of the complex L-functions L(ω i , s). ωi
864
O. Venjakob
Here Zp (1) := lim µpn denotes the Tate-module and, due to the trivial ←− n zeroes of Z, one only has to consider the +1-eigenspace X + of X with respect to complex conjugation.In particular, the denominator of Z is controlled by FZp (1) . More heuristically, the main conjecture should be read as an identity in Q(G) F + Z= X FZp (1) up to units in Λ.6 While this classical theory concerned the multiplicative group Gm – we adjoined the points of its p-primary torsion subgroup to Q and considered the Galois modules Zp (1) and X + which can be interpreted as Galois cohomology groups with coefficients in Zp (1) – we will explain the corresponding theory for an elliptic curve E over Q in the following sections. References: [45, 4, 5, 29, 28, 12] 2. Iwasawa theory of elliptic curves – the philosophy 2.1. Arithmetic of elliptic curves. In order to explain the Iwasawa theory of elliptic curves we first recall basic facts on (the arithmetic of) elliptic curves. To this end let E be an elliptic curve over Q, i.e., a smooth projective curve of genus one with a distinguished Q-rational point (the 0 of the underlying abelian group). Every such E can be realized in P2 by a (non-unique) Weierstrass equation of the form E : y 2 + a1 xy + a3 y = x3 + a2 x2 + a4 x + a6 , ai ∈ Z the distinguished point being the point at infinity (since char(Q) = 2, 3 one can even achieve that a1 = a2 = a3 = 0, but note that for a minimal model over Z below one cannot assume this simpler form in general). For every prime l this equation also defines a (not necessarily smooth) curve over the local field Ql and the finite field Fl , respectively. One of the basic questions concerning the arithmetic of E is to determine the structure and in particular the size of the group E(K) = ? 6This can be read as an alternating product of the characteristic polynomials of the action of γ on certain ´ etale cohomology groups with coefficients in Zp (1) which identify with the Λ-modules Zp (1) and X + . This is analogous to the function field situation, namely the fact that the zeta function of a curve C over a finite field F, l = p, can be expressed by means of the Lefschetz fix point formula using the action of the Frobenius endomorphism (instead of γ) on the ´ etale cohomology of C. It was this prototype which motivated Iwasawa to find a similar interpretation for the p-adic analytic zeta function. The non-trivial contribution of the ´ etale cohomology comes from the jacobian of C, which is paralleled by the ideal class group in the number field case. Moreover, the extension from F to its algebraic closure is achieved by adjoining roots of unity, which corresponds to taking the cyclotomic Zp -extension of Q.
From Classical to Non-commutative Iwasawa Theory
865
of K-rational points of E, i.e., the set of solutions of the above equation with coordinates in K, for K any number field, local field or finite field. Over a number field, e.g., K = Q, there is the famous Theorem 2.1 (Mordell-Weil). The abelian group E(Q) is finitely generated, i.e., it decomposes into E(Q) = Zr ⊕ E(Q)tors , where r = rkZ E(Q) is the rank of the Mordell-Weil group or the algebraic rank of E, while E(Q)tors is the finite torsion subgroup. While the possible structures of E(Q)tors where determined by Mazur, in particular, the order of this group is bounded by 16, it is not known whether the rank can be arbitrarily large when E/Q varies. The properties of the Mordell-Weil group turn out to be, at least conjecturally, deeply related with L-functions, which we are going to recall now. For every prime l we denote by 'l the reduction of E modulo l, i.e., the curve which is given by the reduced E equation 'l : y 2 + a'1 xy + a'3 y = x3 + a'2 x2 + a'4 x + a'6 , a'i ∈ Fl . E Here we assume that the Weierstrass equation is a (global) minimal model of E over Z, i.e., for all primes l the l-part of the discriminant ∆ (= −16(4a34 + 27a26 ) if a1 = a2 = a3 = 0) is minimal with respect to all Weierstrass equations over the l-adic integers Zl which give rise to the same isomorphism class of 'l is again a smooth curve over Fl , then E is said to have good elliptic curves. If E (otherwise bad) reduction at l. In the previous case the integer al is defined by 'l (Fl ) = 1 − al + l. #E 'l has either a node, i.e., multiplicative reduction, or a cusp, i.e., Otherwise E additive reduction. In the second case we set al = 0 while in the first case we set al = 1 if the multiplicative reduction is split, i.e., the tangent lines to the 'l have slopes defined over Fl , and al = −1 if the reduction is nonnode on E split. Then the complex Hasse-Weil L-function of E is defined by the following Euler product −1 3 , s ∈ C, ,(s) > , L(E/Q, s) := 1 − al l−s + (l)l1−2s 2 l
where (l) equals by definition 1, if E has good reduction at l, and 0 otherwise. By the work of Wiles and Taylor-Wiles it is known that L(E/Q, s) has an analytic continuation to the entire complex plane. The following conjecture, which is a generalization of the analytic class number formula for number fields, predicts that the analytic rank of E, i.e., the vanishing order of L(E/Q, s) at s = 1, coincides with the algebraic rank. Moreover, the leading coefficient of the Taylor series expansion of the L-function at s = 1 can be expressed by the most important invariants of E : By X(E/Q) we denote the Tate-Shafarevich
866
O. Venjakob
group of E, which is conjectured to be finite though this is not known for a single elliptic curve. If , denotes the height pairing of E and P1 , . . . , Pr form some set of generators of E(Q)/E(Q)tor , then the regulator of E is defined to be the determinant of the matrix (Pi , Pj )i,j . Further, if we assume that the Weierstrass equation defines a global minimal model of E over Z, then the translation invariant holomorphic differential ω :=
dx 2y + a1 x + a3
is called the N´eron Differential of E. The integration of it along a generator γ + of the real part π1 (E(C), 0)+ := π1 (E(C), 0)G(C/R) of the fundamental group of the complex manifold E(C) defines the real period ω Ω+ = γ+
of E. Similarly, the period Ω− is defined via integration along a generator γ− of the −1 eigenspace of the fundamental group with respect to the action of complex conjugation. Finally, for any prime l we call Tamagawa-number at l the index cl = [E(Ql ) : E ns (Ql )] of the subgroup E ns (Ql ) of the group of Ql -rational points E(Ql ) consisting of those points whose reduction modulo l is non-singular. Conjecture 2.2 (Birch & Swinnerton-Dyer (BSD) Conjecture). I. r := ords=1 L(E/Q, s) = rkZ E(Q) II. lim (s − 1)r L(E/Q, s) = Ω+ RE s→1
#X(E/Q) cl (#E(Q)tors )2 l
Note that the product over the Tamagawa numbers is actually finite as cl = 1 whenever E has good reduction at l. Thus this conjecture describes a mysterious relationship between the complex analytic L-function and the purely algebraically defined Mordell-Weil group. A similar conjecture can be formulated for elliptic curves over arbitrary number fields. The idea of Iwasawa theory is roughly speaking to study this deep connection between the values of (complex) L-functions and arithmetic invariants of E for a full tower of number fields simultaneously as we have already seen in section 1. References: [35, 36] 2.2. The Selmer group of E in towers of number fields. For technical reasons we make from now on the following Assumption. p ≥ 5 is a prime such that E has good ordinary reduction at p, 'p (Fp )[p] equals p. i.e., the order of the group of p-division points E
From Classical to Non-commutative Iwasawa Theory
867
To study the Mordell-Weil group of E it is often more convenient to go over to the cohomologically defined (p-primary) Selmer group Sel(E/K) for any finite extension K/Q. Instead of giving the precise definition we just recall that induced by Kummer theory the Selmer group fits into the following short exact sequence, being the bridge between the (p-primary) Tate-Shafarevich group and the Mordell-Weill group: 0
/ E(K) ⊗Z Qp /Zp
/ Sel(E/K)
/ X(E/K)(p)
/ 0.
Assuming #X(E/K)(p) < ∞, which can be checked – for fixed p – in many cases, it holds for the Pontryagin dual of the Selmer group X(E/K) := Sel(E/K)∨ := Hom(Sel(E/K), Qp /Zp ), that rkZ E(K) = rkZp X(E/K). Thus, indeed, the Selmer group (or its dual) bears significant arithmetic information of E. Now we introduce a canonical tower of number fields associated with E. By E[pn ] ∼ = Z/pn Z × Z/pn Z we denote the pn -division points of E over a fixed algebraic closure Q of Q. The action of the absolute Galois group GQ on this group induces, after choosing a basis, the representation ρpn : GQ −→ Aut(E[pn ]) ∼ = GL2 (Z/pn Z). We define Kn := Q(E[pn ]), 0 ≤ n < ∞,4to be the maximal subfield of Q fixed under the kernel of ρpn . Then K∞ := n≥0 Kn is nothing else than the fixed field under the kernel of the representation ρp∞ : GQ −→ AutZp (Tp E) ∼ = GL2 (Zp ) of GQ on the Tate module Tp E := lim E[pn ], ←− n where the inverse limit is formed with respect to the multiplication by p maps. In particular, K∞ is a Galois extension of Q with Galois group G := G(K∞ /Q) isomorphic to a closed subgroup of GL2 (Zp ). Thus G is a p-adic Lie group. We want to stress that the L-function of E only depends on the Galois representation ρp∞ , thus the tower of number fields {Kn }n is most natural in order to study the arithmetic of E, in particular, to investigate properties of its Lfunction.
868
O. Venjakob
Note that due to the Weil pairing det ◦ρ K∞ is isomorphic to the cyclotomic character χ : GQ −→ Z× p which describes the ac tion of GQ on the p-power roots of unity H µp∞ : gζ = ζ χ(g) for all g ∈ GQ and ζ ∈ µp∞ . Thus K∞ contains the cyclo tomic Zp -extension Qcyc of Q. We write H for the Galois group G(K∞ /Qcyc ) and G∞ ⊆GL2 (Zp ) obtain the diagram sidewards. Qcyc ?? ?? As before the Iwasawa algebra of G ?? ?? Λ(G) = lim Zp [Gn ] ?? ←− Kn ?? n Γ∼ = Zp ?? is the inverse limes of the group algebras ??Gn ?? Zp [Gn ] of Gn with coefficients in Zp . It Q is a compact, regular Noetherian ring. In contrast to the classical Iwasawa algebra Λ(Γ) of Γ it is not commutative in general. Now, for every n ≥ 1, the Galois action makes X(E/Kn ) := Sel(E/Kn )∨ into a compact Zp [Gn ]-module. To study, on the algebraic side, all these Selmer groups simultaneously for the whole tower of number fields means to go over to the inverse limit X := X(E/K∞ ) := lim Sel(E/Kn )∨ , ←− n which turns out to be a finitely generated Λ(G)-module, conjecturally even a torsion Λ(G)-module. Roughly one should think of it as the family of all the Mordell-Weil groups E(Kn ) (and Tate-Shafarevich groups X(E/Kn )(p)). The analytic counterpart of this family will be discussed in the next subsection. References: [6, 9, 30, 31, 18]
2.3. Twisted L-functions. For every n ≥ 0, let Irr(Gn ) denote the set of isomorphism classes of (absolutely) irreducible representations of Gn , realized over an appropriate number field embedded into C or over a local field contained in Ql (depending on n). Via the canonical projection G Gn they are also considered as representations of G, to which we shall refer as Artin representation. Let R be the finite set of primes of Q containing p and all primes l at which E has bad reduction. On the analytic side one 4 is searching for a function LE , the p-adic analytic L-function of E , on the set n Irr(Gn ) which assigns to ρ the value at s = 1
From Classical to Non-commutative Iwasawa Theory
869
of the complex L-function L(E, ρ, s) of E twisted by ρ or rather its modified version LR (E, ρ, s) with the Euler factors at primes in R eliminated.7 8 Heuristically, summarizing the (generalized) BSD conjecture over all the fields Kn leads directly to the Iwasawa Main Conjecture of E 9. Since the (modified) L-function LR (E/Kn , s) of E over Kn (similarly defined as over Q and without the Euler factors in R) decomposes into the product of twisted Lfunctions (with multiplicities), the idea is that on the analytic side of the picture the family of special values at s = 1 of LR (E, ρ, s) can be interpolated p-adically, which should lead to the p-adic analytic L-function. On the other hand on the algebraic side there should be some procedure to assign to the Λ(G)-module X = X(E/K∞ ) (as for any torsion Λ(G)-module) some characteristic element FX bearing hopefully many arithmetic information of E. The (heuristic) comparison of the algebraic and analytic aspect when going over to towers of number fields are illustrated in the following diagram 7We restrict to those primes not lying in R, because the corresponding factors at primes in R usually do not behave well p-adically and thus have to be eliminated from the usual definition of the L-function in order to expect a p-adic L-function in whatever sense. 8 For the interested reader we recall the definition of LR (E, ρ, s). Again it is defined as an Euler product, which converges only for (s) > 32 ,
LR (E, ρ, s) :=
Pq (E, ρ, q −s )−1 , s ∈ C,
q ∈R /
where the Pq (E, ρ, T ) are polynomials to be defined below. The only thing known about its analytic continuation at present is that it has a meromorphic continuation when ρ factors through a soluble extension of Q. We will assume the analytic continuation of L(E, ρ, s) to s = 1 for all Artin characters ρ of G in what follows. If q is any prime number we write Frobq for the Frobenius automorphism of q in G(Qq /Qq )/Iq , where, as usual, Iq denotes the inertia subgroup. Assume now that ρ ∈ Irr(Gn ) is realized on a vector space Vρ over a number field K of dimension nρ . For a fixed place λ of K lying above l = q we denote by Kλ the completion of K with respect to λ and we set Vρ,λ = Vρ ⊗K Kλ . Also we consider the l-adic Tate module Vl E := H1 (E(C), Z) ⊗Z Ql ∼ = Tl E ⊗Zl Ql and set Hl1 (E) := Hom(Vl E, Ql ). Finally we put for any prime l different from q 1 Iq Pq (E, ρ, T ) := det(1 − Frob−1 q .T |(Hl (E) ⊗Ql Vρ,λ ) ).
It can be shown that for ρ the trivial representation the local L-function Pq (E, p−s ) := Pq (E, ρ, p−s ) coincides with the Euler factor at q of the Hasse-Weil L-function of E. In particular, the integers aq are just the traces of Frobq acting on the maximal unramified quotient (Vl E)Iq of the Tate module. 9In fact this can be made precise in the context of the Equivariant Tamagawa Number Conjecture (ETNC), a natural generalisation of the BSD conjecture, see [16, 43], also [2, 3, 15, 21].
870
O. Venjakob
algebraic
analytic
X(E/Kn ) as Gn -module
LR (E/Kn ) =
$ Irr(Gn )
LR (E, ρ, s)nρ
p-adic families X(E/K∞ )
(LR (E, ρ, 1))ρ∈Irr(Gn ),n<∞ p-adic L-functions
FE := FX
LE
characteristic element
analytic p-adic L-function
This comparison culminates in the Main Conjecture.
FE ≡ LE ,
which says that the characteristic element of E and the p-adic analytic Lfunctions (if they exist at all) are essentially the same, in a sense we have to make precise later. If such a relation should hold it must be a very deep relationship since it connects two totally different aspects of E living in completely different worlds and in some sense “explaining” the mysterious BSD-conjecture. Gm E ζ(s) L(E, s) Q(µp∞ ) Q(E[p∞ ]) We conclude this section by illusE(Kn ) Cl(kn ) trating the analogy between the X+ X(E/K∞ ) Gm - and the E-case: LE ζp-adic 3. Iwasawa theory of elliptic curves - recent developments 3.1. What is new? Before we try to describe different approaches to make the philosophy explained above precise we would like to mention that we have to distinguish two totally different cases. Consider the following explicit elliptic curves E1 : y 2 = x3 − x and E2 : y 2 + y = x3 − x2 . At a first glance who would expect an essentially difference between them? But while the first one has a “big” ring of endomorphisms – one can show that
From Classical to Non-commutative Iwasawa Theory
871
End(E1 ) ∼ = Z[i] = Z, i.e., E admits complex multiplication (CM) – the second one only has the endomorphisms arising from multiplication with integers: End(E2 ) ∼ = Z, i.e., E does not admit complex multiplication. Now it follows that in the CM-case, the group G has the form ∼ Zp 2 × finite abelian group, G= in particular it is abelian. This commutative theory is rather well known, the 2-variable main conjecture10 is a Theorem of Rubin [32] in many cases, see also at the end of 3.5. Thus we want to concentrate on the second, the GL2 -case. By a deep result of Serre [34] now G is of the form G ⊆o GL2 (Zp ) open subgroup, in particular it is not abelian. In this case it was not even known how to formulate a GL2 main conjecture and it is this case were the substantial progress we want to describe in these notes has been achieved recently. This development concerns unfortunately only the algebraic side of the picture drawn above, in particular, the existence of characteristic elements has been established, while the p-adic analytic part will be purely conjectural. We also should mention that there is a well-developed (commutative) Iwasawa theory of elliptic curves over the cyclotomic Zp extension Qcyc or more general Kcyc of a number field K, see [26, 17, 27]. We refer to the corresponding main conjecture as 1-variable main conjecture. For CM-elliptic curves this is consequence of the 2-variable main conjecture. In the non-CM case there are partial results by Kato [22] and recent results by Urban and Skinner [37], which together prove the latter main conjecture in several cases. 3.2. Structure theory. In this section let G be any compact p-adic Lie group without element of order p (G can always be realized as an closed subgroup of GLn (Zp ) for some n). In order to define the characteristic element of a Λ(G)-module it is tempting to imitate the approach of classical Iwasawa theory, i.e., the case where G∼ = Zp n and thus there is an isomorphism Λ = Λ(G) ∼ = Zp [[X1 , . . . , Xn ]] for any choice of a minimal system of topological generators γ1 , . . . , γn of G by the identities Xi = γi − 1. In particular, Λ(G) is a complete, regular local ring of dimension n + 1. In the case n = 1 it was Iwasawa himself – and more generally for integrally closed (commutative) domains Serre – who established a structure theorem, similar to that concerning modules over principal ideal domains: Every 10Since G has dimension 2 as p-adic Lie group the p-adic L-function is a power series in two
variables.
872
O. Venjakob
finitely generated Λ(G)-torsion module M is up to pseudo-null modules a direct product of cyclic modules M∼ Λ/Λfini , i
where ni are uniquely defined integers and fi are irreducible elements of Λ(G), unique up to units. The pseudo-null11 modules have to be considered as small in fact, for n = 1 they are precisely the class of all finite modules - and the idea is that they do not contribute essential information in the arithmetic applications. Now one defines the characteristic element using the above invariants attached to M fini . FM := Returning to the general case, i.e., to a not necessarily commutative group G, the concept of pseudo-null modules was developed in the author’s thesis [38, 39, 42] by cohomological methods establishing the fundamental fact that Λ(G) is an Auslander regular ring, for details see (loc. cit.). Then Coates, Schneider and Sujatha went on establishing a structure theorem in this noncommutative setting almost totally parallel to the above mentioned theory. Theorem 3.1 (Coates, Schneider, Sujatha). For every torsion Λ-module M there exist left ideals L1 , . . . , Lr such that, up to pseudo-null modules, M decomposes into a product of cyclic modules M∼
r
Λ/Li .
i=1
For details and the mild technical further assumption on G needed in this theorem, see [11] and [7]. Unfortunately, it turned out that the G-Euler characteristic, an important arithmetic invariant if, e.g., applied to the Selmer group and which will be defined later, is not invariant under pseudo-isomorphisms12. Moreover, the ideals Li occurring above need not be principal in general (see [40, appendix] for a counterexample). Thus, at moment, the theorem cannot be used to attach an characteristic element to M and it is still not clear which role this astonishing structure result will play in non-commutative Iwasawa theory. In order to circumvent this dilemma we are going to apply techniques from algebraic K-theory and localisation of (possibly non-commutative) rings. References: [11, 10, 29, 20, 19, 40, 33] 3.3. Localisation of Iwasawa algebras and characteristic elements. The following theory stems from joint work of Coates, Fukaya, Kato and Sujatha with the author [8] and makes heavily use of the following 11A finitely generated Λ(G)-module M is pseudo-null if its support has codimension at least 2 in the spectrum of Λ(G). 12A homomorphism of modules whose kernel and cokernel are pseudo-null
From Classical to Non-commutative Iwasawa Theory
873
Assumption. There exists a normal closed subgroup H G such that the quotient Γ := G/H is isomorphic to Zp . Recall that it is satisfied in our application because K∞ contains the cyclotomic Zp -extension Qcyc of Q. In this situation we are able to define a multiplicatively closed subset T 13 consisting of non-zerodivisors of Λ := Λ(G) hoping that one can localize Λ with respect to it. While this is always possible for commutative rings this is a quite subtle issue for non-commutative rings: one has to check that T satisfies the Ore-condition, which means roughly speaking that every right fraction with denominator in T can also be written as left fraction, and vice-versa. If the localisation with respect to T exists, it should be related – by construction – to the following subcategory of the category of Λ-torsion modules: MH (G) category of Λ-modules M such that modulo Zp -torsion M is finitely generated over Λ(H) ⊆ Λ(G). Thus, from a technical point of view the following theorem is the key result of our construction: Theorem 3.2. The localization ΛT of Λ with respect to T exists14 and there is a surjective map arising from K-theory15 ∂ : (ΛT )× K0 (MH (G)) from the group of units (ΛT )× of ΛT to the Grothendieck group K0 (MH (G)) of MH (G). This leads directly to the following Definition 3.3. Any FM ∈ (ΛT )× with ∂[FM ] = [M ] is called characteristic element of M ∈ MH (G). In order to show that this is not just a sophisticated but useless definition we state some basic properties of our construction. In particular, FM behaves well with Euler characteristics. 13First define T := {λ ∈ Λ|Λ/Λλ finitely generated over Λ(H)} and then saturate it with
4 the powers of p, i.e., T := i≥0 pi T ⊆ Λ. 14There exists a ring theoretic approach [41], which generalizes to more general subgroups H of G [1], but in this situation the module theoretic proof in [8] is much simpler. 15There is an exact localization sequence Λ×
K1 (Λ)
(ΛT )×
/ K1 (ΛT )
∂
/ K0 (MH (G))
/ 0,
where the surjectivity claims need little arguments, see [8, §4]. For the computation of the groups K1 (Λ), K1 (ΛT ) and K0 (MH (G)) in case G is a 2-dimensional p-adic Lie group, see [24].
874
O. Venjakob
Properties. (i) Any f ∈ (ΛT )× can be interpreted as a map on the isomorphism classes of (continuous) representations ρ : G → Gln (OK ), where OK runs through the ring of integers of finite extensions K of Qp : ρ → f (ρ) ∈ K ∪ {∞} ⊆ Qp ∪ {∞}. (ii) The evaluation of FM at ρ gives the generalized G-Euler characteristic16 χ(G, M (ρ)) |FM (ρ)|p−[K:Qp ] = χ(G, M (ρ)) if the Euler-characteristic is defined. Here the p-adic valuation is normalized as usual by |p|p = 1p . For more details and proofs, see [8, §3],[41, §8]. 3.4. Numerical example. Consider the two elliptic curves E = X1 (11) : y 2 + y = x3 − x2 , A
: y 2 + y = x3 − x2 − 7820x − 263580
and let p = 5. One can show that X ∈ MH (G), i.e., that FX exists. Now G = G(Q(E(5))/Q) has 2 irreducible Artin Representations of degree 4 : ρi = Indχi : G → GL4 (Z5 ), which are in fact induced by the characters χi , i = 1, 2, corresponding to the cyclic extensions of degree 5 as indicated in the following diagram Q(E[5]) Q(A[5]) JJ t JJ t JJ tt t tt χ1 JJJ tt χ2 Q(µ5 )
Q Calculations show that χ(G, X(ρi)) equals 53 and 5 for i = 1 and i = 2, respectively. Thus FX (ρ1 ) ∼ 53 , FX (ρ2 ) ∼ 5 up to Z× 5. 16Note that with M also every twist M(ρ) := M ⊗ n Zp OK (with diagonal G-action, via ρ on the right factor) belongs to MH (G). By definition, (−1)i χ(G, M(ρ)) := #Hi (G, M( ρ)) , i≥0
if all groups are finite and where ρ denotes the contragredient representation of ρ.
From Classical to Non-commutative Iwasawa Theory
875
3.5. Analytic p-adic L-function and the GL2 -main conjecture. In contrast to the algebraic theory above the following analytic part is purely conjectural. First of all we have Deligne’s [13] Period - Conjecture: LR (E, ρ, 1) ¯ ∈Q Ω(E, ρ) (E,ρ,1) for a suitable period Ω(E, ρ) ∈ C, which permits to consider LRΩ(E,ρ) as value × ¯ in Qp , i.e., in the same target where the elements of (Λ(G)T ) interpreted as functions take their values. In analogy with classical Iwasawa theory we call such an element which interpolates these values p-adic analytic L-function though one could criticize that there is no p-adic analysis involved at present.
Conjecture 3.4 (Existence of analytic p-adic L-function). Let p ≥ 5 and assume that E has good ordinary reduction at p. Then there exists LE ∈ (Λ(G)T )× , such that for all Artin representations ρ of G one has LE (ρ) = ∞ and LR (E, ρ, 1) Ω(E, ρ) up to some modifications of the Euler factor at p. LE (ρ) ∼
The precise formula17 describing the interpolation property can be deduced from Fukaya and Kato’s version18 of the Equivariant Tamagawa Number Conjecture (ETNC) together with their -conjecture and thus follows a precise recipe whose explanation is unfortunately out of the scope of this article. In particular, the following version of a non-commutative Iwasawa main conjecture is compatible with the ETNC corresponding to our tower Kn of number fields: Conjecture 3.5 (Main Conjecture). Assume that p ≥ 5, E has good ordinary reduction at p, and X(E/K∞ ) belongs to MH (G). Granted the existence of the p-adic L-function, LE is a characteristic element of X(E/K∞ ) : ∂[LE ] = [X(E/K∞ )]. 17Since E is ordinary at p, we have P (E, 1, T ) = 1 − a T + pT 2 = (1 − uT )(1 − wT ) with p p fρ = p-part of conductor of ρ, and denote by e (ρ) the local ε-factor of ρ u ∈ Z× p p . We put p I
p at p. Finally we set Pp (ρ, T ) := det(1 − Frob−1 q .T |Vρ ). Then the interpolation formula is
LE (ρ) = +
LR (E, ρ, 1) Pp (ρ, ˆ u−1 ) · ep (ρ) · · u−fρ , Ω(E, ρ) Pp (ρ, w−1 ) −
where Ω(E, ρ) = Ω+ (E)d (ρ) Ω− (E)d (ρ) while d+ (ρ) and d− (ρ) denote the dimension of the subspace of Vρ on which complex conjugation acts by +1 and −1, respectively (see [8, 5.7]). 18 The original ETNC was formulated by Burns and Flach [2, 3, 15] inspired by [23]. A different version of an Iwasawa main conjecture (without p-adic L-functions) was discussed by Huber and Kings [21].
876
O. Venjakob
Before we discuss evidence for this conjectures we would like to comment on some of its implications. Assuming the existence of LE , one can show (i) that the GL2 -main conjecture implies the 1-variable main conjecture (over Qcyc ). (ii) that, assuming also the GL2 -main conjecture, it holds χ(G, X(ρ)) finite ⇔ LR (E, ρ, 1) = 0. In this case one has with mρ := [K : Qp ] : ρ . χ(G, X(ρ)) = |LE (ρ)|−m p
(iii) that, if L(E, 1) = 0, by results of Kolyvagin the groups E(Q) and X(E/Q) are finite and the p-part of the BSD conjecture (II.) holds: L(E/Q, 1) #X(E/Q) ∼p cl Ω+ (#E(Q))2 l up to Z× p. We conclude this survey by giving some evidence for Main Conjecture: In the CM-case the existence of LE follows from the existence of the 2variable p-adic L-function (Manin-Vishik [44], Katz [25], Yager [46]). If X ∈ MH (G), then the main conjecture follows from the 2-variable main conjecture (Rubin,Yager). In the GL2 -case almost nothing is known! There is only weak numerical evidence by calculations of T. and V. Dokchitser [14]. Let E = X1 (11), p = 5, and ρi , i = 1, 2, be the two unique irreducible Artin representations of degree 4 as before. Then they verify that the relation χ(G, X(ρi )) = |LE (ρi )|−1 p ,
i = 1, 2
holds as is predicted by the main conjecture, see above. Here LE (ρi ) denotes the term describing the interpolation property of LE if the p-adic L-function should exist.
From Classical to Non-commutative Iwasawa Theory
877
References [1] K. Ardakov and K.A. Brown, Primeness, semiprimeness and localisation in Iwasawa algebras, preprint (2004). [2] D. Burns and M. Flach, Tamagawa numbers for motives with (non-commutative) coefficients, Doc. Math. 6 (2001), 501–570 (electronic). [3] , Tamagawa numbers for motives with (noncommutative) coefficients. II, Amer. J. Math. 125 (2003), no. 3, 475–512. [4] J. Coates, p-adic L-functions and Iwasawa’s theory, Algebraic number fields: Lfunctions and Galois properties (Proc. Sympos., Univ. Durham, Durham, 1975), Academic Press, London, 1977, pp. 269–353. [5] , On p-adic L-functions, Ast´erisque (1989), no. 177-178, Exp. No. 701, 33–59. [6] , Fragments of the GL2 Iwasawa theory of elliptic curves without complex multiplication., Arithmetic theory of elliptic curves. Lectures given at the 3rd session of the Centro Internazionale Matematico Estivo (CIME), Cetraro, Italy, July 12–19, 1997., LNM, vol. 1716, Springer, 1999, pp. 1–50. [7] , Iwasawa algebras and arithmetic, Ast´erisque (2003), no. 290, Exp. No. 896, vii, 37–52. [8] J. Coates, T. Fukaya, K. Kato, R. Sujatha, and O. Venjakob, The GL2 main conjecture for elliptic curves without complex multiplication, appears in: Publ. Math. IHES. [9] J. Coates and S. Howson, Euler characteristics and elliptic curves II, J. Math. Soc. Japan 53 (2001), 175–235. [10] J. Coates, P. Schneider, and R. Sujatha, Links between cyclotomic and GL2 Iwasawa theory. Kazuya Kato’s fiftieth birthday. Doc. Math. 2003, Extra Vol., 187–215 . , Modules over Iwasawa algebras, J. Inst. Math. Jussieu 2 (2003), no. 1, [11] 73–108. [12] P. Colmez, Fonctions L p-adiques, Ast´erisque (2000), no. 266, Exp. No. 851, 3, 21–58. [13] P. Deligne, Valeurs de fonctions L et p´eriodes d’int´ egrales, Automorphic forms, representations and L-functions (Proc. Sympos. Pure Math., Oregon State Univ., Corvallis, Ore., 1977), Part 2, Proc. Sympos. Pure Math., XXXIII, Amer. Math. Soc., Providence, R.I., 1979, pp. 313–346. [14] T. Dokchitser and V. Dokchitser, in preparation, 2004. [15] M. Flach, The equivariant Tamagawa number conjecture: a survey. With an appendix by C. Greither. Contemp. Math., 358, Stark’s conjectures: recent work and new directions, 79–125, Amer. Math. Soc., Providence, RI, 2004. [16] T. Fukaya and K. Kato, A formulation of conjectures on p-adic zeta functions in non-commutative Iwasawa theory, preprint (2003). [17] R. Greenberg, Iwasawa theory for elliptic curves., Arithmetic theory of elliptic curves. Lectures given at the 3rd session of the Centro Internazionale Matematico Estivo (CIME), Cetraro, Italy, July 12–19, 1997., LNM, vol. 1716, Springer, 1999, pp. 51–144.
878
O. Venjakob
[18] Y. Hachimori and O. Venjakob, Completely faithful Selmer groups over Kummer extensions. Kazuya Kato’s fiftieth birthday. Doc. Math. 2003, Extra Vol., 443– 478. [19] S. Howson, Euler characteristics as invariants of Iwasawa modules, Proc. London Math. Soc. (3) 85 (2002), no. 3, 634–658. , Structure of central torsion Iwasawa modules, Bull. Soc. Math. France [20] 130 (2002), no. 4, 507–535. [21] A. Huber and G. Kings, Equivariant Bloch-Kato conjecture and non-abelian Iwasawa main conjecture, Proceedings of the International Congress of Mathematicians, Vol. II (Beijing, 2002) (Beijing), Higher Ed. Press, 2002, pp. 149–162. [22] K. Kato, p-adic Hodge theory and values of zeta functions of modular forms. Cohomologies p-adiques et applications arithm´ etiques. III. Ast´erisque No. 295 (2004), ix, 117–290. [23] , Lectures on the approach to Iwasawa theory for Hasse-Weil L-functions via BdR . I, Arithmetic algebraic geometry (Trento, 1991), Lecture Notes in Math., vol. 1553, Springer, Berlin, 1993, pp. 50–163. [24] , K1 of some non-commutative completed group rings, preprint (2004). [25] N.M. Katz, p-adic interpolation of real analytic Eisenstein series, Ann. of Math. (2) 104 (1976), no. 3, 459–571. MR MR0506271 (58 #22071) [26] B. Mazur, Rational points of Abelian varieties with values in towers of number fields., Invent. Math. 18 (1972), 183–266. [27] B. Mazur and P. Swinnerton-Dyer, Arithmetic of Weil curves, Invent. Math. 25 (1974), 1–61. [28] B. Mazur and A. Wiles, Class fields of Abelian extensions of Q., Invent. Math. 76 (1984), 179–330. [29] J. Neukirch, A. Schmidt, and K. Wingberg, Cohomology of number fields, Grundlehren der mathematischen Wissenschaften, vol. 323, Springer, 2000. [30] Y. Ochi and O. Venjakob, On the structure of Selmer groups over p-adic Lie extensions, J. Algebraic Geom. 11 (2002), no. 3, 547–580. [31] , On the ranks of Iwasawa modules over p-adic Lie extensions, Math. Proc. Cambridge Philos. Soc. 135 (2003), 25–43. [32] K. Rubin, On the main conjecture of Iwasawa theory for imaginary quadratic fields., Invent. Math. 93 (1988), no. 3, 701–713. [33] P. Schneider and O. Venjakob, On the dimension theory of skew power series rings, appears in: J. Pure Appl. Algebra. [34] J.-P. Serre, Proprietes galoisiennes des points d’ordre fini des courbes elliptiques. (Galois properties of points of finite order of elliptic curves)., Invent. Math. 15 (1972), 259–331. [35] J.H. Silverman, The arithmetic of elliptic curves, Graduate Texts in Mathematics, vol. 106, Springer-Verlag, New York, 199? [36] J.T. Tate, The arithmetic of elliptic curves, Invent. Math. 23 (1974), 179–206. [37] E. Urban and C. Skinner, An Eisenstein ideal for gu(2, 2) and the main conjecture for gl2 , in progress (2004). [38] O. Venjakob, Iwasawa theory of p-adic Lie extensions., Ph.D. thesis, Heidelberg: Univ. Heidelberg, Naturwissenschaftlich-Mathematische Gesamtfakult¨ at, 112 p., 2000.
From Classical to Non-commutative Iwasawa Theory [39] [40] [41]
[42] [43]
[44] [45] [46]
879
, On the structure theory of the Iwasawa algebra of a p-adic Lie group, J. Eur. Math. Soc. (JEMS) 4 (2002), no. 3, 271–311. , A non-commutative Weierstrass preparation theorem and applications to Iwasawa theory, J. reine angew. Math. 559 (2003), 153–191. , Characteristic Elements in Noncommutative Iwasawa Theory, Habilitationsschrift, Ruprecht-Karls-Universit¨ at Heidelberg (2003), appears in: J. Reine Angew. Math. , Iwasawa Theory of p-adic Lie Extensions, Compos. Math. 138 (2003), no. 1, 1–54. , From the Birch and Swinnerton-Dyer Conjecture over the Equivariant Tamagawa Number Conjecture to non-commutative Iwasawa theory, a survey, preprint (2005). M.M. Viˇsik and Ju.I. Manin, p-adic Hecke series of imaginary quadratic fields, Mat. Sb. (N.S.) 95(137) (1974), 357–383, 471. L.C. Washington, Introduction to cyclotomic fields, Graduate Texts in Mathematics, vol. 83, Springer-Verlag, New York, 1997. R.I. Yager, On two variable p-adic L-functions., Ann. Math., II. Ser. 115 (1982), 411–449.
Otmar Venjakob Universit¨ at Heidelberg, Mathematisches Institut Im Neuenheimer Feld 288, D-69120 Heidelberg, Germany e-mail:
[email protected] URL: http://www.mathi.uni-heidelberg.de/~otmar/
4ECM Stockholm 2004 c 2005 European Mathematical Society
Index of Authors Alberti, G. . . . . . . . . . . . . Auroux, D. . . . . . . . . . . . . Barthe, F. . . . . . . . . . . . . . Beliaev, D. . . . . . . . . . . . . Bianchini, S. . . . . . . . . . . Biran, P. . . . . . . . . . . . . . . Bonami, A. . . . . . . . . . . . . Borodin, A. . . . . . . . . . . . Bouchut, F. . . . . . . . . . . . Bowditch, B.H. . . . . . . . . Brenier, Y. . . . . . . . . . . . . Cs¨ ornyei, M. . . . . . . . . . . den Hollander, F. . . . . . . Esterle, J. . . . . . . . . . . . . . Friedgut, E. . . . . . . . . . . . G´erard, P. . . . . . . . . . . . . . Golse, F. . . . . . . . . . . . . . . Guerra, F. . . . . . . . . . . . . . Guionnet, A. . . . . . . . . . . H˚ astad, J. . . . . . . . . . . . . . Helffer, B. . . . . . . . . . . . . . Helmke, S. . . . . . . . . . . . . Holden, H. . . . . . . . . . . . . Keating, J.P. . . . . . . . . . . Klein, R. . . . . . . . . . . . . . . Kraj´ıˇcek, J. . . . . . . . . . . . Krammer, D. . . . . . . . . . . Krattenthaler, C. . . . . . . Lindenstrauss, E. . . . . . . L uczak, T. . . . . . . . . . . . . Lyons, T. . . . . . . . . . . . . . . Madsen, I. . . . . . . . . . . . . . Massart, P. . . . . . . . . . . . .
3 23 811 41 61 827 549 73 95 103 555 3 561 573 117 121 699 719 141 733 597 155 173 619 201 221 233 625 247 257 269 283 309
Mih˘ ailescu, P. . . . . . . . . . Mikusky, E. . . . . . . . . . . . Monsurr` o, M. . . . . . . . . . Mustata, M. . . . . . . . . . . . O’Grady, K.G. . . . . . . . . Okounkov, A. . . . . . . . . . Olshanski, G. . . . . . . . . . . Owinoh, A. . . . . . . . . . . . . Ozsv´ ath, P. . . . . . . . . . . . . Preiss, D. . . . . . . . . . . . . . Reid, M. . . . . . . . . . . . . . . Ruzsa, I.Z. . . . . . . . . . . . . Schramm, O. . . . . . . . . . . Serfaty, S. . . . . . . . . . . . . . Shalom, Y. . . . . . . . . . . . . Shcherbina, M. . . . . . . . . Slodowy, P. . . . . . . . . . . . . Smirnov, S. . . . . . . . . . . . . Sodin, M. . . . . . . . . . . . . . Solovej, J.P. . . . . . . . . . . . Stix, J. . . . . . . . . . . . . . . . . Takagi, S. . . . . . . . . . . . . . Tolsa, X. . . . . . . . . . . . . . . Tornberg, A.-K. . . . . . . . Totik, V. . . . . . . . . . . . . . . Tucker, W. . . . . . . . . . . . . Venjakob, O. . . . . . . . . . . Voisin, C. . . . . . . . . . . . . . Watanabe, K. . . . . . . . . . Weiss, M. . . . . . . . . . . . . . Werner, W. . . . . . . . . . . . . Zannier, U. . . . . . . . . . . . .
325 201 643 341 365 751 73 201 769 3 655 381 783 837 391 425 155 41 445 669 681 341 459 477 501 851 861 787 341 283 515 529