Jeffrey M. Lee
Graduate Studies in Mathematics Volume 107
-... ~.~.»
i)
~
~ED
\",
American Mathematical Society
Manifolds and Differential Geometry Jeffrey M. Lee
Graduate Studies in Mathematics Volume 107
~:=~~~~ ! ~j {i]
~
"13bNDEt>
\,+>"0"0.
M American athematical Society Providence, Rhode Island
EDITORIAL COMMITTEE David Cox (Chair) Steven G. Krantz Rafe Mazzeo Martin Scharlemann 2000 Mathematics Subject Classification. Primary 58A05, 58AlO, 53C05, 22E15, 53C20, 53B30, 55RIO, 53Z05.
For additional information and updates on this book, visit www.ams.org/bookpages/gsm-107
Library of Congress Cataloging-in-Publication Data Lee, Jeffrey M., 1956Manifolds and differential geometry / Jeffrey M. Lee. p. cm. - (Graduate studies in mathematics; v. 107) Includes bibliographical references and index. ISBN 978-0-8218-4815-9 (alk. paper) 1. Geometry, Differential. 2. Topological manifolds. 3. Riemannian manifolds.
1. Title.
QA641.L38 2009 516.3'6--dc22 2009012421
Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy a chapter for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for such permission should be addressed to the Acquisitions Department, American Mathematical Society, 201 Charles Street, Providence, Rhode Island 02904-2294 USA. Requests can also be made by e-mail to
[email protected].
© 2009
by the American Mathematical Society. All rights reserved. The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America.
@!
The paper used in this book is acid-free and faUs within the guidelines established to ensure permanence and durability. Visit the AMS home page at http://www . ams . org/ 10 9 8 7 6 5 4 3 2 1
14 13 12 11 10 09
Contents
xi
Preface Differentiable 1!lanifolds
Chapter 1.
1
§1.1.
Preliminaries
2
§1.2.
Topological Manifolds
6
§1.3.
Charts, Atlases and Smooth Structures
11
§1.4.
Smooth Maps and Diffeomorphisms
22
§1.5.
Cut-off Functions and Partitions of Unity
28
§1.6.
Coverings and Discrete Groups
31
§1.7.
Regular Submanifolds
46
§1.8.
Manifolds with Boundary
48 51
Problems Chapter 2.
The Tangent Structure
55
§2.1.
The Tangent Space
55
§2.2.
Interpretations
65
§2.3.
The Tangent Map
66
§2.4.
Tangents of Products
72
§2.5.
Critical Points and Values
74
§2.6.
Rank and Level Set
78
§2.7.
The Tangent and Cotangent Bundles
81
§2.8.
Vector Fields
87
§2.9.
I-Forms
§2.10.
Line Integrals and Conservative Fields
110 116
-
v
Contents
VI
§2.11.
Moving Frames
122
Problems Chapter 3.
120
Immersion and Submersion
127
§3.1.
Immersions
127
§3.2.
Immersed and Weakly Embedded Submanifolds
130
§3.3.
Submersions
138 140
Problems Chapter 4.
Curves and Hypersurfaces in Euclidean Space
143
§4.1.
Curves
145
§4.2.
Hypersurfaces
152
§4.3.
The Levi-Civita Covariant Derivative
165
§4.4.
Area and Mean Curvature
178
§4.5.
More on Gauss Curvature
180
§4.6.
Gauss Curvature Heuristics
184
Problems Chapter 5.
187 Lie Groups
189
§5.1.
Definitions and Examples
189
§5.2.
Linear Lie Groups
192
§5.3.
Lie Group Homomorphisms
201
§5.4.
Lie Algebras and Exponential Maps
204
§5.5.
The Adjoint Representation of a Lie Group
220
§5.6.
The Maurer-Cartan Form
224
§5.7.
Lie Group Actions
228
§5.8.
Homogeneous Spaces
240
§5.9.
Combining Representations
249
Problems Chapter 6.
253 Fiber Bundles
257
§6.1.
General Fiber Bundles
257
§6.2.
Vector Bundles
270
§6.3.
Tensor Products of Vector Bundles
282
§6.4.
Smooth Functors
283
§6.5.
Hom
285
§6.6.
Algebra Bundles
287
§6.7.
Sheaves
288
Contents
§6.8. Principal and Associated Bundles Problems
vii
291 303
Chapter 7. Tensors §7.1. Some Multilinear Algebra §7.2. Bottom-Up Approach to Tensor Fields §7.3. Top-Down Approach to Tensor Fields §7.4. Matching the Two Approaches to Tensor Fields §7.5. Tensor Derivations §7.6. Metric Tensors Problems
307 308 318 323 324 327 331 342
Chapter 8. Differential Forms §8.1. More Multilinear Algebra §8.2. Differential Forms §8.3. Exterior Derivative §8.4. Vector-Valued and Algebra-Valued Forms §8.5. Bundle-Valued Forms §8.6. Operator Interactions §8. 7. Orientation §8.8. Invariant Forms Problems
345 345
Chapter §9.1. §9.2. §9.3. §9.4. §9.5. §9.6. §9.7.
391 394 397 400 404 414 418 425 429 434 437
9. Integration and Stokes' Theorem Stokes' Theorem Differentiating Integral Expressions; Divergence Stokes' Theorem for Chains Differential Forms and Metrics Integral Formulas The Hodge Decomposition Vector Analysis on ]R3
§9.8. Electromagnetism §9.9. Surface Theory Redux Problems Chapter 10. De Rham Cohomology §10.1. The Mayer-Vietoris Sequence §1O.2. Homotopy Invariance
358 363 367 370 373
375 384 388
441 447 449
Contents
viii
§10.3.
Compactly Supported Cohomology
456
§10.4.
Poincare Duality
460
Problems
465
Chapter 11.
Distributions and Frobenius' Theorem
467
§11.1.
Definitions
468
§11.2.
The Local Frobenius Theorem
471
§11.3.
Differential Forms and Integrability
473
§11.4.
Global Frobenius Theorem
478
§11.5.
Applications to Lie Groups
484
§11.6.
Fundamental Theorem of Surface Theory
486
§11.7.
Local Fundamental Theorem of Calculus
494
Problems
498
Chapter 12.
Connections and Covariant Derivatives
501
§12.1.
Definitions
501
§12.2.
Connection Forms
506
§12.3.
Differentiation Along a Map
507
§12.4.
Ehresmann Connections
509
§12.5.
Curvature
525
§12.6.
Connections on Tangent Bundles
530
§12.7.
Comparing the Differential Operators
532
§12.8.
Higher Covariant Derivatives
534
§12.9.
Exterior Covariant Derivative
536
§12.10.
Curvature Again
540
§12.11.
The Bianchi Identity
541
§12.12.
G-Connections
542
Problems Chapter 13.
544 Riemannian and Semi-Riemannian Geometry
547
§13.1.
Levi-Civita Connection
550
§13.2.
Riemann Curvature Tensor
553
§13.3.
Semi-Riemannian Sub manifolds
560
§13.4.
Geodesics
567
§13.5.
Riemannian Manifolds and Distance
585
§13.6.
Lorentz Geometry
588
§13.7.
Jacobi Fields
594
Contents
ix
§13.8.
First and Second Variation of Arc Length
599
§13.9.
More Riemannian Geometry
612
§13.10.
Cut Locus
617
§13.11.
Rauch's Comparison Theorem
619
§13.12.
Weitzenbock Formulas
623
§13.13.
Structure of General Relativity
627
634
Problems Appendix A.
The Language of Category Theory
637
Appendix B.
Topology
§B.l.
The Shrinking Lemma
643 643
§B.2.
Locally Euclidean Spaces
645
Appendix C.
Some Calculus Theorems
647
Appendix D.
Modules and Multilinearity
649
§D.l.
R-Algebras
660
Bibliography
663
Index
667
Preface
Classical differential geometry is the approach to geometry that takes full advantage of the introduction of numerical coordinates into a geometric space. This use of coordinates in geometry was the essential insight of Rene Descartes that allowed the invention of analytic geometry and paved the way for modern differential geometry. The basic object in differential geometry (and differential topology) is the smooth manifold. This is a topological space on which a sufficiently nice family of coordinate systems or "charts" is defined. The charts consist of locally defined n-tuples of functions. These functions should be sufficiently independent of each other so as to allow each point in their common domain to be specified by the values of these functions. One may start with a topological space and add charts which are compatible with the topology or the charts themselves can generate the topology. We take the latter approach. The charts must also be compatible with each other so that changes of coordinates are always smooth maps. Depending on what type of geometry is to be studied, extra structure is assumed such as a distinguished group of symmetries, a distinguished "tensor" such as a metric tensor or symplectic form or the very basic geometric object known as a connection. Often we find an interplay among many such elements of structure. Modern differential geometers have learned to present much of the subject without constant direct reference to locally defined objects that depend on a choice of coordinates. This is called the "invariant" or "coordinate free" approach to differential geometry. The only way to really see exactly what this all means is by diving in and learning the subject. The relationship between geometry and the physical world is fundamentalon many levels. Geometry (especially differential geometry) clarifies,
-
xi
xii
Preface
codifies and then generalizes ideas arising from our intuitions about certain aspects of our world. Some of these aspects are those that we think of as forming the spatiotemporal background of our activities, while other aspects derive from our experience with objects that have "smooth" surfaces. The Earth is both a surface and a "lived-in space" , and so the prefix "geo" in the word geometry is doubly appropriate. Differential geometry is also an appropriate mathematical setting for the study of what we classically conceive of as continuous physical phenomena such as fluids and electromagnetic fields. Manifolds have dimension. The surface of the Earth is two-dimensional, while the configuration space of a mechanical system is a manifold which may easily have a very high dimension. Stretching the imagination further we can conceive of each possible field configuration for some classical field as being an abstract point in an infinite-dimensional manifold. The physicists are interested in geometry because they want to understand the way the physical world is in "actuality". But there is also a discovered "logical world" of pure geometry that is in some sense a part of reality too. This is the reality which Roger Penrose calls the Platonic world. 1 Thus the mathematicians are interested in the way worlds could be in principle and geometers are interested in what might be called "possible geometric worlds". Since the inspiration for what we find interesting has its roots in our experience, even the abstract geometries that we study retain a certain physicality. From this point of view, the intuition that guides the pure geometer is fruitfully enhanced by an explicit familiarity with the way geometry plays a role in physical theory. Knowledge of differential geometry is common among physicists thanks to the success of Einstein's highly geometric theory of gravitation and also because of the discovery of the differential geometric underpinnings of modern gauge theory 2 and string theory. It is interesting to note that the gauge field concept was introduced into physics within just a few years of the time that the notion of a connection on a fiber bundle (of which a gauge field is a special case) was making its appearance in mathematics. Perhaps the most exciting, as well as challenging, piece of mathematical physics to come along in a while is string theory mentioned above. The usefulness of differential geometric ideas for physics is also apparent in the conceptual payoff enjoyed when classical mechanics is reformulated in the language of differential geometry. Mathematically, we are led to the subjects of symplectic geometry and Poisson geometry. The applicability of differential geometry is not limited to physics. Differential geometry is 1 Penrose seems to take this Platonic world rather literally giving it a great deal of ontological weight as it were. 2The notion of a connection on a fiber bundle and the notion of a gauge field are essentially identical concepts discovered independently by mathematicians and physicists.
Preface
xiii
also of use in engineering. For example, there is the increasingly popular differential geometric approach to control theory.
There is a bit more material in this book than can be comfortably covered in a two semester course. A course on manifold theory would include Chapters 1, 2, 3, and then a selection of material from Chapters 5, 7, 8, 9, 10, and 11. A course in Riemannian geometry would review material from the first three chapters and then cover at least Chapters 8 and 13. A more leisurely course would also include Chapter 4 before getting into Chapter 13. The book need not be read in a strictly linear manner. We included here a flow chart showing approximate chapter dependence. There are exercises throughout the text and problems at the end of each chapter. The reader should at least read and think about every exercise. Some exercises are rather easy and only serve to keep the reader alert. Other exercises take a bit more thought. Differential geometry is a huge field, and even if we had restricted our attention to just manifold theory or Riemannian geometry, only a small fragment of what might be addressed at this level could possibly be included. In choosing what to include in this book, I was guided by personal interest and, more importantly, by the limitations of my own understanding. While preparing this book I used too many books and papers to list here but a few that stand out as having been especially useful include [A-M-R], [Hicks], [LI], [Lee, John], [aNI], [ON2], and [Poor].
xiv
Preface
I would like to thank Lance Drager, Greg Friedman, Chris Monico, Mara Neusel, Efton Park, Igor Prokhorenkov, Ken Richardson, Magdalena Toda, and David Weinberg for proofreading various portions of the book. I am especially grateful to Lance Drager for many detailed discussions concerning some of the more difficult topics covered in the text, and also for the many hours he spent helping me with an intensive final proofreading. It should also be noted that certain nice ideas for improvements of a few proofs are due to Lance. Finally, I would like to thank my wife Maria for her support and patience and my father for his encouragement.
Chapter 1
Differentiable Manifolds
"Besides language and music, mathematics is one of the primary manifestations of the free creative power of the human mind.)) - Hermann Weyl
In this chapter we introduce differentiable manifolds and smooth maps. A differentiable manifold is a topological space on which there are defined coordinates allowing basic notions of differentiability. The theory of differentiable manifolds is a natural result of extending and clarifying notions already familiar from multivariable calculus. Consider the task of writing out clearly, in terms of sets and maps, what is going on when one does calculus in polar coordinates on the plane or in spherical coordinates on a sphere. If this were done with sufficient care about domains and codomains, a good deal of ambiguity in standard notation would be discovered and clarifying the situation would almost inevitably lead to some of the very definitions that we will see shortly. In some sense, a good part of manifold theory is just multivariable calculus done carefully. Unfortunately, this care necessitates an increased notational burden that can be intimidating at first. The reader should always keep in mind the example of surfaces and the goal of doing calculus on surfaces. Another point is that manifolds can have nontrivial topology, and this is one reason the subject becomes so rich. In this chapter we will make some connection with some basic ideas from topology such as covering spaces and the fundamental group. Manifold theory and differential geometry play a role in an increasingly large amount of modern mathematics and have long played an important
-
1
2
1. Differentiable Manifolds
role in physics. In Einstein's general theory of relativity, spacetime is taken to be a 4-dimensional manifold. Manifolds of arbitrarily high dimension play a role in many physical theories. For example, in classical mechanics, the set of all possible locations and orientations of a rigid body is a manifold of dimension six, and the phase space of a system of N Newtonian particles moving in 3-dimensional space is a manifold of dimension 6N.
1.1. Preliminaries To understand the material to follow, it is necessary that the reader have a good background in the following subjects. 1) Linear algebra. The reader should be familiar with the idea of the dual space of a vector space and also with the notion of a quotient vector space. A bit of the language of module theory will also be used and is outlined in Appendix D. 2) Point set topology. We assume familiarity with the notions of subspace topology, compactness and connectedness. The reader should know the definitions of Hausdorff topological spaces, regular spaces and normal spaces. The reader should also have been exposed to quotient topologies. Some of the needed concepts are reviewed in the online supplement [Lee, Jeff].
Convention: A neighborhood of a point in a topological space is often defined to be a set whose interior contains the point. Our convention in the sequel is that the word "neighborhood" will always mean open neighborhood unless otherwise indicated. Nevertheless, we will sometimes write "open neighborhood" for emphasis. 3) Abstract algebra. The reader will need a familiarity with the basics of abstract algebra at least to the level of the basic isomorphism theorems for groups and rings. 4) Multivariable calculus. The reader should be familiar with the idea that the derivative at a point p of a map between open sets of (normed) vector spaces is a linear transformation between the vector spaces. Usually the normed spaces are assumed to be the Euclidean coordinate spaces such as ]Rn with the norm Ilxll = ~. A reader who felt the need for a review could do no better than to study roughly the first half of the classic book "Calculus on Manifolds" by Michael Spivak. Also, the online supplement ([Lee, Jeff]) gives a brief treatment of differential calculus on Banach spaces. Here we simply review a few definitions and notations.
Notation 1.1. The elements of ]Rn are n-tuples of real numbers, and we shall write the indices as superscripts, so v = (VI, ... , vn). Furthermore, when using matrix algebra, elements of]Rn are most often written as column
3
1.1. Preliminaries
vectors so in this context, v = [VI, ... , vn]t. On the other hand, elements of the dual space of]Rn are often written as row vectors and the indices are written as subscripts. With this convention, an element of the dual space (]Rn)* acts on an element of ]Rn by matrix multiplication. We shall only be careful to write elements of ]Rn as column vectors if necessary (as when we write Av for some m x n matrix A and v E ]Rn). If f : U --+ ]Rm then f = (II, ... , fm) for real-valued functions fI, ... , fm.
Definition 1.2. Let U be an open subset of ]Rn. A map f : U --+ ]Rm is said to be differentiable at a E U if and only if there is a linear map Aa : ]Rn --+ ]Rm such that lim Ilf(a
+ h) -
f(a) - Aa(h) II = 0
Ilhll
Ilhll~o
The map Aa is uniquely determined by or Dfl a .
. f and a and is denoted by D f (a)
Notation 1.3. If L : V --+ W is a linear transformation and v E V, we shall often denote L (v) by either L . v or Lv depending on which is clearer in a given context. This applies to D f (a), so we shall have occasion to write things like D f( a)( v), D f( a) . v, D f(a)v or D fla v. The space of linear maps from V to W is denoted by L(V, W) and also by Hom(V, W). With respect to standard bases, D f (a) is given by the m x n matrix of partial derivatives (the Jacobian matrix). Thus if w = Df(a)v, then
. "of i . = ~ ax) (a)v J •
wt
)
Note that for a differentiable real-valued function f, we will often denote of /ox i by ad. Recall that if U, V and Ware vector spaces, then a map j3 : U X V --+ W is called bilinear if for each fixed Uo E U and fixed Vo E V, the maps v H j3(uo, v) and U H j3(u, vo) are linear. If j3 : V X V --+ W is bilinear and j3 (u, v) = j3 (v, u) for all u, v E V, then j3 is said to be symmetric. Similarly, antisymmetry or skewsymmetry is defined by the condition j3(u, v) = -j3(v,u). If V is a real (resp. complex) vector space, then a bilinear map V x V --+ ]R (resp. q is called a bilinear form.
If D f (a) is defined for all a E U, then we obtain a map D f : U --+ L(]Rn,]Rm), and since L(]Rn, ]Rm) ~ ]Rmn, we can consider the differentiability of D f. Thus if D f is differentiable at a, then we have a second derivative D 2f(a):]Rn --+ L(]Rn,]Rm), and so ifv,w E ]Rn, then (D2f(a)v)w E ]Rm. This allows us to think of D2 f(a) as a bilinear map:
D2 f(a)(v, w)
:=
(D2 f(a)v) w for v, wE ]Rn.
1. Differentiable Manifolds
4
If u
= D2 f(a)(v, w),
then with respect to standard bases, we have i _ ""' afi ] k u - ~ ax]ax k (a) v w .
],k
Higher derivatives Dr f can be defined similarly as multilinear maps (see D.13 of Appendix D), and if Dr f exists and is continuous on U, then we say that f is r-times continuously differentiable, or C r , on U. A map f is C r if and only if all partial derivatives of order less than or equal to r of the component functions fi exist and are continuous on U. The vector space of all such maps is denoted by cr(U,JR m ), and Cr(U,JR) is abbreviated to Cr(U). If f is C r for all r 2 0, then we say that f is smooth or Coo. The vector space of all smooth maps from U to JRm is denoted by COO(U, JRm) and we thereby include 00 as a possible value for r. Also, f is said to be C r at p if its restriction to some open neighborhood of p is of class cr. If f is C 2 near a then D2 f (a) is symmetric and this is reflected in the fact that we have equality of mixed second order partial derivatives; a fi / ax] ax k = a P / ax k ax]. Similarly, if f is C r near a then the order of partial differentiations can be rearranged in any mixed partial derivative of order less than or equal to r.
Definition 1.4. A bijection f between open sets U C JRn and V c JRm is called a C r diffeomorphism if and only if f and f- 1 are both C r differentiable. If r = 00, then we simply call f a diffeomorphism. Definition 1.5. Let U be open in JR n . A map f : U --+ JRn is called a local C r diffeomorphism if and only if for every p E U there is an open set Up C U with p E Up such that f(Up) is open and flup : Up --+ f(Up) is a C r diffeomorphism. We will sometimes think of the derivative of a curve 1 e : I c JR --+ JRm at to E I, as a velocity vector and so we are identifying Del to E L(JR, JRm) with Del to ·1 E JR m. Here the number 1 is playing the role of the unit vector in R Especially in this context, we write the velocity vector using the notation c(to) or e'(to). Let f : U C JRn --+ JRm be a map and suppose that we write JRn = JRk X JR l . Let (x, y) denote a generic element ofJRk xJR l . For every (a, b) E U C JRk xJR l the partial maps fa : y f--7 f(a, y) and fb : x f--7 f(x, b) are defined in some neighborhood of b (resp. a). We define the partial derivatives, when they exist, by D2!(a, b) := Dfa(b) and Dd(a, b) := Dfb(a). These are, of course,
lWe will often use the letter I to denote a generic (usually open) interval in the real line.
1.1. Preliminaries
5
linear maps.
Dd(a, b) : ]Rk --+ ]Rm, D2!(a, b) : ]Rl --+ ]Rm. Remark 1.6. Notice that if we consider the maps ia : x t--+ (a, x) and i b : x t--+ (x, b), then D2f(a, b) = D(f 0 ia)(b) and Dd(a, b) = D(f 0 ib)(a). Proposition 1. 7. If f has continuous partial derivatives Dd (x, y), i = 1, 2 near (x,y) E]Rk X ]Rl, then Df(x,y) exists and is continuous. In this case, we have for v = (VI, V2) E]Rk x ]Rl,
+ D2!(x, y) . v2.
D f(x, y) . (VI, V2) = Dd(x, y) . VI
Clearly we can consider maps on several factors f : ]Rkl x]Rk2 X· .. x]Rkr --+ ]Rm and then we can define partial derivatives Dd : ]Rki --+ ]Rm for i = 1, ... , l' in the obvious way. Notice that the meaning of Dd depends on how we factor the domain. For example, we have both ]R3 = ]R2 X ]R and also ]R3 = ]R x ]R x R Let U be an open subset of ]Rn and let f : U --+ ]R be a map. Note that for the factorization ]Rn = ]R x ... x ]R, the linear map (Dd) (a) is often identified with the number ad (a). Theorem 1.8 (Chain Rule). Let U be an open subset of]Rn and V an open subset of]Rm. If f : U --+ ]Rm and 9 : V --+ ]Rd are maps such that f is differentiable at a E U and 9 is differentiable at f (a), then 9 0 f is differentiable at a and
D (g
0
f) (a) = Dg(f(a))
0
Df(a).
Furthermore, if f and 9 are C r at a and f (a) respectively, then 9 0 f is C r at a. The chain rule may also be written Di (g where f = (fl, .. . , fm).
0
f) (a) =
2: Djg(f(a))Ddj(a)
Notation 1.9. Einstein Summation Convention. Summations such as n
hi =
2::>-Jaj j=1
occur often in differential geometry. It is often convenient to employ a convention whereby summation over repeated indices is implied. This convention is attributed to Einstein and is called the Einstein summation convention. Using this convention, the above equation would be written
hi =
TJa
j .
The range of the indices is either determined by context or must be explicitly mentioned. We shall use this convention in some later chapters.
1. Differentiable Manifolds
6
Finally, we will use the notion of a commutative diagram. The reader unfamiliar with this notion should consult Appendix A.
1.2. Topological Manifolds We recall a few concepts from point set topology. A cover of a topological space X is a family of sets {U!'3}!'3EB such that X = U!'3U!'3. If all the sets U!'3 are open, we call it an open cover. A refinement of a cover {U!'3}!'3EB of a topological space X is another cover {VihEI such that every set from the second cover is contained in at least one set from the original cover. This means that if {U!'3}!'3EB is the given cover of X, then a refinement may be described as a cover {Vi}iEI together with a set map i H (3( i) of the indexing sets I -+ B such that Vi c U!'3(i) for all i. Two covers {UoJOEA and {U!'3}!'3EB have a common refinement. Indeed, we simply let I = A x B and then let Ui = Uo n U!'3 if i = (a,(3). This common refinement will obviously be open if the two original covers were open. We say that a cover {VihEI of X (by not necessarily open sets) is a locally finite cover if every point of X has a neighborhood that intersects only a finite number of sets from the cover. A topological space X is called paracompact if every open cover of X has a refinement which is a locally finite open cover. Definition 1.10. Let X be a set. A collection IE of subsets of X is called a basis of subsets of X if the following conditions are satisfied:
(i) X
= UBE'BB;
(ii) If B l , B 2 E IE and x E Bl n B with x E B C Bl n B 2.
2,
then there exists a set B E IE
It is a fact of elementary point set topology that if IE is a basis of subsets of a set X, then the family T of all possible unions of members of IE is a topology on X. In this case, we say that IE is a basis for the topology T and T is said to be generated by the basis. In thinking about bases for topologies, it is useful to introduce a certain technical notion as follows: If IE is any family of subsets of X, then we say that a subset U c X satisfies the basis criterion with respect to IE if for any x E U, there is aBE IE with x E B cU. Then we have the following technical lemma which is sometimes used without explicit mention.
Lemma 1.11. If IE is a basis of subsets of X, then the topology generated by IE is exactly the family of all subsets of X which satisfy the basis criterion with respect to the family IE. (See Problem 2.) A topological space is called second countable if its topology has a countable basis. The space ]Rn with the usual topology derived from the Euclidean distance function is second countable since we have a basis for
1.2. Topological Manifolds
7
the topology consisting of open balls with rational radii centered at points with rational coordinates.
Definition 1.12. An n-dimensional topological manifold is a paracompact Hausdorff topological space, say M, such that every point p E M is contained in some open set Up that is homeomorphic to an open subset of the Euclidean space ]Rn. Thus we say that a topological manifold is "locally Euclidean". The integer n is referred to as the dimension of M, and we denote it by dim(M). Note: At first it may seem that a locally Euclidean space must be Hausdorff, but this is not the case. Example 1.13. ]Rn is trivially a topological manifold of dimension n. Example 1.14. The unit circle Sl := {(x,y) E]R2: x 2 +y2 = I} is a I-dimensional topological manifold. Indeed, the map ]R -+ Sl given by e H (cos e, sin e) has restrictions to small open sets which are homeomorphisms. The boundary of a square in the plane is a topological manifold homeomorphic to the circle, and so we say that it is a topological circle. More generally, the n-sphere
sn:= {(xl, ... ,xn+l)
E
]Rn+1 : 2:)x i )2 =
I}
is a topological manifold.
If M1 and M2 are topological manifolds of dimensions n1 and n2 respectively, then M1 x M 2, with the product topology, is a topological manifold of dimension n1 + n2. Such a manifold is called a product manifold. M1 x M2 is locally Euclidean. Indeed, the required homeomorphisms are constructed in the obvious way from those defined on Ml and M2: If
(p) c ]Rnl and 7/Jq : U q C M2 -+ V'IjJ(q) C ]Rn2 are homeomorphisms, then Up x U q is a neighborhood of (p, q) and we have a homeomorphism
-+ V>(p) x V'IjJ(q) C
]Rnl x
]Rn2,
where (
Example 1.15. The n-torus
Tn := Sl
X
Sl
X ... X
is a topological manifold of dimension n.
Sl (n factors)
8
1. Differentiable Manifolds
Recall that a topological space is connected if it is not the disjoint union of nonempty open subsets. A subset of a topological space is said to be connected if it is connected with the subspace topology. If a space is not connected, then it can be decomposed into components: Definition 1.16. Let X be a topological space and x E X. The component containing x, denoted C(x), is defined to be the union of all connected subsets of X that contain x. A subset of a topological space X is a (connected) component if it is C(x) for some x E X. A topological space is called locally connected if the topology has a basis consisting of connected sets. If a topological space is locally connected, then the connected components of each open set (considered with its subspace topology) are all open. In particular, each connected component of a locally connected space is open (and also closed). This clearly applies to manifolds. A continuous curve, : [a, b] -+ M is said to connect a point p to a point q in M if ,(a) = p and ,(b) = q. We define an equivalence relation on a topological space M by declaring p ,..... q if and only if there is a continuous curve connecting p to q. The equivalence classes are called path components, and if there is only one path component, then we say that M is path connected. Exercise 1.17. The path components of a manifold M are exactly the connected components of M. Thus, a manifold is connected if and only if it is path connected. We shall not give many examples of topological manifolds at this time because our main concern is with smooth manifolds defined below. We give plenty of examples of smooth manifolds, and every smooth manifold is also a topological manifold. Manifolds are often defined with the requirement of second countability because, when the manifold is Hausdorff, this condition implies paracompactness. Paracompactness is important in connection with the notion of a "partition of unity" discussed later in this book. It is known that for a locally Euclidean Hausdorff space, paracompactness is equivalent to the property that each connected component is second countable. Thus if a locally Euclidean Hausdorff space has at most a countable number of components, then paracompactness implies second countability. Proposition B.5 of Appendix B gives a list of conditions equivalent to paracompactness for a locally Euclidean Hausdorff space. One theorem that fails if the manifold has an uncountable number of components is Sard's Theorem 2.34. Our approach will be to add in the requirement of second countability when needed.
1.2. Topological Manifolds
9
In defining a topological manifold, one could allow the dimension n of the Euclidean space to depend on the homeomorphism ¢ and so on the point p EM. However, it is a consequence of a result of Brouwer called "invariance of domain" that n would be constant on connected components of M. This result is rather easy to prove if the manifold has a differentiable structure (defined below), but is more difficult in general. We shall simply record Brouwer's theorem:
Theorem 1.18 (Invariance of Domain). The image of an open set U c IR n under an injective continuous map f : U -t IR n is open and f is a homeomorphism from U to f (U). It follows that if U c IR n is homeomorphic to V c IRm , then m = n. Let us define n-dimensional closed Euclidean half-space to be lHl n := lR~n>o := {(a l , ... ,an) E IR n : an 2: O}. The boundary of lHln is 8lHl n = lR~n:o =: {(al, ... ,an) : an = O}. The interior of lHln is int(lHln) = lHln \8lHln . There are other half-spaces homeomorphic to lHl n = lR~n>o. In fact, for a fixed c E IR and for any nonzero linear function ,\ E (IRn) *~ we define 1R~2:c =
{a
E IR n : '\(a)
2: c},
which includes lR~k>c and lR~k
o. However, this is just a convenience that reduces notation. In fact, from-the point of view of "manifold orientation" developed in a later chapter, the spaces of the form lR~l
10
1. Differentiable Manifolds
Exercise 1.19. Show that int(M) n aM
= 0.
Exercise 1.20. The boundary aM of an n-dimensional topological manifold with boundary is an (n - I)-dimensional topological manifold (without boundary). From now on, by "manifold" we shall mean a manifold without boundary unless otherwise indicated or implied by context. Example 1.21. Let [a, b] c IR be a closed interval. If N is a topological manifold of dimension n - 1, then N x [a, b] is an n-dimensional topological manifold with boundary a (N x [a, b]) = (N x {a}) U (N x {b}). Example 1.22. If N is a topological manifold with boundary aN, then N x IR is a topological manifold with boundary and a (N x IR) = aN x R Example 1.23. The closed cube en = {x: max{lx11 , ... , Ixnl} :::; I} with its subspace topology inherited from IR n is a topological manifold with boundary. The boundary is (homeomorphic to) an (n - 1)-dimensional sphere. Recall that the (topological) boundary of a set S in a topological space E X with the property that every open set containing p also contains elements of both Sand X\S. This is not quite the same idea as boundary in the sense of manifold with boundary defined above. For example, the subset of 1R2 defined by
X is defined as the set of all points p
S := {(x, y) E 1R2 : 1 :::; x 2 + y2 < 2 or 2 < x 2 + y2 :::; 3} is a manifold with boundary
as = {(x,y)
E 1R2: x 2 +y2
= 1 or x 2 +y2 = 3}.
On the other hand, the topological boundary of S also contains the circle given by x 2 + y2 = 2. Topological manifolds, both with or without boundary, are paracompact, Hausdorff and hence also normal. This means that given any pair of disjoint closed sets F 1 , F2 eM, there are open sets U1 and U2 containing Fl and F2 respectively such that U1 and U2 are also disjoint. Recall that a topological space X (or its topology) is called metrizable if there exists a metric on the space which induces the given topology of X. Since they are normal, manifolds are, a fortiori, regular. According to the Urysohn metrization theorem, every second countable regular Hausdorff space is metrizable. Now if every connected component of a space is metrizable, then the whole space is also metrizable by a standard trick which modifies the metric so as to make the distance between points in a given component always less than one while a pair of points from distinct components has distance one. Since
1.3. Charts, Atlases and Smooth Structures
11
every paracompact Hausdorff manifold has second countable components, we see that all manifolds, as defined here, are metrizable. Thus, the reader may as well think of manifolds as special kinds of metric spaces. For more on manifold topology, consult [Matsu].
1.3. Charts, Atlases and Smooth Structures
In this section we introduce the notion of charts or coordinate systems. The existence of such charts is what allows for a well-defined notion of what it means for a function on a manifold to be differentiable and also what it means for a map from one manifold to another to be differentiable. Definition 1.24. Let M be a set. A chart on M is a bijection of a subset U c M onto an open subset of some Euclidean space JR n . We say that the chart takes values in JRn or simply that the chart is JRn_ valued. A chart x : U --+ x (U) C JRn is traditionally indicated by the pair (U, x). If pri : JRn -+ JR is the projection onto the i-th factor given by pri(al, ... , an) = ai, then Xi is the function defined by xi = pri 0 x and is called the i-th coordinate function for the chart (U, x). We often write x = (xl, ... ,xn). If p E M and (U,x) is a chart with p E U and x(p) = 0 E JRn, then we say that the chart is centered at p.
Definition 1.25. Let A = {(Ua , Xa)}aEA be a collection ofJRn-valued charts on a set M. We call A an JRn-valued atlas of class C r if the following conditions are satisfied:
(i) UaEA Ua = M. (ii) The sets of the form xa(Ua n Uj3) for
(x,
(3 E A are all open in JRn.
(iii) Whenever Ua n Uj3 is not empty, the map
xj3
0
x~l : Xa (Ua n Uj3) -+ xj3 (Ua n Uj3)
is a C r diffeomorphism.
Remark 1.26. Here xj3
0
x~l is really a shorthand for
xj3lu"nui3
0
x~llxQ(u"nUi3)'
but this notation is far too pedantic and cluttered for most people's tastes. The maps xj3 0 x~l in the definition are called overlap maps or change of coordinate maps. It is exactly the way we have required the overlap maps to be diffeomorphisms that will allow us to have a well-defined and useful notion of what it means for a function on M to be differentiable of class An atlas of class C r is also called a C r atlas.
cr.
1. Differentiable Manifolds
12
There exist various simplifying but occasionally ambiguous notational conventions regarding coordinates. Consider an arbitrary pair of charts (U, x) and (V, y) and the overlap map yo x-I: x(U n V) -+ y(U n V). If we write x = (xl, ... , xn) and y = (yl, ... , yn), then we have
(1.1)
yi(p) = yi
0
x-l(xl(p), ... ,xn(p))
for any p E Un V, which makes sense, but in the literature we also see I ... ,xn) . (1.2) y i =yi( X, In considering this last expression, one might wonder if the xi are functions or numbers. But this ambiguity is sort of purposeful. For if (1.1) is true for all p E Un V, then (1.2) is true for all (xl, ... , xn) E x(U n V), and so we are unlikely to be led into error. Definition 1.27. Two C r atlases Al and A2 on M are said to be equivalent provided that Al U A2 is also a C r atlas for M. A C r differentiable structure on M is an equivalence class of C r atlases. A Coo differentiable structure will also be called a smooth structure. Exercise 1.28. Show that the notion of equivalence of atlases given above defines an equivalence relation. The union of all the C r atlases in an equivalence class, is itself an atlas that is in the equivalence class. Such an atlas is called a maximal C r atlas since it is not properly contained in any larger atlas. Atlases obtained in this way are precisely those that are maximal with respect to partial ordering of atlases by inclusion. Thus every C r atlas is contained in a unique maximal C r atlas which is the union of all the atlases equivalent to it. We thereby obtain a 1-1 correspondence of the set of equivalence classes of C r differentiable structures and the set of maximal C r atlases. Thus an
x(un V)
y(U n V)
Figure 1.1. Chart overlaps
1.3. Charts, Atlases and Smooth Structures
13
alternative way to define a C r differentiable structure is as a maximal C r atlas. We shall use this alternative quite often without comment. As soon as we have any C r atlas, we have a determined C r differentiable structure. Indeed, we just take the equivalence class of this atlas. Alternatively, we take the maximal atlas that contains the given atlas. A pair of charts, say (U, x) and (V, y), are said to be Cr-related if either Un V = 0 or both x(U n V) and y(U n V) are open and x 0 y-I and yo x-I are C r maps. Thus an atlas is just a family of mutually C r -related charts whose domains form a cover of the manifold. We say that a chart (U, x) on M is compatible with a C r atlas A if Au {(U, x)} is also a C r atlas. This just means that (U, x) is C r -related to every chart in A. The maximal C r atlas determined by A is exactly composed of all charts compatible with A. We also say that a chart from the maximal atlas that gives the smooth structure is admissible. Charts will be assumed admissible unless otherwise indicated. Example 1.29. The space IR n itself has an atlas consisting of the single chart (id, IR n ), where id : IR n -+ IR n is just the identity map. This atlas determines a differentiable structure. Lemma 1.30. Let M be a set with a C r structure given by an atlas A = {(Ua,Xa)}aEA. If (U,x) and (V,y) are charts compatible with A such that U n V i= 0, then the charts (U n V, xl unv ) and (U n V, Ylunv) are also compatible with A and hence are in the maximal atlas generated by A. Furthermore, if 0 is an open subset ofx(U) for some compatible chart (U,x), then taking V = x-I (0) we have that (V, xlv) is also a compatible chart. Proof. The assertions of the lemma are almost obvious: Ifxox~l, xaox-I, y 0 X~ I, Xa 0 y-I are all C r diffeomorphisms, then certainly the restrictions xlunv 0 x~l, Xa 0 xlu~v' Ylunv 0 x~l and Xa 0 Ylu~v are also. One might just check that the natural domains of these maps are indeed open in IRn. For example, the domain of xl unv 0 x~l is xa(Ua nUn V) = xa(Ua n U) n xa(Ua n V) and both xa(Ua n U) and xa(Ua n V) are open because of what it means for (y, U) to be compatible. The last assertion of the lemma is equally easy to prove. 0 It follows that the family of sets which are the domains of charts from a maximal atlas provide a basis for a topology on M, which we call the topology induced by the C r structure on M or simply the manifold topology if the C r structure is understood. Thus the open sets are exactly the empty set plus arbitrary unions of chart domains from the maximal atlas. Since a C r atlas determines a C r structure, we will also call this the topology induced by the atlas. This topology can be characterized as follows: A subset V c M is open if and only if xa(Ua n V) is an open subset
14
1. Differentiable Manifolds
of Euclidean space for all charts (Ua , xaJ in any atlas {(Ua , Xa)}aEA giving the C r structure.
Exercise 1.31. Show that if Al is a subatlas of A2, then they both induce the same topology. Proposition 1.32. Let !vI be a set with a C r structure given by an atlas A. We have the following: (i) If for every two distinct points p, q EM, we have that either p and q are respectively in disjoint chart domains Ua and U(3 from the atlas, or they are both in a common chart domain, then the topology induced by the atlas is Hausdorff.
(ii) If A is countable, or has a countable subatlas, then the topology induced by the atlas is second countable. (iii) If the collection of chart domains {Ua}aEA from the atlas A is such that for every fixed ao E A the set {a E A : Ua n Uao i 0} is at most countable, then the topology induced by the atlas is paracompact. Thus, if this condition holds and if M is connected, then the topology induced by the atlas is second countable.
Proof. We leave the proofs of (i) and (ii) as a problem, or the reader may consult the online supplement [Lee, Jeffl We prove (iii). Give M the topology induced by the atlas. It is enough to show that each connected component has a countable basis. Thus we may as well assume that M is connected. By (ii) it suffices to show that A is countable. Let Ua1 be a particular chart domain from the atlas. We proceed inductively to define a sequence of sets starting with Xl = Ua1 . Now given X n- l let Xn be the union of those chart domains Ua which intersect X n- 1 . It follows (inductively) that each Xn is a countable union of chart domains and hence the same is true of the union X = UnX n . By construction, if some chart domain Ua meets X, then it is actually contained in X since to meet X n- l is to be contained in X n. All that is left is to show that M = X. We have reduced to the case that M is connected, and since X is open, it will suffice to show that M\X is also open. If M\X = 0 we are done. If p E M\X, then it is in some Ua , and as we said, Ua cannot meet X without being contained in X. Thus it must be the case that Ua n X = 0 and so Ua C M\X. We see that M\X is open as is X. Since M is connected, we conclude that M = X (and M\X = 0 after all). 0 This leads us to a principal definition:
Definition 1.33. A differentiable manifold of class C r is a set M together with a specified C r structure on M such that the topology induced
1.3. Charts, Atlases and Smooth Structures
15
by the CT structure is Hausdorff and paracompact. If the charts are ~n_ valued, then we say the manifold has dimension n. We write dim(M) for the dimension of M. In other words, an n-dimensional differentiable manifold of class CT is a pair (M, A), where A is a maximal (~n-valued) CT atlas and such that the topology induced by the atlas makes M a topological manifold. A differentiable manifold of class CT is also referred to as a CT manifold. A differentiable manifold of class C= is also called a smooth manifold and a C= atlas is also called a smooth atlas. 3 Notice that if rl < r2, then any CT2 atlas is also a CT! atlas and so any CT2 manifold is also a CT! manifold (0 ::::: rl < r2). In fact, it is a result of Hassler Whitney that for r > 0, every maximal CT atlas contains a C= atlas. For this and other reasons, we will be mostly concerned with the C= case. For every integer n ~ 0, the Euclidean space ~n is a smooth manifold where, as noted above, there is an atlas whose only member is the chart (~n, id) and this atlas determines a maximal atlas providing the usual smooth structure for ~n. If V is an n-dimensional real vector space, then it is a smooth manifold of dimension n in a natural way. Indeed, for each choice of basis (el' ... , en) we obtain a chart whose coordinate functions xi are defined so that xi(v) = ai when v = 2: aiei. The overlap maps between any two such charts are linear and hence smooth. Thus we have a smooth atlas which defines a smooth structure.
°: :
It is important to notice that if r > 0, then a CT manifold is much more than merely a topological manifold. Note also that we have defined manifolds in such a way that they are necessarily paracompact and Hausdorff. For many purposes, neither assumption is necessary. We could have just defined a CT manifold to be a set with a CT structure and with the topology induced by the C T structure. Proposition 1.32 tells us how to determine, from knowledge about a given atlas, whether the topology is indeed Hausdorff and/or paracompact. In Problem 3, we ask the reader to check that these topological conditions hold for the examples of smooth manifolds that we give in this chapter. Notation 1.34. As defined, a CT manifold is a pair (M, A). However, we follow the tradition of using the single symbol M itself to denote the differentiable manifold if the atlas is understood. Now we come to an important point. Suppose that M already has some natural or previously given topology. For example, perhaps it is already known that A1 is a topological manifold. If M is given a CT structure, then it is important to know whether this topology is the same as the topology 3In some contexts we will just say "atlas" when we mean "smooth atlas" .
16
1. Differentiable Manifolds
induced by the C r structure. For this consideration we have the following lemma which we ask the reader to prove in Problem 8: Lemma 1.35. Let (M, T) be a topological space which also has a C r atlas. If each chart of the atlas has open domain and is a homeomorphism with respect to this topology, then T will be the same as the topology induced by the C r structure.
A good portion of the examples of C r manifolds that we provide will be of the type described by this lemma. In fact, one often finds differentiable manifolds defined as topological manifolds that have a C r atlas consisting of charts that are homeomorphisms. In expositions that use this alternative definition, the fact that one can start out with a set, provide charts, and then end up with an appropriate topology is presented as a separate lemma (see for example [Lee, John] or [ONI]). Exercise 1.36. Let M be a smooth manifold of dimension n and let p E M. Show that for any r > 0 there is a chart (U, x) with p E U and such that x(U) = B(O, r) := {x E ]Rn : Ixl < r}. Show that for any p E U we may further arrange that x(p) = O. Remark 1.37. From now on all manifolds in this book will be assumed to be smooth manifolds unless otherwise indicated. Also, let us refer to an n-dimensional smooth manifold as an "n-manifold".
As mentioned above, it is certainly possible for there to be two different differentiable structures on the same topological manifold. For example, if ¢:]RI -+]RI is given by ¢(x) = x 3 , then {(]RI,¢)} is a smooth atlas for ]RI, but the resulting smooth structure is different from the usual structure provided by the atlas {(]R I, id)}. The problem is that the inverse of x t---+ x 3 is not differentiable (in the usual sense) at the origin. Now we have two differentiable structures on the line ]RI. Actually, although the two atlases do give distinct differentiable structures, they are equivalent in another sense (Definition 1.62 below). If U is some open subset of a smooth manifold !vI with atlas AM, then U is itself a differentiable manifold with an atlas of charts being given by all the restrictions (Uo n U, xoluQnu ) where (Uo,x o ) E AM. We shall refer to such an open subset U c M with this differentiable structure as an open submanifold of M. Open subsets of]Rn might seem to be very uninteresting manifolds, but in fact they can be quite complex. For example, much can be learned about a knot K C ]R3 by studying its complement ]R3\K and the latter is an open subset of ]R3.
We now give several examples of smooth manifolds. All of the examples are easily seen to be Hausdorff and paracompact.
1.3. Charts, Atlases and Smooth Structures
17
S Figure 1.2. Stereographic projection
Example 1.38. Consider the sphere S2 C ]R3. We have the usual spherical coordinates (cp, 0), where cp is the polar angle measured from the north pole (z = 1). We want the domain of this chart to be open so we restrict to the set where 0 < cp < 1f and 0 < 0 < 21f. We can also use projection onto the coordinate planes as charts. For instance, let Ui be the set of all (x, y, z) E S2 such that z > o. Then (x, y, z) f-7 (x, y) provides a chart Ui ---+ ]R2. The various overlap maps of all of these charts are smooth (some can be computed explicitly without much trouble). It is easy to show that the topology induced by the atlas is the usual topology. Example 1.39. We can also use stereographic projection to give charts on S2. More generally, we can provide the n-sphere sn C ]Rn+l with a smooth structure using two charts (Us,'l/Js) and (UN,'l/JN). Here,
sn : Xn+1
=1=
1},
UN = {x = (XI, ... ,Xn+l) E Sn: Xn+1
=1=
-1}
Us = {x = (Xl, ... ,xn+d
E
and 'l/Js : Us ---+ ]Rn (resp. 'l/JN : UN ---+ ]Rn) is stereographic projection from the north pole PN = (0,0, ... ,0,1) (resp. south pole Ps = (0,0, ... ,0, -1)). Note that 'l/Js maps from the southern open set containing ps. Explicitly we have
'l/Js(X) = (
1 )(XI, ... ,Xn)E]Rn, 1 - Xn+l
'l/JN(X) = ( 1 ) (Xl, ... ,Xn) E ]Rn. 1 + Xn+l Exercise 1.40. Show that 'l/JS(UN nUs) = 'l/JN(UN nus) = ]Rn\ {O} and that 'l/Js 0 'I/J-;/(y) = y/ IIyl12 = 'l/JN 0 'l/Jsl(y) for all y E ]Rn\ {O}. Thus we have an atlas {(Us,'l/Js), (UN,'l/JN)}. Verify that the topology induced by this atlas is the same as the usual topology on sn (as a subspace of ]Rn+l) and that all the maps involved are smooth.
18
1. Differentiable Manifolds
If we identify IR2 with C, then the overlap maps for charts on S2 from the last example become
(1.3) for all
Z E
C\ {O}. This observation will come in handy later.
Example 1.41 (Projective spaces). The set of all lines through the origin in IR3 is denoted IRp2 and is called the real projective plane. Let Uz be the set of all lines £ E IRp2 not contained in the x, y plane. Every line £ E Uz intersects the plane z = 1 at exactly one point of the form (x (£), y (£), 1). We can define a bijection 'Pz : Uz -+ IR2 by letting £ t---+ (x(£), y(£)). This is a chart for IRp2, and there are obviously two other analogous charts (Ux , 'Px) and (Uy , 'Py). These charts cover IRP2 and form a smooth atlas since they have smooth overlap maps.
More generally, the set IRpn of all lines through the origin in IR n+1 is called real projective n-space. We have the surjective map 1f' : IR n+ 1 \ {O}-+ IRpn given by letting 1f'(x) be the line through x and the origin. We give IRpn the quotient topology where U c IRpn is open if and only if 1f'-1 (U) is open. Also IRpn is given an atlas consisting of charts of the form (Ui, 'Pi), where
Ui = {£
E
IRpn : £ is not contained in the hyperplane
xi
=
O},
and 'Pi(£) is the unique (u 1 , ... ,un) such that (u 1 , .. . u i - 1 , l,u i , ... ,un) E £. Once again it can be checked that the overlap maps are smooth so that we have a smooth atlas. The topology induced by the atlas is exactly the quotient topology, and we leave it as an exercise to show that it is both paracompact and Hausdorff. It is often useful to view IRpn as a quotient of the sphere sn. Consider the map sn -+ IRpn given by x t---+ ex, where £x is the unique line through the origin in IRn +1 which contains x. Notice that if £x = £y for x, y E sn, then x = ±y. It is not hard to show that IRpn is homeomorphic to sn / rv, where x rv y if and only if x = ±y. We can use this homeomorphism to transfer the differentiable structure to sn /rv (see Exercise 1.64). We often identify sn / rv with IRpn. Exercise 1.42. Show that the overlap maps for IRpn are indeed smooth. Example 1.43. In this example we consider a more general way of getting charts for the projective space IRpn. Let 0: : IRn -+ IRn+l be an affine map whose image does not contain the origin. Thus 0: has the form o:(x) = Lx + b, where L : IR n -+ IR n+ 1 is linear and b E IRn+l is nonzero. Let 1f' : IR n+1\{O} -+ IRpn be the projection defined above. The composition 1f' 0 0: can be easily shown to be a homeomorphism onto its image, and we
1.3. Charts, Atlases and Smooth Structures
19
call this type of map an affine parametrization. The inverses of these maps form charts for a smooth atlas. The charts described in the last example are essentially special cases of these charts and give the same smooth structure.
Notation 1.44 (Homogeneous coordinates). For (Xl, ... , xn+d E lRn+l, let [Xl, ... , Xn+l] denote the unique I E lRpn such that the line I contains the point (Xl, ... , Xn+l). The numbers (Xl, ... , Xn+l) are said to provide homogeneous coordinates for I because [AXI, ... , AXn+l] = [Xl, ... , Xn+l] for any nonzero A ERIn terms of homogeneous coordinates, the chart map 'Pi : Ui -t lR n is given by
'Pi([XI, ... ,Xn+l]) = (xlxil, ... ,i, ... ,Xn+lXil), where the caret symbol' means we have omitted the 1 in the i-th slot to get an element of lRn.
Example 1.45. By analogy with the real case we can construct the complex projective n-space cpn. As a set, cpn is the family of all I-dimensional complex subspaces of C n + 1 (each of these has real dimension 2). In tight analogy with the real case, cpn can be given an atlas consisting of charts of the form (Ui, 'Pd, where
Ui = {/! E Cpn : /! is not contained in the complex hyperplane zi = O} and 'Pi(/!) is the unique (zl, .. . , zn) such that (zl, .. . zi-l, 1, zi ... , zn) E /!. Here 'Pi : Ui -t C n ~ lR 2n and so cpn is a manifold of (real) dimension 2n. Homogeneous coordinate notation [Zl, ... , Zn+l] is defined as in the real case, but now the multiplier A is complex.
Exercise 1.46. In reference to the last example, compute the overlap maps 'Pi 0 'Pt : Ui n Uj -t cn. For CPl, show that UI n U2 = C\ {O}, and that 'P2 0 'Pll(z) = z-l = 'PI 0 'P2 l (Z) for z E C\ {O}. Show also that if we define
'P2
0
Notice that with the atlas {(U I ,
2
f(x) = { [z(x),l] if X3 =I -1, [1, w(x)] if X3 =I 1.
- 1:;3
1. Differentiable Manifolds
20
Fixed
Figure 1.3. Double pendulum
Using the fact that 1- x~ = xi + x~, one finds that for x = (Xl, X2, X3) E 8 2 with -1 < X3 < 1, we have w(x) = z(x)-l. It follows that for such x, we have [z(x), 1] = [1, w(x)]. This means that f is well-defined.
Exercise 1.47. Show that the map diffeomorphism.
f : 82
--)-
CP1 defined above is a
Exercise 1.48. Show that IRp1 is diffeomorphic to 8 1 . Example 1.49. The set of all m x n real matrices Mmxn(lR) is an mnmanifold. We only need one chart since it is clear that Mmxn(lR) is in one-to-one correspondence with IR mn by the map [aij] f--t (all, a12,···, amn ). Also, the set of all nonsingular matrices GL(n, IR) is an open submanifold of 2 M nxn = 1!l)n m.. . f'V
If we have two manifolds M1 and M2 of dimensions n1 and n2 respectively, we can form the topological Cartesian product M1 x M 2. We may give M1 x M2 a differentiable structure in the following way: Let AMi and AM2 be atlases for M1 and M 2. Take as charts on M1 x M2 the maps of the form x x Y : Un X V')' --)- IR nl X IR n2,
where (U, x) is a chart from AMi and (V, y) a chart from AM2' This gives M1 x M2 an atlas called the product atlas, which induces a maximal atlas and hence a differentiable structure. With this product differentiable structure, M1 x M2 is called a product manifold. The product of several manifolds is also possible by an obvious iteration. The topology induced by the product atlas is the product topology, and so the underlying topological manifold is the product topological manifold discussed earlier. Example 1.50. The circle 8 1 is a I-manifold, and hence so is the product T2 = 8 1 X 8 1 , which is a torus. The set of all configurations of a double pendulum constrained to a plane and where the arms are free to swing past each other can be taken to be modeled by T2 = 8 1 X 8 1 . See Figure 1.3.
1.3. Charts, Atlases and Smooth Structures
21
Example 1.51. For any smooth manifold M we can construct the "cylinder" M xl, where I = (a, b) is some open interval in R
We now discuss an interesting class of examples. Let G (n, k) denote the set of k-dimensional subspaces of ~n. We will exhibit a natural differentiable structure on this set. The idea is the following: An alternative way of defining the points of projective space is as equivalence classes of n-tuples (vl, ... ,vn ) E ~n\{o}, where (v1, ... ,vn ) '" (AVI, ... ,AV n ) for any nonzero A. This is clearly just a way of specifying a line through the origin. Generalizing, we shall represent a k-plane as an n x k matrix whose column vectors span the k-plane. Thus we are putting an equivalence relation on the set of n x k matrices where A '" Ag for any nonsingular k x k matrix g. Let M~u~lk be the set of n x k matrices with rank k < n (full rank). Two matrices from M~~lk are equivalent exactly if their columns span the same k-dimensional subspace. Thus the set G(k, n) := M~~lk/'" of equivalence classes is in one-to-one correspondence with the set of k-dimensional subspaces of ~n. Let U be the set of all [Aj E G(k, n) such that any representative A has its first k rows linearly independent. This property is independent of the representative A of the equivalence class [Aj, and so U is a well-defined set. Now every element [Aj E U c G(k, n) is an equivalence class that has a unique member Ao of the form
which is obtained by Gaussian column reduction. Thus we have a map on U defined by W: [Aj t--t Z E M(n-k)xk ~ ~k(n-k). We wish to cover G(k,n) with sets similar to U and define similar maps. Consider the set Uil ... ik of all [Aj E G(k, n) such that any representative A has the property that the k rows indexed by il,"" ik are linearly independent. The permutation that puts the k rows indexed by il, ... , ik into the positions 1, ... , k without changing the relative order of the remaining rows induces an obvious bijection O"il ... iJrom U i1 "' ik onto U = Ul...k. We now have maps Wil ... ik : Uil ... ik -7 M(n-k)xk ~ ~k(n-k) given by composition Wit ... ik := WOO"il ... ik' These maps form an atlas {(Uit ... ik , Wit ... ik)} for G(k,n) that gives it the structure of a smooth manifold called the Grassmann manifold of real k-planes in ~n. The topology induced by the atlas is the same as the quotient topology, and one can check that this topology is Hausdorff and paracompact. We have defined C r manifold for 0 :::; r :::; 00 for r an integer or 00. We can also define CW manifolds (analytic manifolds) by requiring that the charts are related by analytic maps. This means that the overlaps maps have component functions that may be expressed as convergent powers series in
1. Differentiable Manifolds
22
a neighborhood of any point in their domains. For convenience, we agree to take 00 < W.
1.4. Smooth Maps and Diffeomorphisms Definition 1.52. Let M and N be smooth manifolds with corresponding maximal atlases AM and AN. We say that a map f : M ---+ N is of class C r (or r-times continuously differentiable) at p E M if there exists a chart (V, y) from AN with f(p) E V, and a chart (U, x) from AM with p E U, such that f (U) c V and such that yo f 0 x-I is of class cr. If f is of class C r at every point p EM, then we say that f is of class C r (or that f is a C r map). Maps of class Coo are called smooth maps. Exercise 1.53. Show that a C r map is continuous. [Hint: Consider compositions y-I 0 (y 0 f 0 x-I) 0 x.] Exercise 1.54. Show that a composition of C r maps is a C r map. The family of C r manifolds together with the family of smooth maps determines a category called the C r category (see Appendix A). The Coo category is called the smooth category. Smooth structures are often tailor made so that certain maps are smooth. For example, given smooth manifolds MI and M 2 , the smooth structure on a product manifold MI x M2 is designed to make the projection maps onto MI and M2 smooth. Even when dealing with smooth manifolds, we may still be interested in maps which are only of class C r for some r < 00. This is especially so when one wants to do analysis on smooth manifolds. In fact, one could define what it means for a map to be Lebesgue measurable in a similar way. It is obvious from the way we have formulated the definition that the property of being of class C r is a local property. Let f : M ---+ N be a map and suppose that (U, x) and (V, y) are admissible charts for M and N respectively. If f-I(V) n U is not empty, then we have a composition yo
f
0
x-I: x (J-I(V)
n U)
---+ y(V).
Maps of this form are called the local representative maps for f. Notice that if f is continuous, then f-l(V) n U is open. Definition 1.52 does not start out with the assumption that f is continuous, but is constructed carefully so as to imply that a function that is of class C r (at a point) according to the definition is automatically continuous (at the point). But if f is known to be continuous, then we may check C r differentiability using representative maps with respect to atlases that are not necessarily maximal: Proposition 1.55. Let {(Uo:,XO:)}O:EA and {(V!3,y!3)}!3EB be (not necessarily maximal) C r atlases for M and N respectively. A continuous map
23
1.4. Smooth Maps and Diffeomorphisms
f : M -+ N is of class c r if for each a and f3, the representative map Y(3 0 f 0 x;1 is C r on its domain Xu (J-l(V(3) n Uu ).
Proof. Suppose that a continuous f is given and that all the representative maps Y(3 0 f 0 x;1 are cr. Let p E M and choose (Uu , xu) and (V(3, Y(3) with p E Uu and f(p) E V(3. Letting U := f- 1 (V(3) nuu , we have a chart (U, xulu) with p E U and f(U) c V(3 such that yo f 0 xl:l is cr. Thus f is C r at p by definition. Since p was arbitrary, we see that
f
is of class
cr.
0
If f is continuous, then the condition that f be C r at p E M for r > 0 can be seen to be equivalent to the condition that for some (and hence every) choice of charts (U, x) from AM and (V, y) from AN such that p E U and f(p) E V, the map
yo f
0
x-I:
x(J-l(V) n U) -+ y(V)
is cr. Note the use of the phrase "and hence every" above. The point is that if we choose another pair of charts (x/, U/) and (y/, V') with p E U/ and f(p) E V', then y/ 0 f 0 x/- 1 must be C r on some open neighborhood of x/(p) if and only if y 0 f 0 x-I is C r on some neighborhood 4 of x(p). This is true because the overlap maps x/ 0 x-I and y/ 0 y-l are C r diffeomorphisms (the chain rule is at work here of course). Without worrying about domains, the point is that
y/
0
f
= y/
0 0
x/- 1
(y-l
0
y)
0
f
0
(x-l
0
= (y/ 0 y-l) 0 (y 0 f 0 x-I)
x) 0
0
(x/
x/- 1 0
x-I
r
1.
Now the reader should be able to see quite clearly why we required overlap maps to be diffeomorphisms. A representative map 1 = yo f 0 x-I is defined on an open subset of IR n where n = dim(M). If dim(N) = k, then 1 = (JI, ... , jk) and each fi is a function of n variables. If we denote generic points in IRn as (u 1 , . .. , un), and 1 ... , u n) , 1 < . TIl>k as (1 t hose In.ll''v , ... , v k) , t hen we may wn't e v i -- f-i ( u, _ ~. < _ k. It is also common and sometimes psychologically helpful to simply write yi = fi(x 1 , ... ,xn). The bars over the 1's are also sometimes dropped. Another common way to indicate yo f 0 x-I is with the notation fvu which is very suggestive and tempting, but it has a slight logical defect since there may be many charts with domain U and many charts with domain V. Exercise 1.56. Consider the map 7f' : S2 -+ IRp2 given by taking the point (x, y, z) to the line through this point. Using an atlas on each of these manifolds such as the atlases introduced previously, show that 7f' is smooth. 4Recall that our convention is that a neighborhood is assumed to be open unless indicated.
1. Differentiable Manifolds
24
(At least check one of the representative maps with respect to a chart on 8 2 and a chart on JRp2.)
q
As a special case of the above, we note that a function f : M ---+ JR (resp. is C r differentiable at p E M if and only if it is continuous and
f 0 x-I: x(U) ---+ JR (resp. q is Cr-differentiable for some admissible chart (U,x) with p E U. And, f is of class C r if it is of class C r at every p. The set of all C r maps M ---+ N is denoted Cr(M, N) and Cr(M, JR) is abbreviated to Cr(M). Both C r (JI,I, JR) and C r (M, q are rings and also algebras over the respective fields JR and C (Definition in Appendix D). The addition, scaling, and multiplication are defined pointwise so that (J + g)(p) := f(p) + g(p), etc.
Definition 1.57. Let (U, x) be a chart on an n-manifold M with p E U. We write x = (xl, ... , xn) as usual. For fECI (M), define a function on U by
-u,
8 f ( ).= l'
8 x~- P .
1m
[fox-1(al, ... ,ai+h, ... ,an)-fox-l(al, ... ,an)] h
h-tO
'
where x(p) = (al, ... , an). In other words,
8f -8- (p) := 8i (J x~
0
x-I) (x(p)) =
8(Jox- 1) 8 (x(p)), u~
where (u 1 , ... , un) denotes the standard coordinates on JR n . Recall that Di is the notation for i-th partial derivative with respect to the decomposition JRn = JR x ... x JR. Thus if 9 : U c JR ---+JR is differentiable at a E JR, then Dig(a) : JR ---+JR is a linear map but is often identified with the single entry 8i g(a) of the 1 x 1 matrix that represents it with respect to the standard basis on JR. Thus one sometimes sees the definition above written as
8f (p) 8x i
:=
Di (-1) fox (x(p)).
If f is a C r function, then 8 f /8x i is clearly C r - 1 . Notice also that f really only needs to be defined in a neighborhood of p and differentiable at p for the expression (p) to make sense. This definition makes precise the notation that is often encountered in calculus courses. For example, if T is the "temperature" on a sphere 8 2 , then T takes as arguments points p on 8 2 . On the other hand, using spherical coordinates, we often consider 8T / 8cp and 8T / 8() as being defined on 8 2 rather than on some open set in a "cp, ()-space" .
-u,
25
1.4. Smooth Maps and Diffeomorphisms
Finally, notice that if I and 9 are C 1 and defined at least on the domain of the chart (U, x), then we easily obtain that on U
a (al + bg) a· Xl
=
al ag a-a . + baXl. for any a, b E JR, Xl
and
a (Ig) ag al -a-·= I-a· + g-a . (the product rule). Xl Xl Xl Let (U, x) and (V, y) charts on an n-manifold with p E Un V. Then it is easy to check using the usual chain rule and the definitions above that for any smooth function I defined at least on a neighborhood of p we have the following version of the chain rule:
al n al ax] -a . (p) = ~ -a . (p)-a . (p). yl ~ xJ yl J=1
A map I which is defined only on some proper open subset of a manifold is said to be C r if it is C r as a map of the corresponding open submanifold, but this is again just to say that it is C r at each point in the open set. We shall often need to consider maps that are defined on subsets ScM that are not necessarily open.
Definition 1.58. Let S be an arbitrary subset of a smooth manifold M. Let I : S -+ N be a continuous map where N is a smooth manifold. The map I is said to be C r if for every s E S there is an open set 0 C M containing s and a map that is C r on 0 and such that = I. 'Isno
1
J1
In a later exercise we ask the reader to show that a function I with domain S is smooth if and only if it has a smooth extension to some open set containing all of S. In particular, a curve defined on a closed interval [a, b] is smooth if it has a smooth extension to an open interval containing
[a, b]. We already have the notion of a diffeomorphism between open sets of some Euclidean space JRn. We are now in a position to extend this notion to the realm of smooth manifolds.
Definition 1.59. Let M and N be smooth (or cr) manifolds. A homeomorphism I : M -+ N such that I and I-I are C r differentiable with r 2: 1 is called a C r diffeomorphism. In the case r = 00, we shorten Coo diffeomorphism to just diffeomorphism. The set of all C r diffeomorphisms of a manifold M onto itself is a group under the operation of composition. This group is denoted Diffr(M). In the case r = 00, we simply write Diff(M) and refer to it as the diffeomorphism group of M.
26
1. Differentiable Manifolds
We will use the convention that Diffo (.M) denotes the group of homeomorphisms of M onto itself. Also, it should be pointed out that if we refer to a map between open subsets of manifolds as being a CT diffeomorphism, we mean that the map is a CT diffeomorphism of the corresponding open submanifolds.
Example 1.60. The map ro : 8 2 -+ 8 2 given by
ro (x, y, z) = (x cos 0 - y sin 0, x sin 0 + y cos 0, z) for x 2 + y2
+ z2
= 1 is a diffeomorphism.
°
Exercise 1.61. Let < 0 < 21T". Consider the map f : 8 2 -+ 8 2 given by fo(x,y,z) = (xcos((l- z2)O) -ysin((l- z2)O),xsin((1- z2)O) +ycos((lz2)O), z). Is this map a diffeomorphism? Tty to picture this map. Definition 1.62. CT manifolds M and N will be called (CT) diffeomorphic and then said to be in the same diffeomorphism class if and only if there is a CT diffeomorphism f : M -+ N. Exercise 1.63. Let M and N be smooth manifolds with respective maximal atlases AM and AN. Show that a bijection f : M -+ N is a diffeomorphism if and only if the following condition holds:
(U, y) E AN if and only if U-I(U), y
0
f)
E
AM.
Exercise 1.64. Show that if M is a CT manifold and ¢> : M -+ X is any bijection, then there is a unique CT structure on X such that ¢> is a diffeomorphism. This process is called a transfer of structure. If X is a topological space and ¢> is a homeomorphism, then the topology induced by the transferred structure is the original topology. For another example, consider the famous Cantor set C C [0,1] C R Consider lR as a coordinate axis subspace of lR 2 set Me := lR 2 \C. It can be shown that Me is diffeomorphic to a surface suggested by Figure 1.4. Once again we see that open sets in a Euclidean space can have interesting differential topology. In the definition of diffeomorphism, we have suppressed explicit reference to the maximal atlases, but note that whether or not a map is differentiable (CT or smooth) essentially involves the choice of differentiable structures on the manifolds. Recall that we can put more than one differentiable structure on lR by using the function x 3 as a chart. This generalizes in the obvious way: The map E: (xl,x2, ... ,xn) t--+ ((x l )3,x 2, ... ,xn ) is a chart for lRn, but is not COO-related with the standard (identity) chart. It is globally defined and so provides an atlas that induces the usual topology again, but the resulting maximal atlas is different! Thus we seem to have two smooth manifolds (lRn, AI) and (lR n , A2) both with the same underlying topological
1.4. Smooth Maps and Diffeomorphisms
27
etc
--+
Figure 1.4. Interesting surface
space. Indeed, this is true. Technically, they are different. But they are equivalent and therefore the same in another sense. Namely, they are diffeomorphic via the map E. SO it may be that the same underlying topological space M carries two different differentiable structures, and so we really have two differentiable manifolds with the same underlying set. It remains to ask whether they are nevertheless diffeomorphic. It is an interesting question whether a given topological manifold can carry differentiable structures that are not diffeomorphic. It has been shown that there are 28 pairwise nondiffeomorphic smooth structures on the topological space 8 7 and more than 16 million on 8 31 . Each JRk for k -/=: 4 has only one diffeomorphism class compatible with the usual topology. On the other hand, it is a deep result that there exist infinitely many truly different (nondiffeomorphic) differentiable structures on JR4. The existence of exotic differentiable structures on JR4 follows from the results of [Donaldson] and [Freedman]. The reader ought to be wondering what is so special about dimension four. Note that when we mention JR4, 8 7 , 8 31 , etc. as smooth manifolds, we shall normally assume the usual smooth structures unless otherwise indicated. Definition 1.65. Let Nand M be smooth manifolds of the same dimension. A map f : M -+ N is called a local diffeomorphism if and only if every point p E M is contained in an open subset U c M such that flu: U -+ f(U) is a diffeomorphism onto an open subset of N. For CT manifolds, a C T local diffeomorphism is defined similarly. Example 1.66. The map 11" : 8 2 -+ JRp2 given by taking the point (x, y, z) to the line through this point and the origin is a local diffeomorphism, but is not a diffeomorphism since it is 2-1 rather than 1-1. Example 1.67. The map (x, y)
H
z (x, y) =
(x/z(x, y), y/z(x, y)), where
\11 -
x 2 - y2,
28
1. Differentiable Manifolds
is a diffeomorphism from the open disk B (0, 1) = {(x, y) : x 2 + y2 < I} onto the whole plane. Thus B(O, 1) and lR 2 are diffeomorphic and in this sense are the "same" differentiable manifold. Sometimes it is only important how maps behave near a certain point. Let M and N be smooth manifolds and consider the set S(p, M, N) of all smooth maps into N which are defined on some open neighborhood of the fixed point p EM. Thus,
S(p,M,N):=
U
COO(U,N),
UENp
where N p denotes the set of all open neighborhoods of p E M. On this set we define the equivalence relation where J and 9 are equivalent at p if and only if they agree on a neighborhood of p. The equivalence class of J is denoted [J], or by [J]p if the point in question needs to be made clear. The set of equivalence classes S(p, M, N)/,,-, is denoted C;:'(M, N).
Definition 1.68. Elements of C;:'(M, N) are called germs, and if J and 9 are in the same equivalence class, we write have the same germ at p.
J "'p 9 and we say that J and 9
The value of a germ at p is well-defined by [J] (p) = J (p). Taking N = lR we see that C;:'(M, lR) is a commutative lR-algebra if we make the definitions
a[J]
+ b[g]
:=
[aJ + bg] for a, b E lR,
[I][g] := [Jg]. The C-algebra of complex-valued germs C;:'(M, q is defined similarly.
1.5. Cut-off Functions and Partitions of Unity There is a special and extremely useful kind of function called a bump function or cut-off function, which we now take the opportunity to introduce. Recall that given a topological space X, the support, supp(J), of a function J : X -+ lR is the closure of the subset on which J takes nonzero values. The same definition applies for vector space-valued functions J : X -+ V. It is a standard fact that there exist smooth functions defined on lR that have compact support. For example, we have the smooth function \{f : lR -+ lR defined by 1/(1-X 2 ) for It I < 1, \{f(t) = o otherwise.
{e-
Lemma 1.69 (Existence of cut-off functions). Let M be a smooth maniJold. Let K be a compact subset oj M and 0 an open set containing K. There exists a smooth Junction f3 on M that is identically equal to 1 on K, takes values in the interval [0, 1], and has compact support in o.
1.5. Cut-off Functions and Partitions of Unity
29
Proof. Special case 1: Assume that M = ~n and that 0 = B(O, R) and < r < R. In this case we may take
K = B(O, r) for 0
~~ g(t) dt
g(t) =
{
e-(t-r)-l e(t-R)-l
0
if r < t < R, otherwise.
It is an exercise in calculus to show that 9 is a smooth function and thus that
Special case 2: Assume again that M = ~n. Let K c 0 be as in the hypotheses. For each point p E K let Up be an open ball centered at p and contained in O. Let Kp be the closed ball centered at p of half the radius of Up. The interiors of the Kp's form an open cover for K and so by compactness we can reduce to a finite subcover. Thus we have a finite family {Ki} of closed balls of various radii such that K c UKi' and with corresponding concentric open balls Ui cO. For each Ui, let
13 (x)
= 1-
II(1 -
General case: From the second special case above it is clear that we have the result if K is contained in the domain U of a chart (U, x). If K is not contained in such a chart, then we may take a finite number of charts (Ul,Xl), ... , (Uk,Xk) and compact sets K 1 , •.. , Kk with K c U7=lKi , Ki CUi, and UUi C O. Now let
13 = 1 -
II (1 -
o
i=l
Let [fl E C~(M,~) (or E C~(M,q) and let f be a representative of the equivalence class [fl. We can find an open set U containing p such that U is compact and contained in the domain of f. If 13 is a cut-off function that is identically equal to 1 on U, and has support inside the domain of f, then 13 f is smooth and it can be extended to a globally defined smooth function that is zero outside of the domain of f. Denote this extended function by (13 f)ext· Then (13 f)ext E [fl (usually, the extended function is just written
30
1. Differentiable Manifolds
as (3J). Thus every element of C:r(M, JR) has a representative in COO(M, JR). In short, each germ has a global representative. (The word "global" means defined on, or referring to, the whole manifold.) A partition of unity is a technical tool that can help one piece together locally defined smooth objects with some desirable properties to obtain a globally defined object that also has the desired properties. For example, we will use this tool to show that on any (paracompact) smooth manifold there exists a Riemannian metric tensor. As we shall see, the metric tensor is the basic object whose existence allows the introduction of notions such as length and volume.
Definition 1. 70. A partition of unity on a smooth manifold M is a collection {'POI.} OlEA of smooth functions on M such that (i) O:S 'POI. :S 1 for all a. (ii) The collection of supports {supp( 'POl)}OlEA is locally finite; that is, each point p of M has a neighborhood Wp such that Wp n supp( 'POI.) = 0 for all but a finite number of a E A. (iii) L:OlEA 'POl(P) = 1 for all p E M (this sum has only finitely many nonzero terms by (ii)). If 0 = {OOl}OlEA is an open cover of M and SUPP('POl) C 001. for each a E A, then we say that {'POl}OlEA is a partition of unity subordinate to 0= {OOl}OlEA.
Remark 1.71. Let U = {UOl}OlEA be a cover of M and suppose that W = {W,B} ,BEB is a refinement of U. If {'I/J,B} ,BEB is a partition of unity subordinate to W, then we may obtain a partition of unity {'POI.} subordinate to U. Indeed, if f : B -+ A is such that W,B C Uf(,B) for every (3 E B, then we may let 'POI. := L:,BEf-1(0l) 'I/J,B.
Our definition of a smooth manifold M includes the requirement that M be paracompact (and Hausdorff). Paracompact Hausdorff spaces are normal spaces, but the following theorem would be true for a normal locally Euclidean space with smooth structure even without the assumption of paracompactness. The reason is that we explicitly assume the local finiteness of the cover. For this reason we put the word "normal" in parentheses as a pedagogical device.
Theorem 1.72. Let M be a (normal) smooth manifold and {UOl}OlEA be a locally finite cover of M. If each U01. has compact closure, then there is a partition of unity {'POl}OlEA subordinate to {UOl}OlEA. Proof. We shall use a well-known result about normal spaces sometimes called the "shrinking lemma". Namely, if {UOl}OlEA is a locally finite (and
1.6. Coverings and Discrete Groups
31
hence "point finite") cover of a normal space M, then there exists another cover {VaJaEA of M such that Va C Ua . This is Theorem B.4 proved in Appendix B. We do this to our cover and then notice that since each Ua has compact closure, each Va is compact. We apply Lemma 1.69 to obtain nonnegative smooth functions 'l/Ja such that supp 1Pa C Ua and 1Pa IVa == 1. Let 1P := LaEA 1Pa and notice that for each p EM, the sum LaEA 1Pa (p) is a finite sum and 1P(p) > 0. Let 'Pa := 1Pa/1P· It is now easy to check that {'Pa}aEA is the desired partition of unity. 0 If we use the paracompactness assumption, then we can show that there exists a partition of unity that is subordinate to any given cover.
Theorem 1.73. Let M be a (paracompact) smooth manifold and {Ua}aEA a cover of M. Then there is a partition of unity {'Pa} aEA subordinate to {Ua}aEA. Proof. By Remark 1. 71 and the fact that M is locally compact we may assume without loss of generality that each Ua has compact closure. Then since M is paracompact, we may find a locally finite refinement of {Ua}aEA which we denote by {Vi}iEI. Now use the previous theorem to get a partition of unity subordinate to {ViLEI. Finally use remark 1.71 one more time to get a partition of unity subordinate to {Ua}aEA. 0 Exercise 1.74. Show that if a function is smooth on an arbitrary set ScM as defined earlier, then it has a smooth extension to an open set that contains
S. Now that we have established the existence of partitions of unity we may show that the analogue of Lemma 1.69 works with K closed but not necessarily compact: Exercise 1. 75. Let M be a smooth manifold. Let K be a closed subset of M and 0 an open set containing K. Show that there exists a smooth function (3 on M that is identically equal to 1 on K, takes values in the interval [0,1]' and has compact support in O.
1.6. Coverings and Discrete Groups 1.6.1. Covering spaces and the fundamental group. In this section, and later when we study fiber bundles, many of the results are interesting and true in either the purely topological category or in the smooth category. Let us agree that a CO manifold is simply a topological manifold. Thus all
32
1. Differentiable Manifolds
°
relevant maps in this section are to be C r , where if r = we just mean continuous and then only require that the spaces be sufficiently nice topological spaces. Also, "Co diffeomorphism" just means homeomorphism. In the definition of path connectedness and path component given before, we used continuous paths, but it is not hard to show that if two points on a smooth manifold can be connected by a continuous path, then they can be connected by a smooth path. Thus the notion of path component remains unchanged by the use of smooth paths.
Definition 1.76. Let fa : X ---t Y and h : X ---t Y be C r maps. A C r homotopy from fa to h is a C r map H : X x [0, 1] ---t Y such that
H(x,O) = fo(x)
and
H(x, 1) = h(x) for all x. If there exists such a C r homotopy, we then say that fa is C r
homotopic to h and write fa ~ h. If A c X is a closed subset and if H(a, s) = fo(a) = h(a) for all a E A and all s E [0,1]' then we say that
fa is C r homotopic to h relative to A and we write fa ~ map H is called a C r homotopy.
h
(reI A). The
In the above definition, the condition that H : X x [0, 1] ---t Y be C r for r > can be understood by considering X x [0, 1] as a subset of X x lR. Homotopy is obviously an equivalence relation.
°
Exercise (reI A) if H(x, s) = H(a, s) =
1. 77. Show that fa : X ---t Y and h : X ---t Yare C r homotopic and only if there exists a C r map H : X x ffi. ---t Y such that fo(x) for all x and s ::; 0, H(x, s) = h(x) for all x and s 2: 1 and fo(a) = h(a) for all a E A and all s.
At first it may seem that there could be a big difference between Coo and CO homotopies, but if all the spaces involved are smooth manifolds, then the difference is not big at all. In fact, we have the following theorems which we merely state. Proofs may be found in [Lee, John].
Theorem 1.78. If f : M ---t N is a continuous map on smooth manifolds, then f is homotopic to a smooth map fa : M ---t N. If the continuous map f : M ---t N is smooth on a closed subset A, then it can be arranged that
coo
f :::: fa (relA). Theorem 1.79. If fa : M ---t Nand h : M ---t N are homotopic smooth maps, then they are smoothly homotopic. If fa is homotopic to h relative to a closed subset A, then fa is smoothly homotopic to h relative to A. Because of these last two theorems, we will usually simply write f :::: fa instead of f
~ fa, the value of r being of little significance in this setting.
1.6. Coverings and Discrete Groups
33
Figure 1.5. Coverings of a circle
Definition 1.80. Let M and M be CT manifolds. A surjective CT map P : M -+ M is called a CT covering map if every point p E M ha~ an open connected neighborhood U such that each connected component Ui of p-l(U) is CT diffeomorphic to U via the restrictions plui : Ui -+ U. In this case, we say that U is evenly covered by p (or by the sets Ud. The triple (M, p, M) is called a covering space. We also refer to the space M (somewhat informally) as a covering space for M. Example 1.81. The map IR -+ Sl given by t r---t eit is a covering. The set of points {e it : () - n < t < () + n} is an open set evenly covered by the intervals In in the real line given by In := (() - n + 2nn, () + n + 2nn) for n E Z. Exercise 1.82. Explain why the map (-2n, 2n) -+ Sl given by t not a covering map.
r---t
eit is
Definition 1.83. A continuous map f is said to be proper if f-l(K) is compact whenever K is compact. Exercise 1.84. Show that a CT proper map between connected smooth manifolds is a smooth covering map if and only if it is a local CT diffeomorphism. The set of all CT covering spaces is the set of objects of a ~tegory. A morphism between CT covering spaces, say (Ml' PI, M l ) and (M2' P2, M 2), is a pair of CT maps (1, f) such that the following diagram commutes:
j
Ml~M2
PI!
f
!P2
Ml~M2
This means that f 0 Pl = P2 0 f. Similarly, the coverings of a fixe~pace M are the objects of a category where the morphisms are maps : Ml -+ M2
1. Differentiable Manifolds
34
that make the following diagram commute:
meaning that PI = P2 0 0 and suppose that p : M --+ M is a CO covering map with M paracompact. Then there exists a (unique) C r structure on M making p a C r covering map. Proof. Choose an atlas {(Uo:, XO:)}O:EA such that each domain Uo: is small enough to be evenly covered by p. Thus we have that p -1 (U0:) is a disjoint union of open sets U~ with each restriction PIUi a homeomorphism. We Q
now construct charts on M using the maps
Xo: 0
PIUi defined on the sets U~ Q
1.6. Coverings and Discrete Groups
35
(which cover M). The overlap maps are smooth since if U~ (Xa 0
=
~luJ 0
Xa 0
(Xf3
0
n UJ # 0 then
~IU~)-1
~Iu~ 0 (~Iu~) -1 0 x~1
= Xa 0 Xf3-1 . We leave it to the reader to show that M is Hausdorff if M is Hausdorff.
D
The following is a special case of Definition 1. 76. Definition 1.87. Let a : [0,1] -7 M and /3 : [0,1] -7 M be two CT maps (paths) both starting at P E M and ending at q. A CT fixed endpoint homotopy from a to /3 is a family of CT maps Hs : [0,1] -7 M parameterized by s E [0,1] such that
1) H: [0,1] x [0,1] -7 M defined by H(t, s) := Hs(t) is CT; 2) Ho = a and HI = /3; 3) Hs(O) = P and Hs(l) = q for all s E [0,1]. Definition 1.88. If there is a C T homotopy from a to /3, then we say that a is CT homotopic to /3 and write a c::= /3 (CT). If r = 0, we speak of paths being continuously homotopic. Remark 1.89. By Theorems 1.79 and 1.78 above we know that in the case of smooth manifolds, if a and /3 are smooth paths, then we have that a c::= /3 (CO) if and only if a c::= /3 (CT) for r > 0. Thus we can just say that a is homotopic to /3 and write a c::= /3. In case a and /3 are only continuous, they may be replaced by smooth paths a' and /3' with a' c::= a and /3' c::= /3.
It is easily checked that homotopy is an equivalence relation. Let P(p, q) denote the set of all continuous (or smooth) paths from p to q defined on [0,1]. Every a E P(p, q) has a unique inverse (or reverse) path a+- defined by a+-(t) := a(l - t). If PI, P2 and P3 are three points in M, then for a E P(PI, P2) and /3 E P(p2,P3) we can "multiply" the paths to get a path a*/3 E P(PI,P3) defined by a(2t) for t < 1/2, { a * /3(t) := /3(2t-1) for 1/2 :::; t :::; 1. Notice that a*/3 is a path that follows along a and then /3 in that order. 5 An important observation is that if al c::= a2 and /31 c::= /32, then al * /31 c::= a2 */32.
°: :;
SIn some settings, it is convenient to reverse this convention.
36
1. Differentiable Manifolds
The homotopy between al *(31 and a2 *(32 is given in terms of the homotopies Ha : al ~ a2 and H(3 : (31 ~ (32 by
Ha(2t, s)
for 0 :::; t
< 1/2,
H(t, s) := {
and 0 :::; s
< 1.
H(3(2t - 1, s) for 1/2 :::; t < 1, Similarly, if al ~ a2, then ai ~ at. Using this information, we can define a group structure on the set of homotopy equivalence classes of loops, that is, of paths in P(p, p) for some fixed p E M. First of all, we can always form a * (3 for any a, (3 E P(p, p) since we always start and stop at the same point p. Secondly, we have the following result.
Proposition 1.90. Let 1fl (M, p) denote the set of fixed endpoint homotopy classes of paths from P(p, p). For [aJ, [(3] E 1fl (M, p), define [a]· [(3] := [a*(3]. This is a well-defined multiplication, and with this multiplication 1fl (M, p) is a group. The identity element of the group is the homotopy class 1 of the constant map 1p : t f--7 p, the inverse of a class [a] is [a+---]. Proof. We have already shown that [a] . [(3] := [a must also show that 1) For any a, the paths a constant map 1p.
0
a+--- and a+---
* (3]
is well-defined. One
a are both homotopic to the
0
* a ~ a and a * 1p ~ a. 3) For any a,(3,"( E P(p,p), we have (a * (3) * "( ~ a * ((3 * "(). 2) For any a E P(p,p), we have 1p
Proof of 1): 1p is homotopic to a
a+--- via
0
a(2t) for { H(t, s) = a(s) for a+---(2t - 1) for
0:::; 2t :::; s, s :::; 2t :::; 2 - s, 2 - s :::; 2t :::; 2,
where 0 :::; s :::; 1. Interchanging the roles of a and a+--- we also get that 1p is homotopic to a+--- 0 a. Proof of 2): Use the homotopy
H(t s) = { a( l!st) for 0 :::; t :::; 1/2 + s/2, , p for 1/2+s/2:::;t:::; 1. Proof of 3): Use the homotopy
a(l!st) H(t, s) = { (3(4(t _ Its)) "(( 2~s (t
-
2t
s ))
for 0 < t < 1+s 4 ' for 1+s < t < 2+s 4 4 ' for 2+s < t < 1. 4 -
D
The group 1fl (M, p) is called the fundamental group of M at p. If desired, one can take the equivalence classes in 1fl(M,p) to be represented
1.6. Coverings and Discrete Groups
37
by smooth maps. If "( : [0,1] --t M is a path from p to q, then we have a group isomorphism 7rl(M,q) --t 7rl(M,p) given by
[a]
r---+
[r * a
* ,,(<-].
It is easy to show that this prescription is a well-defined group isomorphism. Thus, for any two points p, q in the same path component of M, the groups 7rl(M,p) and 7rl(M,q) are isomorphic. In particular, if M is connected, then the fundamental groups based at different points are all isomorphic. Because of this, if M is connected, we may simply refer to the fundamental group of M, which we write as 7rl (M).
Definition 1.91. A path connected topological space is called simply connected if 7rl(M) = {1}. The fundamental group is actually the result of applying a functor (see Appendix A). Consider the category whose objects are pairs (M,p), where M is a C r manifold and p is a distinguished point (base point), and whose morphisms f : (M,p) --t (N, q) are C r maps f : M --t N such that f(p) = q. The pairs are called pointed C r spaces and the morphisms are called pointed C r maps (or base point preserving maps). To every pointed space (M,p), we assign the fundamental group 7rl(M,p), and to every pointed C r map f : (M,p) --t (N, f(p)) we may assign a group homomorphism 7rl(f) : 7rl(M,p) --t 7rl(N, f(p)) by
7rl(f)([a]) = [f 0 a]. It is easy to check that this is a covariant functor, and so for pointed maps f and g that can be composed, (M, x) (N, y) !4 (P, z), we have 7rl(g 0 f) = 7rl (g) 0 7rl (f).
-L
Notation 1.92. To avoid notational clutter, we will often denote 7rl(f) by f#· Definition 1.93. ~et p : ]l.L--t M be a C r covering and let f : P --t !,f be a C r map. A map f : P --t M is said to be a lift of the map f if P 0 f = f. Theorem 1.94. Let p : M --t M be a C r covering, let "( : [a, b] --t M be a C r curve and pick a point y in p-l(-y(a)). Then there exists a unique C r lift ;:y : [a, b] --t M of"( such that ;:Y(a) = y. Thus the following diagram commutes:
38
1. Differentiable Manifolds
Figure 1.6. Lifting a path to a cover
If two paths a and f3 with a(a) = f3(a) are fixed endpoint homotopic via a homotopy h, .!hen for a given point yin p-1(,(a)), we have the corresponding lifts ii and f3 starti"!:Jl at y. In this c~se, the homotopy h lifts to a fixed endpoint homotopy h between ii and f3. In short, homotopic paths lift to homotopic paths.
Proof. We just give the basic idea and refer the reader to the extensive literature for details (see [Gre-Hrp]). Figure 1.6 shows the way. Decompose the curve, into segments that lie in evenly covered open sets by using the Lebesgue number lemma. Lift inductively starting by using the inverse of p in the first evenly covered open set. It is clear that in order to connect up continuously, each step is forced and so the lifted curve is unique. A similar argument shows how to lift the homotopy h. A little thought reveals that if p is a CT covering, then the lifts of CT maps are CT. 0 There are several important corollaries to this result. One is simply that if a : [0,1] -+ M is a path starting at a base point p E M, then since there is one and only one lift ii starting at a given pi in the fiber p -1 (p), the endpoint ii(l) is completely determined by the path a and by the point pi from which we want the lifted path to start. In fact, the endpoint only depends on the homotopy class of a (and the choice of starting point pi). To see this, note that if a, f3 : [0,1] -+ !:! are fixed endpoint homotopic paths in M beginning at p, and if 0: and f3 are the corresponding lifts with 0:(0) = ,8(0) = pi, then by the second part of the theorem, any homotopy h t : a ~ f3 lifts to a unique fixed endpoint homotopy h t : 0: ~ f3. This then ~
~
1.6. Coverings and Discrete Groups
39
/3(1).
implies that a(l) = Applying these ideas to loops based at p EM, we will next see that the fundamental group 7rl(M,p) acts on the fiber 8J-l(p) as a group of permutations. (This is a right action as we will see.) In case the covering space M is simply connected, we will also obtain an isomorphism of the grou~I(M,p) with the deck transformation group (which acts from the left on M). Before we delve into these matters, we state, without proof, two more standard results (see [Gre-Hrp]): Theorem 1.95. Let 8J : M -+ M be a C r covering. Fix a point q E Q and map with cP(q) = 8J (is):.... If Q is a point is E M. Let cP : Q -+ M be a connected, then there is a0!Yost one lift cP : Q -+ M of cP such that cP(p) = is· If cP# (7rl (Q, q)) C 8J# (7rl (M, is)), then cP has such a lift. In particular, if Q is simply connected, then the lift exists.
C!:
Theorem 1. 96. Every connected topological manifold M has a CO simply connected covering space which is unique up to isomorphism of coverings. This is called the universal cover. Furthermo!!;, if H is any subgroup of 7rl(M,p), then there is a connected covering 8J : M -+ M and a point is EM such that 8J#(7rl(M,iS)) = H. If follows from this and Theorem 1.86 that if M is a C r manifold, then ther!.. is a unique C r structure on the universal covering space M so that 8J: M -+ M is a C r covering.
Since a deck transformation is a lift, we have the following corollary. Corollary 1.97.l;et 8J: M -+ M be a C r covering map and choose a base point p EM. If M is connected, there is at most one deck transformation cP that maps a given PI E 8J-l(p) to a given P2 E 8J-l(p). If M is simply connected, then such a deck transformation exists and is unique. Theorem 1.98. If M is the universal cover of M and 8J : M -+ M is the corresponding universal covering map, then for any base point Po EM, there is an isomorphism 7rl(M,po) ~ Deck(8J).
Proof. Fix a point is E 8J -1 (po). Let a E 7rl (M, po) and let 0: be a loop representing a. Lift to a path a starting at is. As we have seen, the point a(l) depends only on the choice of is and a = [0:]. Let cPa be the unique deck transformation such that cPa(iS) = a(l). The assignment a I-t cPa gives a map 7rl(M,po) -+ Deck(8J). For a = [0:] and b = [,B] chosen from 7rl(M,po), we have the lifts a and /3, and we see that cPa 0 /3 is a path from cPa (is) to cPa(/3(l)) = cPa(cPb (is)). Thus the path -;;; := a * (cPa 0/3) is defined. Since
40
p
0
1. Differentiable Manifolds
<Pa = p, we have p
0
~=
p
0
[a * (<Pa (3) ] 0
= (p 0 a)
* (p 0 (<Pa 0 (3) )
= (p 0 a)
* ((p 0 <Pa) 0 (3)
= (p 0 a) * (p 0 Since
0:
(3)
= 0: * (3.
* (3 represents the element ab E 1Tl(M,po), we have <Pab(]i) = ~(1) =
<Pa(
0
It is easy to see that the map a f-t <Pa is onto. Indeed, given f E Deck(p), we simply take a curve ~ from p to f(P), and then we have f =
Finally, if <Pa = id, then we conclude that any loop 0: E [0:] = a lifts to a loop based at p. But M is simply connected, and so is homotopic to a constant map to p, and its projection 0: is therefore homotopic to a constant map to p. Thus a = [0:] = 0 and so the homomorphism is 1-1. 0
a
a
1.6.2. Discrete group actions. Groups actions are ubiquitous in differential geometry and in mathematics generally. Felix Klein emphasized the role of group actions in the classical geometries (see [Klein]). More on classical geometries can be found in the online supplement [Lee, Jeff1- In this section, we discuss discrete group actions on smooth manifolds and show how they give rise to covering spaces. Definition 1.99. Let G be a group and M a set. A left group action on M is a map l : G x M -+ M such that
1) l(g2,l(gl,X)) = l(g2g1, x) for all gl,g2 E G and all x E M; 2) l(e,x) = x for all x E M, where e is the identity element of G. We often write 9 . x or just gx in place of the more pedantic notation l(g, x). Using this notation, we have g2(gIX) = (g2gdx and ex = x. Similarly, we define a right group action as a map l' : M x G -+ M with r(r(x, gl), g2) = r(x, glg2) for all gl, g2 E G and all x E M and r(x, e) = x for all x E M. In the case of right actions, we write r(x, g) as X· 9 or xg.
If l : G x M -+ M is a left action, then for every 9 E G we have a map 19 : M -+ M defined by 19(x) = l(g, x), and similarly a right action gives for every g, a map rg : M -+ M. For every result about left actions, there is an analogous result for right actions. However, mathematical conventions are such that while 9 f-t 19 is a group homomorphism from G to the group of
1.6. Coverings and Discrete Groups
41
permutations of M, the map 9 t-+ r 9 is a group anti-homomorphism which means that r91 0 r92 = r 9291 for all 91, 92 E G (notice the order reversal). Given a left action, the sets of the form Gx = {9X : 9 E G} are called orbits. The set Gx is called the orbit of x. Two points x and yare in the same orbit if and only if there is a group element 9 such that 9X = y. The orbits are equivalence classes and so they partition M. Let G\M be the set of orbits and let gJ: M -+ G\M be the projection taking each x to Gx. We give G\M the quotient topology. By definition, U is open in G\M if and only if gJ -1 (U) is open. This makes gJ continuous, but in this case, it is also an open map. To see this, let U be open. Then gJ-l (gJ(U)) is the union UgEc9U which is open and so gJ(U) is open by the definition of quotient topology. Definition 1.100. Suppose G acts on a set M by l : G x M -+ M. We say that G acts transitively if for any x, y E M there is a 9 such that 9X = y. Equivalently, the action is transitive if the action has only one orbit. We say that the action is effective provided that 19 = idM implies that 9 = e. If the action has the property that 9X = x for some x E M only when 9 = e, then we say that G acts freely (or that the action is free). In other words, an action is free provided that the only element of G that fixes any element of M is the identity element. Similar statements and definitions apply for right actions except that the orbits have the form xG. The quotient space (space of orbits) will then be denoted by MIG. Warning: The notational distinction between G\M and MIG is not universal. Example 1.101. Let gJ : M -+ M be a covering map. Fix a base point Po E M and a base point Po E gJ-1(PO). If a E 1fl(M,po), then for each x E gJ-l(PO), we define ra(x) := xa := a(1), where a is the lift of any loop a representing a. The reader may check that ra is a right action on the set p-l(PO). Example 1.102. Recall that if gJ : M -+ M is a universal C r covering map (so that M is simply connected), we have an isomorphism 1fl(M,po) -+ Deck(gJ), which we denote by a t-+ cPa. This means that l(a,x) = cPa(x) defines a left action of 1fl(M,po) on M. Let G be a group and endow G with the discrete topology so that, in particular, every point is an open set. In this case, we call G a discrete group. If M is a topological space, then we endow G x M with the product topology. What does it mean for a map a : G x M -+ M to be continuous? The topology of G x M is clearly generated by sets of the form S x U, where
42
1. Differentiable Manifolds
8 is an arbitrary subset of G and U is open in M. The map 0: : G x M -t M will be continuous if for any point (gO, xo) E G x M and any open set U c M containing o:(gO, xo) we can find an open set 8 x V containing (gO, xo) such that 0:(8 x V) c U. Since the topology of G is discrete, it is necessary and sufficient that there is an open V such that 0: (gO x V) cU. It is easy to see that a necessary and sufficient condition for 0: to be continuous on all of G x M is that the partial maps O:g := o:(g,.) are continuous for every g E G. Definition 1.103. Let G be a discrete group and M a manifold. A left discrete group action is a group action l : G x M -t M such that for every g E G the partial map 19(·) := l(g,·) is continuous. A right discrete group action is defined similarly.
It follows that if l : G x M -t M is a discrete action, then each partial map 19 is a homeomorphism with l;l = 19-I. Definition 1.104. A discrete group action is C r if M is a C r manifold and each 19 (resp. rg ) is a C r map. Example 1.105. Let ¢ : M -t M be a diffeomorphism and let ;:z act on M by n . x := ¢n(x) where
¢o := idM, ¢n := ¢
0 ... 0
¢ for n
> 0,
¢-n := (¢-l)n for n > 0. This gives a discrete action of;:Z on M. Definition 1.106. A discrete group action l : G x M -t M is said to be proper if for every two points x, y E M there are open neighborhoods Ux and Uy respectively such that the set {g E G : gUx n Uy i- 0} is finite.
There is a more general notion of proper action which we shall meet later. For free and proper discrete actions, we have the following useful characterization. Proposition 1.107. A discrete group action l : G x M -t M is proper and free if and only if the following two conditions hold:
(i) Each x
E M has an open neighborhood U such that gU
nU
=
0
for all g except the identity e. We shall call such open sets sel/avoiding.
(ii) If x, y
E M are not in the same orbit, then they have self-avoiding
neighborhoods Ux and Uy such that gUx
n Uy
=
0 for
all g E G.
Proof. Suppose that the action l is proper and free. Let x be given. We then know that there is an open V containing x such that gV n V = 0 except
1.6. Coverings and Discrete Groups
43
for a finite number of g, say, gl,' .. ,gk, which are distinct. One of these, say gl, must be e. Since the action is free, we know that for each fixed i > 1 we have giX E M\ {x}. By using continuity and then the fact that M is a regular topological space, we can replace V by a smaller open set (called V again) such that giV C M\{x} for all i = 2, ... ,k or, in other words, that x tJ. g2V U··· U gkV. Let U = V\(g2V U··· U 9kV), Notice that U is open and we have arranged that U contains x. We show that Un gU is empty unless 9 = e. So suppose 9 i= e = gl. Since UngU c VngV, we know that this is empty for sure in all cases except maybe where 9 = gi for i = 2, ... , k. If x E Un giU for such an i, then x E U and so x tJ. gi V by the definition of U. But we also have x E giU C gi V, which is a contradiction. We conclude that (i) holds. Now suppose that x, y E M are not in the same orbit. We know that there exist open sets Ux and Uy with x E Ux, Y E Uy and such that gUx n Uy is empty except possibly for some finite set of elements which we denote by gl, ... ,gk· Since the action is free, g1X, ... ,gkX are distinct. We also know that y is not equal to any of g1X, ... ,gkX, and so since M is a Hausdorff space, there exist pairwise disjoint open sets 01, ... ,Ok, Oy with giX E Oi and y E Oy. By continuity, we may shrink Ux so that giUx C Oi for all i = 1, ... , k, and then we also replace Uy with Oy n Uy (renaming this Uy again). As a result we now see that gUx n Uy = 0 for 9 = gl,"" gk and hence for all g. By shrinking the sets Ux and Uy further we may make them self-avoiding. Next we suppose that (i) and (ii) always hold for a given discrete action l. First we show that l is free. Suppose that x = gx. Then for every open neighborhood U of x the set gU n U is nonempty, which by (i) means that 9 = e. Thus the action is free. Next pick x, y E M. If x, yare not in the same orbit, then by (ii) we may pick Ux and Uy so that {g E G : gUx n Uy i= 0} is empty and so certainly a finite set. If x, yare in the same orbit, then y = gox for a unique go since we now know that the action is free. Choose an open neighborhood U of x so that gU n U = 0 for 9 i= e. Let Ux = U and Uy = goU. Then, gUxnUy = gUngoU. If gUngoU i= 0, then golgUnU i= 0 and so golg = e and 9 = go. Thus the only way that gUx n Uy is nonempty is if 9 = go and so the set {g E G : gUx n Uy i= 0} has cardinality one. In either case, we may choose Ux and Uy so that the set is finite, which is what we wanted to show. 0
It is easy to see that if U c M is self-avoiding, then any open subset V C U is also self-avoiding. Thus, if the discrete group G acts freely and properly on M, then the open sets of (i) and (ii) in the above proposition can be taken to be connected chart domains.
44
1. Differentiable Manifolds
Proposition 1.108. Let M be an n-manifold and let l : G x M -t M be a smooth discrete action which is free and proper. Then the quotient space G\M has a natural smooth structure such that the quotient map is a smooth covenng map. Proof. Giving G\M the quotient topology makes gJ : M -t G\M continuous. Using (ii) of Proposition 1.107, it is easy to show that the quotient topology on G\M is Hausdorff. By Proposition 1.107, we may cover M by charts whose domains are self-avoiding and connected. Let (U, x) be such a chart and consider the restriction gJlu. This restricted map is open since, as remarked above, gJ is an open map. If x, y E U and gJ(x) = gJ(Y), then x and yare in the same orbit and so Y = gx for some g. Therefore y E gU n U, which means that gU n U is not empty and so 9 = e since U is self-avoiding. Thus x = y and we conclude that gJlu is injective. Since gJlu is also surjective, we see that it is a bijection and hence a homeomorphism. I.e., since gJlu is also open, it has a continuous inverse and so it is a homeomorphism. Since U is connected, the connected components of gJ-l (gJ (U)) are exactly the sets gU for 9 E G. Since gJ 0 19 = gJ for all G, it is easy to see that gJ restricts to a homeomorphism on each connected component gU of gJ-l (gJ (U)). Thus, gJ(U) is evenly covered by gJ and so gJ is a covering map. For every such chart (U, x), we have a map x 0 (gJlu )-1 : gJ(U)
-t
x(U),
which is a chart on G\M. This map is clearly a homeomorphism. Given any other map constructed in this way, say yo (gJlv )-1, the domains gJ(U) and gJ(V) only meet if there is agE G such that gU meets V and 19 maps an open subset of U diffeomorphically onto a subset of V. In fact, by Exercise 1.109 below, the map (gJl v )-1 0 gJl U is defined on an open set each point of which has a neighborhood on which this map is a restriction of 19 for some g. Thus (gJlv )-1 0 gJlu is smooth and for the overlap map we have yo (gJlv) -1
0
(x 0 (gJlu )-1) -1
= yo (gJlv) -1
0
gJlu
0
x-I,
which is smooth. Thus we have an atlas on G\M and the topology induced by the atlas is the same as the quotient topology since we have already 0 established that the charts are homeomorphisms. Exercise 1.109. In the context of the proof above, show that (gJlv )-1 0 gJlu is defined on an open set 0 = (gJiu )-1 (gJ(U) n gJ(V)). Show that each x E 0 has a neighborhood on which the map (gJlv )-1 0 gJlu coincides with a restriction of a map x f--t gx for some fixed g. Conclude that (gJlv )-1 0 gJlu is a CT map. [Outline of solution: For x E 0, we must have (gJlv )-1 0 gJlu (x) = gx for some g. Now 0' = Un g-1 V is an open set that contains x. But also,
1.6. Coverings and Discrete Groups
45
r(g-l V) = r(V) so a' = Un g-l V c (rl U)-l (r(U) n r(V)). Let x' E Then since r(x') E r(U) n r(V), it follows that
a'.
(rlv)-lo plu (x') = x", where x" is the unique point in V such that r(x") = r(x'). But gx' E V and p (gx') = x" so gx' = x".] Example 1.110. We have seen the torus previously presented as T2 = Sl X Sl. Another presentation that uses a group action is given as follows: Let the group £:: x£::= £::2 act on ]R2 by
(m,n)· (x,y):= (x+m,y+n). It is easy to check that Proposition 1.108 applies to give a manifold ]R2 /£::2. This is actually the torus in another guise, and we have the diffeomorphism ¢: ]R2/£::2 ---+ Sl X Sl = T2 given by [(x,Y)]1--t (ei27rx,ei27rY). The following diagram commutes:
Exercise 1.111. Let p : M ---+ G\M be the covering arising from a free and proper discrete action of G on M and suppose that M is connected. Let fc := {lg E Diff(M) : 9 E G}; then G is isomorphic to fc by the obvious map 9 I--t 19 and furthermore fc = Deck(p).
Covering spaces r : M ---+ M that arise from a proper and free discrete group action are special in that if M is connected, then the covering is a normal covering, which means that the group Deck (p) acts transitively on each fiber p-1(p) (why?). Example 1.112. Recall the multiplicative abelian group £::2 = {I, -I} of two elements. Let £::2 act on the sphere sn C ]Rn+1 by (±1) . x := ±x. Thus the action is generated by letting -1 send a point on the sphere to its antipode. This action is also easily seen to be free and proper. The quotient space is the real projective space ]Rpn (See Example 1.41), ]Rpn = Sn /£::2.
If M is simply connected and we have a free and proper action by G as above, then we can define a map ¢ : G ---+ 7r1(G\M,bo) as follows: Fix a base point Xo E M with p(xo) = boo Given 9 E G, let, : [0,1] ---+ M be a path with ,(0) = Xo and ,(1) = gxo. Then
¢(g):= [po,]
E
7r1(G\M,bo).
1. Differentiable Manifolds
46
This is well-defined because M is simply connected. In fact, we already know from Theorem 1.98 that there is an isomorphism 1fl(G\M,bo) ~ Deck(gJ). But by Exercise 1.111, we know that G ~ Deck(gJ) by the map 9 I---t 19. Composing, we obtain an isomorphism 'ljJ : 1fl(G\M, bo) ---+ G. Recalling the definition of the isomorphism constructed in the proof of Theorem 1.98, we see that the map
The reader may wish to try to prove directly that
~
Z2.
1. 7. Regular Submanifolds A subset 5 of a smooth n-manifold M is called a regular submanifold of dimension k if every point p E 5 is in the domain of a chart (U, x) that has the following regular submanifold property with respect to 5:
x(U n 5) = x(U)
n (jRk
x {c}) for some c E
jRn-k.
Usually c is chosen to be 0, which can always be accomplished by composition with a translation of jRn. The terminology here does not seem to be quite standardized. If a subset 5 c M is covered by charts of M of the above type, then 5 itself is said to have the (regular) submanifold property. We will refer to such charts as being single-slice charts (adapted to 5). For every such single-slice chart (U, x), we obtain a chart (U n 5, xs) on 5, where Xs := pr 0 xl unS and pr : jRk X jRn-k ---+ jRk is projection onto the first k coordinates. In other words, if xl, ... ,xn are the coordinate functions of a single-slice chart, then the restrictions of xl, ... ,xk to un 5 are coordinate functions on 5. These charts provide an atlas for 5 (called a submanifold atlas) making it a smooth manifold in its own right. Indeed, one checks that the overlap maps for such charts are smooth. Exercise 1.115. Prove this last statement.
We will see more general types of sub manifolds in the sequel. An important aspect of regular submanifolds is that the topology induced by the smooth structure is the same as the relative topology. The integer n - k is called the codimension of 5 (in M), and we say that 5 is a regular sub manifold of codimension n - k.
47
1.7. Regular Submanifolds
y
Figure 1.7. Projection chart
Example 1.116. The unit sphere sn C ffi.n+1 is a regular submanifold of To see this, let Wi± := {(al, ... , an+1) E ffi.n+l : ±ai > a}. Then define 1/;; : Wi± -+ 1/;; (ffi.n+ I) by
ffi. n + l .
n'.±( I ... , an+l) -_ ( a, I ... , ai-I , aHI , ... , an+l , I a I - 1) . '{Ii a, These can easily be checked to give charts on ffi.n+l smoothly related to the standard chart. If p E S2 then p is in the domain of one of the charts 1/;;' On the other hand, identifying ffi.n+l with ffi.n x ffi., we have 1/lt(wi±nsn) = 1/;;(U)n(ffi.n x {o}) and so these charts have the submanifold property with respect to sn. Let pr : ffi.n+ I = ffi. n - l x ffi. -+ ffi.n-l. The resulting charts on sn given by pro 1/;;lw±nsn have the form (al, ... ,an+ l ) I-t h 'are proJec . t'IOns from sn ont 0 co ord'1( aI , ... , ai-I , aHI , ... , an+l) . Tese nate hyperplanes in ffi.n+1 and are easily checked to give the same smooth structure as the stereographic charts given earlier. Exercise 1.117. Show that a continuous map f : N -+ M that has its image contained in a regular submanifold S is differentiable with respect to the submanifold atlas if and only if it is differentiable as a map into M. Exercise 1.118. Show that the graph of any smooth map ffi.n -+ ffi.m is a regular submanifold of ffi.n x ffi. m . Exercise 1.119. Show that if M is a k-dimensional regular submanifold of ffi.n, then for every p EM, there exists at least one k-dimensional coordinate plane P such that the orthogonal projection ffi.n -+ P ~ ffi.k restricts to a coordinate chart for M defined on some neighborhood of p. [Hint: If (U, x) is a single-slice chart for M so that xl, ... , xk restrict to coordinates on M, then xk+l, ... , xn together give a map f : U C ffi.n = ffi.k x ffi.n-k -+ ffi.n-k
1. Differentiable Manifolds
48
such that MnU = j-l(O). Argue that if ul, ... , un are standard coordinates on JR. n , then after a suitable renumbering we must have
a( x k+l , .. . ,xn) .../.. 0 a( u k+l , .. . ,un ) r . Now use the implicit mapping theorem to show that M is locally the graph of a smooth function.]
1.8. Manifolds with Boundary For the general Stokes theorem, where the notion of flux has its natural setting, we will need to have the concept of a smooth manifold with boundary. We have already introduced the notion of a topological manifold with boundary, but now we want to see how to handle the issue of the smooth structures. Some basic two-dimensional examples to keep in mind are the upper half-plane JR.~2:o:= {(x,y) E JR.2 : y 2: O}, the closed unit disk D = {(x,y) E JR. 2 : x 2 +y2 ~ I}, and the closed hemisphere which is the set of all (x, y, z) E 8 2 with Z 2: o. Recall that in Section 1.2 we defined the closed n-dimensional Euclidean half-spaces JR.~>c := {a E JR.n : .\( a) 2: c}. Of course, JR.~-c so we are including -both JR.~kO. We also write JR.~=~ = {a E jRn : .\(a) = c}. Give JR.~>c the relative topology as a subset of JR. n . Since JR.~>c c JR. n , we already have-a notion of differentiability for a map U ---t JR. m, where U is a relatively open subset of JR.~>c. We just invoke Definition 1.58. We can extend definitions a bit more: -
Definition 1.120. Let U c JR.~ 1_ >c 1 and j : U ---t JR.:\2_ >c2 . We say that j is C r if it is C r as a map into JR.m. If both j : U ---t j(U) and j-l : j(U) ---t U are homeomorphisms of relatively open sets and C r in this sense, then j is called a C r diffeomorphism. For convenience, let us introduce for an open set U c JR.~>c (relatively open) the following notations: Let au denote JR.~=c n U and int(U) denote U \ au. In particular, aJR.~>c = JR.~=c. Notice that au is clearly an (n - 1)manifold. We have the following three facts: (1) First, let j : U C JR.n ---t JR.k be C r differentiable (with r 2: 1) and 9 another such map with the same domain. If j = 9 on JR.~>c n U, then D j(x) = Dg(x) for all x E JR.~2:c n U. (2) If j : U C JR.n ---t JR.~>c is C r differentiable (with r 2: 1) and j(x) E JR.~=c = aJR.~>c for all x E U, then Dj(x) must have its image in JR.~=O" -
1.8. Manifolds with Boundary
49
Figure 1.8. Manifold with boundary
(3) Let f : UI C ~~ 1_ >c 1 -+ U2 C ~r2_ >c 2 be a diffeomorphism (in our new extended sense). Assume that aUI and aU2 are not empty. Then f induces diffeomorphisms aUI -+ aU2 and int(U1 ) -+ int(U2). These three claims are not exactly obvious, but they are very intuitive. On the other hand, none of them is difficult to prove (see Problem 19). We can now form a definition of smooth manifold with boundary in a fashion completely analogous to the definition of a smooth manifold without boundary. A half-space chart x for a set M is a bijection of some subset U of M onto an open subset of some half-space ~~>c. A C r half-space atlas is a collection (Uo , xoJ of half-space charts such-that for any two, say (Uo , xo ) and (U{3, x{3), the map xoox~l is a C r diffeomorphism on its natural domain. Notice carefully that we allow the half-space to vary from chart to chart, but we will keep n fixed for a given M and refer to the charts as n-dimensional half-space charts.
Definition 1.121. An n-dimensional C r manifold with boundary is a pair (M, A) consisting of a set M together with a maximal atlas of ndimensional half-space charts A. The manifold topology is that generated by the domains of the charts in the maximal atlas. The boundary of M is denoted by aM and is the set of points whose image under any chart is contained in the boundary of the associated half-space. The three facts listed above show that the notion of a boundary is a welldefined concept and is a natural notion in the context of smooth manifolds; it is a "differentiable invariant" .
50
1. Differentiable Manifolds
Colloquially, one usually just refers to M as a manifold with boundary and forgoes the explicit reference to the atlas. Also we refer to an n-dimensional Coo manifold with boundary as an n-manifold with boundary. The interior of a manifold with boundary is M\oM. It is a manifold without boundary and is denoted int(M) or M. Exercise 1.122. Show that oM is a closed set in M. If no component of a manifold without boundary is compact, it is called an open manifold. For example, the interior int(M) of a connected manifold M with nonempty boundary is never compact and is an open manifold in the above sense if every component of M contains part of the boundary.
Remark 1.123. We avoid the phrase "closed manifold", which is sometimes taken to refer to a compact manifold without boundary.
Let M be an n-dimensional C r manifold with boundary and p E oM. Then by definition there is a chart (U, x) with x(p) E olR~>c' The image of the restriction xl un 8M is contained in olR~>c for some A and c depending on the chart. By composing this restriction WIth any fixed linear isomorphism olR~>c ~ IRn -l, we obtain a bijection, say X8M, of Un oM onto an open subs-;;t of IRn - 1 which provides a chart (Ua: noM, X8M) for oM. The family of charts obtained in this way is an atlas for oM. The overlaps are smooth and so we have the following: Proposition 1.124. If M is an n-manifold with boundary, then oM is an (n - 1) -manifold. Exercise 1.125. Show that the overlap maps for the atlas just constructed for oM are smooth. Exercise 1.126. The closed unit ball B(p, 1) in IR n is a smooth manifold with boundary oB(p, 1) = sn-l. Also, the closed hemisphere Sf- = {x E sn : x n+ 1 ~ O} is a smooth manifold with boundary. Exercise 1.127. Is the Cartesian product of two smooth manifolds with boundary necessarily a smooth manifold with boundary? Exercise 1.128. Show that the concept of smooth partition of unity makes sense for manifolds with boundary. Show that such exist.
51
Problems
Problems (1) Prove Proposition 1.32. The online supplement [Lee, Jeff] outlines the proof.
(2) Prove Lemma 1.11. (3) Check that the manifolds given as examples are indeed paracompact and Hausdorff. (4) Let M I , M2 and M3 be smooth manifolds. (a) Show that (MI XM2) x M3 is diffeomorphic to MI X(M2 x M 3 ) in a natural way. (b) Show that f : M ---+ MI XM2 is Coo if and only if the composite maps pri 0 f : M ---+ MI and pr2 0 f : M ---+ M2 are both Coo.
(5) Show that a CT manifold M is connected as a topological space if and only it is C T path connected in the sense that for any two points PI, P2 E M there is a CT map c: [0,1] ---+ M such that c(o) = PI and c(l) = P2.
(6) A k-frame in ffi.n is a linearly independent ordered set of vectors Show that the set of all k-frames in ffi.n can be given the structure of a smooth manifold. This kind of manifold is called a Stiefel manifold. (VI, ... , Vk)'
(7) For a product manifold M x N, we have the two projection maps pri : M x N ---+ M and pr2 : M x N ---+ N defined by (x, y) f------7 x and (x, y) f------7 y respectively. Show that if we have smooth maps h : P ---+ M and 12 : P ---+ N, then the map (J,g) : P ---+ M x N given by (J,g)(p) = (J (p), 9 (p)) is the unique smooth map such that pr I 0 (J, g) = f and pr 2 0 (J, g) = g. (8) Prove (i) and (ii) of Lemma 1.35.
(9) Show that the atlas obtained for a regular submanifold induces the relative topology inherited from the ambient manifold.
(10) The topology induced by a smooth structure is not necessarily Hausdorff: Let S be the subset of ffi.2 given by the union (ffi. x 0) U {(O, I)}. Let U be ffi. x and let V be the set obtained from U by replacing the point (0,0) by (0,1). Define a chart map x on U by x(x, 0) = x and a chart y on V by
°
y(x, 0)
= {
~
if x if x
# 0, = 0.
Show that these two charts provide a Coo atlas on S, but that the topology induced by the atlas is not Hausdorff.
52
1. Differentiable Manifolds
(11) As we have defined them, manifolds are not required to be second countable and so may have an uncountable number of connected components. Consider the set jR2 without its usual topology. For each a E jR, define a bijection cPa : jR X {a} --+ jR by cPa (x, a) = x. Show that the family of sets of the form U x {a} for U open in jR and a E jR provide a basis for a paracompact topology on jR2. Show that the maps cPa are charts and together provide an atlas for jR2 with this unusual topology. Show that the resulting smooth manifold has an uncountable number of connected components (and so is not second countable). (12) Show that every connected manifold has a countable atlas consisting of charts whose domains have compact closure and are simply connected. Hint: We are assuming that our manifolds are paracompact, so each connected component is second countable. (13) Show that every second countable manifold has a countable fundamental group (a solution can be found in [Lee, John] on page 10). (14) If C x C is identified with jR4 in the obvious way, then S3 is exactly the subset ofCxC given by {(ZI,Z2): IZ112+lz212 = I}. Letp,q be coprime integers and p > q ~ O. Let w be a primitive p-th root of unity so that Zp = {I, w, . .. , wp- l }. For (ZI' Z2) E S3, let w·(zI, Z2) := (WZl' wQz2) and extend this to an action of Zp on S3 so that wk. (ZI' Z2) = (WkZl' wQkZ2). Show that this action is free and proper. The quotient space Zp \S3 if called a lens space and is denoted by L(p; q). (15) Let SI be realized as the set of complex numbers of modulus one. Defin{ a map 0: SI xS l --+ SI xS l by O(z, w) = (-z, w) and note that 000 = id Let G be the group {id,O}. Show that M := (SI X SI) IG is a smootl 2-manifold. (16) Show that if S is a regular k-dimensional submanifold of an n-manifol( M, then we may cover S by special single-slice charts from the atlas c M which are of the form x : U --+ VI X V2 C jRk X jRn-k = jRn with
x(unS) = VI x {O} for some open sets VI C jRk, V2 C jRn-k. Show that we may arrange f( VI and V2 to both be Euclidean balls or cubes. (This problem should t easy. Experienced readers will likely see it as merely an observation.) (17) Show that
1rl (M
x N, (p, q)) is isomorphic to
1rl (M,
p) x
1rl (N,
q).
(18) Suppose that M = U U V, where U and V are simply connected ar open. Show that if UnV is path connected, then M is simply connecte (19) Prove the three properties about maps involving the model half-spac ajR~~o listed in Section 1.8.
Problems
53
Px(-l,l)
(]I)
Figure 1.9, Smoothly connecting manifolds
(20) Let !vi and N be smooth n-manifolds with boundaries aM and aN. Let P be a smooth manifold diffeomorphic to both aAf and aN via maps 0: and /3. Suppose that there are open neighborhoods U and V of aM and aN respectively and diffeomorphisms
--7
P x [0,1),
--7
P
X
(-1, 01
such that
= {u lR+ = {u
u i > 0 for i = 1,2, ... ,n},
E
lRn
E
lR n : ui ~ 0 for i = 1,2, ... ,n}.
:
A boundary point of lR+ is a point such that at least one of its coordinates u i is O. A corner point of lR+ is a point such that at least two of its coordinates ui,u j are O. We consider a set Af. An lR+-valued chart on M is a pair (U, x), where U c M and x : U --7 x (U) is a bijection onto an open subset x (U) of lR+, where the latter has the relative topology as a subset of lR n. A smooth atlas for M is a family {( Uu, xu) }uEA of lR+-valued charts whose domains cover M and such that whenever
1. Differentiable Manifolds
54
Uo:
n Uf3
is nonempty, the composite map xf3
0
x~l
:
xo:(Uo:
n Uf3)
-+ xf3(Uo:
n Uf3)
is smooth. A maximal atlas of lR+ -valued charts of this type gives M the structure of a smooth manifold with corners (of dimension n). We also use the terminology n-manifold with corners. (a) Suppose that p E M and xo:(p) is a corner point in lR+. Show that if p is in the domain of another chart (Uf3, xf3) in the atlas (as above), then xf3 (p) is also a corner point. Use this to define "corner points" on M. Do the same for "boundary points". Thus the boundary contains the set of corner points. Explain why a manifold with corners whose set of corner points is empty is a manifold with (possibly empty) boundary. (b) Define the notion of smooth functions on a manifold with corners and the notion of smooth maps between manifolds with corners. (c) Show that the boundary of a manifold with corners is not necessarily a manifold with corners. (22) Prove Theorem 1.113 and its corollary.
Chapter 2
The Tangent Structure
In this chapter we introduce the notions of tangent space and cotangent space of a smooth manifold. The union of the tangent spaces of a given manifold will be given a smooth structure making this union a manifold in its own right, called the tangent bundle. Similarly we introduce the cotangent bundle of a smooth manifold. We then discuss vector fields and their integral curves together with the associated dynamic notions of Lie derivative and Lie bracket. Finally, we define and discuss the notion of a l-form (or covector field), which is the notion dual to the notion of a vector field. One can integrate l-forms along curves. Such an integration is called a line integral. We explore the concept of exact l-forms and nonexact l-forms and their relation to the question of path independence of line integrals.
2.1. The Tangent Space If c : (-€, €) -+ ~N is a smooth curve, then it is common to visualize the "velocity vector" c(O) as being based at the point p = c(O). It is often desirable to explicitly form a separate N-dimensional vector space for each point p, whose elements are to be thought of as being based at p. One way to do this is to use {p} x ~N so that a tangent vector based at p is taken to be a pair (p, v) where v E ~N. The set {p} x ~N inherits a vector space structure from ~N in the obvious way. In this context, we provisionally denote {p} x ~N by Tp~N and refer to it as the tangent space at p. If we write c(t) = (xl(t), .. . , x N (t)), then the velocity vector of a curve c at time 1 t = 0 is (p, ddt1 (0), ... , dd,; (0)), which is based at p = c(O). Ambiguously, both 1 It is common to refer to the parameter t for a curve as "time" , although it may have nothing to do with physical time in a given situation.
-
55
2. The Tangent Structure
56
d:/
(d:
1 (p, (0), ... , d~; (0)) and t (0), ... , d~; (0)) are often denoted by c(O) or c'(O). A bit more generally, if V is a finite-dimensional vector space, then V is a smooth manifold and the tangent space at p E V can be provisionally taken to be the set {p} x V. We use the notation vp := (p, v). If vp := (p, v) is a tangent vector at p, then v is called the principal part of vp.
We have a natural isomorphism between ]RN and Tp]RN given by v r--+ (p, v), for any p. Of course we also have a natural isomorphism Tp]RN ~ Tq]RN for any pair of points given by (p,v) r--+ (q,v). This is sometimes referred to as distant parallelism. Here we see the reason that in the context of calculus on ]RN, the explicit construction of vectors based at a point is often deemed unnecessary. However, from the point of view of manifold theory, the tangent space at a point is a fundamental construction. We will define the notion of a tangent space at a point of a differentiable manifold, and it will be seen that there is, in general, no canonical way to identify tangent spaces at different points. Actually, we shall give several (ultimately equivalent) definitions of the tangent space. Let us start with the special case of a submanifold of ]RN. A tangent vector at p can be variously thought of as the velocity of a curve, as a direction for a directional derivative, and also as a geometric object which has components that depend in a special way on the coordinates used. Let us explore these aspects in the case of a submanifold of ]RN. If M is an n-dimensional regular submanifold of ]RN, then a smooth curve c: (-E, E) ---+ M is also a smooth curve into]RN and C(O) is normally thought of as a vector based at the point p = c( 0). This vector is tangent to M. The set of all vectors obtained in this way from curves into M is an n-dimensional subspace of the tangent space of]RN at p (described above). In this special case, this subspace could play the role of the tangent space of M at p. Let us tentatively accept this definition of the tangent space at p and denote it by TpM. Let vp := (p, v) E TpM. There are three things we should notice about vp. First, there are many different curves c : (-E, E) ---+ M with c(O) = p which all give the same tangent vector vp , and there is an obvious equivalence relation among these curves: two curves passing through p at t = 0 are equivalent if they have the same velocity vector. Already one can see that perhaps this could be turned around so that we can think of a tangent vector as an equivalence class of curves. Curves would be equivalent if they agree infinitesimally in some appropriate sense. The second thing that we wish to bring out is that a tangent vector can be used to construct a directional derivative operator. If vp = (p, v) is a tangent vector in Tp]RN, then we have a directional derivative operator at p which is a map coo(]RN) ---+ ]R given by f r--+ Df(p)v. Now if vp is tangent
2.1. The Tangent Space
57
to M, we would like a similar map COO(M) -t K If f is only defined on M, then we do not have D f to work with but we can just take our directional derivative to be the map given by
Dvp : f
t--t
(f
0
c)' (0),
where c : I -t M is any curve whose velocity at t = 0 is vp. Later we use the abstract properties of such a directional derivative to actually define the notion of a tangent vector. Finally, notice how vp relates to charts for the submanifold. If (U, y) is a chart on M with p E U, then by inverting we obtain a map y-1 : V -t M, which we may then think of as a map into the ambient space ]RN. The map y-1 parameterizes a portion of M. For convenience, let us suppose that y-1(0) = p. Then we have the "coordinate curves" yi t--t y-1(0, ... , yi, ... , 0) for i = 1, ... , n. The resulting tangent vectors Ei at p have principal parts given by the partial derivatives so that 8y-1 ) Ei:= ( p, 8yi (0) . It can be shown that (E 1 , . .. , En) is a basis for TpM. For another coordinate system y with y-1(0) = p, we similarly define a basis (E1, ... , En). If Vp = "n L."i=l aiEi = "n L."i=l a-iE-i, th en 1ett·lng a = (a 1, ... , an) an d a- = (-1 a , ... , a-n) , the chain rule can be used to show that
a = D(y 0
y-1) Iy(p) a,
which is classically written as n 8- i -i _ " Y j a-~8ja.
j=l
y
Both (a 1, ... , an) and (a 1, ... , an) represent the tangent vector vp , but with respect to different charts. This is a simple example of a transformation law. The various definitions for the notion of a tangent vector given below in the general setting will be based in turn on the following three ideas: (1) Equivalence classes of curves through a point. (2) Transformation laws for the components of a tangent vector with respect to various charts. (3) The idea of a "derivation" which is a kind of abstract directional derivative. Of course we will also have to show how to relate these various definitions to see that they are really equivalent.
2.1.1. Tangent space via curves. Let p be a point in a smooth nmanifold M. Suppose that we have smooth curves C1 and C2 mapping into M, each with open interval domains containing 0 E lR and with C1 (0) = C2 (0) = p. We say that C1 is tangent to C2 at p if for all smooth real-valued functions
2. The Tangent Structure
58
f defined on an open neighborhood of p, we have (f 0 cd (0) = (f 0 cd (0). This is an equivalence relation on the set of all such curves. The reader should check that this really is an equivalence relation and also do so when we introduce other simple equivalence relations later. Define a tangent vector at p to be an equivalence class under this relation. Notation 2.1. The equivalence class of c will be denoted by [c], but we also denote tangent vectors by notation such as vp or X p , etc. Eventually we will often denote tangent vectors simply as v, W, etc., but for the discussion to follow we reserve these letters without the subscript for elements of lRn for some n.
If vp = [cl then we will also write C(O) = vp. The tangent space TpM is defined to be the set of all tangent vectors at p EM. A simple cut-off function argument shows that Cl is equivalent to C2 if and only if (f 0 cd (0) = (f 0 C2)' (0) for all globally defined smooth functions f : M -+ Itt Lemma 2.2. CI is tangent to C2 at p if and only if (f 0 cd (0) = (f 0 for alllR k -valued functions f defined on an open neighborhood of p.
cd (0)
Proof. If f = (fl, ... , r), then (f 0 cd (0) = (f 0 C2)' (0) if and only if (t 0 Cl)' (0) = (Ii 0 C2)' (0) for i = 1, ... ,k. Thus (f 0 cd (0) = (f 0 cd (0) if Cl is tangent to C2 at p. Conversely, let 9 be a smooth real-valued function defined on an open neighborhood of p and consider the map f = (g, 0, ... ,0). Then the equality (f 0 cd (0) = (f 0 cd (0) implies that (g 0 cd (0) (g 0 cd (0). 0
The definition of tangent space just given is very geometric, but it has one disadvantage. Namely, it is not immediately obvious that TpM is a vector space in a natural way. The following principle is used to obtain a vector space structure: Proposition 2.3 (Consistent transfer of linear structure). Suppose that S is a set and {VaJaEA is a family of n-dimensional vector spaces. Suppose that for each a we have a bijection ba : Va -+ S. If for every a, (3 E A the map b~1 oba : Va -+ V,a is a linear isomorphism, then there is a unique vector space structure on the set S such that each ba is a linear isomorphism. Proof. Define addition in S by SI + S2 := ba (b;;I(SI) definition is independent of the choice of a. Indeed,
ba (b;;I(Sl)
+ b;;1(S2))
=
ba [b;;1
0
b,a 0 b~l(SI)
+ b;;1 0
b,a [b~l(Sl)
0
= b,a
(b~l(Sl) + b~1(S2)) .
0
b,a 0 b~1(s2)l
+ b~1(S2)]
= ba
b;;l
+ b;;I(S2))'
This
2.1. The Tangent Space
59
The definition of scalar multiplication is a· s := ba(ab~l(s)), and this is shown to be independent of a in a similar way. The axioms of a vector 0 space are satisfied precisely because they are satisfied by each Va. We will use the above proposition to show that there is a natural vector space structure on TpM. For every chart (xa, Ua) with p E U, we have a map ba : ]Rn -+ TpM given by v H bv]' where "tv : t H x~l(xa(P) + tv) for t in a sufficiently small but otherwise irrelevant interval containing o. Lemma 2.4. For each chart (Ua, xa ), the map ba : ]Rn -+ TpM is a bijection and b~l 0 ba = D (Xfj 0 x~l) (xa(P)). Proof. We have
(xa 0 "tv)' (0) = dd I Xa t t=O
= dd I t
0
x~l(xa(P) + tv)
(xa(P)
+ tv)
=
v.
t=o
Suppose that bv] = bw] for v, wE ]Rn. Then by Lemma 2.2 we have v
= (xa 0 "tv)' (0) = (xa 0 "tw)' (0) = w.
This means that ba is injective. Next we show that ba is surjective. Let [c] E TpM be represented by c : (-E, E) -+ M. Let v := (xa 0 c)' (0) E ]Rn. Then we have bQ(v) = bv]' where "tv : t H X~ 1 (Xa (p) + tv). But bv] = [c] since for any smooth f defined near p we have
I f 0 X~ 1 (Xa (p) + tv) = D (J 0 X~ 1 ) (Xa (p)) . v t t=O D (J 0 x~l) (xa(p))· (XQ 0 c)' (0) = (J 0 c)' (0).
(J 0 "tv)' (0) = dd =
Thus bQ is surjective. From Lemma 2.2 we see that the map [c] H (xQ 0 c)' (0) is well-defined, and from the above we see that this map is exactly b~l. Thus
b~l
0
bQ(v) = dd I xfj t t=O
0
x~l(xQ(p) + tv)) = D (Xfj x~l) (xa(P))v. 0
0
The above lemma and proposition combine to provide a vector space structure on the set of tangent vectors. Let us temporarily call the tangent space defined above, the kinematic tangent space and denote it by (TpM)kin. Thus, if Cp is the set of smooth curves c defined on some open interval containing 0 such that c(O) = p, then
(TpM)kin = Cp /"', where the equivalence is as described above.
60
2. The Tangent Structure
Exercise 2.5. Let CI and C2 be smooth curves mapping into a smooth manifold !v!, each with open interval domains containing 0 E lR. and with CI (0) = C2(0) = p. Show that
(J
0
cd (0) = (J
0
C2)' (0)
for all smooth f if and only if the curves x 0 CI and x 0 C2 have the same velocity vector in lR. n for some and hence any chart (U, x). 2.1.2. Tangent space via charts. Let A be the maximal atlas for an nmanifold M. For fixed p EM, consider the set r p of all triples (p, v, (U, x)) E {p} x lR. n x A such that p E U. Define an equivalence relation on r p by requiring that (p, v, (U, x)) '"" (p, w, (V, y)) if and only if
(2.1)
w =
D(y 0 x-I) Ix(p) . v.
In other words, the derivative at x(p) of the coordinate change yo x-I "identifies" v with w. The set r pi "-' of equivalence classes can be given a vector space structure as follows: For each chart (U, x) containing p, we have a map b(u,x) : lR. n -+ rpl"-' given by v H [P,v,(U,x)], where [P,v,(U,x)] denotes the equivalence class of (p, v, (U, x) ). To see that this map is a bijection, notice that if [p, v, (U, x)] = [p, w, (U, x)], then v = D(x 0
x-1)lx(p) .
v= w
by definition. By Proposition 2.3 we obtain a vector space structure on tangent vectors. This is another version of the tangent space at p, and we shall (temporarily) denote this by (TpM)phys. The subscript "phys" refers to the fact that this version of the tangent space is based on a "transformation law" and corresponds to a way of looking at things that has traditionally been popular among physicists. If vp = [p, v, (U, x)] E (TpM)phys' then we say that v E lR.n represents vp with respect to the chart (U, x).
r pi "-' whose elements are
This viewpoint takes on a more familiar appearance if we use a more classical notation. Let (U, x) and (V, y) be two charts containing p in their domains. If an n-tuple (vI, ... , v n ) represents a tangent vector at p from the point of view of (U, x), and if the n-tuple (wI, ... , w n ) represents the same vector from the point of view of (V, y), then (2.1) is expressed in the form (2.2)
i _
~
fJyi
w - L.t fJxj j=l
I
j
v ,
x(p)
where we write the change of coordinates as yi = yi(xl, ... ,xn) with 1 < i So n. Notation 2.6. It is sometimes convenient to index the maximal atlas: A = {(Uo, xoJ }OEA. Then we would consider triples of the form (p, v, ex) and let
61
2.1. The Tangent Space
the defining equivalence relation for (TpM)phys be (p, v, 0:) '" (p, w, j3) if and only if D(x(3 0 x~l) Ixa(p) . v = w. 2.1.3. Tangent space via derivations. We abstract the notion of directional derivative for our next approach to the tangent space. There are actually at least two common versions of this approach, and we explain both. Let M be a smooth manifold of dimension n. A tangent vector vp at p is a linear map vp : COO(M) ---+ ~ with the property that for f, 9 E COO(M),
vp(fg) = g(p)vp (f)
+ f(p)vp (g).
This is the Leibniz law. We may say that a tangent vector at p is a derivation of the algebra COO(M) with respect to the evaluation map evp at p defined by eVp(f) := f(p). Alternatively, we say that vp is a derivation at p. The set of such derivations at p is easily seen to be a vector space which is called the tangent space at p and is denoted by TpM. We temporarily distinguish this version of the tangent space from (TpM)kin and (TpM)phys defined previously by denoting it (TpM)alg and referring to it as the algebraic tangent space. We could also consider the vector space of derivations of C r (M) at a point for r < 00, but this would not give a finitedimensional vector space and so is not a good candidate for the definition of the tangent space (see Problem 18). Recall that if (U, x) is a chart on an n-manifold M, we have defined af /ax i by
af axi(p):= Di
(fox-I) (x(p))
(see Definition 1.57). Definition 2.7. Given (U, x) and p as above, define the operator 8~i Ip :
It is often helpful to use the easily verified fact that if ci is the curve defined for sufficiently small E by
Ci(t)
:=
x-l(x(p)
:
(-E, E) ---+ M
+ ei),
where ei is the i-th member of the standard basis of ~n, then
~I f = lim f(Ci(h)) - f(p). ax t p h--+O h From the usual product rule it follows that 8~i Ip is a derivation at p and so is an element of (TpM)alg. We will show that for the vector space (TpM)alg.
(-£r Ip , ... , 8~n Ip) is a basis
62
2. The Tangent Structure
Lemma 2.8. Let vp E (TpM)alg. Then
(i) if f, g E coo(M) are equal on some neighborhood of p, then vp (f) = vp (g); (ii) if hE coo(M) is constant on some neighborhood ofp, then vp (h) =
O. Proof. (i) Since vp is a linear map, it suffices to show that if f = 0 on a neighborhood U of p, then vp (f) = O. Of course vp(O) = O. Let;3 be a cut-off function with support in U and ;3(p) = 1. Then we have that ;3f is identically zero and so 0= vp(;3J) = f(p)vp (;3)
= vp (f)
+ ;3(p)vp (f)
(since ;3(p) = 1 and f(p) = 0).
(ii) From what we have just shown, it suffices to assume that h is equal to a constant c globally on M. In the special case c = 1, we have
Vp (1) = vp(I·I) = 1· vp (1)
+ 1· vp (1) =
2vp (1),
so that vp (1) = O. Finally we have vp (c) = vp (lc) = c (vp(I)) = O.
D
Notation 2.9. We shall often write vpf or vp' f in place of vp (f).
We must now deal with a technical issue. We anticipate that the action of a derivation is really a differentiation and so it seems that a derivation at p should be able to act on a function defined only in some neighborhood U of p. It is pretty easy to see how this would work for a~' Ip' But the domain of a derivation as defined is the ring coo(M) and not Coo(U). There is nothing in the definition that immediately allows an element of (TpM)alg to act on Coo(U) unless U = M. It turns out that we can in fact identify (TpU)alg with (TpM)alg' and the following discussion shows how this is done. Once we reach a fuller understanding of the tangent space, this identification will be natural and automatic. So, let p E U C M with U open. We construct a rather obvious map : (TpU)alg -+ (TpM)alg by using the restriction map coo(M) -+ Coo(U). For each wp E TpU, we define wp : coo(M) -+ lR by wp(f) := wpUlu). It is easy to show that wp is a derivation of the appropriate type and so wp E (TpM)alg' Thus we get a linear map : (TpU)alg -+ (TpM)alg' We want to show that this map is an isomorphism, but notice that we have not yet established the finite-dimensionality of either (TpU)alg or (TpM)alg' First we show that : wp f--t wp has trivial kernel. So suppose that wp = 0, i.e. wp(f) = 0 for all f E COO (M). Let h E Coo(U). Pick a cut-off function ;3 with support in U so that ;3h extends by zero to a smooth function f on all of M that agrees with h on a neighborhood of
2.1. The Tangent Space
63
p. Then by the above lemma, wp(h) = wp(flu) = wp(f) = O. Thus, since h was arbitrary, we see that wp = 0 and so
.
.
In eIther case the formula IS the same:
a Ip f
ax;
=
a(Jox- 1 ) au;
(x(p)).
Notice that agreeing on a neighborhood of a point is an important relation here and this provides motivation for employing the notion of a germ of a function (Definition 1.68). First we establish the basis theorem: Theorem 2.10. Let M be an n-manifold and (U, x) a chart with p E U.
Then the n-tuple of vectors (derivations) (a~llp"'" a~n Ip) is a basis for (TpM)alg' Furthermore, for each vp E (TpM)alg we have
Proof. From our discussion above we may assume that x(U) is a convex set such as a ball of radius E in ]Rn. By composing with a translation we assume that x(p) = O. This makes no difference for what we wish to prove since vp applied to a constant is O. For any smooth function 9 defined on the convex set x(U) let
gi(U):=
(log oui (tu) dt for all u
Jo
E
x(U).
The fundamental theorem of calculus can be used to show that 9
EgiUi. We see that gi(O) =
?url o'
= g(O) +
For a function f E COO(U), we let
9 := f 0 x-I. Using the above, we arrive at the expression f = f (p) + E fiXi , and applying aa; I we get Ji(p) = !!.La ax ; Ip . Now apply the derivation vp to x p
64
J = J(p)
2. The Tangent Structure
+ 2: Jix i to obtain vpJ = 0 + =
L vp(Jixi)
L vp(xi)Ji(p) + L OVpJi
" ' Vp(x i ) = " ~
This shows that vp =
2: vp(xi)
aJI p. axi
8~i Ip and thus we have a spanning set.
To see that (8~i Ip ' .. · , 8~i Ip) is a linearly independent set, let us assume that 2: a i 8~i Ip = 0 (the zero derivation). Applying this to x j gives 0 =
2: ai ~~~ Ip = 2: ai o1 = aj , and since j
was arbitrary, we get the result.
D
Remark 2.11. On the manifold JR. n , we have the identity map id : JR.n~ JR.n which gives the standard chart. As is often the case, the simplest situations have the most confusing notation because of the various identifications that may exist. On JR., there is one coordinate function, which we often denote by either u or t. This single function is just idjR. The basis vector at to E JR. associated to this coordinate is Ju Ito (or gt Ito)· If we think of the tangent space at to E JR. a..<; being {to} x JR., then Ju Ito is just (to, 1). It is also common to denote Ju Ito by "I" regardless of the point to· Above, we used the notion of a derivation as one way to define a tangent vector. There is a slight variation of this approach that allows us to worry a bit less about the relation between (TpU)alg and (TpM)alg. Let Fp = C~(M, JR.) be the algebra of germs of functions defined near p. Recall that if J is a representative for the equivalence class [f] E F p , then we can unambiguously define the value of [J] at p by [J](p) = J(p). Thus we have an evaluation map evp : Fp ~ JR..
Definition 2.12. A derivation (with respect to the evaluation map evp) of the algebra Fp is a map Dp : Fp ~ JR. such that Dp([J][g]) = J(p)Dp[g] + g(p)Dp[J] for all [f], [g] E Fp. The set of all these derivations on Fp is easily seen to be a real vector space and is sometimes denoted by Der(Fp).
Remark 2.13. The notational distinction between a function and its germ at a point is not always maintained; DpJ is taken to mean Dp[J]. Let M be a smooth manifold of dimension n. Consider the set of all germs of Coo functions Fp at p EM. The vector space Der(Fp) of derivations of Fp with respect to the evaluation map evp could also be taken as the definition of the tangent space at p. This would be a slight variation of what we have called the algebraic tangent space.
2.2. Interpretations
65
2.2. Interpretations We will now show how to move from one definition of tangent vector to the next. Let M be a (smooth) n-manifold. Consider a tangent vector vp as an equivalence class of curves represented by c : I --+ M with c(O) = p. We obtain a derivation by defining
vp ! := dd
t
!
I
0
c.
t=O
This gives a map (TpM)kin --+ (TpM)alg which can be shown to be an isomorphism. We also have a natural isomorphism (TpM)kin --+ (TpM)phYs, Given [c] E (TpM)kin' we obtain an element vp E (TpM)phys by letting vp be the equivalence class of the triple (p, v, (U, x)), where vi := It=o xi 0 c for a chart (U, x) with p E U.
1t
If vp is a derivation at p and (U, x) an admissible chart with domain containing p, then vp , as a tangent vector in the sense of Definition 2.1.2, is represented by the triple (p,v, (U,x)), where v = (vI, ... ,vn) is given by
vi
=
vpx i
('Up is acting as a derivation).
This gives us an isomorphism (TpM)alg --+ (TpM)phys· Next we exhibit the inverse isomorphism (TpA1)phys --+ (TpM)alg. Suppose that [(p, v, (U, x))] E (TpM)phys where v E ]Rn. We obtain a derivation by defining vp ! = D(f 0 x-I) Ix(p) . v. In other words,
v p! =
L vi 8x8 i I ! n
i=1
P
for v = (vI, ... , vn). It is an easy exercise that vp defined in this way is independent of the representative triple (p, v, (U, x)). We now adopt the explicitly flexible attitude of interpreting a tangent vector in any of the ways we have described above depending on the situation. Thus we effectively identify the spaces (TpM)kin' (TpM)phys and (TpM)alg. Henceforth we use the notation TpM for the tangent space of a manifold M at a point p. Definition 2.14. The dual space to a tangent space TpM is called the cotangent space and is denoted by M. An element of M is referred to as a covector.
T;
The basis for
T;
T; M that is dual to the coordinate basis (-£r I
p "
described above is denoted (dx1Ip" .. , dxnlp)' By definition dx i Ip 6j. (Note that 8}
= 1 if i =
j and 8}
= 0 if i i
.. ,
{)~n
I)
({)~j Ip) =
j. The symbols 8ij and
2. The Tangent Structure
66
8ij are defined similarly.) The reason for the differential notation dx i will be explained below. Sometimes one abbreviates 8~j Ip and dx i Ip to 8~j and
dx i respectively, but there is some risk of confusion since later 8~j and dx i will more properly denote not elements of the vector spaces TpM and T; M, but rather fields defined over a chart domain. More on this shortly.
2.2.1. Tangent space of a vector space. Our provisional definition of the tangent space at point p in a vector space V was the set {p} x V, but this set does not immediately fit any of the definitions of tangent space just given. This is remedied by finding a natural isomorphism {p} x V ~ Tp V. One may pick a version of the tangent space and then exhibit a natural isomorphism directly, but we take a slightly different approach. Namely, we first define a natural map JP : V -+ Tp V. We think in terms of equivalence classes of curves. For each v E V, let cp ,v : lR -+ V be the curve cp ,v (t) := p + tv. Then Jp( v) := [cp,v] E Tp V.
As a derivation, Jp( v) acts according to
J t---+
:t 10 J
(p
+ tv) .
On the other hand, we have the obvious projection pr2 : {p} x V -+ V. Then our natural isomorphism {p} x V ~ Tp V is just JP 0 pr2' The isomorphism between the vector spaces {p} x V and Tp V is so natural that they are often identified. Of course, since Tp V itself has various manifestations ((Tp V)alg' (Tp V)phys' and (Tp V)kin' we now have a multitude of spaces which are potentially being identified in the case of a vector space. Note: Because of the identification of {p} x V with Tp V, we shall often denote by pr2 the map Tp V -+ V. Furthermore, in certain contexts, Tp V is identified with V itself. The potential identifications introduced here are often referred to as "canonical" or "natural" and JP : V -+ Tp V is often called the canonical or natural isomorphism. The inverse map Tp V -+ V is also referred to as the canonical or natural isomorphism. Context will keep things straight.
2.3. The Tangent Map The first definition given below of the tangent map at p E M of a smooth map J : M -+ N will be considered our main definition, but the others are actually equivalent. Given J and p as above, we wish to define a linear map TpJ : TpM -+ Tf(p)N. Since we have several definitions of tangent space, we expect to see several equivalent definitions of the tangent map. For the first definition we think of TpM as (TpM)kin'
2.3. The Tangent Map
67
Figure 2.1. Tangent map
Definition 2.15 (Tangent map I). If we have a smooth function between manifolds
J:
M -+ N,
and we consider a point p E M and its image q = the tangent map at p,
J(p)
EN, then we define
TpJ : TpM -+ TqN, in the following way: Suppose that vp E TpM and we pick a curve c with c(O) = p so that vp = [c]; then by definition
TpJ . vp = [J 0 c] E TqN, where [J 0 c] E TqN is the vector represented by the curve J Notation 1.3.)
0
c. (Recall
Another popular way to denote the tangent map TpJ is Jp*, but a further abbreviation to J* is dangerous since it conflicts with a related meaning for f* introduced later. Exercise 2.16. Let J : M -+ N be a smooth map. Show that TpJ : TpM -+ TqN is a linear map and that if J is a diffeomorphism, then TpJ is a linear isomorphism. [Hint: Think about how the linear structure on (TpM)kin was defined.] We have the following version of the chain rule for tangent maps: Theorem 2.17. Let J : M -+ Nand 9 : N -+ P be smooth maps. For each p EM we have Tp(g 0 f) = (Tf(p)g) 0 TpJ.
2. The Tangent Structure
68
Proof. Let v E TpM be represented by the curve c so that v = [c]. Then J 0 c represents TpJ(v) and we have
Tp(g 0 f)(v) = [(g 0 f) =
0
c]
=
[g 0 (f 0 c)]
(Tf(p)g) (TpJ(v)) = ((Tf(p)g) o TpJ) (v).
D
For the next alternative definition of tangent map, we consider TpM as
(TpM)phys· Definition 2.18 (Tangent map II). Let J : M ---t N be a smooth map and consider a point p E M with image q = J(p) E N. Choose any chart (U, x) containing p and a chart (V, y) containing q = J(p) so that for vp E TpM we have the representative (p, v, (U, x)). Then the tangent map TpJ : TpM ---t Tf(p)N is defined by letting the representative of TpJ . vp in the chart (V, y) be given by (q, w, (V, y)), where w
= D(y 0 J 0 x-I) . v.
This uniquely determines TpJ . v, and the chain rule guarantees that this is well-defined (independent of the choice of charts). Another alternative definition of tangent map is given in terms of derivations:
Definition 2.19 (Tangent map III). Let M be a smooth n-manifold. Continuing our set up above, we define TpJ . vp as a derivation by
(TpJ . vp)g = vp(g 0 f) for each smooth function g. It is easy to check that this defines a derivation and so a tangent vector in TqM. This map is yet another version of the tangent map TpJ. In the above definition, one could take 9 to be the germ of a smooth function defined on a neighborhood of J(p) and then TpJ . vp would act as a derivation of such germs. One can check the chain rule for tangent maps using the above definition in terms of derivations as follows: If J : M ---t Nand 9 : N ---t P are smooth maps and vp E TpM, then for h E Coo (M) we have
(Tp(g
0
f). vp)h = vp(h 0 (g
0
f)) = vp((h 0 g)
= (TpJ . vp) (h 0 g)
=
0
f)
Tf(p)g . (TpJ . vp) h,
so since hand vp were arbitrary, we conclude again that Tp(gof) = (Tf(p)g)o TpJ. We have just proved the same thing using different interpretations of tangent space and tangent map, but the isomorphisms between the versions are so natural that we may use any version convenient for a given purpose and then draw conclusions about all versions. This could be formalized
2.3. The Tangent Map
69
using category theory arguments but we shall forgo the endeavor. Just as with the idea of the tangent space, we will think of the above versions of the tangent map as a single abstract thing with more than one interpretation depending on the interpretation of tangent space in play. Now we introduce the differential of a function.
Definition 2.20. Let M be a smooth manifold and let p E M. For f E Coo (M), we define the differential of f at p as the linear map df (p) TpM ---+ ~ given by
df(p) . vp = vpf for all vp E TpM. Thus df(p) E T* M. The notation dfp or dflp is also used in place of df(p). One may view dfp(vp) as an "infinitesimal" aspect of the composition f H f where ,'(0) = vp. It is easy to see that df(p) is just Tpf followed by the natural map Tf(p)~ ---+ R In a way, df(p) is just a version of the tangent map that takes advantage of the identification of Tf(p)~ with ~ (recall Remark 2.11).
0"
Let (U, x) be a denoted the basis now this notation (~I p have dxil pax]
chart with x = (xl, ... ,xn) and let p E U. We previously dual to (lxr(p)""'o~n(P)) by (dxllp, ... ,dxnlp), and is justified since we can check directly that we really do
)=
<5Ji .
Definition 2.21. Let I = (a, b) be an interval in IR. If c : I ---+ M is a smooth map (a curve), then the velocity at to E I is the vector c(to) E Tc(to)M defined by
c(to)
:=
Tto
c,:u Ito'
tu
where Ito is the coordinate basis vector at to E Ttol = Tto~ associated to the standard coordinate function on ~ (denoted here by u).
Note: We may also occasionally write c' for
c.
Thus if f is a smooth function defined in a neighborhood of c(to), then c(to) acts as a derivation as follows: . to) . f = dt dI c(
f
0
c;
t=to
c(to) is also denoted by ft Ito c. In this notation, to every tEl the tangent vector c( t) := Ttc·
c = ftc is a map that assigns
fu It'
and this is referred to as the velocity field along the curve c. Also note that we may view c(to) as the equivalence class of the curve t H c( t + to).
2. The Tangent Structure
70
The differential can be generalized: Definition 2.22. Let V be a vector space. For a smooth J : M --+ V with p E M as above, the differential dJ(p) : TpM --+ V is the composition of the tangent map TpJ and the canonical map Ty V --+ V where y = J(p),
The notational distinction between TpJ and dJp is not universal, and dJp is itself often used to denote Tpj. Exercise 2.23. Let J : M --+ N be a smooth map. Show that if TpJ = 0 for all p E M, then J is locally constant (constant on connected components of M).
We now consider the inclusion map L : U Y .Nf where U is open. For p E U, we get the tangent map TpL : TpU --+ TpM. Let us look at this map from several points of view corresponding to the various ways one can define the tangent space. First, consider tangent spaces from the derivation point of view. From this point of view the map TpL is defined for vp E TpU as acting on COO(M) as follows: TpL(Vp)J = vp (J 0 L) = vp (II U). We have seen this map before where we called it : (TpU)alg --+ (TpM)alg' and it was observed to be an isomorphism and we decided to identify (TpU)alg with (TpM)alg. From the point of view of equivalence classes of curves, the map TpL sends h'J to [L 0 ,),J. But while')' is a curve into U, the map L 0 ')' is simply the same curve, but thought of as mapping into M. We leave it to the reader to verify the expected fact that TpL is a linear isomorphism. Thus it makes sense to identify h'J with [L 0 ')'J and so again to identify TpU with TpM via this isomorphism. Next consider vp E TpU to be represented by a triple (p, v, (UCO xaJ) where (UC\:, xC\:) is a chart on the open manifold U. Since (UC\:' xC\:) is also a chart on M, the triple also represents an element of TpM which is none other than TpL' vp. The map TpL looks more natural and trivial than ever, and we once again see the motivation for identifying TpU and TpM. More generally, when S is a regular submanifold of M, then the tangent space TpS at pES C M is intuitively a subspace of TpM. Again, this is true as long as one is not bent on distinguishing a curve in S through p from the "same" curve thought of as a map into M. If one wants to be pedantic, then we have the inclusion map L : S y M, and if c : I --+ S is a curve into S, then L 0 C : I --+ M is a map into M. At the tangent level this means that C(O) E TpS while (L 0 c)'(O) E TpM. Thus TpL : TpS --+ TpL(TpS) c TpM. When convenient, we just identify TpS with TpL(TpS) and so think of TpS as a subspace of TpM and take TpL to be an inclusion.
2.3. The Tangent Map
71
Recall that a smooth pointed map f : (M,p) ---+ (N, q) is a smooth map f: M ---+ N such that f(p) = q. Taking the set of all pairs (M,p) as objects and pointed maps as morphisms we have an obvious category. Definition 2.24. The "pointed" version of the tangent functor T takes (M,p) to TpM and a map f : (M,p) ---+ (N, q) to the linear map Tpf :
TpM ---+ TqM. The following theorem is the inverse mapping theorem for manifolds and will be used repeatedly. Theorem 2.25. If f : M ---+ N is a smooth map such that Tpf : TpM ---+ TqN is an isomorphism, then there exists an open neighborhood 0 of p such that f(O) is open and flo: 0 ---+ f(O) is a diffeomorphism. If Tpf is an isomorphism for all p EM, then f : M ---+ N is a local diffeomorphism. Proof. The proof is a simple application of the inverse mapping theorem, Theorem C.l found in Appendix C. Let (U, x) be a chart centered at p and let (V, y) be a chart centered at q = f(p) with f(U) c V. From the fact that Tpf is an isomorphism we easily deduce that M and N have the same dimension, say n, and then D(y 0 f 0 x-I)(x(p)) : IR n ---+ IR n is an isomorphism. From Theorem C.l if follows that yo f 0 x-I restricts to a diffeomorphism on some neighborhood 0' of 0 E IRn. From this we obtain that flo: 0 ---+ f(O) is a diffeomorphism where 0 = x-I (0'). The second part follows from the first. 0 2.3.1. Tangent spaces on manifolds with boundary. Recall that a manifold with boundary is modeled on the half-spaces 1R~>c := {a E IR n : oX (a) 2: c}. If M is a manifold with boundary, then the tangent space TpM is defined as before. For instance, even if p E 8M, the fiber TpM may still be thought of as consisting of equivalence classes where (p, v, a) '" (p, w, (3) if and only if D(x{3 0 x~l) Ixa(P) . v = w. Notice that for a given chart (Uo:, xo:), the vectors v in (p, v, a) still run through all of IR n and so TpM still has dimension n even if p E 8M. On the other hand, if p E 8M, then for any half-space chart x : U ---+ 1R~>c with p in its domain, Tx- I (Tx(p)IR~=c) is a subspace of TpM. This is the subspace of vectors tangent to the boundary and is identified with Tp8M, the tangent space to 8M (also a manifold). Exercise 2.26. Show that this subspace does not depend on the choice of chart.
If one traces back through the definitions, it becomes clear that because of the way charts and differentiability are defined for manifolds with boundary, any smooth function defined on a neighborhood of a boundary point can be thought of as being the restriction of a smooth function defined
2. The Tangent Structure
72
Figure 2.2. Tangents at a boundary point
slightly "outside" M. More precisely, the representative function always has a smooth extension from a (relatively open) neighborhood in 1R~>c to a neighborhood in IRn. The derivatives of the extended function at points of alR~>c = 1R~=c are independent of the extension. These considerations can be -used to show that tangent vectors at boundary points of a smooth n-manifold with boundary can still be considered as derivations of germs of smooth functions. A closed interval [a, b] is a one-dimensional manifold with boundary, and with only minor modifications in our definitions we can also consider equivalence classes of curves to define the full tangent space at a boundary point. It also follows that if c is a smooth curve with domain [a, b], then we can make sense of the velocities c(a) and c(b), and this is true even if c(a) or c(b) is a boundary point. The major portion of the theory of manifolds extends in a natural way to manifolds with boundary.
2.4. Tangents of Products Suppose that f : MI x M2 ~ N is a smooth map. For fixed p E MI and fixed q E M2, consider the "insertion" maps &p : y H (p, y) and &q : x H (x, q). Then fo&q and fo&p are the maps sometimes denoted by f(-,q) and f(p, .).
Definition 2.27. Let f : MI x M2 tangent maps ad and a2f by
~
N be as above. Define the partial ~
(ad) (p, q)
:=
Tp (f
0
&q) : TpMI
(a2f) (p, q)
:=
Tq (f
0
&p) : TqM2 ~ Tf(p.q)N.
Tf(p,q)N,
Next we introduce another natural identification. It is obvious that a curve c: I ~ MI X M2 is equivalent to a pair of curves CI :
I
~
MI,
C2:
I
~
M2.
2.4. Tangents of Products
73
The infinitesimal version of this fact gives rise to a natural identification on the tangent level. If c(t) = (CI(t),C2(t)) and c(O) = (p,q), then the map T(P,q)prl x T(p,q)pr2 : T(p,q)(MI x M2) --t TpMI x TqM2 is given by [c] f---t ([CI], [C2]), which is quite natural. This map is an isomorphism. Indeed, consider the insertion maps ~p : q f---t (p, q) and ~q : p t--+ (p, q),
We have linear monomorphisms T~q(p) Ttp(q) : TqM2 --t T(p,q)(M I x M 2),
TpMI --t T(p,q)(MI x M 2) and
Then, we have the map
Ti q + Tip: TpMI
X
TqM2 --t T(p,q)(M I x M2),
which sends (v,w) E TpMI X TqM 2 to Tiq(V) + Tip(W). It can be checked that this map is the inverse of [c] f---t ([CI], [C2])' Thus we may identify T(p,q)(M I x M 2) with TpMI x TqM 2. Let us say a bit about the naturalness of this identification. In the smooth category, there is a direct product operation. The essential point is that for any two manifolds MI and M 2 , the manifold MI x M2 together with the two projection maps serves as the direct product in the technical sense that for any smooth maps f : N ~ MI and g: N ~ M2 we always have the unique map f x 9 : N ~ MI X M2 which makes the following diagram commute: N
~gJ~
MI ~ MI
X
M2 ---- M2
For a point x E N, write p = f(x) and q = g(x). On the tangent level we have
Tx(fxg)
Txg
which is a diagram in the vector space category. In the category of vector spaces, the product of TpMI and TpM2 is TpMI X TpM2 together with the
74
2. The Tangent Structure
projections onto the two factors. Corresponding to the maps Tpf and Tqg we have the map Tpf x Tqg. But
(T(P,q)prl x T(P,q)pr2)
0
Tx (f x g)
= (T(P,q)prl 0 Tx (f x g)) x (T(P,q)pr2 0 Tx (f x g)) =
Tx (prl
(f x g)) x Tx (pr2
=
Txf x Txg.
0
0
(f x g))
Thus under the identification introduced above the map Tx (f x g) corresponds to Txf x Txg. Now if v E TpMl and w E TqM2' then (v, w) represents an element of T(p,q) (Ml x M 2), and so it should act as a derivation. In fact, we can discover how this works by writing (v,w) = (v,O) + (O,w). If Cl is a curve that represents v and C2 is a curve that represents w, then (v,O) and (O,w) are represented by t f-7 (Cl(t),q) and t f-7 (p,C2(t)) respectively. Then for any smooth function f on Ml x M2 we have
(v,w)f = (v,O)f =
v[f 0
+ (O,w)f =
~ql
+ w[f 0
:tl
o f(Cl(t),q)
+
:tl
o f(p,C2(t))
~pl.
Lemma 2.28 (Partials lemma). For a map f : Ml x M2 -+ N, we have
T(p,q)f· (v, w) = (8d)(p,q) . v + (82f)(p,q) . w, where we have used the aforementioned identification T(p,q)(Ml x M 2) TpMl x Tq M 2 .
=
Proving this last lemma is much easier and more instructive than reading the proof so we leave it to the reader in good conscience.
2.5. Critical Points and Values Definition 2.29. Let f : M -+ N be a Cr-map and p E M. We say that p is a regular point for the map f if Tpf is a surjection. Otherwise, p is called a critical point or singular point. A point q in N is called a regular value of f if every point in the inverse image f-l{q} is a regular point for f. This includes the case where f-l{q} is empty. A point of N that is not a regular value is called a critical value. Most values of a smooth map are regular values. In order to make this precise, we will introduce the notion of measure zero on a second countable smooth manifold. It is actually no problem to define a Lebesgue measure on such a manifold, but for now the notion of measure zero is all we need. Definition 2.30. A subset A of ~n is said to be of measure zero if for any E > there is a sequence of cubes {Wi} such that A c UWi and L vol(Wi ) < E. Here, vol(Wi) denotes the volume of the cube.
°
2.5. Critical Points and Values
75
In the definition above, if the Wi are taken to be balls, then we arrive at the very same notion of measure zero. It is easy to show that a countable union of sets of measure zero is still of measure zero. Our definition is consistent with the usual definition of Lebesgue measure zero as defined in standard measure theory courses. Lemma 2.31. Let U c ]Rn be open and f : U has measure zero, then f(A) has measure zero.
-7
]Rn a C 1 map. If A
cU
Proof. Since A is certainly contained in the countable union of compact balls (all of which are translates of a ball at the origin), we may as well assume that U = B(O, r) and that A is contained in a slightly smaller ball B(O, r-8) c B(O, r). By the mean value theorem (see Appendix C), there is a constant C depending only on f and its domain such that for x, y E B(O, r) we have Ilf(y) - f(x)11 :S c Ilx - YII. Let E > be given. Since A has measure zero, there is a sequence of balls B(Xi' Ei) such that A c U B(Xi' Ei) and
°
'"' vol(B(xi' Ei)) < _E_. n n ~
Thus f(B(Xi, Ei)) also have vol
c
2 c B(j(Xi),2cEi), and while f(A) C
UB(j(Xi), 2CEi),
we
(U B(j(Xi), 2CEi)) :S L vol(B(j(xi), 2CEi)) :S
L vol(Bl) (2CEi)n :S 2ncn L vol(B(xi' Ei)) :S E,
where Bl = B(O, 1) is the ball of radius one centered at the origin. Since E was arbitrary, it follows that A has measure zero. D The previous lemma allows us to make the following definition: Definition 2.32. Let M be an n-manifold that is second countable. A subset A C M is said to be of measure zero if for every admissible chart (U, x) the set x( A n U) has measure zero in ]Rn. In order for this to be a reasonable definition, the manifold must be second countable so that every atlas has a countable subatlas. This way we may be assured that every set that we have defined to be measure zero is the countable union of sets that are measure zero as viewed in some chart. It is not hard to see that in this more general setting it is still true that a countable union of sets of measure zero has measure zero. Also, we still have that the image of a set of measure zero under a smooth map, has measure zero. Proposition 2.33. Let M be second countable as above and A = {(Uo:, xo:)} a fixed atlas for M. If xo:(A n Uo:) has measure zero for all a, then A has measure zero.
2. The Tangent Structure
76
Proof. The atlas A has a countable subatlas, so we may as well assume from the start that A is countable. We need to show that given any admissible chart (U, x) the set x(A n U) has measure zero. We have
x(A n U) =
Ux(A nUn Un) (a countable union).
Since xn(An U nUn) C xn(An Un), we see that xn(An U n Un) has measure zero for all cx. But x(A nUn Un) = X 0 x~l 0 xn(A nUn Un), and so by the lemma above, x(A nUn Un) also has measure zero. Thus x(A n U) has 0 measure zero since it is a countable union of sets of measure zero. We now state the famous and useful theorem of Arthur Sardo Theorem 2.34 (Sard). Let N be an n-manifold and M an m-manifold, both assumed second countable. For a smooth map f : N ---+ M, the set of critical values has measure zero. The somewhat technical proof may be found in the online supplement [Lee, Jeff] or in [Bro-Jan]. Corollary 2.35. If M and N are second countable manifolds, then the set of regular values of a smooth map f : M ---+ N is dense in N. 2.5.1. Morse lemma. If we consider a smooth function f : M ---+ JR, and assume that M is a compact manifold (without boundary), then f must achieve both a maximum at one or more points of M and a minimum at one or more points of M. Let Pe be one of these points. The usual argument shows that dfl pe = 0. (Recall that under the usual identification of JR with any of its tangent spaces we have dfl pe = TpJ.) Now let p be some point for which dflp = 0, i.e. p is a critical point for J. Does f achieve either a maximum or a minimum at p? How does the function behave in a neighborhood of p? As the reader may well be aware, these questions are easier to answer in case the second derivative of f at p is nondegenerate. But what is the second derivative in this case? Definition 2.36. The Hessian matrix of f at one of its critical points p and with respect to coordinates x = (x\ ... , x n ), is the matrix of second partials:
where Xo = x(p). The critical point p is called nondegenerate if H is nonsingular.
2.5. Critical Points and Values
77
Any such matrix H is symmetric, and by Sylvester's law of inertia, it is congruent to a diagonal matrix whose diagonal entries are either 0 or 1 or -1. The number of -1 's occurring in this diagonal matrix is called the index of the critical point. According to Problem 13 we may define the Hessian H f,p : TpM x TpM -t JR, which is a symmetric bilinear form at each critical point p of f, by letting Hf,p(v,w) = Xp(Yf) = Yp(Xf) for any vector fields X and Y which respectively take the values v and w at p. Thus, we may give a coordinate free definition of a nondegenerate point for f. Namely, p is a nondegenerate point for f if and only if Hf,p is a nondegenerate bilinear form. The form H f,p is nondegenerate if for each fixed nonzero v E TpM the map Hf,p(v,·) : TpM -t JR is a nonzero element of the dual space M.
T;
Exercise 2.37. Show that the nondegeneracy is well-defined by either of the two definitions given above and that the definitions agree. Exercise 2.38. Show that nondegenerate critical points are isolated. Show by example that this need not be true for general critical points. The structure of a function near one of its nondegenerate critical points is given by the following famous theorem of M. Morse: Theorem 2.39 (Morse lemma). Let f : M -t JR be a smooth function and let Xo be a nondegenerate critical point for f of index 1/. Then there is a local coordinate system (U, x) containing Xo such that the local representative fu := f 0 x-I for f has the form
fu(x I , . .. ,xn)
= f(xo) + L hijxix j i,j
and it may be arranged that the matrix h = (h ij ) is a diagonal matrix of the form diag( -1, ... , -1, 1, ... ,1) for some number (perhaps zero) of ones and minus ones. The number of minus ones is exactly the index 1/. Proof. This is clearly a local problem and so it suffices to assume that f : U -t JR for some open U C JRn and also that f(O) = O. Our task is to show that there exists a diffeomorphism ¢ : JRn -t JRn such that f 0 ¢( x) = xt hx for a matrix of the form described. The first step is to observe that if 9 : U c JRn -t JR is any function defined on a convex open set U and g(O) = 0, then
2. The Tangent Structure
78
Thus 9 is of the form 9 = 2:~=1 Uigi for certain smooth functions gi, 1 :::; i :::; n with the property that Oig(O) = gi(O). Now we apply this procedure first to f to get f = 2:~1 Udi where od(O) = fi(O) = and then apply the procedure to each fi and substitute back. The result is that
°
n
f(Ul, ... ,Un ) =
(2.3)
L uiujhij(ul"",un) i,j=l
for some functions hij with the property that hij is nonsingular at, and therefore near 0. Next we symmetrize the matrix h = (h ij ) by replacing hij with ~(hij + h ji ) if necessary. This leaves (2.3) untouched. The index of the matrix (hij(O)) is v, and this remains true in a neighborhood of 0. The trick is to find a matrix C(x) for each x in the neighborhood that effects the diagonalization guaranteed by Sylvester's theorem: D = C(x)h(x)C(X)-l. The remaining details, including the fact that the matrix C(x) may be D chosen to depend smoothly on x, are left to the reader.
2.6. Rank and Level Set Definition 2.40. The rank of a smooth map rank of Tpf.
f at
p is defined to be the
If f : M -t N is a smooth map that has the same rank at each point, then we say it has constant rank. Similarly, if f has the same rank for each p in a open subset U, then we say that f has constant rank on U.
Theorem 2.41 (Level submanifold theorem). Let f : M -t N be a smooth map and consider the level set f-l(qO) for qo E N. If f has constant rank k on an open neighborhood of each p E f-l(qo), then f-l(qO) is a closed regular submanifold of codimension k. Proof. Clearly f-l(qO) is a closed subset of M. Let Po E f-l(qo) and consider a chart (U, cp) centered at Po and a chart (V, 'lj;) centered at qo with f(U) c V. We may choose U small enough that f has rank k on U. By Theorem C.5, we may compose with diffeomorphisms to replace (U, cp) by a new chart (U' , x) also centered at Po and replace (V, 'lj;) by a chart (V', y) centered at qo such that J := yo f 0 x-I is given by (a l , ... , an) H (a l , ... ,ak,O, ... ,0), where n = dim(M). We show that U'
If p E U' or
n f-l(qo) = {p E U ' : Xl(p) = ... = xk(p) = a}.
n f-l(qo),
then yo f(p) =
°
and yo f 0 x-1(X1(p), ... , xn(p)) =
°
Xl(p) = ... = xk(p) = 0. On the other hand, suppose that p E U' and xl (p) = ... = xk (p) = 0. Then we can reverse the logic to obtain that yo f(p) = and hence f(p) = qo.
°
2.6. Rank and Level Set
79
Since Po was arbitrary, we have verified the existence of a cover of f-l(qo) by single-slice charts (see Section 1.7). D Proposition 2.42. Let M and N be smooth manifolds of dimension m and n respectively with n > m. Consider any smooth map f : M ---+ N. Then if q E N is a regular value, the inverse image set f- 1 (q) is a regular submanifold. Proof. It is clear that since f must have maximal rank in a neighborhood of f-l(q), it also has constant rank there. We may now apply Theorem 2.41. D Example 2.43 (The unit sphere). The set 5 n - 1 = {x E JRn : L (x i )2 = I} is a codimension 1 sub manifold of JRn. For this we apply the above proposition with the map f : JRn ---+ JR given by x J---7 L (xi) 2 and with the choice q = 1 E lR.. Example 2.44. The set of all square matrices Mnxn is a manifold by virtue of the obvious isomorphism Mnxn ~ JRn 2 • The set sym(n, JR) of all symmetric matrices is a smooth n(n + 1)/2-dimensional manifold by virtue of the obvious 1-1 correspondence sym(n, JR) ~ JRn(n+1)j2 given by using n(n+ 1)/2 entries in the upper triangle of the matrix as coordinates. It can be shown that the map f : Mnxn ---+ sym(n, JR) given by A J---7 At A has full rank on O(n, JR) = f-l(I) and so we can apply Proposition 2.42. Thus the set O(n, JR) of all n x n orthogonal matrices is a submanifold of Mnxn. We leave the details to the reader, but note that we shall prove a more general theorem later (Theorem 5.107).
The following proposition shows an example of the simultaneous use of Sard's theorem and Proposition 2.42. Proposition 2.45. Let 5 be a connected submanifold of JRn and let L be a codimension one linear subspace of JR n . Then there exist x E JRn such that (x + L) n 5 is a submanifold of 5. Proof. Start with a line l through the origin that is normal to L. Let pr: JRn ---+ 5 be orthogonal projection onto l . The restriction 7r := prls ---+ l is easily seen to be smooth. If 7r(5) were just a single point x, then 7r- 1 (x) = (x + L) n 5 would be all of 5, so let us assume that 7r (5) contains more than one point. Now, 7r(5) is a connected subset of l ~ JR, so it must contain an open interval. This implies that 7r(5) has positive measure. Thus by Sard's theorem there must be a point x E 7r(5) c l that is a regular value of 7r. Then Theorem 2.42 implies that 7r- 1 (x) is a submanifold of 5. But this is D the conclusion since 7r- 1 (x) = (x + L) n S.
We can generalize Theorem 2.42 using the concept of transversality.
2. The Tangent Structure
80
Definition 2.46. Let f : M --t N be a smooth map and SeN a submanifold of N. We say that f is transverse to S if for every p E f- 1 (S) we have Tf(p)N = Tf(p)S + Tpf(TpM). If f is transverse to S, we write f rh S. Theorem 2.47. Let f : M --t N be a smooth map and SeN a submanifold of N of codimension k and suppose that f rh Sand f- 1(S) i= 0. Then f- 1(S) is a submanifold of M with codimension k. Furthermore we have Tp(f-l(S)) = Tf-l(Tf (p)S) for all p E f-l(S). Proof. Let q = f(p) E S and choose a single-slice chart (V, x) centered at q E V so that x(S n V) = x(V) n (II~n-k x 0). Let U := f-l(V) so that p E U. If 7f : IR n - k x IRk --t IRk is the second factor projection, then the transversality condition on U implies that 0 is a regular value of 7f 0 x 0 flu. Thus (7f 0 x 0 flu )-1 (0) = f- 1 (S) n U is a submanifold of U of codimension k. Since this is true for all p E f- 1 (S), the result follows. D We can also define when a pair of maps are transverse to each other: Definition 2.48. If h : Ml --t Nand 12 : M2 say that hand 12 are transverse at q E N if
Tf(p)N = TP1h(TPIM)
+ Tp212(Tp2M)
--t
N are smooth maps, we
whenever h(pd = 12(P2) = q.
(N ote that h is transverse to 12 at any point not in the image of one of the maps h and h.) If hand 12 are transverse for all q E N, then we say that hand 12 are transverse and we write h rh 12. One can check that if f : M of N, then f and the inclusion according to Definition 2.47.
--t L :
N is a smooth map and S is a submanifold S y N are transverse if and only if f rh S
If h : Ml --t Nand 12 : M2 consider the set
--t
N are smooth maps, then we can
(h x 12)-1 (~) := ((pl,P2) E Ml x M2 : h(Pl) = 12(P2)}, which is the inverse image of the diagonal
~ :=
{(ql, q2)
E
N x N : ql = q2}.
Corollary 2.49 (Transverse pullbacks). If h : Ml --t Nand 12 : M2 --t N are transverse smooth maps, then (h x 12)-1 (~) is a submanifold of lvIt x M 2. If gl : P --t Ml and g2 : P --t M2 are any smooth maps with the property h o gl = 12 0 g2, then the map (gl,g2): P --t (fl X 12)-1 (~) given by (gl, g2) (x) = (gl (x), g2 (x)) is smooth and is the unique smooth map such that prl 0 (gl,g2) = gl and pr2 0 (g1,g2) = g2. Proof. We leave the proof as an exercise. Hint: h x 12 is transverse to ~ if and only if h rh 12. D
2.7. The Tangent and Cotangent Bundles
81
2.7. The Tangent and Cotangent Bundles We define the tangent bundle of a manifold M as the (disjoint) union of the tangent spaces; T M = UPEM TpM. We show in Proposition 2.55 below that T 111 is a smooth manifold, but first we introduce a couple of definitions.
Definition 2.50. Given a smooth map J : M -+ N as above, the tangent maps TpJ on the individual tangent spaces combine to give a map
TJ:TM-+TN on the tangent bundle which is linear on each fiber. This map is called the tangent map or sometimes the tangent lift of J. For smooth maps J : M -+ Nand 9 : N -+ M we have the following simple looking version of the chain rule:
T(goJ) = TgoTf.
If U is an open set in a finite-dimensional vector space V, then the tangent space at x E U can be viewed as {x} xV. For example, recall that an element vp = (p, v) corresponds to the derivation J r---+ vpJ := ftlt=o J(p+tv). Thus the tangent bundle of U can be viewed as the product U x V. Let Ul and U2 be open subsets of vector spaces V and W respectively and let J : U1 -+ U2 be smooth (or at least C 1). Then we have the tangent map TJ : TU1 -+ TU2. Viewing TU1 as U1 x V and similarly for TU2, the tangent map T J is given by (p, v) r---+ (J(p), D J(p) . v). Definition 2.51. If J : M -+ V, where V is a finite-dimensional vector space, then we have the differential dJ (p) : TpM -+ V for each p. These maps can be combined to give a single map dJ : T M -+ V (also called the differential) which is defined by dJ(v) = dJ(p)(v) when v E TpM.
If we identify TV with the product V x V, then dJ = pr2 0 TJ, where pr2 : TV = V x V -+ V is the projection onto the second factor. Remark 2.52 (Warning). The notation "df" is subject to interpretation. Besides the map dJ : T M -+ V described above it could also refer to the map dJ : p r---+ dJ(p) or to another map on vector fields which we describe later in this chapter. Definition 2.53. The map 1fTM : TM -+ M defined by 1fTM(V) = p if v E TpM is called the tangent bundle projection map. (The set T AI together with the map 1fTM : T AI -+ M is an example of a vector bundle which is defined later.)
82
2. The Tangent Structure
Whenever possible, we abbreviate JrTM to Jr. For every chart (U, x) on M, we obtain a chart (if, x) on T M by letting
if:= TU = and by defining
Jr- 1 (U)
c TM
x on U by the prescription
x(vp) = (xl(p), ... ,xn(p),v l , ... ,vn ), where vp E TpM, and where v!, ... , v n are the (unique) coefficients in the coordinate expres. a Ip' Thus ~-l( I ... , u n, v I , ... , v n) -_ " a Ix-l(u)" slOn v -_ "i...J v i axi xu, i...J v i axi Recall that if vp = L vi a~i Ip' then vi = dx i (vp). From this we see that x = (xl 0 Jr, ... , xn 0 Jr, dx l , ... , dxn). For any (U, x), we have the tangent lift Tx : TU -+ TV where V = x (U). Since V C IRn, we can identify Tx(p) V with {x (p)} x IRn. Let us invoke this identification. Now let vp E TpU and let, be a curve that represents vp so that ,'(0) = vp. Exercise 2.54. Under the identification of Tx(p) V with {x (p)} x IR n we have Tpx,vp = (x (p), (x 0 , ) ) . [Hint: Interpret both sides as derivations.]
ftlt=o
If vp = a~i Ip' then we can take ,(t) := x-l(x (p) i-th member of the standard basis of IRn. Thus
Tpx,
f)~i Ip =
(x (p) , :t It=O (x (p)
+ tei) )
+ tei), = (x
where ei is the
(p) , ed·
Now suppose that vp = Lvi a~i Ip' Then Tpx . vp
~ Tpx. =
(L {J~i IJ
(x (p),
vi
L viei)
=
(xl(p), ... , xn(p), vI, ... , vn).
x
From this we see that Tx is none other than defined above, and since if = TU, we see that an alternative and suggestive notation for (if , is
x)
(TU, Tx), and we adopt this notation below. This notation reminds one that the charts we have constructed are not just any charts on T M, but are each associated naturally with a chart on M and are essentially the tangent lifts of charts on M. They are called natural charts. Proposition 2.55. For any smooth n-manifold M, the set T M is a smooth 2n-manifold in a natural way and JrTM : TM -+ M is a smooth map. Furthermore, for a smooth map f : M -+ N, the tangent map T f is smooth and
2.7. The Tangent and Cotangent Bundles
83
the following diagram commutes:
TM~TN
~
f
M
~
>N
Proof. For every chart (U, x), let TU = 11"-1 (U) and let Tx be the map Tx: TU ---+ x(U) x jRn. The pair (TU, Tx) is a chart on TM. Suppose that (TU, Tx) and (TV, Ty) are two such charts constructed as above from two charts (U, x) and (V, y) and that U n V i= 0. Then TU n TV i= 0 and on the overlap we have the coordinate transitions Ty 0 Tx- 1 : (x, v) t--t (y, w) where y = yo X-I (x), w
= D(y 0
x- 1 )lx v.
Thus the overlap maps are smooth. It is easy to see that Tx(TU n TV) and Ty(TU n TV) are open. Thus we obtain a smooth atlas on T M from an atlas on M and this generates a topology. It follows from Proposition 1.32 that T M is Hausdorff and paracompact. To test for the smoothness of 11", we look at maps of the form x011"o(Tx) -1. We have X011"O
(Tx)-l (x,v)
=X011"
(vi aail ) =xox-1(x)=x, x x-1(x)
which is just a projection and so clearly smooth. The remainder is left for the exercise below. 0 In the above proof we observed that Tx(TU n TV) and Ty(TU n TV) are open. This must be checked because of (ii) in Definition 1.25 and is the kind of detail we may leave to the reader as we move forward. Exercise 2.56. For a smooth map
f : M ---+ N, the map
Tf: TM ---+ TN is itself a smooth map.
If p E U n V and x(p) = (xl(p), ... , xn(p)), then, as in the proof above, TyoTx- 1 sends (xl(p), ... ,xn(p),v 1, ... ,vn ) to (yl(p), ... ,yn(p), wi, ... , w n ), where
84
2. The Tangent Structure
If we abbreviate the i-th component of yo x-I(XI(p), .. . , xn(p)) to yi = yi (xl (p), ... , xn (p)), then we could express the tangent bundle overlap map by the relations ) " 8yi k Yi = Yi ( x I ( p, ... , x n (p)) an d w i = 'L..J 8x k v .
Since this is true for all p E x(U n V), we can write the very classical looking expressions Yi
= Yi( x I , ... , x n) and
wi
" 8yi k = 'L..J 8x k v ,
where we now can interpret (xl, ... , xn) as an n-tuple of numbers. Once again we note that local expression could either be interpreted as living on the manifold in the chart domain or equally, in Euclidean space on the image of the chart domain. This should not be upsetting since, after all, one could argue that the charts are there to identify chart domains in the manifold with open sets in Euclidean space.
Definition 2.57. The tangent functor is defined by assigning to a manifold M its tangent bundle T M and to any map f : M -t N the tangent map T f : T M -t TN. The chain rule shows that this is a covariant functor (see Appendix A). Recall that we also defined a "pointed" tangent functor. We have seen that if U is an open set in a vector space V, then the tangent bundle is often taken to be U x V. Suppose that for some smooth n-manifold M, there is a diffeomorphism F : TM -t M x V such that the restriction of F to each tangent space is a linear isomorphism TpM -t {p} X V and such that the following diagram commutes: TM--_F_~) MxV
~~ M Then for some purposes, we can identify TM with M x V.
Definition 2.58. A diffeomorphism F : T M -t M x V such that the map FITp M : TpM -t {p} X V is linear for each p and such that the above diagram commutes is called a (global) trivialization of TM. If a (global) trivialization exists, then we say that T M is trivial. For an open set U eM, a trivialization of TU is called a local trivialization of T Mover U. For most manifolds, there does not exist a global trivialization of the tangent bundle. On the other hand, every point p in a manifold M is contained in an open set U so that T M has a local trivialization over U. The
2.7. The Tangent and Cotangent Bundles
85
existence of these local trivializations is quickly deduced from the existence of the special charts which we constructed above for a tangent bundle. Next we introduce the cotangent bundle. Recall that for each P E M, the tangent space TpM has a dual space T; M called the cotangent space at
p. Definition 2.59. Define the cotangent bundle of a manifold M to be the set
T*M:=
U
T;M pEM M to be the obvious projection taking
and define the map 7rT" M : T* M -t elements in each space T; M to the corresponding point p.
Remark 2.60. We will denote both the tangent bundle projection and the cotangent bundle projection simply by 7r whenever no confusion is likely. Remark 2.61. Suppose that J : M -t N is a smooth map. It is important to notice that even though for each P E M, the map TpJ : TpM -t Tf(p)N has a dual map (TpJ)* : (Tf(p)N) * -t (TpM)*, these maps do not generally combine to give a map from T* N to T* M. In general, there is nothing like a "cotangent lift". To see this, just consider the case where J is a constant map. We now show that T* M is also a smooth manifold. Let A be an atlas on M. For each chart (U, x) E A, we obtain a chart (T*U, T*x) for T* M which we now describe. First, T*U = 7rr:M(U) = UPEu T;M. Secondly, T*x is a map which we now define directly and then show that, in some sense, it is dual to the map Tx. For convenience, consider the map Pi : {}p t---+ ~i which just peals off the coefficients in the expansion of any {}p E T; M in the basis
(dxll p "'" dxnl p):
Notice that we have
(}p(a~il) = L~j dxil p (a~il) = L~jol =~j = Pi ({}p) , and so
Pi(Op) = Op (
8~i
IJ .
With this definition of the Pi in hand, we can define
T*x =
(Xl 0 7r, ... ,
xn
0 7r,PI, ...
,Pn)
on T*U. We call (T*U, T*x) a natural chart. If x = (Xl, ... , x n ), then for the natural chart (T*U, T*x), we could use the abbreviation T*x =
2. The Tangent Structure
86
(xl, ... ,Xn,P1, ... ,Pn)' Another common notation is qi := xi notation is very popular in applications to mechanics.
07r.
This
We claim that if we take advantage of the identifications of TxjRn = jRn = (jRn)* = T*jRn where (jRn)* is the dual space of jRn, then T*x acts on each fiber T; M as the dual of the inverse of the map Tpx, i.e. the contragredient of Tpx: ((TpX)-l) * (fJp) . (v)
= fJp ((Tpx)-l . v) .
Let us unravel this. If fJp E T; M for some P E U, then we can write fJp = L~i dxil p
for some numbers We have
~i
depending on fJp which are what we have called Pi (fJp).
((Tpx)-l)* (fJp)' (v)
= fJp ((Tpx)-l. v) = L~i dxil p ' ((Tpx)-l. v)
IJ
~ Lei dxil (~>' a=' ~ Leivi p
Thus, under the identification ofjRn with its dual we see that
((T x)-l) * (fJ p
p)
is just (6, ... ,~n). But recall that T*x(fJp) = (x1(p), ... ,xn(p),6"'.'~n)' Thus for fJp E T; M we have T*x(fJp) = (x(p), ((Tpx)-l)* (fJp)).
Suppose that (T*U, T*x) and (T*V, T*y) are the coordinates constructed as above from two charts (U, x) and (V, y) respectively with unv i- 0. Then on the overlap T* U n T* V we have T*y
0
(T*x)-l : x(U n V) x jRn*
-t
y(U n V) x jRn*.
This last map will send something of the form (x,~) E U x jRn* to (x, () = (yox-1 (x), D(xoy-1)* .~), where D(xoy-1)* is the dual map to D(xoy-1), which is the contragredient of the map D(y 0 x-I). If we identify jRn* with jRn and write ~ = (6, ... , ~n) and ( = ((1, ... , &), then in the classical style we have: . 1 " ox k ft. = yt(x , ... , xn) and ~i = L...J ~k oyi . k
This should be compared to the expression (2.2). It is now clear that we have an atlas on T* M constructed from an atlas on M. The topology of T* M (induced by the above atlas) is easily seen to be paracompact and Hausdorff.
2.8. Vector Fields
87
In summary, both T M and T* M are smooth manifolds whose smooth structure is derived from the smooth structure on M in a natural way. In both cases, the charts are derived from charts on the base M and are given by the n coordinates of the base point together with the n components of the element of T M (or T* M) in the corresponding coordinate frame.
2.8. Vector Fields In this section we introduce vector fields. Roughly, a vector field is a smooth assignment of a tangent vector to each point of a manifold. Definition 2.62. If 1l' : M -+ N is a smooth map, then a (global) section of 1l' is a map (J : N -+ M such that 1l' 0 (J = id. If (J is defined only on an open subset U of Nand 1l' 0 (J = idu, then we call (J a local section. In case the section (J is a smooth (or cr) map, we call (J a smooth (or cr) section. Clearly, if 1l' : M -+ N has a (global) section, then it must be surjective. Definition 2.63. A smooth vector field on M is a smooth map X: M -+ T M such that X (p) E TpM for all p E M. In other words, a vector field on M is a smooth section of the tangent bundle 1l' : T M -+ M. We often write Xp = X(p). Convention: Obviously the notion of a section or field that is not smooth makes sense. Sometimes one is interested in merely continuous sections or measurable sections. In this book, by "vector field" or "section", we will always mean "smooth vector field" or "smooth section" unless otherwise indicated explicitly or by context. A local section of T M defined on an open set U is just the same thing as a vector field on the open manifold U. If (U, x) is a chart on a smooth n-manifold, then writing x = (Xl, .. . ,xn ), we have vector fields defined on Uby
:pi--t aa.j. aa. xt xt p The ordered set of fields (Ixr, ... , 8~n) is called a coordinate frame field (or also "holonomic frame field"). If X is a smooth vector field defined on some set including this chart domain U, then for some smooth functions Xi defined on U we have
X(p) = or in other words
L Xi(p) a~i jp,
2. The Tangent Structure
88
Notation 2.64. In this context, we will not usually bother to distinguish from its restrictions to chart domains and so we just write = L: a~i .
X
X
Xi
Lemma 2.65. If v E TpM then there exists a vector field X such that X(p) = v.
Ip·
Proof. Write v = L: vi a~i Define a field Xu by the formula L: vi a~i where the vi are taken as constant functions on U. Let (3 be a cut-off function with support in U and such that (3(p) = 1. Then let X := (3Xu on U and extended to zero outside of U. 0
Let us unravel what the smoothness condition means for a vector field. Let (TU, Tx) be one of the natural charts that we constructed for T M from a corresponding chart (U, x) on M. To test the smoothness of X, we look at the composition Tx 0 X 0 x-I. For x E x(U), we have Tx 0 X
0 X-I (x)
(I: Xi a~i) x- (x) = Tx(I: X (x- (x)) aa I
= Tx 0
1
0
1
i
= (x,
T
)
i
X
x- 1 (x)
X-l(X)X(I: X i (x- 1 (x)) aaX i Ix-1(x) ))
= (x, X I 0 x-I (x), ... , Xn 0 x-I (x)) . Our chart was arbitrary, and so we see that the smoothness of X is equivalent to the smoothness of the component functions in every chart of an atlas for the smooth structure.
Xi
Exercise 2.66. Show that if X : M -t T M is continuous and 7r 0 X = id, then X is smooth if and only if X f : p H Xpf is a smooth function for every locally defined smooth function f on M. Show that it is enough to consider globally defined smooth functions. Notation 2.67. The set of all smooth vector fields on M is denoted by X(M). Smooth vector fields may at times be defined only on some open set U c M so we also have the notation X(U) = XM(U) for these fields.
We define the addition of vector fields, say X and Y, by
(X
+ Y) (p)
:=
X(p)
+ Y(p),
and scaling by real numbers, by
(cX) (p) := cX(p).
89
2.8. Vector Fields
Then the set X(M) is a real vector space. If we define multiplication of a smooth vector field X by a smooth function f by
(f X) (p)
:=
f(p)X(p),
then the expected algebraic properties hold making X(M) a module over the ring COO(M) (see Appendix D). It should be clear how to define vector fields of class cr on M and the set of these is denoted xr(M) (a module over Cr(M)). The notion of a vector field along a map is often useful.
Definition 2.68. Let f : N ---+ M be a smooth map. A vector field along I is a smooth map X : N ---+ T M such that 1fTM 0 X = f. A vector field along a regular submanifold ScM is a vector field along the inclusion map S,-+ M. (Note that we include the case where S is an open submanifold.) We let Xf denote the space of vector fields along f. It is easy to check that for a smooth map COO(N)-module in a natural way.
f : N ---+
M, the set Xf is a
We have seen how individual tangent vectors in TpM can be identified as derivations at p. The derivation idea can be globalized. We explain how we may view vector fields as derivations.
Definition 2.69. Let M be a smooth manifold. A (global) derivation on COO(M) is a linear map V : COO(M) ---+ COO(M) such that
V(fg)
= V(f)g + fV(g).
We denote the set of all such derivations of COO(M) by Der(COO(M)). Notice the difference between a derivation in this sense and a derivation at a point.
Definition 2.70. To a vector field X on M, we associate the map LX COO(M) ---+ COO(M) defined by
(Lxf)(p)
ex
:=
Xpj.
is called the Lie derivative on functions.
It is important to notice that (Lxf)(p) = Xp' f = df(Xp) for any p and so Lxf = df 0 X. If X is a vector field on an open set U, and if I is a function on a domain V c U, then we take LX f to be the function defined on V by p f---t Xpf for all p E V. It is easy to see that we have CaX+bY = aLx + bLy for a, bE lR and X, Y E X(M).
Lemma 2.71. Let U c M be an open set and X E X(M). If Lxf all IE COO(U), then Xiu = o.
= 0 lor
2. The Tangent Structure
90
Proof. Let p E U be given. Working locally in a chart (V, x), let X = 'LX i8/8xi. We may assume p EVe U. Using a cut-off function we may find functions fi defined on U such that fi coincides with xi on a neighborhood of p. Then we have Xi(p) = Xpx i = Xpfi = (.cxf i ) (p) = O. Thus X (p) = 0 for an arbitrary p E U. 0
The next result is a very important characterization of smooth vector fields. In particular, it paves the way for the definition of the bracket of vector fields which plays a central role in differential geometry. Theorem 2.72. For X E X(M), we have.c x E Der(CC'O(M)), and if V E Der(COO(M)), then V = .cx for a uniquely determined X E X(M). Proof. That .cx is in Der(COO(M)) follows from the Leibniz law, in other words, from the fact that Xp is a derivation at p for each p. If we are given a derivation V, we define a derivation Xp at p (i.e. a tangent vector) by the rule Xpf := (V J) (p). We need to show that the assignment p H Xp is smooth. Recall that any locally defined function can be extended to a global one by using a cut-off function. Because of this, it suffices to show that p H Xpf is smooth for any f E COO(M). But this is clear since Xpf := (VJ) (p) and Vf E COO(M). Suppose now that V = .cXI = .cX2' Notice that .cXI - .cX2 = .cXI-X2 and so .cXI-X2 is the zero derivation on COO(M). By Lemma 2.71, we have Xl - X2 = O. 0
Because of this theorem, we can identify Der(COO(M)) with X(M) and we can and often will write X f in place of .c x f:
Xf:= .cxf. The derivation law (also called the Leibniz law) .cx(Jg) = g.cx f + f.cx 9 becomes simply X(Jg) = gXf + fXg. Another thing worth noting is that if we have a derivation of COO(M), then from our discussion above we know that it corresponds to a vector field. As such, it can be restricted to any open set U c M, and thus we get a derivation of COO(U). If f E COO(U) we write X f instead of the more pedantic Xlu J. While it makes sense to talk of vector fields on M of differentiability r where 0 < r < 00 and these do act as derivations on cr(M), it is only in the smooth case (r = 00) that we can say that vector fields account for all derivations of cr(M). Theorem 2.73. If VI, V2 E Der(COO(M)), then [VI, V2J E Der(COO(M)) where
91
2.8. Vector Fields
Proof. We compute
VI ('0 2 (fg)) = VI (V 2(f)g
+ f V 2(g))
= ('0 1'021) 9 + V2fV 1g + VdV 2g + fV 1V 2g. Writing out the similar expression for '0 2 (VI (f g)) and then subtracting we obtain, after a cancellation,
[VI, '0 2] (fg) = ('0 1'021) 9 + f V 1V 2g - ((V2V d) 9 + f V 2V 1g)
= ([VI, V 2]f) 9 + J[V1' V 2]g.
0
Corollary 2.74. If X, Y E X(M), then there is a unique vector field [X, Y] such that £[X,Yj = £x 0 £y - £y 0 £x.
Since £xf is also written Xf, we have [X, Y]f = X (Yf) - Y (Xf) or
[X,Y] = XY - Yx. Definition 2.75. The vector field [X, Y] from the previous corollary is called the Lie bracket of X and Y. Proposition 2.76. The map (X, Y) X, Y, Z E X(M) we have
f-7
[X, Y] is bilinear over JR, and for
(i) [X, Y] = -[Y, X];
+ [Z, [X, YlJ = 0 = fg[X, Y] + f (Xg) Y - 9 (Y f) X
(ii) [X, [Y, ZlJ+ [Y, [Z, X]]
(Jacobi Identity);
(iii) [f X, gY]
for all f, 9 E COO(M).
Proof. These results follow from direct calculation and the previously mentioned fact that £aX+bY = a£x + bey for a, bE JR and X, Y E X(M). 0
The map (X, Y) f-7 [X, Y] is bilinear over JR, but by (iii) above, it is not bilinear over COO(M). Also notice that in (ii) above, X, Y, Z are permuted cyclically. We ought to see what the local formula for the Lie derivative looks like in conventional "index" notation. Suppose we have X = LXi 8~' and Y = L yi 8~i. Then we have the local formula
Exercise 2.77. Verify this last formula.
The JR-vector space X(M) together with the JR-bilinear map (X, Y) f-7 [X, Y] is an example of an extremely important abstract algebraic structure:
92
2. The Tangent Structure
Definition 2.78 (Lie algebra). A vector space a (over a field IF) is called a Lie algebra if it is equipped with a bilinear map a x a -t a (a multiplication) denoted (v, w) t--t [v, w] such that
[v,w] = -[w,v] and such that we have the Jacobi identity
[x, [y, z]]
+ [y, [z, x]] + [z, [x, y]] = 0
for all x,y,z E a.
Definition 2.79. A Lie algebra a is called abelian (or commutative) if [v, w] = 0 for all v, w E a. A subspace ~ of a is called a Lie subalgebra if it is closed under the bracket operation, and it is called an ideal if [v, w] E ~ for any v E a and w E ~. (We indicate this by writing [a,~] C ~.) Notice that the Jacobi identity may be restated as [x, [y, z]] = [[x, y], z] + [y, [x, z]], which just says that for fixed x the map y t--t [x, y] is a derivation of the Lie algebra a. This is significant mathematically and also an easy way to remember the Jacobi identity. The Lie algebra X(M) is infinite-dimensional (unless M is zero-dimensional), but later we will be very interested in certain finite-dimensional Lie algebras which are subalgebras of X(M). Given a diffeomorphism ¢ : M -t N, we define the pull-back ¢*Y E X(M) for Y E X(N) and the push-forward ¢*X E X(N) of X E X(M) by ¢ by ¢*Y ¢*X
= T¢-l 0 Y 0 ¢ and = T ¢ 0 X 0 ¢ -1.
In other words, (¢*Y)(p) = T¢-l. Y<j>(P) and (¢*X)(p) = T¢·X<j>-l(p). Notice that ¢*Y and ¢*X are both smooth vector fields. Warning: Since many authors use the notation f* for the tangent map T f, the notation f*X might be interpreted to mean T foX, which is actually a vector field along the map f rather than an element of X(N). We shall not use f* as a notation for Tf.
To summarize a bit, if f : M -t N is a smooth map, then for each p we have the tangent map Tpf : TpM -t Tf(p)N, the tangent lift T f : T M -t TN (a "bundle map"), and if f is a diffeomorphism, we have the induced maps on the level of fields f* : X(M) -t X(N) and f* : X(N) -t X(M). Notice that if ¢ : M -t Nand 'ljJ : N -t Pare diffeomorphisms, then we have ('ljJ
0
¢)* = 'ljJ*
0
¢* : X(M)
('ljJ
0
¢)* = ¢*
0
'ljJ* : X(P) -t X(M).
-t
X(P),
We have right and left actions of the diffeomorphism group Diff(M) on the space of vector fields. The left action Diff(M) x X(M) -t X(M) is given by
2.8. Vector Fields
93
(¢, X) t-+ ¢*X, and the right action x(M) x Diff(M) -+ x(M) is given by (X, ¢) t-+ ¢* X.
On functions, the pull-back is defined by ¢*g := go ¢ for any smooth map, but if ¢ is a diffeomorphism, then we can also define a push-forward ¢* := (¢-1)*. With this notation we have the following proposition. Proposition 2.80. The Lie derivative on functions is natural with respect to pull-back and push-forward by diffeomorphisms. In other words, if ¢ : M -+ N is a diffeomorphism and f E COO(M), 9 E COO(N), X E x(M) and Y E x(N), then
and
Proof. We use Definition 2.19. For any p we have
(£<jJ*y¢*g) (p) = (¢*Y)p ¢*g = (T¢-l = (T¢-l . Y<jJ(p») [g
0
0
Y
0
¢)p [g
0
¢]
¢] = T¢ (T¢-lY<jJ(p») 9
= Y<jJ(p)g = (£yg) (¢(p)) = (¢* £yg) (p). The second statement follows from the first since ¢* = (¢ -1) *.
0
Even if f : M -+ N is not a diffeomorphism, it may still be that there is a vector field Y E x(N) such that
TfoX=Yof. In other words, it may happen that Tf· Xp = Yf(p) for all p in M. In this case, we say that Y is f -related to X and write X '" f Y. It is not hard to check that if Xi is f-related to Y; for i = 1,2, then aX l + bX1 is f-related to aYl + bYl .
Example 2.81. Let M and N be smooth manifolds and consider the projections pr1 : M x N -+ M and pr2 : M x N -+ N. Since T(p,q) (M x N) can be identified with TpM x TqN, we see that for X E x(M), Y E x(N) we obtain a vector field X x Y E x(M x N) defined by (X x Y) (p, q) = (X(p), Y(p)). Then one can check that X x Y and X are pr1-related
and
X x Y and Yare prTrelated. Exercise 2.82. Let M, N, X, Y and X x Y be as in the example above. Show that if ~q : M -+ M x N is the insertion map p t-+ (p, q), then X and X x Y are ~q-related if and only if Y(q) = O.
94
2. The Tangent Structure
Lemma 2.83. Suppose that f : M -+ N is a smooth map, X E X(M) and Y E X(N). Then X and Yare f-related if and only if X(g 0 f) = (Y g) 0 f for all 9 E COO(N). Proof. Let p E M and let 9 E Coo (N). Then
X(g
0
f)(p)
= Xp(g 0
f)
= (Tpf . Xp) 9
and (Y go f) (p)
so that X (g 0 f) = (Y g)
0
= Yf(p)g
f for all such 9 if and only if Tpf . Xp = Yf(p).
0
Proposition 2.84. If f : M -+ N is a smooth map and Xi is f -related to Yi for i = 1,2, then [Xl, X2] is f-related to [YI, Y2]. In particular, if ¢ is a diffeomorphism, then [¢*X I , ¢*X2] = ¢*[X I , X 2] for all Xl, X 2 E X(M). Proof. We use the previous lemma: Let 9 E COO(N). Then X I X2(g 0 f) = X I ((Y2g) 0 f) = (YIY2g) 0 f. In the same way, X 2X I (g 0 f) = (Y2Y I g) 0 f and subtracting we obtain
[XI,X2] (g
0
f) = XIX2(g 0 f) - X2 X I(g 0 f) = (YIY2g) 0 f - (Y2YIg) 0 f =
([YI , Y2]g)
0
f.
Using the lemma one more time, we have the result.
o
If S is a submanifold of M and X E X(M), then the restriction Xis E X(S) defined by Xis (p) = X(p) for all pES is i-related to X where i : S y M is the inclusion map. Thus for X, Y E X(M) we always have that [Xis, Yl s] is i-related to [X, Y]. This just means that [X, Y] (p) = [Xis, Yls] (p) for all p. We also have
Proposition 2.85. Let f : M -+ N be a smooth map and suppose that X "'f Y. Then we have £x (f*g) = j*£yg for any 9 E COO(N).
The proof is similar to what we did above and is left to the reader. 2.8.1. Integral curves and flows. Recall that if c : I -+ M is a smooth curve, then the velocity at "time" t is
:t c(t) = c(t) = Ttc· :u It'
Iu
where is the standard field on 1R given at a E 1R as the equivalence class a f = f' (a). of the curve t f-t a + t or by the derivation
Iu I
2.8. Vector Fields
95
Definition 2.86. Let X be a smooth vector field on M. A curve c: I -t M is called an integral curve for X if for all t E I, the velocity of c at time t is equal to X(c(t)), that is, if
c= X
0
c.
Thus if c is an integral curve for X and
f
is a smooth function, then
Xc(t)f = (f 0 c)' (t) for all t in the domain of c. If the image of an integral curve c lies in U for a chart (U, x), and if X = L: Xi 8~i' then c = X 0 c gives the local expressions
d . dt xt 0
.
C
= Xt
0
c for i
= 1, ... ,n,
which constitute a system of ordinary differential equations for the functions xi 0 c. These equations are classically written as d:1ti = Xi.
A (complete) flow is a map
X~(p) =
:t
10
E
TpM for p E M.
If one computes the velocity vector C(O) of the curve c : t H 0, an open set V with Xo EVe U, and a smooth map
2. The Tangent Structure
96
such that t H cx(t) := (t,x) is a curve satisfying t E (-a, a) and cx(O) = x.
c~(t) =
F(cx(t)) for all
Example 2.88. Consider the differential equation on the line given by
c'(t) = (c(t))2/3. There are two distinct solutions with initial condition c(O)
= o. Namely,
c(t) = 0 for all t and
1 c(t) = 27t3 for all t.
The reason uniqueness fails is the fact that the function F(x) = x 2/ 3 is not differentiable at x = O. Now let X E X(M) and consider a point p in the domain of a chart (U, x). The local expression for the integral curve equation c( t) = X (c( t)) is of the form treated in the last theorem, and so we see that there certainly exists an integral curve for X through p defined on at least some small interval (-E, E). We will now use this theorem to obtain similar but more global results on smooth manifolds. First of all, we can get a more global version of uniqueness: Lemma 2.89. If
Cl : (-El, Ed ---+ M and C2 : (-E2' (2) ---+ M are integral curves of a vector field X with Cl (0) = C2(0), then Cl = C2 on the intersection of their domains.
Proof. Let K = {t E (-El,Ed n (-E2,E2) : Cl(t) = C2(t)}. The set K is closed since M is Hausdorff. It follows from Theorem 2.87 that K contains a (small) open interval (-E, E). Let to be any point in K and consider the translated curves cia (t) = Cl (to + t) and c~a (t) = C2 (to + t). These are also integral curves of X and they agree at t = 0, and by Theorem 2.87 again we see that cia = c~a on some open neighborhood of o. But this means that Cl and C2 agree in this neighborhood, so in fact this neighborhood is contained in K implying that K is also open since to was an arbitrary point in K. Thus, since I = (-El, Ed n (-E2, (2) is connected, it must be that I = K and so Cl and C2 agree on I = (-El, EI) n (-E2, (2). 0
Let X be a Coo vector field on M. A flow box for X at a point p E M is a triple (U, a,
(1) U is an open set in M containing p. (2)
2.8. Vector Fields
97
(4) The map
x
0
x
x
x
0
x
whenever defined. This is the local group property, so called because if
tx
Notice that whereas is a complete vector field on ]R2 (using standard coordinates x,y), this vector field restricted to ]R2\{O} is not complete on the manifold ]R2\{O}. The reason is that integral curves starting at points on the x-axis will run up against the missing origin in finite time. This "running up to missing points" is not the only way a vector field can fail on R The integral to be complete. Consider the vector field (1 + x 2 ) curve that is at the origin at time zero is obtained by solving the initial value problem x' = 1 + x 2, x(O) = o. The unique solution is x(t) = tan t, and since limH±7r/2 tan t = ±oo, the solution cannot be extended beyond
!
±1f/2.
tx
t
Exercise 2.90. Show that on ]R2 the vector fields y2 and x 2 y are complete, but y2 + x 2 y is not complete. In particular, the set of complete vector fields is not generally a vector space (but this is true if M is compact).
tx
t
Theorem 2.91 (Flow box). Let X be a Coo vector field on an n-manifold M with r ~ 1. Then for every point Po E M there exists a flow box for X at Po. If (U I , aI,
2. The Tangent Structure
98
to be smaller so that the flow stays within the range of the chart map x. In this setting, a vector field can be taken to be a map U -+ ]Rn, so Theorem 2.87 provides us with the flow box data (V, a, 0 small enough that vt = (t, V) C U for all t E (-a, a). Now the flow box is transferred back to the manifold via x, U
rpx (t,p)
= x-I(V),
= x-I [(t, x(p))].
If we have two such flow boxes (UI' aI, rpr) and (U2, a2, rp?f), then by Lemma 2.89, we see that for any x E UI n U2 we must have rpr (t, x) = rp?f (t, x) for all t E (-aI, al) n (-a2, a2).
Finally, since rpt = rpx (t , .) and rp~t = rpx ( - t , .) are both smooth and inverses of each other, we see that rpt is a diffeomorphism onto its image Ut = x-l(vt). 0 Lemma 2.92. Suppose that Xl, ... , X k are smooth vector fields on M and let Po E M be given and 0 be an open set containing Po. If rpXl, ... , rpXk are the local flows corresponding to flow boxes whose domains UI, ... ,Uk all contain Po, then there is an open set U C UI n ... n Uk and an E > 0 such that the composition Xk Xl rptk 0'" 0 rptl is defined on U and maps U into 0 whenever it, ... , tk E (-E, E).
Proof. If the flow box corresponding to rpXi is (Ui , Ei, rpf), then by shrinking UI further we may arrange things so that rp;l maps UI into 0 for all t E (-EI' EI), and then inductively we arrange for rpXi to map Ui into Ui-l for all t E (-Ei' Ei). Now let E = min{ EI,"" Ek}. 0 Remark 2.93. When making compositions of local flows, we will not always make careful statements about domains, but the previous lemma will be invoked implicitly. If Cp(t) is an integral curve of X defined on some interval (a, b) containing
o and Cp(O) = p, then we may consider the limit lim Cp(t).
t~b-
If this limit exists as a point PI EM, then we may consider the integral curve cPl beginning at Pl. One may now use Lemma 2.89 to combine t I-t Cp(t) with t I-t Cpl (t-b) to produce an extended integral curve beginning at p. We may repeat this process as long as the limit exists. We may do a similar thing in the negative direction. This suggests that there is a maximal integral curve defined on a maximal interval := (Tp~x' Tp~x), where Tp~x might be -00 and Tp~x might be +00. We produce this maximal integral curve
J;
2.8. Vector Fields
99
as follows: Consider the collection Jp of all pairs (J, a), where J is an open interval containing 0 and a : J -t M is an integral curve of X with a(O) = p. Then let J; = U(J,a)EJpJ and define cmax(t) := a(t) whenever t E J for (J, a) E Jp. By existence and uniqueness, this definition is unambiguous. -t .M is the desired maximal integral curve and is The curve Cmax : easily seen to be unique.
J;
Definition 2.94. Let X be a Coo vector field on M. For any given p EM, let := (Tp~x' Tp~x) c ~ be the domain of the maximal integral curve c: -t M of X with c(O) = p. The maximal flow cpx is defined on the set (called the maximal flow domain)
J; J;
U J; x {p}
1)x =
pEM
by the prescription that t such that cpx (0, p) = p.
H
cpx (t,p) is the maximal integral curve of X
Thus by definition, X is a complete vector field if and only if 1)x IRxM. We will abbreviate (T;'x,Tp~x) to (Tp-, T p+).
Theorem 2.95. For X E X(M), the set 1)x is an open neighborhood of {O} x M in ~ x M and the map cpx : 1)x -t M is smooth. Furthermore,
(2.4)
cpx (t
+ s,p) = cpx (t, cpx (s,p))
whenever both sides are defined. If the right hand side is defined, then the left hand side is defined. Suppose that t, s ::::: 0 or t, s ::; O. Then if the left hand side is defined so is the right hand side. Proof. Let q = cpx (s,p). If the right hand side is defined, then s E (T;,T:) and t E (Tq-,T:). The curve 7jJ : T H cpx(s + T,p) is defined for T E (Tp- - s, T: - s), and this is the maximal domain for 7jJ. We have
ddT I 7jJ = dd cpx (s 7
+ T,p)
T
=
dd U
I
cpx (u,p)
U=S+7
= X(cpX (s + T,p)) = X(7jJ(T)).
17=0
We also have 7jJ(0) = cpx (T + s, p) = cpX (s, p) so 7jJ is an integral curve starting at q = cpX(s,p). Thus (Tp- - s,Tp+ - s) C (Tq-,Tq+) and 7jJ = !pX(.,q) on (Tp- - s,Tp+ - s). But the maximal domain for 7jJ is (Tp- - s, ~ - s) and so in fact (Tp- - s, T: - s) = (Tq-, T:) for otherwise cpx (., q) would be a proper extension. But then, since t E (Tq-, T:), we have that t E (T; - s, Tp+ - s) and so 7jJ(t) = cpx (t + s,p) is defined and
cpx (t
+ s, p)
=
cpx (t, cpx (s, p)).
2. The Tangent Structure
100
Now let us assume that t, s > 0 and that cpx (s+t,p) is defined (the case of t, s ~ 0 is similar). Then since s, t ~ 0, we have s, t, t + s E
(Tp-, T:) .
Let q = cpx(s,p) as before and let O(u) = cpx(s + u,p) be defined for u with 0 ~ u ~ t. But O(u) is an integral curve with 0(0) = q. Thus we have that cpx (u, q) must also be defined for u = t and O(t) = cpx (t, q). But cpX(t,q) cpx(t,cpX(s,p)), which is thereby defined, and we have
cpx (s + t,p) :_ O(t) _ cpx (t, cpx (s,p)). Now we will show that Vx is open in lR x M and that cpx : Vx --+ M is smooth. We carefully define a subset S C 'Dx by the condition that (t,p) E S exactly if there exists an interval J containing 0 and t and also an open set U C M such that the restriction of cpx to J x U is smooth. Notice that S is open by construction. We intend to show that S = 'Dx. Suppose not. Then let (tQ,po) E 'Dx n Sc. We will assume that to > 0 since the case to < 0 is proved in a similar way. Now let T := sup{t : (t,po) E S}. We know that (O,po) is contained in some flow box and so T > O. We also have T ~ to by the definition of to. Thus T E J:O and we define qo := cpx (T,PO). Now applying the local theory we know that qo is contained in an open set Uo such that cpx is defined and smooth on (-f, f) X Uo for some f > o. We will now show that cpx is actually defined and smooth on a set of the form (-8,r) x 0 where 0 is open, 8,r > 0, and (T,PO) E (-8,r) x O. Since this contradicts the definition of T, we will be done. We may choose tl > 0 so that T E (tl' tl +f) and so that cpX(tl'PO) E Uo. Note that (tl,PO) E S since tl < T. SO on some neighborhood (-8, tl +8) x U1 of (tl,PO) the flow cpx is smooth. By choosing Ul smaller if necessary we can arrange that cpx ({tl} x U1 ) c Uo. Now consider the equation
cpx (t, p) - cpX (t - tl, cpx (tl,p)). If It - tl < f and p E UI, then both sides are defined and the right hand side is smooth near such (t,p). But the right hand side is already known to be smooth on (-8, tl + 8) X UI. We now see that cpx is smooth on (-8, tl + f) X UI, which contains (T,PO) contradicting the definition of T. Thus S Vx. 0 Remark 2.96. In this text, cpx will either refer to the unique maximal flow defined on Vx or to its restriction to the domain of a flow box. In the latter case we call cpx a local flow. We could have introduced notation such as cp!ax, but prefer not to clutter up the notation to that extent unless necessary. We hope that the reader will be able to tell from context what we are referring to when we write cpx.
2.8. Vector Fields
101
If cpx is a flow of X, then we write CPt for the map p t---+ CPt (p). The (maximal) domain of this map is V~ - {p : t E (Tp~x' T:'-x )}' Note that, in general, the domain of CPt depends on t. Also, we have the tangent map Tocp; : ToJR ---t TpM and
:t It=o
cP; (t) = Tocp; :u 10 = X p,
where :'U 10 is the vector at 0 associated to the standard coordinate function on IR (denoted by u here). Exercise 2.97. Let sand t be real numbers. Show that the domain of is contained in V~+t and show that for each t, V~ is open. Show that
Definition 2.98. The support of a vector field X is the closure of the set {p: X(p) i= O} and is denoted supp(X). Lemma 2.99. Every vector field that has compact support is a complete vector field. In particular, if M is compact, then every vector field is complete.
J; J;
c;
Proof. Let be the maximal integral curve through p and = (T-, T+) its domain. If X(p) = 0, then the constant curve, c(t) = p for all p, is the unique integral curve through p and is defined for all t so = JR. Now (t) must always suppose X(p) i= O. If t E (T-, T+), then the image point lie in the support of X. Indeed, since X (p) is not zero, is not constant. If (t) were in M\ supp(X) , then (t) would be contained in some ball on which X vanishes and then by uniqueness this implies that is constantly (t) for all time a contradiction. But we show that if T+ < 00, equal to then given any compact set K c M, for example the support of X, there is an f > 0 such that for all t E (T+ - f, T+), the image (t) is outside K. If not, then we may take a sequence ti converging to T+ such that (ti) E K. But then going to a subsequence if necessary, we have Xi := (ti) ---t X E K. Now there must be a flow box (U, a, x), so that for large enough k, we have that tk is within distance a of T+ and Xk = (tk) is inside U. We are then guaranteed to have an integral curve ~ (t) of X that continues beyond T+ and thus can be used to extend which is a contradiction of the maximality of T+. Hence we must have T+ = 00. A similar argument gives the result that T- = -00. 0
c:
c: c: c:
c:
c;
c:
c:
c;
c;
c:
Exercise 2.100. Let a > 0 be any fixed positive real number. Show that iffor a given vector field X, the flow cpx is defined on (-a, a) x M, then in fact the (maximal) flow is defined on JR x M and so X is a complete vector field.
2. The Tangent Structure
102
@~ ~@ Figure 2.3. Isolated vanishing points
Figure 2.4. Straightening
If a vector field is zero at some point but nonzero elsewhere in a neighborhood of that point, then we say that the field has an isolated zero. The structure of a vector field near such an isolated zero can be quite complex and interesting. The qualitative structure of three such possibilities are show in Figure 2.3 for dimension two. Near zeros that are not isolated, the situation is also potentially complex. On the other hand, at non-vanishing points, all vector fields are the same up to a local diffeomorphism. This is the content of the following theorem which is sometimes called the straightening theorem. Theorem 2.101. Let X be a smooth vector field on M with X(p) some p E M. Then there is a chart (U, x) with p E U such that X
a
= axl
=1=
0 for
on U.
Proof. Since this is clearly a local problem, it will suffice to assume that M = ]Rn and p = O. Let (u l , ... , un) be standard coordinates on ]Rn. By a rotation and translation if necessary, we may assume that X(O) = The idea is that there must be a unique integral curve through each point of the hyperplane {u1 = O}. We wish to arrange for the new coordinates of q E ]Rn to be such that if an integral curve of X passes through (0, a 2 , ••• , an) at time zero and hits q at time t, then xi(p) = ai for i = 2, ... , n while xl(p) = t. Let r.p be a local flow for X near 0, and define X in some sufficiently
blo.
2.8. Vector Fields
103
small neighborhood of
°
by
x(a l , ••. , an) := <Pal (0, a2 , •. • , an). For a = (aI, a2 , ••• , an) in the domain of X, and 1 E COO(M), we have
TX' ~ 8u l 1a 1 = ~ 8u l 1a (f 0 X) = lim -hl [I(x(a l h-+O
+ h, a2 , •.. , an)) -
I(x(al, a2 , ••• , an))]
= lim-hl [I (<Pal +h(0,a2 , ... ,an )) -1(X(a l ,a 2 , ... ,an ))] h-+O
= lim -hI [J (
In particular, TX'
TX'
8~i
= (Xf)(X (a)).
Iur 10 = Iur 10' If i > 0, then at °we have
10
1=
8~t
10
lox
= lim -hl [J(X(O, '" h, .• .) - 1(0)] h-+O
= h-+O lim -hl [1(0, ... , h, ... ) -
1(0)]
= 88ut·1 0 f.
Thus ToX = id and so by the inverse mapping theorem (Theorem 2.25) we see that after restricting X to a smaller neighborhood of zero, the map x := X-I is a chart map. We have already seen that TX I = X 0 X. But then for 1 E Coo (M) we have
Iur
-; 1
8x
p
1=
81 1 1 0 8u x(P)
x-I =
8 1 1 lox 8u x(p)
=Tx 88 1 1 1= (Xox)(x(p))I=Xpl, u x(p) so that
Ixr = x.
o
2.8.2. Lie derivative. We now introduce the important concept of the Lie derivative of a vector field extending the previous definition. The Lie derivative will be extended further to tensor fields.
Definition 2.102. Given a vector field X, we define a map LX : X(M) -+ X(M) by
LXY:= [X, Y]. This map is called the Lie derivative (with respect to X).
2. The Tangent Structure
104
The Jacobi identity for the Lie bracket easily implies the following two identities for any X, Y, Z E X(M) -+ X(M):
..cx[Y, Z] = [..cxY,Z] + [Y,..cxZ], (Le . ..c[X,y] = ..cx o..cy - ..cy 0 ..cx). ..c[X,y] = [..cx,..c y ] We will see below that ..cx Y measures the rate of change of Y in the direction X. To be a bit more specific, (..cx Y) (p) measures how Y changes in comparison with the field obtained by "dragging" Yp along the flow of X. Recall our definition of the Lie derivative of a function (Definition 2.70). The following is an alternative characterization in terms of flows: For a smooth function f : M -+ R and a smooth vector field X E X(M), the Lie derivative ..cx of f with respect to X is given by
..cx f (p)
:t
=
10 f 0
Exercise 2.103. Explain why the above formula is compatible with Definition 2.70. We will also characterize the Lie derivative on vector fields in terms of flows. First, we need a technical lemma: Lemma 2.104. Let X E X(M) and f E Coo(U) with U open and p E U. There is an interval 10 := [-8,8] and an open set V containing p such that
f(
E
10
+ tg(t, q)
x V and such that g(O, q) = Xqf for all q E V.
Proof. The existence of the set 10 x V with '{)X (10 x V) C U follows from our study of flows. The function r(r,q) := f(
r
1 8r g(t,q):= Jo 8r(st,q)ds,
so that
tg(t, q) =
fal ~~ (st, q)t ds = fal :s r(st, q) ds = r(t, q).
Then f(
+ tg(t, q).
Also
g( O q) = lim ~r(t q) = lim f(
t-+O
t'
t-+O
t
q
0
105
2.8. Vector Fields
Proposition 2.105. Let X and Y be smooth vector fields on M. Let cP = cpx be the flow. The function t M Tcp-t . Y
dd I Tcp-t· Y
(2.5)
Proof. Let
I
E
COO(U) with p
E
U as in the previous lemma. We have
dI . Tcp-t . Yept(p) - ¥p . Ypi - (Tcpt . Yep tCP)) I -d TCP-t·Y
Ypi - (Tcpt . Yep t(p)) I
¥pI - Yep-t(p) (J 0 CPt)
t
t
_ ¥pI - Yep t(p) (J + tYt) t
where Y is as in the lemma and Yt(q) = y(t, q). Continuing, we have
Ypi - Y
Taking the limit as t side above becomes
t
---t
0 and recalling that Yo
= XI
on V, the right hand
lim (Y I)(p) - (Y J)(cp-t(p)) -lim Y t
t--+O
t--+O
ep_t(p)Yt.
ep-t (p)
Y t
= lim (Y 1)( CPt (p)) - (Y J) (p) - Yo X I t--+O t p
= X ,X I - YpX I = [X, Y]pf. All told we have
dd
I
t t=O
(Tcp-t . Y
for all f E Coo (U). If we let U be the domain of a chart (U, x), then letting I be each of the coordinate functions we see that each component of the TpMvalued function t M Tcp-t· Y
ft
We see from this characterization that in order for (£xY) (p) to make sense, Y only needs to be defined along the integral curve of X that passes through p. Discussion: Notice that if X is a complete vector field, then for each t E R the map cpf is a diffeomorphism M ---t M and
(2.6)
(cpf) * Y = (TCPf)-l 0 Y
0
cPf.
2. The Tangent Structure
106
One may write
cx Y -
:tl
o (
On the other hand, if X is not complete, then there exists no t such that O. It follows that for all 0 ~ t ~ € the map
(
It
Theorem 2.106. Let X, Y be vector fields on a smooth manifold M. Then d dt (
2.8. Vector Fields
107
Proof. Let cp := cpX. Suppressing the point p we have
:t
It cp;Y =
! 10
= cp; Exercise 2.107. Show that
=
CP;+sY
:s
! 10 cp;(cp~*Y) o
10 (cp;Y) = cp;£xY.
ftlo (cp~Y)(p) =
- (£xY) = -[X,Y].
Proposition 2.108. Let X E X(M) and Y E X(N) be f-related vector fields for a smooth map f : M ~ N. Then
f
cpf
0
= cpr 0 f
whenever both sides are defined. Suppose that f : M ~ N is a diffeomorphism and X E X(M). Then the flow of foX = (f-l)* X is f 0 cpf 0 f- 1 and the flow of f* X is f- 1 0 cpf 0 f. Proof. For any p E M, we have ft(f 0 cpf)(p) = Tf ·ftcpf(p) = Tf 0 X 0 cpf (P) = Yo f 0 cpf (p). But f 0 cp~ (p) = f(p) and so t t-+ f 0 cpf (p) is an integral curve of Y starting at f(p). By uniqueness we have f 0 cpf (p) = cpi (f(p)). The second part follows from the first. 0 Theorem 2.109. For X, Y E X(M), the following are equivalent:
(i) £xY
= [X,
Y] = O.
(ii) (cpf)*Y = Y whenever defined.
(iii) The flows of X and Y commute: cpf
0
cpr = cpr
0
cpf whenever defined.
Proof. (Sketch) The equivalence of (i) and (ii) follows easily from Proposition 2.105 and Theorem 2.106. Using Proposition 2.108, the equivalence of (ii) and (iii) can be seen by noticing that cpf 0 = 0 cpf is defined and true exactly when = Cp~t 0 0 cpf is defined and true, which happens exactly when
cpr
cpr cpr
cpr Y
CPs
(
= CPs
t
is defined and true. This happens, in turn, exactly when Y
= (cpf)*Y.
0
Example 2.110. On~.2 we have the flows given by ¢(t, (x, y)) = ¢t(x, y) := (x + ty, y) and 1/;(t, (x, y)) = 1/;t(x, y) := (x, y + t). We have 1/;1 0 ¢1(0, 0) = (1,1) while ¢l 01/;1(0,0) = (0,1). These noncommuting flows correspond to the noncommuting vector fields X = y8/8x and Y = 8/8y. Exercise 2.111. Find global noncommuting flows on S2.
108
2. The Tangent Structure
x y
x p Figure 2.6. Bracket measures lack of commutativity of flows
The Lie derivative and the Lie bracket are essentially the same object and are defined for local sections X E XM(U) as well as global sections. This is obvious anyway since open subsets are themselves manifolds. As is so often the case for operators in differential geometry, the Lie derivative is natural with respect to restriction, so we have the commutative diagram
X(U) r~ .} X(V)
.cxlu ---7
.c x
u
---7
X(U) .} r~ X(V)
where Xlu denotes the restriction of X E X(M) to the open set U and r~ is the map that restricts from U to V cU. If X and Yare smooth vector fields with flows 'Px and 'P Y , then starting at some p EM, if we flow with X for time 0, then flow with Y for time 0, and then flow backwards along X and then Y for time 0, we arrive at a point a(t) given by Y x Y x (t) . a
.= 'P -..;t 0 'P -..;t 0 'P..;t 0 'P..;t.
It turns out that a(t) is usually not p (see Figure 2.6). In fact, we have the
following theorem:
Theorem 2.112. With a(t) as above we have
dd
t
Proof. See Problem 8.
I
a(t) = [X, Y](p).
t=O
D
We know by direct calculation that if (xl, ... , xn) are the coordinate functions of a chart, then
2.8. Vector Fields
109
for all i,j = 1, ... , n. The converse is also locally true. This follows from the next theorem, which can be thought of as saying that we can simultaneously "straighten" commuting local fields.
Theorem 2.113. Let M be an n-manifold. Suppose that on a neighborhood V of a point p E M we have vector fields Xl, . .. ,Xk such that X 1(x), ... , Xk(X) are linearly independent for all x E V. If [Xi, X]] = 0 on V for all i, j, then there exist a possibly smaller neighborhood U of p and a chart x : U ---+ ]Rn such that 8 . -8. = Xi on U for z = 1, ... , k x~
and such that for the corresponding flows we have -1 ( 1 ... ,Ui , ... , Un) = (1 'Ptx, 0 XU, U , ... , Ui
+ t , ... , Un)
for i = 1, ... , k. Proof. Let p E V and choose a chart (0, y) centered at p with 0 c V. By rearranging the coordinate functions if necessary and using a simple linear independence argument, we may assume that the vectors
Xl (p), ... , Xk(p) ,
8x~+llp , ... , 8~n Ip
form a basis for TpM. Let 'P~l, ••• , 'P~" be local flows for Xl, .. . , Xk and let W be an open neighborhood of p contained in 0 such that the composition x" 0'" 0 'PtlXl 'Pt" is defined on Wand maps W into 0 for all t1, ... , tk E (-f, f) as in Lemma 2.92. Define
.- {( a k+1 , ... , an).. y -1(0 , ... , 0 ,ak+1 , ... , an)} S .and define the map 'I/; : (-f, f)k x S ---+ U by 0"( 1 k k+1 , ... , Un) -_ 'Pu" X" 0'" 0 'PUI Xl 'I" U , ... , U ,U
0
E
W,
Y-1(0 , ... , 0 ,Uk+1 , ... , Un) .
Since [Xi, X]] = 0 for 0 ~ i, j ~ k, we know that the flows in the composition above commute. Hence for any a E y (W) and smooth function f, we have for 1 ~ i ~ k,
TatP·
--
8~~la = 8~~lafO'l/;(u1, ... ,un) 8 i Ia f( 'Pu" X" Xl Y-1(0 , ... , 0 ,Uk+1 , ... , Un)) 8u
8 = 8u~
0'"
I f('P~' a
= Xi('I/;(a))f,
0
0 'PUI 0
'P~k"
0 .•• 0
--.
'P;"
0··· 0
'P~l
0
Y 1(0, ... ,0, uk+1, ... , Un))
2. The Tangent Structure
110
where the caret indicates omission. Thus we have Ta'ljJ·
a~i la =
Xi('ljJ(a)) for 1
~ i ~ k,
and in particular, To'ljJ·
For k + 1 ~ i
~
a~i 10 =
Xi(p) for 1
~ i ~ k.
n, we have
0"(0 - -1(0 , ... , 0 ,uk+1 , ... ,u, n) If' , ... , 0 ,uk+1 , ... ,un) -y
so To'ljJ·
a~i 10 = a~i Ip .
Thus To'ljJ maps a basis of TojRn to a basis of TpM. We can use the inverse mapping theorem (Theorem 2.25) to conclude that 'ljJ is a diffeomorphism from some open neighborhood of 0 E jRn onto an open neighborhood U of p in M. It is now straightforward to check that x = 'ljJ-1 is a chart map of D the desired type. For later reference, we note that if (U, x) is a chart of the type produced in the theorem above, then we can easily arrange that x (U) is of the form V x W C jRk X jRn-k, so that the fields are tangent to the submanifolds of the form S := {p E U: x k+1(p) = ak+l, ... , xk+1(p) = an}.
Ixr, ... ,b
The reader has probably surmised that the mathematics of vector fields and flows can be applied to fluid mechanics. This is quite true, but one needs to deal with time dependent vector fields. There is a trick that allows time dependent vector fields to be treated as ordinary vector fields on a manifold of one higher dimension, but it is not always best to think in those terms. The author has included a bit about time dependent vector fields in the online supplement [Lee, J eff1.
2.9. l-Forms Definition 2.114. A smooth (resp. or) section of the cotangent bundle is called a smooth (resp. or) l-form or also a smooth (resp. or) covector field. The set of all or l-forms is denoted by Xr*(M) and the set of smooth l-forms is denoted by X*(M). The set xr*(M) is a module over or(M). Later we will have reason to denote X*(M) also by 01(M). The analogue of Lemma 2.65 is true. That is, if O:p E T; M, then there is a l-form 0: such that o:(p) = p. The proof is again a simple cut-off function argument.
2. The Tangent Structure
112
which is interpreted to mean that at each p E Uo; we have
df(p) =
L ~:i Ip dxii p'
The covector fields dx i form what is called a coordinate co frame field or holonomic coframe field 2 over U. Note that the component functions X~ of a vector field with respect to the chart above are given by Xi = dx~(X), where by definition dx~(X) is the function p t-+ dx~ip (Xp). Thus ~. 8 Xlu = ~ dx~(X) 8xi '
Note. If a is a l-form on M and p EM, then one can always find many functions f such that df(p) = a(p), but there may not be a single function f such that this is true for all points in a neighborhood, let alone all points on M. If in fact df = a for some f E COO(M), then we say that a is exact. More on this later. Let us try to picture l-forms. As a warm up, let us recall how we might picture a tangent vector vp at a point p E ]Rn. Let "I be a curve with "I' (0) = vp. If we zoom in on the curve near p, then it appears to straighten out and so begins to look like the curve t t-+ P + tvp. So one might say that a tangent vector is the "infinitesimal representation" of a (parameterized) curve. At each point a l-form gives a linear functional in that tangent space, and as we know, the level sets of a linear functional are parallel affine subspaces or hyperplanes. We should imagine level sets as being labeled by the values of the function. A covector puts a ruling in the tangent space that measures tangent vectors stretching across this ruling. For example, the l-form df in ]R3 gives a ruling in each tangent space as suggested by Figure 2.7a. For a given p, the level sets of dfp are what we see if we zoom in on the level sets of f near p. The fact that the individual dfp's living in the tangent spaces somehow coalesce into the level sets of the global function f as shown in 2.7b, is due to the fact that the l-form df is the differential of the function
f· A more generall-form a is still pictured as straight parallel hyperplanes in each tangent space. Because these level sets live in the tangent space, we might call them infinitesimal level sets. These (value labeled) level sets may not coalesce into the level sets of any global smooth function on the manifold. There are various, increasingly severe ways coalescing may fail to happen. The least severe situation is when a is not the differential of a global function but is still locally a differential near each point. For example, 2The word holonomic comes from mechanics and just means that the frame field derives from a chart. A related fact is that [8~ '8~ J = O.
2.9. l-Forms
113
Figure 2.7. The form df follows level sets
Figure 2.8. Level sets of overlapping angle functions
if M = ]R2\{O}, then the familiar l-form a = (x 2 + y2)-1(_ydx + xdy) is locally equal to d() for some "angle function" () which measures the angle from some fixed ray such as the positive x axis. But there is no such single smooth angle function defined on all of ]R2\ {O}. Thus, globally speaking, a is not the differential of any function. In Figure 2.8, we see the coalesced result of "integrating" the infinitesimal level sets which live in the tangent spaces. While these suggest an angular function, we see that if we try to picture rising as we travel around the origin, we find that we do not return to the same level in one full circulation, but rather keep rising. Locally, however, we really do have level sets of smooth functions. Now the second more severe way that a l-form may fail to be the differential of a function is where there is not even a local function whose differential agrees with the l-form near a point. The infinitesimal level sets do not coalesce to the level sets of a smooth function, even in small neighborhoods. This is much harder to represent, but Figure 2.9 is meant to at least be suggestive. Nearby curves cross inconsistent numbers of level sets. AI:. an example consider the l-form {3 = -ydx + xdy.
The astute reader may object that surely radial rays do match up with the
2. The Tangent Structure
114
Figure 2.9. Suggestive representation of a form which is not closed
directions described by this 1-form. However, the point is that a covector in a tangent space is not completely described by the level sets as such, but rather the level sets are to be thought of as labeled according to the values they represent in the individual tangent spaces. Here we have a case where the infinitesimal level sets coalesce, but the values assigned to them do not; they are 1-dimensional submanifolds that fit the 1-form but they are not level sets of even a local smooth function whose differential agrees with the 1-form. This brings us to the most severe case which only happens in dimension 3 or greater. It can be the case that there is no nice family of (n - 1)-dimensional submanifolds that line up in any reasonable sense with the I-form (either globally or locally). This is the topic of the Frobenius integrability theory for tangent distributions that we study in Chapter 11, and we shall forgo any further discussion until then except to say that the reader should be ready to understand much of that chapter, including the Frobenius theorem, after finishing the next chapter.
Definition 2.116. If ¢ : M -t N is a Coo map, the pull-back of a 1-form a E X*(N) by ¢ is defined by (¢*a)p' v = a¢(p) (Tp¢ . v)
for v E TpM. This extends the notion of the pull-back of a function defined earlier. If we view a 1-form on M as a map T M -t JR, then the pull-back is given by ¢*a=aoT¢.
Exercise 2.117. The pull-back is contravariant in the sense that if ¢l : Ml -+ M2 and ¢2 : M2 -t N, then for a E X*(N) we have (¢2 0 ¢l)* =
¢i 0 ¢2' Next we describe the local expression for the pull-back of a 1-form. Let (U, x) be a chart on M and (V, y) a coordinate chart on N with ¢(U) c V. A
2.9. I-Forms
115
typical I-form has a local expression on V of the form a = Coo(V). The local expression for ¢*a on U is ¢*a = 2: (ai
2: aidyi for ai E 0
¢) d (yi
0
¢) =
. 3 L: (ai 0 ¢) a(y'o¢) ax) dx J • Thus we get a local pull-back formula convenient for computations:
(2.7) The pull-back of a function or I-form is defined whether ¢ : M --+ N happens to be a diffeomorphism or not. On the other hand, the pull-back of a vector field only works in special circumstances such as where ¢ is a diffeomorphism. Let ¢ : M --+ N be a Coo diffeomorphism with r ~ 1. Recall that the push-forward of a function f E Coo(M) is denoted ¢d and defined by ¢*f(P) := f(¢-l(p)). We can also define the push-forward of a I-form as ¢*a = a 0 T¢-l. Exercise 2.118. Find the local expression for ¢*f and ¢*a. Explain why we need ¢ to be a diffeomorphism. Lemma 2.119. The differential is natural with respect to pull-back. In other words, if ¢ : M --+ N is a Coo map and f : N --+ lR a Coo function, then d( ¢* f) = ¢* df. Consequently, the differential is also natural with respect to
restrictions. Proof. We wish to show that
(¢*df)p = d(¢* f)p for all p E M. Let v E TpM and write q = f(p). Then
(¢*df)lp v = dflq (Tp¢' v) = ((Tp¢' v) f) (q) =v(fo¢)(p) = d(¢*f)lpv. The second statement is obvious from local coordinate expressions, but also notice that if U is open in M and L : U <---+ M is the inclusion map (Le. the identity map idM restricted to U), then flu = l,* f and dflu = l,*df. So the statement about restrictions is just a special case of the first part. 0 The tangent and cotangent bundles T M and T* M are themselves manifolds and so have their own tangent and cotangent bundles. Among other things, this means that there exist I-forms and vector fields on these manifolds. Here we introduce the canonical I-form on T* M. We denote this form by Ocan and note that it is a section of T* (T* M). Let a E T* M and suppose that a is based at p so that a E T;M. Consider a vector U a E Ta (T* M). 3To ensure clarity we have not used the Einstein summation convention here.
2. The Tangent Structure
116
Notice that since 1r : T* M --+ M, we have Ta1r : Ta (T* M) --+ TpM. Thus Ta1r . U a E TpM. We define
Oean(u a) = a (Ta1r' u a). The definition makes sense because a E T; M and Ta1r . U a E TpM. Let (U, x) be a chart containing P and let (xl 0 1r, ... ,xn 0 1r, PI, ... ,Pn) = (ql, ... , qn,PI,'" ,Pn) be the associated natural coordinates for T* M.
Exercise 2.120. It is geometrically clear that Ta1r' a~.la = 0 since a~.la is tangent to the fiber T;(a)M along which 1r is constant. Deduce this directly from the definitions. Hint:
a~.la
can be represented by a curve in T;(a)M.
We wish to show that locally Oean =
Oean( a~.
L: Pi dqi.
U= Pi(a) and Oean( a~' U= 0 for all Oean(
since in fact Ta 1r'
8~i
a~' la = Oean(
I)
= a (Ta 1r '
It will suffice to show that
i. We have
8~i
IJ
= 0
0 by Exercise 2.120. Also, we have
8~i
I)
IJ
8~i ~ a ( /)~, ~ a' ~ p,(a), = a (Ta 1r '
I.)
a~.la = a~.lp' which follows from the Indeed, we know that Ta1r' a~.la = cf ~ Ip for some
where we have used the fact that Ta1r' definition qi constants
= xi 01r.
cf, but we have cf = dx k (Ta 1r ' 8~i = d ( xk
0
1r)
IJ
(8~i
= 1r*dxk
J
1
= dqk
(8~i
(8~i
I
IJ
J
=
8:'
This 1-form plays a basic role in symplectic geometry and classical mechanics. For more about symplectic geometry see the online supplement [Lee, J effj.
2.10. Line Integrals and Conservative Fields Just as in calculus on Euclidean space, we can consider line integrals on manifolds, and it is exactly the 1-forms that are the appropriate objects to integrate. First notice that all 1-forms on open sets in ]Rl must be of the
2.10. Line Integrals and Conservative Fields
117
form f dt for some smooth function f, where t is the coordinate function on RI. We begin by defining the line integral of a 1-form defined and smooth on an interval [a, bl C RI. If {3 = f dt is such a 1-form, then
r {3:= lb f(t) dt.
J[a,b)
a
Any smooth map '"'{ : [a, bl ~ M is the restriction of a smooth map on some larger open interval (a - c, b + c), and so there is no problem defining the pull-back '"'{*o.. If '"'{ : [a, bl ~ M is a smooth curve, then we define the line integral of a 1-form a along'"'{ to be
1 r 0.:=
J[a,b)
'Y
'"'{*o. =
lb
f(t) dt.
a
where '"'{*o. = f dt. Now if t = 4J(s) is a smooth increasing function, then we obtain a positive reparametrization 7 = '"'{ 0 4J : [e, d] ~ M. where 4J(e) = a and 4J(d) = b. With such a reparametrization we have
1
[e,d)
r 4J*'"'{*o. = r 4J*(f dt) = 1 f J[e,d) [e,d)
7*0. =
=
J[e,d)
ld
f(4J (s))4J' (s) ds
=
0
4J ~4J ds
lb
S
f(t) dt,
where the last line is the standard change of variable formula and where we have used 4J*(fdt) = d(~:t/»ds, which is a special case of the pull-back formula mentioned above. We see now that we get the same result as before. This is just as in ordinary multivariable calculus. We have just transferred the usual calculus ideas to the manifold setting. Definition 2.121. A continuous curve '"'{ : [a, bl ~ M into a smooth manifold is called piecewise smooth if there exists a partition a = to < tl < ... < tic = b such that '"'{ restricted to [ti' ti+11 is smooth for 0 ~ i ~ k - 1 (in the sense of Definition 1.58). It is convenient to extend the definitions a bit to include integration along piecewise smooth curves. Thus if'"'{ : [a, bl ~ M is such a curve, then we define for a 1-form a.
where ,",{, is the restriction of'"'{ to the interval [til ti+11.
2. The Tangent Structure
118
Just as in ordinary multivariable calculus we have the following: Proposition 2.122. Let, : [a, b] --+ M be a piecewise smooth curve with ,(a) = PI and ,(b) = P2. If a = df, then
~ a = ~ df = f(P2) -
f(p1).
In particular, I-y a is path independent in the sense that it is equal to Ie a for any other piecewise smooth path c that also begins at PI and ends at P2. Definition 2.123. If a is a I-form on a smooth manifold M such that Ie a = 0 for all closed piecewise smooth curves c, then we say that a is conservative.
We will need a lemma on differentiability. Lemma 2.124. Suppose f is a function defined on a smooth manifold M,
and let a be a smooth I-form on M. Suppose that for any P EM, vp E T M and smooth curve c with c(O) = vp, the derivative 1tlof(c(t)) exists and :t 10 f(c(t)) = a(v p). Then f is smooth and df = a. Proof. We work in a chart (U, x). If we take c(t) := x-1(x(p)+tei), then the hypotheses lead to the conclusion that all the first order partial derivatives of f 0 x-I exist and are continuous. Thus f is a1. But then also dfp . vp = 1tlof(c(t)) = ap(vp) for all vp, and it follows that df = a, and this also implies that f is actually smooth. D Proposition 2.125. If a is a I-form on a smooth manifold M, then a is
conservative if and only if it is exact. Proof. We know already that if a = df, then a is conservative. Now suppose a is conservative. Fix Po E M. Then we can define f(p) = I-ya, where, is any curve beginning at Po and ending at p. Given any vp E TpM, we pick a curve c : [-1, c) with c > 0 such that c( -1) = Po, c(O) = p and d (0) = vp. Then
-d f(C(T)) == - 1 id dT 0 dT 0 =
j
e[
1,7']
a
d~ 10 11[-1,0] a + ddT 10 1 [0,7'] a
= 0+
1
d~ 10 foT c*a - :1' 10 fo7' g(t) dt
= g(O),
2.10. Line Integrals and Conservative Fields
119
where c*o: = 9 dt. On the other hand,
O:(Vp ) = o:(c'(O)) =
= c*o: (:t
0:
IJ
(Toc.
:t IJ
= g(O) dtl o (:t
IJ
= g(O),
iT
where t is the standard coordinate on R Thus 10 f(C(T)) = o:(vp ) for any vp E TpM and any p E M. Now the result follows from the previous lemma. D
It is important to realize that when we say that a form is conservative in this context, we mean that it is globally conservative. It may also be the case that a form is locally conservative. This means that if we restrict the I-form to an open set which is diffeomorphic to a Euclidean ball, then the result is conservative on that ball. The following examples explore in simple terms these issues.
Example 2.126. Let 0: = (x 2 + y2(1 (-y dx + x dy). Consider the small circular path c given by (x, y) = (xo + £ cos t, Yo + £ sin t) with 0 ::; t ::; 27r and £ > O. If (xo, YO) = (0,0), we obtain
1
e 0:
{27r
= Jo
= Thus
0:
1 ( £2
1
27r
o
dx -y(t) dt
dY)
+ x(t) dt
1 2" ( - (£ sin t) (-£ sin t) £
dt
+ (£ cos t) (£ cos t)) dt = 27r.
is not conservative and hence not exact. On the other hand, if
(xo, Yo) f. (0,0), then we pick a ray Ro that does not pass through (xo, Yo) and a function O(x, y) which gives the angle of the ray R passing through (x, y) measured counterclockwise from Ro. This angle function is smooth
Y5, Ie
and defined on U = JR2\Ro. If £ < !Jx~ + then c has image inside the domain of 0 and we have that 0:1 U = dO. Thus 0: = O(c(O)) -O(c(27r)) = O. We see that 0: is locally conservative. Example 2.127. Consider (3 = y dx - x dy on JR 2. If it were the case that for some small open set U C JR2 we had (3lu = df, then for a closed path c with image in that set, we would expect that (3 = f(c(27r)) - f(c(O)) = O. However, if c is the curve going around a circle of radius £ centered at
Ie
120
2. The Tangent Structure
(xo, Yo), then we have
1~ 1(y( = =
t) ~~ - x( t) ~~) dt
1271' (( Xo + c sin t) (-c sin t) -
(Yo + c cos t) (c cos t)) dt
= -2c 2 7l', so we do not get zero no matter what the point (xo, Yo) and no matter how small c. We conclude that ~ is not even locally conservative. The distinction between (globally) conservative and locally conservative is often not made sufficiently clear in the physics and engineering literature. Example 2.128. In classical physics, the static electric field set up by a fixed point charge of magnitude q can be described, with an appropriate choice of units, by the 1-form q
q
q
gxdx+ gydy+ gzdz, p p p where we have imposed Cartesian coordinates centered at the point charge and where p = x 2 + y2 + z2. Notice that the domain of the form is the punctured space ]R3 \ {o}. In spherical coordinates (p, (), ¢), this same form is
J
q
p2dp= d
(-q) p ,
so we see that the form is exact and the field is conservative.
2.11. Moving Frames It is important to realize that it is possible to get a family of locally defined vector (resp. covector) fields that are linearly independent at each point in their mutual domain and yet are not necessarily of the form a~. (resp. dx i ) for any coordinate chart. In fact, this may be achieved by carefully choosing n 2 smooth functions f~ (resp. a~) and then letting Ek := 2:i f~ a~. (resp. ()k
:=
2:i a~dxi).
Definition 2.129. Let E 1, E 2, ... , En be smooth vector fields defined on some open subset U of a smooth n-manifold M. If E1 (P), E2 (P), ... , En (p) form a basis for TpM for each p E U, then we say that (E1' E2,"" En) is a (non-holonomic) moving frame or a frame field over U. If E1, E2, ... ,En is a moving frame over U C M and X is a vector field defined on U, then we may write
X = I::XiEi on U,
121
2.11. Moving Frames
for some functions Xi defined on U. If the moving frame (E1, ... , En) is not identical to some frame field (Ixr, ... , arising from a coordinate chart on U, then we say that the moving frame is non-holonomic. It is often possible to find such moving frame fields with domains that could never be the domain of any chart (consider a torus).
Ixn)
Definition 2.130. If E 1, E2, . .. ,En is a frame field with domain equal to the whole manifold M, then we call it a global frame field. Most manifolds do not have global frame fields. Taking the basis dual to (E1{p), ... , En{P)) in T;M for each P E U we get a moving coframe field (o1, ... , on). The Oi are I-forms defined on U. Any I-form 0: can be expanded in terms of these basic I-forms as 0: = L ad)i. Actually it is the restriction of 0: to U that is being expressed in terms of the 02 , but we shall not be so pedantic as to indicate this in the notation. In a manner similar to the case of a coordinate frame, we have that for a vector field X defined at least on U, the components with respect to (E1, .. . , En) are given by Oi{X):
Let us consider an important special situation. If M x N is a product manifold and (U, x) is a chart on M and (V, y) is a chart on N, then we have a chart (U x V, x x y) on M x N where the individual coordinate functions are xl 0 pr1, ... , xm 0 pr1, y1 0 pr2, ... , yn 0 pr2, which we temporarily denote bY -1 x , ... , x::::Tn , -1 y , ... , ::-:n y. Now we conSl·der wh a t·IS the reIa t·IOn b et ween the coordinate frame fields (Ixr, ... m ), (-/yr, ... -/yn) and the frame field
(Ixr, ... , a~n).
a:
The latter set of n + m vector fields is certainly a linearly independent set at each point (p, q) E U x V. The crucial relations are a~.f = a~i (f 0 prl) and a~' = a~' (f 0 p r 2)· Exercise 2.131. Show that Tpr2
a~' Ip
= Tprl ai?,I(p) and x ,q
l,Y I
q
a~Y I(p,q) .
Remark 2.132. In some circumstances, it is safe to abuse notation and denote Xi 0 pr1 by xi and yi 0 pr2 by yi. Of course we are denoting a~. by a~' and so on.
A warning (The second fundamental confusion of calculus 4 ): For a chart (U, x) with x = (xl, ... , x n ), we have defined for any appropri-
l!r
ately defined smooth (or C1) function
f. However, this notation can be
'In [Pen], Penrose a.ttributes this cute terminology to Nick Woodhouse.
122
2. The Tangent Structure
i!r
ambiguous. For example, the meaning of is not determined by the coordinate function xl alone, but implicitly depends on the rest of the coordinate functions. For example, in thermodynamics we see the following situation. We have three functions P, V and T which are not independent but may be interpreted as functions on some 2-dimensional manifold. Then it may be the case that any two of the three functions can serve as a coordinate depends on whether we are using the coordinate system. The meaning of functions (P, V) or alternatively (P, T). We must know not only which function we are allowing to vary, but also which other functions are held fixed. To get rid of the ambiguity, one can use the notations (Us) V and (U) T" In the first case, the coordinates are (P, V), and V is held fixed, while in the second case, we use coordinates (P, T), and T is held fixed. Another way to avoid ambiguity would be to use different names for the same functions depending on the chart of which they are considered coordinate functions. For example, consider the following change of coordinates:
Us
y2
yl = xl + x 2 , = xl _ x 2 + x 3 , y3 = x3.
Here y3 - x 3 as functions on the underlying manifold, but we use different symbols. Thus may not be the same as The chain rule shows that in fact = + This latter method of destroying ambiguity is not very helpful in our thermodynamic example since the letters P, V and T are chosen to stand for the physical quantities of pressure, volume and temperature. Giving these functions more than one name would only be confusing.
b
b
ib a?
a?
Problems (1) Show that if j : M -7 N is a diffeomorphism, then for each p E M the tangent map Tpj : TpM -7 Tf(p)N is a vector space isomorphism. (2) Let M and N be smooth manifolds, and j : M -7 N a Coo map. Suppose that M is compact and that N is connected. If j is injective and Tpj is an isomorphism for each p EM, then show that j is a diffeomorphism. (Use the inverse mapping theorem.) (3) Find the integral curves in ]R2 of the vector field X = edetermine if X is complete or not.
xtx + ~ and
(4) Which integral curves of the field X = x 2 fx + y /y are defined for all times t?
Problems
123
(5) Find a concrete description of the tangent bundle for each of the following manifolds: (a) Projective space IRpn. (b) The Grassmann manifold G (k, n).
(6) Recall that we have charts on IRP2 given by [x,y,z]1-t (Ul,U2) = (x/z,y/z) on U3 = {z i= O}, [x, y, z]1-t (VI, V2) = (x/y, z/y) on U2 = {y i= O}, [x, y, z] I-t (WI, W2) = (y/x, z/x) on Ul = {x i= O}. Show that there is a vector field on IRP2 which in the last coordinate chart above has the following coordinate expression:
a
a
WI--W2-· aWl aW2
What are the expressions for this vector field in the other two charts? (Caution: Your guess may be wrong!). (7) Show that the graph r(f) = {(p, f(p)) E M x N : p E M} of a smooth map f : M -+ N is a smooth manifold and that we have an isomorphism T(p,/(p)) (M x N) ~ T(p,j(p))r(f) EB Tf(p)N. (8) Prove Theorem 2.112. (9) Show that a manifold supports a frame field defined on the whole of M exactly when there is a trivialization of TM (see Definitions 2.130 and 2.58). (10) Prove Proposition 2.85. (11) Find natural coordinates for the double tangent bundle TTM. Show that there is a nice map s : TT M -+ TT M such that s 0 s = idTTM and such that T1f 0 S = T1fTM and T1fTM 0 s = T7r. Here 1f : TM -+ M and 1fTM : TTM -+ TM are the appropriate tangent bundle projection maps. (12) Let N be the subset of IR n+1 x IRn +l defined by N = {(x, y) : Ilxll = 1 and x . y = O} is a smooth manifold that is diffeomorphic to Tsn. (13) (Hessian) Suppose that f E COO(M) and that dfp = 0 for some p E M. Show that for any smooth vector fields X and Y on M we have that Yp(Xf) = Xp(Yf). Let Hf,p(v,w) := Xp(Yf) , where X and Y are such that Xp = V and Yp = w. Show that Hf,p(v, w) is independent of the extension vector fields X and Y and that the resulting map Hf,p : TpM x TpM -+ IR is bilinear. Hf,p is called the Hessian of f at p. Show that the assumption djp = 0 is needed. (14) Show that for a smooth map F : M -+ N, the (bundle) tangent map T F : T M -+ TN is smooth. Sometimes it is supposed that one can
124
2. The Tangent Structure
obtain a well-defined map F. : X (M) --t X (N) by thinking of vector fields as derivations on functions and then letting (F.X) f = X (f 0 F) for f E COO(N). Show why this is misguided. Recall that the proper definition of F. : X (M) --t X (N) would be F.X := T FoX 0 F-l and is defined in case F is a diffeomorphism. What if F is merely surjective?
(15) Show that if'l/J : M' --t M is a smooth covering map, then so also is T'l/J: TM' --t TM. (16) Define the map f : Mnxn(1R) --t sym(Mnxn(1R)) by f(A) := AT A, where Mnxn(1R) and sym(Mnxn(1R)) are the manifolds of n x n matrices and n x n symmetric matrices respectively. Identify TA (Mnxn(1R)) with Mnxn(lR) and Tf(A)sym(Mnxn(lR)) with sym(Mnxn(lR)) in the natural way for each A. Calculate TJ f : Mnxn(lR) --t sym(Mnxn(lR)) using these identifications.
(17) Let h, ... , fN be a set of smooth functions defined on an open subset of
T;
a smooth manifold. Show that if dh (p), ... , dfN (p) spans M for some p E U, then some ordered subset of {h, ... , fN} provides a coordinate system on some open subset V of U containing p.
(18) Let
~r
be the vector space of derivations on Cr(M) at p E M, where r ~ 00 is a positive integer or 00. Fill in the details in the following outline which studies ~r. It will be shown that ~r is not finitedimensional unless r = 00. (a) We may assume that M = lRn and p = 0 is the origin. Let mr := {f E cr(lRn) : f(O) = O} and let m~ be the subspace spanned by the functions of the form fg for f, 9 E mr . We form the quotient space mr/m~ and consider its vector space dual (mr/m~r. Show that if 8 E ~r, then 8 restricts to a linear functional on mr and is zero on all elements of m~. Conclude that 8 gives a linear functional on mr/m~. Thus we have a linear map ~r --t (mr/m~r. (b) Show that the map ~r --t (mr/m~r given above has an inverse. Hint: For a A E (mr/m~r, consider 8>.(f) := A([J - f(O)]), where f E cr(lRn) and hence [f - f(O)] E mr/m~. Conclude that by taking r - 00 we have ToRn = ~oo ~ (mr/m~)·. The case r < 00 is different as we see next. (c) Let r < 00. The goal from here on is to show that mr/m~ and hence (mr/m~r are infinite-dimensional. We start out with the case lRn = R First show that if f E mr, then f(x) = xg(x) for 9 E Cr-1(lR). Also if f E m~, then f(x) = x 2 g(x) for 9 E Cr-1(lR). (d) For each r E {I, 2, 3, ... } and each c E (0,1), define
o<
for x > 0, for x ~ O.
125
Problems
Then g; E mr , but g; ~ Cr +1(R). Show that for any fixed r E {1, 2, 3, ... }, the set of elements of the form [g;] := gr + m; for c E (0,1) is linearly independent in the quotient. Hint: Use induction on r. In the case of r = 1, it would suffice to show that if we are given 0 < cl < ... < Cl < 1 and if 2:i-l ajg~J Em;, then aj = 0 for all j. (Thanks to Lance Drager for donating this problem and its solution.) (19) Find the integral curves of the vector field on R2 given by X(x, y) := x
2
a
ax
a + xYay'
(20) Show that it is possible that a vector field defined on an open subset of a smooth manifold M may have no smooth extension to all of M. (21) Find the integral curves (and hence the flow) for the vector field on R2 given by X(x, y) := -yfx + x/y. (22) Let N be a point in the unit sphere 8 2. Find a vector field on 8 2 \ {N} that is not complete and one that is complete. (23) Using the usual spherical coordinates (cp, 8) on 8 2, calculate the bracket [¢,,81¢]. (24) Show that if X and Yare (time independent) vector fields that have flows r.pf and r.pf, then if [X, Y] = 0, the flow of X + Y is r.pf 0 r.pf, (25) Recall that the tangent bundle of the open set GL(n, R) in Mnxn(R) is identified with GL(n, R) x Mnxn(R). Consider the vector field on GL(n,R) given by X: 9 f--t (g,g2). Find the flow of X. (26) Let t f--t Qt =
(~~:: ~~~~t ~)
001 for t E R. Let ¢(t, P) := QtP, where P is a plane in R3. Show that this defines a flow on the Grassmann manifold G(3,2). Find the local expression in some coordinate system of the vector field XQ that gives this flow. Do the same thing for the flow
t
f--t
Rt =
(co;t
~ -~nt)
sin t 0
cos t
and find the vector field XR. Find the bracket [XR,XQ]. (27) Develop definitions for tangent bundle and cotangent bundle for manifolds with corners. (See Problem 21.) [Hint: A curve into an n-manifold with corners should be considered smooth only if, when viewed in a chart, it has an extension to a map into Rn. Similarly, a functions is smooth at a corner (or boundary) point only if its local representative
126
2. The Tangent Structure
in some chart containing the point can be extended to an open set in ]Rn.]
(28) Show that if p(x) for some mEN,
= p(Xl, .. . ,xn) is a homogeneous polynomial, so that
P(tXl, ... , tXn) = tmp(Xl, ... , x n ), then as long as c ifold of ]Rn.
=1=
0, the set p-l(c) is an (n - I)-dimensional subman-
(29) Suppose that 9 : M ~ N is transverse to a submanifold WeN. For another smooth map f : Y ~ M, show that f rh g-l(N) if and only if
(g 0 f) rh W. (30) Let M x N be a product manifold. Show that for each X E X(M) there is a vector field X E X(M x N) t~at is prl-related to X and prTrelated to the zero field on N. We call X the lift of X. Similarly, we may lift a field on N to M x N.
Chapter 3
Immersion and Submersion
Suppose we are given a smooth map J : M ---+ N. Near a point p E M, the tangent map TpJ : TpM ---+ TpN is a linear approximation of J. A very important invariant of a linear map is its rank, which is the dimension of its image. Recall that the rank of a smooth map J at p is defined to be the rank of TpJ. It turns out that under certain conditions on the rank of J at p, or near p, we can draw conclusions about the behavior of J near p. The basic idea is that J behaves very much like TpJ. If L : V ---+ W is a linear map of finite-dimensional vector spaces, then Ker Land L(V) are subspaces (and hence submanifolds). We study the extent to which something similar happens for smooth maps between manifolds. In this chapter we make heavy use of some basic theorems of multivariable calculus such as the implicit and inverse mapping theorems as well as the constant rank theorem. These can be found in Appendix C (see Theorems C.l, C.2 and C.5). More on calculus, including a proof of the constant rank theorem, can be found in the online supplement to this text [Lee, J effj. 3.1. Immersions Definition 3.1. A map J : M ---+ N is called an immersion at p E M if TpJ : TpM ---+ Tf(p)N is an injection. A map J : M ---+ N is called an immersion if it is an immersion at every p EM. Note that TpJ : TpM ---+ Tf(p)N is an injection if and only if its rank is equal to dim(M). Thus an immersion has constant rank equal to the dimension of its domain.
-
127
128
3. Immersion and Submersion
Immersions of open subsets of R2 into R3 appear as surfaces that may self-intersect, or periodically retrace themselves, or approach themselves in various limiting ways. The map R2 -+ R3 given by (u, v) H (cos u, sin u, v) is an immersion as is the map (u,v) H (cosusinv,sinusinv, (1- 2 cos2 v) cos v). The map S2 -+ R3 given by (x, y, z) H (x, y, z - 2z 3 ) is also an immersion. By contrast, the map f : 8 2 -+ R3 given by (x, y, z) H (x, y, 0) is not an immersion at any point on the equator 8 2 n {z = o}.
Example 3.2. We describe an immersion of the torus T2 := 8 1 x 8 1 into R3. We can represent points in T2 as pairs (eilh,ei02). It is easy to see that, for fixed a, b > 0, the following map is well-defined: (e l01 , ei(2 ) H (x(e l01 , ei(2 ), y(el01 , el(2 ), z(ei01 , ei(2 )), where
x(el01 , ei(2 ) = (a + bcos{h) cos 02, y(e~Ol,ei02) = (a z(e~Ol,ei02) =
+ bCOSOl) sin 02,
bsinOl.
Exercise 3.3. Show that the map of the above example is an immersion. Give conditions on a and b that guarantee that the map is a 1-1 immersion. Theorem 3.4. Let f : M -+ N be a smooth map that is an immersion at p. Then for any chart (x, U) centered at p, there is a chart (y, V) centered at f (p) such that f (U) c V and such that the corresponding coordinate expression for f is (xl, ... , xk) H (xl, ... , xk, 0, ... ,0) ERn. Here, n is the dimension of Nand k = dim(M) is the rank of Tpf. Proof. Follows easily from Corollary C.3.
o
Theorem 3.5. If f : M -+ N is an immersion (so an immersion at every point), and if f is a homeomorphism onto its image f(M) (using the relative topology on f(M)), then f(M) is a regular submanifold of N. Proof. Let k be the dimension of M and let n be the dimension of N. Clearly f is injective since it is a homeomorphism. Let f(p) E f(M) for a unique p. By the previous theorem, there are charts (U, x) with p E U and (V, y) with f (p) E V such that the corresponding coordinate expression for f is (xl, ... , xk) H (xl, ... , xk, 0, ... ,0) ERn. We arrange to have f(U) C V. But f(U) is open in the relative topology on f(M), so there is an open set o C V in M such that f(U) - f(M) n O. Now it is clear that (0, yi o ) is a chart with the regular submanifold property, and so since p was arbitrary, we conclude that f(M) is a regular submanifold. 0 If f : M -+ N is an immersion that is a homeomorphism onto its image (as in the theorem above), then we say that f is an embedding.
3.1. Immersions
129
Exercise 3.6. Show that every injective immersion of a compact manifold is an embedding. Exercise 3.7. Show that if I: M -+ N is an immersion and p E M, then there is an open U containing p such that 1Iu is an embedding. Exercise 3.S. Recall the definition of a vector field along a map (Definition 2.68). Let X be a vector field along 1 : N -+ M. Show that if 1 is an embedding, then there is an open neighborhood U of I(N) and a vector field X E X(U) such that X - X 0 I. Recall that a continuous map 1 is said to be proper if 1-1 (K) is compact whenever K is compact. Exercise 3.9. Show that a proper 1-1 immersion is an embedding. [Hint: This is mainly a topological argument. You may assume (without loss of generality) that the spaces involved are Hausdorff and second countable. The slightly more general case of paracompact Hausdorff spaces follows.] Definition 3.10. Let Sand M be smooth manifolds. A smooth map 1 : S -+ M will be called smoothly universal if for any smooth manifold N, a mapping 9 : N -+ S is smooth if and only if log is smooth.
Definition 3.11. A weak embedding is a 1-1 immersion which is smoothly universal. Let 1 : S -+ M be a weak embedding and let A be the maximal atlas that gives the differentiable structure on S. Suppose we consider a different differentiable structure on S given by a maximal atlas A2. Now suppose that f : S -+ M is also a weak embedding with respect to A2. Resorting to seldom used pedantic notation, we are supposing that both 1 : (S, A) -+ M and f : (S, A2) -+ M are weak embeddings. From this it is easy to show that the identity map gives smooth maps (S, A) -+ (S, A2) and (S, A 2) -+ (S, A). This means that in fact A = A2, so that the smooth structure of S is uniquely determined by the fact that 1 is a weak embedding. Exercise 3.12. Show that every embedding is a weak embedding.
3. Immersion and Submersion
130
Figure 3.1. Figure eight immersions
In terms of 1-1 immersions, we have the following inclusions: {proper embeddings}
c c
{embeddings} {weak embeddings}
c
{1-1 immersions} .
3.2. Immersed and Weakly Embedded Submanifolds We have already seen the definition of a regular submanifold. The more general notion of a submanifold is supposed to realize the "subobject" in the category of smooth manifolds and smooth maps. Submanifolds are to manifolds what subsets are to sets in general. However, what exactly should be the definition of a submanifold? The fact is that there is some disagreement on this point. From the category-theoretic point of view it seems natural that a sub manifold of M should be some kind of smooth map I : S ---7 M. This is not quite in line with our definition of regular submanifold, which is, after all, a type of subset of M. There is considerable motivation to define sub manifolds in general as certain subsets; perhaps the images of certain nice smooth maps. We shall follow this route. Definition 3.13. Let S be a subset of a smooth manifold M. If S is a smooth manifold such that the inclusion map L : S ---7 M is an injective immersion, then we call S an immersed submanifold. Notice that in the above definition, S certainly need not have the subspace topology! Its topology is that induced by its own smooth structure. The reader may rightfully wonder just how S could acquire such a smooth structure in the first place. If f : N ---7 M is an injective immersion, then S := f(N) can be given a smooth structure so that it is an immersed submanifold. Indeed, we can simply transfer the structure from N via the bijection f : N ---7 f(N). However, this may not be the only possible smooth structure on f(N) which makes it an immersed submanifold. Thus it is imperative to specify what smooth structure is being used. Simply looking at
3.2. Immersed and Weakly Embedded Submanifolds
131
v
Figure 3.2. Immersions can approach themselves
the set is not enough. For example, in Figure 3.1 we see the same figure eight shaped subset drawn twice, but with arrows suggesting that it is the image of two quite different immersions which provide two quite different smooth structures. Suppose that S is a k-dimensional immersed submanifold of a smooth n-manifold M, and let pES. Then using Theorem 3.4, we see that there is a chart (0, x) on S, and a chart (V, y) on M, with p E 0 C V, such that yo
La
x- 1
-
yo x- 1 : x (0) ---+ Y (V)
has the form (a\ ... , a k ) t--+ (a\ ... , ak , 0, ... ,0). This means that y (0) = yo x l(x (0)) is a relatively open subset of ]Rk x {O}. Thus there is an open subset W of y (V) c ]Rn such that y (0) = W n (]Rk x {O}). Letting Ul := y-l(W), we see that y (U1 nO)
= y(U1 ) n
(]Rk x
{O}).
Thus the chart (ylUl ,Ut) has the submanifold property with respect to 0 (but not necessarily with respect to S). The set 0 has a smooth structure as an open submanifold of S. But this is the same smooth structure 0 has as yk 0 a regular submanifold. To see this note that the restrictions combine to give an admissible chart on S. Indeed, using the functions we obtain a bijection of 0 with an open subset of ]Rk. We only need to show that this bijection is smoothly related to the chart (0, x), and this amounts to showing that yk 0 are smooth. But this follows immediately 1 from the fact that yo x- is smooth. Notice that unlike the case of a regular submanifold, it may be that no matter how small V,
y1lo ' ... , I
y1Io' ... ' I
y(V n 0) # y(V) n (]Rk
x {O}),
as indicated in Figure 3.2. So in summary, each point of an immersed submanifold has a neighborhood that is a regular submanifold. Proposition 3.14. Let ScM be an immersed submanifold of dimension k and let f : N ---+ S be a map. Suppose that L of: N ---+ M is smooth,
132
where /, : S y also smooth.
3. Immersion and Submersion
M is the inclusion. Then ij j : N -+ S is continuous, it is
Proof. We wish to show that j : N -+ S is smooth if j-1(0) is open for every set 0 C S that is open in the manifold topology on S. Let pEN and choose a chart (V, y) for M centered at /, 0 j (p), so that U
= {q E V
: yk+1 (q)
= ... = yn (q) = O}
is an open neighborhood of p in S and such that y1j u ' ... , yk jU are coordinates for S on U. By assumption j-1(U) is open. Thus (/, 0 f) (I-1(U)) C U. In other words, /, 0 j maps an open neighborhood of pinto U. To test for the smoothness of j, we consider the functions (y' j u) 0 j on the set j-1 (U). But (yijU) oj=yio/,oj, and these are clearly smooth by the assumption that /, 0 j is smooth. 0 Sometimes the previous result is stated differently (and somewhat imprecisely): Suppose that j : N -+ M is a smooth map with image inside an immersed submanifold S; then j is smooth as a map into S if it is continuous as a map into S. The lack of a notational distinction between j as a map into Sand j as a map into M is what makes this way of stating things less desirable. Let ScM and suppose that S has a smooth structure. To say that an inclusion S y M is an embedding is easily seen to be the same as saying that S is a regular submanifold, and so we also say that S is embedded in M.
Corollary 3.15. Suppose that ScM is a regular submanijold. Let j : N -+ S be a map such that /, 0 j : N -+ M is smooth. Then j : N -+ S is smooth. Proof. The map f., : S y M is certainly an immersion, and so by the previous theorem we need only check that j : N -+ S is continuous. Let 0 be open in S. Then since S has the relative topology, 0 = un S for some open set U in M. Then 1-1(0) = 1- 1(UnS) = 1-1(/,-1(U)) = (/, 0 f)-1 (U), which is open since /, 0 j is continuous. Thus j is continuous. 0 Definition 3.16. Let S be a subset of a smooth manifold M. If S is a smooth manifold such that the inclusion map /, : S -+ M is a weak embedding, then we say that S is a weakly embedded submanifold. From the properties of weak embeddings we know that for any given subset ScM there is at most one smooth structure on S that makes it a weakly embedded submanifold.
3.2. Immersed and Weakly Embedded Submanifolds
133
Corresponding to each type of injective immersion considered so far we have in their images different notions of submanifold: {proper submanifolds}
c c
{regular submanifolds} {weakly embedded submanifolds}
C {immersed submanifolds}.
We wish to further characterize the weakly embedded submanifolds. Definition 3.17. Let S be any subset of a smooth manifold M. For any XES, denote by Ox(S) the set of all points of S that can be connected to x by a smooth curve with image entirely inside S. It is important to be clear that Ox(S) is not necessarily the connected component of S with its relative topology since, for example, S could be the image of an injective nowhere differentiable curve. In the latter case, Cx(S) = {x} for all XES! Definition 3.18. We say that a subset S of an n-manifold M has property W(k) if for each So E S there exists a chart (U, x) centered at So such that x(Cso(U n S)) = x(U) n (jRk x {O}). Here jRn = jRk X jRn-k. Together, the next two propositions show that weakly embedded submanifolds are exactly those subsets that have property W(k) for some k. Our proof follows that of Michor [Michl, who refers to subsets with property W(k), for some k, as initial submanifolds. With Michor's terminology, the result will be that the initial submanifolds are the same as the weakly embedded submanifolds. Proposition 3.19. If an injective immersion I : S ~ M is smoothly universal, then the image I (S) has property W( k) where k = dim( S). In particular, if SCM is a weakly embedded submanifold of M, then it has property W(k) where k = dim(S). Proof. Let dim(S) - k and dim (M) = n. Choose So E S. Since I is an immersion, we may pick a coordinate chart (W, w) for S centered at So and a chart (V, v) for M centered at I(so) such that
volow I(y) = (y,O) = (yl, ... ,yk,O, ... ,O).
°
Choose an r > small enough that Bk(O, 2r) C w(W) and Bn(O, 2r) C v(V). Let U = v-I(Bn(O,r)) and WI = w-I(Bk(O,r)). Let x := vl u . We show that the coordinate chart (U, x) satisfies the conditions of Definition 3.18:
x-l(x(U) n (jRk x {a})) = x-I{(y,0) : Ilyll < r} =Iow Io(xolow 1)-I({(y,O): Ilyll
= low I({y: Ilyll < r}) = I(WI)'
134
3. Immersion and Submersion
Clearly I(WI )
c 1(8), but we also have v 0 I(WI) c v 0 I 0 w-I(Bk(O, r)) = Bn(O, r)
n {]Rk
x {O}} C Bn(o, r),
so that I(Wd C v-I(Bn(o, r)) = U. Thus I(WI) C Unl(8). Since I(WI) is smoothly contractible to l(so), every point of I(WI) is connected to l(so) by a smooth curve completely contained in I(WI) C Unl(8). This implies that I(WI ) C C1(so)(U n 1(8)). Thus x-I(x(U) n (]Rk x {O})) C C1(so)(U n 1(8)) or x(U)n(]Rk x {a}) cx(C1 (so)(Unl(8))). Conversely, let z E C1(so)(U n 1(8)). By definition there must be a smooth curve c : [0,1] -+ M starting at l(so) and ending at z with c([0,1]) C Un 1(8). Since I : 8 -+ M is injective and smoothly universal, there is a unique smooth curve CI : [0, 1] -+ 8 with I 0 CI = c.
Claim: CI([O, 1]) C WI. Assume not. Then there is a number t E [0,1] with CI(t) E w I({r:::; Ilyll < 2r}). Therefore,
(VOI)(CI(t)) E (volow- I )({r:::; Ilyll < 2r}) = {(y,O): r:::; Ilyll < 2r} C {z E]Rn: r:::;
Ilzll < 2r}. (vo c)(t) E {z E ]Rn : r < Ilzll < 2r}, which
This implies that (vo I 0 CI)(t) = in turn implies the contradiction c(t) t/:. U. The claim is proven.
The fact that cI([0,1]) C WI implies cI(1) = l- I (z) E WI, and so z E I(WI). As a result we have C1(so)(U n 1(8)) = I(WI ) which together with the first half of the proof gives the result:
I(WI) = x-I(x(U)
n (]Rk
x {a})) C C1 (so)(U
n 1(8)) =
I(WI )
====>
x-I (x(U) n (]Rk x {O})) = C1(so) (U n 1(8))
====>
x(U) n (]Rk x {O}) = x(C1(so)(U n 1(8))).
0
Proposition 3.20. If 8 C M has property W(k), then there is a unique smooth structure on 8 which makes it a k-dimensional weakly embedded submanifold of M. Proof. (Sketch) We are given that for every s E 8, there exists a chart (Us,x s ) with xs(s) = and with x(Cs(Us n 8)) = x(Us) n (]Rk x {O}). The charts on 8 will be the restrictions of the charts (Us, xs ) to the sets Cs(Us n 8). The overlap maps are smooth because they are restrictions of overlap maps on M to subsets of the form V n (]Rk x {O}) for V open in ]Rn. If (USp X S1 ) and (US2 'X S2 ) are two such charts with corresponding sets 81 := C81 (US1 n 8) and 82 := CS2 (US2 n S), then we need to check that X S1 (81 n 8 1 ) is open in ]Rk x {a} (recall the definition of smooth atlas). For each p E US1 n US2 n 8 consider the set C(p) := Cp (US1 n US2 n 8). We
°
3.2. Immersed and Weakly Embedded Submanifolds
135
leave it to the reader to show that C(p) c 81 n 8 2 and that if p i- q then c(p) n C(q) = 0. Thus the sets C(P) form a partition of US! n US2 n 8. It is not hard to see that each C (p) maps onto a connected path component of XS ! (US1 n US2 n 8) and that every path component of this set is the image of some C(p). But this implies that X s ! (US1 n US2 n 8) open. Notice however that the topology induced by the smooth structure thus obtained on 8 is finer than the relative topology that 8 inherits from M. This is because the sets of the form Cs(U n 8) are not necessarily open in the relative topology. Since it is a finer topology, it is also Hausdorff. It is clear that with this smooth structure on 8, the inclusion ~ : 8 y M is an injective immersion. We now show that the inclusion map ~ : 8 y M is smoothly universal and hence a weak embedding. By the comments following Definition 3.11 the smooth structure on 8 is unique. Let 9: N ---+ 8 be a map and suppose that ~ 0 9 is smooth. Given x E M, choose a chart (Us, x s ) where s = 9(X). The set 9- 1 (Us ) is open since ~ 0 9 is continuous and (~O 9)-1 (Us) = 9- 1 0 ~-I(Us) = g-I(Us). We may choose a chart (V, y) centered at x with V c g-l(Us ) and we may arrange that y(V) is a ball centered at the origin. This means that ~og(V) is smoothly contractible in Ug(x) n 8 and hence 9(V) c Cg(x) (Ug(x) n 8). But then Xs
-1, IC.(u.nS) 0 goy -1 = Xs 0 (~ 0)goy
and so 9 is smooth because ~ 0 9 is smooth. To be completely finished, we need to show that with the topology induced by the atlas, each connected component of 8 is second countable. We can give a quick proof, but it depends on Riemannian metrics which we have yet to discuss. The idea is that on any paracompact smooth manifold, there are plenty of Riemannian metrics. A choice of Riemannian metric gives a notion of distance making every connected component a second countable metric space. If we put such a Riemannian metric on M, then it induces one on 8 (by restriction). This means that each component of 8 is also a separable metric space and hence a second countable Hausdorff topological space. 0 We say that two immersions II : Nl ---+ M and 12 : N2 ---+ M are equivalent if there exists a diffeomorphism
M
136
3. Immersion and Submersion
Figure 3.3. Tori converge to a point
If J : N -t M is a weak embedding (resp. embedding), then there is a unique smooth structure on S = J(N) such that S is a weakly embedded (resp. embedded) submanifold and J : N -t M is equivalent to the inclusion " : S y M in the above sense. We shall follow the convention that the word "submanifold", when used without a qualifier such as "immersed" or "weakly embedded", is to mean a regular submanifold unless otherwise indicated. What R. Sharpe [Shrp] calls a "submanifold", refers to something more restrictive than the weakly embedded sub manifolds , but still less restrictive that the regular submanifolds. Sharpe's definition of "submanifold" seems designed to exclude examples like that shown in Figure 3.3. Here the tori converge to a point on the plane (which is taken to be part of the manifold). Every neighborhood of that point will contain an infinite number of tori. Such a behavior is excluded by Sharpe's definition, but this is still a weakly embedded submanifold. One can also imagine the tori flattening while only the holes converge to a point. This example can easily be modified to be path connected and yet, it could never be the maximal integral manifold of a tangent distribution (see Chapter 11 for definitions). The celebrated Whitney embedding theorem states that any secondcountable n-manifold can be embedded in a Euclidean space of dimension 2n. We do not prove the full theorem, but we will settle for the following easier result. Theorem 3.21. Suppose that M is an n-manifold that has a finite atlas. Then there exists an injective immersion of Minto ]R2n+1. Consequently, every compact n-manifold can be embedded into ]R2n+1. Proof. Let M be a smooth manifold with a finite atlas. In particular, M is second countable. Initially, we will settle for an immersion into ]RD for
3.2. Immersed and Weakly Embedded Submanifolds
137
some possibly very large dimension D. Let {Oi, 'PihEI be an atlas with cardinality N < 00. By applying Lemma B.4 twice, the cover {Oil may be refined to two other covers {UihEI and {VihEI such that Ui C Vi C Vi C Oi. Also, we may find smooth functions fi : M --+ [0, IJ with supp(li) c Oi and such that fi(X) = 1 for all x E U, and f,(x) < 1 for x ¢ Vi. Next we write 'P, = (x;, ... , xi) so that Oi --+ ]R is the j-th coordinate function of the i-th chart, and then form the product
xi :
lij
:=
fixi,
which is defined and smooth on all of M after extension by zero. Now we put the functions fi together with the functions fij to get a map
f : M --+ ]Rn+Nn : f = (h,···, fn, fll, h2, ... , 121' ... ' fNn). Now we show that f is injective. Suppose that f(x) = f(y). Note that fk(x) must be 1 for some k since x E Uk for some k. But then Jk(y) = 1 also, and this means that y E Vk (why?). Since fk(X) = fk(Y) = 1, it follows that Jkj(x) = Jkj(Y) for all j. Remembering how things were defined, we see that x and y have the same image under 'Ilk : Ok --+]Rn and thus x = y. To show that Txf is injective for all x E M, we fix an arbitrary such x; then x E Uk for some k. But then near this x, the functions Jkl,!k2, ... , fkn, are equal to xl, ... , xk and so the rank of f must be at least n and in fact equal to n since dim TxM = n. So far we have an injective immersion of Minto ]RD where D = n + Nn. We show that there is a projection 7r : ]RD --+ L c ]RD, where L ~ ]R2n+l is a (2n + I)-dimensional subspace of ]RD such that 7r 0 f is an injective immersion. The proof of this will be inductive. So suppose that there is an injective immersion f of Minto ]Rd for some d with D ~ d > 2n + 1. We show that there is a projection 7rd : ]Rd --+ L d - 1 ~ ]Rd-l such that 7rd 0 f is still an injective immersion. To this end, define a map h : M x M x]R --+ ]Rd by h(x, y, t) := t(f(x) - f(y)). Since d > 2n + 1, Sard's theorem (Theorem 2.34) implies that there is a vector z E ]Rd which is neither in the image of the map h nor in the image of the map df : T M --+ ]Rd. This z cannot be 0 since a is certainly in the image of both of these maps. If pr.lz is projection onto the orthogonal complement of z, then pr.lz 0 f is injective; for if pr.lz 0 f(x) = pr.lz 0 f(y), then f(x) - f(y) = az for some a E ]R. But suppose x i- y. Since f is injective, we must have a i- O. This state of affairs is impossible since it results in the equation h(x, y, l/a) = z, which contradicts our choice of z. Thus pr.lz 0 f is injective. Next we examine Tx(pr.lz 0 f) for an arbitrary x E M. Suppose that Tx(pr.lz 0 f)v = o. Then d(pr.lz 0 f)lx v = 0, and since pr.lz is linear, this
138
3. Immersion and Submersion
amounts to pr..l.z 0 dflx v = 0, which gives dflx v = az for some number a E ~, and which cannot be 0 since f is assumed to be an immersion. But then dflx ~v = z, which also contradicts our choice of z. We conclude that pr..l.z 0 f is an injective immersion. Repeating this process inductively we finally get a composition of projections pr : ~D ~ ~2n+1 such that pr 0 f : M --t ~2n+1 is an injective immersion. The final statement for compact manifolds follows from Exercise 3.6. 0
3.3. Submersions Definition 3.22. A map f : M --t N is called a submersion at p E M if Tpf : TpM --t TJ(p)N is a surjection. f : M --t N is called a submersion if f is a submersion at every p E M. Example 3.23. The map of the punctured space ~3\ {O} onto the sphere S2 given by x t--t xllxl is a submersion. To see this, use any spherical coordinates (p, c/J, (1) on ~3\ {O} and the induced submanifold coordinates (c/J, (1) on S2. Expressed with respect to these coordinates, the map becomes (p, c/J, (1) t--t (c/J, (1) on the domain of the spherical coordinate chart. Here we ended up locally with a projection onto a second factor ~ x ~2--t ~2, but this is clearly good enough to prove the point. As in the last example, to show that a map is a submersion at some p it is enough to find charts containing p and f(p) so that the coordinate representative of the map is just a projection. Conversely, we have Theorem 3.24. Let M be an m-manifold and N a k-manifold and let f : M --t N be a smooth map that is a submersion at p. Then for any chart (V, y) centered at f(P) there is a chart (U, x) centered at p with f(U) c V such that yo f 0 x-I is given by (xl, ... , x k , ... , xm) t--t (xl, ... , xk) E ~k. Here k is both the dimension of N and the rank of Tpf. Proof. Follows directly from Theorem C.4 of Appendix C.
o
In certain contexts, submersions, especially surjective submersions, are referred to as projections. We often denote such a map by the letter 7r. Recall that if 7r : M --t N is a smooth map, then a smooth local section of 7r is a smooth map (J : V --t M defined on an open set V and such that 7r 0 (J = idv. Also, we adopt the terminology that subsets of M of the form 7r- 1 (q) are called fibers of the submersion. Proposition 3.25. If 7r : M --t N is a submersion, then it is an open map and every point p E M is in the image of a smooth local section.
139
3.3. Submersions
Proof. Let p E M be arbitrary. We choose a chart (U, x) centered at p and a chart (V, y) centered at 7I"(p) such that yo 71" 0 x-I is of the form (xl, ... , xk, xk+l, ... ,xm) r--+ (xl, . .. ,xk), where dim M = m and dim N = k. By shrinking the domains if necessary, we can arrange that x(U) has the form A x B C IRk X IRm - k and y(V) = B C IRI. Then we may transfer the section ib : a -+ (a, b) where b = x(p). More precisely, the desired local section is (j := x-I 0 ib 0 y on A. We now use the existence of local sections to show that 71" is an open map. Let 0 be any open set in M. To show that 71"(0) is open, we pick any q E 71"(0) and choose p E 0 with p E 7I"-I(q). Now we choose a chart (U, x) as above with p E U C O. Then q is in the domain of a section which is open and contained in 71"(0). 0 Proposition 3.26. Let 71" : M -+ N be a surjective submersion. P is any map, then I is smooth il and only il I 0 71" is smooth:
II I: N
-+
M
~!~~
N~P
Proof. One direction is trivial. For the other direction, assume that 1071" is smooth. We check for smoothness of I about an arbitrary point q E N. Pick p E 7I"-l(q). By the previous propositionp is in the image of a smooth section ( j : V -+ M. This means that I and (f 071") 0 (j agree on a neighborhood of q, and since the latter is smooth, we are done. 0 Next suppose that we have a surjective submersion 71" : M -+ Nand consider a smooth map g : M -+ P which is constant on fibers. That is, we assume that if PI,P2 E 7I"-l(q) for some q E N, then I(Pl) = I(P2). Clearly there is a unique induced map I: N -+ P so that g = 1 0 71". By the above proposition I must be smooth. This we record as a corollary:
Corollary 3.27. II g : M -+ P is a smooth map which is constant on the fibers 01 a surjective submersion 71" : M -+ N, then there is a unique smooth map I : N -+ P such that g = I 0 71" • The following technical lemma is needed later and represents one more situation where second count ability is needed. Lemma 3.28. Suppose that M is a second countable smooth manilold. II I: M -+ N is a smooth map with constant rank that is also surjective, then
it is a submersion. Proof. Let dim M = m, dim N = nand rank(f) = k and choose p E M. Suppose that I is not a submersion so that k < n. We can cover M
3. Immersion and Submersion
140
by a countable collection of charts (Uo , xo ) and cover N by charts (Vi, y,) such that for every a, there is an i = i(a) with f (Ua,) C Vi and Yt 0 f 0 x~l(xI, ... , xn) = (xl, ... , xk, 0, ... ,0). But this means that f (Uo,) has measure zero. However, f(M) = Uof (Uo ) and so f(M) is also of measure zero which contradicts the surjectivity of f. This contradiction means that f must be a submersion after all. 0
Problems (1) Let
°<
a
< b. Show that the subset of]R3 described by the equation
(J x 2 + y2 _ b) 2 + z2 =
a2
is a submanifold. Show that the resulting manifold is diffeomorphic to 8 1 x Sl. (2) Show that the map 8 2 ~ ]R3 given by (x, y, z) ~ (x, y, z - 2z3 ) is an immersion. Try to determine what the image of this map looks like.
(3) Show that if M is compact and N is connected, then a submersion f : M ~ N must be surjective. (4) Let f : M ~ N be an immersion. (a) Let (U, x) be a chart for M with p E U, and let (V, y) be a chart for N with f(p) E V such that yo fox 1(a 1,a2, ... ,an ) - (a 1,a2, ... ,an ,0, ... ,0).
Show that Tpf·
aax i Ip = aayi If(p)
for i - I , ... , n.
(b) Show that if f is as in part (a) and Y E X(N) is such that Y(p) E Tpf(TpM) for all p, then there is a unique X E X(M) such that X is f-related to Y. (5) Define a function s : ]Rn+1\{0} ~ ]Rpn by the rule that s(x) is the line through x and the origin. Show that s is a submersion. (6) Show that there is a continuous map f : ]R2 ~ JR2 such that f(B(O, 1)) C B(O, 1), f(]R2\B(0, 1)) C f(]R2\B(0, 1)) and faB(O,l) = idaB (o,l) and with the properties that f is Coo on B(O, 1) and on ]R2\B(0, 1), while f is not Coo on ]R2. (7) Construct an embedding of ]R x 8 n into ]Rn+1.
(8) Embed the Stiefel manifold of k-frames in]Rn into a Euclidean space JRN for some large N.
Problems
141
(9) Construct an embedding of G(n, k) into G(n, k + l) for each l ~ 1. (10) Show that the map f : 1~'p2 ---+ lR3 defined by f([x, y, z]) = (yz, xz, xy) is an immersion at all but six points p E lRP2. The image is called the Roman surface, and nice images can be found on the web. Show that it is a topological immersion (locally a topological embedding). Show that the map 9 : lRp 2 ---+ lR4 given by g([x, y, z]) = (yz, xz, xy, x 2 + 2y2 + 3z2) is a smooth embedding. (11) Let h : M ---+ lRn be smooth and let N c lRn be a regular submanifold. Prove that for each E > 0 there exists a v E lR n , with Ivl < E, such that the map p r-+ h(p) + v is transverse to N. (Think about the map M x N ---+ lRn given by (p, y) r-+ y - f(p).) (12) Define ¢ : SI ---+ lR by e'o r-+ () for 0 ~ () < 271". Define>. : lR ---+SI by () r-+ eiO • Show that >. is an immersion, that >. 0 ¢ is smooth, but that ¢ is not differentiable (it is not even continuous).
Chapter
4
Curves and Hypersurfaces in Euclidean Space
So far we have been studying manifold theory, which is foundational for modern differential geometry. In this chapter we change direction a bit to introduce some ideas from classical differential geometry. We do this for pedagogical reasons. The reader will see several ideas introduced here that will only be treated in generality later in the book. These ideas include parallelism, covariant derivative, metric and curvature. We concentrate on the geometry of one-dimensional sub manifolds ofIRn (geometric curves) and submanifolds of co dimension one in IR n , which we refer to as hypersurfaces. l Recall that a vector in TpIRn can be viewed as a pair (p, v) E {p} x IR n , and we sometimes write vp or (vI, ... , vn)p for (p, v). The element v is called the principal part of vp. Recall also that in IR n we have a natural notion of what it means for vectors in different tangent spaces to be parallel. By definition, vp E TpIRn is parallel to Wq E TqIR n if v = W E IRn. For any wp E TpIRn, there is a unique vector Wq E TqIR n which is parallel to wp. We call Wq the parallel translate of wp to the tangent space TqIRn. We often identify TpRn with IRn. If eI, ... ,en is the standard orthonormal basis for IRn, then el," . will denote the standard global frame field on IRn defined by e,(p):- 8~.lp for i = 1, ... , n, where xl, ... , xn are the standard coordinate functions on IRn. The reason for this separate notation is to emphasize the fiducial role of this frame field. If vp E TpIRn and Wq E TqIRn, then vp is
,en
1 Unless
otherwise stated, we will assume that a hypersurface is a manifold without boundary.
-
143
4. Curves and Hypersurfaces in Euclidean Space
144
parallel to
Wq
if vi - wi for all i, where
vp
L viei(p) and
=
Wq
=
L wiet(q).
Recall that if e : I -t M is a smooth curve into an n- manifold M, then a smooth vector field along e is a smooth map Y : I -t T M such that 71' 0 Y = e. The velocity of the map Y is an element of T (T M) rather than T M. On the other hand, if c : I -t ]Rn, then the special structure of ]Rn allows us to take a derivative of a vector field along e and end up with another vector field along c. If Y is such a vector field along e, then it can be written Y = ~ yi 0 e for some smooth functions yi : I -t R In other words, Y(t) = ~Yt(t)ei(e(t)) for tEl. For such a field, we define
ei
dY di(t)
=
Y'(t)
=
dyi ~ L Tt(t)ei(e(t)).
In notation that keeps track of base points, we can write Y so that Y(t) = (yl(t), ... , yn(t))c(t), and then
= (yl, ... , yn)c
dY (t) = (d d Y n (t) ) . -d -d Y 1 (t), ... , -d t t t c(t) Since ~r is obviously also a vector field along c, we can repeat the process to obtain higher derivatives ~r = y(k). In particular, we have the velocity e', acceleration e", and higher derivatives e(k). Except for the current emphasis on the base point of vectors, these definitions are the usual definitions from multivariable calculus. Another thing that is special about ]Rn is the availability of the natural inner product (the dot product) in every tangent space Tp]Rn. This inner product on Tp]Rn is denoted (-, ')p and is given by
(vp,wp) t--+ (vp,wp) = Lviw i , where vp = (p, v) and wp - (p, w) as explained above. This gives what is called a Riemannian metric on ]Rn, and it is obtained from the canonical identification of the tangent spaces with ]Rn itself. We study Riemannian metrics on general manifolds later in the book. For smooth vector fields on ]Rn, say X and Y, the function (X, Y) defined by p t--+ (Xp, Yp) is smooth. IT X and Yare fields along a curve e: I -t ]Rn, then (X, Y) is a function on I and we clearly have
d /dX) dt (X, Y) = \ dt' Y
dY) + /\ X, dt .
Remark 4.1. We shall sometimes leave out the subscript p in the notation vp if the base point is understood from the context. In fact, we often just identify vp with its principal part v E ]Rn.
4.1. Curves
145
An ordered basis VI,.'" vn is positively oriented if det(vI, ... , v n ) is positive. For v p , wp E T p]R3, we can obviously define a cross product vp x wp := (v x w)p, which results in a vector based at p which is then orthogonal to both vp and wp. If vp and wp are orthonormal, then (vp, w p , (v x w)p) is a positively oriented (right-handed) orthonormal basis for Tp]R3. The following lemma allows us to do something similar in higher dimensions.
Lemma 4.2. If VI, ... , Vn I E ]Rn, then there is a unique vector N(VI, ... , Vn-l) such that
(i) N( VI, ... , Vn-I) is orthogonal to each of VI, ... , Vn-I'
(ii) If {Vl, ... ,Vn -I} is an orthonormal set of vectors, then the list (VI, ... , Vn-I, N) is a positively oriented orthonormal basis.
(iii)
N(VI"'" Vn-I) depends smoothly on VI,.··, Vn -l.
Proof. Let L be the linear functional defined by L (V) := det (VI, ... , Vn-I, V). There is a unique N E ]Rn such that L(v) = (N, v). Now (i), (ii) and (iii) follow from the properties of the determinant. D
4.1. Curves If C is a one-dimensional submanifold of]Rn and pEe, then there is a chart (V, y) of C containing p such that y (V) is a connected open interval I c R The inverse map y-l : I ~ V c M is a local parametrization. Thus for local properties, we are reduced to studying curves into ]Rn which are embeddings of intervals. We can be even more general and study immersions. The idea is to extract information that is appropriately independent of the parametrization. If 'Y : I ~ ]Rn and c : J ~ ]Rn are curves with the same image, then we say that c is a positive reparametrization of 'Y if there is a smooth function h : J ~ I with h' > 0 such that c = 'Y 0 h. In this case, we say that 'Y and c have the same sense and provide the same orientation on the image. We assume that 'Y : I ~ ]Rn has IIT'II > 0, which is the case of interest. Such a curve is called regular, which just means that the curve is an immersion.
Definition 4.3. If 'Y : I ---t]Rn is a regular curve, then T(t) := 'Y'(t)/ 11T'(t)II defines the unit tangent field along 'Y. (Of course, we then have IITII = 1.) We have the familiar notion of the length of a curve defined on a closed interval'Y: [tI, t2J ~ ]Rn:
L=
l
t2
tl
!Ir' (t) II dt.
4. Curves and Hypersurfaces in Euclidean Space
146
One can define an arc length function for a curve 'Y : I --+ lRn by choosing to E I and, then defining s
= h(t)
:-I 1i'Y'(T)\\ t
dT.
to
Notice that s takes on negative values if t < to, so it does not always represent the length in the ordinary sense. If the curve is smooth and regular, then h' - 1i'Y'(T)II > 0, and so by the inverse function theorem, h has a smooth inverse. We then have the familiar fact that if e(s) = 'Y 0 h- 1 (s), then Ile'll (s) :- Ile'(s)11 = 1 for all s. Curves which are parametrized in terms of arc length are referred to as unit speed curves. For a unit speed curve, ~~(s) = T(s). Since parametrization by arc length eliminates any component of acceleration in the direction of the curve, the acceleration must be due only to the shape of the curve. Definition 4.4. Let c : I --+ lRn be a unit speed curve. The vector-valued function dT
K(S)
:=
ds (s)
is called the curvature vector. The function
K,
defined by
is called the curvature function. If K,(s) > 0, then we also define the principal normal dT 11-1 Ts(s), dT N(s):= Il Ts(s)
so that ~'!'
= K,N.
Let 'Y : I --+ lRn be a regular curve. An adapted orthonormal moving frame along 'Y is a list (El," . ,En) of smooth vector fields along 'Y such that El(t) = 'Y'(t)/Ii'Y'(t)11 and such that (E1(t), ... ,En(t)) is a basis of T'Y(t)lR n for each tEl. Identifying El (t), . .. , En(t) with elements of lRn written as column vectors, we say that the orthonormal moving frame is positively oriented if
Q(t) = [E1(t), ... , En(t)] is an orthogonal matrix of determinant one for each t. Definition 4.5. A moving frame El(t), ... , En(t) along a curve 'Y : 1--+ lRn is a Frenet frame for 'Y if 'Y(k) (t) is in the span of El (t), . .. , Ek(t) for all t and 1 ~ k ~ n. As we have defined them, Frenet frames are not unique. However, under certain circumstances we may single out special Frenet frames. For example,
147
4.1. Curves
if c: I -4 lR3 is a unit speed curve with /'i, > 0, then the principal normal N is defined. By letting
B=TxN we obtain a Frenet frame T, N, B. It is an easy exercise to show that we obtain dT ds dN
-
dB ds
=
/'i,N -/'i,T
ds
+
TB
-TN
for some function T called the torsion. This is the familiar form presented in many calculus texts. In matrix notation,
[ °~ -/'i,~ ~T° 1
:8[T,N,B] =[T,N,B]
•
Notice that for a regular curve, /'i, ~ 0, while T may assume any real value. Another special feature of the frame T, N, B is that it is positively oriented. If [ is injective, then we can think of /'i, and T as defined on the geometric image [(1). Thus, if p = ,(80), then /'i,(p) is defined to be equal to /'i,(80)' Exercise 4.6. Let c : I -4 lR3 be a unit speed regular curve with /'i, > 0. Show that (( c' (8) X e" (8)) , e"' (8)) c 11 I () T 8 = /'i, (8) 2 lor a 8 E . Exercise 4.7. Let c: I -4lR3 be a unit speed curve and 80 E I. Show that we have a Taylor expansion of the form
,(8) -,(80) = ((8 - 80) -
~(8 -
80)3/'i,2(8 0)) T(80) 80) 3 d/'i,)) d8 (80 N(80 )
+ ( 12 (8 -
80) 2 /'i,(80)
+ (~(8 -
80 )3/'i,(8 0 )T(80 )) B(8)
1 + 6(8
+ 0((8 -
80)3).
We wish to generalize the special properties of the Frenet frame T, N, B to higher dimensions thereby obtaining a notion of a distinguished Frenet frame. For maximum generality, we do not assume that the curve is unit speed. A curve, in lRn is called k-regular if {['(t), ,"(t), .. . ,[(k)(t)} is a linearly independent set for each t in the domain of the curve. For an (n - 1)regular curve, the existence of a special orthonormal moving frame can be easily proved. One applies the Gram-Schmidt process: If El (t), ... , Ek (t)
4. Curves and Hypersurfaces in Euclidean Space
148
are already defined for some k < n - 1, then
Ew(t)
:~ [-y(W) (t) - ~ U
k +1) (t),
Ck
E,(t)) E,(t)] ,
where Ck is a positive constant chosen so that IIEk+1 (t) II = 1. Inductively, this gives us El(t), ... , En-l(t), and it is clear that the Ek(t) are all smooth. Now we choose En(t) to complete our frame by letting it be of unit length and orthogonal to E1(t), ... ,En-1(t). By making one possible adjustment of sign on En(t) we obtain a moving frame that is positively oriented. In fact, En(t) is given by the construction of Lemma 4.2, from which it follows that En(t) is smooth in t. By construction we have a nice list of properties:
(1) For 1 ~ k ~ n, the vectors El (t), .. . , Ek(t) have the same linear span as -y'(t), ... ,-y(k)(t) so that there is an n x n upper triangular matrix function U (t) such that
b'(t), ... , -y(n) (t)]U(t) = [E1(t), ... , En(t)]. (2) For 1 ~ k ~ n - 1, the vectors E1(t), ... , Ek(t) have the same orientation as -y' (t), ... , -y(k) (t). Thus U (t) has diagonal elements which are all positive except possibly the last one.
(3) (E1(t), ... , En(t)) is positively oriented as a basis of T'Y(t)R n ~ JRn • Exercise 4.8. Show that the moving frame we have constructed is the unique one with these properties. We call a moving frame satisfying the above properties a distinguished Frenet frame along -y. For any orthonormal moving frame, the derivative of each Ei(t) is certainly expressible as a linear combination of the basis E1(t), ... , En(t), and so we may write
d
n
dtEj(t) = ~Wi3(t)Ei(t). ~=l
Of course, Wij(t) = (Ei(t),-9tEj(t)), but since (Ei(t), Ej(t)} = elij, we conclude that Wij(t) = -Wji(t), i.e., the matrix w(t) = [wtj(t)] is antisymmetric. However, for a distinguished Frenet frame, more is true. Indeed, if (E1(t), ... , En(t)) is such a distinguished Frenet frame, then for 1 ~ j < n we have Ej(t) - E{=l Ukj'Y(k) (t), where U(t) = [Uk3 (t)] is the upper triangular matrix mentioned above. Using the fact that U, -9tU, and U- 1 are all upper triangular, we have
149
4.1. Curves
But 'Y(k+1) (t) so that
= "k+l (U- 1) r,k+l E r (t) , and 'Y(k) (t) L.,.,r=l :tEj(t) =
=
t, (! tr (d j
Ukj ) ')Ik)(t)
dt Ukj
j
+
+
t,
= "k (U- 1) rk E r (t) L.,.,r=l
ukd k+1)(t)
) ?; (U-1)rkEr(t) k
k+1
L Uk) L (U- )r,k+1 Er(t). 1
k=l
r=l
From this we see that ;1tEj(t) is in the span of (Er(t)h~r~j+1' Thus w(t) = (Wij(t)) can have no nonzero entries below the sub diagonal. But w is antisymmetric, so we conclude that W has the form
o W(t) =
o
-Wn,n-l(t)
Wn,n-l(t)
0
We define the i-th generalized curvature function by
Wt+1,i(t) () Ki t := 1l'Y'(t) II . Thus if 'Y : I -t
]Rn
is a unit speed curve, we have
o W(s) =
o
-Kn-l(S)
Kn-l(S)
0
Note: Our matrix w is the transpose of the W presented in some other expositions. The source of the difference is that we write a basis as a formal row matrix of vectors. Lemma 4.9. IJ'Y : I -t ]Rn (n ~ 3) is (n - I)-regular, then Jor 1 :::; i :::; n-2, the generalized curvatures Ki are positive.
150
4. Curves and Hypersurfaces in Euclidean Space
Proof. By construction, for 1
~
i
~
=
L Uji(t)-y(J) (t),
n- 1
i
Ei(t)
J-1 i
'Y(i)(t) =
L (U)j/ (t)Ej(t), 3=1
with Un
> 0,
w.+1,.(t)
and hence
(U- 1 )ii > o.
Thus if 1 ~ i ~ n - 2, we have
~ (E,+J, ! E.) ~ ( E.+l, :t ~ Uj;(thu1 (t)) = / E~+l, \
t
J-1
'Y(J)(t) :t
UJ~(t)) + / E~+l, \
t UJ~(t)-y(j+1)(t)) J
1
= Un (E~+1(t)''Y(i+l)(t)) = Un (U- 1)i+1,i+1 > O. In passing from the second to the third line above, we have used the fact that E~+1 is orthogonal to all 'Y(J) for j ~ i since these are in the span of
{EJ}J=l, .... ~.
0
The last generalized curvature function II:n-l is sometimes called the torsion. It may take on negative values. Exercise 4.10. If 'Y : I -+ JRn is (n - l)-regular, show that E1 = T and E2 = N. If 'Y is parametrized by arc length, then fs 'Y - E1 and ~'Y = 1I:1E 2. Conclude that 11:1 II: (the curvature defined earlier). The orthogonal group O(JR n ) is the group of linear transformations A : JRn -+ JRn such that (Av, Aw) = (v, w) for all v, w E JR n . The group O(JRn) is identified with the group of orthogonal n x n matrices denoted O(n). The Euclidean group Euc(JR n ) is generated by translations and elements of O(JRn ). Every element ¢ E Euc(JR n ) can be represented by a pair (A, b), where A E O(JRn ) and bE JRn and where ¢(v) = Av + b. Note that in this case, D¢ - A (the derivative of ¢ is A). The elements of the Euclidean group are called Euclidean motions or isometries of JR n . If ¢ E Euc(JR n ), then for each p E JR n , the tangent map Tp¢ : TpJRn -+ T.p(p)JR n is a linear isometry. In other words, (Tp¢. vp, Tp¢· wp)/(p) - (vp, wp)p for all vp, wp E TpJR n . The group SO(JRn ) is the special linear group on JRn and consists of the elements of O(JR n ) which preserve orientation. The corresponding matrix group is SO(n) and is the subgroup of O(n) consisting of elements of determinant l. The subgroup SEuc(JR n ) c Euc(JR n ) is the group generated by translations and elements of SO(JRn). It is called the special Euclidean group.
4.1. Curves
151
Rn and '1 : I ~ R n be two (n - I)-regular curves with corresponding curvature functions Ki and K.i (1 < i ~ n - 1). If -y'(t) II I'1'(t) II and Ki(t) = Ki(t) for all t E I and 1 ~ i ~ n - 1, then there exists a unique isometry 4> E S Euc(Rn) such that
Theorem 4.11. Let"( : I
~
'1-4>0"(. Proof. Let (E1(t), ... , En(t)) and (E1(t), ... , En(t)) be the distinguished Frenet frames for "( and '1 respectively and let Wtj and i:hJ be the corresponding matrix elements as above. Fix to E I and consider the unique isometry 4> represented by (A, b) such that 4>b(to)) = '1(to) and such that
A(Ei(tO)) = Ei(tO) for 1 ~ i ~ n. Since 1I"f'(t) II = 11'1'(t)II and Ki(t) = K.i(t), we have that Wtj(t) = Wtj(t) for all i,j and t. Thus we have both
and
d
n
dt AEi(t) =
L Wji(t)AEJ (t). J
1
El and AEi satisfy the same linear differential equation, and since A(Ei(tO)) - Ei(to), we conclude that A(Ei(t)) = Ei(t) for all t and 1 ~ i < n. In particular, A-y'(t) = 1I"f'(t)II AEl(t) = 11'1'(t)II E1(t) = '1'(t). Thus Hence
4>b(t)) - 4> b(to)) =
it it
(4) 0 -y)' (T) dT =
to
=
to
A-y'(T) dT =
it
it
D4>· -y'(T) dT
to
'1'(T) dT = '1(t) - '1(to),
to
from which we conclude that 4>b(t)) - '1(t). For uniqueness, we argue as follows. Suppose that 'Ij; 0 -y = '1 for 'Ij; E Euc(Rn) and suppose that 'Ij; is represented by (B, c). The fact that D'Ij; must take the Frenet frame of "( to that of '1 means that A = D'IjJ = D4> = B. The fact that 'lj;b(to)) - '1(to) implies that b c and so 'Ij; 4>. 0 Conversely, we have Theorem 4.12. If Kl,' .. , Kn-l are smooth functions on a neighborhood of E R such that K t > 0 for i < n -1, then there exists an (n -1) -regular unit speed curve -y defined on some interval containing So such that Kl, ... ,Kn 1 are the curvature functions of -y.
So
152
4. Curves and Hypersurfaces in Euclidean Space
Proof. We merely sketch the proof: Let
A(s) :=
a
-KI
KI
a
a
a a
a a
a
a Kn-l
-Kn-l
a
and consider the matrix initial value problem
X'=XA, X(so) = I. This has a unique smooth solution X on some interval I which contains so. The skew-symmetry of A implies that X (s) is orthogonal for all s E I. If we let Xl be the first column of X, then
,(s):=
1 8
xI(t)dt
80
defines a unit speed (n - l)-regular curve with the required curvature func-
0
fuM.
Exercise 4.13. Fill in the details of the previous proof. If n > 3, then a regular curve, need not have a Frenet frame. However, a regular curve still has a curvature function, and if the curve is 2-regular, then we have a principal normal N which is defined so that T and N are the Gram-Schmidt orthogonalization of " and ,". We will sometimes denote this principal normal by E2 in order to avoid confusion with the normal to a hypersurface, which is denoted below by N.
4.2. Hypersurfaces Suppose that Y is vector field on an open set in IRn. For p in the domain of Y, and vp E TplRn, let c be a curve with c(O) = p and c(O) = vp. For any t near 0, we can look at the value of Y at c(t). We let V'vpY := (Y 0 c)' (0),
which is defined since Yo c is a vector field along c. Note that in this context (Y 0 c)' (0) is taken to be based at p = c(O). If X is a vector field, then a vector field V'xY is given by V'xY : pH V'xpY. In fact, it is easy to see that if X = E Xiei and Y = E Y~ei' then V'x Y =
It
L (Xyi)ei
since (Xyi) (p) = XpY~ = 10 yi 0 c. We have presented things as we have because we wish to prime the reader for the general concept of a covariant derivative that we will meet in later chapters. However, it must be confessed
153
4.2. Hypersurfaces
that under the canonical identification of IRn with each tangent space, \7 Xp Y is just the directional derivative of Y in the direction Xp. The map (X, Y) I---t \7 x Y is COO(IRn)-linear in X but not in Y. Rather, it is IR-linear in Y and we have a product rule:
\7xfY = (XI) Y
+ f\7xY.
Because of these properties, the operator \7 x : Y I---t \7 x Y, which is given for any X, is called a covariant derivative (or Koszul connection). In Chapter 12 we study covariant derivatives in a more general context. Since we shall soon consider covariant derivatives on submanifolds, let us refer to 'Vas the ambient covariant derivative. Notice that we have \7 x Y - \7y X -
[X,Yj. There is another property that our ambient covariant derivative \7 satisfies. Namely, it respects the metric:
X (Y, Z) = (\7xY, Z)
+ (Y, \7xZ).
Similarly, if vp E TpIRn, then vp (Y, Z) = (\7 vp Y, Zp) + (Yp, \7 vp Z). Consider a hypersurface M in IRn. By definition, M is a regular (n - 1)dimensional sub manifold of IRn. (If n = 3, then such a submanifold has dimension two and we also just refer to it as a surface in IR3.) A vector field along an open set 0 C M is a map X : 0 -+ TlRn such that the following diagram commutes: TIRn
Y1
o '---
IRn
Here the horizontal map is inclusion of 0 into IRn. If X (p) E TpM for all p EO, then X is nothing more than a tangent vector field on O. If N is a field along 0 such that (N(p), vp) = 0 for all vp E TpM and all p E 0, then we call N a (smooth) normal field. If N is a normal field such that (N(p) , N(p)) = 1 for all p E 0, then N is called a unit normal field. Note that because of examples such as embedded Mobius bands in IR3, it is not always the case that there exists a globally defined smooth unit normal field. Definition 4.14. A hypersurface Min IRn is called orientable if there exists a smooth global unit normal vector field N defined along M. We say that M is oriented by N. We will come to a more general and sophisticated notion of orientable manifold later. That definition will be consistent with the one above. Exercise 4.15. Show that for a connected orientable hypersurface, there are exactly two choices of unit normal vector field.
154
4. Curves and Hypersurfaces in Euclidean Space
In this chapter, we study mainly local geometry. We focus attention near a point p E M. We consider a chart (0, u) for ~n that is a single-slice chart centered at p and adapted to M. Thus if u = CuI, ... ,un), then the restrictions of the functions UI , ... ,un - I give coordinates for M on the set o = 0 n M = {un = o}. We denote these restrictions by ul, ... , un 1. Thus if we write u:= (u I , ..• ,un - I ), then (O,u) is a chart on M. We may further arrange that u( 0) is a cube centered at the origin in ~n and so, in particular, 0 is connected and orientable. Let us temporarily call such charts special. The coordinate vector fields a~. for i = 1, ... , n are defined on 0, while the vector fields a~. are defined on O. We have
r::J~' (p) =
-
uu t
r::J0. (P) for all p E 0 and i = 1, ... , n - 1. uu t
-
-
If X is a vector field on an open set 0, then its restriction to 0 - 0 Mis a vector field along 0, which certainly need not be tangent to M. If X is a vector field along 0, then there must be smooth functions Xi on 0 such that n
X(p) =
.
a
I: xt(P) ou (p) i
1
t
for all p EO. Then X is a tangent vector field on 0 precisely when the last component xn is identically zero on 0 so that n 1
a
X = '"' Xi_. L...J aut t
1
Now if X is a field along M, then it can be extended to a field X on 0 C ~n by considering the component functions Xi as functions on 0 which happen to be constant with respect to the last coordinate variable un. In other words, if 7r : 0 --+ 0 is the map (a I , ... , an) ~ (a I , .. . , an-I, 0), then
-
~
- a
X = L...J x taut' t
1
where it = xiou I 07rou• This last composition makes good sense because (0, u) is a special single-slice chart as described above. We will refer to X as an extension of X, but note that the extension is based on a particular special choice of single-slice chart. The constructions on the hypersurface that we consider below do not depend on the extension. Given a choice of unit normal N along a neighborhood of p EM, V'vpN is defined for any vp E TpM by virtue of the fact that N only needs to be defined along a curve with tangent vp. Alternatively, we can define V'vpN to be equal to V'vp IV for an extension IV of N in a special single-slice chart
155
4.2. Hypersurfaces
N
Figure 4.1. Shape operator
adapted to M and containing p. We note that ('VvpN,N(p)) = O. Indeed, since (N, N) 1, we have
0= vp (N, N) = 2 ('VvpN, N). Thus 'V vpN E TpM since TpM is exactly the set of vectors in TplRn perpendicular to Np = N(p). Exercise 4.16. Suppose we are merely given a unit normal vector at p. Show that we can extend it to a smooth normal field near p. For dimensional reasons, any two such extensions must agree on some neighborhood of p. Definition 4.17. Given a choice of unit normal Np at p E M, the map SNp : TpM ~ TpM defined by
SNp(Vp) := -'VvpN for any local unit normal field N with Np = N(p), is called the shape operator or Weingarten map at p. From the definitions it follows that if c is a curve with c(to) = p and = vp, then SNp(Vp) = - (N 0 c)' (to). If 0 c M is an open set oriented by a choice of unit normal field N along 0, then we obtain a map SN : TO ~ TO by SNITpM := SN p. This map is also called the shape operator (on 0). We have the inner product (" ')p on each tangent space TplRn and TpM C TplRn. For each p E M, the restriction of this inner product to each tangent space TpM is also denoted by (', ')p or just (', .). We also denote this inner product on TpM by gp so that gp("') = (-, ')p. The map p f--t gp is smooth in the sense that p f--t gp (X(p), Y(p)) is smooth whenever X and Yare smooth vector fields on M. Such a smooth assignment of inner product to the tangent spaces of M provides a Riemannian metric on M, a concept
c(to)
156
4. Curves and Hypersurfaces in Euclidean Space
studied in more generality in later chapters. We denote the function p r-+ gp (X(p), Y(p)) simply by 9 (X, Y) or (X, Y). In short, a Riemannian metric is a smooth assignment of an inner product to each tangent space. Definition 4.18. A diffeomorphism f : Ml ---+ M2 between hypersurfaces in JRn is called an isometry if Tpf : TpMl ---+ Tf(p) M2 is an isometry of inner product spaces for all p E MI. In this case, we say that MI is isometric to M 2·
Proposition 4.19. SNp : TpM ---+ TpM is self-adjoint with respect to (', ')p. Proof. Let (0, y) be a special chart centered at p and let 0 = 0 n M as above. Let Xp and Yp be elements of TpM and ex~end th.-:m to .:'ector fields X and Y on O. Then extend X and Y to fields X and Y on O. Similarly, extend Np to N and then fir. Note that Xp(fir, X) = Xp (N, X) = 0 and Yp(fir, Y) = O. Using this, we have
- (SNpXp, Yp)
+ (Xp, SNpYp)
= ('VxpN, Yp) - (Xp, 'VYpN) = ('V xfir, Y)p - (X, 'Vyfir)p
= Xp(fir, Y) - (fir, 'V xY)p - Yp(fir, X) + (fir, 'VyX)p = ('VyX - 'V xY, fir)p = ([Y, Xl, fir)p
= ([Y, Xl (p) , fir (p)) = ([Y, Xl (p) , N (p)) = 0 since [Y, Xl (p) sion map.
= [Y, Xl (p)
E
TpM by Proposition 2.84 applied to the inclu0
Definition 4.20. The symmetric bilinear form IIp on TpM defined by
IIp(vp,wp):= (Vp,SNpWp) = (SNpVp,Wp) is called the second fundamental form at p. If N is a unit normal on an open subset of M, then for smooth tangent vector fields X, Y, the function II(X, Y) defined by p H IIp(Xp, Yp) is smooth. The assignment (X, Y) H II(X, Y) defined on pairs of tangent vector fields is also called the second fundamental form. The form II is bilinear over COO(O), where 0 is the domain of N. As we shall see, the shape operator can be recovered from the second fundamental form II together with the metric 9 = (.,.) (first fundamental form). If the reader keeps the definition in mind, he or she will recognize that the second fundamental form appears implicitly in much that follows. We return to the second fundamental form again explicitly later. Exercise 4.21. Let Xp E TpJRn and let Y be any smooth vector field on JR n . Show that if f : JRn ---+ JRn is a Euclidean motion, then we have
157
4.2. Hypersurfaces
Tf· 'V xpY = 'VTj.xpf.Y. More generally, show that this is true if f is affine (i.e. if f is of the form f{x) - Ax + b for some linear map A and b E ]Rn).
Theorem 4.22. If Ml is a connected hypersurface in]Rn and if f : ]Rn ---t is a Euclidean motion, then M2 - f{M 1 ) is also a hypersurface and
]Rn
(i) the induced map flMl : Ml ---t M2 is an isometry; (ii) if Ml and M2 are oriented by unit normals Nl and N2 respectively, then, after replacing Nl or N2 by its negative if necessary, we have TfoSNl = SN2oTf.
Proof. That f(M I ) is a hypersurface is an easy exercise, which we leave to the reader. After noticing that flMl : Ml ---t M2 is smooth (why?), we argue as follows: Let ¢ := flMl and notice that T¢ . v = T f . v for all v tangent to Ml. Since T f preserves the inner products on T]Rn, we see that Tp¢ is an isometry for each p E MI. Since ¢ is clearly a bijection we conclude that (i) holds. Note that we must have Tpf· NI (p) = ±N2(p) and, since MI is connected, one possible change of sign on unit normals gives Tf
° NI ° f- I
-
N2.
Then by Exercise 4.21 Tf· SN1V
or Tf
0
SNl
= -Tf· 'VvNl = SN2
0
- -'VTj.vf.Nl
=
-'VTf·v N 2 = SN2 (Tf· v),
0
Tf.
Notice that an arbitrary isometry Ml ---t M2 need not be the restriction of an isometry of the ambient ]Rn and need not preserve shape operators. Definition 4.23. Let M be a hypersurface in ]Rn, take a point p E M, and let Np be a unit normal at p. The mean curvature H(p) at p in the direction Np is defined by 1 H(p) := --1 trace(SNp ) '
n-
Notice that changing Np to -Np changes H(p) to -H(P). If N is a unit normal field along an open set U c M, then the function p ~ H(p) is smooth. It is called a mean curvature function. If M is an orient able hypersurface, then it has a global mean curvature function for each unit normal field. Definition 4.24. Let M be a hypersurface in ]Rn. Let a unit normal Np be given at p. The Gauss curvature K(p) at p is defined by K(p) := det(SNp )'
4. Curves and Hypersurfaces in Euclidean Space
158
If N is a unit normal field along an open set U eM, then the function p H K(P) det(SN(p») is smooth. It is called a Gauss curvature function associated to the normal field, and always exists locally. If M is orient able, then it has a global Gauss curvature function for every choice of unit normal field. Notice that changing Np to -Np changes K(p) to (-lr- 1 K(p). It follows that if n - 1 is even, then there is a unique global Gauss curvature function regardless of whether M is orientable or not. In particular, this is the case for surfaces in IR3.
The shape operator S Np encodes the local geometry of the sub manifold at p and measures the way M bends and twists through the ambient Euclidean space. Since SNp is self-adjoint, there is a basis for TpM consisting of eigenvectors of SNp. An eigenvector for SNp is called a principal vector, and a unit principal vector is called a principal direction or a direction of curvature. The eigenvalues are called principal curvatures at p. If k1 , ... ,kn - 1 are the principal curvatures at p, then n-l
H(p)
_1_", k n 1 L...J 1 i
n-l
and
1
K(p) =
II k t
i•
1
If u E TpM is a tangent vector with (u,u) = 1, then k(u) = (SNpU,U) is the normal curvature in the direction u. Of course, if u is a unit length eigenvector (principal direction), then the corresponding principal curvature k is just the normal curvature in that direction. Notice that k( -u) - k(u). A vector v E TpM is called asymptotic if (SNpV,v) = O.
Proposition 4.25. Let M be a hypersurface and N a unit normal field. Let "( : I -+ M c ]Rn be a regular curve with image in the domain of N. Then (SNp "('(t), "('(t)) (N(-y(t)), "("(t)). Proof. Since (N (-y( t)), "(' (t)) - 0, differentiation gives
(N(-y(t)), "("(t))
+ (:t N(-y(t)), :t "((t)) = O.
Thus (SNp"(' (t), "(' (t))
= (-\} -y/(t)N, "(' (t)) = ( - :t N(-y(t)), :t "((t)) - (N(-y(t)) , "("(t)).
0
In particular, if u is a unit vector at p and u = C(O) for some unit speed curve c, then k(u) = (Np, d'(O)). This shows that all unit speed curves with a given velocity u have the same normal component given by the normal curvature in that direction. This curvature is forced by the shape of M and we see how normal curvatures measure the shape of M.
159
4.2. Hypersurfaces
Corollary 4.26. Let c : I --+ M c ]Rn be a unit speed curve. If I'\:(so) = 0 for some So E I, then k(c(so)) = O. If I'\: (s) > 0 and E2 is the principal normal defined near s, then k(c(s)) - ~(s) cosO(s), where O(s) is the angle between N(c(s)) and E2(S). Proof. Suppose ~(O) = O. Then k(d(O)) = (N(c(O)), c"(O)) 0, then E2 is defined for an interval containing s. We have
k(c'(s))
= O.
If ~ (s) >
(N(c(s)), c"(s)) = (N(c(s)), ~E2(C(S))) = ~(s) (N(c(s)), E2(S)) = ~(s) cos O(s).
=
o
If P is a 2-plane containing p E M and ~uch that Np is tan~ent to P, then for a small enough open neighborhood 0 of p, the set C = 0 n P n M is a regular one-dimensional submanifold of ]Rn. We parameterize C by a unit speed curve c with c(O) = p. This curve c is called a normal section at p. Notice that in this case E 2(O) - ±Np. Then k(c(O)) ±~l(O) where the" " sign is chosen in case Np - -E2(O). Thus we see that k(c(O)) is positive if the normal section c bends away from Np •
Definition 4.27 (Curve types). Let M be a hypersurface in I --+ M be a regular curve.
]Rn
and let
"y :
(i) , is called a geodesic if the acceleration ," (t) is normal to M for all t E I. (ii) , is called a principal curve (or line of curvature) if -y(t) is a principal vector for all tEl. (iii) , is called an asymptotic curve if -y(t) is an asymptotic vector for all tEl. Proposition 4.28. Let M be a surface in ]R3. Suppose that a regular curve "y : I --+ M is contained in the intersection of M and a plane P. If the angle between M and P is constant along " then, is a principal curve. Proof. The result is local, and so we assume that M is oriented by a unit normal N. Let v be a unit normal along P. Since P is a plane, v is constant. By assumption, (N, v) is constant along ,. Thus
0= :t (N
0"
v0
,)
= ("V"yN, v 0
,),
so 'V"yN is orthogonal to v along,. By the same token, "V"yN is orthogonal to N since (N, N) = 1. Thus "V"yN must be collinear with -y. In other words, SN-Y - "V"yN = >'-y for some scalar function >.. 0
160
4. Curves and Hypersurfaces in Euclidean Space
Example 4.29 (Surface of revolution). Let t t-+ (9(t), h(t)) be a regular curve in ]R2 defined on an open interval I. Assume that h > O. Call this curve the profile curve. Define x : I x ]R --+ ]R3 by
x(u, v) = (9(U), h(u) cosv, h(u) sin v). This is periodic in v and its image is a surface. The curves of constant u and curves of constant v are contained in planes of the form {x - c} and {z = my} (or {y = mz}) and so they are principal curves. For a surface of revolution, the circles generated by rotating a fixed point of the profile are called parallels. These are the constant u curves in the example above. The curves that are copies of the profile curve are called the meridians. In the example above, the meridians are the constant v curves. We know from standard linear algebra that principal directions at a point in a hypersurface corresponding to distinct principal curvatures are orthogonal. Generically, each eigenspace will be one-dimensional, but in general they may be of higher dimension. In fact, if SNp is a multiple of the identity operator, then there is only one eigenvalue and the eigenspace is all of TpM. In this case, every direction is a principal direction. It may even be the case that S Np = O. If c : I --+ M is a unit speed curve into a hypersurface, then rather than use the distinguished Frenet frame of c, we can use a frame which incorporates the unit normal to the hypersurface: Definition 4.30. Let c : I --+ M be a unit speed curve into a surface in R3 and let N be a unit normal defined at least on an open set containing the image of c. A frame field (Dl' D2, D3) (along c) such that Dl = T (= c), D3 = N 0 c and D2 D3 x Dl is called a Darboux frame. Exercise 4.31. Let c: I --+ M be a unit speed curve into a surface M in ]R3 and let D 1, D2, D3 be an associated Darboux frame. Show that there exist smooth functions 91, 92 and 93 such that 91 D 2 + 92D3, -91 D l + 93D 3, -92Dl 93D2. Show that 91 - 0 along c if and only if c is a geodesic. Show that 92 = 0 along c if and only if c is asymptotic, and 93 = 0 along c if and only if c is principal. dDl/ds dD 2 /ds = dD3/ ds =
The function 91 from the previous exercise is called the geodesic curvature function and is often denoted K- g • Let us obtain a formula for K- g • Define J : TpM --+ TpM by
4.2. Hypersurfaees
161
Figure 4.2. Curvature vectors
Notice that IIJvp 11 = Ilvpll and so J is an isometry of the 2-dimensional inner product space (TpM, (', ')p). But J also satisfies J2 = - id, and in dimension 2 this determines J up to sign since it must be a rotation by ±7l' /2. We now assume that N is globally defined so that the map J extends to a smooth map T M -+ T M. For a unit speed curve e : I -+ M, the geodesic curvature is given by (4.1)
Kg
/I dT JT ) = ( e, J e' ) - \/ ds'
.
Indeed, abbreviating No e to N we have from the first equation in Exercise 4.31 the following: dT
/I
,
,
ds = e = KgN x e + 92 N = KgJe + 92N. Taking inner products with Jd gives formula (4.1). Let e : I -+ M be a curve in a surface that is parametrized by arc length. The curvature vector K.(s) can be decomposed into a component Kg(S) tangent to M and a component K.n(s) normal to M:
K.(s) = K.g(s)
+ K.n(s).
The curve will be a geodesic if and only if K. g (s) = 0 for all s. The vector Kg(S) is the geodesic curvature vector at e(s). It is easy to show that Kg = ± II K.g(s) II. Figure 4.2 depicts a sphere of radius R with a conical "hat". The cone intersects the sphere in a curve of latitude. Since the cone is tangent to the sphere, the vectors K., K.g and K. n apply equally well to both surfaces at least along the curve. Using the Pythagorean theorem and similar triangles, it is possible to show that IIK.gll - 1/a, where a is the distance from the curve to the vertex of the cone. Now imagine cutting the cone along a generating line and unrolling it as shown in Figure 4.3. The result is a planar region with a circular arc of radius a and curvature whose
162
4. Curves and Hypersurfaces in Euclidean Space
magnitude is l/a. The reader might want to try and give a reason why this is to be expected. We will answer this later in this chapter.
Figure 4.3. Unrolling a cone
Definition 4.32. If SNp is a multiple of the identity operator, then p is called an umbilic point. If SNp = 0, then p is called a flat point. If every point of a hypersurface is umbilic, then we say that the hypersurface is totally umbilic. Example 4.33. If aIx l plane P in ]Rn, then
+ a2x2 + ... + anxn = 0 is the equation of a hypern
N = Laiei i-I
is a normal field when restricted to P. Since clearly SNp = 0 for all p E P, we see that every point of P is flat and that the Gauss and mean curvatures are identically zero. Example 4.34. Let sn-I be the unit sphere in ]Rn. Then the map p = (al, . .. ,an) f---t N(p) := ~~=I ai~ is a unit normal field along sn-I. We can calculate SN(V) = -\lvN. Let c be a curve in sn-I with C(O) = v. Then
-
dl
-\lvN = - dt
=-
N(c(t)) t=Q
d I L~ dt n
i=1
i
ei = -v.
t=Q
We are really just using the fact that up to a change in base point we have N (c( t)) = c( t). Thus S N = - id and we see that every point is umbilic and every principal curvature is unity. Also, K = 1 and the mean curvature H = -1 everywhere. Exercise 4.35. What is the shape operator on a sphere of radius r? Composition with a Euclidean motion preserves local geometry so if we want to study a hypersurface near a point p, then we may as well assume
163
4.2. Hypersurfaces
that p is the origin and that Np = en(O) = ~Io' Locally, M is then the graph of a function f : jRn-l ---t jR with ~(O) - 0 for i = 1, ... ,n -1. A normal field N which extends No = en(O) is given by n
1
.
(1 + L (8f /8x~)2)
-1/2
n-l
8t
(en - L 8xiei)' i-I
i=1
Let Z = en - E~:11 -£!'ei so that N = gZ, where 9 is the first factor in the formula above. For v = E~:ll v~ei tangent to M at the origin we have
V'v (gZ)
= (vg) Z + gV'vZ,
where vg means that v acts on 9 as a derivation. But vg vanishes at the origin since 9 takes on a maximum there. Since g(O) = 1, we have V'v (gZ) = V'vZ. Thus we may compute SNo using Z: SNoV
= -V'v Z .
We have n-l
SNoV
=
LV
(81 /8x i ) ~(O)
i=l
n-ln-l
.
81 I
= ~ ~ v 8xi8x j 0 e~(O). J
We conclude that the shape operator at the origin is represented by the (n - 1) x (n - 1) matrix
[D2 t] (0) =
[8:~XJ (0)] l~i,j~n-l
.
This is only valid at the origin. In other words, [D2 t] does not give us a representation of the shape operator except at the origin where M is tangent to jRn-l. We can arrange, by rotating further if necessary, that el, ... ,en 1 are directions of curvature. In this case, the above matrix is diagonal with the principal curvatures kI, . .. , k n - 1 down the diagonal. Example 4.36. Let M be the graph of the function f(x, y) = xy. We look at 0 E M. We have
[D21] = which diagonalizes to
[6 -.?d
[~ ~],
with corresponding eigenvectors ~
(el
+ e2)
and ~ (el - e2). Thus the Gauss curvature at the origin is -1 and the mean curvature is O. We can understand this graph geometrically. The normal sections created by intersecting the graph with the planes y = 0 and x = 0 are straight lines, and so the normal curvatures in those directions
4. Curves and Hypersurfaces in Euclidean Space
164
are zero. However, the normal section given by intersecting with the plane y = x is concave up, while that created by the plane y = -x is concave down. be a surface. If VI, V2 E TpM are linearly independent, then for a given unit normal Np we have
Proposition 4.37. Let M C
]R3
SNpVI x SNpV2 SNpVI x V2
+ VI
X
SNpV2
= K(p) (VI x V2), = 2H(p) (VI x V2) .
Proof. We prove the first equation and leave the second as an easy exercise. Let (s;) be the matrix of SNp with respect to the basis VI, v2. Then SNpVI x SNpV2
= (visi + v2s~) x (vls~ + v2s~) = sis~ - s~s~ = det(SNp)'
o
We remind the reader of the easily checked Lagrange identity:
(v x wax b) = '
I (v, a) (v, b) I (w,a)
(w,~
for any a, b, v, wE ]R3 (or in any Tp]R3). If N is a normal field defined over an open set in the surface M, and if X and Yare linearly independent vector fields over the same domain, then we can apply the Lagrange identity and the above proposition at each point to obtain
and also H=~
I(SNX,X) (Y,X)
(SNX, Y) (Y,Y)
I+ I (X,X)
(SNY,X)
(X, Y) (SNY,Y)
2~------~I-(~X-,X~)~(X-'-Y~)~I------~ (Y,X)
(Y,Y)
These formulas show clearly the smoothness of K and H over any region where a smooth unit normal is defined. Also, the formula k± =H± JH2_K
gives the two functions k+ and k- such that k+(p) and k-(p) are principal curvatures at p. These functions are clearly smooth on any region where k+ > k- and are continuous on all of the surface. Furthermore, if every point of an open set in M is an umbilic point, then k(p) := k+(p) = k-(p) defines a smooth function on this open set. This continuity is important for
165
4.3. The Levi-Civita Covariant Derivative
obtaining some global results on compact surfaces. Notice that for a surface in ]R3 the set of nonumbilic points is exactly the set where k+ > k-, and so this is an open set.
Definition 4.38. A frame field E I , ... , En I on an open region in a hypersurface M is called an orthonormal frame field if EI (p), ... , E n - l (p) is an orthonormal basis for TpM for each p in the region. An orthonormal frame field is called principal frame field if each Ei(P) is a principal vector at each point in the region.
Theorem 4.39. Let M be a surface in lR3 . If p E M is a point that is not umbilic, then there is a principal frame field defined on a neighborhood of p.
Proof. The set of nonumbilic points is open, and so we can start with any frame field (say a coordinate frame field) on a neighborhood of p. We may take this open set to be oriented by a unit normal N. We then apply the Gram-Schmidt orthogonalization process simultaneously over the open set to obtain a frame field FI, F2. Since p is not umbilic, we can multiply by an orthogonal matrix to assure that F I , F2 are not principal at p and hence not principal in a neighborhood of p. On this smaller neighborhood we have
SNFI
= aFI + bF2,
SNF2 = bFI + cF2 for functions a, b, c with b =1= O. Now define GI , G2 by
G1 = bFI + (k+ - a)F2' G2 = (k- - C)Fl
+ bF2
and check by direct computation that SNG I = k+G I and SNG2 = k+G2. We have used a standard linear algebra technique for changing to an eigenbasis. Since b =1= 0, we see that IIGtiland IIG211 are not zero. Finally, let EI = Gd IIGtil and
E2 =
G 2 / IIG211.
0
4.3. The Levi-Civita Covariant Derivative The Levi-Civita covariant derivative is studied here only in the special case of a hypersurface in lRn. The more general case is studied in Chapters 12 and 13. We derive some central equations, which include the Gauss formula, the Gauss curvature equation and the Codazzi-Mainardi equation. Let M be a hypersurface. For Xp E TpM, and Y a tangent vector field on M (or an open subset of M), define V xp Y by Vx"Y:= projT"MVX"Y,
where projTpM : TplRn -+ TpM is orthogonal projection onto TpM. For convenience, let us agree to denote the orthogonal projection of a vector
166
4. Curves and Hypersurfaces in Euclidean Space
v E TplRn onto TpM by v T and the orthogonal projection onto the normal direction by v.i. We call v T the tangent part of v and v.i the normal part. Then V'xl' Y = (V'xl'Y) T. Exercise 4.40. Show that for! a smooth function and Xp and Y as above, we have V'xl'!Y = (Xpl) Y(p) + !(p)V'xl'Y' If N is any unit normal field defined near p, then
V'xl'Y = V'xl'Y + (Np, V'xl'Yp) Np. Since
0= Xp (N, Y) = (-SNXp, Y) we obtain the Gauss formula:
+ (N, V'xl' Y) ,
(4.2) Notice that the right hand side of the above equation is unchanged if N is replaced by -N. It follows that if X and Yare smooth tangent vector fields, then p ~ V' xl' Y is smooth and we may then define the field V' x Y by (V'xY) (p) := V'xl'Y. By construction (V'xY) (p) = (V'zY) (p) if X(p) = Z(p). It is also straightforward to check that the map (X, Y) M V'xY is COO(M)-linear in X, but not in Y. Rather, like V', it is IR-linear in Y and we have the product rule V'x!Y
= (XI) Y + !V'xY,
which follows directly from Exercise 4.40. Thus V' is a covariant derivative on M and V' x Y is defined for X, Y E X(M). We remind the reader that X(M) is the space of tangent vector fields and not to be confused with vector fields along M which may not be tangent to M. Let Y and Z be smooth tangent vector fields on M and take Xp E TpM. We study the situation locally near p. Let (i5, \1) be a special chart ce~tered at p and let (~, u) be_the chart obtained by restrictio~ where 0 = 0 n M as before. If Y and Z are extensions of Y and Z to 0, then since S N IS self-adjoint, we have
(V'yZ - V' zY) (p) = (V'yZ - V' zY) (p)
= (V'yZ - V' zy) (p)
= [Y, Z]p = [Y, Z]p. Thus V' y Z - V' z Y = [Y, Z] for all Y, Z E X (M). This fact is expressed by saying that V' is torsion free. Also,
Xp (Y, Z) = Xp (y, Z) = (V'xl'y, Zp) = (V' xpY, Zp)
+ (¥p, V' xl'Z),
+ (Yp, V'xl'Z)
4.3. The Levi-Civita Covariant Derivative
167
so that if X E X(M), then X (Y, Z) = (V'xY, Z) + (Y, V'xZ). We express this latter fact by saying that V' is a metric covariant derivative on M. There is only one covariant derivative on M that satisfies these last two properties, and it is called the Levi-Civita covariant derivative (or Levi-Civita connection). We prove the uniqueness later in this chapter. With coordinates u\ ... ,un - 1 as above we have functions
If n-l
X
,8
= LX~-8i u
i=l
n-l
and Y
,8
= L Y3 -8" i=l
uJ
then (X, Y) =
L 9ijXi yj. i,j
The length of a curve in M is just the same as its length as a curve in the ambient Euclidean space. One may define a distance function on M by dist(p, q) := inf{L(c)}, where the infimum is over all curves connecting p and q. This gives M a metric space structure whose topology is the same as the underlying topology. This will be proved in more generality in Chapter 13. For now, the point is that metric aspects of M are determined by (" .) and locally by the 9'3 associated to each chart of an atlas for M. For example, the length of a curve c : [a, b] --+ 0 c M is given by
(4.3)
L(c) =
rL
J(1
b n-l
dci dd
9ij (c(t)) dtdt
dt,
a i,j=l
where ci(t) := ui
0 C.
Exercise 4.41. Deduce the above local formula for length from the formula for the length of the curve in the ambient Euclidean space. Returning to our covariant derivative V', we have
8 V'..L 8.. 1 = Bu' ·w
n-l
8
Lr~8U k
k=l
168
4. Curves and Hypersurfaces in Euclidean Space
for smooth functions rt known as the Christoffel symbols of V. For X and Y expressed as above, we have
and so we have
(4.4)
_~ ~ k j) t:i (~8Yk f-t X + /;::l rijX Y i
Vx Y -
Thus the functions
8 8u k '
i
8ui
rt determine V in the coordinate chart.
Also note that
o = [88., 88uJ ] = V...JL.. 8a - V...iL 88. = Lk (rfj - rji) 88u k . u~ au' u) au u~ 1
It follows that
rt =
(4.5)
rj~ for all i, j, k = 1, ... ,n - 1.
n-l
=
L
(r~igsj + r~jgsi) .
s=1
The matrix (gij) is invertible, and it is traditional to denote the components of the inverse by gi j , so that Er gjrgri = 6}. One may solve to obtain (4.6)
r~.~J = ~2 "" ks ~g
(8 9Si _ 8g~j 8ui 8us
+ 88u9 jsk )
.
II
Exercise 4.42. Prove formula (4.6) above. [Hint: First write the formula ~~i = Ell r~igsj + r~jgsi two more times, but cyclically permuting i, j, k to obtain three expressions. Subtract the second expression from the sum of the first and third. Use equality (4.5).] Proposition 4.43. The Levi-Civita connection on a hypersurface is determined uniquely by the properties of being a torsion free metric connection.
169
4.3. The Levi-Civita Covariant Derivative
Proof. In deriving the local formula for the Christoffel symbols, we only used the fact that V is a torsion free metric connection. We do this again in Chapter 13 in a more satisfying way. 0
If an object is determined completely by the metric on M, then we say that the object is intrinsic. Equivalently, if all local coordinate expressions for an object can be written in terms of the metric coefficients gij (and their derivatives, etc.), then that object is intrinsic. We have just seen that the connection V is intrinsic. It follows that if f : Ml ~ M2 is an isome1
2
try of hypersurfaces and V and V are the respective Levi-Civita covariant
derivatives, then 1
2
f*Vx Y = V/.xf*Y
for all vector fields X, Y E I(M). One way to see this is to examine the situation using a chart (V, u) on Ml and the chart (f(V), u 0 f 1) on M2. 2
A better way is to show that (X, Y) t--+ f*V /.xf*Y defines a torsion free metric connection on Ml and then use Proposition 4.43. (Exercise!)
If Y : I define
~
T M is a vector field along a curve c : I
~
M
c
]Rn,
then we
for any to E I and then define V a Y by at
( V ..!!. at
Y) (t) := V Yfor tEl. ..!!.I
at t
Suppose that c : I ~ M is such that c(J) and some chart (V, u). We can then write
Y(t) =
~ yi(t) a~i I i=l
c V for some subinterval J c I
for all t E J.
c(t)
Let us focus attention on a to such that c(to) =1= 0 and let J be an open interval with to E J. By restricting J if necessary ,!e may also assume that c is an embedding. Then there is a smooth field Y on a neighborhood of c(J) such that Yo c = Y. In fact, a simple partition of unity argument shows that we may arrange that Y be defined on all of V (in fact, on all of
170
4. Curves and Hypersurfaces in Euclidean Space
M). For t
E J we have
(V It Y) (t) = y' (t) T = ( (Y
0
c)' (t)) T
= (VC(t)Y) T = Vc(t)Y
n-l (n-l ~
= {;
=
,n-l + ij;l
ayk d aui (c(t)) d~
L (aY m + .L (rfj
n-l
k
n-l
k=l
0
c) (t)
d
rfj(c(t))
i
d~ (t)yj(c(t))
d~ (t)yj(t) auka I i
)
)
.
c(t)
l,j=1
We arrive at
We would like to argue that the above formula holds for general curves. If c(to) f:. 0, then there is an interval around to so that the formula holds as we have just seen. If c(to) = 0, we consider two cases. If there exists a sequence ti converging to to such that C(ti) = 0 for all i, then the formula holds for each ti and hence by continuity at to. Otherwise there must be an interval J containing to such that c is constant on J, say c( t) = p for all t E J, and then in this interval Y is just a map into the vector space TpM. In this case we have
(V:,Y) (t) = Y'W = =
(! ~>'(t) 8~.1J
ayk a I L m(t) auk
But this agrees with the formula since
c(t)
T
.
7t' = O.
Exercise 4.44. Show that for a vector field X along c : I --+ M and smooth function h E Coo (1) we have
I!.
Va/athX = hVa/atX + h'X.
Exercise 4.45. Show that for vector fields X, Y along c we have d dt (X, Y)
= (Va/at X , Y) + (X, Va/atY) = O.
The operator Va/at involves the curve c despite the fact that the latter is not indicated in the notation.
171
4.3. The Levi-Civita Covariant Derivative
We pause to consider again the question posed earlier about the circular arc on the unrolled cone in Figure 4.3. The key lies in the fact that the absolute geodesic curvature IKgl is intrinsic. We remarked earlier that the operator J is intrinsic up to sign (the latter being determined by orientation). On the other hand, in Problem 11 the reader is asked to derive the formula (Til, J,') K
But
g -
"'!"':--'--"';"":-
1Ir'11 3
IK 1- 1(Til, ±J,') 1_ I(V8/ 8 t'Y', ±J,') I g
-
1,'11 3
-
1Ir'11 3
'
and since both V and the pair ±J are intrinsic, we see that 1Kg 1 is intrinsic. A little thought should convince the reader that if a curve in one surface is carried to a curve in another surface by an isometry, then the curves will have equal absolute geodesic curvatures at corresponding points. The unrolling of the cone in Figure 4.3 can be thought of as inducing an isometry between the cone (minus a line segment) and a region in a planar surface in JR3. Thus we expect IKgl to be the same for both curves. But IKgl for a circular arc in a plane is just the reciprocal of the radius. Definition 4.46. A tangent vector field Y along a curve e : I -+ M is said to be parallel (in M) along e if V Jt Y = 0 for all tEl. Ilt
If e is self-parallel, Le. V8/8tC(t)
= 0 for all tEl, then it is easy to see
that e is a geodesic (in M) and in fact, this could serve as an alternative definition of geodesic curve, which will be the basis of later generalizations. Notice that if e is a curve in ]Rn, then Y can be considered as taking values in TJRn. However, Y being parallel in M is not the same as Y being parallel as a TJRn-valued vector field along e. In particular, a geodesic in M certainly need not be a straight line in ]Rn. For example, constant speed parametrizations of great circles on S2 C ]R3 are geodesics. Exercise 4.47. Given a smooth curve e : I -+ M, show that d' = 0 if and only if c is a geodesic in M such that (SNC(t), c(t)) = 0 for all t and choice of unit normal at e(t). Suppose that e" is never zero. Show that e is a geodesic in M if and only if e" is normal to M (Le. d'(t) ~ Tc(t)M for all tEl). The following simple result follows from the preceding exercise: Proposition 4.48. Let Ml and M2 be hypersurfaces in JRn. Suppose that c : I -+ ]Rn is such that e(t) E Ml n M2 for all t. If e" is not zero on any subinterval of I, then Tc(t)Ml = Tc(t)M2 for all tEl. Proof. By Exercise 4.47,
Tc(t)Ml
= d'(t).l = Tc(t)M2
for all t.
D
172
4. Curves and Hypersurfaces in Euclidean Space
~ M are parallel, then (X, Y) (t) := (X(t), Y(t)) is constant in t. In particular, a parallel vector field has constant length.
Proposition 4.49. If vector fields X, Y along a curve c : I
Proof.
ft (X, Y) = ('V 8/fJtX, Y) + (X, 'V 8/fJtY) = o.
o
Corollary 4.50. A geodesic has a velocity vector of constant length. Definition 4.51. Suppose Y E X(M) is a smooth vector field on Mj then Y is called a parallel vector field on M if 'V x Y = 0 at all points of M and for all smooth vector fields X E X(M). Obviously, Y is a parallel field if and only if Y all curves c.
0
c is parallel along c for
We now move on to prove two basic identities and introduce the curvat~r~ t~sor. It is easy to check by direct computation that for vector fields X, Y, Z on an open subset of ]Rn, we have
(4.8) If M is a hypersurface and X, Y, Z are tangent vector fields on a !le~h.E0r hood of an arbit~y p E M, then we may extend these to fields X, Y Z on a neighborhood 0 in ]Rn. Then we have
('Vx'VyZ - 'Vy'VxZ - 'V[X,YjZ) (p)
= ('V x 'VyZ - 'Vy'V xZ - 'V[X,y]Z) (p) = 0, and so (4.9) wherever the fields are all defined on M. Suppose that N is a unit normal field defined on the same domain in M. We apply the Gauss formula (4.2) to equation (4.9) above and then decompose it into tangent and normal parts:
0= 'Vx ('VyZ + (SNY, Z) N) - 'Vy ('VxZ + (SNX, Z) N) - 'V[X,YjZ = 'Vx'VyZ + (SNX, 'VyZ) N + X (SNY, Z) N - (SNY, Z) SNX - 'Vy'VxZ - (SNY, 'VxZ) N - Y (SNX, Z) N + (SNX, Z) SNY - 'V[X,YjZ - (SN[X, Y], Z) N. Equating the tangential parts of the above gives the Gauss curvature equation:
(4.10) 'V x'VyZ - 'Vy'V xZ - 'V[X,YjZ = (SNY, Z) SNX - (SNX, Z) SNY. The normal parts give
0= (SNX, 'VyZ) + X (SNY,Z) - (SNY, 'VxZ) - Y (SNX, Z) - (SN[X, Y], Z),
4.3. The Levi-Civita Covariant Derivative
173
or (VXSNY, Z) - (VySNX, Z) - (SN[X, YJ, Z) = 0 for all Z. From this we obtain the Codazzi-Mainardi equation:
(4.11) Let us give an application of the Codazzi-Mainardi equation and then return to the Gauss curvature equation.
Proposition 4.52. Let M be a connected hypersurface in ]Rn oriented by a unit normal field N. If every point of M is umbilic (i. e. M is totally umbilic), then the normal curvatures are all equal and constant on M. Furthermore, M is an open subset of a hyperplane or a sphere according to whether the normal curvatures are zero or nonzero. In particular, if M is a closed subset of]Rn, then it is a sphere or a hyperplane according to whether it is compact or not. Proof. There is a function k such that S N - kI, where I is the identity on each tangent space. This function is continuous since k = nIl trace S N. Let Xp E TpM and pick Yp E TpM so that Xp and Yp are linearly independent. Extend these to tangent fields X and Y on a neighborhood of p. By the Codazzi-Mainardi equation we have
0= VxkY - VykX - k[X, Yj = (Xk) Y + kVxY - ((Yk)X + kVyX) - k[X, Yj = (Xk) Y - (Yk) X, where we have used VxY - VyX = [X, Yj. In particular, at p we have (Xpk) Yp-(Ypk) Xp = O. Since Xp and Yp are linearly independent, Xpk = O. Since p and Xp were arbitrary and M is connected, we see that k is in fact constant. If the constant is k = 0, then SN = 0 on M, and this means that N = No is constant along M. This implies that M is in a hyperplane normal to No. If k =1= 0, then (changing N to -N if necessary) we may assume that k > O. Define a function f : M --t ]Rn by f(p) - p + tcN(p). We identify tangent spaces with subspaces of]Rn and calculate Df(p). Let v E TpM and choose a curve c: (-a, a) --t M with C(O) = v. Then we have
Df(p)·v= ddl f oc =C(0)+-k1 ddl t t=O t t 1
Noc 0
1
= v - "kSNV = V - "kkv - O. Thus D f (p) = 0 for all p EM, and since M is connected, f is constant (Exercise 2.23). Thus p + tcN(p) = q for some fixed q and all p. In other
174
4. Curves and Hypersurfaces in Euclidean Space
words, all p E M are at a distance 11k from q E ]Rn, and so M is contained 0 in that sphere of radius 11k. The left hand side of the Gauss curvature equation (4.10) is given its own notation
R(X, Y)Z := 'iJ x'iJyZ - 'iJy'iJ xZ - 'iJ[X,YjZ, and looking at the right side of (4.10) we see that (R(X, Y)Z) (p) depends only on the values of X, Y, Z at the point p, which means that we obtain 8 map Rp : TpM x TpM x TpM --t TpM defined by the formula
Rp(Xp, Yp)Zp := (R(X, Y)Z) (p). We say that R is a tensor since it is linear in each variable separately. We study tensors systematically in Chapter 7. The tensor R is called the Riemannian curvature tensor and the map p ~ Rp is smooth in the sense that if X, Y, and Z are smooth tangent vector fields, then p ~ Rp(Xp, Yp)Zp is a smooth vector field. We often omit the subscript p and just write R(Xp, Yp)Zp. (Notice that equation (4.8) just says that the curvature of lRn associated to the ambient covariant derivative 'iJ is identically zero.) Using (4.10) again, it is also easy to check that Rp is linear in each slot separately on TpM. In particular, for fixed X p, Yp E TpM we have a linear map R(Xp, Yp) : TpM --t TpM.
Theorem 4.53. If M is a surface in]R3 and (Xp, Yp) is an orthonormal basis for TpM, then
Proof. Using 4.10, and abbreviating SNp to S, we have
(R(Xp, Yp)Yp , Xp) = (SYp, Yp) (SXp, Xp) - (SXp, Yp) (SYp, Xp)
= det S = K (P).
0
We have shown that 'iJ is intrinsic and thus R is also intrinsic. Thus the previous theorem implies that the Gauss curvature K for a surface in lR3 is intrinsic. This is the content of Gauss's Theorema Egregium, which we prove (again) below using local parametric notation. If M is a k-dimensional submanifold in ]Rn, and (V, u) is a chart on M with U = u (V), then u- 1 : U --t M is a parametrization of a portion of M, and we denote this map by x: U --t M.
This notation is traditional in surface theory. Composing with the inclusion " : M y ]Rn, we obtain an immersion " 0 x : U --t ]Rn, but we normally identify" 0 x and x when possible. One may study immersions that are not
175
4.3. The Levi-Civita Covariant Derivative
necessarily one-to-one. A reason for this extension is that it might be the case that while x : U ---+ ]Rn is not one-to-one, its image is a submanifold M and so we can still study M via such a map. For example, consider the map x: ]R2 ---+ ]R3 given by
x(u, v) = ((a + b cos u) cos v, (a + bcos u) sin v, bsin u)
(4.12)
for 0 < b < a. This map is periodic and its image is an embedded torus. Notice that the restriction of x to sets of the form (uo - 7r /2, Uo + 7r /2) x (vo - 7r /2, Va + 7r /2) are parametrizations whose inverses are charts on the torus. Another example is the map x : ]R2 ---+ S2 C ]R3 given by x( 'P, 0)
= (cos 0 sin 'P, sin 0 sin 'P, cos 'P)'
The restriction of this map to (0,7r) set of measure zero.
X
(0,27r) parametrizes all of S2 but a
If x : U ---+ M is a parametrization of a hypersurface in ]Rn, then for UE U, the vectors (u) are tangent to M at p = x(u) and in fact form the
&:.
coordinate basis at p in somewhat different notation. In fact, if we abuse notation and write ui for ui 0 x-I, we obtain a chart (V, u) with x(U) = V. Then (u) is essentially just a~i ix(u)' We have = 9ij 0 x, but in the current context of viewing things in terms of the parametrization, we just change our notations slightly so that (::.':::,) = 9ij' These 9iJ are the components of the metric with respect to the parametrization. In these terms, some of the calculations look a bit different. For example, if 'Y : [a, b] ---+ M is a curve whose image is in the range of x, then the length of the curve can be computed in terms of the 9ij, which are now functions of the parameters u i . First, 'Y must be of the form t H x(u1(t), ... ,un-1(t)), where u : t H (u 1 (t), ... , Un-l (t)) is a smooth curve in U. Then we have
&:.
(&:., g;)
which is only notationally different from formula (4.3) due to our current parametric viewpoint. Exercise 4.54. Show that there always exists a local parametrization of a hypersurface M C ]Rn around each of its points such that 9~J (0) - 6~J and ~(O) = 0 for all i,j, k. [Hint: Argue that we may assume that M is written as a graph of a function f : ]Rn-l ---+ ]R with f(O) = 0 and Df(O) = O. Then let x: ]Rn-l ---+]Rn be defined by (u\ ... ,un - 1 ) H (U1 , ...
,un-l , f( u 1 , ... ,un-l)) .]
176
4. Curves and Hypersurfaces in Euclidean Space
Theorem 4.55 (Gauss's Theorema Egregium). Let M be a surface in]R3 and let p EM. There exists a parametrization x : U --t M with x(O, 0) = p such that 9ij = dij to first order at 0 and for which we have
K(p) = 82 912 (0) 8u8v
! 82 9222 (0) - ! 829112 (0). 2 8u
2 8v
Proof. In the coordinates of the exercise above, which give the parametrization (u, v) H (u, v, f(u, v)), where p is the origin of IR3 , we have
~f [ 911(U,V) 912(U,V)] _ [ 1 + (m)2 921(U, v) 922(U, v) ~~ 1 + (~)2
],
from which we find, after a bit of straightforward calculation, that
!
!
8 2912 (0) - 8 2922 (0) - 8 2911 (0) 8u8v 2 8u 2 2 8v 2
82 f 82 f
82 f
2
= 8u 2 8v 2 - 8u8v = det D f(O) = det S(p) = K(p).
0
Let us introduce some traditional notation. For x : U --t M c IR3, and denoting coordinates in U again by (u, v), and in IR3 by (x, y, z), we have
Xu=
( 8X 8y 8X) 8u'8u'8u x'
( 8X 8y 8X) 8v'8v'8v x' 82x 8 2y 8 2x) ( Xuv = 8u8v' 8u8v' 8u8v x' Xv
and so on. In this context, we always take the unit normal to be given, as a function of u and v, by
xuxXv x Xv I A careful look at the definitions gives (
)
N u,v =
(based at x(u,v)).
I XU
8N
8N
8u = SN(Xu) and 8v = SN(Xv).
Recall that the second fundamental form is defined by II(v,w) = (SNV,W), and this makes sense if the tangent vectors v, w are replaced by fields along x defined on the domain of N. The traditional notation we wish to introduce is E = (xu, xu) , 1=
(SNX u, xu) ,
F = (Xu,x v ) ,
G = (xv, xv) ,
m= (SNX u, xv) ,
n = (SNX v , xv) .
177
4.3. The Levi-Civita Covariant Derivative
Thus the matrix of the metric (.,.) (sometimes called the first fundamental form) with respect to Xu, Xv is
[:~~ :~:] = [~ ~], while that of the second fundamental form I I is
[! :]. The reader can check that Ilxu x Xv 112 = EG - F2. The formula for the length of a curve written as t r--t x(u(t), v(t)) on the interval [a, b] is
fa b
) a
E
(du)2 dt
dv (dV)2 + 2F du dt dt + F dt dt.
For this reason the classical notation for the metric or first fundamental form is
ds 2 = Edu 2 + 2Fdudv + Fdv 2, where ds is taken to be an "infinitesimal element of arc length" . Consider the map 9 : ]R3 -t (]R3)* given by v r--t (v, .). With respect to the standard basis and its dual basis, the matrix for this map is [~~]. Similarly, we can consider the second fundamental form as a map II: ]R3 -t (JR.3)* given by v r--t II (v, .), and the matrix for this transformation is [; ~]. Then since II (v, w) = (SNV, w), we have II = 9 0 SN and so SN = g-1
0
II.
We conclude that SN is represented by the matrix
This matrix may not be symmetric even though the shape operator is symmetric with respect to the inner products on the tangent spaces. Taking the determinant and half the trace of this matrix we arrive at the formulas nl-m2
K = EG-F2' H _ Gl + En - 2Fm 2 (EG - F2) .
Exercise 4.56. Show that 1 = (N, x uu ), m = (N, xuv) and n = (N, xvv). Exercise 4.57. Consider the surface of revolution given parametrically by
x(u, v) = (g(u), h(u) cosv, h(u) sin v)
178
4. Curves and Hypersurfaces in Euclidean Space
with h > O. Denote the principal curvature for the meridians through a point with parameters (u, v) by kll and that of the parallels by k7r • Show that these are functions of u only given by
k Il -
-
g' I g"
h' h"
I
-----;,------c~
((g,)2 + (h,)2)3/2 '
g' k - ------..----=-----;;-----:7r h((g,)2 + (h,)2)1/2'
4.4. Area and Mean Curvature In this section we give a result that provides more geometric insight into the nature of mean curvature. The basic idea is that we wish to deform a surface and keep track of how the area of the surface changes. Let M C IR3 be a surface and let x : U --+ V C M be a parametrization of a portion V of M. We suppose that V has compact closure. The area of V is defined by A(V):=
L
Ilxu x
Xvii dudv.
The total area of M (if it is finite) can be obtained by breaking M up into pieces of this sort whose closures only overlap in sets of measure zero. However, our current study is local and it suffices to consider the areas of small pieces of M as above. Suppose x : U x (-c, c) --+ ]R3 is a smooth map such that for each fixed t E (-c, c), the partial map x(·, " t) : (u, v) t-+ x(u, v, t) is an embedding and such that is normal to Xu and Xv for all t. For each t, the image \It = x(U, t) is a surface. We have in mind the case where x(',', 0) is a parametrization of a portion of a given surface M so that Vo = V c M (see Figure 4.4). The normal N = Xu x Xvi Ilxu x Xv depends on t and at time t provides a unit normal to the surface \It. Thus \It is a one parameter family of surfaces.
tt:
Theorem 4.58. Let x : U x (-c, c) --+ ]R3 and let \It = x(U, t) be as above so that ~~ is normal to the surface \It. Let H (t) denote the mean curvature of the surface \It. Then
:t
A (\It) = -2
£I ~: I
Proof. Since ~ is parallel to N, we have ~~
axu at
=~ auat
H(t) dA.
=
I tt: II N.
Thus
ax = ~ S (x) +au ~ Ilaxll au (1laxll at N) = -II axil &t at N N
u
179
4.4. Area and Mean Curvature
Figure 4.4. Deformation of a patch
and so
Similarly,
(N,
Xu x
[}~v) = -II~~II (N, XU x SN (Xv)).
Now we calculate using the second formula of Proposition 4.37:
:t A = :t J Ilxu x xvii dudv = ! J (N,xu x xv) dudv = =
JI JI
d
\ dt N, XU
x Xv ) du dv +
[}xu
JI
[}x v )
\ N, Tt x Xv + Xu x [}t
= - J
= -2 J
[}xu \ N, Tt x Xv + Xu x [}Xv) [}t du dv
du dv
II~~II {II SN (xu) x Xv +xu X SN (Xv)II} dudv H(p)
I ~~llllxu x xvii dudv = -2 J I :11 H dA.
0
In particular, we can arrange that II ~ II = 1 at time t = 0, and then we have dd
I
t t=O
A = -2JHdA.
Thus, H is a measure of the rate of change in area under perturbations of the surface. Definition 4.59. A hypersurface in lRn for which H is identically zero is called a minimal hypersurface (or minimal surface if n = 3 so that dimM = 2).
4. Curves and Hypersurfaces in Euclidean Space
180
Example 4.60. For c> 0, the catenoid is parametrized by
x(u,v) = (u, c cosh(u/c)cosv, c cosh u sin v) and is a minimal surface. Indeed, a straightforward calculation gives [ EF FG]
[ c2 cosh2 (u/c)
[~
[~c I~C]'
It follows that H
0 ] cosh2(u/c)'
o
:] =
= O.
Example 4.61. For each b =1= 0, the map x(u,v) = (bv,ucosv,usinv) is a parametrization of a surface called a helicoid and this is also a minimal surface. The reader may enjoy plotting this and other surfaces using a computer algebra system such as Maple or Mathematica.
4.5. More on Gauss Curvature In this section we construct surfaces of revolution with prescribed Gauss curvature and also prove that an oriented compact surface of constant curvature must be a sphere. Let
x(u, v) = (g(u), h(u) cosv, h(u) sin v) be a parametrization of a surface of revolution. By a reparametrization of the profile curve (g(u), h(u)) we may assume that it is a unit speed curve so that (g,)2 + (h,)2 = 1. When this is done, we say that we have a canonical parametrization of the surface of revolution. A straightforward calculation using the results of Exercise 4.57 shows that for any surface of revolution given as above we have
g'
g' K=--
I g"
h ((g,)2
h' h"
I
+ (h,)2)
2'
If we assume that it is a canonical parametrization, then we have
K=
- (g,)2 h" + g' g" h' h
.
On the other hand, differentiation of (g,)2 + (h,)2 = 1 leads to g' g" = -h'h", and so we arrive at -h" K=h' which shows the expected result that K is constant on parallels (curves along which u is constant). Now suppose we are given a smooth function K defined on some interval I, which we may as well assume to contain O.
4.5. More on Gauss Curvature
181
We would like K to be our Gauss curvature and so we wish to solve the differential equation h" + Kh = 0 subject to h(O) > 0 and Ih'(O)1 < 1. The first condition simplifies the analysis, while the second condition allows us to obtain the canonical situation (g,)2 + (h,)2 = 1. In fact, we let
g(u)
=
fou V1 - (h'(t))2 dt,
and this will give the desired solution defined on the largest open subinterval J of I such that h > 0 and Ih'I < 1. With this solution, our surface of revolution will be defined, but only for u E J. Suppose we try to obtain a surface of revolution with constant positive Gauss curvature K = 1/c2 for some constant c. Then a solution of the equation for h will be h(u) = acos(u/c) for an appropriate a > O. Then
g(u)
=
fou
1- :: sin2 (t/c) dt,
and with the resulting profile curve, we obtain a surface of revolution with Gauss curvature 1/c2 at all points. There are three cases to consider. First, if a = c, then the interval J on which h > 0 and Ih'I < 1 is easily seen to be (-rrc/2, rrc/2) and we have
h(u)
=
ccos(u/c) and g(u)
= csin(u/c).
This gives a semicircle which revolves to make a sphere minus the two points on the axis of revolution. We already know that these two points can be added to give the sphere ofradius c = 1/.JK, which is a compact surface of constant positive Gauss curvature K. It turns out that the spheres are the only compact surfaces (without boundary) of constant positive Gauss curvature. Now consider the case 0 < a < c. The interval J is the same as before, but the surface extends in the x direction between Xl = limu -t-7rcj2 g( u) and X2 = limu -t7rc/2 g(u). It is easy to show that Xl < -a and X2 - -Xl> a. Since the maximum value of h is now smaller than c, the profile curve is shallower and wider than the semicircle of the a = c case above. Although liIIlu-t±7rc/2 h( u) = 0 as before, we now have the profile curve tangents at the endpoints given by lim
u-t±7rc/2
(g' (u), h' (u)) =
lim
u-t±7rc/2
(
1-
a:c sin (u/c), -~c Sin(u/C)) 2
( a2 a)
= 1 - c2 ' =t=~ • This shows that the revolved surface is pointed at its extremes and forms an American football shape. In this case, there is no way to add in the missing points on the axis of revolution to obtain a smooth surface. The two principal curvatures are no longer equal, but their product is still1/c2 •
182
4. Curves and Hypersurfaces in Euclidean Space
The third case a with boundary.
> 0 gives a surface that has an extension to a surface
Exercise 4.62. Analyze the case a
> O.
From above we see that there is an infinite family of surfaces with constant curvature (only one extending to a compact surface). The situation for constant negative curvature is similar, but we obtain no compact surfaces. One particular constant negative curvature surface is of special interest: Example 4.63 (Bugle surface). This surface has Gauss curvature and is given by
-lie?
x(u, v) = (u, h(u) cosv, h(u) sin v), where h is the solution of the differential equation h'-
- vc -h h 2-
2'
subject to initial condition limu-+o h( u) = c. The function h is defined on (0,00). Lemma 4.64. Let M be a surface in IR3. Let p E M be a nonumbilic point and EI, E2 a principal frame on a neighborhood of p (oriented by N) so that SNEI = k+ EI and SNE2 = k- E 2. If we define functions
then
Proof. Since (E2, E2) = I, we have ('VEl E2, E2) = 0 and so there is some function hI such that 'VEl E2 = hiEI. Similarly, 'V E2EI = h2E2 for some function h2. We find formulas for each of these functions. We have
and
0= EI (EI,E2) = ('VE 1 EI,E2)
+ (EI, 'VE 1 E2),
from which it follows that 'VEl EI = -hIE2' Similarly, 'V E2E2 = -h2El' We have
4.5. More on Gauss Curvature
183
We now apply the Codazzi-Mainardi equations (4.11):
0= 'VE1SNE2 - 'VE2SNEI - SN[EI,E2] = 'VEl k- E2 - 'V E2 k+EI - SN (hiEI - h2E2)
= (EIk- E2 + k-'VE1E2) - (E2k+ EI + k+'VE2EI) - (hIk+ EI - h2k- E2) = (Elk- E2
+ k-hIEI) -
(E2k+ EI
+ k+h2E2)
- (hlk+ EI - h2k- E2) = (k-h i
-
E2k+ - hIk+) EI
+ (Elk- + h2k- -
k+h2)E2.
Setting the coefficients of EI and E2 equal to zero we obtain
hI
-E2k+ k- and h2
= k+ _
E1kk- .
= k+ -
We compute as follows:
R(EI, E2)E2 = 'VEl 'V E2E2 - 'V E2 'V E1E2 - 'V[El,E2]E2 = 'VEl (-h2EI) - 'VE2 (hIEI) - 'V(hlEl h2E2)E2 = - (EIh2) EI - h2 'V El EI - (E2 hl) EI - hI 'V E2EI - hl'VE1E2 + h2'VE2 E2 = - (EIh2) EI - (E2hl) EI - h2 (-hIE2) - hI (h2E2) - hI (hIEI) + h2 (-h2EI) = - (Elh2) EI - (E2hl) EI - h~EI - h~EI'
Using the Gauss curvature equation (4.10) we arrive at
(4.13)
K
= (R(E I , E2)E2, EI) = -EIh2 -
E2hl - h~ - h~.
0
Corollary 4.65. If p is a nonumbilic point for which both k+ and k- are critical, then
Proof. Using equation (4.13), we have
K = -Elh2 - E2hl - h~ - h~
= -EI (k~~~- )
- E2
(k~E~~~ ) - (k~E~~~)
2-
(k~~~- )
2
At p we have E2k+ = E1k- = 0, and so (suppressing evaluations at p) we obtain -(k+ - k-)E?ko K(p) = (k+ _ k-)2
184
4. Curves and Hypersurfaces in Euclidean Space
Corollary 4.66 (Hilbert). Let M C ]R3 be a surface. Suppose that K is a positive constant on M. Then k+ cannot have a relative maximum at a nonumbilic point. Similarly, k- cannot have a relative minimum at a nonumbilic point. Proof. Since K = k+ k- > 0 is constant, k+ has a relative maximum exactly when k- has a relative minimum, so the second statement follows from the first. For the first statement, suppose that k+ has a relative maximum at p. Notice that X 2k+ ~ 0 and X 2k- ~ 0 for any tangent vector field defined near p. Near p, there is a principal frame as above, and if we use Corollary 4.65, we have E 2k+ - E 2kK (p) = 2 k+ _ k (p) < 0,
2
contradicting our assumption about K.
o
We are now able to use the above technical results to obtain a nice theorem. Theorem 4.67. If M C ]R3 is an oriented connected compact surface w~th constant positive Gauss curvature K, then M is a sphere of radius 1/.fK. Proof. The hypotheses give us the existence of a global unit normal field N. Note that k+ ~ ../K at every point, and since M is compact, k+ must have an absolute maximum at some point p. By Corollary 4.66, p is an umbilic point and so k+(p) = k-(p). But then (k+(p))2 = k+(p)k-(p) = K, and thus the maximum value of k+ is ../K. We have both k+ ~ ../K and k+ < ../K on all of M, and so k+ = k- = ../K everywhere (M is totally umbilic). By Proposition 4.52 we see that M is a sphere of radius 1/../K. 0
We already know from our surface of revolution examples that there are many noncompact surfaces of constant positive Gauss curvature, but now we see that the sphere is the only compact example. Actually, more is true: If a surface of constant positive Gauss curvature is a closed subset of ]R3, then it is a sphere. This follows from a theorem of Myers (see Theorem 13.143 in Chapter 13).
4.6. Gauss Curvature Heuristics The reader may be left wondering about the geometric meaning of the Gauss curvature. We will learn more about the Gauss curvature in Section 9.9 and much more about curvature in general in Chapter 13. For now, we will simply pursue an informal understanding of the Gauss curvature.
185
4.6. Gauss Curvature Heuristics
Negative Curvature
Zero Curvature
Positive Curvature
Figure 4.5. Curvature bends geodesics
On a plane, geodesics are straight lines parametrized with constant speed. If two insects start off in parallel directions and maintain a poliey of not turning either left or right, then they will travel on straight lines, and if their speeds are the same, then the distance between them remains constant. On a sphere or on any surface with positive curvature, the situation is different. In this case, geodesics tend to curve toward each other. Particles (or insects) moving along geodesics that start out near each other and roughly parallel will bend toward each other if they travel at the same speed. The second derivative of the distance between them will be negative. For example, two airplanes traveling due north from the equator at constant speed and altitude will be drawn closer and, if they continue, will eventually meet at the north pole. It is the curvature of the earth that "pulls" them together. On a surface of negative curvature, initially parallel motions along geodesics will bend away from each other. The second derivative of the distance between them will be positive. These three situations are depicted in Figure 4.5. For a more precise formulation of these ideas it is best to consider a parametrized family of curves, and we will do this in Chapter 13. It should be mentioned that according to Einstein's theory, it is the curvature of spacetime that accounts for those aspects of gravity (such as tidal forces) that cannot be nullified by a choice of frame (accelerating frames cause gravity-like effects even in a fiat spacetime). For example, an initially spherical cluster of particles in free fall near the earth will be deformed into an egg shape. For a wonderful popular account of gravity as curvature, see
[Wh]. Another way to see the effects of curvature is by considering triangles on surfaces whose sides are geodesic segments. These geodesic triangles are affected by the Gauss curvature. For instance, consider the geodesic triangle on the sphere in Figure 4.6. The angles shown are actually measured in the tangent spaces at each point based on the tangents of the curves at the endpoints of each segment. We have the following Gauss-Bonnet formula
186
4. Curves and Hypersurfaces in Euclidean Space
Figure 4.6. Curvature and triangle
involving the interior angles:
where D is the region interior to the geodesic triangle. This formula is true for geodesic triangles on any surface M. For a sphere of radius a, this becomes (31 + (32 + (33 = 7r + AIa 2 , where A is the area of the region interior to the geodesic triangle. The fact that the sum of the interior angles for a triangle in a plane is equal to 7r regardless of the area inside the triangle is exactly due to the fact that a plane has zero Gauss curvature. If one starts at p and moves counterclockwise around the triangle, then at the corners one must turn through angles Gl, G2, G3. In terms of these turning angles, the statement is 3
L i=l
Gi
= 27r
-1
K dB.
D
By an argument that involves triangulating a surface, it can be shown that if one integrates the Gauss curvature over a whole surface (without boundary) Me JR3, then something amazing occurs. We obtain
This result is called the Gauss-Bonnet theorem. Here X(M) is a topological invariant called the Euler characteristic of the surface and is equal to 2 - 2g, where g is the genus of the surface (see [Arm]). Thus while the left hand side of the above equation involves the shape dependent curvature K, the right hand side is a purely topological invariant! A presentation of a very general version of the Gauss-Bonnet theorem may be found in [Poor] (also see [Lee, J effj).
187
Problems
Problems (1) Let, : I -+ ]R2 be a regular plane curve. Let J : ]R2 -+ ]R2 be the rotation (x, y) t--+ (-y, x). Show that the signed curvature function K2 (t):=
,"(t) . J,'(t) 1Ir" (t) 11 3
determines, up to reparametrization and Euclidean motion.
+ g2 = 1. Let to E (a,b) and suppose that j(to) - cosfJo and g(to) = sinfJo for some ()o E R Show that there is a unique continuous function () : (a, b) -+ ]R such that ()(to) = fJo and
(2) Let j, 9 : (a, b) -+ ]R be differentiable functions with j2
f(t)
= cosfJ(t),
g(t)
= sinfJ(t)
for t E (a,b).
(3) Let, : (a, b) -+ ]R2 be a regular plane curve. (a) Given to E (a, b) and ()o with
" (to)
II,' (to) I
. = (cos fJo, sm fJo),
show that there is a unique continuous turning angle function ()"( : (a, b) -+ ]R such that ()(to) = fJo and
,'(t)
11r'(t) I
.
= (cos()"((t),smfJ"((t)) for
t E (a,b).
(b) Show that fJ~(t) = 1Ir'(t) I K2(t), where K2 is as in Problem 1. (4) Show that if K2 for a curve, : (a, b) -+ ]R2 is 1/r, then, parametrizes a portion of a circle of radius r. (5) Let,:]R -+]R3 be the elliptical helix given by t t--+ (a cos t, bsint, ct). (a) Calculate the torsion and curvature of ,. (b) Define a map F : ]R -+ SO(3) by letting F have columns given by the Frenet frame of ,. Show that F is a periodic parameterization of a closed curve in SO(3). (6) Calculate the curvature and torsion for the twisted cubic ,(t) := (t, t 2 , t 3 ). Examine the behavior of the curvature, torsion, and Frenet frame as t -+ ±oo. (7) Find a unit speed parametrization of the catenary curve given by c(t) := (a cosh(tla) , t). Revolve the resulting profile curve to obtain a canonical parametrization of a catenoid and find the Gauss curvature in these terms.
4. Curves and Hypersurfaces in Euclidean Space
188
(8) (Four vertex theorem) Show that the signed curvature function K2 for a simple, closed plane curve is either constant or has at least two local maxima and at least two local minima. (9) If 'Y : I ---+ ]R3 is a regular space curve (not necessarily unit speed), then _
show that B(t) -
-y'x-y" "'!'x"'!"II'
_ (-y'x"'!")·"'!'" _ I"'!'x"'!"11 I"'!' x",!" 112 and K 1"'1"113 •
T -
(10) With'Y as in Problem 1, show that 'Y"
=
(dt bill) T + b /11 2 KN.
(11) Let 'Y : 1---+ M C ]R3 be a curve which is not necessarily unit speed and 'Y" .J'Y') suppose M is oriented by a unit normal field N. Show that Kg =
W'
(12) Show that if a surface M c ]R3, either the image by an immersion of an open domain in ]R2 or is the zero set of a real-valued function f (such that df of. 0 on M) then it is orientable.
(13) Calculate the shape operator at a generic point on the cylinder {(x, y, z) : x 2 + y2 = r2}. (14) (Euler's formula) Let p be a point on a surface M in ]R3 and let Ul,U2 be principal directions with corresponding principal curvatures kl,k2 at p. If u = (cosO) Ul + (sinO) U2, show that k(u) = kl cos 2 0 + k2sin20. (15) Show that a point on a hypersurface in ]Rn is umbilic if and only if there is a constant ko such that k(u) = ko for all u E TpM with Ilull = 1. (16) Let Z be a nonvanishing (not necessarily unit) normal field on a surface M c ]R3. If X and Yare tangent fields such that X x Y = Z, then show that (Z,VxZ x VyZ) (Z, VxZ x Y +X x VyZ) 4 K = IIZI1 and H = 211Z11 3 (17) Find the Gauss curvature K at (x, y, z) on the ellipsoid x 2/a 2 + y2/b2+ y2/c 2 = 1. (18) Show that a surface in ]R3 is minimal if and only if there are orthogonal asymptotic vectors at each point. (19) Compute E, F, G, 1, m, n as well as Hand K for the following surfaces: (a) Paraboloid: (u, v) ~ (u, v, au 2 + bv 2) with a, b > O. (b) Monkey Saddle: (u,v) ~ (u,v,u 3 - 3uv 2). (c) Torus: (u,v) ~ ((a+ bcosv) casu, (a+ bcosv) sinu, bsinv) with a> b > O. (20) (Enneper's surface) Show that the following is a minimal but not 1-1 immersion: x(u,v):= (u-
~3 +uv2,
Chapter 5
Lie Groups
One approach to geometry is to view it as the study of invariance and symmetry. In our case, we are interested in studying symmetries of smooth manifolds, Riemannian manifolds, symplectic manifolds, etc. The usual way to deal with symmetry in mathematics is by the use of the notion of a transformation group. The wonderful thing for us is that the groups that arise in the study of geometric symmetries are often themselves smooth manifolds. Such "group manifolds" are called Lie groups. In physics, Lie groups play a big role in connection with physical symmetries and conservation laws (Noether's theorem). Within physics, perhaps the most celebrated role played by Lie groups is in particle physics and gauge theory. In mathematics, Lie groups play a prominent role in harmonic analysis (generalized Fourier theory), group representations, differential equations, and in virtually every branch of geometry including Riemannian geometry, Cartan geometry, algebraic geometry, Kahler geometry, and symplectic geometry.
5.1. Definitions and Examples Definition 5.1. A smooth manifold G is called a Lie group if it is a group (abstract group) such that the multiplication map J.I. : G x G ---t G and the inverse map inv : G ---t G, given respectively by J.I.(g, h) = gh and inv(g) = g-1, are Coo maps. If the group is abelian, we sometimes opt to use the additive notation 9 + h for the group operation. We will usually denote the identity element of any Lie group by the same letter e. Exceptions include the case of matrix or linear groups where we
-
189
190
5. Lie Groups
use the letter I or id. The map inv : G ~ G given by 9 H 9- 1 is called inversion and is easily seen to be a diffeomorphism.
Example 5.2. JR is a one-dimensional (abelian) Lie group, where the group multiplication is the usual addition +. Similarly, any real or complex vector space is a Lie group under vector addition. Example 5.3. The circle 8 1 = {z E C: IzI2 = 1} is a 1-dimensional (abelian) Lie group under complex multiplication. It is also traditional to denote this group by U(l). Example 5.4. Let JR* = JR\ {O}, C* = C\ {O} and IHI* = IHI\ {O} (here JH[ is the quaternion division ring discussed in detail later ). Then, using multipli. cation, JR*, C*, and IHI* are Lie groups. The Lie group IHI* is not abelian. The group of all invertible real n x n matrices is a Lie group denoted GL(n, JR). A global chart on GL(n, JR) is given by the n 2 functions x~, where if A E GL(n, JR) then xj(A) is the ij-th entry of A. We study this group and some of its subgroups below. Showing that GL(n, JR) is a Lie group is straightforward. Multiplication is clearly smooth. For the inversion map one appeals to the usual formula for the inverse of a matrix, A-I = adj(A)j det(A). Here adj(A) is the adjoint matrix (whose entries are the cofactors). This shows that A-I depends smoothly on the entries of A. Similarly, the group GL(n, C) of invertible n x n complex matrices is a Lie group.
Exercise 5.5. Let H be a subgroup of G and consider the cosets gH, g E G. Recall that G is the disjoint union of the cosets of H. Show that if H is open, then so are all the cosets. Conclude that the complement He is also open and hence H is closed. Theorem 5.6. If G is a connected Lie group and U is a neighborhood of the identity element e, then U generates the group. In other words, every element of g is a product of elements of U. Proof. First note that V = inv(U) n U is an open neighborhood of the identity with the property that inv(V) = V. We say that V is symmetric. We show that V generates G. For any open WI and W2 in G, the set WI W2 = {WI W2 : WI E WI and W2 E W2} is an open set being a union of the open sets U9EWlgW2. Thus, in particular, the inductively defined sets
V n = VV n- I , n
= 1,2,3, ... ,
are open. We have e EVe V2
c ... Vn
C ....
5.1. Dennitions and Examples
It is easy to check that each
191
vn is symmetric and so also is the union 00
V oo
:_
UV
n.
n=l
Moreover, V OO is not only closed under inversion, but also obviously closed under multiplication. Thus VOO is an open subgroup. From Exercise 5.5, yeo is also closed, and since G is connected, we obtain V OO = G. 0 In general, the connected component of a Lie group G that contains the identity is a Lie group denoted Go, and it is generated by any open neighborhood of the identity. We call Go the identity component of G. Definition 5.7. For a Lie group G and a fixed element 9 E G, the maps L9 : G --7 G and Rg : G --7 G are defined by
Lgx = gx for x E G, Rgx = xg for x E G, and are called left translation and right translation (by g) respectively. It is easy to see that Lg and Rg are diffeomorphisms with L;l = Lg 1 and Rg 1 = Rg 1. If G and H are Lie groups, then so is the product manifold G x H, where multiplication is (gl, hI) . (g2, h2) = (glg2, h1h2)' The Lie group G x H is called the product Lie group. For example, the product group 8 1 x 8 1 is called the 2-torus group. More generally, the higher torus groups are defined by Tn = 8 1 X ••. X 8 1 (n factors). Definition 5.S. Let H be an abstract subgroup of a Lie group G. If H is a Lie group such that the inclusion map H y G is an immersion, then we say that H is a Lie subgroup of G. Proposition 5.9. If H is an abstract subgroup of a Lie group G that is also a regular submanifold, then H is a closed Lie subgroup. Proof. The multiplication and inversion maps, H x H --7 Hand H --7 H, are the restrictions of the multiplication and inversion maps on G, and since H is a regular submanifold, we obtain the needed smoothness of these maps. The harder part is to show that H is closed. So let Xo E H be arbitrary. Let (U, x) be a single-slice chart adapted to H whose domain contains e. Let 8 : G x G --7 G be the map 8(gl' g2) = g1 1g2, and choose an open set Y such that e EVe V c U. By continuity of the map 8 we can find an open neighborhood 0 of the identity element such that 0 x 0 C 8- 1 (V). Now if {hi} is a sequence in H converging to Xo E H, then xo 1hi --7 e and Xo 1ht E 0 for all sufficiently large i. Since h-;lhi = (xo1h 3 f1 xo1h tl we
192
5. Lie Groups
have that hjl h~ E V for sufficiently large i, j. For any sufficiently large fixed j, we have Since U is the domain of a single-slice chart, un H is closed in U. Thus since each h j 1hi is in un H, we see that h-;lxo E Un H c H for all sufficiently large j. This shows that Xo E H, and since Xo was arbitrary, we are done. 0 By a closed Lie subgroup we shall always mean one that is a regular submanifold as in the previous theorem. It is a nontrivial fact that an abstract subgroup of a Lie group that is also a closed subset is automatically a closed Lie subgroup in this sense (see Theorem 5.81).
Example 5.10. 8 1 embedded as 8 1 x {I} in the torus 8 1 x 8 1 is a closed subgroup. Example 5.11. Let 8 1 be considered as the set of unit modulus complex numbers. The image in the torus T2 = 8 1 X 8 1 of the map JRl ---+ 8 1 X 81 given by t I--t (e~27rt, e~27rat) is a Lie subgroup. This map is a homomorphism. If a is a rational number, then the image is an embedded copy of 8 1 wrapped around the torus several times depending on a. If a is irrational, then the image is still a Lie subgroup but is now dense in T2. The last example is important since it shows that a Lie subgroup might actually be a dense subset of the containing Lie group.
5.2. Linear Lie Groups Let V be an n-dimensional vector space over IF, where IF = JR or C. The space L(V, V) of linear maps from V to V is a vector space and therefore a smooth manifold. A global chart for L(V, V) may be obtained by first choosing a basis for V and then defining n 2 functions {x~ h~~.j
5.2. Linear Lie Groups
193
the coordinate functions just introduced provide a global chart for GL(V). Let GL(n, IF) denote the group of invertible matrices with entries from IF. We obtain an isomorphism of GL(V) with the matrix group GL(n, IF) by choosing a basis and then simply sending each element of GL(V) to its matrix representative with respect to that basis. If we write mat(A) for the matrix that represents A E GL(V) with respect to a fixed basis, then A --t mat(A) is a group isomorphism and is clearly smooth. It follows that GL(V) is a Lie group, and for each choice of basis we have an isomorphism of Lie groups GL(V) ~ GL(n, IF). In practice it is common to work with the matrix group GL(n, IF) and its subgroups. The Lie group GL(V) is called the general linear group of V and is also denoted GL(V, IF) when we want to make the field apparent. The matrix group GL(n, IF) is also referred to as a general linear (matrix) group and is often identified with GL(IFn). More specifically, GL(n, lR) is called the real general linear group, and GL(n, q is called the complex general linear group. Lie groups that are subgroups of GL(V) for some vector space V are referred to as linear Lie groups and are often realized as matrix subgroups of GL(n, IF) for some n. Definition 5.12. Let V be an n-dimensional vector space over the field IF which we take to be either lR or C. Then the group SL(V) defined by SL(V) = SL(V,IF) := {A E GL(V) : det(A) = 1} is called the special linear group for V.
A bilinear form [3 : V x V ~ IF on an IF-vector space V is called nondegenerate if the maps V ~ V· given by [3 R : v ~ [3 (v , .) and fh : v ~ [3(" v), are both linear isomorphisms. If V is finite-dimensional, then f3 R is an isomorphism if and only if [3L is an isomorphism and then fJ is nondegenerate provided it has the property that if [3 (v, w) = 0 for all W E V, then v = O. Definition 5.13. A (real) scalar product on a (real) finite-dimensional vector space V is a nondegenerate symmetric bilinear form [3 : V x V ~ lR. A (real) scalar product space is a pair (V, (3) where V is a real vector space and f3 is a scalar product. (As usual we refer to V itself as the scalar product space when [3 is given) Definition 5.14. Let V be a complex vector space. An lR-bilinear map f3 : V x V ~ C that satisfies f3(av, w) - af3(v, w) and [3(v, aw) = a[3(v, w) for all a E C and v, wE V is called a sesquilinear form. If also f3(v, w) = [3(w, v), we call f3 a Hermitian form. If a Hermitian form is nondegenerate, we call it a Hermitian scalar product, and then (V, (3) a Hermitian scalar product space.
194
5. Lie Groups
The sesquilinear conditions imposed are described by saying that {1 is to be conjugate linear in the first slot and linear in the second slot. Nondegeneracy for a sesquilinear form is defined as for bilinear forms. Many authors define sesquilinear and Hermitian forms to be conjugate linear in the second slot and linear in the first. Obviously, f3( v, v) is always real for a Hermitian form. Definition 5.15. Let f3 be a (real) scalar product or Hermitian scalar product on a vector space V. Then
(i) f3 is positive (resp. negative) definite if f3(v,v) ~ 0) for all v E V and f3(v, v) = 0 ====} v = OJ
~
0 (resp. {1(v,v)
(ii) f3 is positive (resp. negative) semidefinite if f3(v,v) f3(v, v) ~ 0) for all v E V.
~
0 (resp.
What is called an inner product (a term we have already used) is a positive definite scalar product (or positive definite Hermitian scalar product in the complex case). In this book the term "inner product" always implies positive definiteness. Let us generalize the notion of orthonormal basis for an inner product space to include indefinite scalar product spaces. A basis (el, ... , en) for a scalar product space (V, f3) is called an orthonormal basis if f3(ei, ej) = 0 when i t- j, and f3(ei, ei) = ±1 for all i. An orthonormal basis always exists for a finite-dimensional scalar product space. Definition 5.16. Let V be an n-dimensional vector space over the field IF which we take to be either lR or C. If f3 is a bilinear form or sesquilinear form on V, then Aut(V, f3) is the subgroup of GL(V) defined by
Aut(V, f3) := {A E GL(V) : f3(Av, Aw) - f3(v, w) for all v, wE V}. If f3 is a scalar product, then the elements of Aut(V, f3) are called isometries of V. Theorem 5.17. SL(V) is a closed Lie subgroup of GL(V). If f3 is a bilinear or sesquilinear form as above, then Aut(V, f3) and SAut(V, f3) := Aut(V, f3) n SL(V) are closed Lie subgroups of GL(V). (The form f3 is most often taken to be nondegenerate.) Proof. It is easy to check that the sets in question are subgroups. They are clearly closed. For example, Aut(V, f3) = nv,wFv,w, where
Fv,w := {A E GL(V) : f3(Av, Aw) = f3(v, wH. The fact that they are Lie subgroups follows from Theorem 5.81 below. However, as we shall see, most of the specific cases arising from various choices of f3 can be proved to be Lie groups by other means. That they are Lie subgroups follows from Proposition 5.9 once we show that they are
195
5.2. Linear Lie Groups
regular submanifolds of the appropriate group GL(V, IF). We will return to this later, once we have introduced another powerful theorem that will allow us to verify this without the use of Theorem 5.B1. 0 Let dim V = n. After choosing a basis, SL(V) gives the matrix version SL(n,lF) := {A E Mnxn(lF) : detA - 1}. Notice that even when IF = C, it may be that fJ is only required to be lR-linear. Depending on whether IF = Cor lR and on the nature of fJ, the notation for the linear groups takes on special conventional forms introduced below. When choosing a basis in order to represent one of the groups associated to a form fJ in a matrix version, it is usually the case that one uses a basis under which the matrix that represents fJ takes on a canonical form. Example 5.18 (The (semi) orthogonal groups). Let (V, f3) be a real scalar product space. In this case we write Aut(V, f3) as O(V, f3) and refer to it as the semiorthogonal group associated to f3. With respect to an appropriately ordered orthonormal basis, fJ is represented by a diagonal matrix of the form
where there are p ones and q minus ones down the diagonal. The group of matrices arising from O(V, f3) with such a choice of basis is denoted O(p, q) and consists exactly of the real matrices Q satisfying Q'f/p,qQt - 'f/p,q' These groups are called the semiorthogonal matrix groups. With such an orthonormal choice of basis as above, the bilinear form (scalar product) is given as a canonical form on lRn where (p + q = n): p
(x,y):= Lxiyi i-I
n L
xiyi,
i=p+I
and we have the alternative description
O(p,q) = {Q E GL(n): (Qx,Qy) = (x,y) for all x,y E lRn}. If f3 is positive definite, we then have q = 0, and O(V, fJ) is referred to as a real orthogonal group. We write O(n,O) as O(n) and refer to it as the real orthogonal (matrix) group; Q E O(n) ¢:::::> QtQ = I.
196
5. Lie Groups
Example 5.19. There are also complex orthogonal groups (not to be confused with unitary groups). In matrix representation, we have O(n, C) := {Q E GL(n,C): QtQ = I}. Example 5.20. Let (V, (3) be a Hermitian scalar product space. In this case, we write Aut (V, t9, C) as U(V, t9) and refer to it as the semiunitary group associated to t9. If t9 is positive definite, then we call it a unitary group. Again we may choose a basis for V such that t9 is represented by the Hermitian form on cn given by p
(x,y) := Lxiyi 1-1
p+q=n
L
xiyi.
I-p+l
We then obtain the semiunitary matrix group U(p, q)
= {A
E GL(n, C) :
(Ax, Ay) = (x, y) for all x, y E lRn}.
We write U(n, 0) as U(n) and refer to it as the unitary (matrix) group. In particular, U(1) = 8 1 - {z E C: Izl = 1}. Definition 5.21. For (V, t9) a real scalar product space, we have the special orthogonal group of (V, t9) given by SO(V, t9) = O(V, (3)
n SL(V, lR).
For (V, t9) a Hermitian scalar product space, we have the special unitary group of (V, t9) given by SU(V, t9)
= U(V, (3) n SL(V, C).
Definition 5.22. The group of n x n complex matrices of determinant one is the complex special linear matrix group SL(n, C). We also have the similarly defined real special linear group SL(n, lR). The special orthogonal and special semiorthogonal matrix groups, SO(n) and SO(p, q), are the matrix groups defined by SO(n) = O(n) n SL(n, lR) and SO(p, q) = O(p, q) n SL(n, lR). The special unitary and special semiunitary matrix groups SU(n) and SU(p, q) are defined similarly. The group SO(3) is the familiar matrix representation of the proper rotation group of Euclidean space and plays a prominent role in classical physics. Here "proper" refers to the fact that SO(3) does not contain any reflections. In the problems we ask the reader to show that SO(3) is the connected component of the identity in 0(3). Exercise 5.23. Show that SU(2) is simply connected while SO(3) is not. Example 5.24 (Symplectic groups). We will describe both the real and the complex symplectic groups. Suppose that t9 is a nondegenerate skewsymmetric C-bilinear (resp. lR-bilinear) form on a 2n-dimensional complex
197
5.2. Linear Lie Groups
(resp. real) vector space V. The group Aut(V, {3) is called the complex (resp. real) symplectic group and is denoted by Sp(V, C) (resp. Sp(V,JR.)). There exists a basis {f~} for V such that {3 is represented in the canonical form by n
n
(v,w) = Lviw n+i
-
i=l
Lvn+jw j . 3=1
The symplectic matrix groups are given by Sp(2n, C) : ={A E M2nx2n(C) : (Av, Aw)
= (v, w)},
Sp(2n, JR.) := {A E M2nx2n(JR.) : (Av, Aw) = (v, w)}, where (v, w) is given as above. Exercise 5.25. For IF AtJA - J, where
= C or JR., show that A
E Sp(2n, IF) if and only if
Much of the above can be generalized somewhat more. Recall that the algebra of quaternions JHI is a copy of JR.4 endowed with a multiplication described as follows: First let a generic elements of JR.4 be denoted by x = (xO,x 1,x2,X3), y = (yO,y1,y2,y3), etc. Thus we are using {O,1,2,3} as our index set. Let the standard basis be denoted by 1, i,j, k. We define a multiplication by taking these basis elements as generators and insisting on the following relations:
= j2 = k 2 = -1, ij = -ji = k, jk = -kj = i, i2
ki = -ik =j.
Of course, JHI is a vector space over JR. since it is just JR.4 with some extra structure. As a ring, JHI is a division algebra which is very much like a field, lacking only the property of commutativity. In particular, we shall see that every nonzero element of JHI has a multiplicative inverse. Elements of the form a1 for a E JR. are identified with the corresponding real numbers, and such quaternions are called real quaternions. By analogy with complex numbers, quaternions of the form xli + x 2j + x 3 k are called imaginary quaternions. For a given quaternion x = x01+x 1 i+x 2j+x3 k, the quaternion xli + x 2j + x 3 k is called the imaginary part of x, and x01 =xo is called the real part of x. We also have a conjugation defined by
x
H
x
:=
x01 - xli - x 2j - x 3 k.
5. Lie Groups
198
Notice that xx = xx is real and equal to (X O)2 + (X 1)2 + (X 2)2 + (x 3)2. We denote the positive square root of this by Ixl so that xx = Ix1 2. Exercise 5.26. Verify the following for x, y E 1HI and a, bE R:
ax + by = ax + by,
(x) = x,
Ixyl = Ixllyl ,
Ixl = lxi, xy = yx.
Now we can write down the inverse of a nonzero x E 1HI: 1
-1
x
=
Ixl2x.
Notice the strong analogy with complex number arithmetic. Example 5.27. The set of unit quaternions is U(l,lHI) := {Ixl = I}. This set is closed under multiplication. As a manifold it is (diffeomorphic to) 8 3 . With quaternionic multiplication, 8 3 = U(l, 1HI) is a compact Lie group. Compare this to Example 5.3 where we saw that U(l, C) = 8 1 . For the future, we unify things by letting U(l, R) := Z2 = SO cR. In other words, we take the O-sphere to be the subset {-I, I} with its natural structure as a multiplicative group. U(l,lHI) = 8 3 , U(l, C) = 81, U(l,R) := Z2 = 8 0 . Exercise 5.28. Prove the assertions in the last example. We now consider the n-fold product lHIn , which, as a real vector space (and a smooth manifold), is R4n. However, let us think of elements of IHln as column vectors with quaternion entries. We want to treat lHIn as a vector space over 1HI with addition defined just as for Rn and en, but since 1HI is not commutative, we are not properly dealing with a vector space. In particular, we should decide whether scalars should multiply column vectors on the right or on the left. We choose to multiply on the right, and this could take some getting used to, but there is a good reason for our choice. This puts us into the category of right lHI-modules were elements of 1HI are the "scalars". The reader should have no trouble catching on, and so we do not make formal definitions at this time (but see Appendix D). For v, w E lHI n and a, bE 1Hl, we have
v(a + b) = va + vb (v+w)a=va+wa (va) b = v (ab) .
199
5.2. Linear Lie Groups
A map A : lHIn -+ lHIn is said to be lHI-linear if A(va) = A(v)a for all v E lHIn and a E 1HI. There is no problem with doing matrix algebra with matrices with quaternion entries, as long as one respects the noncommutativity of 1HI. For example, if A = (aj) and B = (b)) are matrices with quaternion entries, then writing C = AB we have
but we cannot expect that L aib; = L bjai. For any A = (a~), the map lHl!1. -+ lHIn defined by v f---t Av is lHI-linear since A (va) = (Av) a. Definition 5.29. The set of all m x n matrices with quaternion entries is denoted Mmxn(lHI). The subset GL(n, 1HI) is defined as the set of all Q E Mmxn(lHI) such that the map v f---t Qv is a bijection. We will now see that GL(n, 1HI) is a Lie group isomorphic to a subgroup of GL(2n, C). First we define a map t : C 2 -+ 1HI as follows: For (Zl' Z2) E C with Zl = x O+ xli and Z2 = x 2 + x 3 i, we let t(zl, z2) _ (x O+ xli) + (x 2 + x 3 i) j where on the right hand side we interpret i as a quaternion. Note that (x O+ xli) + (x 2 + x 3 i) j = x O + xli + x 2j + x 3 k . It is easily shown that this map is an lR-linear bijection, and we use this map to identify C2 with 1HI. Another way of looking at this is that we identify C with the span of 1 and i in 1HI and then every quaternion has a unique representation as zl + z2j for zl, z2 E C c 1HI. We extend this idea to square quaternionic matrices; we can write every Q E Mmxn(lHI) in the form A + Bj for A, BE Mmxn(C) in a unique way. This representation makes it clear that Mmxn(lHI) has a natural complex vector space structure, where the scalar multiplication is z(A + Bj) =zA + zBj. Direct computation shows that
(A + Bj) (C + Dj) = (AC - BD)
+ (AD + BC)j
for A + Bj EMmxn(lHI) and C + Dj EMnxk(lHI), where we have used the fact that for Q E Mmxn(C) we have Qj = jQ. From this it is not hard to show that the map 'I9 mxn : Mmxn(lHI) -+ M2mx2n(C) given by
'I9 mxn : A + Bj
f---+
(_AB
~)
is an injective lR-linear map which respects matrix multiplication and thus is an IR-algebra isomorphism onto its image. We may identify Mmxn(lHI) with the subspace of M2mx2n(C) consisting of all matrices of the form (_AB ~), where A, BE cmxn. In particular, if m = n, then we obtain an injective lRlinear algebra homomorphism 'I9 nxn : Mnxn(lHI) -+ M2nx2n(C), and thus the image of this map in M2nx2n(C) is another realization of the matrix algebra Mn n(lHI). If we specialize to the case of n = 1, we get a realization of 1HI as the set of all 2 x 2 complex matrices of the form ( _zw ~ ). This set of matrices
200
5. Lie Groups
is closed under multiplication and forms an algebra over the field R Let us denote this algebra of matrices by the symbol n.4 since it is diffeomorphic to lHI ~ R4. We now have an algebra isomorphism {} : lHI ---t n.4 under which the quaternions 1, i, j and k correspond to the matrices
(~ ~), (~ ~i)' ( ~1 ~)
(~ ~)
and
respectively. Since lHI is a division algebra, each of its nonzero elements has a multiplicative inverse. Thus n.4 must contain the matrix inverse of each of its nonzero elements. This can be seen directly: (
z w -w Z
)
1
-1
z-w ( )
Izl2 + Iwl 2
-
W
Z
•
Consider again the group of unit quaternions U(l, lHI). We have already seen that as a smooth manifold, U(l,lHI) is 8 3 • However, under the isomorphism lHI ---t n.4 C M 2x2 (C) just mentioned, U(l, lHI) manifests itself as SU(2). Thus we obtain a smooth map U(l, lHI) ---t SU(2) that is a group isomorphism. We record this as a proposition: Proposition 5.30. The map U(l, lHI) ---t SU(2) given by X
=
Z
. (z-w w)
+wJ
M
_,
Z
where x = xo + xli + x 2j + x 3 k, Z = xo + xli and w - x 2 + x 3i, is a group isomorphism. Thus 8 3 = U(l,lHI) ~ SU(2). Proof. The first equality has already been established. Notice that we then have
and so x E U(l, lHI) if and only if (-.!w ~) has determinant one. But such matrices account for all elements of SU(2) (verify this). We leave it to the reader to check that the map U(l, lHI) ---t SU(2) is a group isomorphism. 0 Exercise 5.31. Show that Q E GL(n, lHI) if and only if det({}nxn(Q))
=1=
o.
The set of all elements of GL(2n, C) which are of the form (_AB ~) is a subgroup of GL(2n, C) and in fact a Lie group. Using the last exercise, we see that we may identify GL(n, lHI) as a Lie group with this subgroup of GL(2n, C). We want to find a quaternionic analogue of U(n, C), and so we define b : lHIn x lHIn ---t lHI by
201
5.3. Lie Group Homomorphisms
Explicitly, if
then
b(v,w) =
[V' ...
vn
1[ }: ]
=
L .'w'.
Note that b is obviously JR.-bilinear. But if a E lHI, then we have b(va, w) = b(v, w)a and b(v, wa) = b(v, w)a. Notice that we consistently use right multiplication by quaternionic scalars. Thus b is the quaternionic analogue of an Hermitian scalar product. Definition 5.32. We define U(n, lHI): U(n,lHI):= {Q E GL(n,lHl): b(Qv,Qw) = b(v,w) for all v,w E JHr} U(n, lHl) is called the quaternionic unitary group. The group U(n, lHl) is sometimes called the symplectic group and is denoted Sp(n), but we will avoid this since we want no confusion with the symplectic groups we have already defined. The group U(n, lHl) is in fact a Lie group (Theorem 5.17 generalizes to the quaternionic setting). The image of U(n, lHl) in M2nx2n(C) under the map {}nxn is denoted USp(2n, C). Since it is easily established that {}nxnIU(n,lIlI) is a group homomorphism, the image USp(2n, C) is a subgroup of GL(2n, C). Exercise 5.33. Show that {}nxn(At) = ({}nxn(A)) t. Show that USp(2n, C) is a Lie subgroup of GL(2n, C). Exercise 5.34. Show that USp(2n, C) = U(2n) n Sp(2n, C). Hint: Show that {}nxn(Mnxn{IHr)) = {A E GL(2n, C) : JAJ- 1 = A}, where J = (-qd ig). Next show that if A E U(2n), then JAJ 1 = A if and only if AtJA = J.
5.3. Lie Group Homomorphisms Definition 5.35. Let G and H be Lie groups. A smooth map f : G -+ H that is a group homomorphism is called a Lie group homomorphism. A Lie group homomorphism is called a Lie group isomorphism in case it has an inverse that is also a Lie group homomorphism. A Lie group isomorphism G ~ G is called a Lie group automorphism of G.
5. Lie Groups
202
If f : G --+ H is a Lie group homomorphism, then by definition f(glg2) = f(gl)f(g2) for all gl, g2 E G, and it follows that f(e) = e and also that f(g-l) = f(g)-1 for all 9 E G.
Example 5.36. The inclusion SO(n, R) Y GL(n, IR) is a Lie group homomorphism. Example 5.31. The circle 8 1 C C is a Lie group under complex multiplication and the map z
= et(J
cos(O) sin(O) [ t---t - sin( 0) cos (0) o 0
is a Lie group homomorphism of 8 1 into SO (n). Example 5.38. The map U(l, lH[) --+ SU(2) of Proposition 5.30 is a Lie group isomorphism. Example 5.39. The conjugation map Cg : G --+ G given by x t---t gxg- 1 is a Lie group automorphism. Note that Cg = Lg 0 R g- l . Proposition 5.40. Let h : G --+ Hand 12 : G --+ H be Lie group homomorphisms that agree in a neighborhood of the identity. If G is connected, then h = h· Proof. By Theorem 5.6, any 9 EGis a product of elements in the set on which hand 12 agree, so the homomorphism property forces h = h. 0 Exercise 5.41. Show that the multiplication map J.L : G x G --+ G has tangent map at (e, e) E G x G given as T(e,e) J.L(v , w) = v + w. Recall that we identify T(e,e)(G x G) with TeG x TeG. Exercise 5.42. GL(n, IR) is an open subset of the vector space of all n x n matrices Mnxn(R). Using the natural identification of TeGL{n, IR) with Mnxn(IR), show that TeCg(x) = gxg-l, where 9 E GL(n,R) and x E Mnxn(R). Example 5.43. The map t t---t eit is a Lie group homomorphism from R to 8 1 C C. Definition 5.44. A Lie group homomorphism from the additive group IR into a Lie group is called a one-parameter subgroup. (Note that despite the use of the word "subgroup", a one-parameter subgroup is actually a map.)
203
5.3. Lie Group Homomorphisms
Example 5.45. We have seen that the torus 8 1 x 8 1 is a Lie group under multiplication given by (ei 7'J., et(h )(et7"2, et(2 ) _ (e t ('!"! +7"2), ei (9 1 +9 2 )). Every homomorphism of lR into 8 1 x 8 1 , that is, everyone-parameter subgroup of 8 1 x 8 1 , is of the form t I--t (e tai , etbi ) for some pair of real numbers a, b E JR. Example 5.46. The map R : lR
t I--t
~
80(3) given by
cost -sint ( sin t cos t o 0
is a one-parameter subgroup. Also, the map
t
I--t
cost -sint ( sin t cos t o 0
is a one-parameter subgroup of GL(3). Recall that an nxn complex matrix A is called Hermitian (resp. skewHermitian) if At - A (resp. At - -A). Let .5u(2) denote the vector space of skew-Hermitian matrices with zero trace. We will later identify .5u(2) as the "Lie algebra" of 8U(2). Example 5.47. Given 9 E 8U(2), we define the map Adg : .5u(2) ~ .5u(2) by Adg : x I--t gxg- 1 . The skew-Hermitian matrices of zero trace can be identified with lR3 by using the following matrices as a basis:
(01 -1) (-i0 0) ( -i0 -i) 0 0 ' i . I
These are just -i times the Pauli matrices 0'1,0'2,0'3, and so the correspondence .5u(2) ~ lR3 is given by -XiO'l - yi0'2 - iZ0'3 I--t (x, y, z). Under this correspondence, the inner product on lR3 becomes the inner product (A, B) = !trace(ABt) = -!trace(AB). But then
(AdgA, AdgB) = =
-21 trace(gAg- 1gBg- 1) 1
-2 trace(AB) -
(A, B).
So, Adg can be thought of as an element of 0(3). More is true; Adg acts as an element of 80(3), and the map 9 I--t Adg is then a homomorphism from 8U(2) to 80(.5u(2)) ~ 80(3). This is a special case of the adjoint map studied later. (This example is related to the notion of "spin". For more, see the online supplement.)
5. Lie Groups
204
Definition 5.48. If a Lle group homomorphism p : G -7 G is also a covering map then we say that G is a covering group and p is a covering homomorphism. If G is simply connected, then G (resp. p) is called the universal covering group (resp. universal covering homomorphism) of G. Exercise 5.49. Show that if p : M -7 G is a smooth covering map and G is a Lie group, then M can be given a unique Lie group structure ~ch that p becomes a covering homomorphism. (You may assume that M is paracompact. ) Example 5.50. The group Mob of Mobius transformations of the complex plane given by TA : z t-+ ~:t~ for A = (~~) E SL(2,C) can be given the structure of a Lie group. The map p : SL(2, C) -7 Mob given by p: A t-+ TA is onto but not injective. In fact, it is a (two fold) covering homomorphism. When do two elements of SL(2, q map to the same element of Mob?
5.4. Lie Algebras and Exponential Maps Definition 5.51. A vector field X E X(G) is called left invariant if and only if (Lg)*X = X for all 9 E G. A vector field X E X(G) is called right invariant if and only if (Rg)*X = X for all 9 E G. The set of left invariant (resp. right invariant) vector fields is denoted XL(G) (resp. XR(G)). Recall that by definition (Lg)*X = T Lg 0 X 0 L;1, and so left invariance means that TLgoXoL;l = X or that given any x E G we have TxLg·X(x) = X(gx) for all 9 E G. Thus X E X(G) is left invariant if and only if the following diagram commutes for every 9 E G: TL
TG~TG
x
IG
Lg
I 'G
x
There is a similar diagram for right invariance. Lemma 5.52. XL(G) is closed under the Lie bracket operation. Proof. Suppose that X, Y E XL(G). Then by Proposition 2.84 we have
(Lg)*[X, Y)
=
[Lg*X, Lg*Y) = [X, Y).
o
Given a vector v E TeG, we can define a smooth left (resp. right) invariant vector field LV (resp. RV) such that LV(e) = v (resp. RV(e) = v) by the simple prescription (resp. RV(g) = TRg · v).
5.4. Lie Algebras and Exponential Maps
205
A hit more precisely, LV(g) = Te (Lg) . v. The proof that this prescription gives smooth invariant vector fields is left to the reader (see Problem 7). Given a vector in TeG there are various notations for denoting the corresponding left (or right) invariant vector field, and we shall have occasion to use some different notation later on. We will also write L(v) for LV and R(v) for RV. The map v H L(v) (resp. v H R(v)) is a linear isomorphism from TeG onto XL(G) (resp. XR(G)): Exercise 5.53. Show that v H LV gives a linear isomorphism TeG ~ XL(G). Similarly, TeG ~ XR(G) by v H R(v). We now restrict attention to the left invariant fields but keep in mind that essentially all of what we say for this case has analogies in the right invariant case. We will discover a conduit (the adjoint map) between the two cases. The linear isomorphism TeG ~ XL(G) just discovered shows that XL(G) is, in fact, a vector space of finite dimension equal to the dimension of G. From this and Lemma 5.52 we immediately obtain the following: Proposition 5.54. If G is a Lie group of dimension n, then XL(G) is an n-dimensional Lie algebra under the bracket of vector fields (see Definition 2.75). Using the isomorphism TeG ~ XL(G), we can transfer the Lie algebra structure to TeG. This is the content of the following: Definition 5.55. For a Lie group G, define the bracket of any two elements TeG by
1J,w E
With this bracket, the vector space TeG becomes a Lie algebra (see Definition 2.78), and so we now have two Lie algebras, XL(G) and TeG, which are isomorphic by construction. The abstract Lie algebra isomorphic to either/both of them is often referred to as the Lie algebra of the Lie group G and denoted variously by .c(G) or g. Of course, we are implying that .c(H) is denoted ~ and .c(K) by t, etc. In some computations we will have to use a specific realization of g. Our default convention will be that g = .c(G) := TeG with the bracket defined above. Definition 5.56. Given a Lie algebra g, we can associate to every basis VI,"" vn for g, the structure constants ~j which are defined by [Vi, Vj]
=
2: C~jVk k
for 1 ~ i,j, k
< n.
206
5. Lie Groups
It follows from the skew symmetry of the Lie bracket and the Jacobi identity that the structure constants satisfy
i) (5.1) ii)
c:
l:k s4t + ~t4r + cfr4s -
O.
The structure constants characterize the Lie algebra, and structure constants are sometimes used to actually define a Lie algebra once a basis is chosen. We will meet the structure constants again later. Let a and b be Lie algebras. For (aI, bl ) and (a2' b2) elements of the vector space a x b, define
With this bracket, a x b is a Lie algebra called the Lie algebra product of a and b. Recall the definition of an ideal in a Lie algebra (Definition 2.79). The subspaces a x {O} and {O} x b are ideals in a x b that are clearly isomorphic to a and b respectively. We often identify a with a x {O} and 0 with {O} x o. Exercise 5.57. Show that if G and H are Lie groups, then the Lie algebra 9 x ~ is (up to identifications) the Lie algebra of G x H. Definition 5.58. Given two Lie algebras over a field IF, say (a, [, ]a) and (0, [, ]&), an IF-linear map (l is called a Lie algebra homomorphism if and only if
(l([V, w]a) = [(lV, (lwj& for all V, w E a. A Lie algebra isomorphism is defined in the obvious way. A Lie algebra isomorphism 9 ~ 9 is called an automorphism of g. It is not hard to show that the set of all automorphisms of g, denoted Aut(g), forms a Lie group (actually a Lie subgroup of GL(g)). Let V be a finite-dimensional real vector space. Then GL(V) is an open subset of the linear space L(V, V), and we identify the tangent bundle of GL(V) with GL(V) x L(V, V) (recall Definition 2.58). The tangent space at A E GL(V) is then {A} x L(V, V). Now the Lie algebra is T/GL(V), and it has a Lie algebra structure derived from the Lie algebra structure on XL(GL(V)). The natural isomorphism of nGL(V) with L(V, V) puts a Lie algebra structure on L(V, V). We now show that the resulting bracket on L(V, V) is just the commutator br~ket given by [A, B] := A 0 B - BoA. In the following discussion we let X denote the left invariant vector field corresponding to X E L(V, V) ~ T/(GL(V)). By definition we have [A, B] = [A, B]. We will need some simple results from the following easy exercises.
-----
5.4. Lie Algebras and Exponential Maps
207
Exercise 5.59. For a fixed A E GL(V), the map LA : GL(V) ---t GL(V) given by A ~ AoB has tangent map given by (A, X) ~ (AoB, AoX) where (A,X) E GL(V) x L(V, V) ~ T(GL(V)). Show that if X denotes the left invariant vector field corresponding to X E L(V, V), then X(A) = (A, AX). We are going to consider functions on G L(V) that are restrictions of linear functionals on the vector space L(V, V). We will not notationally distinguish the functional from its restriction. If I is such a linear function and X E L(V, V), let I,x be given by l,x(A) := I(A 0 X). It is easy to check that I,x is also a linear functional, and so for Y E L(V, V) we also have (f,x),y. It is clear that (f,x),y = I,Yox (notice the reversal of order). Exercise 5.60. Show that the map L(V, V) ---t (L(V, V))* given by X f,x is linear over JR.
~
Exercise 5.61. Let X be the left invariant field corresponding to X E L(V, V) as above. If I is the restriction to GL(V) of a linear functional on L(V, V) as above, then XI - I,x. Solution/Hint: (Xf)(A) = d/i A (XA) f(Ao X). Recall that if one picks a basis for V, then we obtain a global chart on
GL(V) which is given by coordinate functions x~ defined by letting x~(A) be the ij-th entry of the matrix of A with respect to the chosen basis. These coordinate functions are restrictions of linear functions on L (V, V) . Using this we see that if I,x = I,Y for all linear I, then I(X) = l,x(I) = l,y(I) f(Y) for all linear I and this in turn implies X = Y. Proposition 5.62. The Lie algebra bracket on L(V, V) induced by the isomorphisms XL(GL(V)) rv TJGL(V) ~ L(V, V) is the commutator bracket
[X, Y] := X
0
Y - Y a X.
Proof. Let X, Y E L(V, V) and let [X, Y] denote the bracket induced on L(V, V). Then for any linear I we have
f,[x,Y] Since
I
[X, Y]I = [X, Y]I = XYI - YXI = X (f,y) - Y (f,x) = (f,y),x - (f,x),y = l,xoY - I,Yox = 1,(xoY-YoX). =
was an arbitrary linear functional, we conclude that [X, Y] 0
XoY-YoX.
A choice of basis gives the algebra isomorphism L(V, V) ~ Mnxn(JR) where n = dim (V) . This isomorphism preserves brackets and restricts to a group isomorphism GL(V) ~ GL(n, JR). It is now easy to arrive at the following corollary.
5. Lie Groups
208
Corollary 5.63. The linear isomorphism of g[(n, IR) = TIGL(n, JR) with Mnxn(1R) induces a Lie algebra structure on Mnxn(IR) such that the bracket is given by [A, B] := AB - BA. From now on we follow the practice of identifying the Lie algebra g[(V) of GL(V) with the commutator Lie algebra L(V, V). Similarly for the matrix group, the Lie algebra of GL(n, IR) is taken to be Mnxn(IR) with commutator bracket. IT G c GL(n) is some matrix group, then TIG may be identified with a linear subspace of Mnxn. This linear subspace will be closed under the commutator bracket, and so we actually have an identification of Lie algebras: g is identified with a subspace of Mnxn. It is often the case that G is defined by some matrix equation or equations. By differentiating these equations we find the defining equations for g (as a subspace of Mnxn). We first prove a general result for Lie algebras of closed subgroups, and then we apply this to some matrix groups. Recall that if N is a submanifold of M and /. : N y M is the inclusion map, then we identify TpN with Tpt(TpN) and Tpt is an inclusion. IT H is a closed Lie subgroup of a Lie group G, and v E ~ = TeH, then v corresponds to a left invariant vector field on G which is obtained by using the left translation in G. But v also corresponds to a left invariant vector field on H obtained from left translations in H. The notation we have been using so far is not sensitive to this distinction, so let us introduce an alternative notation. Notation 5.64. For a Lie group G, we have the alternative notation v G for the left invariant vector field whose value at e is v. If H is a closed Lie subgroup of G and v E TeH, then v G E XL(G) while v H E XL(H). Proposition 5.65. Let H be a closed Lie subgroup of a Lie group G. Let
XL(H) := {X
E
XL(G) : X(e)
E
TeH}.
Then the restrictions of elements of XL(H) to the submanifold H are the elements of XL(H). This induces an isomorphism of Lie algebras of vector fields XL(H) ~ XL(H). For v, w E ~, we have [v, wh - [v H, wH]e = [v G, WG]e = [v, w]g I and so ~ is a Lie subalgebra of g; the bracket on inherited from g.
~
is the same as that
Proof. The Lie bracket on ~ = TeH is given by [v, w] := [v H, wH](e). Notice that if H is a closed Lie subgroup of a Lie group G, then for h E H we have left translation by h as a map G -t G and also as a map H -t H. The latter is the restriction of the former. To avoid notational clutter, let us denote
209
5.4. Lie Algebras and Exponential Maps
LhlH by lh. If t : H y G is the inclusion, then we have to lh = Lh a t, and so Tt a Tlh = TLh a Tt. If v H E XL(H), then we have Tht (v H(h))
Tht(Te1h(v)) = Tt a Tlh(V) = TLh a Tt(v) = TeLh (Tet(v)) = TeLh (v) = vG(h) = (v G a t) (h), =
so that v H and v G are t-related for any v E TeH C TeG. Thus for v, wE TeH we have [v,w]~ = [vH,wH]e = [vG,wG]e = [v,wjg, the formula we wanted. Next, notice that if we take Th" as an inclusion so that Tht (v H(h)) = v H (h) for all h, then we have really shown that if v E TeH, then v H is the restriction of vG to H. Also it is easy to see that
XL(H) = {v G : v
E
TeH},
and so the restrictions of elements of XL(H) are none other than the elements of XL(H). From what we have shown, the restriction map XL(H) --+ XL(H) is given by v G I-----t v H and is a surjective Lie algebra homomorphism. It also has kernel zero since if v H is the zero vector field, then v = 0, which implies that v G is the zero vector field. D Because [v, w]~ = [v, w]g, the inclusion ~ y 9 is a Lie algebra homomorphism. Examining the details of the previous proof we see that we have a commutative diagram
of Lie algebra homomorphisms where the top horizontal map is a restriction to H, the left diagonal map is v I-----t v G and the right diagonal map is v t----t v H • In practice, what this last proposition shows is that in order to find the Lie algebra of a closed subgroup H C G, we only need to find the subspace ~ = TeH, since the bracket on ~ is just the restriction of the bracket on g. The following is also easily seen to be a commutative diagram of Lie algebra homomorphisms:
XL(G)
('----~
where both vertical maps are v
I-----t
vG•
I 9
210
5. Lie Groups
The Lie algebra Lie group correspondence works in the other direction too: Theorem 5.66. Let G be a Lie group with Lie algebra g. If ~ is a Lie subalgebra of g, then there is a unique connected Lie subgroup H of G whose Lie algebra is ~.
The above theorem uses the Frobenius integrability theorem, so we defer the proof until we have that theorem in hand. Since the Lie algebra of GL( n) is the set of all n x n matrices with the commutator bracket, and since we have just shown that the bracket for the Lie algebra of a subgroup is just the restriction of the bracket on the containing group, we see that the bracket on the Lie algebra of matrix subgroups can also be taken to be the commutator bracket if that Lie algebra is represented by the appropriate space of matrices. We record this as a proposition: Proposition 5.67. IfG c GL(V) is a linear Lie group, then the Lie algebra of G may be identified with a subalgebra of End(V) with the commutator bracket. A similar statement holds for matrix Lie algebras. Example 5.68. Consider the orthogonal group O(n) C GL(n). Given a curve of orthogonal matrices Q(t) with Q(O) = I and -9t Q(O) = A, we compute by differentiating the defining equation 1= QtQ:
It=o
0=~1 QtQ dt t=o =
r
(! It-o Q
Q(O) + Qt(O)
(! It=o Q)
=At+A, so that the space of skew-symmetric matrices is contained in the tangent space TrO(n). But both TrO(n) and the space of skew-symmetric matrices have dimension n( n - 1) /2, so they are equal. This means that we can identify the Lie algebra o(n) = .c(O(n)) with the space of skew-symmetric matrices with the commutator bracket. One can easily check that the commutator bracket of two such matrices is skew-symmetric, as expected. We have considered matrix groups as subgroups of GL(n), but it is often more convenient to consider subgroups of GL(n, C). Since GL(n, q can be identified with a subgroup of GL(2n), this is only a slight change in viewpoint. The essential parts of our discussion go through for GL( n, q without any significant change. Example 5.69. Consider the unitary group U(n) c GL(n, C). Given a Q(O) = A, we curve of unitary matrices Q(t) with Q(O) = I and -9t
It=o
5.4. Lie Algebras and Exponential Maps
211
compute by differentiating the defining equation I = QtQ. We have
o=~1 clQ dt t-O =
(!It=o Q) tQ(O) + Qt(O) (!It=o Q)
=At+A. Examining dimensions as before, we see that we can identify u( n) with the space of skew-Hermitian matrices (At = -A) under the commutator bracket. Along the lines of the above examples, each of the familiar matrix Lie groups has a Lie algebra presented as a matrix Lie algebra with commutator bracket. This subalgebra is defined in terms of simple conditions as in the examples above, and the chart below lists some of the more common examples. Group
Lie algebra
SL(n,JR) O(n) SO(n) U(n) SU(n) Sp(2n,lF)
s[(n, JR) o(n) so(n) u(n)
su(n) sp(2n,lF)
Conditions defining the Lie algebra Trace(A) = 0 At=-A At=-A At=-A At = -A, Trace(A) JAtJ=A
=0
In the chart above the matrix J is given by J=
(~I ~).
Exercise 5.70. Find the defining conditions for the Lie algebra of the semiorthogonal group O(p, q). We would like to relate Lie group homomorphisms to Lie algebra homomorphisms. Proposition 5.71. Let f : G 1 --t G 2 be a Lie group homomorphism. The map Tel: gl --t g2 is a Lie algebra homomorphism called the Lie differential, which is denoted in this context by df : gl --t g2. Proof. For v E gl and x E G, we have
Txf· LV(x) = Txf· (TeLx . v)
= TeU 0 Lx) . v = Te(Lf(x) 0 f) . v = TeLf(x) (Tel . v) = TeLf(x) (Tel . v) = Ldf(v)U(x)),
5. Lie Groups
212
so LV "'/ Ld/(v). Thus by Proposition 2.84 we have that for any v,w E gl,
L[v,w] '" / [Ld/(v) I Ld/(w)]. In other words, [Ld/(v) I Ld/(w)]
0
f = T f 0 L[v,w], which at e gives
o
[df(v),df(w)] = [v,w].
Theorem 5.72. Invariant vector fields are complete. The integral curves through the identity element are the one-parameter subgroups. Proof. We prove the left invariant case since the right invariant case is similar. Let X be a left invariant vector field and c : (a, b) -7 G be an integral curve of X with c(O) = X(P). Let a < tl < t2 < b and choose an element 9 E G such that gC(tl) = C(t2)' Let t::..t = t2 - tl, and define c: (a + t::..t, b + t::..t) -7 G by c(t) = gc(t - t::..t). Then we have dd
I
t t=o
c(t)
= TLg . c(t = X
t::..t)
= TLg · X(c(t -
t::..t))
(gc(t - t::..t)) = X(c(t)),
and so c is also an integral curve of X. On the intersection (a + t::..t, b) of their domains, c and c are equal since they are both integral curves of the same field and since C(t2) = gC(tl) = C(t2)' Thus we can concatenate the curves to get a new integral curve defined on the larger domain (a, b + At). Since this extension can be done again for the same fixed At, we see that c can be extended to (a, 00 ). A similar argument shows that we can extend in the negative direction to get the needed extension of c to (-00,00). Next assume that c is the integral curve with c(O) = e. The proof that c(s + t) = c(s)c(t) proceeds by considering ,(t) = c(s)-lc(s + t). Then ,(D) = e and also
-yet)
TLc(s)-l . c(s + t) = TLc(s) l ' X(c(s + t)) = X(c(s)-lc(s + t)) = X(,(t)). =
By the uniqueness of integral curves we must have c(s)-lc(s + t) = c(t), which implies the result. Conversely, suppose c : R -7 G is a one-parameter subgroup, and let Xe - c(O). There is a left invariant vector field X such that X(e) = X e , namely X = LXe. We must show that the integral curve through e of the field X is exactly c. But for this we only need that c(t) - X(c(t)) for all t. We have c(t + s) = c(t)c(s) or c(t + 8) = Lc(t)c(s). Thus
c(t) =
:810 c(t + s) = (Tc(t)L) . c(O) = X(c(t)).
0
5.4. Lie Algebras and Exponential Maps
213
Proposition 5.73. Let v E 9 = TeG. We have the corresponding left invariant field LV and flow
(5.2) A similar statement holds with R V replacing LV.
Proof. Let u = st. We have that c1tlt=o
= sv 0
Theorem 5.74. Let G be a Lie group. For a smooth curve c: lR -7 G with c(O) = e and c(O) = v, the following are all equivalent:
(i) c is a one-parameter subgroup with c(O) = v. (ii) c(t) =
dd
t
It=O gc(t) =
dd
t
It=O Lg(c(t))
= TLgv = LV(g) for any g. In other words,
!I
Rc(t)g = LV (g)
t=O
for any g. But also, Rc(o)g = eg = 9 and
5. Lie Groups
214
Thus for s,t E R and v E 9 we have exp((s + t)v) = exp(sv) exp(tv), exp( -tv) = (exp(tv))-l . Note that we usually use the same symbol "exp" for the exponential map of any Lie group, but we may also write expG to indicate that the group is G. By Proposition 5.73 we have
exp(tv) = cptV(l) = cpV(t) =
cpf (e).
Thus by Theorem 5.74 we obtain the following: Proposition 5.76. For v E g, the map R ~ G given by t t--+ exp(tv) is the one-parameter subgroup that is the integral curve of LV. Lemma 5.77. The map exp: 9
~
G is smooth.
Proof. Consider the map R x G x 9 ~ G x 9 given by
(t, g, v) t--+ (g. exp(tv), v). This map is easily seen to be the flow on G x 9 of the vector field X : (g, v) t--+ (LV(g),O) and so is smooth. The restriction of this smooth flow to the submanifold {1} x {e} x 9 is (l,e,v) t--+ (exp(v) , v) and is also smooth. This clearly implies that exp is smooth also. 0 Note that exp(O) = e. In the following theorem, we use the canonical identification of the tangent space of TeG at the zero element (that is, To(TeG)) with TeG itself. Theorem 5.78. The tangent map of the exponential map exp : 9 ~ G is the identity at 0 E TeG = g, and exp is a diffeomorphism of some neighborhood of the origin onto its image in G, Teexp = id: TeG
~
TeG.
Proof. By Lemma 5.77, we know that exp: 9 ~ G is a smooth map. Also) 10 exp(tv) = v, which means that the tangent map is v t--+ V. If the reader thinks through the definitions carefully, he or she will discover that we have here used the identification of 9 with Tog. 0
it
From the definitions and Theorem 5.74 we have
cpf (p) = pexptv, cpf (p) = (exp tv) p for all v E g, all t E R and all pEG.
215
5.4. Lie Algebras and Exponential Maps
Proposition 5.79. For a (Lie group) homomorphism! : Gl following diagram commutes:
Proof. For v in the Lie algebra of Gl, the curve t a one-parameter subgroup. Also,
I-t
---t
G 2 , the
!(expGl (tv)) is clearly
and so by uniqueness of integral curves !(expGl(tV)) = expG2(td!(v)).
0
The Lie algebra of a Lie group and the group itself are closely related in many ways, and the exponential map is often the key to understanding the connection. One simple observation is that by Theorem 5.6, if G is a connected Lie group, then for any open neighborhood V C g of 0 the group generated by exp(V) is all of G. If H is a Lie subgroup of G, then the inclusion L : H y G is an injective homomorphism and Proposition 5.79 tells us that the exponential map on ~ egis the restriction of the exponential map on g. Thus, to understand the exponential map for linear Lie groups, we must understand the exponential map for the general linear group. Let V be a finite-dimensional vector space. It will be convenient to pick an inner product (.,.) on V and define the norm of v E V by Ilvll := J(v, v). In case V is a complex vector space, we use a Hermitian inner product. We put a norm on the set of linear transformations L{V, V) by IIAvl1 II All = sup -II-II . IIvll#o v We have IIA 0 BII ~ IIAIIIIBII, which implies that IIAkl1 ~ IIAllk, If we use the identification of gr(V) with L(V, V) (or equivalently the identification of g[(n, JR) with the vector space of n x n matrices M nxn ), then the exponential map is given by a power series
216
5. Lie Groups
which can be seen from the following argument: The sequence of partial 0 ~Ak is a Cauchy sequence in the normed space gl(V), sums SN :=
Ef
From this we see that
and so {SN} is a Cauchy sequence. Since gl(V) together with the given norm is known to be complete, we see that E:'o ~Ak converges. For a fixed A E gl(V), the function a : t H a(t) = exp(tA) is the unique solution of the initial value problem
a(O) = A.
a'(t) = Aa(t),
This can be seen by differentiating term by term: d dt exp(tA)
~1kk
= ~ k!t k 0
~
1
= ~ (k _1)!t
k
1 Ak
k=l
1 L tk k=l (k - 1)! 00
= A
A
1 A k- 1
= A exp(tA)
.
Under our identifications, this says that a is the integral curve corresponding to the left invariant vector field determined by A. Thus we have a concrete realization of the exponential map for gl(V) and, by restriction, each Lie subgroup of gl(V). Applying what we know about exponential maps in the abstract setting of a general Lie group we have in this concrete case exp((s + t) A) = exp(sA) exp(tA) and exp( -tA) = (exp(tA))-l. Let A, BE gl(V). Then
217
5.4. Lie Algebras and Exponential Maps
On the other hand, suppose that A
exp(A+B) =
L00
1
0
B = BoA. Then we have
m!tm(A+B)m
=
m-O
=
L00
1 m!
3+k=m J
m=O
~ ~ _1_ ti+ kAi Bk L..J L..J j!k!
i=Ok=O
(L0 0 ') %!AiBk
.
Thus in the case where A commutes with B, we have
exp(A + B) = exp(A) exp(B). Since most Lie groups of interest in practice are linear Lie groups, it will pay to understand the exponential map a bit better in this case. Let V be a finite-dimensional vector space equipped with an inner product as before, and take the induced norm on g[(V). By Problem 18 we can define a map log: U -+ g[(V), where
U = {B E GL(V) : IIBII
< 1},
by using the power series:
If we compute formally, then for A E g[(V), log (exp A)
= (A+
~!A2) - ~ (A+ ~!A2)2
+ ~ (A + ! 3
=A+
2!
A2) 3 + ...
(~!A2 - ~A2) + (;!A 3 _ ~A3 + ~A3) + ....
We will argue that the above makes sense if IIAII < 10g2 and that there must be cancellations in the last line so that log( exp A) = A. In fact, Ilexp A - III ~ e liAIl - 1, and so the double series on the first line for log(expA) must converge absolutely if e llAJJ - 1 < 1 or if IIAII < log 2. This means that we may freely rearrange terms and expect the same cancellations as we find for the analogous calculation of log(exp z) for complex % with Izl < log 2. But since log(expz) = z for such z, we have the desired conclusion. Similarly one may argue that exp(log B) = B if liB -
III < 1.
218
5. Lie Groups
Next, we prove a remarkable theorem that shows how an algebraic assumption can have implications in the differentiable category. First we need some notation. Notation 5.80. If S is any subset of a Lie group G, then we define
S-l = {s-l : s E S}, and for any x E G we define
xS = {xs: s E S}. Theorem 5.81. An abstract subgroup H of a Lie group G is a (regular) submanifold and hence a closed Lie subgroup if and only if H is a closed set in G. Proof. First suppose that H is a (regular) submanifold. Then H is locally closed. That is, every point x E H has an open neighborhood U such that un H is a relatively closed set in H. Let U be such a neighborhood of the identity element e. We seek to show that H is closed in G. Let y E Hand x E yU- 1 n H. Thus x E H and y E xU. This means that y E H n xU, and hence x- 1y E H n U = H n U. So y E H, and we have shown that His closed. Conversely, suppose that H is a closed abstract subgroup of G. Since we can always use left/right translation to translate any point to the identity, it suffices to find a single-slice chart at e. This will show that H is a regular submanifold. The strategy is to first find .c(H) = ~ and then to exponentiate a neighborhood of 0 E ~. First choose any inner product on T!}p so we may take norms of vectors in TeG. Choose a small ne.!ghborhood U of 0 E TeG = g on which exp is_a diffeomorphism, say exp : U ~ U, and denote the inverse by logu : U ~ U. Define the set Ii in fJ by Ii = logu (H n U). Claim. If h n is a sequence in H converging to zero and such that Un = hn / Ilhnll converges to v E g, then exp(tv) E H for all t E JR. Proof of the claim: Note that thn/ Ilhnll ~ tv while IIhnll converges to zero. But since IIhnll ~ 0, we must be able to find a sequence ken) E Z such that ken) Ilhnll ~ t. From this we have exp(k(n)hn) = exp(k(n) Ilhnllll~:II) ~ exp(tv). But by the properties of exp proved previously, we have that exp(k(n)hn ) = (exp(hn))k(n l . But also exp(hn ) E H n U CHand so (exp(hn))k(n l E H. Since H is closed, we have exp(tv) = lim (exp(hn))k(nl E H. n-+oo
Claim. If W is the set of all sv where s E JR and v can be obtained as a limit hn / Ilhnll ~ v with hn E Ii and h n ~ 0, then W is a vector space.
219
5.4. Lie Algebras and Exponential Maps
Proof of the claim: We just need to show that if hnf Ilhnll ~ v and h~J IIh~1I ~ w with h~, hn E ii, then there is a sequence of elements h~ from ii with h~ ~ 0 such that
"f I h"II n ~
hn
v +w Ilv +wll'
Using the previous claim, observe that
h(t) := logu(exp(tv) exp(tw)) = (logu op,) (exp(tv) , exp(tw)). Here p, is the group multiplication map. But, by the first claim, exp(tv) and exp(tw) are in H for all t, and so h(t) is in ii for small t. By Exercise 5.41 and the fact that Te log = id, we have that lim h(t)ft = h'(O) 40
= v + w.
Thus,
h(t)
--=
h(t)ft
~
v +w
.
Ilh(t)11 Ilh(t)ftll Ilv + wll 0 and let h~ := h(tn). Notice that by the first claim,
Now just let tn .!. exp(W) c H. Claim. Let W be the set from the last claim. Then exp(W) contains an open neighborhood of e in H. Proof of the claim: Let W 1. be the orthogonal complement of W with respect to the inner product chosen above. Then we have TeG = W1. EB W. It is not difficult to show that the map ~ : W EB W 1. ~ G defined by w + x ~ exp(w) exp(x)
is a diffeomorphism in a neighborhood of the origin in TeG. Denote this diffeomorphism by 1jJ. Now suppose that exp(W) does not contain an open neighborhood of e in H. Then we may choose a sequence hn ~ e such that hn is in H but not in exp(W). But this means that we can choose a corresponding sequence (wn, xn) E W EB W1. with (wn, xn) ~ 0 and exp(wn) exp(xn) E H and yet, Xn i= O. The space W1. is closed and the unit sphere in W1. is compact. After passing to a subsequence, we may assume that xnf Ilxnll ~ x E Wl., and of course IIxll = 1. Now exp(wn) E H since exp(W) CHand H is at least an algebraic subgroup, so we see that exp(wn ) exp(xn) E H. Thus it must be that exp(xn) E H also and so Xn E ii. But Xn ~ 0 and xnf Ilxnll ~ 0, and so, since we now know that Xn E ii, we have that x E W by definition. This contradicts the fact that Ixii = 1 and x E Wl.. Thus exp(W) must contain a neighborhood of e in
H. Finally, we let 0 C exp(W) be a neighborhood of e in H. The set 0 must be of the form 0 = H n V for some open V in TeG containing O. By
5. Lie Groups
220
shrinking V further we obtain a diffeomorphism 'I/Ilv' The inverse of this diffeomorphism,
0
~~~~W
5.5. The Adjoint Representation of a Lie Group Definition 5.82. Fix an element 9 E G. The map C g : G -* G defined by Cg(x) = gxg- I is a Lie group automorphism called the conjugation map, and the tangent map TeCg : g -* g, denoted Adg, is called the adjoint map. Proposition 5.83. The map Cg : G -* G is a Lie group homomorphism. The map C: 9 1--+ C g is a Lie group homomorphism G -* Aut(G).
o
Proof. See Problem 4. Using Proposition 5.71, we get the following Corollary 5.84. The map Adg : g -* g is a Lie algebra homomorphism.
Lemma 5.85. Let f: M x N -* N be a smooth map and define the partial map at x E M by fx(y) = f(x,y). Suppose that for every x E M the point Yo is fixed by fx: fx(Yo) = Yo for all x. Then the map AyO : x
1--+
TYofx is a smooth map from M to GL(TYON).
Proof. It suffices to show that Ayo composed with an arbitrary coordinate function from an atlas of charts on GL(TyoN) is smooth. But GL(TYON) has an atlas consisting of a single chart. Namely, choose a basis VI, V2, ••• , Vn of TYON and let vI, v 2 , • •• ,vn be the dual basis of T;oN. Then xj : A H v'(Avj) is a typical coordinate function. Now we compose:
X;
0
Ayo(x)
= Vi (Ayo (x)Vj) = vi(Tyofx . Vj).
It is enough to show that Tyofx . Vj is smooth in x. But this is just the composition of the smooth maps M -* TM x TN ~ T(M x N) -* TN
given by x
1--+
(Ox, vJ )
1--+
(ad) (x, Yo) . Ox + (/hi) (x, Yo) . Vj = TYofx . Vj.
(Recall the discussion leading up to Lemma 2.28.)
o
Proposition 5.86. The map Ad: 9 1--+ Adg is a Lie group homomorphism G -* GL(g), which is called the adjoint representation of G.
5.5. The Adjoint Representation of a Lie Group
221
Proof. We have Ad(9I92)
= TeCglg2 = Te(Cg1 0 Cg2 ) = TeCgl
0
TeCg2 = Ad g1 0 Adg2 ,
which shows that Ad is a group homomorphism. The smoothness follows from the previous lemma applied to the map C: (9, x) t-+ Cg(x). 0 Recall that for v E g we have the associated left invariant vector field L1J as well as the right invariant field RV. Using this notation, we have Lemma 5.87. Let v E g. Then LV(x) = RAdxv. Proof. We calculate as follows:
LV(x) = Te{Lx) . v = T{Rx)T{Rx l)Te{L x )' V = T(Rx)T{Rx
Lx) . v = RAd(x)v.
10
o
We now go one step further and take the differential of Ad. Definition 5.8S. For a Lie group G with Lie algebra g, define the adjoint representation of g as the map ad: g ---t gl{g) given by ad Proposition 5.89. ad(v)w
= Te Ad = d (Ad) .
= [v,w] for all v,w E g.
Proof. Let vI, ... ,vn be a basis for g so that Ad(x)w = I;ai(X)Vi for some functions ai. Let c be a curve with i:(0) = v. Then we have
On the other hand, by Lemma 5.87 we have
LW(x) = RAd(x)w = R =
I: ai(x)R
v'
(I: a~(x)vi)
(x).
From the fact that the bracket of any left invariant vector field with any right invariant vector field is zero, we have
222
5. Lie Groups
Finally, using the equation for ad(v)w derived above, we have
[v,w] = [LV, LW](e) = LLV(ai)(e)RV'(e) = LLV(ai)(e)v i
= L(vai)Vi = ad(v)w.
o
The map ad : g ~g[(g) = End(TeG) is given as the tangent map at the identity of the map Ad which is a Lie group homomorphism. Thus by Proposition 5.71 we obtain the following:
Proposition 5.90. ad : g ~g[(g) is a Lie algebra homomorphism. Since ad is defined as the Lie differential of Ad, Proposition 5.79 tells us that the following diagram commutes for any Lie group G:
g~g[(g) exp
1
!
exp
G~GL(g) On the other hand, for any 9 E G, the map Cg : x f-t gxg- 1 is also a homomorphism, and so Proposition 5.79 applies again, giving the following commutative diagram:
In other words, exp (t Adg v) = 9 exp(tv)g-l for any 9 E G, v E g and t E R. In the case of linear Lie groups G c GL(V), we have identified g with a subspace of gl(V), which is in turn identified with L(V, V). In this case, the exponential map is given by the power series as explained above. It is easy to show from the power series that B 0 exp(tA) 0 B-1 = exp(tB 0 A 0 B-1) for any A E gl(V) and B E GL(V). In this special set of circumstances, we have This is seen as follows:
dl AdBA = -d t
=
B 0 exp(tA) 0 B _1 t=Q
~I dt
exp(tBoAoB- 1 )=BoAoB- 1 • t=Q
223
5.5. The Adjoint Representation of a Lie Group
Earlier we noted that for a general Lie group we always have Ad 0 exp = exp 0 ad. In the current context of linear Lie groups, this can be written as
L
1
00
exp(A) 0 B 0 exp( -A) =
k! (ad(A»k B
k=O
for any A E g[(V) and any B E GL(V). We end this section with a statement of the useful Campbell-BakerHausdorff (CBH) formula. A proof may be found in [HeIgl or [Michl. First notice that if G is a Lie group with Lie algebra £I, then for each X E £I we have adX E L(g, g). We choose a norm on £I, and then we have a natural operator norm and consequent notion of convergence in L(g, g). The analytic function ::~ is defined by the power series
f
lnz := (-1)n (z _1)n z-1 n=on+1 ' which converges in a small ball about z = 1. We may then use the corresponding power series to make sense of (In A) (A - 1) -1 for A E L (1:1, 1:1) sufficiently close to the identity map I = idg • Theorem 5.91 (CBH Formula). Let G be a Lie group and 1:1 its Lie algebra. Let J(z) = be defined by a power series as above. Then Jor sufficiently small x, y E 1:1, exp(x) exp(y) = exp(C(x, y», where
:J.
C(x,y) = y +
11
=X +y +
J(etadxetady). xdt
L -;:;-y 1 n=l + 00
(_1)n
1
0
L
(
tk
n k
kIll (adx) (ady)
l
)
xdt.
k,I>O, k+12:1
It follows from the above that 00
n=l where C1(x, y) = x + y, C2(X, y) = ~[x, yl and 1
C3(X, y) = 12 ([[x, yl, y] + [[y, x], x]). In particular,
exp(tx) exp(ty) = exp(t(x + y)
+ O(t2 »)
for any x, y and sufficiently small t. One can show, using the above results, that G is abelian if and only if 1:1 is abelian (Le. [x, y] = 0 for all x, y E 1:1). The
5. Lie Groups
224
CBH formula shows that the Lie algebra structure on g locally determines the multiplicative structure on G.
5.6. The Maurer-Cartan Form Let G be a Lie group, and for each 9 E G, define wG(g) : TgG -+ g and ) • T.9 G -+ g bY wGright(g.
and w~ght(g)(Xg) = TRg-l . X g.
The maps WG : 9 H wG(g) and w~ght : 9 H w~ght(g) are g-valued I-forms called the left Maurer-Cartan form and right Maurer-Cartan form respectively. We can view WG and w~ght as maps TG -+ g. For example, wG(Xg) := WG(g) (Xg). As we have seen, GL(n) is an open set in a vector space, and so its tangent bundle is trivial, TGL(n) ~ GL(n) x Mnxn (recall Definition 2.58). A general abstract Lie group G is not an open subset of a vector space, but we are still able to show that TG is trivial. There are two such trivializations obtained from the Maurer-Cartan forms. These are triv L : TG -+ G x g and trivR : TG -+ G x g defined by
trivL(vg)
(g,wG(vg)),
trivR(Vg) - (g,w~ght(Vg)) for Vg E TgG. Observe that trivL1(g, v) = LV(g) and triv]i1(g, v) = RV(g). It is easy to check that triv Land triv R are trivializations in the sense of Definition 2.58. Thus we have the following: Proposition 5.92. The tangent bundle of a Lie group is trivial: TG G x g.
~
We will refer to trivL and trivR as the (left and right) Maurer-Cartan trivializations. How do these two trivializations compare? There is no special reason to prefer left multiplication. We could have used right invariant vector fields as our means of producing the Lie algebra, and the whole theory would work "on the other side", so to speak. The bridge between left and right is the adjoint map: Lemma 5.93 (Left-right lemma). For any v E g, and 9 E G we have trivR a trivL l(g, v) = (g,Adg(v)).
5.6. The Maurer-Cartan Form
225
Proof. We compute: trivR 0 trivL1(g, v) = (g, TRg lTLgv) = (g, T(Rg-1Lg) . v)
= (g, TCg . v) = (g, Adg(v)).
o
It is often convenient to identify the tangent bundle TG of a Lie group G with G x g. Of course we must specify which of the two trivializations described above is being invoked. Unless indicated otherwise we shall use the "left version" described above: Vg 1--+ (g, wG(vg)) = (g, TLgl(vg)). Warning: It must be realized that we now have three natural ways to trivialize the tangent bundle of the general linear group. In fact, the usual one which we introduced earlier is actually the restriction to TGL(n) of the Maurer-Cartan trivialization of the abelian Lie group (Mnxn, +). In order to use the (left) Maurer-Cartan trivialization as an identification effectively, we need to find out how a few basic operations look when this identification is imposed. The picture obtained from using the trivialization produced by the left Maurer-Cartan form:
(1) The tangent map of the left translation T Lg : TG ---t TG takes the form "TLg" : (x,v) 1--+ (gx, v). Indeed, the following diagram commutes:
TG
.!. Gxg
TG "TL"
~
.!. Gxg
where elementwise we have TLg
Vx 1. . . - - - - + ) TLg·v x
1
J
(2) The tangent map of multiplication: This time we will invoke two identifications. First, group multiplication is a map J.L : G x G ---t G, and so on the tangent level we have a map T( G x G) ---t TG. Recall that we have a natural isomorphism T( G x G) rv TG x TG given by T7rl x T7r2 : (V(x,y)) 1--+ (T7rl . v(x,y) , T7r2 . V(x,y))' If we also identify TG with G x g, then TG x TG rv (G x g) x (G x g), and we end
5. Lie Groups
226
up with the following "version" of TJ1.:
"TJ1." : (G x g) x (G x g) ~ G x g, "TJ1." : ((x, v), (y, w)) I-? (xy, TRyv + TLxw) (see Exercise 5.94). (3) The (left) Maurer-Cartan form is a map Wa : TG ~ TeG = g, and so there must be a "version", "wa", that uses the identification TG ~ G x g. In fact, the map we seek is just projection:
"wa" : (x, v)
I-?
v.
(4) The right Maurer-Cartan form is a little more complicated since we are currently using the isomorphism TG ~ G x 9 obtained from the left Maurer-Cartan form. From Lemma 5.93 we obtain: "w~ight" : (x, v) I-? Adg ( v ) .
The adjoint map is nearly the same thing as the right MaurerCartan form if we decide to use the left trivialization TG !:::!. G x 9 as an identification. (5) A vector field X E X(G) should correspond to a section of the projection G x 9 ~ G which must have the form )( : x I-? (x, FX (x)) for some smooth g-valued function F X E Coo (G; g). It is an easy consequence of the definitions that FX (x) = wa(X(x)) = TL;l . X(x). Under this identification, a left invariant vector field becomes a constant section of G x g. For example, if X is left invariant, then the corresponding constant section is x I-? (x, X(e)). Exercise 5.94. Refer to 2 above. Show that the map "TJ1." defined so that the diagram below commutes is ((x, v), (y, w)) I-? (xy, TRyv + TLxw).
T(G x G)
I
TIL
"r""
~
TG
j
(G x g) x (G x g) - - G x 9 We have already seen that using available identifications the Lie algebra of a matrix groups and associated formulas take a concrete form. We now consider the Maurer-Cartan form in the case of matrix groups. Suppose that G is a Lie subgroup of GL( n) and consider the coordinate functions x~ on GL(n) defined by x~(A) = a~ where A = [a~]. We have the associated 1forms dx~. Both the functions x~ and the forms dx~ restrict to G. Denoting these restrictions by the same symbols, the Maurer-Cartan form can be expressed as . 1 [ . wa = [x~]- dxj].
227
5.6. The Maurer-Cartan Form
One sometimes sees the shorthand, g-ldg, which is a bit cryptic. Our goal is to understand this expression better. First, from a practical point of view we think of it as follows: An element Vg of the tangent space TgG is also an element of TgGL(n) and so can be expressed in terms of the vectors a/ax~lg, say
Then,
WG (vg) =
[x~rl [dx~] (vg) = [g;r1 [v;] ,
where 9 = [g;] and the matrix [g;r l [v;] is interpreted as an element of the Lie algebra g. For instance, if G = 80(2), then the Maurer-Cartan form is given by
x~dx~ - x~dx~ ] [ x~ -X~] [dxi dX~] _ [ x~dxi - x~dx~ -x~ xi dx~ dx~ -x~dxi + xidx~ -x~dx~ + xidx~ ldXl2 Xlld x21 - X2ld x22 ] _ [ XlldXl1 - X2 x~dxi + xidx~ x~dx~ + xidx~ , since on 80(2) we have xi = x~, x~ = -x~ and xix~ - x~x~ = 1. But this can be further simplified. If we let Vg E Tg80(2), then WG (vg) is in .50(2) and so must be antisymmetric. We conclude that 0 xidx~ - x~dx~ ] WSO(2) = [ -xidx~ + x~dx~ 0 .
Let us try to understand things a bit more thoroughly. If G is a subgroup of GL(n), then consider the inclusion map j : G <---+ Mnxn, where Mnxn is the set of n x n matrices. Then we have the differential dj : TG -+ Mnxn and the two left multiplications Lg : G -+ G and £g : Mnxn -+ Mnxn for g E G. We have the following commutative diagrams:
for any h E G. We have used the fact that since £g is linear, D£g(A) = £g for all A E Mnxn. Then for Vg E TgG we have
djle (wG(Vg)) = djle (TLg-1Vg) = £g-1 djlg (vg)
=g-l
djlg(v g) = (j 0 7r(vg ))-1 dj(v g),
where 7r : TG -+ G is the tangent bundle projection. Notice that the effect of djle is simply to interpret elements of TeG as matrices, and so can be
228
5. Lie Groups
suppressed from the notation. Taking this into account and applying a reasonable abbreviation for (j 01l"(Vg))-1 dj (vg ), we arrive at WG
=j-Idj.
But notice that for a matrix group, j is none other than the map [x;J : A ~ [x~(A)l = [a~], so that we are returned to the expression WG = [x~J-I[dx~l = "g-Idg". This tells us the meaning of 9 in the expression g-Idg.
5.7. Lie Group Actions The basic definitions for group actions were given earlier in Definitions 1.99 and 1.6.2. As before we give most of our definitions and results for left actions and ask the reader to notice that analogous statements can be made for right actions. Definition 5.95. Let 1 : G x M ~ M be a left action, where G is a Lie group and M is a smooth manifold. If 1 is a smooth map, then we say that 1 is a (smooth) Lie group action. As before, we also use any ofthe notations gp, g.p or 19(P) for l(g,p). We will need this notational flexibility. Recall that for p E M, the orbit of pis denoted Gp or G .p, and the action is transitive if Gp = M. Recall also that an action is effective if 19(P) = p for all p only if 9 = e. For right actions T : M x G ~ M, similar definitions apply and we write pg = Tg(P) = T(P,g). A right action corresponds to a left action by the rule gp := pg-I. Definition 5.96. Let 1 be a Lie group action as above. For a fixed p E M, the isotropy group of p is defined to be Gp:={gEG.gp=p}.
The isotropy group of p is also called the stabilizer of p. Exercise 5.97. Show that Gp is a closed subset and an abstract subgroup of G. This means that Gp is a closed Lie subgroup. Recalling the definition of a free action (Definition 1.100), it is easy to see that an action is free if and only if the isotropy subgroup of every point is the trivial subgroup consisting of the identity element alone. Definition 5.98. Suppose that we have a Lie group action of G on M. If N is a subset of M and Gx c N for all x E N, then we say that N is an invariant subset. If N is also a submanifold, then it is called an invariant submanifold. In this definition, we include the possibility that N is an open submanifold. If N is an invariant subset of M, then it is easy to see that gN = N,
229
i.7. Lie Group Actions
where gN = 19(N) for any g. Furthermore, if N is a submanifold, then the action restricts to a Lie group action G x N ---+ N. If G is zero-dimensional, then by definition it is just a group with discrete topology, and we recover the definition of a discrete group action. We have already seen several examples of discrete group actions, and now we list a few examples of more general Lie group actions. Example 5.99. The maps G x G ---+ G given by (g, x) t--t Lgx and (g, x) t--t Rgx are Lie group actions. Example 5.100. In case M = Rn, the Lie group GL(n,R) acts on R n by More abstractly, matrix multiplication. Similarly, GL(n, IC) acts on GL(V) acts on the vector space V. This action is smooth since Ax depends smoothly (polynomially) on the components of A and on the components of xE Rn.
en.
Example 5.101. Any Lie subgroup of GL(n, R) acts on Rn also by matrix multiplication. For example, O(n,R) acts on Rn. For every x ERn, the orbit of x is the sphere of radius I xii. This is trivially true if x - O. In general, if Ilxll -=1= 0, then IIgxll = Ilxll for any 9 E O(n, R). On the other hand~ if x, y E R n and IIx I = lIyll - r, then let a; := x/r and fj:= y/r. Let el and fj = h and then extend to orthonormal bases (el,"" en) and (/1 ,. '" In). Then there exists an orthogonal matrix S such that Set = It for i = 1, ... , n. In particular, Sa; = fj, and so Sx = y.
x:-
Exercise 5.102. From the last example we can restrict the action ofO(n, R) to a transitive action on sn-l. Now SO(n,R) also acts on sn-l by restriction. Show that this action on sn-l is transitive as long as n > 1. Example 5.103. If H is a Lie subgroup of a Lie group G, then we can consider Lh for any h E H and thereby obtain a Lie group action of H onC. Recall that a subgroup H of a group G is called a normal subgroup if gkg- 1 E K for any k E H and all 9 E G. In other words, H is normal if gHg- 1 c H for all 9 E G, and it is easy to see that in this case we always have gHg- 1 = H. Example 5.104. If H is a normal Lie subgroup of G, then G acts on H by conjugation: g. h:= Cgh = ghg- 1 • Notice that the notation g. h cannot reasonably be abbreviated to gh in this example. Suppose that a Lie group G acts on smooth manifolds M and N. For simplicity, we take both actions to be left actions, which we denote by land
230
5. Lie Groups
A, respectively. A map f : M -t N such that f 0 19 = Ag 0 f for all 9 E G, is said to be an equivariant map (equivariant with respect to the given actions). This means that for all 9 the following diagram commutes:
(5.3)
M
-l- N
!
19!
Ag
M-l-N
If f is also a diffeomorphism, then f is an equivalence of Lie group actions.
Example 5.105. If if> : G -t H is a Lie group homomorphism, then we can define an action of G on H by A(g, h) = Ag(h) = L¢(g)h. We leave it to the reader to verify that this is indeed a Lie group action. In this situation, ¢ is equivariant with respect to the actions A and L (left translation). Example 5.106. Let rn = 8 1 x ... X 8 1 be the n-torus, where we identify 8 1 with the complex numbers of unit modulus. Fix k = (k}, ... , k n ) E ]Rn. Then JR acts on JRn by rk(t, x) = t· x := x + tk. On the other hand, JR. acts on Tn by t· (zl, ... , zn) = (eitklzl, ... , eitkn zn). The map JRn -t rn given by (xl, ... , xn) t-+ (e iX1 , ... , eixn ) is equivariant with respect to these actions. Theorem 5.107 (Equivariant rank theorem). Suppose that f : M -t N is smooth and that a Lie group G acts on both M and N with the action on M being transitive. If f is equivariant, then it has constant rank. In particular, each level set of f is a closed regular submanifold. Proof. Let the actions on M and N be denoted by 1 and A respectively as before. Pick any two points PI, P2 EM. Since G acts transitively on M, there is a 9 with 19P1 = P2. By hypothesis, we have f 0 19 = Ag 0 I, which corresponds to the commutative diagram (5.3). Upon application of the tangent functor we have the commutative diagram Tp1J
Tpl M ~ TJ(pI)N TP1lg
1
!
Tf(PIlAg
T p2 J
Tp2M ~ T f (P2)N
Since the maps TP1lg and Tf(Pl)Ag are linear isomorphisms, we see that TpJ must have the same rank as Tp2f. Since PI and P2 were arbitrary, we see that the rank of f is constant on M. Apply Theorem C.5. 0 There are several corollaries of this nice theorem. For example, we know that O(n,JR) is the level set f- 1 (I), where f : GL(n,JR) -t g£(n,JR) =Mnxn. is given by f(A) = AT A. The group O(n,JR) acts on itself via left translation, and we also let O(n,JR) act on g£(n, JR.) by Q . A := QT AQ (adjoint
231
5.7. Lie Group Actions
action). One checks easily that f is equivariant with respect to these actions, and since the first action (left translation) is certainly transitive, we see that O(n, JR.) is a closed regular submanifold of GL(n, JR.). It follows from Proposition 5.9 that O(n, JR.) is a closed Lie subgroup of GL(n, JR.). Similar arguments apply for U(n, q c GL(n, q and other linear Lie groups. In fact, we have the following general corollary to Theorem 5.107 above. Corollary 5.108. If ¢ : G --+ H is a Lie group homomorphism, then the kernel Ker(h) is a closed Lie subgroup of G. Proof. Let G act on itself and on H as in Example 5.105. Then ¢ is equivariant, and ¢-l(e) = Ker(h) is a closed Lie subgroup by Theorem 5.107 and Proposition 5.9. 0 Corollary 5.109. Let 1 : G x M --+ M be a Lie group action, and let Gp be the isotropy subgroup of some p EM. Then Gp is a closed Lie subgroup
a/G. Proof. The orbit map Op : G --+ M given by Op(g) = gp is equivariant with respect to left translation on G and the given action on M. Thus by the equivariant rank theorem, Gp is a regular submanifold of G, and then by Proposition 5.9 it is a closed Lie subgroup. 0 Proper Actions and Quotients. At several points in this section, such as the proof of Proposition 5.111 below, we follow [Lee, John]. Definition 5.110. Let 1 : G x M --+ M be a smooth (or merely continuous) group action. If the map P : G x M --+ M x M given by (g,p) t--+ (lgP,p) is proper, we say that the action is a proper action. It is important to notice that a proper action is not defined to be an action such that the defining map 1 : G x M --+ M is proper. We now give a useful characterization of a proper action. For any subset K eM, let g. K := {gx : x E K}.
Proposition 5.111. Let 1 : GxM --+ M be a smooth (or merely continuous) group action. Then 1 is a proper action if and only if the set G K := {g E G : (g . K)
nK
=1=
0}
is compact whenever K is compact. Proof. Suppose that 1 is proper so that the map P is a proper map. Let 1f'G be the first factor projection G x M --+ G. Then GK
= {g: there exists an x E K such that gx E K} = {g: there exists an x E M such that peg, x) E K x K} = 1rG(P-l(K x K)),
a.nd so GK is compact.
232
5. Lie Groups
Next we assume that GK is compact for all compact K. If C c M x M is compact, then letting K = 7I"1(C) U 71"2 (C), where 71"1 and 71"2 are the first and second factor projections M x M -; M respectively, we have p
1 (C) C
P
1 (K
x K)
c {(g, x) : gx E K
and x E K}
cGKxK. Since p-1(C) is a closed subset of the compact set GK x K, it is compact. This means that P is proper since C was an arbitrary compact subset of MxM. 0 Using this proposition, one can show that Definition 1.106 for discrete actions is consistent with Definition 5.110 above. Proposition 5.112. If G is compact, then any smooth action 1 : G x M -} M is proper. Proof. Let B c M x M be compact. We find a compact subset K such that B c K x K as in the proof of Proposition 5.111. Claim: P-1(B) is compact. Indeed,
P-l(B)
c P-1(K x K)
cM
= UkEKP-1(K x {k})
= UkEK{(g,P) : (gp,p) E K x {k}} = UkEK{(g, k) : gp E K} C U kEK (G x
{k}) = G x K
Thus P-1(B) is a closed subset of the compact set G x K and hence is compact. 0 Exercise 5.113. Prove the following: (i) If 1 : G x M -; M is a proper action and H eGis a closed subgroup, then the restricted action H x M -; M is proper.
(ii) If N is an invariant submanifold for a proper action l : G x M
~
M,
then the restricted action G x N -; N is also proper. Let us consider a Lie group action 1 : G x M -; M that is both proper and free. The orbit map at p is the map Op : G -; M given by Op(g) = g.p. It is easily seen to be smooth, and its image is obviously G . p. In fact, since the action is free, each orbit map is injective. Also, Op is equivariant with respect to the left action of G on itself and the action 1 :
Op(gx) - (gx) . p = g. (x· p) = g. Op(x) for all x, 9 E G. It follows from Theorem 5.107 (the equivariant rank theorem) that Op has constant rank, and since it is injective, it must be an
233
5.7. Lie Group Actions
G·p
Figure 5.1. Action-adapted chart
immersion. Not only that, but it is a proper map. Indeed, for any compact K c M the set O;I(K) is a closed subset of the set GKU{p}, and since the latter set is compact by Theorem 5.111, O;I(K) is compact. By Exercise 3.9, ()p is an embedding, so each orbit is a regular submanifold of M. It will be very convenient to have charts on M which fit the action of G in a nice way. See Figure 5.l. Definition 5.114. Let M be an n-manifold and G a Lie group of dimension k. If 1 : G x M ~ M is a Lie group action, then an action-adapted chart on M is a chart (U, x) such that (i) x(U) is a product open set Vi x
V2 C Rk X Rn-k = ]Rn;
(ii) if an orbit has nonempty intersection with U, then that intersection has the form {X k +l
= el , ... , xn = en- k }
£or some const ant s cI , ... , en-k . Theorem 5.115. If 1 : G x M ~ M is a free and proper Lie group action, then for every p E M there is an action-adapted chart centered at p. Proof. Let p E M be given. Since G . p is a regular submanifold, we may choose a regular submanifold chart (W, y) centered at p so that (G· p) n W is exactly given by yk+l = ... = yn = 0 in W. Let S be the complementary slice in W given by yl = ... = yk = O. Note that S is a regular submanifold. The tangent space TpM decomposes as TpM = Tp (G· p) EEl TpS.
Let t.p : G x S ~ M be the restriction of the action 1 to the set G x S. Also, let ip : G -* G x S be the insertion map 9 ~ (g,p) and let ie : S ~ G x S be the insertion map s ~ (e, s). (See Figure 5.2.) These insertion maps
5. Lie Groups
234
are embeddings, and we have Op - tp 0 ip and also t.p 0 je = t, where t is the inclusion S <-+ M. Now TeOp(TeG) = Tp( G· p) since Op is an embedding. On the other hand, TOp = Ttp 0 Tip, and so the image of T(e,p)t.p must contain Tp(G· p). Similarly, from the composition t.p 0 je = t we see that the image of T(e,p)t.p must contain TpS. It follows that T(e,p)t.p : T(e,p)(G x S) ~ TpM is surjective, and since T(e,p) (G x S) and TpM have the same dimension~ it is also injective. By the inverse mapping theorem, there is a neighborhood 0 of (e,p) such that t.pla is a diffeomorphism. By shrinking 0 further if necessary we may assume that t.p(O) c W. We may also arrange that 0 has the form of a product 0 = A x B for A open in G and B open in S. In fact, we can assume that there are diffeomorphisms 0: : ]k ~ A and {3 : ]n-k --t B, where ]k and ]n-k are the open cubes in ~k and ~n-k given respectively by ]k (-1, 1)k and ]n-k = (-1, 1)n-k and where o:-l(e) = 0 E IRk and (3-1(P) = 0 E ]Rn-k. Let U := tp(A x B). The map tp 0 (0: X f3) : ]k x]n k ~ U is a diffeomorphism, and so its inverse is a chart. We now show that B can be chosen small enough that the intersection of each orbit with B is either empty or a single point. If this were not true, then there would be a sequence of open sets {Bd with compact closure and Bi+1 C B, (and with corresponding diffeomorphisms f3, : ]k --t Bi as above), such that for every i there is a pair of distinct points Pi'P~ E B, with giPi = p~ for some sequence {gil C G. (We have used the fact that manifolds are first countable and normal.) This forces both p, and p~ = giPi to converge to p. From this we see that the set K = {(giPi,Pi), (p,p)} c M x M is compact. Recall that by definition, the map P : (g, x) t--t (gx, x) is proper. Since (g"Pi) - P l(gipi,p,), we see that {(gi,Pi)} is a subset of the compact set P-l(K). Thus after passing to a subsequence, we have that (gi,pd converges to (g,p) for some 9 and hence g, ~ 9 and giPi ~ gpo But this means that we have gp = .lim giPi = .lim p~ = p, '--+00
'--+00
and since the action is free, we conclude that 9 = e. However, this is impossible since it would mean that for large enough i we would have gi E A, and this in turn would imply that
t.p(g"Pi) = 19,(p,)
= p~ = le(P~) =
t.p(e,pD
contradicting the injectivity of t.p on A x B. Thus after shrinking B we may assume that the intersection of each orbit with B is either empty or a single point. One may now check that with x := (t.p 0 (0: X (3))-1 : U --t ]k x]n-k C ]Rn, we obtain a chart (U,x) with the desired properties. Write x = (x,y), where y takes values in ]n-k C jRn-k and x takes values in [k C ]Rk. Each y = c slice is of the form t.p(A x {q}) c Gq for q E B and so is contained in a single orbit. We see that the intersection of an orbit with U must be
235
5.7. Lie Group Actions
AXB
~m
S
.r GXS
lp
Figure 5.2. Construction of action-adapted charts
a union of such slices. But each orbit can only intersect B in at most one point, and so it is clear that each orbit intersects U in one slice or does not 0 intersect U at all. Notice that we have actually constructed action-adapted charts that have image the cube In. For the next lemma, we continue with the convention that I is the interval (-1, 1). Lemma 5.116. Let x := (cp 0 (a x .8))-1 : U -t Ik X In-k = In c Rn be an action-adapted chart map obtained as in the proof of Theorem 5.115 above. Then given any PI E U, there exists a diffeomorphism 'I/J : In -t In 8uch that 'I/J 0 x is an action-adapted chart centered at PI and with image In, Furthermore 'I/J can be decomposed as (a, b) ~ ('l/JI(a), 'l/J2(b)), where 1: Ik -t Ik and 'l/J2 : In-k -t In k are diffeomorphisms. Proof. We need to show that for any a E In, there is a diffeomorphism 'I/J(a) = O. Let ai be the i-th component of a. Let : I -t I be defined by
1/J : In -t In such that
'l/Ji := (a,) 0
0
We now discuss quotients. If l : G x M -t M is a Lie group action, then there is a natural equivalence relation on M whereby the equivalence classes are exactly the orbits of the action. The quotient space (or orbit space) is denoted G\M, and we have the quotient map 7r: M -t G\M. We put the quotient topology on G\M so that A c G\M is open if and only if
5. Lie Groups
236
(A) is open in M. The quotient map is also open. Indeed, let U c M be open. We want to show that 7I"(U) is open, and for this it suffices to show that 71"-1 (7I"(U)) is open. But 71"-1 (7I"(U)) is the union Uglg(U), and this is open since each Ig(U) is open. 71"-1
Proposition 5.117. Let G act smoothly on M. Then G\M is a Hausdorff space if the set r := {(gp,p) : g E G, p EM} is a closed subset of M x M. If M is second countable then G\M is also. Proof. Let p, q E G\M with 7I"(p) = P and 7I"(q) = q. If p =I- q, then p and q are not in the same orbit. This means that (p, q) ¢:. r, and so there must
be a product open set U x V such that (p, q) E U x V and U x V is disjoint from r. This means that 71" (U) and 71" (V) are disjoint neighborhoods of p and q respectively. Finally, if {Ui} is a countable basis for the topology on M, then {71" (U~)} is a countable basis for the topology on G\M. 0 Proposition 5.11S. If 1 : G x M --+ M is a free and proper action, then G\M is Hausdorff. Proof. To show that G\M is Hausdorff, we use the previous lemma. We must show that r is closed. By Problem 2, proper continuous maps are closed. Thus r = peG x M) is closed since P is proper. 0
We will shortly show that if the action is free and proper, then G\M has a smooth structure which makes the quotient map 71" : M --+ G\M a submersion. Before coming to this let us note that if such a smooth structure exists, then it is unique and is determined by the smooth structure on M. Indeed, if (G\M)A is G\M with a smooth structure given by a maximal atlas A and similarly for (G\M)B for another atlas B, then we have the following commutative diagram: M
/~
(G\M)A Since
id
~ (G\M)B
is a surjective submersion, Proposition 3.26 applies to show that (G\M)A ~ (G\M)B is smooth as is its inverse. This means that A = B. 71"
Theorem 5.119. If I: G x M --+ M is a free and proper Lie group action, then there is a unique smooth structure on the quotient G\M such that
(i) the induced topology is the quotient topology, and G\M is a smooth manifold;
5.7. Lie Group Actions
237
(ii) the projection 71" : M -t G\M is a submersion; (iii) dim(G\M) - dim(M) - dim(G). Proof. Let dim(M) = nand dim(G) - k. We have already shown that G M is a Hausdorff space. All that is left is to exhibit an atlas such that the charts are homeomorphisms with respect to this quotient topology. Let q E G\M and choose P with 7I"(p) = q. Let (U, x) be an action-adapted chart centered at P and constructed exactly as in Theorem 5.115. Let 7I"(U) = V c G\M and let B be the slice xl = ... = xk O. By construction 7I"IB : B -t V is a bijection. In fact, it is easy to check that 7I"IB is a homeomorphism, and a :- (7I"IB)-1 is the corresponding local section. Consider the map y 11'2 0 x 0 a, where 71"2 is the second factor projection 71"2 : ~k X Rn-k -t In k. This is a homeomorphism since (11'20 x)IB is a homeomorphism and 11'2 0 x 0 a = (71"20 x)IB 0 a. We now have a chart (V, y). Given two such charts CV; y) and (V, y), we must show that yo y 1 is smooth. The (V, y) and (iT, y) are constructed from associated actionadapted charts (U, x) and (U, x) on M. Let q E V n V. As in the proof of Lemma 5.116, we may find diffeomorphisms 1/J and 1/J such that (U,1/J 0 x) and (V, 1/J 0 x) are action-adapted charts centered at points PI E 71"-1 (q) and P2 E 11' l(q) respectively. Correspondingly, the charts (V, y) and (V, y) are modified to charts (V, y.,p) and (V, y.,p) centered at q, where y.,p := 71"2 0 x.,p 0 a',
- := 11'2 0 x.,p --, y.,p 0 a , with x.,p := 1/J 0 x and similarly for X.,p. Also, the sections a' and ii' are constructed to map into the zero slices of x.,p and x.,p. However, it is not hard to see that y.,p and y.,p are unchanged if we replace a' and a' by the sections a and a corresponding to the zero slices of x and y. Now, recall that 1/J was chosen to have the form (a, b) ~ (1/J1 (a), 1/J2 (b» . Using the above observations, one checks that y.,p 0 y-1 = 1/J2 and similarly for Y1jj 0 Y 1. From this it follows that the overlap map y;l 0 y.,p will be smooth if and only if y-1 0 y-1 is smooth. Thus we have reduced to the case where (U, x) and U,x) are centered at PI E 7I"-l(q) and P2 E 1I'-1(q) respectively. This entails that both (V, y) and (V, y) are centered at q E V n V. If we choose agE G such that 19(p1) = P2, then by composing with the diffeomorphism 19 we can reduce further to the case where PI = P2. Here we use the fact that 19 takes the set of orbits to the set of orbits in a bijective manner and the special nature of our action-adapted charts with respect to these orbits. In this case, the overlap map xox- 1 must have the form (a, b) ~ (f(a, b), g(b» for some smooth functions f and g. It follows that yo y 1 has the form b H g(b).
238
5. Lie Groups
Finally, we give an argument that G\M is paracompact. Since G\M is Hausdorff and locally Euclidean, we will be done once we show that every connected component of G\M is second countable (see Proposition B.5). Let Qo be any such connected component of G\M. Let Xo be a connected component of 7r- 1 (Qo). Then Xo is a second countable manifold and open in M. We now argue that 7r(Xo) = Qo. To this end we show that the connected set 7r(Xo) is open and closed in Qo. It is open since 7r is an open map. Let x be in the closure of 7r(Xo) and choose x E M with 7r(x) - X. Let U be the domain of an action-adapted chart centered at x and let S be the slice of U that maps diffeomorphically onto an open neighborhood o of x. Now we may find yEO n 7r(Xo) and a corresponding yES such that 7r(y) = y. Since y is in the image of Xo, there is a y' E Xo such that 7r(Y') = y. But then there exists 9 E G such that gy = y'. The open set gU is diffeomorphic to U under 19 and hence is path connected. It contains y' and gx. Since Xo is a path component, gU C Xo and so gx E Xo. But x = 7r(gx) so x E 7r(Xo), We conclude that 7r(Xo) is closed (and open) and connected. Hence 7r(Xo) = Qo. Now since Xo is a second countable manifold, we argue as in Proposition 5.117 that Qo is second countable. Conclusion: G\M is paracompact. 0 Similar results hold for right actions. Some of the most important examples of proper actions are usually presented as right actions. In fact, principal bundle actions studied in Chapter 6 are usually presented as right actions. We shall also encounter situations where there are both a right and a left action in play. Example 5.120. Consider s2n-l as the subset of en given by s2n-1 = {e E en: lei = l}. Here e = (zl, ... ,zn) and lei = ~zizi. Let Sl act on s2n-l by (a, e) t---+ ae = (az 1, ... , az n). This action is free and proper. The quotient is the complex projective space epn-l,
These maps (one for each n) are called the Hopf maps. In this context, S1 is usually denoted by U (1). In what follows, we will consider the similar right action sn x U(l) --t sn. In this case, we think of en+! as a set of column vectors, and the action is given by (e, a) t---+ ea. Of course, since U(l) is abelian, this makes essentially no difference, but in the next example we consider the quaternionic analogue where keeping track of order is important.
239
5.7. Lie Group Actions
The quaternionic projective space JH[pn 1 is defined by analogy with 1 are I-dimensional subspaces of the right IHI-vector space JH[n. Let us refer to these as JH[-lines. Each of these are of real dimension 4. Each element of JH[n\{o} determines an JH[-line and the IHI-line determined by (e 1, ... , en) t will be the same as that determined by ([1, ... ,[n)t if and only if there is a nonzero element a E JH[ such that ({l, ... ,[n)t = (e1, ... ,en)ta (ea, ... ,ena)t. This defines an equivalence relation", on JH[n \ {O}, and thus we may also think of JH[pn 1 as (JH[n \ {O} ) / "'. The element of JH[pn 1 determined by (e, ... , en)t is denoted by [e, . .. , en]. Notice that the subset {e E JH[n : lei = I} is s4n-l. Just as for the complex projective spaces, we observe that all such JH[-lines contain points of s4n-1, and two points E s4n 1 determine the same JH[-line if and only if { (a for some a with lal = 1. Thus we can think of JH[pn-l as a quotient of s4n 1. When viewed in this way, we also denote the equivalence class of { (e, ... ,en)t E s4n-l by [e] = [e, ... ,en]. The equivalence classes are clearly the orbits of an action as described in the following example.
cpn-l. The elements of JH[pn
e, (
Example 5.121. Consider s4n 1 regarded as the subset of JH[n given by {e E JH[n: lei = I}. Here e = (el, ... ,~n)t and I~I = E(i~i. Now we define a right action of U(l, JH[) on s4n-l by (~, a) t-+ ~a (ea, ... , ~na)t. This action is free and proper. The quotient is the quaternionic projective space JH[pn-l, and we have the quotient map denoted by p,
s4n-l
This map is also referred to as a Hopf map. Recall that Z2 = {I, -1 } acts on sn-l = lR n on the right (or left) by multiplication, and the action is a (discrete) proper and free action with quotient ~'pn-l, and the examples above generalize this. For completeness, we describe an atlas for JH[pn-l. View JH[pn 1 as the quotient s4n-l / '" described above. Let
Uk := {[~l
E
s4n-l
C
JH[n : ~k
# O},
and define
where as before, the caret symbol ~ indicates that we have omitted the 1 in the k-th slot to obtain an element of JH[n-l. Notice that we insist that the k1 in this expression multiply from the right. The general pattern for the
e
240
5. Lie Groups
overlap maps become clear from the example 'P3 'P3
0
'P2"1(Y1, Y3, ... , Yn)
'P2"1. Here we have
0
= 'P3([Yl. 1, Y3,· .. , Yn)) = (Y1Y3 1, Y3 1, Y4Y3 1, . .. , YnY3 1) .
In the case n = 1, we have an atlas of just two charts {(U1' 'P1), (U2' 'P2)}. In close analogy with the complex case we have U1 n U2 = IHl\{O} and 'P1 0 'P2"l(y) - jj-1 = 'P2 0 'Pl1(y) for Y E IHl\{O}.
Exercise 5.122. Show that by identifying IHl with ]R4 and modifying the stereographic charts on S3 C ]R4 we can obtain an atlas for S3 with overlap maps of the same form as for IHlP1 given above. Use this to show that IHlp1 ~ S3. Combining the last exercise with previous results we have ]Rp1 ~ s1,
rep1
e:: S2,
IHlP1
e:: S3.
5.8. Homogeneous Spaces Let H be a closed Lie subgroup of a Lie group G. Then we have a right action of H on G given by right multiplication r : G x H -+ G. The orbits of this right action are exactly the left cosets of the quotient G / H. The action is clearly free, and we would like to show that it is also proper. Since we are now talking about a right action, and G is the manifold on which we are acting, we need to show that the map Pright : G x H -+ G x G given by (p, h) t---+ (p, ph) is a proper map. The characterization of proper action becomes the condition that
HK := {h E H : (K· h)
nK
~
0}
is a compact subset of H whenever K is compact in G. Let K be any compact subset of G. It will suffice to show that HK is sequentially com.. pact. To this end, let (ht)tEN be a sequence in HK. Then there must be sequences (at) and (b i ) in K such that a~ht = bi. Since K is compact and hence sequentially compact, we can pass to subsequences (ai(J)) jEN and (btu)) jEN such that limJ--+oo ai(j) = a and limJ--+oo bi(j) - b. Here i t---+ i(j) is a monotonic map on positive integers; N -+ N. This means that limJ--+oo h~(j) = limJ--+oo ai""0)bi (j) = a-lb. Thus the original sequence {hi} is shown to have a convergent subsequence. Since by Theorem 5.81, H is an embedded submanifold, this sequence converges in the topology of H. We conclude that the right action is proper. Using Theorem 5.119 (or its analogue for right actions), we obtain the following Proposition.
Proposition 5.123. Let H be a closed Lie subgroup of a Lie group G. Then,
5.B. Homogeneous Spaces
241
(i) the right action G x H -; G is free and properj (ii) the orbit space is the left coset space G / H, and this has a unique smooth manifold structure such that the quotient map 7r : G -; G / H is a surjection. Furthermore, dim(G/H) = dim(G) - dim(H).
If K is a normal Lie subgroup of G, then the quotient is a group with multiplication defined by [gl][g2] = (gIK)(g2K) = glg2K. In this case, we may ask whether G / K is a Lie group. If K is closed, then we know from the considerations above that G / K is a smooth manifold and that the quotient map is smooth. In fact, we have the following: Proposition 5.124 (Quotient Lie groups). If K is a closed normal subgroup of a Lie group G, then G / K is a Lie group and the quotient map G -; G / K ts a Lie group homomorphism. Furthermore, if f : G -; H is a surjective Lie group homomorphism, then Ker(J) is a closed normal subgroup, and the mduced map G / Ker(J) -; H is a Lie group isomorphism.
1:
Proof. We have already observed that G / K is a smooth manifold and that the quotient map is smooth. After taking into account what we know from standard group theory, the only thing we need to prove for the first part, is that the multiplication and inversion in the quotient are smooth. It is an easy exercise using Corollary 3.27 to show that both of these maps are smooth. Consider a Lie group homomorphism f as in the hypothesis of the proposItion. It is standard that Ker(J) is a normal subgro~p and it is clearly closed. It is also easy to verify fact that the induced f map is an isom.:>rphism. One can then use Corollary 3.27 to show that the induced map f is smooth. 0 If a Lie group G acts smoothly and transitively on M (on the right or left), then M is called a homogeneous space with respect to that action. Of course it is possible that a single group G may act on M in more than one way and so M may be a homogeneous space in more than one way. We will give a few concrete examples shortly, but we already have an abstract example on hand.
Theorem 5.125. If H is a closed Lie subgroup of a Lie group G, then the map G x G/H -; G/H, given by I: (g,gIH) -; gglH, is a transitive Lie group action. Thus G / H is a homogeneous space with respect to this action. Proof. The fact that l is well-defined follows since if 91H = g2H, then g2'l g1 E H, and so 992H = 9g292191H = 9glH. We already know that G/H is a smooth manifold and 7r : G -; G/H is a surjective submersion.
5. Lie Groups
242
We can form another submersion idG following diagram commute:
GxG
X7r :
G x G -+ G x G/H making the
) G
idGX~t ~ ~1 GxG/H-G/H Here the upper horizontal map is group multiplication and the lower horizontal map is the action l. Since the diagonal map is smooth, it follows from Proposition 3.26 that I is smooth. We see that 1 is transitive by observing that if 91H,92H E G/H, then 19291 1 (91H) = 92H. 0 It turns out that up to appropriate equivalence, the examples of the above type account for all homogeneous spaces. Before proving this let us look at some concrete examples. Example 5.126. Let M = JRn and let G = Euc(n, JR) be the group of Euclidean motions. We realize Euc(n, JR) as a matrix group
Euc(n, R) = {
[~ ~]: v E JRn and Q E O(n) }
The action of Euc(n, JR) on JRn is given by the rule
[~ ~]. x = Qx + v, where x is written as a column vector. Notice that this action is not given by a matrix multiplication, but one can use the trick of representing the points x of JRn by the (n + 1) x 1 column vectors [i 1, and then we have ~ [i 1= [Qx\v]· The action is easily seen to be transitive.
[ S]
Example 5.127. As in the previous example we take M = Rn, but this time, the group acting is the affine group Aff(n, JR) realized as a matrix group: Aff(n,JR) The action is
= {[
~ ~]: v E JRn and A E GL(n,JR)}.
[~ ~]. x = Ax + v,
and this is again a transitive action. Comparing these first two examples, we see that we have made JRn into a homogeneous space in two different ways. It is sometimes desirable to give different names and/or notations for Rn to distinguish how we are acting on the space. In the first example we might write lEn (Euclidean space), and
5.B. Homogeneous Spaces
243
in the second case we write An and refer to it as affine space. Note that, roughly speaking, the action by Euc(n, ~) preserves all metric properties of figures such as curves defined in En. On the other hand, Aff(n,~) always sends lines to lines, planes to planes, etc. Example 5.128. Let M = H := {z E e : Imz > o}. This is the upper half-plane. The group acting on H will be SL(2, ~), and the action is given by
b). z = cz+d az + b.
( a cd This action is transitive.
Example 5.129. We have already seen in Example 5.102 that both O(n) and SO(n) act transitively on the sphere sn-l C ~n, so sn-l is a homogeneous space in at least two (slightly) different ways. Also, both SU(n) and U(n) act transitively on s2n-l C en. Example 5.130. Let V~,k denote the set of all k-frames for ~n, where by a k-frame we mean an ordered set of k linearly independent vectors. Thus an n-frame is just an ordered basis for ~n. This set can easily be given a smooth manifold structure. This manifold is called the (real) Stiefel manifold of k-frames. The Lie group GL(n,~) acts (smoothly) on V~ k by 9 . (el,"" ek) = (gel, ... , gek). To see that this action is transitiv~, let (el,'''' ek) and (h, ... , ik) be two k-frames. Extend each to n-frames el,"" ek,"" en) and (h, ... , ik, ... , fn)' Since we consider elements of JRn as column vectors, these two n-frames can be viewed as invertible n x n matrices E and F. If we let 9 := EF-l, then gE = F, or g. (el, ... , ek) gel,··., gek) = (h, ... , ik)· Example 5.131. Let Vn,k denote the set of all orthonormal k-frames for ~n, where by an orthonormal k-frame we mean an ordered set of k orthonormal vectors. Thus an orthonormal n-frame is just an orthonormal basis for ~n. This set can easily be given a smooth manifold structure and is called the Stiefel manifold of orthonormal k-frames. The group O(n,~) acts t ansitively on Vn,k for reasons similar to those given in the last example. Theorem 5.132. Let M be a homogeneous space via the transitive action I : G x M --+ M, and let Gp be the isotropy subgroup of a point p EM. Recall that G acts on G/Gp. If G/Gp is second countable (in particular f G is second countable), then there is an equivariant diffeomorphism
5. Lie Groups
244
This map is surjective by the transitivity of the action l. It is also injective since if ¢(91Gp) = ¢(92Gp), then 91 . P = 92 . P or (91 192) . P = p, which by definition means that 91 192 E Gp and 91Gp = 92Gp. Notice that the following diagram commutes: G
j~
G/Gp--M From Corollary 3.27 we see that ¢ is smooth. To show that ¢ is a diffeomorphism, it suffices to show that the rank of ¢ is equal to dim M or in other words that ¢ is a submersion. Since ¢(991Gp) = (991) . P = 9¢(91Gp), the map ¢ is equivariant and so has constant rank. By Lemma 3.28 , ¢ is a submersion and hence in the present case a diffeomorphism. 0 Without the technical assumption on second count ability, the proof shows that we still have that ¢ : G/Gp ~ M is a smooth equivariant bijection. Exercise 5.133. Show that if instead of the hypothesis of second countability in the last theorem we assume that ()p has full rank at the identity, then ¢: G/Gp ~ M is a diffeomorphism.
Let 1 : G x M ~ M be a left Lie group action and fix Po EM. Denote the projection onto cosets by 7r and also write 0po : 9 t-+ 9Po as before. Then we have the following equivalence of maps: G
!~ G/Gp Exercise 5.134. Let G act on M as above. Show that if P2 = 9Pl for some 9 E G and PbP2 EM, then is a natural Lie group isomorphism Gp1 9! GP2 and a natural equivariant diffeomorphism G/Gp1 9! G/Gp2 •
We now look again at some of our examples of homogeneous spaces and apply the above theorem. Example 5.135. Consider again Example 5.126. The isotropy group of the origin in ]Rn is the subgroup consisting of matrices of the form
5.8. Homogeneous Spaces
245
where Q E O(n). This group is clearly isomorphic to O(n,R), and so by the above theorem we have an equivariant diffeomorphism R n ~ Euc(n, R) - O(n,R) . Example 5.136. Consider again Example 5.127. The isotropy group of the origin in Rn is the subgroup consisting of matrices of the form
(~ ~), where A E GL(n, R). This group is clearly isomorphic to GL(n, R), and so by the above theorem we have an equivariant diffeomorphism
R n ~ Aff(n,R) - GL(n,R)" It is important to realize that there is an implied action on R n , which is different from that in the previous example. Example 5.137. Consider the action of SL(2, R) on the complex upper halfplane H = C+ as in Example 5.128. We determine the isotropy subgroup for the point i = A . A matrix A = (~ ~) is in this subgroup if and only if ai+b . ci
+ d = t.
This is true exactly if bc - ad = 1 and bd + ac = 0, and so the isotropy subgroup is SO(2, R) (~ 8 1 = U(1, C)). Thus we have an equivariant diffeomorphism '" SL(2,R) H = C+ = SO(2,R)' Example 5.138. From Example 5.129 we obtain 1
~ O(n) -0(n-1)'
s2n 1
~ U(n) - U(n -1)'
8n-
sn- 1 ~
SO(n) -SO(n-1)'
s2n 1
~ SU(n) - SU(n -1)'
Example 5.139. Let (el, ... , en) be the standard basis for Rn. Under the action of GL(n, R) on V~,k given in Example 5.130, the isotropy group of the k-frame (ek+1' ... ,en) is the subgroup of GL( n, R) of the form
(~ ~)
for A E GL(n- k,R).
We identify this group with GL(n - k, R) and then we obtain , '" GL(n,R) Vn,k - GL(n - k,R)"
5. Lie Groups
246
Example 5.140. A similar analysis leads to an equivariant diffeomorphism rv
O(n,R)
Vn,k = O(n - k,R)' where Vn,k is the Stiefel manifold of orthonormal k-planes of Example 5.131. Notice that taking k = 1 we recover Example 5.135. Exercise 5.141. Show that if k
< n, then we have Vn,k ~ sg?~nf.~).
Next we introduce a couple of standard results concerning connectivity. Proposition 5.142. Let G be a Lie group acting smoothly on M. Let the action be a left (resp. right) action. If both G and M\G (resp. MjG) are connected, then M is connected. Proof. Assume for concreteness that the action is a left action and that G and M\ G are connected. Suppose by way of contradiction that M is not connected. Then there are disjoint open sets U and V whose union is M. Each orbit G· p is the image of the connected space G under the orbit map g I--t g . p and so is connected. This means that each orbit must be contained in one and only one of U and V. Since the quotient map 7r is an open map, 7r(U) and 7r(V) are open, and from what we have just observed they must be disjoint and 7r(U) U 7r(V) = M\G. This contradicts the assumption that M\ G is connected. 0 Corollary 5.143. Let H be a closed Lie subgroup ofG. If both Hand G/H are connected, then G is connected. Corollary 5.144. For each n > 1, the groups SO(n), SU(n) and U(n) are connected while the group O(n) has exactly two components: SO(n) and the subset ofO(n) consisting of elements with determinant-1. Proof. The groups SO(l) and SU(l) are both connected since they each contain only one element. The group U(l) is the circle, and so it too is connected. We use induction. Suppose that SO(k), SU(k) are connected for 1 k n - 1. We show that this implies that SO(n), SU(n) and U(n) are connected. From Example 5.138 we know that sn-l = SO(n)/SO(n -1). Since sn 1 and SO(n - 1) are connected (the second by the induction hypothesis), we see that SO(n) is connected. The same argument works for SU(n) and U(n).
:s :s
Every element of O(n) has determinant either 1 or -1. The subset SO(n) C O(n) is closed since it is exactly {g E O(n) : detg = I}. Fix an element ao with det ao = -1. It is easy to show that aoSO( n) is exactly the set of elements of O(n) with determinant -1 so that SO(n) U aoSO(n) - O(n) and SO(n) n aoSO(n) = 0. Indeed, by the multiplicative property of determinants, each element of aoSO(n) has determinant -1. But aoSO(n)
247
5.8. Homogeneous Spaces
also contains every element of determinant -1 since for any such 9 we have 9 - ao (ao1g) and ao1g E SO(n). Since SO(n) and aoSO(n) are complements of each other, they are also both open. Both sets are connected, since 9 t-t aog is a diffeomorphism which maps the first to the second. Thus we see that SO(n) and aoSO(n) are the connected components of O(n). D We close this chapter by relating the notion of a Lie group action with that of a Lie group representation. We give just a few basic definitions, some of which will be used in the next chapter. Definition 5.145. A linear action of a Lie group G on a finite-dimensional vector space V is a left Lie group action A : G x V -+ V such that for each 9 E G the map Ag : v H A(g, v) is linear. The map G -+ GL(V) given by 9 H A(g) := Ag is a Lie group homomorphism and will be denoted by the same letter A as the action so that .\(g)v := A(g, v). A Lie group homomorphism A : G -+ GL(V) is called a representation of G. Given such a representation, we obtain a linear action by letting A(g, v) := A(g)V. Thus a linear action of a Lie group is basically the same as a Lie group representation. The kernel of the action is the kernel of the associated homomorphism (representation). An effective linear action is one such that the associated homomorphism has trivial kernel, which, in turn, is the same as saying that the representation is faithful. Two representations A : G -+ GL(V) and>..' : G -+ GL(V') are equivalent if there exists a linear isomorphism T : V -+ V' such that T 0 Ag = A~ 0 T for all g. Exercise 5.146. Show that if A : G x V -+ V is a map such that Ag : v H ). g, v) is linear for all g, then A is smooth if and only if Ag : G -+ GL(V) is smooth for every 9 E G. (Assume that V is finite-dimensional as usual.) We have already seen one important example of a Lie group representation, namely, the adjoint representation. The adjoint representation came from first considering the action of G on itself given by conjugation which leaves the identity element fixed. The idea can be generalized: Theorem 5.147. Let 1 : G x M -+ M be a (left) Lie group action. Suppose that Po EM is a fixed point of the action (lg(Po) = Po for all g}. The map
l(Po) : G -+ GL(TpoM) gwen by 8
a Lie group representation.
5. Lie Groups
248
Proof. Since
l(Po)(9192) = TPo (lgH12) = TPO(l91 Olg2)
= TpOlgl 0 Tpolg2
= l(po) (91)l(po) (92),
we see that l(Po) is a homomorphism. We must show that l(po) is smooth. It will be enough to show that 9 t-t a(Tpolg·v) is smooth for any v E TPoM and any a E T:aM. This will follow if we can show that for fixed Vo E TPoM, the map G -+ T M given by 9 t-t Tpolg . Vo is smooth. This map is a composition
G -+ TG x T M ~ T (G x M)
!! T M,
where the first map is 9 t-t (Og, vo), which is clearly smooth. By Exercise 5.146 this implies that the map G x TpoM -+ TpoM given by (g, v) t-t TpOlg·v is smooth. 0 Definition 5.148. For a Lie group action 1: G x M -+ M with fixed point Po, the representation Z(po) from the last theorem is called the isotropy
representation for the fixed point. Now let us consider a transitive Lie group action 1 : G x M -+ M and a point Po. For notational convenience, denote the isotropy subgroup GPo by H. Then H acts on M by restriction. We denote this action by A: H x M -+ M, A: (h,p) t-t hp for h E H = Gpo.
Notice that Po is a fixed point of this action, and so we have an isotropy representation ,A(po) : H x TPoM -+ TpoM. On the other hand, we have another action C : H x G -+ G, where Ch : G -+ G is given by 9 t-t h9h 1 for h E H. The Lie differential of Ch is the adjoint map Adh : g -+ g. The map Ch fixes H, and Adh fixes ~. Thus the map Adh : g -+ g descends to a map Adh : g/~ -+ g/~. We are going to show that there is a natural isomorphism TpoM ~ g/~ such that for each h E H the following diagram commutes:
(5.4)
One way to state the meaning of this result is to say that h t-t Adh is a representation of H on the vector space g/~, which is equivalent to the linear isotropy representation. The isomorphism TPoM ~ g/~ is given in the
5.9. Combining Representations
249
~ E ~
following very natural way: Let have
Te7r(~) =
and consider
Te7r(~) E
TpoM. We
dd 7r(exp~t) = 0 t t=O since exp~t E ~ for all t. Thus ~ C Ker(Te 7r). On the other hand, dim~ = dimH = dim(Ker{Te 7r)), so in fact ~ = Ker(Te 7r) and we obtain an isomorphism g/~ ~ TpoM induced from T e7r. Let us see why the diagram (5.4) commutes. First, Lh is well-defined as a map from G / H to itself and the following diagram clearly commutes:
I
Ch
G
.. G
~l
l~
G/H~G/H Using our equivariant diffeomorphism
~
M, we obtain an equiva-
G~G Opo
1~
lopo
M~M
Applying these maps to exp t~ for
expt~1 opo
J
~ E
g, we have
Ch.,.
h(expt~)h-l lopo
(exp t~) Po
~h
f-------+-
h (exp t~) Po
Applying the tangent functor (looking at the differential), we get the commutative diagram g
Adh
.. g
1 ~(po) 1
TPoM~TPoM
and, taking quotients, this gives the desired commutative diagram (5.4).
5.9. Combining Representations We close this chapter with a bit about constructing new linear representations from old ones. Suppose that V is an IF-vector space and let B = (VI, ... , vn) be a basis for V. Then denoting the matrix representative of >..g with respect to B by [>..g].13 we obtain a homomorphism G ~ GL(n, IF)
250
5. Lie Groups
given by 9 t--+ [AglB. In general, a Lie group homomorphism of a Lie group G into GL(n, IF) is called a matrix representation of G. We have already seen that any Lie subgroup of GL(JRn ) acts on IRn by matrix multiplication, and the corresponding homomorphism is the inclusion map G c......t GL(JRn ). More generally, a Lie subgroup G of GL(V) acts on V in the obvious way simply by employing the definition of GL(V) as a set of linear transformations of V. We call this the standard action of the linear Lie subgroup of GL(V) on V, and the corresponding homomorphism is just the inclusion map G c......t GL(V). Choosing a basis, the subgroup corresponds to a matrix group, and the standard action becomes matrix multiplication on the left of lFn , where the latter is viewed as a space of column vectors. This action of a matrix group on column vectors is also referred to as a standard action. Given a representation A of G in a vector space V, we have a dual rep~ resentation A* of G in the dual space V* by defining A*(g) := (A(g 1))*: V* -+ V*. Recall that if L : V -+ V is linear, then L * : V* -+ V* is defined by L*(a)(v) = a(Lv) for a E V* and v E V. This dual representation is also sometimes called the contragredient representation (especially when IF = R). Now let AV and AW be representations of a lie group G in IF-vector spaces V and W respectively. We can then form the direct sum repr~ sentation AV E9 AW by (AV E9 AW)g :- A~ E9 A't' for 9 E G, where we have (A~ E9A't') (v,w) (A~V,AWW). We will not pursue a serious study of Lie group representations but simply note that a major goal in the subject is the identification and classification of irreducible representations. A representation A : G -+ GL(V) is said to be irreducible if there is no nonzero proper subspace W of V such that Ag(W) c W for all g. A large class of Lie groups known as semisimple Lie groups have the property that their representations break into direct sums of irreducible representations. Example 5.149. A homogeneous polynomial of degree don ((:2 is a linear combination of monomials of total degree d. Let HJ denote the vector space of homogeneous polynomials of degree 2j, where j is a nonnegative "halfinteger" (j = k/2 for some nonnegative integer k). Define Aj : SU(2) --t GL(HJ ) by (Aj(g)f)(z) := f(g-lz) for z = (Zl,Z2) E ((:2. Then AJ is an irreducible representation called the spin-j representation of SU(2 . The spin-l/2 representation turns out to be equivalent to the standard representation of SU(2) in ((:2. These spin representations play an important role in quantum physics. One can also form the tensor product of representations. The definitions and basic facts about tensor products are given in the more general context of
251
5.9. Combining Representations
module theory in Appendix D. Here we give a quick recounting of the notion of a tensor product of vector spaces, and then we define tensor products of representations. Given two vector spaces V and W over some field IF, consider the class CVxw consisting of all bilinear maps V x W --t X, where X varies over aU IF-vector spaces, but V and Ware fixed. We take members of CYxW as the objects of a category (see Appendix A). A morphism from, say, J.l.l : V x W --t X to J.l.2 : V x W --t Y is defined to be a linear map l : X -t Y such that the diagram
Y
VxW
X
'-
~Y commutes. There exists a vector space Tv,w together with a bilinear map : V x W--t Tv,w that has the following universal property: For every bilinear map J.l. : V x W --t X, there is a unique linear map Ii : Tv,w --t X such that the following diagram commutes:
If such a pair (Tv,w, ®) exists with this property, then it is unique up to isomorphism in CYxw. In other words, if ®: V x W --t Tv,w also has this universal property, then there is a linear isomorphism Ty,w ~ Tv,w such that the following diagram commutes:
y VxW
~
Tv,w
~
Tv,w
We refer to such universal object as a tensor product of V and W. We will indicate the construction of a specific tensor product that we denote by V W with corresponding bilinear map ® : V x W --t V ® W. The idea is
252
5. Lie Groups
simple: We let V ® W be the set of all linear combinations of symbols of the form v ® W for v E V and W E W, subject to the relations (VI
+ V2) ® W = VI ® W + v2 ® W,
V ® (WI + W2) = V ® WI + V ® W2, r (v ® w) = rv ® W = V ® rw, for rEF. The map ® is then simply ®: (v,w) ---+ v®w. Somewhat more pedantically, let F(V x W) denote the free vector space generated by the set V x W (the elements of V x Ware treated as a basis for the space, and so the free space has dimension equal to the cardinality of the set V x W). Next consider the subspace R of F(V x W) generated by the set of all elements of the form
(av,w) - a(v,w), (v, aw) - a(v, w), (VI + V2, w) - (VI. w) + (V2' w), (V, WI + W2) - (V, WI) + (V, W2), for VI, V2, V E V, WI, W2, wE W, and a E F. Then we let V ® W be defined as the quotient vector space F(V x W)/R, and we have a corresponding quotient map F(V x W) ---+ V ® W. The set V x W is contained in F(V x W), and the map ® : V x W ---+ V ® W is then defined to be the restriction of the quotient map to V x W. The image of (v, w) under the quotient map is denoted by V ® w. Tensor products of several vector spaces at a time are constructed similarly to be a universal space in a category of multilinear maps (Definition D.13). We may also form the tensor products two at a time and then use the easily proved fact that (V ® W) ® U ~ V ® (W ® U), which is then denoted by V ® W ® U. Again the reader is referred to Appendix D for more about tensor products. Elements of the form V ® w generate V ® W, and in fact, if (eI,"" er ) is a basis for V and (fI, .•• , fs) is a basis for W, then
{ei ® Ii : 1 ~ i ~ r, 1 ~ j ~ s} is a basis for V ® W, which therefore has dimension rs
= dim V dim W.
One more observation: If A : V ---+ X and B : W ---+ Y are linear maps, then we can define a linear map A ® B : V ® W ---+ X ® Y. First note that the map (v, w) H Av ® Bw is bilinear. Thus, by the universal property, there is a unique map A ® B such that
(A ® B) (v ® w) = Av ® Bw, for all v E V, w E W.
Problems
253
Notice that if A and B are invertible, then A ® B is invertible with (A ® B)-l(v ® w) = A-Iv ® B lw. Pick bases for V and W as above and bases {ei, ... , e~} and {if, ... , I;} for X and Y respectively. For rEV ® W, we can write r = r ij ei ® 13 using the Einstein summation convention. We have
A ® B(r) = A ® B(rije, ® Ij) = r,j Aei ® Blj
= r ij Afe~ ® B~I{ ij Bl. (e'k'
so that the matrix of A ® B is given by (A ® B)~; =
,
Af B~.
Let Av and AW be representations of a Lie group G in IF-vector spaces V and W, respectively. We can form a representation of G in the tensor product space V ® W by letting (A v ® AW) 9 : - A~ ® A": for all 9 E G. There is a variation on the tensor product that is useful when we have two groups involved. If Av is a representation of a Lie group GI in the IF-vector space V and AW is a representation of a Lie group G 2 in the IF-vector space W, then we can form a representation of the Lie group G I x G2 , also called the tensor product representation and denoted Av ® AW as before. In this case, the definition is (Av ® AW) (91,92) := A~ ® A~. Of course if it happens that G1 = G2, then we have an ambiguity since AV ®AW could be a representation of G or of G x G. One can usually determine which version is meant from the context. Alternatively, one can use pairs to denote actions so that an action A : G x V ~ V is denoted (G, A). Then the two tensor product representations would be (G x G, Av ® AW) and (G, Av ® AW), respectively.
Problems (1) Verify that each of the groups described in Section 5.2 is (isomorphic to) a Lie subgroup of an appropriate Lie group of linear automorphisms. (2) Show that proper continuous maps are closed maps. (3) Show that SL(2, IC) is simply connected and that p : SL(2, C) ~ Mob is a universal covering homomorphism. See Example 5.50. (4) Prove Proposition 5.83. (5) Let g be a Lie algebra of a Lie group G. Show that the set of all automorphisms of g, denoted Aut(g), forms a Lie group (actually a Lie subgroup of GL({!)).
5. Lie Groups
254
(6) Show that if we consider SL(2, R) as a subset of SL(2, C) in the obvious way, then SL(2, R) is a Lie subgroup of SL(2, C) and g:J(SL(2, R)) is a Lie subgroup of Mob. Show that if T E g:J(SL(2, R)), then T maps the upper half-plane of C onto itself (bijectively). (7) Show that for v E TeG, the field defined by 9 automatically smooth.
I---t
LV(g) := TLg . v is
(8) Determine explicitly the map Tjinv: TjGL(n,R) ---+TjGL(n,R), where inv: GL(n,R) ---+GL(n,R) is defined by inv(A) = A-l. (9) Let H be the set of real 3 x 3 matrices of the form 1 a
b]
A= [ O l e
.
001 Find a global chart for H and show that this and the usual matrix multiplication gives H the structure of a Lie group. (10) If G is a connected Lie group and h : G ---+ H is a Lie group homomorphism with discrete kernel K, then K C Z(G), where Z(G) = {x E G: xg = gx for all 9 E G} is the center of G. (11) Show that for a Lie group G, the conjugation map Cg : G ---+ G defined by x I---t gxg- 1 is a Lie group isomorphism. Show that the map C : g-t Diff(G) is a group homomorphism. Note that we have not defined any Lie group structure on Diff (G). (12) Consider the map TeCg : TeG ---+ TeG. Show that 9 group homomorphism from G into GL(TeG).
I---t
TeCg is a Lie
(13) Show that SO(3) is the connected component of the identity in 0(3). Show that the special Lorentz group SO(1,3) is not connected. Show that the first entry of elements of SO(1,3) must have absolute value greater than or equal to 1. Define SO(3,1)t as the subset of SO(l, 3) consisting of matrices with positive first entry (which must be greater than 1). Show that SO(3,1)t is connected (and hence the connected component of the identity in 0(1,3)). (14) Let A E g£(V) = L(V, V) for some finite-dimensional vector space V. Show that if A has eigenvalues {Ath-l, ... ,n, then ad(A) has eigenvalues {Aj-Ak}"k=l, ... ,n' Hint: Choose a basis for V such that A is represented by an upper triangular matrix. Show that this induces a basis for g£(V) such that with the appropriate ordering, ad(A) is upper triangular. (15) Fix a nonzero vector w E R3 with length () = I wll. Let Lw : R3 ---+ IR3 denote the linear transformation v ~ wxv, where x is the vector cross product. Show that for any right handed orthonormal basis {el' e2, e3}
255
Problems
with e3 parallel to w we have
exp(Lw)el = cosOel +sinOe2, exp(Lw)e2 = -sinOel + cosOe2, exp(Lw)e3 = e3' 16) Let Lw be as in the previous problem. Show that sin 0 1 - cos 0 2 02 Lw, exp Lw = I + -O-Lw + where s~/J and
1
~~s/J are defined in the obvious way using power series.
17) Let A, B
E g[(V), where V is a finite-dimensional vector space over the field IF = lR or C, and show that the following statements are equivalent: (a) [A, B] = O. (b) exp sA and exp tB commute for all s, t E IF. (c) exp (sA + tB) = exp (sA) exp (tB) for all s, t E IF.
18) Let V be a finite-dimensional normed space over the field lR (resp. q. Show that if L:~ 0 anx n is an absolutely convergent real (resp. complex) power series with radius of convergence R, then
converges (absolutely) in the normed space g[(V) for
19) Show that if G is a connected Lie group, then
7r1 (G)
IIAII < R. is abelian.
20) Let G be a Lie group and denote by f..L : G x G ---t G the multiplication map. (a) Identify T (G x G) with TG x TG in the usual way. Show that the tangent map T f..L : TG x TG --+ TG defines a Lie group structure on TG and show that if (V9,Wh) E T9G x ThG, then Tf..L(v9, Wh)
= TRhV9 + TL9wh,
where Rh and L9 are right and left multiplications respectively. (b) Show that under the isomorphism G x g with TG, the Lie group multiplication takes the form (g,A)· (h,B) = (gh,Ad h
1
A+B).
21) Let M be a smooth manifold, let G be a Lie group, and let r : M x G ---t M be a right Lie group action. Recall from the previous problem that TG is naturally a Lie group. (a) Show that Tr : TM x TG --+ TM defines a right Lie group action.
5. Lie Groups
256
(b) For each A E g = TeG, define a vector field O'(A) on M by O'(A)(P) := It=opexp(At). Show that for A, BEg, we have
it
0' ([A, BD
=
[0' (A) ,0' (B)] .
(c) Show that if A L denote the left invariant vector field generated by A, then for vp E TpM, Tr(vp, AL(g))
= TRg (vp) + O'(A)(pg).
Chapter 6
Fiber 3ul,:dles
The notion of a bundle is basic in both topology and geometry. The reader need not master everything in this chapter before going on to later chapters and should skip forward rather than become too bogged down. The definition and basic examples of a vector bundle are most important. In this chapter we also introduce the more advanced notion of a structure group for a bundle. There is more than one approach to structure groups. We start out with an approach that takes the notion of a G-atlas as basic. This is essentially the approach of Steenrod [Stj. One may also approach Gbundle structures by first introducing the notion of a principal G-bundle (see [Hus]). We discuss principal bundles near the end of this chapter.
6.1. General Fiber Bundles Definition 6.1. Let F, M, and E be C r manifolds and let 7r : E ---t M be a Cr map. The quadruple (E, 7r, M, F) is called a (locally trivial) C r fiber bundle if for each point p E M there is an open set U containing p and a ar diffeomorphism
--¢>----~ U x F
~~ U In differential geometry, attention is usually focused on Coo fiber bundles smooth fiber bundles), but the continuous case is also of interest. We will restrict ourselves to the smooth case, but the reader should keep in mind that most of the definitions and theorems have analogous CO versions where 257
6. Fiber Bundles
258
Ep
~ 4() F
I
i')
!
p•
M
Figure 6.1. Schematic for fiber bundle
the spaces are assumed merely to be sufficiently nice topological spaces and the maps are only assumed to be continuous. In this chapter, all maps and spaces will be smooth unless otherwise indicated. Definition 6.2. If (E, 71", M, F) is a smooth fiber bundle, then E is called the total space, 71" is called the bundle projection, M is called the base space and F is called the typical fiber. For each p E M, the set Ep ;= 7I"-l(p) is called the fiber over p. Because the quadruple notation is cumbersome, it is common to denote a fiber bundle by a single symbol. For example, we could write ~ (E, 71", M, F). In the literature, it is common to see E refer both to the total space and to the fiber bundle itself (an abuse of notation). The map IT is also a common way to reference the fiber bundle. Example 6.3. For smooth manifolds M and F, we have the projection pr1 : M x F -+ M. Then, (M x F, pr1' M, F) is a fiber bundle called a product bundle (or trivial bundle). Exercise 6.4. Show that if ~ = (E, 71", M, F) is a (smooth) fiber bundle, then 71" : E -+ M is a submersion and each fiber 71"-1 (p) is a regular submanifold which is diffeomorphic to F. Show that if both F and M are connected, then E is connected. There are various categories of bundles with corresponding notions of morphism. We give two very general definitions and modify them as needed. Definition 6.5 {Bundle morphism (type I)). Let 6 = (E1' 71"1, M, F1) and = (E2' 71"2, M, F2) be smooth fiber bundles with the same base space M. A (type I) bundle morphism over M from 6 to 6 is a smooth map
6
6.1. General Fiber Bundles
259
h : El ---* E2 such that the following diagram commutes:
M This type of morphism is also called an M-morphism or a morphism over M. If h is also a diffeomorphism, then h is called a bundle isomorphism over M and in this case the bundles are said to be isomorphic (over M) or equivalent. A bundle isomorphism from a bundle to itself is called a bundle automorphism. Definition 6.6 (Bundle morphism (type II)). Let 6 = (El' 71"1, MI, F l ) and 6 = (E2' 71"2, M2, F2) be smooth fiber bundles. A (type II) bundle morphism from 6 to 6 is a pair of smooth maps El ---* E2 and f : Ml ---* M2 such that the following diagram commutes:
1:
El
1 E2
-----+-
!~1
f
!~2
Ml ------ M2
1
We write (1, J) : 6 ---* 6 and say that is a bundle morphism along f. If both and fare diffeomorphisms, then we call (1, J) a bundle isomorphism. In this case, we say that the bundles are isomorphic over f.
f
1
Note that as a fiber preserving map, determines f and so it is also proper to refer to as the bundle morphism and we sometimes say that is a bundle morphism along (or over) f. Warning: The definitions of bundle morphism above are quite relaxed. There are a variety of definitions in the literature that require more than the definitions above, especially when structure groups (discussed below) are emphasized.
1
1
e=
Definition 6.7. A (global) smooth section of a fiber bundle (E,7I", M,F) is a smooth map a : M ---* E such that 7r 0 a = idM (Le., a(p) E Ep). A local smooth section over an open set U is a smooth map a : U ---* E such that 7r 0 a = idu. The set of smooth sections of e is denoted by r(e) or sometimes by r (E) or r (71"). A very important point is that a fiber bundle may not have any global smooth sections. If two bundles are equivalent, via a bundle isomorphism h (of type 1), then there is a natural bijection between the spaces of sections given by a I---t h 0 a. This means that one quick way to conclude that two
260
6. Fiber Bundles
bundles are not equivalent is by showing that one bundle has global sections while the other does not. A bundle chart essentially gives a local type I bundle isomorphism, but it is sometimes more natural to consider bundle charts which are local type II isomorphisms. We will call these type II bundle charts. These are of the form (4),x), where 4> : 7r- 1U ~ V x F and x : U ~ V are smooth diffeomorphisms such that the following diagram commutes: 7r- 1U ~ V x F
+ U
~
+ V
Usually, the pair (U, x) is a chart on the base manifold. The two types of bundle charts are equivalent since one may always compose a type II chart with (x-I, idF) to obtain a type I bundle chart. More restricted notions of bundle morphism can be obtained by making requirements such as that the induced maps on fibers 1l7l"11(P) : 7rl1(P) -t
7r21 (P) are Coo diffeomorphisms. The maps 4> : 7r- 1 (U) -t U x F occurring in the definition of a fiber bundle are said to be local trivializations of the bundle. It is easy to see that such a local trivialization must be a map of the form 4> = (7r17l"-1(u) , Ill) where (y) = (7l",
The second component map is a local trivialization over U eM, is called a bundle chart. (Clearly, a local trivialization and a bundle chart are essentially the same thing.) A family {(Ua,4>a)}aEA of bundle charts such that {Ua}aEA is a cover of M is said to be a bundle atlas. Given two such bundle charts (Ua,4>a) and (U(:1, 4>(:1), we have 4>a = (7r, <pa) : 7r- 1 (Ua) ~ Ua x F
and similarly for 4>fJ = (7r,
4>0 0 4>-;/ : (Uo n U(:1) x F
~ (Ua n U(:1) x F.
Since olEP
0
- l'1 •
6.1. General Fiber Bundles
261
It follows that cPa 0 cP~l (p, y) = (p, afj(P)(Y))' The functions afj : Ua n Ufj --t Diff(F) are called transition maps or transition functions. Given a bundle atlas, the corresponding transition functions clearly satisfy the following "co cycle conditions": aa (p) afj(P) afj(P)
=e = (fja(p))-1 0
fj-y(p)
0
for P E Ua, for P E Ua n Ufj, -ya(P) = id for P E Ua n Ufj n U-y,
for all a, (3, 'Y. Notation 6.8. We will often denote afj(P) (y) by af:ll p (Y), which is, in many contexts, more transparent. Diff(F) is a group, and we have a group action Diff(F) x F --t F given by (1/J, y) I--t 'IjJ(y). However, Diff(F) is too big for many purposes, and we have certainly not attempted to give Diff(F) a Lie group structure. Even if we were to somehow extend the notion of Lie group sufficiently to include Diff(F), it would be infinite-dimensional and thereby take us out of the circle of ideas we have been developing. Because of this, the transition functions ~Qf3 above, which could be called "raw transition functions" , might not be appropriate for our needs. We remedy this below by bringing Lie groups into the picture. First we give a simple example of a nontrivial bundle. Example 6.9. The circle 8 1 can be considered as a quotient R/rv where x is equivalent to y if and only if x - y is an integer multiple of 211". For this example, we put an equivalence relation on R x (-1,1) according to the prescription (x, t) rv (x + 27m, (-l)nt) for any integer n. The quotient (JR X (-1, 1)) / rv can easily be seen to be a smooth manifold and is none other than the familiar Mobius band which we denote by MB. Define a map 7r: MB --t R/rv = 8 1 by 1I"([x, t]) = [xl. We show that this is a fiber bundle by exhibiting an atlas consisting of three bundle charts. We call it the Mobius band bundle. We use three bundle charts instead of two in order that the overlaps be connected sets. Let U1 = {[xl E R/"" : -211"/3 < x < 27r/3} and U2 = {[xl E R/rv : 0 < x < 411"/3} and U3 = {[xl E R/rv : 21r/3 < x < 27r}. Then U1 U U2 U U3 = R/rv = 8 1. For i = 2,3 define ~i : 7r- 1 (Ui) --t Ui x 8 1 by
cPi([X, t]) = ([x], t), where (x, t) is the unique representative of [x, tl in the set (0,211") x (-1,1). For ¢1 : 7r- 1(U1) --t U1 x 81, we define cP1([X, t]) = ([x], t), where (x, t) is the unique representative of [x,t] in the set (-211"/3,211"/3) x (-1,1). One can check that cP2 0 cP3 1 = cP3 0 cP'i 1 = id on the overlap 1I"-1(U2) n 11"-1 (U3). Now consider the overlap 7r- 1(U1) n 1I"-1(U2). If [x, t] E 7r- 1(U1) n 11"-1 (U2), then
262
6. Fiber Bundles
Twist
'P---
Start
Figure 6.2. Mobius band
[x, tj is uniquely represented by some (x, t) E (0, 27r/3) x (-1,1) and in vieW' of the definitions we see that ¢20¢3l = id also. Finally we consider ¢lO¢"3 1• If [x, tj E 7r- l (Ul) n 7r- 1 (U3), then it has a unique representative (x, t) in (47r /3, 27r) x (-1,1) and then ¢;l([xj, t) = [x, tj. For ¢1, we need to represent [x, tj properly. We use the fact that [x, tj = [x - 27r, -tj and (x - 27r, -t) E (-27r /3,0) x (-1,1) so that ¢l([X - 27r, -t]) = ([x - 27rl, -t) = ([x], -t). In short, we have ¢l 0 ¢;l([x, t]) = ([x], -t). From these considerations and the fact that in general ¢a 0 ¢"'il(p, y) = (p, ~a{3 (P) (y)) we see that
Ul n U2, ~23(P) = id(-l,l) E Diff(-1, 1) for p E U2 n U3, ~13(P) = -id(-l,l) E Diff(-1, 1) for p E Ul n U3.
~12(P) = id(-l,l) E Diff( -1,1) for p E
A "twist" occurs on the overlap 7r- l (Ul) n 7r- l (U3). There is no way to construct an atlas for this bundle without having such a twist on at least one of the overlaps. Notice that if we define an action -X of Z2 = {1, -I} on the interval (-1,1) by -X(9, x) H 9X, then we can describe the transition functions in the last example by ~a,8(P)(x)
where 9a,8 : Ua
= -X(9a{3(P) , x),
n U,8 -+ Z2 is given by n U2, 923 = 1 on U2 n U3, 913 = -1 on Ul n U3, 912 = 1 on Ul
and in this case the 9a,8 satisfy a co cycle condition like the ~a,8. This is convenient since we understand Z2 very well. It is a zero-dimensional Lie group. Inspired by this, we seek to put Lie groups into the formalism. This
6.1. General Fiber Bundles
263
will alleviate our concerns about the group Diff (F) mentioned above. We are also led to the theory of G-bundles that involves the group in subtle ways. Definition 6.10. Let {UaJ be an indexed open cover of a smooth manifold M and let G be a Lie group. A G-cocycle on {Ua } is the assignment of a smooth map gaf3 : Ua n Uf3 -t G to every nonempty intersection Ua n Uf3 such that the co cycle conditions hold:
gaa(P) = e for p E Ua , ga(3(P) = (gf3a(p))-l for P E Ua n Uf3 gaf3(P)gf3-y(p)g-ya(P) = e for p E Ua n Uf3 n U-Yl where e is the identity in G. The family of maps {ga(3} forms a cocycle. The idea that we wish to pursue is that of representing the action of the raw transition maps by using Lie group actions. There is a subtle point here that the reader should not miss. Consider the following fact: If A : G x F -t F is a group action, then by letting K = {g : Ag (p) = p for all P E F} (the kernel of the action) we obtain an effective action of G / K on F. Things are not so simple on the global level of bundles as becomes clear when dealing with the notion of spin structure (See [L-M]). The best way to explain what is at stake is by the use of the notion of a principal bundle which we introduce later. Even before we get to that point, we will mention some things that will provide some idea as to why we need to be careful about ineffective actions. We start out assuming that the action is effective (see Definition 1.100): Definition 6.11. Let ~ = (E, 11", M, F) be a fiber bundle and G a Lie group. Suppose that we have an effective left action A: G x F -t F. Let {(¢a, Ua.)} be a bundle atlas for~. Suppose that for every nonempty intersection Ua.nuf3 there exists a smooth map gaf3 : Ua n Uf3 -t G such that A(ga.(:J(p) , y) = 1>Q.8lp (y) for all p E Ua n U(3 and y E F. Then the atlas {(¢a., Ua )} is called It. (G, A)-bundle atlas. If the action A is understood or standard in some way, one also speaks of a G-bundle atlas. Because the action A in the above definition is assumed effective, it follows that the family {ga(3} satisfies the co cycle conditions of Definition 6.10. Notice that if we had not assume the action to be effective, then the maps ga(3 would not be unique and may not satisfy a co cycle condition (although they would do so modulo the kernel of the action). Thus if we do not assume effectiveness, then we have to make the co cycle {ga.(3} part of the definition. We return to this below.
6. Fiber Bundles
264
The basic definition in the case of an effective action can be formulated as follows:
e
Definition 6.12. Let = (E, 11", M, F) be a fiber bundle and G a Lie group. Suppose that we have an effective left action A : G x F -+ F. Two (G,>.)bundle atlases for say {(¢et, Uet )} and {(¢~, U~)}, are strictly equivalent if the union of the atlases is also a (G, A)-bundle atlas. A strict equivalence class of atlases is referred to as an effective (G, >.)-bundle structure on and we say that together with this (G, >.)-bundle structure is an effective (G, A)-bundle. Again, if the action is standard or understood, then it is common to speak of a G-bundle structure and refer to as a G-bundle.
e,
e,
e
e
Actually, there is a tiny point to be made. To keep things neat we should always arrange that the indexing map a t--t (¢et, Uet ) for any atlas is injective. Thus when taking the union of two atlases per the definition of strict equivalence, one may need to reindex so that the 9et{3 are notationally unambiguous. For example, if we take the union of an atlas {(¢1, UI), (¢2, U2 )} with an atlas {("pI, VI), ("p2, V2)}, then how should the transition maps for the bigger atlas be denoted? What would 911 mean? Sometimes the traditional indexing scheme is wisely dropped. Instead one uses the set of trivializing maps itself as the index set so that a chart is written as (¢, U4J) or ("p, Ut/I), and then one denotes transitions maps by 9rfyrp etc. The notion of maximal (G, >.)-bundle atlas is defined in the obvious way by direct analogy with the notion of maximal atlas for a smooth structure. Our main emphasis will be on the effective (G, A)-bundles, but as mentioned above, if we wish to allow ineffective actions, then the notion of atlas should include the co cycle as part of the data. But even then we have to be careful. Indeed, for an ineffective action it is conceivable that there could be a different co cycle {9~{3} such that A(9~{3(P), Y) = .} and {(¢j, Uj, (9~j)' A} to define the same (G, >.)-bundle it must be the case that both of the co cycles are contained in a larger cocycle that gives the transitions for the atlas obtained as the union of the collection of charts from {(¢et, Uet ), (9et{3) , A} and {(¢~, U;), (9~j)' A}. We handle ineffective actions this way so as to keep aligned with the notion of associated bundle introduced later. If there is no chance of confusion, we will drop the adjective effective. The reader is warned that some standard expositions on fiber bundles allow ineffective actions right from the start, but in some cases assertions are made that would only be true in the effective case! It is interesting to note that in his famous book on the subject [St], Norman Steenrod restricts himself
6.1. General Fiber Bundles
265
to effective actions, although he announces this restriction in one easily overlooked sentence early in the book. Notice that an alternative way to say that A(90.fJ(P), y) = <po.fJlp (y) is ~a 0
= (p, 90.fJ(P) . y).
(Actually, we shall at first avoid the notation 9 . y for the action of a group element 9 on Y since we do not want the beginner to forget the role of the choice of action.) The maps 90.(3 are also called transition functions for the (G, A)-bundle atlas. If the G is literally a subgroup of Diff(F) and the action is simply (
Definition 6.13. Let 6 = (El, 7fl, M I , F) be a (G, A)-bundle with its (G, A)-bundle structure determined by the strict equivalence class of the (G, A)-atlas {(CPa, Uo.)} o.EA' Let 6 = (E2' 7f2, Ml, F) be a (G, A)-bundle with its (G, A)-bundle structure determined by the strict equivalence class of the (G,A)-atlas {('¢fJ, VfJ)}fJEB' Then a type II bundle morphism (ii, h) : ~ 6 is called a (G, A)-bundle morphism along h if
el
(i) 11, carries each fiber of EI diffeomorphic ally onto the corresponding fiber of E 2 ; (ii) whenever Uo. n h- I (VfJ) is not empty, there is a smooth map hap: Uo. n h- I (VfJ) ~ G such that for each p E Uo. n h- 1 (Vp) we have
( 'It (3011, 0 (
= (7f1'
and '¢fJ
= (7f2' 'ItfJ)·
IT M1 = M2 and h = idM, then we call 11, a (G, A)-bundle equivalence over M. (In this case, 11, is a diffeomorphism.) Condition (ii) simply says that Ii must be given by the action on each fiber when viewed in (G, A)-charts. For this definition to be good it must be shown to be well-defined. That is, one must show that condition (ii) is independent of the choice of representatives {(<Pa, Uo.)}o.EA and {(cpi, l-'i)zEJ} of the strict equivalence classes of atlases that define the (G, A)-bundle structures. We leave this as an exercise. Later we will discover another, perhaps better, way to talk about equivalence of (G, A)-bundles. The product bundle pr1 : M x F ~ M has a trivial (G, A)-bundle structure for any A acting on F. Indeed, we just take the structure given
266
6. Fiber Bundles
by the single bundle chart (idMxF, M) where we have the resulting co cycle {9n}, and where 9n (x) := e for all x. In fact, if {UoJ OlEA is any open cover of M ~ then {(idu",xF, lJOi)}OiEA is a (G, A}-atlas strictly equivalent to the atlas {(idMxF, M)} and so defining the same trivial (G, A)-bundle. By trivial (G, A}-bundle over M we will mean either this product (G, A}-bundle or one that is (G, A)-equivalent to it. Remark 6.14. The special case of a (G, A}-bundle equivalence in the case that E1 = E2 and 11"1 = 11"2 is interesting but easily misunderstood. Suppose that {(
6.1. General Fiber Bundles
267
e
Let = (E, 1f, M, F) have a (G, A)-bundle structure given by a maximal (G, A)-bundle atlas {(
6. Fiber Bundles
268
Let Ul = {e'O E 8 1 : 0 < (J < 21l-} and U2 = {e iO E 8 1 : -7r < (J < 7r}. Then Ul n U2 is a disjoint union of open sets V and W where 1 E V and -1 E W. Now we define maps with values in the multiplicative group of two elements {1, -1} = Z2. Let 911(X) 1 for all x E UI, 922(X) = 1 for all x E Ul, and then let 912 and 921 be defined on Ul n U2 by 1 on V, 912(X) := { -1 on W,
and 921 := 91l- Let Z2 act on the symmetric interval (-1,1) C 1R by multiplication (so that -1· x := -x). Using the co cycle {911,922,912,921} and this action on JR, Theorem 6.15 above gives a Z2-bundle which can be shown to be equivalent to the Mobius band bundle described in Example
6.9. Example 6.17. Use the same co cycle on 8 1 as in the last example but with typical fiber 8 1 and action Z2 x 8 1 --* 8 1 given by letting 1 act as the identity and -1 act by a rotation of 8 1 by 7r, that is -1 . eiO := ei(6+ 71") = _ei9 . Then the bundle obtained is a Z2-bundle, sometimes called the twisted torus. It is of the upmost importance to realize that there is a fiber preserving diffeomorphism between the twisted torus and the trivial bundle 8 1 x 8 1 ~ 8 1 and yet there is no Z2-bundle equivalence between the twisted torus and the trivial Z2-bundle 8 1 x 8 1 ~ 8 1. Thus the twisted torus is trivial as 8. general bundle but not as a Z2-bundle (See Problem 5). This shows how much the involvement of the group matters. Notice that the previous theorem is true even without the assumption that the action is effective and we still end up with a genuine (ineffective) (G, A)-bundle since there is no problem about the cocycle existing. A quotient by the kernel K of the action would give an effective structure group action. Thinking of the action as a homomorphism G --* Diff(F), we have an induced homomorphism GI K --* Diff(F) such that the following diagram commutes: G ~Diff(F)
1/
GIK
e
Definition 6.18. Let = (E, 7r, M, F) be a smooth fiber bundle and f: N -4 M a smooth map. The pull-back bundle = (1* E, 7rI, M, F) (or induced bundle) is defined as follows: The total space E is the set
f* E
:=
re
{(q, €)
E
N x E: f(q) = 7r(€)}.
Then we define 7rl as the restriction to
N.
r
r E of the projection pr1 : N x E ~
269
6.1. General Fiber Bundles
Notice~hat the second factor projection map pr2 : N x E -t E restricts to a map f : E -t E which is a bundle morphism over the map f:
r
rEL E ~ I ~ N---M
The map 1 restricts to a diffeomorphism on each fiber. If ¢ = (11", cI») is a trivialization of the bundle over the open set U, then (11"1, cI») := (11"1, cI» 0 1) is a trivialization of over the open set f-l(U). Thus a bundle atlas on mduces a bundle atlas on If {cI»aP} are (raw) transition maps for corresponding to a bundle atlas {(11", cI»a)}aEA, then {cI»a,B 0 f} are transition maps for corresponding to the atlas {(1I"1' cI»a 0 1)}aEA. In fact,
e
re
e
re.
e
re
cI»alqXEf(q) (q, €)
so the inverse is Thus cI»al q xE/(q Y
= cI»a 0 ~qXEf(q) (q, €) = cI»aIEf(q) (€),
cI»al;~Ef(q) 0
:y
~ (q, cI»aIE~(q) (y)).
cI».BI-~E is given by q /(q)
~ (q, cI»pIE~(q) (y)) ~ cI»aIE/(q)
0
cI»,BIE~(q) (y),
and so cI»a,B(q) = cI»alqXEf(q) = cI»alE
/(q
0
0
cI»pl;~Ef(q)
1 cI»pIE-/(q) = cI»a,B
0
f(q).
e,
Furthermore, if {(11", cI»a)}aEA is a (G, A)-atlas for then since cI»aplp (y) = )..(9a/3(P) , y), we have cI»aPlq (y) = cI»aP(J(q))(y) A(9a/3 (J(q)), y). We see has a (G, A)-bundle structure with co cycles ga/3 0 f. Note, however, that that because of the composition with f, this structure may be reducible to a. smaller group.
re
e
Definition 6.19. Let = (E, 11", M, F) be a smooth fiber bundle and let f : N -t M be a smooth map. A section of along f is a map (j : N -t E such that 11" 0 (j = f. The set of sections along f will be denoted (e) or
e
r,
r,(E).
e
If (j : N -t E is a section of along f, then the map by p ~ (p, (j(P)) is a section of the pull-back bundle show that all sections of have this form.
re
e
(jl :
re.
N -t rEgiven It is not hard to
Proposition 6.20. Let = (E, 11", M, F) be a smooth fiber bundle and f : N -+ M a smooth map. Then there is a natural bijection between r ,(e) and T(f*e)·
270
6. Fiber Bundles
Proof. Let s E r(f*e). For each pEN, we have s(p) E (f* E)p' which must have the form (p, y) for some y E Ef(p). Therefore the smooth map (J' := pr2 0 s : N -+ E has the property that 7r 0 (J' = f. Thus we obtain a map r(f*e) -+ r f(e) given by s r-+ (J'. But the inverse of this map is clearly (J' r-+ (idN, (J'). 0
6.2. Vector Bundles The tangent and cotangent bundles are examples of a general type of fiber bundle called a vector bundle. Roughly speaking, a vector bundle is a parametrized family of vector spaces. We shall need both complex vector bundles and real vector bundles, and so to facilitate definitions we let IF denote either lR or C. Let V be a finite-dimensionallF-vector space. The simplest examples of vector bundles over a manifold M are the product vector bundles which consist of a Cartesian product M x V together with the projection onto the first factor prl : M x V -+ M. Each set of the form {x} x V c M x V inherits an IF-vector space structure from that of V in the obvious way: a(p, v) + b(p, w) := (p, av + bw). We think of M x V as copies of V parametrized by M. Definition 6.21. Let V be a finite-dimensionallF-vector space. A smooth IF-vector bundle with typical fiber V is a fiber bundle (E, 7r, M, V) such that: (i) for each x E M the set Ex := 7r- 1 (x) has the structure of a vector space over the field IF, isomorphic to the fixed vector space V; (ii) every p E M is in the domain of some bundle chart (U,
6.2. Vector Bundles
271
convert the V-valued VB-charts into C k _ or ffi.k-valued VB-charts. Thus we could have assumed from the start that we were dealing with one of these standard vector spaces, but it is not always natural to do so since our vector space may arise in a specific way (it could be a Lie algebra or perhaps a space of algebraic tensors) and may not have a preferred choice of basis. Exercise 6.24. Show that the tangent and cotangent bundles of an nmanifold are vector bundles with typical fiber ffi.n (the cotangent bundle may be viewed as having typical fiber (ffi.n)*). Let IF be ffi. or C as above. The space of smooth sections r(~) of an IF-vector bundle ~ = (E, 7r, M, V) has the structure of a module over the ring COO(M; IF); for u, Ul, U2 E r(~) and f E COO(M; IF), we define (Ul
+ (2) (p) := Ul(P) + U2(p)
for all P EM,
fu (p) := f(p)u(P) for all p E M. Definition 6.25. Let 6 and 6 be IF-vector bundles with respective bundle projections 7rl and 7r2. A bundle morphism (1, f) : 6 ---* 6 is called a vector bundle morphism if the restrictions to fibers, 117l"11(p) 7rll(p) ---* 7r;l(f(p)), are JF-linear. If 6 and 6 have the same base space M, then we obtain the definition of a vector bundle morphism over M by specializing to the case f = idM. We then also have the corresponding notions of vector bundle isomorphism and automorphism (for both type I and II bundle morphisms) . A vector bundle (E, 7r, M, V) is said to be trivial if it is vector bundle isomorphic to the product vector bundle prl : M x V ---t M. This happens exactly when there is a vector bundle trivialization over the entire manifold M, which we call a global vector bundle trivialization (a notion already introduced for tangent bundles). Definition 6.26. Let (E, 7r, M, V) be a rank k vector bundle with typical fiber V and fix an i-dimensional subspace V' of V. If E' c E is a submanifold with the property that for every p E M there is a VB-chart (U, ¢) such that ¢(7r-l(U) n E')
= U X V' c U x V,
then (E', 7rIEI ,M, V') is called a rank l vector subbundle of (E, 7r, M, V). Charts with this property are said to be adapted to the subbundle. The triple (E', 7rIEI' M, V') is a vector bundle and every adapted VBchart (U, ¢) on E gives rise to a chart on (E', 7rIEI' M, V'); namely, (U, ¢/), where ¢' is the restriction of ¢ to 7r-l(U) n E' = 7rl.p} (U). By picking a basis for V' and extending to a basis for V one may take V to be ffi.k and V' to be ffi.l embedded in ffi.k as ffi.l x {O} C ffi.k.
6. Fiber Bundles
272
Exercise 6.27. Let E -7 M be a vector bundle as above. Suppose that a subspace E~ of Ep is given for each p E M and consider the set E' = UPEME~. Show that E' is the total space of a rank 1 vector subbundle if and only if for each p E M, there is an open neighborhood U of p on which smooth sections Ot, • •. ,0'1 are defined such that for each q E U the set {0'1 (q), ... ,0'/ (q)} is a basis of the subspace E~. IT h : E1
-7
E2 is a vector bundle morphism over M, then Ker h:-
U Ker hlEl!' pEM
is a subset of E 1 • This subset is not necessarily (the total space of) a. subbundle, at least in the ordinary sense. However, if the rank 1 of hlEl!' is independent of p, then we say that the bundle map has rank 1 and, in this case, Ker h is a vector subbundle. Similarly, if h has constant rank in this sense, then the image Im h is a vector subbundle of E2. Both of these facts follow from Proposition 6.28. Suppose that h : E1 -7 E2 is a vector bundle morphism over M of constant rank r and that E1 and E2 have typical fibers VIand V2 respectively. Fix a rank r linear map A : VI -7 V 2. Then for every p E M there is a VB-chari (U, ¢) for E1 with p E U and a VB-chari (U, 'I/J) for E2 such that 'I/J 0 h 0 ¢-1 : U x VI -7 U X V2 has the form
(p, v) t--t (p, Av). It follows that Ker h is a vector sub bundle with typical fiber Ker A and 1m h is a vector sub bundle of E2 with typical fiber Im A.
Proof. Let us first make an observation. Notice that only the rank of the linear map A : VI -7 V 2 in this last proposition is important and we may replace A by any linear map of the same rank. The reason for this is that if B : VI -7 V 2 is any other linear map with the same rank as A, then there exist linear isomorphisms a and {3 such that B = {3Aa- 1 • In particular, if one has chosen bases and identified VI with Rkl and V 2 with Rk2, then we may take A to be a map of the form
(x\ ... , Xkl) t--t (xl, ... , xr, 0, ... ,0), so that Ker A is a copy of Rkl -r and 1m h is a copy of Rr. What we need to prove is entirely local. Thus our task is to show that for any smooth map h : U x IF'kl -7 U X ]Fk2 of the form (p, v) -7 (p, hpv), with hp a linear map of rank r and where p t--t hp is smooth, we may find maps 'I/J and ¢ such that 'I/J 0 h 0 ¢-1 : U x ]Fkl -7 U X IF'k2 is given by (p, xl, ... , Xkl) t--t (p, xl, ... , x r , 0, ... , 0). Fix Po E U. There exist linear
273
6.2. Vector Bundles
isomorphisms a : JFkl -+ JFkl and f3 : JF k 2 -+ given by a k2 x k1 matrix of the form
JF k2
such that f3
0
hp
0
a -1 is
and where All (Po) is an r x r matrix which is invertible. By shrinking U if needed, we may assume that All (P) is invertible for all p E U. Thus we may as well assume from the start that hp is represented by a matrix of this form. Now consider the map ¢p : JFkl -+ JFkl whose matrix is given by
A12 (P) I(kl-r) x (kl-r)
] klXkl'
Then hp 0 ¢;1 has a matrix of the form [
Irxr A21 (p )Ail (p)
and since this matrix must have rank r, we see that C = O. Let Mp := A21(P)All(p) and let 'lj;p be the linear map ]Rk2 -+ ]Rk2 with matrix
Then 'lj;p 0 hp 0 ¢p 1 has the form [Iror~]. Now define ¢(p, v) = (P, ¢pv) , h(P,x) := (p, hpx) and 'Ij;(p, v) := (p, 'lj;pv) for p E U, x E JFk 1 , and v E JFk 2. Notice that 'lj;p, hp and ¢;1 each depend smoothly on p. The map 'lj;oho¢-l has the required form. 0 Proposition 6.29. Let AO : GL(V) x V -+ V be the standard action of GL(V) on the JF -vector space V. A fiber bundle with typical fiber V has an IF-vector bundle structure if and only if it admits a Ao-bundle atlas (a GL(V)bundle atlas}. Furthermore, if A : GL(V) x V -+ V is any effective action which acts linearly, then any fiber bundle (E, 71", M, V) that has a A-atlas is
a vector bundle in a natural way. Proof. That a vector bundle has a GL(V)-bundle structure follows directly from the definition. All that remains to show is the second part of the theorem, since this will imply the remainder of the first part. Let A: GxV -+ V be any effective Lie group action which acts linearly and suppose that E, 71", M, V) has a A-bundle structure. Let (Ua , ¢a) and (U/1, ¢/1) be Acompatible bundle charts and let ¢a = (7I",tlI a ) and ¢/1 = (71",41/1)' Fix
274
6. Fiber Bundles
P E Ua n Uf3. For v, W E V, and a, b E IF, we have af3(p)(av + bw)
= >.(gaf3(P), av + bw) = a>. (gaf3(P), v) + b>'(gaf3(P), w) = aiP af3(p)(v)
+ biP af3(P)(w),
which shows that af3(P) E GL(V) for all p E Ua n Uf3. We transfer the vector space structure from V to Ep via iPalElp and note that this is welldefined by Proposition 2.3. With this linear structure on the fibers it is easy to verify that (E, 1r, M, V) is a vector bundle. D Theorem 6.30 (Vector bundle construction theorem). Let {Ua}aEA be a cover of M and let {gaf3} be a G-cocycle for a Lie group G. If G acts linearly on the vector space V (by say>'), then there exists a vector bundle over M with a VB-atlas {(Ua, q)a)} satisfying q)a 0 q)~ 1 (p, v) = (p, gaf3 (p) . v)) on nonempty overlaps Ua n Uf3. In other words, there exists a vector bundle with (G, >.)-atlas. Proof. This is essentially a special case of Theorem 6.15. One only needs to check linearity of the q)a on fibers. D Perhaps some clarification is in order. In the case of a vector bundle, the raw transition maps iPa{3 take values in the general linear group GL(V), which is a Lie group. They correspond to a >'o-bundle structure where >'0 is the standard linear action of GL(V) on V (the standard representation), and they automatically satisfy the co cycle condition. The more general transition maps that define a (G, >.)-bundle structure (G-bundle structure) are G-valued. It is important to note that G may be small compared to GL(V) and certainly need not be thought of as a subset of GL(V). For example, the tensor bundles have (possibly ineffective) GL(V)-bundle structures coming from tensor representations, but the tensor bundles themselves generally have rank greater than k = dim (V). Since in the vector bundle case, the iP af3 arise directly from a VB-atlas and act by the standard action, we will call these standard transition maps, and the corresponding GL(V)bundle structure will be called the standard GL(V)-bundle structure. The standard GL(V)-bundle structure is the structure that a vector bundle has simply by virtue of being an IF -vector bundle with typical fiber V. Remark 6.31. We have previously mentioned that the notion of a representation is equivalent to that of a left linear action. When dealing with vector bundles it is perhaps more common to use the representation terminology and notation and this we shall do as convenient. So if>. is a left linear action, then the map G -----t GL(V) given by 9 H >.(g) := >.g is a representation of G. Conversely, if>. is such a representation, we obtain a linear action by letting >.(g, v) := >.(g)v.
6.2. Vector Bundles
275
We already know what it means for two vector bundles over M to be equivalent. Of course any two vector bundles that are equivalent in a natural way can be thought of as the same. Since we can and often do construct our bundles according to the above recipe, it will pay to know something about when two vector bundles over M are isomorphic, based on their respective transition functions. Notice that the standard transition functions are easily recovered from every (G, A)-atlas by the formula A (go.f3{p)) Y = <po.f3lp (Y)· Proposition 6.32. Two vector bundles 7[' : E -t M and 7[" : E' -t M with standard transition maps {
Proof. (Sketch) Given a vector bundle isomorphism f : E -t E' over M, let fa{x) :=
To :
The conditions (6.1) insure that f is well-defined on the overlaps Eluo n E uf) = Eluanuf)' One easily checks that this is a vector bundle isomorphism. 0 We can use this construction to arrive at several common vector bundles. Example 6.33. Given an atlas {(Ua , xa)} for a smooth manifold M, we let gCtp(p) = Tp x a o (TpXf3)-l for allp E ua nuf3. The bundle constructed according to the recipe of Theorem 6.30 is a vector bundle which is (naturally isomorphic to) the tangent bundle TM. If we let g~f3(P) = {Tpxf3 0 (TpXa)-l)*, then we obtain the cotangent bundle T* M. Proposition 6.34. Let 7[' : E -t M be an r -vector bundle with typical fiber V and with VB-atlas {(Ua , ¢a)}. Let So. : Ua -t V be a collection of maps such that whenever Ua n Uf3 i= 0, we have sa(P) =
6. Fiber Bundles
276
This gives a well-defined section
S
because for x E Ua
n U{3
we have
4>;1 0,01 (P) = 4>;l(p, Sa(P))
= 4>;1(p, ;1 04>01
0
4>-t(p, S{3(p))
= 4>-pI(P, s.a(P))
= 4>fl1 0l.a(p)·
o
Suppose we have two vector bundles, 11"1 : E1 ~ M and 11"2 : E2 ~ M. We give two constructions of the Whitney sum bundle 11"1 EEl7r2 : E1 EElE2 4 M. This is a globalization of the direct sum construction of vector spaces. In fact, the first construction simply takes E1 EEl E2 = UPEME1p EEl E2p. Now, we have a vector bundle atlas {(4)a, Ua )} for 11"1 and a vector bundle atlas {(1Pa, Ua )} for 11"2. Assume that both atlases have the same family of open sets (we can arrange this by taking a common refinement). Now let 4>a EEl1Pa : (vp, wp) t-+ (P,pr2 0 4>01 (vp) ,pr2 0 1Pa (wp)) for all (vp, wp) E (E1 EEl E2)lu... Then {(4)a EEl 1Pa, Ua )} is a VB-atlas for 11"1 EEl 11"2 : E1 EEl E2 ~ M. Another method of constructing this bundle is to take the co cycle {gaP} for 11"1 and the co cycle {ha.a} for 11"2 and then let ga{3 EEl ha{3 : Ua n U.a 4 GL(IFkl X IFk2) be defined by (ga.a EElhafl) (x) = ga{3(x) EEl ha{3(x) : (v,w) t-+ (ga.a(x)v, ha.a(x)w). The maps 9a.a EEl ha.a form a co cycle which determines a bundle by the construction of Proposition 6.30, which is (isomorphic to) 11"1 EEl 11"2 : El EEl E2 ~ M. The pull-back of a vector bundle 11" : E ~ M by a smooth map f : M is naturally a vector bundle whose linear structure on each fiber (f* E)q = {q} x Ep is the obvious one induced from Ep. Put another way, ';!e give the unique linear structure to each fiber that makes the bundle map f : f* E ~ E linear on fibers. When given this vector bundle structure, we call f* E the pull-back vector bundle.
N
~
Example 6.35. Let 11"1 : E1 ~ M and 11"2 : E2 ~ M be vector bundles and let 6. : M ~ M x M be the diagonal map x t-+ (x, x). From 11"1 and 11"2 one can construct a bundle 1I"E lxE2 : E1 x E2 ~ M x M by 1I"E l XE2 (EI' E2) := (11"1 (El) , 11"2 (E2)). The Whitney sum bundle E1 EElE2 defined previously is naturally isomorphic to the pull-back 6.*11" El X E2 : 6.* (E1 X E 2) ~ M (Problem 10). Exercise 6.36. Recall the space r f(~) from Definition 6.19. Show that if 11" : E ~ M is an IF-vector bundle and f : N ~ M is a smooth map, then both r f(~) and r(f*~) are modules over COO(N; IF), and that the natural correspondence between r f(~) and r(f*~) is a module isomorphism. Every vector bundle has global sections. An obvious example is the zero section which maps each x E M to the zero element Ox of the fiber
277
6.2. Vector Bundles
E3J. The image of the zero section is also referred to as the zero section and is often identified with M. (Of course, the image of any global section is a submanifold diffeomorphic to the base manifold.) We have the following simple analogue of Lemma 2.65: Lemma 6.37. Let 11" : E -+ M be an IB'-vector bundle with typical fiber V. llv E 1I"-1(p) then there exists a global section 0' E r(e) such that O'(p) = v. Furthermore, if s is a local section defined on U, and V is an open set with compact closure with V c V c U, then there is a section 0' E r(e) such that (J = S on V. Proof. Using a local trivialization one can easily get a local section O'loe defined near p such that O'(p) = v. Now just use a cut-off function as in the proof of Lemma 2.65. For the second part we just choose a cut-off function Pwith support in U and such that (3 = 1 on V. Then (3s extends by zero to the desired global section. D
If a section of a vector bundle takes the zero value in some fiber we say that it vanishes at that point. Global smooth sections that never vanish do not always exist; such sections are called nowhere vanishing or nonvanishing. However, there is one case where it is easy to see that nonvanishing smooth sections exist: Proposition 6.38. Any bundle equivalent to a (trivial) product bundle must have a nowhere vanishing smooth global section. It is a fact that the tangent bundle of 8 2 does not have any such nowhere vanishing smooth sections. In other words, all smooth (or even continuous) vector fields on 8 2 must vanish at some point. This is a result from algebraic topology called the "hairy sphere theorem" (see Theorem 10.15). If one fancifully imagines a vector field on a sphere to be hair, then the theorem suggests that one cannot comb the hair neatly "flat" without creating a cowlick somewhere. More generally, the analogous result holds for 8 2n if n,~ 1. Exercise 6.39. Modify either the construction of Example 6.9 or Example 6.16 to obtain a rank one vector bundle version of the Mobius band and give an argument proving that every global continuous section of this bundle must vanish somewhere.
e
Definition 6.40. If = (E, 11", M, V) is a vector bundle and p EM, then a vector space basis for the fiber Ep is called a frame at p. Definition 6.41. Let 11" : E -+ M be a rank k vector bundle. A k-tuple (J = (0'1, •.. , O'k) of sections of E over an open set U is called a (local) frame field over U if for all p E U, (0'1 (p), ... , O'k (p)) is a frame at p.
278
6. Fiber Bundles
If we choose a fixed basis {edi-l, ... ,k for the typical fiber V, then a choice of a local frame field over an open set U c M is equivalent to a local trivialization (a vector bundle chart). Namely, if
=
3
= LA{(9a{3(P))0"3(P). j
Thus the smooth matrix-valued function (A{ 09a,B) defined on Ua n U,B gives the change of frame and embodies the transition map on uanu,B. In practice, this is often a good way to look at things. Consider the common situation where V = ]Rk and where AO is the standard representation of GL(]Rk). If
6.2. Vector Bundles
279
we identify GL(Rk) with the matrix group GL(k), then ((AO)~ 0 9a/1 (P)) is just the matrix 9a(3 (P) itself. Metric differential geometry begins if we have a scalar product on the fibers of a vector bundle. We introduce the concept at this point so as to have an example of a reduction of the structure group. Definition 6.42. A Riemannian metric on a real vector bundle 7r : E---+ M is a map p t-+ 9p("') which assigns to each p E M a positive definite scalar product 9p(',') on the fiber Ep that is smooth in the sense that p H Up (S1 (P), 82 (p)) is smooth for all smooth sections 81 and 82. A real vector bundle together with a Riemannian metric is referred to as a Riemannian vector bundle. For example, a Riemannian metric on the tangent bundle of a smooth manifold is what one means by a Riemannian metric on the manifold. A smooth manifold with a Riemannian metric is called a Riemannian manifold and such will be studied later in this book. If a rank k real vector bundle 1f : E ---+ M has a Riemannian metric, then it is convenient to assume that a fixed inner product is chosen on the typical fiber V and that a distinguished orthonormal basis (el,"" ek) has been chosen. In most applications, V is ]Rk, the inner product is the standard dot product, and the distinguished basis is the usual standard basis. Recall that the orthogonal group O(V) is the subgroup of GL{V) consisting of elements that preserve the inner product. Once the inner product and distinguished orthonormal basis are fixed, every choice of Riemannian metric on E corresponds to a reduction of the standard structure group GL(V) to the subgroup O(V) as follows. Let us first show how a metric leads to a reduction. We start with an arbitrary VB-atlas {(Ua , <Pa)}. Since we have fixed a basis for V, each chart Ua,cPa) defines a frame field (0'1'"'' O'k') on Ua by O'i(p) :=
Make this replacement for each (Ua , <Pa) to obtain a new atlas {(Ua ,
6. Fiber Bundles
280
are related by
Q; (p)ef (p)
e~(p) = L
for some smooth orthogonal matrix function Q~. One now checks that the transition maps for this new atlas (which is still a subatlas for the maximal VB-atlas) take values in O(V). Indeed,
(p, <1>~,8(p)(v))
= ¢'p 0 ¢';1(P, v) = ¢'p (L vje~(P)) =
v-1Qj(p)e,8) (P) '(""'i
¢,8
L..J
= (p, L
i
v-1Q;(p)
=
8 pL..J v-1QJ(p)e,8) ( p, <1>,'I""'i i (p)
<1>'plp ef(P)) = (p, L
v-1 Qj(p)ei )
from which we see that <1>~,8 (p) (v) = <1>~,8 (P)(I: v j ej) = I: v j Q; (p )ei' Since (Q~(p)) is an orthogonal matrix for all p, we have <1>~,8(p) E O(V) for all p. The converse is also true. Namely, a reduction to the structure group O(V) (acting in the standard way) is tantamount to the introduction of a Riemannian metric. The correspondence presumes the prior choice of inner product and distinguished orthonormal basis on V. Exercise 6.43. Prove the converse statement referred to above. Exercise 6.44. Let E be a complex vector bundle of rank k. Define by analogy with Riemannian metric, the notion of a Hermitian metric on E and show that every Hermitian metric on E corresponds to a reduction of the standard GL(k, C)-bundle structure to a U(n)-bundle structure. Proposition 6.45. On every real vector bundle E there can be defined a Riemannian metric. Similarly, on any complex vector bundle there exists a Hermitian metric. Proof. We prove the Riemannian case; the Hermitian case is entirely analogous. The proof uses the fact that a strict convex combination of positive definite scalar products is a positive definite scalar product. This allows us to use a partition of unity argument. Endow V with an inner product. Let {(Ua , ¢an be a VB-atlas and let (Ua , ¢a) be a given VB-chart. On the trivial bundle Ua x V -t Ua there certainly exists a Riemannian metric given on each fiber by ((P, v), (p, w))a - (v, w). We may transfer this to the bundle 7r l(UaJ -t Ua by using the map ¢~1, thus obtaining a metric 9a on this restricted bundle over Ua . We do this for every VB-chart in the atlas. The trick is to piece these together in a smooth way. For that, we take a smooth partition of unity (Ua , Pa) subordinate to the cover {Ua }. Let g(P) = LPa(P)9a(P).
6.2. Vector Bundles
281
The sum is finite at each p E M since the partition of unity is locally finite and the functions Paga are extended to be zero outside of the corresponding Ucr - The fact that Pa ~ 0 and Pa > 0 at p for at least one a: easily gives the result that g is positive definite at each p and so it is a Riemannian metric onE. 0 Example 6.46 (Tautological line bundle). Recall that ]Rpn is the set of all lines through the origin in ]Rn+1. Define the subset IL(]Rpn) of]Rpn x ]Rn+1 consisting of all pairs (1, v) such that vEl (think about this). This set together with the map 7rlRP" : IL(Rpn) --+ ]Rpn given by (l,v) t-+ 1, is a rank one vector bundle. Example 6.47 (Tautological bundle). Let G(n, k) denote the Grassmann manifold of k- planes in ]Rn. Let 'Yn,k be the subset of G (n, k) x ]Rn consisting of pairs (P, v) where P is a k-plane (k-dimensional subspace) and v is a vector in the plane P. The projection 7rn,k : 'Yn,k --+ G(n, k) is simply (P,v) t-+ P. The result is a vector bundle bn,k,7rn,k,G(n,k),Rk ). We leave it to the reader to discover an appropriate VB-atlas (see Problem 12). These tautological vector bundles are not just trivial bundles, and in fact their topology or twistedness (for large n) is of the utmost importance for classifying vector bundles (see [Bo-Tu]). One may take the inclusions Rn C IRn+1 C ... c ]Roo to construct inclusions G(n, k) C G(n + 1, k) c ... and 'Yn,k C 'Yn+l,k. Given a rank k vector bundle 7r : E --+ M, there is an n such that 7r : E --+ M is (isomorphic to) the pull-back of 'Yn,k by some map I: M --+ G(n, k):
E
~
/*'Yn,k
t
M
--~ .. 'Yn,k
f
t
.. G(n,k)
Exercise 6.48. To each point on a unit sphere in IRn, attach the space of all vectors normal to the sphere at that point. Show that this normal bundle is in fact a (smooth) vector bundle. Generalize to define the normal bundle of a hypersurface in IRn. When is such a normal bundle trivial? Exercise 6.49. Fix a nonnegative integer j. Let Y = ]Rx (-1,1) and let Xl,Yl) '" (X2,Y2) if and only if Xl = X2 + jk and Yl = (-1)jk Y2 for some integer k. Show that E := Y/"" is a vector bundle of rank 1 that is trivial if and only if j is even. Prove or at least convince yourself that this is the Mobius band when j is odd.
282
6. Fiber Bundles
6.3. Tensor Products of Vector Bundles Given two vector bundles 11'"1 : El -+ M and 11'"2 : E2 -+ M with respective typical fibers VIand V2, we let
El ® E2 :=
U Elp ® E2p
(a disjoint union).
pEM
Then we have a projection map 11'" : El ® E2 -+ M given by mapping any element in a fiber E 1p ® E2p to the base point p. We show how to construct a VB-atlas for El ® E2 from an atlas on each of El and E2. The smooth structure and topology can be derived from the atlas as usual in such a way as to make all the relevant maps smooth. We leave the verification of this to the reader. The resulting bundle is the tensor product bundle. As usual we can assume that the atlases are based on the same open cover. Thus suppose that {(Ua ,4>a)} is a VB-atlas for El while {(Ua,I/'a)} is a VB-atlas for E 2. Now let -Po ® Wo : (El ® E 2)l u", -+ VI ® V2 be defined by (-Po ® Wa)IElp E2p := -PalElp ® WalE2p for p E Ua . Then let 4>0 ® 1/;0: (El ® E2)l u", -+ Ua
X
(VI ® V2)
be defined by 4>0®1/;0 := (11'", -Po®w o ). To clarify, the map -POIElP ® Wa lE2P : E 1p ® E2p -+ V 1 ® V2 is the tensor product map of two linear maps as described at the end of Chapter 5. To see what the transition maps look like, we compute; (-Pa ® WO)IElP®~P
0
(-P~ ® W~) E~p®E2P
1 = -PalElp ® WoIE2p 0 -P~IE-llp ® W~ E-2p
= (-PaIElP
0
-P~IE~p) ® (WaIElP 0 W~IE~p)
= -Po~(P) ® Wo~(p).
Thus the transition maps are given by p -+ -Pa~(P) ® wa~(P), which is a map from Uo to GL(VI ® V2). The group GL(V1 ® V2) acts on V1 ® V2 in a standard way, and this is the standard effective structure group of the bundle as we have just seen. However, it is also true that the bundle E1 ®Ez has (ineffective) structure group GL(VI) x GL(V2) via a tensor product representation. Indeed, if 1.1 denotes the standard representation of GL(V d in VI and 1.2 denotes the standard representation of GL(V2) in V2, then we have a tensor product representation 1.1 ®1.2 of GL(V1) x GL(V2) in V1 ®V2. This is usually not a faithful representation. Using the GL(Vl) x GL(V2)valued co cycle p r-t ho~(p) := (-Po~(P), Wa~(P)), together with 1.1 ® 1.2, we see that by definition
283
8.4. Smootb Functors
Furthermore, if VI = V2 = V, then the tensor product representation is usually defined as a representation of GL(V) rather than GL(V) x GL(V), and so E1 0 E2 would have a (GL(V), I, 0 I,)-bundle structure where I, is the standard representation. In this case 1,0(, is still not a faithful representation since - idv is in the kernel. We can reconstruct the same vector bundle using any of these representation-co cycle pairs via Lemma 6.30. In fact, it is quite common that we have different representations by one group. Suppose that we have two faithful representations )11 and A2 of a Lie group G acting on VI and V2 respectively. If {ga:,B} is a co cycle of transition maps, then we can use the pair {ga:,B, Ad in Lemma 6.30 to form a vector bundle E1 that has a (G, A1)-bundle structure by construction. Similarly, we can construct a vector bundle E2 with (G, A2)-bundle structure. If we use Al 0 A2 and the same co cycle {ga:,B}, then we obtain a bundle which, as a vector bundle, is E1 ® E 2 • But by construction, it has a (G, Al ® A2)-structure (possibly ineffective). This is the case in the following exercise: Exercise 6.50. Suppose that E is a vector bundle with a (G, A)-bundle structure given by a (G, A)-atlas with a corresponding cocycle of transition functions. Show how one may use Theorem 6.30 to construct bundles isomorphic to E*, E®E and E0E* which will have a (G, A*)-bundle structure, a (G, ). ® A)-bundle structure and a (G, A0 A*)-bundle structure respectively.
6.4. Smooth Functors We have seen that various new vector bundles can be constructed starting with one or more vector bundles. Most of the operations of linear algebra extend to the vector bundle category. We can unify our thinking on these matters by introducing the notion of a Coo functor (or smooth functor). With IF = R or C, the set of alllF-vector spaces together with linear maps is a category that we denote by Lin(lF). The set of morphisms from V to W is the space of IF-linear maps L(V, W) (also denoted Hom(V, W)). Definition 6.51. A covariant Coo functor F of one variable on Lin(lF) consists of a map, denoted again by F, that assigns to every IF-vector space Van IF-vector space FY, and a map, also denoted by F, which assigns to every linear map A E L(V, W), a linear map FA E L(FY,FW) such that (i) F ~ L(V, W) -t L(FY, FW) is smooth; (ii) F(idv)
= idF\'
for alllF-vector spaces V;
(iii) F(A 0 B) = FA 0 FB for all A E L(U, V) and B E L(V, W) and vector spaces U, V and W. As an example we have the Coo functor which assigns to each V the k-fold direct sum ffik V = V E9 ... E9 V and to each linear map A E L(V, W)
6. Fiber Bundles
284
the map
E9A : E9v -t E9w
k k k given by E9kA(Vl,"" Vk) := (AVI,"" AVk). Similarly there is the functor which assigns to each V the k-fold tensor product (j!;lv = V Q9 ... Q9 V and to each A E L(V, W) the map ®k A : ®kV -t ®kW given on homogeneous elements by (®k A)(VI Q9 ... Q9 Vk) := AVI Q9'" Q9 AVk. One can also consider Coo covariant functors of several variables. For example, we may assign to each pair of vector spaces (V, W) the tensor product V Q9 W, and to each pair (A,B) E L(V, V') x L(W, W'), the map A Q9 B : V Q9 W -t V' Q9 W'. There is also a similar notion of contravariant Coo functor: Definition 6.52. A contravariant Coo functor F of one variable on Lin(lF) consists of a map, denoted again by F, which assigns to every JFvector space V an IF-vector space FY, and a map, also denoted by F, which assigns to every linear map A E L(V,W) a linear map FA E L(FW,FV) (notice the reversal) such that
(i) F: L(V, W) -t L(FW, FY) is smoothj (ii) F(idv) = idw for alllF-vector spaces Vj (iii) F(A 0 B) = FB 0 FA for all A E L(U, V) and B
E
L(V, W) and
vector spaces U, V and W. The map that assigns to each vector space its dual and to each map its dual map (transpose) is a contravariant Coo functor F. One may define the notion of a Coo functor of several variables which may be covariant in some variables and contravariant in others. For example, consider the functor of two variables that assigns to each pair (V, W) the space V Q9 W* and to each pair (A, B) E L(Vb V2) X L(WI' W2) the map AQ9B* : VI Q9Wi -t V2Q9Wi. Theorem 6.53. Let F be a Coo functor of m variables on Lin(lF) and let EI, ... ,Em be IF -vector bundles with respective typical fibers V I, ••. , Vm' Then the set
E:= F(E1, ... ,Em):= UF(EIlp, ... , Emlp} p
together with the map 1r : E -t M which takes elements of F(Ellp I • • • , Em Ip) to p is naturally a vector bundle with typical fiber F(V1, .•. , Vm).
Proof. We will only prove the case of m = 2 with covariant first variable and contravariant second variable. This should make it clear how the general case would go while keeping the notational complexity under control.
6.5. Hom
285
Given vector bundles 7r1 : El -+ M and 7r2 : E2 -+ M! the total space of the constructed bundle is UpF( Ellp , E21p) with the obvious projection which we call 7r. Let (4)a! Ua ) be a VB-atlas for El and (1/Ja, Ua) a VBatlas for E2 (we have arranged that both atlases use the same cover by going to a common refinement as usual). For each p, let Ep denote the fiber F(E1I p ' E2Ip). Fix a and for each p E Ua define 8 a lp E L(Ep,F(Vl, V2)) by 8 alp := F( alp, Wal;l), where 4>a = (7rl' a) and 1/Ja = (7r2' wa). Then define 8 a : 7r- 1 (Ua) -+ F(Vl' V2) by 8 a (€) = Sal p (10) whenever 10 E F(Ell p ' E2Ip). Next define Oa = (7r, Sa) : 7r- 1Ua -+ Ua
X
F(Vl' V2).
The family {(Oa, Ua )} is to be a VB-atlas for E. We check the transition maps: 8 al'(P) = Sal p 0 8~11p = F(alp! Wal;l) of(I'l p ' '111'1;1)-1
= F(alp, Wal;l) of(I'lp 1, wpl p ) = F(alp
0
I'1;1! wpl p
0
Wal;l)
= F(ap(P) , wPa(P))'
(Remember that the functor is contravariant in the second variable.) Now we can see from the properties of al" '11 I'a and the definition of Coo functor that F(al'(P) , wI'a(P)) E GL(F(Vl, V2)) and the maps 8 al' : Ua n Up -+ GL(F(Vl, V2)) are smooth. 0
6.5. Hom Let 6 := (E1' 7rt, M, V) and 6 := (E2' 7r2, M, V) be smooth IF-vector bundles. The bundle whose fiber over P E M is L(E1p, E2p ) = Hom(Elp, E2p) is denoted by Hom(6,6) or less precisely, by referring to the total space Hom(E1, E2)' Here Hom(E1p, E2p) denotes IF-linear maps. If I : E1 -+ E2 is vector bundle homomorphism over M, then we may obtain a section s of Hom(El, E2) by defining s : P H IIEl,,' Conversely, given s E r (Hom(El, E 2)) we define f : El -+ E2 by requiring that fiE I" = s(p). Thus every element of Hom(Et, E 2) can be identified with a vector bundle homomorphism over M. Exercise 6.54. Let E1 -+ Ml and E2 -+ M2 be smooth vector bundles. Show that the set of vector bundle homomorphisms along a smooth
286
6. Fiber Bundles
map 9 : Ml -t M2 is in natural bijection with the sections of the bundle Hom(El' g* E2). Since r (El) and r (E2) are COO(M, IF) modules, we can look at the COO(M, IF) module Hom(r (El) ,r (E2))' Then we have Proposition 6.55. Let EI -t M and E2 -t M be smooth IF -vector bundles. Then r (Hom(E!, E2)) and Hom(r (EI) , r (E2)) are naturally isomorphic as COO(M,IF) modules. Proof. To each section s E r (Hom(EI, E2)) we assign a map ¢s ; r (E1 ) r (E2) defined by the formula
¢s (0") (p) = s(P)O"(P) for 0"
E
-t
r (EI) .
Then we obtain a map
s(p)(vp ) := ¢ (0") (p) for
vp E
El p ,
where 0" is any section in r (EI) such that O"(p) = vp. We need to show that this is well-defined. It suffices to show that if O"(p) = 0 then ¢ (0") (P) = O. Let (Xl, ... , Xk) be a local frame field for EI defined over an open set U. Choose 9 E COO(M) with support in U and g(p) = 1. Define fields Xi ;= 9X~ and extend by zero outside of U. Then there exist functions fi such that k
gO" = LfiXi, i=l
and since O"(p)
= 0, we must have fi(P) = 0 for all i. Then we have
(¢O") (P) = g(p) (¢ (0")) (P) = (g¢ (0")) (P) = 4> (gO") (P) = 4>
(t. J' x.)
k
= L i=l
(P)
k
(Ji¢ (Xi)) (P)
=L
fi(p)¢ (Xi) (P)
= O.
i-I
The constructed map is easily checked to be the inverse of the map
6.6. Algebra Bundles
287
If (0"1, . .. , O"kl) is a local frame field for E1 -+ M over an open set U, and if (
ei
(UI, ... ,Ukl) = (0"1, ... , O"kl )0,
-
-
(
e;
then A~ = Lr,s (D-1)~A~CJ on Un V. Of special importance is End(E) := Hom(E, E) -+ M for a given vector bundle E -+ M. Here if (U1,' '.:...' Uk) = (0"1, ... , O"k)O is a change of frame field, then we take
(6.2)
r,s
which is the signature transformation law for sections of End(E). This bundle is important in the study of covariant derivatives and curvature. Notice how everything is formally similar to operations in linear algebra, but here we are dealing with fields and functions (often only locally defined).
6.6. Algebra Bundles Let IF be C or lR. Recall that an IF-algebra is a vector space V with a bilinear map V x V -+ V giving a product on V. Such a bilinear map is uniquely specified by the associated linear map V ® V -+ V (see Definition D.17).
e
Definition 6.56. Let V be an IF-algebra. An IF-vector bundle = (E,1T, M, V) is called an IF-algebra bundle if each fiber Ep has an IF-algebra structure in such a way that the associated maps Ep ® Ep -+ Ep combine to give a vector bundle homomorphism E ® E -+ E and has a VB-atlas {(Ua ,
e
Example 6.57. The endomorphism bundle End(E) -+ M of a vector bundle (E, 7r, M, V) is the bundle whose fiber at p is End(Ep) = Hom(Ep, Ep). The space End(Ep) has an IF-algebra structure given by composition of linear maps.
288
6. Fiber Bundles
Example 6.58. Let ~ - (E, 11", M, V) be an IF-vector bundle. The corresponding general linear Lie algebra bundle is the bundle whose fiber at p is L{Ep, Ep) but with the product being (j, g) I-t [f, gJ := fog - go f. When given this product, the fiber is denoted gl(Ep) and is the Lie algebra of GL{Ep). The total space of the general linear Lie algebra bundle is then denoted by gl(E). This provides an example of a Lie algebra bundle. Recall that when a direct sum has an infinite number of summands, we require that each element have finite support. In other words, ffi~ 1 Vt is the vector space of all formal sums E~l V z , where Vi E Vi and all but finitely many of the vt's are zero.
Example 6.59. The definition of tensor algebra is given as Definition D.46 of Appendix D. Let ~ = (E, 11", M, V) be an IF-vector bundle and consider the bundle ® E -+ M whose fiber at p is the IF-tensor algebra ® Ep ~ EEl Ep EEl (Ep ® Ep) EEl···. If we consider the disjoint union
®E=
U ®Ep, pEM
with the obvious projection ®E -+ M, then it seems that we have a bundle with algebra fibers. However, each fiber ® Ep is infinite-dimensional. On the other hand, we do at least have a nested sequence of vector bundles 101
®::;k E = REEl E EEl (E ® E) EEl ... EEl ( ®k E) . 6.7. Sheaves Let E -+ M be a vector bundle. We have seen that r{M, E) is a module over the smooth functions Coo (M). It is important to realize that having a vector bundle at hand not only provides a module, but a family of modules {f{U, En parametrized by the open subsets U of M. How are these modules related to each other? Consider a section u : M -+ E. Given any open set U c M, we may always produce the restricted section ulu : U -+ E. This gives us a family of sections; one for each open set U. Conversely, suppose that we have a family of sections Uu : U -+ E where U varies over the open sets (or just a cover of M). When is it the case that such a family is just the family of restrictions of some section u : M -+ E? To help with these kinds of questions, and to provide a language that will occasionally be convenient, we will introduce another formalism. This is the formalism of sheaves and presheaves. The formalism of sheaf theory is used for studying the interplay between the local and the global, for example, using sheaf cohomology theory. It is especially
289
6.7. Sheaves
useful in complex geometry. Sheaf theory also provides a very good framework within which to develop the foundations of supergeometry, which is an extension of differential geometry that incorporates the important notion of "fermionic variables". A deep understanding of sheaf theory is not necessary for what we do here and it would be enough to acquire a basic familiarity with the definitions since we only want the convenience of the language. Definition 6.60. A presheaf of abelian groups (resp. rings, etc.) on a manifold (or more generally a topological space) M assigns to each open set an abelian group (resp. ring, etc.) M(U) and assigns to each nested pair V C U of open sets a homomorphism r~ : M(U) -+ M(V) of abelian groups (resp. rings, etc.) such that
(i) rw a r~ = r~ whenever W eVe Uj
(ii) r~ = idv for all open V C M. Definition 6.61. Let M be a presheaf and R a presheaf of rings over M. If for each open U C M we have that M(U) is a module over the ring R(U), and if multiplication commutes with restriction, that is, if the following diagram commutes for each nested pair V c U,
R(U) x M(U) --+-M(U)
r~xr~
!
!r~
R(V) x M(V) -M(V) then we say that M is a presheaf of modules over R. Definition 6.62. Let Ml and M2 be presheaves over M. A presheaf morphism h : Ml -+ M2 over M is a collection of morphisms, hu : Ml(U) -+ M2(U), one for each open set and such that whenever V C U, the following diagram commutes:
Note that we have used the same notation for the restriction maps of both presheaves. Definition 6.63. Let M be a presheaf. A family {sa:} with Sa: E M(Ua:) is called consistent if rK,nul3sa: = r~a:nul3sp whenever Ua: n Up 1= 0. Definition 6.64. We will call a presheaf M a sheaf if the following properties hold whenever U = UUaEU Ua: for some collection of open sets U.
290
6. Fiber Bundles
(i) If S1,S2 E M(U) and rKs1 = rK,s2 for all UOt E U, then S1 = S2. (ii) Given a consistent family {SOt: SOt E M(UOt )}, there exists s E M(U) such that rK,s = SOt. (iii) M(0) is the trivial group, (resp. ring, etc.). If we need to indicate the space M involved, we will write MM instead
ofM. Definition 6.65. A morphism of sheaves is a morphism of the underlying presheaf. The assignment Coo (.) : U ~ Coo (U) is a sheaf of rings. This sheaf will also be denoted by Cr:i. The best and most important example of a sheaf of modules over cooO is the assignment r(., E) : U ~ r(U, E) for some vector bundle E --t M, where by definition r~ (s) = S Iv for S E r (U, E). In other words, r~ is just the restriction map. Let us denote this (pre)sheaf by rE : U ~ rE(U) := r(U, E). Exercise 6.66. For each open set U in a manifold M, let B(U) denote the ring of bounded smooth functions defined on U. Show that U ~ B(U) defines a presheaf that is not a sheaf. Many if not most of the constructions and operations we introduce for sections of vector bundles are really also operations appropriate to the (pre)sheaf category. Naturality with respect to restrictions is one of the features that is often not even mentioned (precisely because it seems obvious). This is the inspiration for a slight twist on our notation.
Functions on M Vector fields on M Sections of E where Cr:i : U
~
Global
Local
Sheaf
Coo(M) X(M) r(E)
Coo(U) X(U) r(U, E)
Cr:i XM rE
Cr:i(U) := Coo(U), XM : U ~ XM(U) := X(U) and so on.
Notation 6.67. For example, when we say that D : Cr:i --t Cr:i is a. derivation we mean that D is actually a family of algebra derivations Du : Cr:i(U) --t Cr:i (U) indexed by open sets U such that we have naturality with respect to restrictions; i.e., diagrams of the form below for V c U commute:
Cr:i(U) ~ Cr:i(U)
r~
!
Cr:i(V)
!r~
~ Cr:i(V)
291
6.8. Principal and Associated Bundles
It is easy to see that all of the following examples are sheaves. In each case, the maps r~ are just the restriction maps.
Example 6.68 (Sheaf of holomorphic functions). Sheaf theory really shows its strength in complex analysis. This example is one of the most studied. However, we have not studied the notion of a complex manifold, and so this example is for those readers with some exposure to complex manifolds (see the online supplement). Let M be a complex manifold and let OM(U) be the algebra of holomorphic functions defined on U. Here too, OM is a sheaf of modules over itself. Whereas the sheaf eM always has global sections, the same is not true for OM. The sheaf-theoretic approach to the study of obstructions to the existence of global holomorphic functions has been very successful. For a bit more on sheaves, see the online supplement [Lee, Jefl1.
6.8. Principal and Associated Bundles Let 7r : E ~ M be a vector bundle with typical fiber V and for every p E M let GL(V, Ep) denote the set of linear isomorphisms from V to Ep. If we choose a fixed basis (eI, ... ,ek) for V, then each frame (Ul, ... ,Uk) at p gives an element U E GL(V, Ep) defined by
u(v) :=
2: viUi,
where v = E vie,. We identify U with (ur, ... , Uk) and refer to it as a frame. With this identification, notice that if U a := U CPa. is the local frame field coming from a VB-chart (Ua , ¢>a) as described above, then we have ua(p) = CI>alEp for p E Ua.
Now let F(E) :=
U GL(V, Ep)
(disjoint union).
pEM
It will shortly be clear that F(E) is a smooth manifold and the total space of a fiber bundle. Let p : F(E) ~ M be the projection map defined by p(u) - p for U E GL(V, Ep). Observe that GL(V) acts on the right ofthe set F(E); the action F(E) x GL(V) ~ F(E) is given by r: (u,g) t--t ug - uog. U we pick a fixed basis for V as above, then we may view 9 as a matrix and an element U E GL(V, Ep) as a basis (Ul,"" Uk)' In this case, we have ug =
(2: uigt, ... ,2: uig1) .
It is easy to see that the orbit of a frame at p is exactly the set p-l(p) = GL(V,Ep) and that the action is free. For each VB-chart (U,¢» for E, let (JcP be the associated frame field. Define /cp : U x GL(V) ~ p 1 (U)
292
6. Fiber Bundles
-
by fr/J(p, g) = ur/J(P)g. It is easy to check that this is a bijection. Let ¢ : p-l (U) --+ UxGL(V) be the inverse of this map. We have ¢ = (p, ~), where ~ is uniquely determined by ¢. Starting with a VB-atlas {(Ua , ¢a)} for E, we obtain a family {¢a : p-l (Ua ) --+ Ua x GL(V)} of trivializations which gives a fiber bundle atlas {(Ua , ¢a)} for F(E) --+ M and simultaneously induces the smooth structure. Definition 6.69. Let 71" : E --+ M be a vector bundle with typical fiber V. The fiber bundle (F(E), p, M, GL(V)) constructed above is called the linear frame bundle of E and is usually denoted simply by F(E). The frame bundle for the tangent bundle of a manifold M is often denoted by F(M) rather than by F(TM). Notation 6.70. It would be perhaps more appropriate to refer to the frame bundle of a vector bundle { = (E, 71", M, V) as F({) since the notation F(E) is inconsistent with the notation F(M) above. After all, E is itself a manifold. Despite this we will continue with the dangerous notation.
Notice that any VB-atlas for E induces an atlas on F(E) according to our considerations above. We have
¢a 0 ¢~l(p, g) - ¢a(u(j (p)g) = ¢a( pi Ep g) - (p, af3(p)g). Thus the transition functions of F(E) are given by the standard transition functions of E acting by left multiplication on GL(V). The cocycle corre-sponding to the bundle atlas for F(E) that we constructed from a VB-atlas {(Ua , cfJa)} for the original vector bundle E, is the very cocycle af3 deriving from this atlas on E. We need to make one more observation concerning the right action of GL(V) on F(E). Take a VB-chart for E, say (U, cfJ), and let us look again at the associated chart (U, ¢) for F(E). First, consider the trivial bundle prl : U x GL(V) --+ U and define the obvious right action on the total space ¢ (p-l(U)) = U x GL(V) by ((P,gl),g) t-+ (P,glg) := (P,gl)' g. Then this action is transitive on the fibers of this trivial bundle. Of course since GL(V) acts on F(E) and preserves fibers, it also acts by restriction on p-l(U). Proposition 6.71. The bundle map ¢: p-l(U) --+ UxGL(V) is equivariant
with respect to the right actions described above. Proof. We first look at the inverse: - 1 cfJ(P,gl)g
- 1 = Ur/J(P)glg = cfJ(p,glg).
To see this from the point of view of cfJ rather than its inverse, take u E p-l(U) C F(E) and let (P,gl) be the unique pair such that u = ¢-l(p,gl)'
6.8. Principal and Associated Bundles
293
Then
¢(ug) -
¢ (¢-1(p,91)9)
=
¢¢
l(p,glg) = (P,glg) = (P,g1)·g = ¢(u).g. 0
A section of F(E) over an open set U in M is just a frame field over U. A global frame field is a global section of F(E), and clearly a global section exists if and only if E is triviaL For a frame bundle F(E), the following things stand out: The typical fiber is the structure group GL(V), and we constructed an atlas which showed that F(E) has a GL(V)-bundle structure where the action is left multiplication. Furthermore there is a right action of GL(V) on the total space F(E) which has the fibers as orbits. The charts constructed were of the form (U, ¢), where ¢ is equivariant in the sense that ¢(ug) = ¢(u)g, where if ¢( u) = (p, 91)' then (p, g1)g := (p, gIg) by definition. These facts motivate the concept of a principal bundle: Definition 6.72. Let p : P ---t M be a smooth fiber bundle with typical fiber a Lie group G. The bundle (P, p, M, G) is called a principal G-bundle if there is a smooth free right action of G on P such that (i) The action preserves fibers; p(ug) = p(u) for all u E P and 9 E G. (ii) For each p E M, there exists a bundle chart (U, ¢) with p E U and such that if ¢ = (p, cp), then
cp(ug) = cp(u)g for all u E p-l (U) and 9 E G. If the group G is understood, then we may refer to (P, p, M, G) simply as a principal bundle. Define a right action on U x G by (p, gl)g = (p, gIg). Then U x G -+ U is a trivial principal bundle. If ¢ = (p, cp), then using this right action on U x G we have that ¢(ug) = ¢(u)g if and only if (ug) = cp(u)g. Charts of the form described in (ii) of the definition are called principal bundle charts, and an atlas consisting of principal bundle charts is called a principal bundle atlas.
Proposition 6.73. If (P, p, M, G) is a principal G-bundle, then the fibers are exactly the orbits of the right G-action. Proof. The definition makes it clear that each orbit is contained in some fiber. Suppose that Ul and U2 are in the same fiber so that P(Ul) = p(U2)' We wish to find agE G such that Ul = U2g. Let 9 := CP(U2)-lcp(Ul). Then (Ul) = CP(U2)g and so
¢(Ul) = (p(Ul), CP(Ul)) = (p(U2), CP(U2)g) = (p(U2g), CP(U2g)) = ¢(U2g).
6. Fiber Bundles
294
Since cP is bijective, we see that Ul = U2g. The conclusion can be expressed 0 by saying that the action is transitive on fibers. The definition of a principal G-bundle above does not use the notion of a G-atlas or strict equivalence. The definition of principal bundle atlas is given after the notion of a principal bundle is already defined. However t once we have singled out this type of atlas, we can find out what the transition functions are and thereby make connection with earlier developments. Notice that if (CPa, Ua ) and (cp(3, U(3) are overlapping principal bundle charts with CPa = (50,
ga,a(p) =
where u is any element in the fiber at p. Lemma 6.74. Let (CPa, Ua) and (cp(3, U(3) be overlappmg principal bundle charts. For each p E Ua n U,a,
0 (
(p») -1 (g) = ga,a(p)g,
where the ga(3 are given as above. Proof. Let (
=
0
0
From this lemma we see that the structure group of a principal bundle is G acting on itself by left translation. Conversely if (P, 50, M, G) is a fiber bundle with a G-atlas with G acting by left translation, then (P, 50, M, G) is a principal bundle. To see this, we only need to exhibit the free right action. Let u E P and choose a chart (CPa, Ua) from the G-atlas. Then let ug :cp-;;1(P,
= CPaCP/(P,
so that U1 = cp-;;1(P,
6.S. Principal and Associated Bundles
295
we see that IPa(ug) = IPa(u)g as required by the definition of principal bundle. Obviously the frame bundles of vector bundles are examples of principal bundles. Definition 6.75. Suppose that 7r : E ~ M is a rank k vector bundle with typical fiber V. Let G be a Lie subgroup of GL(V) and suppose that 7r : E ~ M has a G-bundle structure where G acts on V by the standard action as a subgroup of GL(V). Thus we have a reduction of the structure group to G. Let Ae = {(Ua , 4>a)} be the maximal G-atlas that defines this structure. For each p EM, let Fe(Ep) := {u E GL(V, Ep) : u
= 4>-l(p,.) for some (U,4»
E
Ac}.
Elements of Fe(Ep) are called G-frames at p (associated to Ae). The group G acts on the right on Fe(Ep) by (u,g) ~ u 0 g, and letting FG(E) := UPEMFe(Ep) we see that G acts on the right on Fe(E). It is not hard to show that Fe (E) has the structure of a principal G-bundle with this action. It is a subprincipal bundle of the frame bundle. Definition 6.76. The bundle Fe(E) is called the bundle of G-frames associated to the G-bundle structure on 7r : E ~ M. Actually, G acts on the whole frame bundle of E by the same formula and on the set of all frames F(Ep) at a fixed p. Then, Fa(Ep) is an orbit of this action on F(E). In fact, one may simply define a G-bundle structure on a vector bundle to be a subbundle of the frame bundle such that G acts transitively on each fiber. It is not hard to see that the notion of a bundle of G-frames gives us another way to describe the notion of reduction of the structure group. It is also important to notice that since there may be more than one G-bundle structure on E, the notation Fa(E) is ambiguous. If one must deal with two different G-bundle structures on E, then one must resort to another notation such as F8(E) and F~(E). Exercise 6.77. Show that a choice of metric on a vector bundle E ~ M is equivalent to a specification of a subbundle of the frame bundle such that O(k) acts transitively on each fiber of the subbundle. As another example of a principal bundle we also have the Hopf bundles described in the next example and the following exercise. Example 6.78 (Hopf bundles). Recall the Hopf map ~: s2n-l ~ cpn-l defined in Example 5.120. The quadruple (S2n l,~, cpn-l, U(1)) is a principal fiber bundle. We have already defined the left action of U(1) on s2n-l in Example 5.120. Since U(l) is abelian, we may take this action to also be a right action. Recall that in this context, we have s2n 1 = {e E Cn : lei = l},
6. Fiber Bundles
296
where for e = (zl, ... ,zn), we have lel 2 L:ziz i . The right action of U(l) = Sl on s2n-l is (e, g) ~ eg = (zIg, ... I zng). It is clear that p(eg) = pee)· To finish the verification that (s2n-l I p, cpn-l, U(l)) is a principal bundle, we exhibit appropriate principal bundle charts. For each k = 1,2, ... , n, we let Uk := {[zl I ' •• I zn] E cpn-l : zk =1= O} and we let 1/Jk : p-l(Uk) ---+ Uk X U(l) be defined by 1/Jk := (p, Wk), where
Wk(e) = Wk(Z\ ... I zn) := Izkl-1zk . We leave it to the reader to show that 1/Jk :- (p, Wk) is a diffeomorphism. For 9 E U(l), we have
= Izkgl-1(zkg) = Izkl-1(zkg) = (lzkl-1zk)g = Wk(e)9, Let us compute the transition cocycle {g~3}' For p = [el E u~nU,
Wk(eg) as desired. we have
Exercise 6.79. By analogy with the above example, show that we have principal bundles (sn-l, p, lRpn-l, Z2) and (s4n-1, P, Iffipn-l I U(l, lHl)). Show that in the quaternionic case g~3(P) = qi for p = [ql, ... , qn] and that the order matters in this case.
Iqil-1 (qjr1lqjl
If (U,
Proposition 6.80. If p : P ---+ M is a surjective submersion and a Lie group G acts freely on P so that for each p E M the orbit of p is exactly p-l(P), then (P, p, M, G) is a principal bundle. Proof. Let us assume (without loss of generality) that the action is a right action since it can always be converted into such by group inversion if needed. We use Proposition 3.25: For each point p EM, there is a local section (J' : U ---+ P on some neighborhood U containing p. Consider the map fCT : U x G ---+ p-l (U) given by fCT(P,g) = (J'(p)g. One can check that this map is injective and has an invertible tangent map at each point of U. Now
6.8. Principal and Associated Bundles
297
let ¢ := f;;l. Then we have ¢ = (g:J, : U -+ G. If p - g:J(u), we have ¢(ug) = (p, (ug)) and so
ug
= ¢-l(p, (ug)),
while
¢-l(p, (u)g) = fa (p, (u)g)
= O'(p) ((u)g) = (O'(p) (u)) 9 = fa (p, (u)) 9 = ¢ 1 (p, (U)) 9 _ ug = ¢-l(p, (ug)). we have (ug) = (u)g. Thus the section
Since ¢-l is a bijection, rise to a principal bundle chart (U, ¢), where ¢ = (7r,
0'
gives 0
Combining this with our results on proper free actions from Chapter 5, we obtain the following corollary: Corollary 6.81. If a Lie group G acts properly and freely on M (on the rtghtj, then (M,7r,M/G,G) is a principal bundle. In particular, if H is a closed subgroup of a Lie group G, then (G, 7r, G / H, H) is a principal bundle
(with structure group H j. Definition 6.82. Let (PI, g:Jl, M l , G) and (P2, g:J2, M 2, G) be two principal G-bundles. A (type II) bundle morphism PI -+ P2 along a smooth map f : Ml -+ M2 is called a principal G-bundle morphism (along f) if
J:
J(u. g) = J(u) . 9 If Ml = M2 and f = idM,
for all 9 E G and u E P. principal G-bundle morphism over M.
then we say that
f
is a
Exercise 6.83. Sh~w that if (PI, g:JI, Ml, G) and (P2, g:J2, M2, G) are principal G-bundles and f : Pi -+ P2 is a principal G-bundle morphism along a diffeomorphism f, then f is a diffeomorphism. If f is a principal bundle morphism over M, then it is a diffeomorphism and hence_a bundle ~quivalence (or bundle isomorphism over M) with th~ property f(u· g) = f(u) . 9 for all 9 E G and u E P. In this case, we call f a principal G-bundle equivalence and the two bundles are equivalent principal G-bundles over M. A principal G-bundle equivalence from a principal bundle to itself is called a principal bundle automorphism or also a (global) gauge transformation.
The classification problem for principal bundles (in the topological category) is regrettably beyond the scope of this volume and can be found in [Hus]. We can only offer the following comments: In the topological category, the notion of a Lie group is replaced by that of a topological group,
298
6. Fiber Bundles
but the reader may continue to think of Lie groups. For every topological group G, there is a principal bundle ~(G) = ((E(G), Poo, B(G), G) called a universal bundle with the property that all the homotopy groups of E( G) are trivial. There is then a classification theorem that states that the equivalence classes of principal bundles with a fixed sufficiently nice l base space M are in one-to-one correspondence with the set of homotopy classes of maps from M to B(G). The correspondence is given by assigning to the homotopy class [ll, the pull-back principal bundle f*~(G). The notion of a principal bundle morphism over M can be generalized to the situation where we have two groups in play. Definition 6.84. Let (PI, PI, M, G I ) and (P2, P2, M, G2) be principal bun· dIes and let h : G I -+ G2 be a Lie group homomorphism. A bundle morphism over M is called a principal bundle homomorphism with respect to h if
f(u. g) = f(u) . h(g). If G I C G2 and the homomorphism h is the inclusion, then we call reduction of (P2, P2, M, G2) to (PI, PI, M, G I ).
1a
_ In the most common case of a reduction, PI is a submanifold of P2 and f is the inclusion PI Y P2. For example, this is the case when one chooses a metric on a vector bundle and thereby obtains a bundle of orthonormal frames FO(n) (E). The inclusion FO(n) (E) Y F(E) is then a reduction, and we just say that FO(n)(E) is a reduction of the frame bundle F(E). We have seen that a principal G-bundle atlas {(Ua , ¢a)} is associated to a co cycle {ga,8}' From this co cycle and the left action of G on itself we may construct a bundle which has {ga,8} as a transition cocycle. In fact, recall that in the construction we formed the total space by putting an equivalence relation on the set E :- Ua{a:} X Ua x G, where (a:,p,g) E {a:} x Ua x G is equivalent to (/3,p', gl) E {/3} x U,B xG if and only ifp = pi and 9' = g,Ba(p)·g, If we define a right action on the total space of the constructed bundle by [a:,P,gI] . 9 = [a:,p,gIg], then this is well-defined, smooth, and makes the constructed bundle a principal G-bundle equivalent to the original principal G-bundle. Exercise 6.85. Prove the last assertion above. Thus we see that G-cocycles on a smooth manifold M give rise to principal G-bundles and conversely. If we start with two G-cocycles on M, then we may ask whether the principal G-bundles constructed from these cocydes are equivalent or not. First notice that the constructed bundles will have principal bundle atlases with the respective original transition cocycles. Thus we are led to the following related question: What conditions on 1M should be a CW-complex.
6.8. Principal and Associated Bundles
299
the transition co cycles arising from principal bundle atlases on two principal G-bundles will ensure that the bundles are equivalent principal G-bundles? By restricting the trivializing maps to open sets of a common refinement, we obtain new atlases and so we may as well assume from the start that the respective principal bundle atlases are defined on the same cover of M. Theorem 6.86. Let (Pb PI, M, G) and (P2,s<J2, M, G) be principal Gbundles with principal bundle atlases {( 4>0:, Uo:)} and {( 4>~, Uo:)} respectively. Then (PI, Pb M, G) is equivalent to (P2, P2, M, G) if and only if there exists a family of (smooth) maps To: : Uo: -+ G such that g~f3{p) (To: (p))-l gO:f3(p)Tf3{p) for all p E Uo: n Uf3 and for all nonempty intersections Ua n Up. (Here {go:f3} is the cocycle associated to {(4)o:, Uo:)} and {g~f3} is the cocycle associated to {(4)~, Uo:)}.) Sketch of proof. First suppose that PI and P2 are equivalent principal Gbundles and let PI -t P2 be an equivalence. Let p E Uo: and choose some 1 u E PI (p), so f(u) E p;l (p). Write 4>0: = (PI, ~ = (p2,
1:
To:(p) := <po:(u)(
Conversely, given the maps To: : Uo: -+ G satisfying g~f3(p) = (To: (p) )-1 go:f3{p )Tf3 (p),
we define, for each a, a map fo: : Pi l {uo:) -t p;l{Uo:) by
fo:{u) := (4)~rl
(p, (To:(p))-l
Check that fO/.(u) = ff3(u) when PI(U) E UO/. n Uf3 so that there is a welldefined map PI -t P2 such that fo:(u) = f(u) whenever PI(U) E Uo:. Finally, check that f(u. g) = f(u) . g. 0
1:
Let p: P -+ M be a principal G-bundle and suppose that we are given a smooth left action A : G x F -t F on some smooth manifold F. Define a right action of G on P x F according to
(u, y) . 9 := (ug, g-ly) = (ug, A(g-l, y)). Denote the orbit space of this action by P x>.F (or P xGF) and let p denote the quotient map. Also denote the equivalence class of (u, y) by [u, y] so
6. Fiber Bundles
300
p( u, y) = [u, yj. One may check that there is a unique map rr : P x).. F ---+ M such that rr ([u, y]) = p (u), and so we have a commutative diagram:
PxF~P
lp
!p
Px)..F~M
Next we show that (P x).. F, rr, M, F) is a fiber bundle (a (G, A)-bundle). It is said to be associated to the principal bundle P. Bundles constructed in this way are called associated bundles. More precisely, if the action A is not effective, we should say that P x).. F is weakly associated to P. This is what lies behind our previously introduced notion of "ineffective (G, A) bundle". Theorem 6.87. Referring to the above diagram and notations, P x).. F is a smooth manifold and the following hold:
(i) (P x).. F, rr, M, F) is a fiber bundle, and for every principal bundle atlas {(Ua ,4>oJ}, there is a corresponding bundle atlas {(Ua,¢a)} for P x).. F such that -
-
1
.
4>a 0 4>(3 (p, y) = (p, A(ga(3(P), y)) if p E Ua n U(3 and y E F, where the ga(3 are defined by equation (6.3). (ii) (P x F, p, P x).. F, G) is a principal bundle with the right action given by (u,y). g:= (ug,g-ly).
(iii) P x F
prl)
P is a principal bundle morphism along rr.
Proof. Let {(Ua ,4>a)} be a principal bundle atlas for p : P ---+ M. Note that P (p l(Ua ) X F) = rr-1(Ua ). For each 0:, define ;J;a : rr-1(Ua ) ---+ F by requiring that ;J;o: 0 p(u, y) = q}a(u) . Y for all (u, y) E p-l(Uo:) X F and then let ¢a. := (rr, ;J;a) on rr-1(Ua ). We want to show that ¢a is bijective by defining an inverse for ¢a. For every p E Ua , let O'a(P):= 4>;;l(p,e), where e is the identity element in G. Then we have
O'a(P) . q}a(u)
= 4>;1(p, e) . q}a(u) = 4>a l(p, q}o:(u)) = u.
Define 'f/a : Ua x F ---+ rr-1(Ua ) by 'f/a(P, y) := P(O'a(P), y). We have
'f/a 0 ¢a(P(U, y)) = 'f/a(P, q}a(u) . y) = P(O'a(P) , q}a(u) . y) = P(O'a(P) . q}a(U) , y) = p(u, y).
-
-
Thus 'f/o: is a right inverse for eta and so 4>a is injective. It is easily checked that 'f/a is also a left inverse for 4>a. To see this first note that (p, q) a (0'a (p))) -
301
6.8. Principal and Associated Bundles
4>Q(aa(P)) = (p, e) so cI>o:(ao:(p)) = e. Thus we have 1>0: 0 'fJo:(p, y) = 1>0: (f.)(ao:(p) , y)) = (p, ;Po: (f.) (ao: (P) , y)))
= (p, cI>0: (ao: (p)) . y)
= (p, y).
Thus 4>0: is a bijection. Next we check the overlaps. We use Lemma 6.74; ~
4>0:
~
0
1
~
~
~
4>f3 (p, y) =
= (p, cI>o:(af3(p)) . y) = (p, cI>o:(
= (p, cI>o:lp 0 cI>f3I;l (e)) . y) = (p, go:f3(P) . e . y) = (p, go:f3(p)y).
This shows that the transitions mappings have the stated form and that the l are smooth. The family {(Uo:, 0: 0 duced smooth structure and is also a bundle atlas. Since o:(u)y) in the domain of every bundle chart (uo:,
1>i
We leave it to the reader to verify that (P x F, f.), P x).F, G) is a principal G-bundle. Notice that while the map prl : P x F ---+ P is clearly a bundle map along 71", we also have prl ((u,y)· g)
= prl ((u. g,g-ly)) = U· 9 = prl(U,y)· g,
and so prl is in fact a principal bundle morphism.
o
Clearly what we have is another way of looking at bundle construction. The principal bundle takes the place of the cocycle of transition maps. Exercise 6.88. Construct a principal Z2-bundle P and left actions Al and >'2 of Z2 on 8 1 and lR respectively, such that P X).1 8 1 is the twisted torus and P X).2 lR is the Mobius band line bundle.
We have seen that given a principal G-bundle, one may construct various fiber bundles with G-bundle structures. Let us look at the converse situation. Suppose that (E, 71", M, F) is a fiber bundle. Suppose that this bundle has a (G, A)-atlas {(Uo:,
If (E, 71", M, V) is a vector bundle and we use the standard GL(V)-cocycle {cI>0:/3} associated to a VB-atlas, then the principal bundle obtained by the above construction is (equivalent to) the linear frame bundle F(E). Letting GL(V) act on V according to the standard action we have F(E) XGL(V) V, which is equivalent to the original bundle (E, 71", M, V). More generally, if
302
6. Fiber Bundles
A : G ~ GL(V) is a Lie group representation, then by treating A as a linear action we can form P x>. V. Proposition 6.89. Let P be a principal G-bundle and let A : G ~ GL{V) be a representation. Then P x>. V has a natural vector bundle structure with typical fiber V. Proof. This follows from Theorem 6.87, but we can argue more directly. Let us denote the total space of P x>. V by B and let Bp be the fiber over some point p E M. Then for each u E Pp there is a map 1/Ju := [u,·J : V ~ Bp given by v t-+ [u, vJ. We compare 1/Ju with 1/Jug for 9 E G and u E Pp • Since lug, vJ = [u, A (g) vJ for all v E V, the following diagram commutes:
V
>.(g)
¢~ Bp h
'V
From this it follows that 'l/Ju transfers the linear structure of V to Bp independently of the choice of u E Pp • We leave it to the reader to show that the local trivializations of P x>. V constructed as in the proof of Theorem 6.87 are linear on each fiber. 0 Example 6.90. Let M be an n-manifold and let F(M) be the frame bundle of M. Then, if AO is the standard action of GL(n, IR) on IRn, we have the following vector bundle isomorphisms:
F(M) x>'O IRn ~ TM, F(M) x>'o IR n ~ T* M, F(M) x>'o®>'o IRn ~ TM ® T* M. If E = P x>. V is an associated vector bundle for A a representation, then we can map P into the frame bundle of E. Indeed, the map is just 1/J : u t-+ 'I/J(u) = 'l/Ju, where 1/Ju := [u,·J as above. Furthermore, 1/J(ug) = 1/J(u) 0 A (g) and so we have a principal bundle morphism with respect to the homomorphism A:
P
¢.
F(B)
~/ M The map 'I/J : P ~ F(E) is only injective if the action A is effective. Based on what we have seen above we can say that the theory of principal bundles and associated bundles is an alternative and "invariant" approach to G-bundles. By "invariant" we mean that the foundations can be laid out
303
Problems
without recourse to strict equivalence classes of G-atlases or the use of cocycles (of course, these notions can be brought in as convenient). According to this approach, the central notion is the principal bundle, and one recovers the other G-bundles of interest as associated bundles. Developing the theory in this way has the advantage that much can be accomplished without the direct need of bundle atlases. It is a more "intrinsic" approach. This approach seems to have originated with Ehresmann and is the approach followed by [H us].
Problems (1) Show that 8 n x JR and 8 n x 8 1 are parallelizable.
(2) Let X := [0,1] x JRn. Fix a linear isomorphism L : JRn -+ JRn and consider the quotient space E - XI"', where the equivalence relation is given by (0, v) rv (1, Lv). Show that E is the total space of a smooth vector bundle over the circle 8 1 . (3) Exhibit the vector bundle charts for the pull-back bundle construction of Definition 6.18. (4) Let ~ = (E, 7[, M, F) be a G-bundle. Let 9afj be cocycles associated to a G-altas {(Ua ,4>a)} for~. Show that ~ is G-equivalent to a product bundle if and only if there exist functions Aa : Ua -+ G such that 9fja(X) = A,s(x)A;;1(x)
and all
O!,
for all x E Ua n Ufj
(3.
(5) Show that the twisted torus of Example 6.17 is trivial as a fiber bundle but not trivial as a Z2-bundle. (Use Problem 4.) (6) Show that the space of sections of a vector bundle over a compact base is a finitely generated module. Show that if the bundle is trivial, then the space of sections is a finitely generated free module. (7) Let El -+ M1 and E2 -+ M2 be smooth vector bundles. Show that if F : E1 -+ E2 is a vector bundle homomorphism along a map f : M1 -+ M2 such that F is an isomorphism of fibers, then E1 -+ M1 is isomorphic to the pull-back bundle f* E2 -+ M1. ~ = (E, 1f, M) is a vector bundle with a positive definite Show that the metric induces a vector bundle isomorphism
(8) Suppose that metric. E~E*
(9) Show that the tangent bundle of the real projective plane is a vector bundle isomorphic to Hom (JL (JRpn) , JL(JRpn).L) , where JL(JRpn) -+ JRpn
6. Fiber Bundles
304
is the tautological line bundle and 1L(lRpn).1 -+ JRpn is the rank n vector bundle whose fiber at l E]R.Pn is {(l,v) E JRpn x JRn+1 : v.il}. (10) Recall Example 6.35. Show that the Whitney sum bundle El EEl E2 is naturally isomorphic to the pull-back l::. *1rE1 XE2 : l::. * (EI x E2) -+ M. (11) (a) Let 7r : E -+ M be an IF-vector bundle. We wish to show that T1r : T E -+ T M is naturally a vector bundle. Consider the maps a : E EEl E -+ E and Its : E -+ E for each sElF given by
a(vp, wp) := vp + wp for vp, wp Its(e p) := se p for ep E Ep
E
Ep
Show that we may identify T (E EEl E) with the submanifold of T Ex TE given by
{(v,w)
E
TE x TE: T7r· v = T1r· w}
Now suppose that for v, wET E with T7r . v = T7r . w we define v EB w := Ta . (v, w) and for sElF and vETE we define 8 • V : Tits· v. Show that with these definitions of addition and scalar multiplication, T1r: TE -+ TM is indeed an IF-vector bundle. (b) Let E be as above but assume for simplicity that IF = JR. Let xl, ... ,xn be coordinates on U eM. Suppose that el, ... , ek is a frame field over U. Let ~n be defined on Elu by y = L: ~i(y)ei(7r(Y)) for any y E E. Then, identifying xi with xi 0 1r, the functions Xl, ... , x n , ,{n are a coordinate system for E defined on Elu and such that the a~. are in the kernel of T7r. Now if v, wET E are such that T1r . v = T1r . w, then we may express v and w as
e, ... , e, ...
and w-
La a~il_ + LbO a~ol_· i
y
~
Y
0
Here y and fi are the base points of v and w, and the fact that the a's are the same for both v and w is a result of the condition T1r . v = T1r . w. Show that
v EB w =
L a a~i I i
~
y+y
+L a
(b
O
+ ba) a~a I ~
y+y
'
where in v EB w, the EB refers to the addition described in part (a). (12) Exhibit a VB-atlas for the tautological bundle of Example 6.47. (13) Show that the tautological bundle over JRpl is a Mobius band.
Problems
305
(14) Let P -+ M be a principal bundle with group G. If H is a Lie subgroup of G, then the quotient P/H is an H-principal bundle. Show that P/H-+ M admits a global section if and only if the structure group of P -+ M is reducible to H. (15) Show that the notions of smooth fiber bundle and vector bundle make sense when the base space is allowed to be a manifold with boundary. What issues arise if one considers allowing both the base space and typical fiber to have boundary?
Chapter 7
Tensors
In this chapter we shall employ the Einstein summation convention. For example, r;kaivk is taken to be shorthand for
where the range of summation is understood from the context. Normally, the repeated indices that are summed over occur once as a subscript and once as a superscript. For example, if A = (a)) is an n x m matrix and B (b;) is an m x k matrix, where in this case we use upper indices to indicate rows and lower indices for columns, then C AB corresponds to
This is reduced by the summation convention to cj = aib~. We will occasionally include the summation symbol l: for emphasis, or to meet the demands of clarity. Tensor fields (often referred to simply as tensors) can be introduced in a rough and ready way by describing their local expressions in charts and then going on to explain how such expressions are transformed under a change of coordinates. With this approach one can gain proficiency with tensor calculations in short order, and this is usually the way physicists and engineers are introduced to tensors. However, since this approach hides much of the underlying algebraic and geometric structure, we will not pursue it here. Instead, we present tensors in terms of multilinear maps.
307
308
7. Tensors
7.1. Some Multilinear Algebra It will be convenient to define the notion of an algebraic tensor on a vector space or module. The reader who has looked over the material in Appendix D will find this chapter easier to understand. In particular, we assume the definition of "multilinear" (Definition D.13). In this chapter, if we say that a module is finite-dimensional, l we mean that it is free and finitely generated and thus has a basis. All modules in this chapter are assumed to be over a commutative ring with unity.
Definition 7.1. Let V and W be modules over a commutative ring R with unity. Then, an algebraic W-valued tensor on V is a multilinear mapping of the form r:V1xV2X"'XVm~W,
where each factor Vi is either V or V*. If the number of V* factors occurring is r and the number of V factors is 8, then we say that the tensor is rcontravariant and 8-covariant. We also say that the tensor is of total type (:). The most common situation is where W is the ring R itself, in which case we often drop the adjective "R-valued". Notice that if r : V* x V x V* x V -+ R is a tensor, then we can define a tensor V* x V x V x V* ~ R by
r:
r(a1' VI, V2, (2) := r(a1' VI, a2, V2)' Although these two tensors clearly contain the same information, they are nevertheless different. We indicate this with a more specific notation. We say that r is a tensor of type (11 11)' while r is of type (1 2 1). More generally, a tensor might, for example, be specified to be of type
(rl 81 r2 82
r a 8b) or (81 r1 82
Sb r a )
The general pattern should be clear. If r = rl + ... + r a and s = 81 + ... +Sb, then the tensor would be of total type (:), which we also write as (r,s). The set of all tensors of fixed type (as above) is easily seen to be an R-module with the scalar multiplication and addition defined as is usual for spaces of functions. As another example, a multilinear map l':VxV*xVxV*xV*~W
is a W -tensor which is of type (1 1 1 2) and total type (~). The set of all Wvalued tensors on V oftype (1 11 2) is denoted TIll 2 (V; W), and we have analogous notations for other types. In many, if not most, circumstances we agree to associate to each tensor of total type (:), a unique element of T"s(V; W) by simply keeping the relative order among the V variables and 1 For
modules, what we mean by dimension is what is usually called the rank.
7.1. Some Multilinear Algebra
309
among the V* variables separately, but shifting all V variables to the right of the V* variables. Following this procedure, we have, for example, the map TIl12 (V; W) -+T32 (V; W). Maps like this will be called consolidation maps or consolidation isomorphisms. Definition 7.2. A tensor
r : V* x V* x ... x V* x V x V x ... x V -+ W ,
v
T
times
'\
Vi
I'
s times
where all the V factors occur last, is said to be in consolidated form. The set of all such (consolidated) W-valued tensors on V will be denoted rs(V; W). As a special case we have rol (V; R) = V*. We will often abbreviate T"s(V; R) to TTS(V). For example, elements of T32 (V; W) are in consolidated form, while tensors from T I I I 2 (V; W) are unconsolidated. Remark 7.3. Some authors consolidate by putting all V arguments first. Also, sometimes it is appropriate to forgo the consolidation especially in connection with the "type changing" operations introduced later. Our policy will be to work with tensors in consolidated form whenever convenient. Example 7.4. One always has the special tensor 8 E T\ (V; R) defined by
8(a, v) = a(v) for a E V* and v E V. This tensor is sometimes referred to as the Kronecker delta tensor.
v:
There is a natural map from V to V** given by v t---+ V, where a: t---+ a(v). If this map is an isomorphism, we say that V is a reflexive module and we identify V with V**. Finite-dimensional vector spaces are reflexive. Exercise 7.5. Show that the COO(M) module of sections of a vector bundle E -+ M is a reflexive module. (It is important here that we are only considering vector bundles with finite-dimensional fibers.) We now consider the relationship between tensors as defined above and the abstract tensor product spaces described in Appendix D. We restrict our discussion to tensors in consolidated form since the implications for the general situation will be obvious. We specialize to the case of R-valued tensors where R is the ring. Recall that the k-th tensor power of an Rmodule V is denoted by ®k V := V ® ... ® V. We always have a module homomorphism
(7.1)
7. Tensors
310
whereby an element UI ® ... ® Ur ® f31 ... ® f3s E (®rV) ® (®8V*) corresponds to the multilinear map given by (a1, ... , ar , VI, •.• ,vs) t-+ a1 (U1) ... ar (u r ) f3I (VI) ... f38 (v s ) . We will identify UI ® ... ® Ur ® f3I ® ... ® f3B with this multilinear map. In particular, this entails identifying V E V with the element E V** where a t-+ a(v). If V is a finite-dimensional vector space, then the map (7.1) is an isomorphism. In fact, it is also true that if V is the space of sections of some vector bundle over M (with finite-dimensional fibers), then V is a COO(M)-module and the map (7.1) is still an isomorphism. A tensor which can be written in the form UI ® ... ® f3s is called a simple or decomposable tensor. Note well that not all tensors are simple.
v
v:
Remark 7.6. The reader should take careful notice of how we treat the orders of the factors: An element of V ® V* ® V* corresponds to a multilinear map V* x V x V ---+ R and not to a map V x V* x V* ---+ R. Since the map (7.1) is not always an isomorphism for general modules, and since no analogous isomorphism exists in the case of tangent spaces to infinite-dimensional manifolds such as those discussed in [Ll], it becomes important to ask to what extent the map (7.1) is needed in differential geometry. Serge Lang has written a very fine differential geometry book for manifolds modeled on Banach spaces [Ll] without the help of such an isomorphism. In any case, we still can and will consider UI ® ... ® U r ® {31 ... ® f38 to be an element of ~(V) as described above. Another thing to notice is that if (7.1) is an isomorphism for all r and s, then in particular V ~ V**, that is, V must be reflexive. Corollary D.33 of Appendix D states that for a finitely generated free module, being reflexive is enough to insure that (7.1) is an isomorphism for all rand s. In the latter case, the consolidation maps introduced earlier can be described in terms of simple tensors. For example, the consolidation map Tl2 2
(V) ---+
T23
(V)
is given on simple tensors by
a ® v ® w ® f3 ® "f ---+ v ® w ® a ® f3 ® "f. Now let us consider the spaces V ® V* and V ® V* ® V*. By a straightforward argument using the universal property of tensor product spaces, one can construct a bilinear map (V ® V*) x (V ® V* ® V*) ---+ V ® V* ® V ® V* ® V* such that (v ® a, w ® f31 ® (32) is mapped to v ® a ® w ® corresponds to a product map ® : TIl (V) x TI 2 (V) ---+ TIl I 2 (V)
f31
® f32. This
7.1. Some Multilinear Algebra
311
such that (8, T) -+ 80 T, where
The general pattern should be clear, but writing down the general case is notationally onerous. This product is the (unconsolidated) tensor product of tensors. Note carefully the order of the factors. To simplify the notation, the tensor product is often defined in a slightly different way when dealing with tensors which are in consolidated form: Definition 7.7. For tensors 8 E Tr~l (V) and T E Tr~2(V)' we define the (consolidated) tensor product 8 ® T E T r 1+r 2 81+ 8 2 (V) by
Whether or not a tensor product is the consolidated version will normally be clear from the context, and so we will drop the word "consolidated". We can also extend to products of several tensors at a time. While it is easy to see that the tensor product defined above is associative, it is not commutative since the order of the slots is an issue. Let T*(V) denote the direct sum of all spaces of the form Tro(V), where we take (V) := R. The tensor product gives T* (V) the structure of an algebra over R as long as we make the definition that r 0 A := r A for r E R.
roo
Proposition 7.8. Let V be a free R-module with basis (el,"" en) and corresponding dual basis (e 1 , ... , en) for V*. Then the indexed set
is a basis for T r8 (V). If 7 E T r8 (V), then
Proof. If 7 E T r8 (V; R) and we define 7 i1 "·}1 ... j. = 7(e~1, ... , e~r, eJ1 , • •• , ej.), then it is easy to check that
and so, in particular, our indexed set spans Trs(V; R). Indeed, if we denote the right hand side of the above equation by 7', we obtain (using the
7. Tensors
312
summation convention throughout) T'( ekl , ... , e kr , eli, ... , el. )
= Ti1 ...ir)1 ...)_. ei1 (e k1 ) ... eir (e kr )e31(el 1 ) ... ej• (ez • ) = Ti1 ... ir .
)1 ... ).
[/1 ... 8kr831 ... 8j8 ~1 ~r II z.
_
Tk1 ... kr
1I ... l.
= T (ekl , ... , e kr , elI , ... , eI. ) . Thus T' and T agree on basis elements, and by multilinearity T' = T. For independence, suppose Til "·~;I ... j. e'l 0 ... 0 e~r 0 ej1 0 ... 0 e)' - 0 for some n T +S elements T i1 ... ir )1 ... j. of R. This is an equality of multilinear maps, and if we apply both sides to (e k1 , ... , ekr , elI, ... , ez.), then we obtain T k1 .. .l_ - O. Since our choices were arbitrary, we see that all n T +S elements T'I ... ~r.)1 ...)_. are equal to O. 0
t.
As a special case we see that if A E T\(VjR), then A = A~) e, ®eJ , where Ai j = A(ei , e)). This theorem is a special case of Theorem D.29 of Appendix D, which we will also invoke below for spaces like W 0 V* . If we are dealing with tensors that are not in consolidated form, it should still be clear how to obtain a basis. For example, {e~ 0 d 0 ek} is a basis for TIll (Vj R), and a typical element A would have an expansion i k
A= Aj
ei
.
0 e3 0 ek·
Notice the purposeful staggered positioning of the indices in Ai) k. Definition 7.9. The elements A i l ...i r jl ... j. from the previous proposition are called the components of T with respect to the basis ell ... , en. Example 7.10. If V = TpM for some smooth manifold M, then we can use any basis of TpM we please. That said, we realize that if p is in a coordinate Ip form a basis for TpM, and chart (U, x), then the vectors ~ Ip ,... , we may form a basis for TTS (TpM) consisting of all tensors of the form
-fxn
--!-I OX'1
p
--!-I
0 .. · 0 ox'r
For example, an element Ap of form as
TIl
Ap = A' j An element Ap of
Ap
0 dxiI p
Ip 0 .. · 0
dxi-I p .
(TpM) can be expressed in coordinate
O~i Ip 0
dx-Jlp .
rrs (TpM) can be written as
--!-I
= A'I ...i r .)1 .. ·).. ox'!
P
--!-I
0 .. · 0 ox'r
P
0 dxi11 P 0 .. · 0 dx-J s ' p'
and this is called the coordinate expression for Ap.
313
7.1. Some Multilinear Algebra
The components of a tensor depend on the basis chosen, and a different choice will give new components related to the first by a transformation law. This is the content of the following exercise: Exercise 7.11. Let eI, ... , en be a basis for V and let e l , ... , en be the corresponding dual basis for V*. IT el, ... ,en is another basis for V with
- = ei
Ckiek,
then the dual basis e1 , ... , en is related to e1 , ... , en by ei = (C- I ) ik e k , where C = (Cj). Show that if ri jk are the components of r with respect to the first basis (and its dual) and if ri jk are the components with respect to the second basis, then
r-i.Jk
_ -
(C-l)ia r a be Cbcc j k
(sum over a, b, c).
This is a transformation law. What is the analogous statement for r E
T"s(V; R)? Example 7.12. It is easy to show that for any basis (with corresponding dual basis) as above, the Kronecker delta tensor 8 has components 8;, where 8~ = 0 if i =1= j and 8\ = 1. It is easy to show that if 8 E T12 (V) and T E T22 (V), then 8 ® T has components given by
(8 ® T)abe defg = Sa de Tbe fg • More generally, if S E TTil (V) and T E rr~2 (V), then
(7.2)
(S ® T)a1 ... ar10<1 ... O
-
bl ... ba dh ... /3a2 -
Sa1 ... a r 1
bl ... ba1
TO
/31 ... /3s2 •
Notice the consolidation. If we choose not to employ consolidation, then the (unconsolidated) tensor product would be expressed differently in component form. For example,
(S ® T)a
de be fg
= Sa de Tbe fg •
This way of treating the position of the indices is a convention that is called (naturally enough) "positional notation". For more on positional notation, see [Pel, [Dod-Pos] and [Stern]. If V is a finitely generated free module, then we have a natural isomorphism V ® V* ~ L(V, V). We can be a bit more general. Let W be another finitely generated free module. Consider the map q, : W x V* -+ L(V, W) given by (w,a) H q, (w,a), where q, (w, a) (v) := a(v)w.
314
7. Tensors
This map is multilinear, and so using the universal property of tensor products we get a map ~ : W ® V* -+ L(V, W) such that ~ : w a t-+ W(w, a) E L(V, W). Let us show that this map is an isomorphism. A given element A E W®V* may be written as A = L: w i 0a" where the w t are linearly independent. Indeed, we just write A in terms of a basis and collect terms. Then if ~(A) = 0, we have L: ai(v)w i 0 for all v. Thus at(v) = 0 for all v. It follows that A = O. We see that Wis injective. Both V and Ware finite-dimensional, and W ® V* and L(V, W) have the same dimension. Indeed, L(V, W) is isomorphic to the space of m x n matrices with entries from R, where m and n are the dimensions (or ranks) of Wand V respectively. If V and W were vector spaces, this would imply that ~ is onto. However, it remains true that ~ is onto in the case where Wand V are finite-dimensional free modules, but we must argue differently. In Problem 2 we ask the reader to show that if el, ... ,en is a basis for V and il, ... , fm is a basis for W, then {~ (It ® ej )} is a basis for L(V, W). Thus ~ : W ® V* -+ L(V, W) is an isomorphism. If r = rJft 0 e , then
~ (r) : v t-+ rJeJ(v)fi = r~ vj ft.
The components rJ are exactly the entries of the matrix that represents ~ (r). When the above conditions hold, so that ~ is an isomorphism, we often identify W ® V* with L(V, W) and write r even when we mean ~ (r). We say that r has two "interpretations". Under this identification
(w ® a )(v) - a (v)w. In component form, the two interpretations of r show up as
v t-+ w, where wi = r~ v j , and
(a,v) t-+ r (a,v), where r (a,v) = r~vJat. In particular, we identify T\ (V) with L(V, V). Remark 7.13 (Tensions of conventions). Both W ® V* and V* ® W can be identified with L(V, V). For example, we may also interpret a®w E V*0W as the map (a ® w) (v) = a (v)w. If the reader looks at how we have consolidated the spaces in the definition of TTS(V), it will be apparent that we have preferred W ® V* over V* ® W. However, if one considers the case where the underlying ring is not commutative, it becomes clear that the isomorphism V* ® W ~ L(V, W) is correct for left modules while the other is correct for right modules (V* is a right module if V is a left module, and vice versa). On the other hand, the identification W ® V* ~ L(V, W) is more natural for the conventions of matrix multiplication. For example, if we think of elements of IRn as column vectors and elements of (JR n )* as row
7.1. Some Multilinear Algebra
315
vectors, then for w E Rn and 0: E (JRn )*, the linear map 0: ® w is indeed given by the matrix o:w. The tension between standard matrix conventions and left modules is well known to algebraists. We think of modules over commutative rings as simultaneously both left and right modules. If V is finite-dimensional, then one can make various other reinterpretations of tensors: Example 7.14. Suppose that V is finite-dimensional. Then, elements of T~ (V) can be interpreted as members of
T!(V; To(V)) according to the prescription that
7(Vb ••• , VB)
acts by
7(VI, ... ,vB)(o:l, ... ,00T):= 7(0: 1, ... ,00T,VI. ...
,VB)'
Similarly, elements of T~ (V) can be interpreted as members of TO (V; ~ (V)). Example 7.15. Let V be as in the previous example. Elements ofT~l +B2 (V) can be interpreted as members of TOS 1 (V; TOS 2 (V) )
by the prescription
One can easily see from the above examples that many reinterpretations are possible. One of the most common is where one interprets elements of 2 (V, R) as elements of L(V, V*) according to
ro
7(V)(U)
=
7(V,
u) for u, V E V.
Exercise 7.16. Show that under the identification ofT\ (V, R) with L(V, V) we can interpret the Kronecker delta tensor as the identity map. Definition 7.17. A covariant tensor 7 E if 7(V1, ••. , VB)
TOs (V, W)
is said to be symmetric
= 7(Vu (I),"" Vu(s))
for all VI, ... , V s and all permutations (J' of the letters {I, 2, ... , s}. We define a symmetric contravariant tensor similarly. Definition 7.18. A covariant tensor 7 E ros(V, W) is said to be alternating if 7(VI, ... , VB) = sgn((J')7(Vu (1), ... , Vu(s)) for all VI. ... ,Vs and all permutations (J' of the letters {I, 2, ... ,s}, where sgn( (J') = 1 if (J' is an even permutation and -1 if it is an odd permutation. sgn (J') is called the sign of the permutation (J'; We define an alternating contravariant tensor similarly.
7. Tensors
316
If £ : V -t V is a linear map, then the map l* : TOs(V; W) -t TOs(V; W) is defined by (l*T)(UI, ... ,U s ) :=T(£(UI)""'£(us)). l*T is called the pull-back of T by £. It is easy to show that l* is linear. If we have linear maps £ : VI -t V2 and A : V2 -t V, then
(A 0 £)* : ~(V; W) -t Tls(V I ; W) and
(A 0 £) * = £* 0 A* . Thus the pull-back £ -t l* defines a contravariant functor in the category of W -valued covariant tensors. (Because of this, one might wish that covariant tensors were called contravariant and vice versa, and indeed some authors have reversed the traditional terminology.) Suppose that (eI, ... , en) is a basis for V and that (h, ... ,fm) is a basis for W. If l(ei) = ~£ffk' then we have (l*T)il ...i. = (l*T)(eip ... , et .) = T(£ (eil)"'" £ (ei.))
£7:
£7
= T(£711hI' ... , fk.) = = Tk 1··· k • £~1 ... £~. tl t. '
1 1 ••
·£t T (fkl' ... , h.)
which gives the component form of the pull-back operation in terms of the matrix (£:). Proposition 7.19. If A E L(V, V), a E TOs 1 (V) and (3 E ~2(V), then
A*(a 0 (3) = A*a 0 A* {3. Proof. We have
A*( a 0 (3)( UI, ... ,US}! US1 +I, ... ,US1 +S2) = a 0 (3(AUI, ... , AUS1 , AUS1 +1, ... , AUS1 +S2 )
= a(AUI, ... , AU S1 )f3 (AU s1 +1,"" AUS1 +S2 ) = A*a( UI, ... , US1 )A* {3 (U S1 +1, .•• ,US1 +S2) - A*a 0 A* (3(UI, ... , US1 ' US1 +I, .. . ,US1 +S2 )'
0
Before going on to study tensor fields, we introduce one more notion from multilinear algebra referred to as contraction. Definition 7.20. Let (el, ... , en) be a basis for V and (e l , ... , en) the dual basis. If T E T r s (V), then for k ::; rand 1 < s, we define G1kr E T"~ 1 (V) by
GfT(Ol, ... , or-I, WI, ... , Ws-I) n ._ " ' (L11 . -" ~ T (7 ,
a=l
••• ,
ea . . , k-th posItion
... , L1r-1 (7
,
WI, ... ,
, ••. , Ws-l ) . ea l-th position
7.1. Some Multilinear Algebra
317
This processes is called contraction. Write the components of T with respect to our basis as T~l ... ir j1 ... js' If we pick out an upper index, say ik, and also a lower index, say il, then we obtain the components of the contracted tensor CtT by: ( CkT )
il ... ik ... ir
.
-:-
l
. : - T~···a ...~r JI ... a ... js
(sum over a).
)1 .. ·)1''')8
Here the caret means omission. In practice, one often just writes
Ti1 ... ik ... ir.
_
.
)1 .. ·)1"')8
instead of (C[T)i 1 ... i k ... i r . as long as it has been made clear how the )l .. ·)I .. ·)s contraction was carried out. For example, one often sees expressions like ~): R rtrj . Notice that we always contract an upper index with a lower contracts the k-th upper index with the l-th lower index. index. The map
ct
Consider a tensor of the form v ® w ® '" ® (J E that
T22 (V).
One can show
Similarly, Ci(v ® w ® '" ® (J)
= O(v)w ® "'.
ct
In general, acts on simple tensors VI ® v2 ® ... ® vr ® ",1 ® ",2 ® ... ® ",s by an obvious extension of the above. Universal mapping properties can be invoked to give a basis free definition of contraction. Contraction generalizes the notion of the trace of a linear transformation. A common use of contraction involves first taking the tensor product of two tensors, and then performing a contraction of a contravariant slot of one with a covariant slot of the other. One often performs several contractions. For example, we may form a tensor that is given in components as T
ac
elg
=
Sa ke
Tkc
Ig
(sum over k).
Evaluation of a tensor on its arguments is the result of a repeated contraction. For example, let V have a basis el, ... , en and let eI , ... , en be the dual basis for V* as above. If v = v~ei, w = wtei, and a = ate i , then for i E T 12 (V) we easily deduce that
(7.3)
T ( a,v,w ) = Ti jkaiV) Wk ,
which is the result of a repeated contraction on the tensor T ® a ® v ® w. More generally, if we express elements VI, .•• , v s E V and aI, ... , a r E V* in terms of our basis and its dual, then for T E TTs(V), we have an analogous general expression for T(aI, ... , ar, VI, . .. , vs) in terms of the components of the tensor and its arguments.
318
7. Tensors
7.2. Bottom-Up Approach to Tensor Fields There are two approaches to tensor fields on smooth manifolds that turn out to be equivalent (at least for finite-dimensional manifolds). We start with the "bottom-up" approach where we apply multilinear algebra first to individual tangent spaces. The second approach directly defines tensors on M as tensors on the module X(M). Roughly speaking, a smooth (r, s)-tensor field on a manifold M assigns to each p E M an element of Trs(TpM) in a smooth way. We are interested in making sense of smoothness for tensor fields, so we wish to view a tensor field as a section of an appropriate vector bundle (a tensor bundle). Let us start out being a bit more general by considering a real rank k vector bundle ~ (E, 1f, M). For convenience, we take the typical fiber to be IRk. Let Trs(E) - UPEMTrs(Ep)' We wish to construct a bundle Trs(~) which has TTs(E) as total space, M as base space, and TTs(Ep) as fiber over p. If (U, l/» is a VB-chart for ~, then we construct a VB-chart for Trs(~) in the following way: Recall that l/> has the form l/> = (-lr, cI», where cI> : 1f- l U -+ IRk and where cI>p := cI>IEp : Ep -+ JRk is a linear isomorphism for each p. We obtain a map cI>~,s : Trs(Ep) -+ TTs(1Rk) by
(cI>;,STp)(al, ... , a r , VI,···, VB) :=
Tp((cI>p)* al, ... , (cI>p)* a r , cI>;lVl,"" cI>;lv s ).
These maps combine to give a map cI>r,s : 1f- l U -+ TTs(JR k ) which is smooth (exercise). Our chart for Tr8(~) is
l/>r,s := (71', cI>r,s) : 1f lU -+ U
X
Trs(lRk).
If desired, one can choose, once and for all, an isomorphism TTs(JR k ) ~ IRk •
A VB-atlas {(Ua , l/>a)} for ~
= (E, 71', M) gives a VB-atlas {(Ua , l/>~/)} for
Tr8(~)'
Exercise 7.21. Show that there is a natural vector bundle isomorphism
We leave it to the interested reader to prove the following useful theorem. Proposition 7.22. Let ~ = (E, 71', M) be a vector bundle as above and let T : M -+ TTs(E) be a map which assigns to each p E M an element of Trs (Ep). Then T is smooth if and only if
p r-+ T(p )(0:1 (P), ... ,00r(P), Xl (p), ... ,Xs(p)) is smooth for all smooth sections p r-+ 0:~(P) and p r-+ X~(P) of E* -+ M and E -+ M respectively. The same statement is true if we use local sections.
7.2. Bottom-Up Approach to Tensor Fields
319
The set of smooth sections of Trs(~)) is denoted r(Trs(~))' If Y E then for XI, ... , Xs any smooth sections of E -+ M and aI, ... , a r smooth sections of E* -+ M, define Y(al,"" ar, Xl, ... , Xs) E COO(M) by r(Trs(~)),
Y(aI, ... , ar, Xl, ... , Xs)(p) :- Yp(a1(p)"", ar(p), X1(P), . .. , Xs(P)). Now we have a map Y : (r E*)k x (r E)l -+ COO(M). This map is clearly multilinear over Coo (M), and we see that we can interpret elements of r(T~ (~)) as such maps when convenient. This extends the idea of thinking of a I-form a as a COO(M) linear map X(M) -+ COO(M). Like most linear algebraic structures existing at the level of a single fiber Ep, the notion of tensor product is easily extended to the level of sections: For T E r(Trs\ (~)) and'f/ E r(Tr~2(~))' we define the (consolidated) tensor product T ® 'f/ E r(r S~~~2(~)) by (T ® 'f/) (P) := Tp ® 'f/p. Thus
(T ® 'f/) (P) (al, ... , a rl +r2 , VI, ... , VSl +S2) _ ( 1 -T a , ... ,arl ,V1,,,,,VSl ) 'f/ ( a r1 +1 , ... ,arl +r2 'VS1+1. .. ·'VS1+S2 ) for all a i E E; and V~ E Ep. Let (Sl, ... , Sk) be a local frame field for ~ over an open set U and let 0'1, .•• , O'k be the dual frame field of the dual bundle E* -+ M so that O'~(sJ) b} Consider the set {cr~1 ® ... ® cr~'" ®
SJ1 ® ... ® sJs : il, ... , iT> iI, ... ,is = 1, ... , k}.
If T E r(Trs(~))' then we have functions T~1 ... ~r jl ... js E COO(U) defined by T~1 ... ~r j1 ... j8 = T(cr i1 , ... , cr ir , Sjll ... , Sjs)' It follows from Proposition 7.8 that T (restricted to U) has the expansion T = Th ... ir J1. .. ·J.. cril ® ... ® crir ® s·J1 ® ... ® s·J. . Also, applying equation (7.2) in each fiber Ep, we see that the component functions for T ® 'f/ are given by (T ® t1 ...ir1+r2. . 'f/ Jl ...J-1 +S2 = Ti1 ...ir1. . 'I'1ir1+1 .. .ir2. 31 .. ·J8 1 ./ J81 +1 ...J8 2 • Here, and wherever convenient, we use the consolidated tensor product. Notation 7.23. Whenever there is no chance of confusion, we will refer to Trs(~) by Trs(E) -+ M or even just Trs(E) (the latter is the notation for the total space of the bundle). In the case of the tangent bundle TM, we have special terminology and notation:
Definition 7.24. The bundle rs(TM) -+ M is called the (r, s)-tensor bundle on M.
320
7. Tensors
By Exercise 7.21, Trs(TM) ~ ((g/TM) ® (®sT*M) and this natural isomorphism is taken as an identification so the latter bundle is also referred to as a tensor bundle. We now restrict ourselves to the case of the tangent bundle of a manifold but note that much of what follows makes sense for general vector bundles. Definition 7.25. The space of sections r(Trs(TM)) is denoted by T;(M) and its elements are referred to as r-contravariant s-covariant tensor fields or just type (r, s)-tensor fields. The space 7O(M) is denoted by P(M) and T;(M) by Ts(M). In summary, a smooth tensor field A is a smooth assignment of a multilinear map on each tangent space of the manifold. Thus for each p, A(P) is a multilinear map
A(p) : (T;Mt x (TpMt -+ JR, or in other words, an element of Trs(TpM). Elements of Trs(TpM) are called tensors at p. We also write Ap for A(p). Example 7.26. In Definition 6.42, we introduced the notion of a Riemannian metric on a real vector bundle. We saw that such metrics always exist. The most important case is where the bundle is the tangent bundle T M of a manifold M. In this case, we say that we have a Riemannian metric on M. Thus a Riemannian metric on M is an element of 72(M) which is symmetric and positive definite at each point. Of course the manifold in question could be an open sub manifold U of M so we have COO(U)-module (r, s)-tensor fields over that set denoted T;(U). The open subsets are partially ordered by inclusion V c U and the tensor fields on these are related by restriction. Let r~ : T;(U) -+ 7;(V) denote the restriction map. The assignment U -+ T;(U) is an example of a presheaf and in fact a sheaf. We will also sometimes deal with tensors with values in T M (or in T* M). First note that the space Trs(TpM; TpM) of all multilinear maps (T;Mr x (TpMt -+ TpM is a vector space. The set Trs(TM;TM) := UpTrs(TpM; TpM) can be given a smooth vector bundle structure in a way that is closely analogous to Trs(TM) -+ M. Definition 7.27. The space of sections r(Trs(TM; TM)) is denoted by T; (M; T M) and its elements are referred to as r-contravariant s-covariant TM-valued tensor fields. Similarly, we may define T* M-valued tensor fields. Note that T M -valued tensor fields can be associated in a natural manner with ordinary tensor fields. For example, if A E T02 (T M; T M), then using
7.2. Bot tom- Up Approach to Tensor Fields
321
the same letter A by abuse of notation, we may define an element A E T~(TM) by
Ap(Op, vp, wp) = Op(Ap(vp,wp)) for Op E r;M and vp,wp E rpM, Many such reinterpretations are possible. For this reason, we shall stick to studying ordinary tensors and tensor fields in what follows. We shall define several operations on spaces of tensor fields. We would like each of these to be natural with respect to restriction. We already have one such operation: the tensor product. If A E T;f(U) and B E T;i(U) and V c U, then r~ (A 0 B) = r~A 0 r~B. A (k, l)-tensor field A may generally be expressed in a chart (U, x) as
..
A=A~1 ...tr
a
a·
®"'®-. ®dx3 1
"'0dx)', ax'" where A i1 .. .i,. )1. .. ·). are functions, 88, E X(U) and dx j E X*(U). Actually, it x is the restriction of T to U that can be written in this way, but because of (7.4)
.. -.
)1 .. ·). ax~1
the naturality of all the operations we introduce, it is generally safe to use the same letter to denote a tensor field and its restriction to an open set such as a chart domain. It is easy to show that
. .
A ~1""r
-
).1 .. ·)'• -
(
A d
X
l1
'''.,
d
x
. a
'r
a)
'-a' Xl1 '''''-a xJ.. ,
and so the components of a smooth tensor field are Coo for every choice of coordinates (U, x) by Proposition 7.22. Conversely, one can obviously define tensors that are not necessarily smooth sections of the appropriate tensor bundle, and then a tensor will be smooth exactly when its components with respect to every chart in an atlas are smooth. Evaluating the expression (7.4) above at a point p E U results in an expression such as that given at the end of Example 7.10. Exercise 7.28 (Transformation laws). Suppose that we have two charts (U,x) and (V, i). If A E Ti(M) has components A;k in the first chart and
A;k in the second chart, then on the overlap Un V we have G axi ax b axe Ak=A -' bc ) axG ax j axk
--.i
where
a axb a . axi = -a. a b' dx' = a x G dx Gand a-' x' x' X
This last exercise reveals the transformation law for tensor fields in r~(M), and there is obviously an analogous law for tensor fields from T's(M)
for any values of rand s. In some presentations, tensor fields are defined in terms of such transformations laws (see [L-R] for this approach). It should be emphasized again that there are two slightly different ways of reading
322
7. Tensors
local expressions like the above. We may think of all of these functions as living on the manifold in the domain U n V. In this interpretation, we read the above as QX a
A;k(P) -
QX b
QX i
A~b(P) QX j (p) QXk (P) QXl (p) for each p E Un V.
This is the default modern viewpoint. Alternatively, we could take g:m to be functions on x(U n V) and write (Xl, ... , xn). Then, ~ would refer to ~ 0 x 0 x-l(xI, ... ,xn ) so that both sides of the equation are functions of variables which we abusively write as (xl, ... , xn). The first version seems theoretically pleasing, but for specific calculations that use familiar coordinates such as polar coordinates, the second version is often convenient. For example, suppose that a tensor 7 has components with respect to rectangular coordinates on JR2 given by AJk and we wish to find the components in polar coordinates. For indexing purposes, we take (x, y) = (u l ,u2 ) and (r,O) = (v I ,v2 ). Then we have
&::..
_
(7.5)
QUa QU b
Ajk = Aab QVJ QV k '
which can be read so that both sides are functions of (vI, V2) by writing u l and u 2 as a function of (vI, v 2 ), etc. Of course, the charts are there to "identify" open sets in Euclidean space with open sets on the manifold so these viewpoints are really somehow the same after all. Using (x, y) and (r, 0), the transformation (7.5) is given in matrix form as [COSO -rsinO] [All A12] [COSO sinO] [ All A12] AI2 A22 sinO rcosO AI2 A22 -rsinO rcosO . We now introduce the pull-back of a covariant tensor field, which will play a big role in the next chapter. Definition 7.29. If f : M --+ N is a smooth map and define the pull-back 1*7 E 7s(M) by
j*7 (Vb
... , vs)(p)
- 7(Tf· VI,
... , Tf·
7 E
7s(N), then we
vs)
for all Vb ... , Vs E TpM and any p E M. Notice the connection of this with the pull-back defined earlier in a purely algebraic context. It is not hard to see that j* : 7s(N) --+ Ts(M) is linear over JR, and for any h E COO(N) and 7 E Is(N) we have j* (h7 = (h 0 J) j*7. If f : M --+ Nand 9 : N --+ P are smooth maps, then of course (g 0 J)* - 1* 0 g*. Let us discover the local expression for pull-back. Choose a chart (UI xl on M and a chart (V, y) on N and assume that f(U) C V. Let us denote
7.3. Top-Down Approach to Tensor Fields
8(Y'of)
ax3
8 '
.
..
323
k 8 I E 7Jfj(p)lfil< f(P)
8 18
by ~ for sImplIcIty. We have Tpf· ax' p =
and
(f*1')~l ...~. (P) = (f*1') (a:tl Ip ,... , a:i.l) =
l'
I
(Tf a:t1 p '''' ,Tf a:ts
I)
aykl a 1 ayks a I ) =1' ( - a i (p) - a k , ... , - a i (P) - a k Y 1 f(P) X • Y S f(p) x 1 a I a I ) aykl ayks =1' ( - k '''''-k (p) ... -(P) ay 1 f(p) ay s f(P) axtl ax tB aykl ayks = 1'kl ... k. (f (p)) axil (p) ... axis (P).
Thus we have
This looks similar to a transformation law for a tensor, but here f is not a change of coordinates and need not even be a diffeomorphism. Pull-back respects tensor products: Exercise 7.30. Let f : M -+ N be as above. Show that for 1'1 E and 1'2 E Ts 2 (N) we have /* (1'1 ® 1'2) = /*1'1 /*1'2.
Ts 1(N)
In the case that f : M -+ N is a diffeomorphism, the notion of pull-back can be extended to contravariant tensors and tensors of mixed covariance. For such a diffeomorphism, let (T f I) * : T; M -+ T; N denote the dual of the map Tf- 1 : TpN -+ TpM. Definition 7.31. If f : M -+ N is a diffeomorphism and l' is an (r, s)-tensor field on N, then define the pull-back /*1' E T r s (M) by f*1'(a1, ... ,ar ,v1, ... ,Vs )(p) :=
1'((Tf-1)* ar, ... , (Tf 1)* a r , Tf· VI, ... , Tf·
Vs)
for all VI, ... , Vs E TpM and aI, ... , a r E T; M and any p E M. The pushforward is then defined for l' E T r s (M) as f*1' := (f 1)*1'.
7.3. Top-Down Approach to Tensor Fields Specializing what we learned from the discussion following Proposition 7.22 to the case of the tangent bundle, we see that a tensor field gives us a COO (M)-multilinear map based on the module X(M). This observation leads
7. Tensors
324
to an alternative definition of a tensor field over M. In this "top-down" view, we simply define an (r, s)-tensor field to be a COO(M)-multilinear map
X*(Mt x X(M)B -+ COO(M). In this view, a tensor field is an element of 'P s (X(M)). For example, a global covariant 2-tensor field on a manifold M is a map r : X(M) xX(M)-t Coo (M) such that
r(hX1 + h X 2, Y) = h r(XI, Y) + hr(X2' Y), r(Y, hXI + hX2) = h r(Y, Xl) + hr(Y, X2) for all h, h E COO(M) and all Xl, X 2 , Y E X(M). As we shall see, it turns out that such COO(M)-multilinear maps determine tensor fields in the sense of the previous section. If we take a top-down approach to tensor fields, then we must work to recover the presheafjsheaf aspects. Indeed, it is not obvious what is the relation between T's(X(M)) and T's(X(U)) for some proper open subset U eM. Indeed, thinking purely in terms of modules makes the issue clear. The module X(M) is not the same module as X(U) unless U = M. A priori, there is no immediate reason to think that a multilinear map with arguments from the module X(M) should be able to take elements of X(U) as arguments! For instance, from the top-down viewpoint, how can we insert coordinate fields a~. and dxi into an element of Trs(X(M)) to get coordinate expressions if the chart domain is not all of M? We address this in the next section indirectly by showing how the top-down approach gives back tensors as sections (the bottom-up approach). Another comment is that both X(U) and 'Ps(X(U)) are finite-dimensional free modules over the ring COO(U) whenever U is a chart domain or, more generally, the domain of a frame field. The reason is that a local frame field and its dual frame field provide a module basis for X(U) and r(U) and the latter really is the dual of the first in the module sense. On the other hand, the COO(M)-modules X(M) and T's(X(M)) are not generally free unless M is parallelizable.
7.4. Matching the Two Approaches to Tensor Fields If we define a tensor field as we first did, that is, as a field of tensors in tangent spaces, then we immediately obtain a tensor as defined in the topdown approach. On the other hand, if r is initially defined as a COO(M)multilinear map, then how should we recover a field of tensors on the tangent spaces?2 Answering this is our next goal. 2This is exactly where things might not go so well if the manifold is not finite-dimensional. What we need is the existence of smooth cut-off functions. Some Banach manifolds support cut-off functions but not all do.
325
7.4. Matcbing tbe Two Approacbes to Tensor Fields
Trs(.X-(M)). Let (h, ... , Or and 1 :::; i :::; r; also let Xl"'" Xs and X!, ... , XS be smooth vector fields such that Xt(p) = Xt(P) for 1 :::; i :::; s. Then we have that Proposition 7.32. Let p E M and
l'
E
(h, ... ,Or be smooth 1-forms such that Oi (P) = Oi (p) for
1'(01, •.. , Or, Xl"'" Xs)(P) = 1'(01, ... , Or, Xl"'" Xs)(P). Proof. The result will follow easily if we can show that
1'(01, •.• ,Or, Xl"'" Xs)(P) = 0 whenever one of OI(P), ... ,Or(P),Xl(P), ... ,Xs(P) is zero. We shall assume for simplicity of notation that r = 1 and s = 2. Now suppose that Xl (p) = O. If (U,x), with x = (xl, ... , x n ), is a chart with p E U, then Xllu = ~~i 8~' for some smooth functions ~i E COO(U). Let 13 be a cut-off function with support in U and j3(P) = 1. Then for any smooth vector field X defined on U we can consider both j3X and 13 2X to be globally defined and zero outside of U. Similarly, if f is a smooth function defined on U, then 13 f can be taken to be globally defined on M and zero outside of U. Now (32 Xl = ~ (j3~i) (13 8~' ). (Notice that in this last expression we have used 13 to extend both the functions ~i and the coordinate fields 8~" which is why we used 13 2 rather than just 13.) Thus
j321'(OI' Xl, X2) = 1'(01, 13 2XI, X2) =
l'
(0 13 ~t 8~i ' X
=
l'
(01,
1,
2
2)
(j3~i) 13 8~i' X 2 )
= j3~i1'(Ol' 13 8~t' X2)' (Notice that the point of the above expression is that, at this moment, l' is defined only for global sections and is linear over global functions. For example, 8~' is not a global section while 13 8~' is a global section.) Since XI(p) - 0, we must have ~i(P) = 0 for all i. Also recall that j3(P) = 1. Plugging P into the formula above we obtain
1'(01 , Xl, X2)(p) =
o.
= 0 or Ol(P) = O. Assume that OI(P) - OI(P), Xl(p) = Xl(p) and X 2(p) = X2(p), Then
A similar argument holds when X2(P) we have
1'(OI' XI, X2) - 1'(01 , Xl, X2)
= 1'(01 - OI, Xl, X2) + 1'(01, Xl - XI, X2) + 1'(01 , XI, X 2 - X 2).
7. Tensors
326
Since 01 - (h, Xl - Xl, and X2 - X2 are all zero at p, we obtain the result that 7(0 1, XI, X2)(P) = 7(0 1, Xl, X2)(p). 0 Thus we have a natural correspondence between Trs(X(M)) and T;(M) (the latter being smooth sections of the bundle T r s (TM) -+ M). For example, if A E TJ(M), then we obtain an element of T13 (X(M)), also denoted by A, by defining a smooth function A(O,X, Y, Z) for given fields (O,X,Y,Z) by
A(O, X, Y, Z)(P)
:=
A(P)(O(P) , X(P), Y(p), Z(P)).
Conversely, if A E T I3 (X(M)), then we can use the above proposition to define an element of (M), which we denote by the same letter. Given A E T 13 (X(M)), define A(p) E T1(TpM) for each p as follows: For X p, Yp, Zp E TpM and Op E T; M we let
TJ
A(p) (Op, X p, Yp, Zp)
:=
A(O, X, Y, Z)(p),
where 0, X, Y, Z are any fields chosen so that O(p) = Op, X(p) = X p, Y(p) Yp, and Z(P) = Zp. By Lemma 6.37 we can always find such extensions, and by Proposition 7.32 above, A is well-defined. That A so defined is smooth follows from Proposition 7.22. The general case should be clear and, all said, we end up with a natural isomorphism of COO(M)-modules:
Similar reasoning shows that there is a correspondence between fields of TM-valued tensors and X(M)-valued tensors on X(M). For example, we have (7.6) Elements of T02(X(M); X(M)) are COO(M)-bilinear maps X(M) x X M
-+ X( M), while elements of 12 (M; T M) are sections of the bundle whose fiber at p is T02(TpMj TpM). So, if A E 12(Mj T M), then for each p, A(p) is an lR-multilinear map TpM x TpM -+ TpM. Similarly, we have (7.7)
T03 (X(M)jX(M)) ~ 73(M;TM).
In fact, later, when we define the curvature tensor on a semi-Riemannian manifold, it will initially be given as a multilinear map on modules of fields with values in X(M). The correspondence is then invoked to get a tenso' field (as defined in the bottom-up approach) with values in TM. It is just as easy to give a similar correspondence between Trs(r(~)) and r (Trs(~)) for some vector bundle ~ = (E, 11", M), where we view r(~) as a COO(M) module. Exercise 7.33. Exhibit the isomorphism (7.7) in detail.
7.5. Tensor Derivations
327
Exercise 7.34. Suppose that Sand T are tensors of the same type and we wish to show that they are equal. Then it is enough to check equality under the assumption that the vector fields inserted into the slots of Sand T are locally defined and have vanishing Lie brackets. Hint: Think about coordinate vector fields. We end this section with some warnings. It may seem that there is a simple way to obtain a pull-back by a smooth map f : M --+ N entirely from the top-down or module-theoretic view. In fact, one often sees expressions like f*T (Xl,'" ,Xs ) = T(f*X I , ... , f*X s )
(problematic expression!).
This looks cute, but invites misunderstanding. The left hand side takes fields Xl"'" Xs as arguments, while on the right hand side, if we consider T as a multilinear map X(N) x ... x X(N) --+ COO(N), then f*X~ must be fields. But the push-forward map f* is generally not defined on fields, and even if it were, the above expression would seem to be an equality of a function on M with a function on N. Note that X(M) is a COO(M)-module, while X(N) is a COO (N)-module. The above expression may be taken to mean something like f*T(X I , ... , Xs)(P) = T{Tf· XI(P), ... , Tf· Xs(P)), but now the right hand side has tangent vectors as arguments, and we are back to the bottom-up approach! A correct statement is the following: Proposition 7.35. Let f : M --+ N be a smooth map. Let T be a (0, s)tensor field. If T and f*T are interpreted as elements of rDs(X(N)) and TOs(X(M)) respectively, then f*T(X I , ... , Xs) whenever ~ is f -related to X t for i
= T(Y1 , .. . , Ys) 0
f
= 1, ... , s.
Of course, we can use Definition 7.31 to make sense of both push-forward and pull-back in the case that f is a diffeomorphism.
7.5. Tensor Derivations We would like to be able to differentiate tensor fields. In particular, we would like to extend the Lie derivative to tensor fields. For this purpose we introduce the following definition, which will be useful not only for extending the Lie derivative, but also in several other contexts. Recall the presheaf of tensor fields U 1-+ T;(U) on a manifold M. Definition 7.36. A tensor derivation is a collection of maps T;(U) --+ T;(U), all denoted by V for convenience, such that
V~ U
7. Tensors
328
(1) V is a presheaf map for 7; considered as a presheaf of vector spaces over R In particular, for all open U and V with V c U we have
'OAlv ='O(Alv) for all A E ~(U), i.e., the restriction of VA to V is just V (Alv).
(2) V commutes with contractions. (3) V satisfies a derivation law. Specifically, for A r{(U) we have
E ~(U) and
BE
'O(A®B) =VA®B+A®VB. For smooth n-manifolds, the conditions (2) and (3) imply that for A E ~(U), aI, ... ,ar E X*(U) and XI, ... ,Xs E X(U), we have
'O(A(al, ... , aT) Xl, ... ,Xs))
= (VA) (al, ... , ar, Xl"'" Xs) r
(7.8)
+ LA(al, ... ,Vai, ... ,ar,Xl, ... ,Xs ) i=l
s
+L
A(al, ... , ar, Xl, ... , VXi, . .. , Xs).
i=l
This follows by noticing that
A(al, ... ,ar,XI, ... ,Xs) = C(A® (al ® ... ® a r ®Xl ® ... ®Xs)) (where C is the repeated contraction) and then applying (2) and (3). Note that V stands for a family of maps whose domains 7;(U) depend not only on rand s, but also on U. The next proposition considers the situation where we only have derivations defined for U = M (the global case). Proposition 7.37. Let M be a smooth manifold and suppose we have a map on globally defined tensor fields V : ~ (M) ---+ 7; (M) for all nonnegative integers r, s such that (2) and (3) above hold for the case U = M. Then there is a unique induced tensor derivation that agrees with V on global sections, that is, on the various T;(M). Proof. We need to define V : ~(U) ---+ ~(U) for arbitrary open U as a derivation. Let 8 be a function in Coo(U) that vanishes on a neighborhood V of p E U. We claim that ('Oo)(p) = O. To see this, let 13 be a cut-off function equal to 1 on a neighborhood of p and zero outside of V. Then = (1 - 13)0 and so
o
'08(P) = '0((1- f3)8)(p) = 8(P)V(1- f3)(P)
+ (1- f3(p))'Oo(p) = O.
329
7.5. Tensor Derivations
Now given r E T;(U), let (3 be a cut-off function with support in U and equal to 1 on a neighborhood of p E U. Then (3r E T;(M) after extending by zero. Define (Vr)(p) := V((3r) (p). To show that this is well-defined let (32 be any other cut-off function with support in U and equal to 1 on a neighborhood of p E U. Then we have
V((3r)(p) - V((32r)(p) = (V((3r) - V((32r))(p) = V(((3 - (32)r)(p) = 0, where the last equality follows from our claim above with
~
= (3 - (32. Thus
V is well-defined on T;(U). We now show that Vr so defined is an element of ~(U). Let (U', x) be a chart with p E U' C U. Then we can write rJul E ~(U') as
We can use this to show that Vr as defined agrees with a global section in a neighborhood 0 of p and so must be a smooth section itself since the choice of p E U was arbitrary. To save on notation, let us take the case r = 1, s = 1. Then Tfl", = rjdxi ® a~" Let (3 be a cut-off function equal to 1 in the neighborhood 0 of p and zero outside of U'. Extend each of the sections (3rJ, f3dx i , and (3 a~' to global sections and apply V to (33r = ((3rj) ((3dxi) ®((3 a~i) to get
By assumption, V takes smooth global sections to smooth global sections, so both sides of the above equation are smooth. On the other hand, we have V(f33r )(q) = V(r)(q) by definition and valid for all q E O. Thus V(r) is smooth and is the restriction of a smooth global section. This gives a unique derivation V : T;(U) -+ T;(U) for all U satisfying the naturality conditions (1), (2) and (3). We leave it to the reader to check this last statement. D Exercise 7.38. Let VI and V2 be two tensor derivations (so satisfying conditions (1), (2) and 3 of Definition 7.36) that agree on functions and vector fields. Then VI = V 2 • [Hint: If a E X*(U) = -rr(U), we must have (V,o:) (X) = Vi (a(X)) - o:(ViX) for i = 1,2. Then both VI and V2 must obey formula (7.8) above.]
7. Tensors
330
Theorem 7.39. If'Du can be defined on Coo(U) and X(U) for each open U c M so that
(1) (2) (3) (4)
'Du(fg) = ('Duf) g + f'Dug for all f, g E Coo(U), ('DMf)lu = 'Du flu for each f E Coo(M), 'Du(fX) = ('Duf) X + f'DuX for all f E COO(U) and X E X(U), ('DMX)lu = 'Du Xlu for each X E X(M),
then there is a unique tensor derivation 'D that is equal to 'Du on Coo (U) and X(U) for all U. Sketch of proof. We wish to define 'D on X* (U) so that
(7.9)
'Du(a
X) - 'Dua ® X + a
'DuX.
By contraction we see that we must have ('Dua)(X) = 'Du(a(X))-a('DuX), which we take as the definition. Then check that (7.9) holds. Now define 'Du by formula (7.8) and verify that we really have a map 7;(U) -+ 7;(U . Check that 'Du commutes with contraction C : /t(U) -+ COO(U) for simple tensors a ® X E If(U). Use the fact that, locally, every element of It can be written as a sum of simple tensors. Next extend to 7; along the lines exemplified by the case of Ii (U) and the contraction CJ as follows: For 7 E Ii (U), we have
('DuCi7) (X) = 'Du ((CJ7) (X)) - (Ci7) 'DuX = 'Du (C(7(" X, .))) - C(7(" 'DuX, .)) = C'Du (7(', X,·) - 7(', 'DuX, .)) = C (('Du7) (', X, .)) = (ci'Du7) (X).
The general case would involve an inconvenient profusion of parentheses. Uniqueness follows from Exercise 7.38. Finally check by direct calculation that (3) of Definition 7.36 holds. Corollary 7.40. The L~e derivative ex can be extended to a tensor derivation for any X E X(M). The last corollary extends the Lie derivative to tensor fields. It follows from formula (7.8) that we have (eXS)(Y1,"" Ys)
(7.10)
X(S(Y1,"" Ys )) s
- L S(Y
1, ...
,1';,-1, ex 1';" Yi+1,"" Ys).
l=l
We now present a different way of extending the Lie derivative to tensor fields that is equivalent to what we have just done. First let A E TJ(M
7.6. Metric Tensors
and recall that if f* A E T;(M) by
f : M -+ M
331
is a diffeomorphism, then we can define
(f* A)(p)(a\ ... , aT, VI, ... , VS)
= A(f(p)) ( (Tpf-I)* (a l ), ... , (Tpf-I) * (aT), Tpf(Vl), ... , Tpf( v s )) for all a 1 , ... , aT E (TpM)* and VI, ... , Vs E TpM. If X is a vector field on M (possibly locally defined), we can define
(7.11) just as we did for vector fields. We leave it as a project for the reader to show that this definition agrees with our first definition of the Lie derivative of a tensor field. The Lie derivative on tensor fields is natural with respect to diffeomorphisms in the sense that for any diffeomorphism f : M -+ N and any vector field X we have (7.12) This property is not shared by some other important derivations such as the covariant derivative, which we define later in this book. Exercise 7.41. Show that the Lie derivative on tensor fields is natural with respect to diffeomorphisms in the above sense of equation (7.12) by using the fact that it is natural on functions and vector fields.
7.6. Metric Tensors We start out again considering some linear algebra that we wish to globalize. Thus the vector space V that we discuss next should be thought of as a tangent space of a manifold or a fiber of some vector bundle. We recall the following definitions: A symmetric bilinear form 9 on a finite-dimensional vector space V is nondegenerate if and only if g( V, w) = o for all w E V implies that V = O. A (real) scalar product on a (real) finite-dimensional vector space V is a nondegenerate symmetric bilinear form 9 : V x V -+ R. A scalar product space is a pair (V,g) where V is a vector space and 9 is a scalar product. We say that 9 is positive (resp. negative) definite if g(v,v) 2: 0 (resp. g(v,v) :S 0) for all V E V and g(v, v) = 0 ===> V = O. In case the scalar product is positive definite, we also refer to it as an inner product and the pair (V,g) as an inner product space. Otherwise we say that the scalar product is indefinite. A scalar product on V is sometimes called a metric tensor on V. We now need to introduce quite a few more definitions.
332
7. Tensors
Definition 7.42. The index of a symmetric bilinear form 9 on V is the dimension of the largest subspace W c V such that the restriction glw is negative definite. The index is denoted ind(g). Definition 7.43. Let (V, g) be a scalar product space. We say that v and ware mutually orthogonal if and only if g(v, w) = O. Furthermore, given two subspaces WI and W 2 of V we say that WI is orthogonal to W 2 and write WI ..1 W 2 if and only if every element of WI is orthogonal to every element of W 2. Since, in general, 9 is not necessarily positive definite or negative definite, there may be nonzero elements that are orthogonal to themselves. Definition 7.44. Given a subspace W of a scalar product space V, we define the orthogonal complement as W1.. = {v E V: g(v,w) = 0 for all
WEW}. Exercise 7.45. We always have dim(W) + dim(W1..) = dim(V), but unless 9 is definite, we may not have wnW1.. = {O}. Definition 7.46. A subspace W of a scalar product space (V, g) is called nondegenerate if glw is nondegenerate. Lemma 7.47. A subspace W W EEl W1.. (inner direct sum).
c (V, g)
is nondegenerate if and only if V =
Proof. This an easy exercise. One uses the standard fact that dim W + dim W1..
= dim(W + W1..) + dim(W n W1..).
0
It is a standard fact from linear algebra, already mentioned in Chapter 5, that if 9 is a scalar product, then there exists a basis eI, ... , en for V such that the matrix representative of 9 with respect to this basis is a diagonal matrix with ones or minus ones along the diagonal. Such a basis is called an orthonormal basis for (V, g). The number of minus ones appearing is the index ind(g) and so is independent of the orthonormal basis chosen. It is easy to see that the index ind(g) is zero if and only if 9 is positive definite. Definition 7.48. For each v E V with (v,v) 1= 0, let €(v):= sgn(v,v). Then if el,"" en are orthonormal, we have €t = €(i) := €(ei)' Thus if el, ... , en is an orthonormal basis for (V, g), then 9 (ei, eJ ) - Et 8tJ , where €~ = g(e t , e~) = ±1 are the entries of the diagonal matrix ind(g) of which are equal to -1 and the remaining are equal to 1. Let us refer to the list of ±1's given by (€I, ... , En) as the signature. We may arrange for the -1's to come first by permuting the elements of the basis. For example, if (-1, -1, 1, 1) is the signature, then the index is 2.
7.6. Metric Tensors
333
Remark 7.49. From now on, whenever context allows, we shall always assume that by "orthonormal basis" we mean an orthonormal basis that is arranged so that the -1 's come first as described above. The convention of putting the minus signs first is not universal, and in fact we used the opposite convention in Chapter 5. The negative signs first convention is popular in relativity theory and semi-Riemannian geometry, but the reverse convention is perhaps more common in Lie group theory and quantum field theory. It makes no difference in the final analysis as long as one is consistent, but it can be confusing when comparing references in the literature. Another difference between the theory of positive definite scalar products and indefinite scalar products is the appearance of the fi'S from the signature in formulas that would be familiar in the positive definite case. For example, we have the following: Proposition 7.50. Let el, ... , en be an orthonormal basis for (V, g). For any v E V, we have a unique expansion given by v = L:i fi(V, ei)ei. Proof. The usual proof works. One just has to notice the appearance of D the €i'S. Definition 7.51. If v E V, then let Ilvll denote the nonnegative number Ig(v, v)1 1 / 2 and call this the (absolute or positive) length or norm of v. Some authors call g(v, v) or g(v, v)I/2 the norm, which would make it possible for the norm to be negative or even complex-valued. We will avoid this. Just as for positive definite inner product spaces, we call a linear isomorphism il> : (VI, gl) -+ (V 2, g2) from one scalar product space to another an isometry if gI(V, w) = g2(il>v, il>w). It is not hard to show that if such an isometry exists, then gl and g2 have the same index and signature. Let (Vi, gi) be scalar product spaces for i = 1, ... ,k. By Corollary D.35 of Appendix D, there is a unique bilinear form
such that for
Vi E
Vi and
®:=1 Vi x ®:=I Vi -+ lR
Wi E
Wi,
= gI(Vl, WI)'"
9k(Vk, Wk)'
The form
= g(v, w).
334
7. Tensors
Denote the inverse by gn : V" ~ V. We force this to be an isometry by defining the scalar product on V" to be
g*(a, f3) = g(gn(a), gn(f3)). Under this prescription, the dual basis (e l , ... , en) corresponding to an orthonormal basis (el' ... , en) for V will also be orthonormal. The signatures (and hence the indexes) of g* and 9 are the same. Notation 7.52. When convenient, we shall also denote 9b(v) by either In or v~ and similarly for g~ (a). The above procedure now applies to give a scalar product on any tensor space Trs(V). For example, consider TIl (V) = V®V". Then there is a unique scalar product g} on V V" such that for VI al and V2 a2 E V ® V" we have g}(VI ® al,V2 ® (2) = g(VI,v2)g*(al,a2)' One can then see that for orthonormal el, ... , en we have that
{ei
eih
n
gO .
is an orthonormal basis for (T\ (V), In general, if we endow rs V) with a scalar product as above, then the natural basis for Trs (V) formed from the orthonormal basis (el' ... ,en) (and its dual (e l , ... , en)) will also be orthonormal. Notation 7.53. In order to reduce notational clutter, let us reserve the option to denote all these scalar products coming from 9 by the same letter 9 or, even more conveniently, by (" .). So, for example, (VI aI, V2 0:2) (VI, V2) (al' (2) by definition. Exercise 7.54. Show that under the natural identification of V ® V" with L(V, V), the scalar product of linear transformations A and B is given by (A, B) - trace (At B). The maps g~ and g" are called musical isomorphisms. Let us see how things look in terms of components. Let iI, ... , fn be an arbitrary basis of V and let 11, ... , fn be the dual basis for V". The components of 9 are given by gij :- g(fl, 13)' So if V v'jt and w = wilt, then g(v,w) =g(vi ft,w j f 3) viw3g(f"f3) =g3viw3, where we continue to use the Einstein summation convention. There must be a matrix (Alj) such that bft - Akd k. On the other hand,
gtJ
= g(f,,/3) = A kt
(bIt) (fJ)
8; = AJi.
= Akdk(fJ)
7.6. Metric Tensors
335
So we have bf~ = g~kfk.
Thus if v - vi f~, then bv = v JbfJ = vi gJdt, so the components of bv are (bv)~ - vi9i~ = giJvi. It is a common convention that if v~ are the components of v with respect to a basis, then the components of bv are denoted simply by lowering the index:
The map gp is called the flatting operator and the effect of this operator is sometimes described as "index lowering". If we write ~r gtJ Ii for some matrix (gtJ) , then (g~J) -1 (g~J) so that
-
tk
.k
9 gkJ - vi' This follows from fi = b~r = gZkbik gikgkJfi. If w E V* is written as i w = wd , then an easy calculation shows that the components of ~w are given by wi :- (~w)t gzk Wk . The isomorphism g~ is called the sharping operator and its effect is referred to as "index raising". The scalar product 9* on V* introduced above has components gtJ with respect to the dual basis. Indeed,
g*(fi,jJ) =g(~fi,~jJ) _g(lkik,gJ1fl) = gik gJI gkl = g jl 81 g1i ii. Next we see how to extend the notions of index raising and lowering to tensors. Suppose we have a tensor A E T 22 (V) and we wish to obtain a new related tensor A' whose final slot takes elements of V* rather than elements of V. Thus we want A' be of type (2 11). The trick is to define A' using ~ as follows: A'(w,1],v,a):= A(w,1],v,~a). Let us compute the components of A' with respect to our basis. We have
(A')ij k £
A'(fi,fj,fk,l) =A(f\jJ,fk,~f£) = A(l, f i , fk,lrikr)
= lr AtJ kr'
It is common to write AiJ k £ in place of (A')t j k £ when the context makes the meaning clear. In other words, we use the same root symbol but reposition the indices. This is an instance of index raising. Similarly we might use the flatting operation to obtain a new tensor from A. For w E V* and u, v, w E V we could define A' by
A'(u,w,v,w):- A(bu,w,v,w)
7. Tensors
336
and the components of A' would be given by (A')i i kt :=
g~rAri kt'
and again it is common to see simply A~ i kt. This process of raising and lowering indices is called type changing. Notice that we often obtain tensors that are not in consolidated form. However, one may simply apply consolidation as desired. But, notice that the staggering of position makes the relation of Ai j kt to the original tensor Aij kt clear. This is where positional index notation excels. The above can be approached in a slightly different way. We can take tensor products of various combinations of gp, g~ and the identity maps idv and idv*. For example, gp 0 idv 0g p
g~ 0 idv* : V 0 V 0 V 0 V*
V* --t V*
V 0 V* 0 V 0 V*.
Depending on convenience, this map might then be followed by the consolidation isomorphism V* 0 V 0 V* 0 V 0 V* --t V* 0 V* 0 V* 0 V 0 V. Exercise 7.55. Show that the map gp 0 idv gp g~ 0 idv* effects three iterated type changes and is given in component form by _. tCAaJb A iJktm I---t A i i kt m .. g~agkbg em·
In the presence of a scalar product, a type-changed tensor is considered to be just a different manifestation of the original tensor. We say that it is metrically equivalent. The reader should expect to see some slight variability with regard to how index positioning and order of slots is handled when type changing is done. For example, there is nothing stopping us from raising the a-th lower index into the b-th upper position while keeping everything in consolidated form:
Invariantly, this is described as
A'( a 1 , ... , a r+l ,VI, .. . , Vs-I )
..= A( a, 1 ... , b a , ... , a r+1 ,VI, ... , Va
1, 9 ~ ( a b) ,Va+I,···, Vs ).
Notice that if we raise all the lower indices and lower all the upper indices on a tensor, then we can "completely contract" against another tensor of the original type. We leave it to the reader to show that the result is the scalar product of tensors defined earlier. For example, let X = ~ Xij!, f and T = ~ Tij P 0 Ii. We may apply two type changes to T that are given
7.6. Metric Tensors
in component form we have
337
as'Tij H
'T~ H
'T ij .
In other words,
'T ij
= gikgjl'Tkl.
Then
See Problem 18. 7.6.1. Metrics on manifolds. If 9 E 72(M) is nondegenerate, symmetric and positive definite at every tangent space, we call 9 a Riemannian metric (tensor). If 9 is a Riemannian metric, then we call the pair (M, g) a Riemannian manifold. For example, in Chapter 4, we saw how a hypersurface in ]Rn inherits a Riemannian metric. This works just as well for a regular submanifold M c ]Rn of arbitrary codimension. In Riemannian geometry, it is the metric that is the basis for generalizations of length, volume and so on. Motivated by a desire to generalize and to include the mathematics needed for general relativity, we also allow the metric to be indefinite. In this case, some nonzero tangent vectors v might have zero or negative self-scalar product (v, v). If 9 E 72(M) is a symmetric tensor field, then we say that it is nondegenerate if gp is nondegenerate on TpM for every p. If furthermore gp has the same index for all p, then we say it has constant index. Definition 7.56. If 9 E 72(M) is symmetric nondegenerate and has constant index on M, then we call 9 a semi-Riemannian metric and (M, g) a semi-Riemannian manifold or pseudo-Riemannian manifold. The index is called the index of (M, g) and denoted ind(g) or ind(M). The signature is also constant and so the manifold has a signature also. If the signature of a semi-Riemannian manifold (with dim(M) ~ 2) is (-1, +1, +1, +1, ... ) (or according to some conventions (1, -1, -1, -1, ... )), then the manifold is called a Lorentz manifold. The simplest semi-Riemannian manifolds are the spaces the spaces lR n endowed with the scalar products given by v
(x,y)v
which are
n
= - Lxiyi + L t
lR~,
1
i
xiyi.
v+l
Since ordinary Euclidean geometry does not use indefinite scalar products, we shall call the spaces lR~ semi-Euclidean spaces when the index v = ind(g) is not zero. If we write just ]Rn, then either we are not concerned with a scalar product at all, or the scalar product is assumed to be the usual inner product (v = 0). Thus a Riemannian metric is just the special case of index O. The space lRt is called the Minkowski space. We will usually write (Xp, Yp) or g(Xp, Yp) in place of g(p) (Xp, Xp). Also, for a pair of vector fields X and Y, we define the function (X, Y) which is given by (X, Y)(p) = (Xp, Yp). In local coordinates (x\ ... , xn) on U c M,
338
7. Tensors
we have that glu = g~Jdxi ® dx j , where g~J and Y = Y~b on U, then
= (8~"
8~ ). Thus if X
X a~
(7.13) which is a smooth function defined on U. The expression (X, Y) - gijX~Y means that for all p E U we have (X(P), Y(p)) g~j(p)X~(P)yi(P). As we know, the functions Xi and yi are given by X~ dx~(X) and yi _ dxt(Y). On a semi Riemannian manifold, the musical isomorphisms are globalized in the obvious way to act on tensor fields. We simply apply the type change at each point in the domain of a given tensor field. For example, if A is a tensor field of type (2,2), then we may obtain a new metrically equivalent tensor field A' of type (1,3) by the rule that for any p E M, we have A'(p)(a,u,v,w) := A(p)(a,Du,v,W) for a E T;M and u,v,w E TpM. If we choose a chart (U, x), then in terms of the coordinate frames and the corresponding gtJ' we have (7.14)
A' =
A~Jklax~ a
where AiJkl - gJaAi~I' Of course, each of the possible conventions for consolidation and index position globalize accordingly. The reader is invited to compare our treatment with those found in [Pel, [Dod-Pos], [ONl] and [Stern]. We have been using coordinate frame fields, but there is nothing preventing us from giving local components of tensor fields with respect to arbitrary smooth frame fields. For example, if we choose a frame field (Ell ... , En and the corresponding dual frame field, then we may define gij := (E~, EJ with the corresponding giJ, and then local expressions analogous to those above hold with the 8~' 's and dxJ's replaced by the Ei'S and EJ's, where the components of the tensor are obtained by evaluating on these frame fields. In particular, if the frame field is an orthonormal frame field, then g~j = ±1 for i = j and g~J 0 for i =I- j. This can result in a good deal of simplification.
We now say a few words about the appropriate notion of equivalence of semi-Riemannian manifolds. Definition 7.57. Let (M, g) and (N, h) be two semi-Riemannian manifold~. A diffeomorphism cP : M -+ N is called an isometry if CP* h - g. Thus for an isometry cP : M -+ N we have g(v, w) = h(TCP· v, Tcp· w) for all v, wE TM. If cP : M -+ N is a local diffeomorphism such that CP* h g, then cP is called a local isometry. If there is an isometry cP : M -+ N, then we say that (M, g) and (N, h) are isometric.
7.6. Metric Tensors
339
Definition 7.58. The set of all isometries of a semi-Riemannian manifold M to itself is a group called the isometry group. It is denoted by Isom(M). The isometry group of a generic manifold is most likely trivial, but examples of manifolds with relatively large isometry groups are easy to find using Lie group theory. Also, Myers and Steenrod showed that the isometry group of a compact Riemannian manifold is a Lie group (see [My-StD. Recall from Chapter 5 that associated to 1R~ we have the matrix groups 0(/1, n - /I) and SO(/I, n - /I). The isometry group of 1R~ is given by Iso(/I, n - v) = {L : L(x) = Qx + Xo for some Q E O(v, n - v) and Xo E
1R~}.
This is the group of semi-Euclidean motions. Example 7.59. We have seen that a regular submanifold of a Euclidean space IRn is a Riemannian manifold with the metric inherited from Rn. In I C IRn is a Riemannian manifold. Every isometry particular, the sphere of sn-I is the restriction to sn I of an isometry of IRn that fixes the origin.
sn
Definition 7.60. Let M and M be semi-Riemannian manifolds. If p : M -t M is a covering map such that p is a local isometry, we call p : M -t M a semi-Riemannian covering. If we have a local isometry ¢ : N -t M, then any lift ¢ : N -t M is also a local isometry (Problem 16). Deck transformations are lifts of the identity map M -t M, and so are diffeomorphisms which are local isometries. Thus deck transformations are, in fact, isometries. We conclude that the group of deck transformations of a semi-Riemannian cover is a subgroup of the group of isometries Isom( M). Let us consider here the case of a discrete group G and a discrete group action>. : G x M -t M that is smooth, proper, and free. We have already seen that the quotient space M / G has a unique structure as a smooth manifold such that the projection K, : M -t M / G is a covering. Let us now assume that G acts by isometriE's so that >,;(.,.) = (.,.) for all 9 E G. The tangent map TK,: TpM -t T".(p) (MIG) is onto. For x E MIG, let VI,V2 E Tx(M/G). Define hx(VI' V2) = (VI, V2), where VI and V2 are chosen at the same point and such that TK,'Vi = Vi. We wish to show that this is well-defined. Indeed, if Vi E TpM and Wi E TqM are such that TpK, . Vi = TqK, . Wi = Vi for i = 1,2, then there is an isometry >.g with >'gp = q. Furthermore, since >.g is a deck transformation and curves representing Vi and Wi must be related by this deck transformation, we also have Tp>'g . Vi = Wi. Thus
(VI, V2) = (Tp>'gvI, Tp>.gV2) = (WI, W2),
340
7. Tensors
which means hx is well-defined. It is easy to show that x I---t hx is smooth and defines a metric on MIG with the same signature as that of (0,.) and that further, /'i,*h = (', .). In fact, we will use the same notation for either the metric on MIG or on M. Definition 7.61. A lattice of rank k in ]Rn is a set of the form
r
:=
{x
E ]Rn : x =
'Endi where ni
E Z} ,
where h, ... , !k are linearly independent elements of ]Rn. The called the generators of the lattice.
h, ... , !k are
The lattice zn c ]Rn is the standard rank n lattice, and it is generated by the standard basis. A lattice is a subgroup of ]Rn and so acts on ]Rn by a I---t a + v for v E r. This is a discrete, free and proper action, and so the quotient ]Rn Ir provides a simple example of the above construction and so has a metric induced from ]Rn. If the lattice has full rank n, then IRn If is called a flat torus (or flat n-torus) and is diffeomorphic to the product of n copies of the circle 8 1 . Each of these n-dimensional flat tori is locally isometric, but may not be globally isometric. To be more precise, suppose that h, 12, ... , in is a basis for]Rn which is not necessarily orthonormal. Let r f be the lattice consisting of integer linear combinations of h, 12,···, In. Now suppose we have two such lattices r f and r f' When is ]Rn Ir f isometric to ]Rn Ir f? It may seem that, since these are clearly diffeomorphic and since they are locally isometric, they must be (globally) isometric. But this is not the case (see Problem 17). The study of the global geometry of flat tori is quite interesting and even has deep connections with fields outside of geometry such as arithmetic, which we shall not have the space to pursue. We know from Chapter 4 that there are surfaces in ]R3 that are diffeomorphic to a torus 8 1 x 8 1 . Such surfaces inherit a metric from the ambient space, but it turns out that the Riemannian surface obtained in this way cannot be isometric to one of the flat 2-tori introduced here. In Chapter 13 we will see how each metric on a manifold gives rise to an associated curvature tensor. The reason the tori just introduced are referred to as flat is because (being locally isometric to some ]Rn) they have vanishing curvature tensor. If we have semi-Riemannian manifolds (M, g) and (N, h), then we can consider the product manifold M x N and the projections prl : M x N -t M and pr2 : M x N ~ N. The tensor g x h = prig + pr2h provides a semiRiemannian metric on the manifold M x N, which is then called the semiRiemannian product of (M,g) and (N,h). Let (Ul,x) and(U2,Y) denote charts on M and N respectively. Then we may form a product chart for M x N defined on Ul x U2. The coordinate functions of this chart are given by Xi = xi 0 prl and 'if = yi 0 pr2' We have the associated frame fields 88
7.6. Metric Tensors
341
and 8~" The components of 9 x h = prig + pr2h in these coordinates are discovered by choosing a point (pl,P2) E UI X U2 and then calculating. We have
9 x h(
O~i (p,q) , o~j (P,q) I
= prig (
1
~~i I(p,q) ' ~~J I(p,q) ) + pr2h( ~~i I(p,q) ' ~~j I(p,q) )
uX
g(
1
o~t I
p '
= 0+0
uX
O~ix (p,q) ,Tprl o~jY
= 9 (Tprl =
UY
Op)
+ h(Oq,
o~j
I
(p,q)
)
+ h (Tp r2
I)
UY
O~ix
1
(p,q)
,Tpr2
o~jY
I
(p,q)
)
= 0,
and (abbreviating a bit)
gXh(~~tl 'tJ~jl ) =g(tJ°tl ,tJ°il )+h(Oq,Oq)=9ij(P). uX (p,q) uX (p,q) uX p uX p Similarly 9 x h(8~" 8~J )(P, q) = htj(q). In practice, the coordinate functions constructed above are often abusively denoted by (xl, ... , xn1 , yl, ... , yn2) and the frame field,s by -/yr, ... , a:n2' So with respect to these coordinates, the matrix of 9 x h is of the form
-/xr, ... , 8:nl ,
where G
= (gij 0 prl) and H = (hij 0 pr2)'
Notation 7.62. The product metric is often denoted by 9 dinates by ds 2 = gijdxidxj + hk1dykdyl.
+ h or in coor-
Every smooth manifold that admits partitions of unity also admits at least one (in fact infinitely many) Riemannian metric. This includes all (finite-dimensional) paracompact manifolds. The reason for this is that the set of all Riemannian metric tensors is, in an appropriate sense, convex. We record this as a proposition. Proposition 7.63. Every smooth (paracompact) manifold admits a Riemannian metric. Proof. This is a special case of Proposition 6.45.
o
If M is a regular sub manifold of a Riemannian manifold (N, h), then M inherits a Riemannian metric 9 := z* h, where z : M "--+ N is the inclusion map. We have already used this idea for submanifolds of the Euclidean
7. Tensors
342
space lRd • More generally, if f : M -+ N is an immersion, then (M, f* h) is a Riemannian manifold. In particular, if f : M -+ lRd is an immersion, then we obtain a Riemannian metric on M. It turns out that every Riemannian metric on M can be obtained in this way. Actually, more is true! For any Riemannian manifold (M, g) there is an embedding f : M -+ lRd , for some d, such that g := z*gO, where go denotes the standard metric on JRd. What this means is that f(M) is a regular submanifold, and if we give I(M) the metric induced from the ambient space lRd , then f becomes an isometry when viewed as a map into f(M). We say that such an f is an isometric embedding of Minto lRd . In short, the result is that every Riemannian manifold can be isometrically embedded into some Euclidean space of sufficiently high dimension. This difficult theorem is called the Nash embedding theorem and is due to John Forbes Nash (see [Nash!] and [Nash2]). Note that d must be quite large in general (d = (dimM)2 + 5 (dim M) + 3 is sufficient). For an indefinite semi-Riemannian manifold (N, h), the pull-back f*h by a smooth map I: M -+ N may not be a metric because there may be points p E M such that Tpl(TpM) is a degenerate subspace of Tf(p)N. In particular, not every embedding of a manifold M into a semi-Euclidean space 1R~ (with 1 < 1I < d) induces a metric on M. Nevertheless, every metric on M of any index can be obtained using an appropriate embedding into some lR~ (see [Clark]).
Problems
7
(1) Show that if E V ® V* has the same components every basis, then 7] = for some E lR.
ao;
a
7; with respect to
(2) If el, ... , en is a basis for V and iI, ... , 1m is a basis for W, then {Ejh=l .....n j=l .....m is a basis for L(V, W), where Ej(v) :- ei(v)fJ. Show this directly without assuming the isomorphism of W ® V* with L(V, W). (3) Let (Vi, gi) be scalar product spaces for i = 1, ... , k. By Corollary D.35 of Appendix D, there is a unique bilinear form 'P : ®:=l Vi x
such that for
Vi E
Vi and
Wi E
®:=1 Vi -+ lR
Wi,
Show that 'P is nondegenerate and that it is positive definite if each 9 is positive definite.
343
Problems
(4) Define T : X(M) x X(M) -t COO(M) by T(X, Y) = XYf. Show that T does not define a tensor field. (5) Let b~j and b~j be the components of a bilinear form b with respect to bases el,"" en and ei, ... , e~ respectively. Show that in general det(bij ) does not equal det(b~j)' Show that if det(biJ ) is nonzero, then the same is true of det(b~j)' (6) Show that while a single algebraic tensor Tp at a point on a manifold can always be extended to a smooth tensor field, it is not the case that one may always extend a (smooth) tensor field defined on an open subset to a smooth tensor field on the whole manifold. (7) Let 4J : ]R2 -+ ]R2 be defined by (x, y) 1---7- (x + 2y, y). Let T := x ® dy + y ® dy. Compute 4J*T and 4J*T.
tx
t
(8) Prove Proposition 7.35. (9) Let V be a tensor derivation on M and suppose that in a local chart we have V(a~.) = l:Df a~J for smooth functions nf. Show that V(dx j ) = - l: Df dx i . Let X be a fixed vector field with components Xi in our chart. Find the nf in the case that V = ex. (10) Let A E 7(f(M). Show that the component form of the Lie derivative with respect to a chart is given as (£xAtb = aAab Xh _ axa Ahb _ ox bA ah oxh aXh ox h (where we use the Einstein summation convention). Show that if A E TJ! (M), then the formula becomes aAab h axh axh (£x A ) = ox h X + ox a Ahb + ox b A ah .
Ti
Find a formula for A E (M). (11) Show that our two definitions of the Lie derivative of a tensor field agree with each other. (12) In some chart (U, (x, y)) on a 2-manifold, let A = x/y ® dx ® dy + ® dy ® dy and let X = + x/y. Compute the coordinate expression for A. £x 13) Suppose that for every chart (U, x) in an atlas for a smooth n-manifold M we have assigned n 3 smooth functions j , which we call Christoffel symbols. Suppose that rather than obeying the transformation law expected for a tensor, we have the following horrible formula relating the Christoffel symbols r:j on a chart (U', y) to the symbols rfj :
Ix
Ix
r:
'k
r ij
a 2 xl oyk = 0 y~'0 yJ' a XI
t oxr ox s oyk
+ rrs-a'-a yJ'-a Xt y~
(sum).
7. Tensors
344
Assume that such a transformation law holds between the Christoffel symbol functions for all pairs of intersecting charts. For any pair of vector fields X, Y E X(M), consider the functions (DXy)k given in every chart by the formula (Dxyl := ~r;:Xh + rfJxiyi. Show that the local vector fields of the form (Dxy)k~, defined on each chart, are the restrictions of a single global vector field D x y. Show that Dx : Y H DxY is a derivation of X(M) and that with Dxl:= XI for smooth functions, we may extend to a tensor derivation; D x is called a covariant derivative with respect to X. There are many possible covariant derivatives. (14) Continuing on the last problem, show that DfX+gYY = IDxY +gDyT for all I, 9 E COO(M) and X, Y E X(M) and Y E 'Ps(M). (15) Show that if are coordinate vector fields from some chart, then [a~" a~J 1 == O. Consider the vector fields and y arising from standard coordinates on R2 and also the and from polar coordinates. Show that [ix, is not identically zero by explicit computation.
-/t-r, ... ,-Jln
frl
tr
Ix
to
t
(16) Let M -+ M be a semi-Riemannian cover. P~ve that if we have a local isometry ¢> : N -+ M, then any lift ¢> : N -+ M is also a local isometry. (17) Suppose we have two lattices r, and r, in R,n. Let R,n have the standard metric. Describe the induced metrics on R,n jr, and R,n jr, and provide a necessary and sufficient condition for the existence of an isometry R,njr,
-+ R,njr,.
(18) Show that if a, (3 E Tk(V) where V is a scalar product space, then the scalar product on Tk(V) is given in terms of index raising and contraction by
Chapter 8
~ifferential
50rmS
In one guise, a differential form is nothing but an alternating (antisymmetric) tensor field. What is new is the introduction of an antisymmetrized version of the tensor product and also a natural differential operator called the exterior derivative. We start off with some more multilinear algebra. 8.1. More Multilinear Algebra Definition 8.1. Let V and W be real finite-dimensional vector spaces. A kmultilinear map a : V x ... x V ---t W is called alternating if a( VI, ... ,Vk) = o whenever Vi = Vj for some i =1= j. The space of all alternating kmultilinear maps into W will be denoted by L~t(Vj W) or by L:1t(V) if W = lR. By convention, L~lt(Vj W) is taken to be Wand in particular, L~t(V) = R Since we are dealing with the field 1R (which has characteristic zero), it is easy to see that alternating k-multilinear maps are the same as (completely) antisymmetric k-multilinear maps which are defined by the property that for any permutation a of the letters 1,2, ... , k we have
W(Vl' V2,···, Vk)
= sgn(a)W(Vu(l) , Vu(2) , ... , Vu(k))'
Let us denote the group of permutations of the k letters 1,2, ... , k by Sk. In what follows, we will occasionally write ai in place of a(i). Definition 8.2. The antisymmetrization map Alt k : TDk(V) is defined by
---t
L~t(V)
k 1 ~ Alt (W)(Vl' V2, .. ·, Vk) := k! L..J sgn(a)w(vUll VU2 ' " ' ' VUk )· UESk
345
346
8. Differential Forms
Lemma 8.3. For a E TOkl (V) and /3 E yDk2 (V), we have
= Altkl+k2 (a ® /3), Altkl +k2 (a ® Alt k2 /3) = Alt kl +k2 (a ® /3) ,
Altkl+k2(Alt kl a ® /3)
and
Proof. For a permutation a E Sk and any T E TOk(V), let aT denote the element ofTOk(V) given by (aT)(v1, ... ,Vk):= T(V a (l), ... ,Va (k)). We then have Alt k(aT) = sgn( er) Altk (T) as may easily be checked. Also, by definition Altk(T) = Esgn(er)erT. We have Alt kl +k2(Alt kl (a) ® /3)
= Altkl+k2
((k~! L
sgna (aa)) ® /3)
aESkl
= Altkl+k2
(k~! L
sgner (era ® /3))
aESkl
1
= -kI 1·
L
sgn a Alt kl +k2 (aa ® /3) .
aESkl
Let us examine the expression sgn a Alt ki +k2 (era ® /3). If we extend each er E SkI to a corresponding element a' E Skl+k2 by letting a'(i) = a(i) for i ~ kl and er'(i) = i for i > kl' then we have era ® /3 = er'(a ® /3) and also sgn( er) = sgn( a'). Thus sgn a Altkl +k2 (aa ® /3) = sgn a' Alt k1 +k2 er' (a ® !3 and so
We arrive at Altkl+k2(Altkl (a) ® /3) = Altkl+k2 (a ® /3). In a similar way, Altkl+k2(a®Altk2/3) = Altkl+k2 (a ® /3), and so the last part of the theorem follows. 0
347
8.1. More Multilinear Algebra
Given W E L!ft(V) and TJ E L!rt(V), we define their exterior product or wedge product W /\ TJ E L:.ttk2(V) by the formula
w/\TJ:=
(k1 + k2)! k +k k'k' Alt l 2(w01J). 1· 2·
Written out, this is
(8.1)
Warning: The factor in front of Alt in the definition of the exterior product is a convention but not the only convention in use. This choice has an effect on many of the formulas to follow which differ by a factor from the corresponding formulas written by authors following other conventions. It is an exercise in combinatorics that we also have
L
sgn(lT)w(vO"l' ... , VO"I,JTJ(VO"Iol +1' ... , VO"kl +102)'
(kl ,k2 )-shufRes 0"
In the latter formula, we sum over all permutations such that IT (1) < IT(2) < ... < IT(kI) and IT(k1 + 1) < IT(k1 + 2) < ... < IT(k1 + k2)' This kind of permutation is called a (k1' k2 )-shuffie as indicated in the summation. The most important case of (8.1) is for w, TJ E L~t(V), in which case
(w /\ 1J)(v, w) = w(v)TJ(w) - w(w)TJ(v). This clearly defines an antisymmetric bilinear map. Proposition 8.4. For a E L:tt(V), f3 E L:ft(V), and'Y E L:?t(V), we have kl (V) x L k2 (V) ~ L kl +k2 (V) is R-bilinear' (i) /\.. L alt alt alt ' (ii) a /\ f3 = (_1)k l k2 f3 /\ a;
(iii) a /\ (f3 /\ 'Y) = (a /\ f3) /\ 'Y. Proof. We leave the proof of (i) as an easy exercise. For (ii), we consider the special permutation f given by (f(1), f(2), ... , f(k 1+k2)) = (k1 +1, ... , kl +k2, 1, ... , kI). We have that a0f3 = f (f3 0 a). Also sgn(f) = (_1)klk2. So we have Alt kl +k2 (a 0 f3) which gives (ii).
= Alt k1 +k2 (f (f3 0
a))
= (-1 )klk2 Alt k1 +k2 (f3 0
a),
8. Differential Forms
348
For (iii), we compute (kl
0./\
+ k2 + k3)!
(f3 IVy) = k' (k
k )' Alt(a ® (f3/\ ')'))
+ 3· = (kl + k2 + k3)! (k2 + k3)! Al ( 1·
2
k 1·'(k 2 + k)' 3·
=
(kl
k 2·'k 3·,
t a ®
Al (R t
® ')'
f-'
))
+ k2 + k3)! kl!k2!k3! Alt(a ® Alt (f3 ® ')')).
By Lemma 8.3, we know that Alt(a ® Alt (f3 ® ')')) = AIt(a ® (f3 ® ')')), and so we arrive at (kl + k2 + k3)! 0./\ (f3/\ ')') = kl!k2!k3! AIt(a ® (f3 ® ')')). By a symmetric computation, we also have (kl
(a /\ (3) /\ ')' =
+ k2 + k3)! kl!k2!k3! Alt((a ® (3) ® ')'),
and so by the associativity of the tensor product we obtain the result.
0
Example 8.5. Let V have a basis e}, e2, e3 with dual basis el, e2, e3. Let a = 2e1 /\ e2 + e 1 /\ e2 and f3 = e 1 - e3. Then as a sample calculation we have 0./\ f3 = (2e 1 /\ e2 + e 1 /\ e3) /\ (e 1 - e3 )
= 2e 1 /\ e2 /\ e1 + e1 /\ e3 /\ e 1 = -2e1 /\ e2 /\ e3 _ e1 /\ e3 /\ e3 = -2e 1
/\
e2 /\ e3,
where we have used that e 1 /\ e2 /\ e1 = -e 1 /\ e1 /\ e2 = 0, etc.
Lemma 8.6. Let 0. 1 , •.• ,ak be elements of V* - L~t (V) and let VI, ••. ,Vk E V. Then we have 0. 1 /\ ... /\ a k (V}, ... ,Vk)
where A = (aj) is the k
xk
= det A,
matrix whose ij-th entry is
aj =
a~(vj).
Proof. From the proof of the last theorem we have 0./\
(f3 /\ ')') =
(kl
+ k2 + k3)! k1!k2!k3! Alt(a ® (f3 ® ')')).
By inductive application of this we have 0. 1 /\ ••• /\
a k = k! Alt(a 1 ® ... ® a k ).
Thus 0.1
/\ ... /\ ak(vl, ... , Vk)
=
z= 0'
sgn(u)a 1 (vO'l)'" ak(vO'k) = det A.
0
349
B.1. More Multilinear Algebra
Let us define
1
(8.2)
-1
o
if jl, .. " jk is an even permutation of iI, ... , ik, if jI, ... ,jk is an odd permutation of iI, ... ,ik, otherwise.
Then we have Corollary 8.7. Let el, ... ,en be a basis for V and e l , e 2 , .•• ,en the dual basis for V*. Then we have e
il /\
...
/\
e
ik ( ejI' •.. ,ejk )
= EJIil ......ikJk ·
Since any a E L~t(V) is also a member of T>k(V), we may write
where a tI ...tk = a(eill" ., elk)' By Alt (a) = a and the linearity of Alt we have a
= ""' a' . Alt (eil ~ 'I···lk
® ... ® elk)
=~ a . eh k! ""' ~ 'I···'k
/\ ... /\ eik .
We conclude that the set of elements of the form eil /\ ... /\ eik spans L~t(V). Furthermore, if we use the fact that both ai UI ""Uk = sgn (j a'I ... ik and e'ul /\ •.. /\ e'erk = sgn (j eil /\ ... /\ eik for any permutation (j E Sk, we see that we can permute the indices into increasing order and collect terms to get 0: =
~ a eil /\ ... /\ eik k! ""' ~ \I .. ·'k
=
""'
.~ 'I
.
a ' . eil 'I 12,··,'k
/\
ei2
/\ ••. /\
eik ,
where in the last expression we sum only over strictly increasing indices. We can check that the set of (~) elements of the form eil /\ e i2 /\ ... /\ elk with 1 :::; il < i2 < ... < ik :::; n is linearly independent as follows: Suppose 0: = EiI
0= a(eJll ···, ejk)
L
il <'2< .. ·
L 1...
We have used that E;~·.::;~ is zero unless jl ... jk is a permutation of i i k, and then in this case, since both are increasing, we must have ir = jr for r = 1, ... , k. Thus we get 0 - ajli2- .. jk' and since the choice of j's was arbitrary, we have shown independence. Thus, we have proved the following theorem.
8. Differential Forms
350
Theorem 8.8. If (e 1 , e2 , ••• , en) is a basis for V*, then the set of elements {ei1 /\ ei2 /\ ... /\ eik : 1 :::; i1 < i2 < ... < ik :::; n} is a basis for L~t(V). Thus dim(L~t(V)) = (~) L!lt(V) = 0 for k > n.
= k!(:~k)!' In particular,
Corollary 8.9. If 0:1 , ... ,o:k E V*, then 0: 1 , ... ,o:k are linearly independent if and only if Proof. If 0: 1 , ... ,o:k are independent, then there are elements o:k+!, ... , an such that 0: 1 , ... ,o:n is a basis for V*. Let Vb ... ,Vn be the basis for V dual to the above basis. Then since 0:1 /\ •.. /\ o:k is a basis element for L!lt(V), it cannot be zero. For the other direction, suppose that, after rearranging if needed, we have Then we would have 0: 1
+ ... + Cko: k ) /\ 0: 2 /\ ... /\ o:k = 0: 1 /\ •.• /\ o:k # 0 implies that 0: 1, ••• , o:k
/\ ... /\ o:k
Thus we see that independent.
=
(C2 0: 2
O. are linearly
0
Notation 8.10. In order to facilitate notation, we will sometimes abbreviate a sequence of k integers, say iI, i2, ... , ik , from the set {1, 2, ... , dim(V)} as I, and ei1 /\ ei2 /\ ... /\ eik will be written as el and €IL = €~l"'i;e. Also, if <-1· .. (;k
we require that i1 < i2 < ... < ik, then we will write f. We will freely use similar self-explanatory notation as we go along without further comment. For example, we may write 0:
to mean
0:
= 2:ajej
= I:tl
/\
ei2 /\ ... /\ eik .
Whenever we have 0: = I: aje j , where aj = ail ... ik and i1 < ... < ik, we can define aJ for any k-tuple of indices J = (il, ... ,jk) by requiring that aJ = I: aj. Then we have aJ = 0 when the entries of J are indices that are not distinct. Otherwise, aJ = J where J is J rearranged in increasing
€5
€Ja
order. But, it is also easy to see that o:(ejll ... ,ejk) have 0:
=
2: a-e I
12: ale
j = -
= '"" . eh L.J a'tl .. ·t/c
k!
to-. ••• to-. 'CI 'CI
I
12: a"
= -k!
ei/c .
tl .. ·tk
=
I:€5aj.
ei 1/\ ... /\ e t k
We then
8.1. More Multilinear Algebra
351
Exercise 8.11. Show that for a = J as above, we have
if :L aIel and /3 = ~ :L bJe J with I and
a /\ /3 = (k ~ i)! . ~ (a /\ /3)il ... ikH ei1 /\ ... /\ e~k+t, ~l"'~k+t
where
(8.3) Definition 8.12. Let v E V and
W
E L~t(V). Define ivw E L:i;l(V) by
ivw is called the interior product of v with W or the contraction of w by v. By convention ~va = 0 for a E L~t(V) := lR. Thus we obtain a linear map iv : L~t(V) ~ L~tl(V). It is clear that iv depends linearly on v and that iviww = -iwivw for all v,w E V and hence iv 0 iv = O. With L~t(V) = V* and L~t(V) = lR, the sum dim (V)
Lalt (V) =
E9
L~lt (V)
k=O
is made into an lR-algebra via the exterior product just defined. Since a/\/3 E L~t+k2(V) whenever a E L:MV) and /3 E L:rt(V), the algebra Lalt(V) is a graded algebra (see Definition D.43). The contraction map iv for v E V extends to a map Lalt(V) ~ Lalt(V). Since iv(L~t(V)) C L~t l(V), we say that iv is a map of degree -1. Proposition 8.13. For v E V, the contraction map iv satisfies the product
rule
iv(a /\ /3) = (~va) /\ /3 + (-I)ka /\ (~v/3) for a
E L~t(V).
The map iv : Lalt (V) ~ Lalt (V) is the unique degree -1 map satisfying the above product rule and satisfying iva = 0 for a E lR and iv9 = 9(v) for 9 E V* = L~t(V) and v
E
V.
Proof. Let a E L~t(V) (M) and /3 E L~lt(V) (M). In the following computation we use the permutation (j which is given by (2,3, ... ,k + 1, 1, k + 2, ... ,k + i)
1--1-
(1,2, ... , k + i).
352
8. Differential Forms
The sign of
u is (_1)k.
We compute as follows:
(zva 1\ [3 + (-1)ka 1\ zv[3)(V2, ... , Vk+e) (k+l-1)!
= (k-1)!l! Alt(zv a @[3)(v2, ... ,VkH) +(-1) ~
k
(k + l - 1)! k!(l-1)! Alt(a@l,v[3)(v2, ... ,Vk+1)
sgn((j)
= L..J (k _ 1)!l!a(v, VU2 ,···, vUk )[3(VUk+l'···' vUkH ) U
+ (-1)
k
1 ~ k!(l _ 1)! L..J sgn ((j)a(vU2 ,···, VU(k+l»)[3(V, VUk +2,·'·' VUk+l) U
=
1 (k-1)!l! Lsgn((j)a(v,vU2, .. "VUk)[3(VUk+l,.",VUkH) U
+ (_1)k k!(l ~ 1)! L sgn((ju)a( v, VU2 , ... , VUk+l)[3(VUk+l' ... , VUk+l) U
1
= (k _ 1)!l! L a(v, VU2 ,· .. , VU(k+l) )[3 (VUk+lI ... , VUkH ) U
= a 1\ [3 (v, V2, ... , Vk+1)
= l,v(a 1\ [3)(V2,.'.' Vk+1).
We leave the proof of the remaining statements of the theorem to the reader (see Problem 14). 0 It follows from the above that if (h,
. Zv
... , ()k
E V* and v E V, then
k
(()1
1\ ... 1\ ()k)
where the caret over
()e
~ -.. 1\ ... 1\ ()k, = L..J (-1) k+1 ()e( V)()1 1\ ... 1\ ()e
e=1 denotes omission.
Proposition 8.14. If'\ E L(V, V), a E L!tt(V) and [3 E L!?t(V), then
'\*(a 1\ [3) = '\*a 1\,\*[3. Proof. This follows from Proposition 1.83 and the definition of the exterior ~~.
0
We look more closely at L~t(V) where n = dim V. The dimension of L~t(V) is one, and any nonzero element of L~t(V) provides a basis. If ,\ E L(V, V), then ,\* : L~t(V) ~ L~t(V) is a linear transformation
8.1. More Multilinear Algebra
353
between one-dimensional vector spaces, and so it must be multiplication by an element of JR. Thus there is a unique number det(A) E lR such that
A*w
=
det(A)w
for any W E L~lt (V). This number is called the determinant of A. This provides a definition of determinant that does not involve a choice of basis. We will show that if A = (a~) represents A E L(V, V) with respect to a basis for V, then det(A) = det(A), where the determinant of a matrix is given by the standard definition. Let A E L(V, V) and suppose that A(ei) = L: a{ej for some basis (el,"" en) with dual (e l , ... , en). Then A = (a~) represents A, and we have A*(e l /\ ... /\ en)(eb ... , en) = (e l /\ ... /\ en) (Ael,"" Aen ) = det(ei(Aej))
= detA.
On the other hand, A*(e l /\ ... /\ en) = (det A) (e l /\ ... /\ en), and since el /\ ... /\ en(eb ... , en) = 1, it must be that det(A) = det(A). Exercise 8.15. Show (without using a basis) that if f, A E L(V, V), then (i) det (f 0 A) = det f det A; (ii) det(id) = 1; (iii) A E GL(V) if and only if det A -I 0; (iv) if A E GL(V), then det (A-I) = (detA)-I. We now briefly discuss orientations of real vector spaces. In what follows, let V be a real vector space with n = dim V. The nonzero elements of L~lt(V) are sometimes referred to as volume elements (although this term will also apply to a global object later on). Two nonzero elements WI and W2 of L~lt(V) are said to be equivalent if there is a scalar c> 0 such that WI = CW2. The equivalence class of W will be denoted [w]. There are clearly exactly two equivalence classes. An equivalence class is referred to as an orientation for V. An oriented vector space is a vector space with a choice of orientation and is sometimes written as a pair (V, [w]). An ordered basis (el, ... , en) for an oriented real vector space (V, [w]) is said to be positive with respect to the orientation if w (el' ... ,en) > 0 for some and hence any w E [w]. Equivalently, (el, ... ,en) is positive if ei /\ ... /\ en E [w]. Actually a choice of ordered basis for a real vector space determines an orientation with respect to which it is positive. Indeed, if e1, ... ,en is dual to such a basis, then choose [w] where w = e l /\ ... /\ en. Definition 8.16. Let (Vb [WI]) and (V2' [W2]) be oriented real vector spaces. A linear isomorphism A : V I ---+ V 2 is said to be orientation preserving if )'*W2 = CWI for some c> 0 and some, and hence any, choices WI E [WI] and W2 E [W2]'
8. Differential Forms
354
When one talks about an element A E GL(V) being orientation preserving, one means that det A > 0 and this is tantamount to A : (V, [w]) ~ (V, [w]) being orientation preserving for any choice of orientation [w]. 8.1.1. The abstract Grassmann algebra. We now take a very abstract approach to constructing an algebra that will be seen to be isomorphic to Lalt(V). We wish to construct a space that is universal with respect to alternating multilinear maps. We work in the category of real vector spaces, although much of what we do here makes sense for modules. Consider the tensor space Tk(V) := &;/V (take any realization of the abstract tensor product as in Definition D.17). Let A be the submodule of Tk(V) generated by elements of the form VI ® ... Vi ® ... ® Vi··· ® Vk· In other words, A is generated by simple tensors with two (or more) equal factors. Recall that associated to Tk (V) we have the canonical map I8l : V x··· x V -+ Tk(V) defined so that ®(VI, ... , Vk) = VI ® .. ·®Vk. We define the space of k-vectors to be V A ... A V :=
1\
k
V:= Tk(V)j A.
Let Ak : V x ... x V -+ I\k V be the composition of the canonical map ® with quotient map of Tk (V) onto 1\ k V. This map turns out to be an alternating multilinear map. We will denote Ak(VI, ... , Vk) by VI A ... A Vk. Using the universal property of Tk(V) as described in Appendix D, one can show that the pair (I\k V, Ak) is universal with respect to alternating k-multilinear maps: That is, given any alternating k-multilinear map a : V X ••. x V -+ W, there is a unique linear map a/\ : 1\ k V -+ W such that a = a/\ 0 Ak; that is, the following diagram commutes: Vx···xV~W
/\k!
~
I\kV Notice that we also have that VIA· . ·AVk is the image of VI ® .. ·®Vk under the quotient map Tk(V) -+ I\kV. Next we define 1\ V:= tfJ~ol\kV, which is a direct sum, and we take 1\0 V := R We impose on 1\ V the multiplication generated by the rule
" " (vIA···Avi) x (vIA···Avj) t-tvIA···AviAvIA···Avj E I\i+j V. The resulting graded algebra is called the Grassmann algebra or exterior algebra. (Of course, the definition of A here is different from what we defined previously.) If we need to have a Z grading rather than an fir grading,
8.1. More Multilinear Algebra
355
we may define A. k V := 0 for k < 0 and extend the multiplication in the obvious way. Elements of A. V are called multivectors and specifically, elements of A. k V are called k-multivectors. Notice that since (v+w)/\(w+v) = 0, it follows that v/\w = -w/\v. In fact, any transposition of the factors in a simple element such as VI/\· .. /\ Vk, introduces a change of sign: VI /\ ..• /\ Vi /\ ... /\ Vj /\ ... /\ Vk
=
-VI /\ ... /\ Vj /\ ... /\ Vi /\ ... /\ Vk.
Lemma 8.17. If V has dimension n, then is a basis for V, then the set {eil /\ ... /\ eik : 1 ~ il
is a basis for
A. k V
A. k V = 0 for k > n.
< ... < ik
~
where we agree that eil /\ ... /\ eik
If el, ... , en
n}
=1
if k
= o.
Proof. The first statement is easy and we leave it to the reader. We will show that the set above is indeed a basis. First note that A. n V is spanned by el /\ ... /\ en. To see that el /\ ... /\ en is not zero we let det : V x ... x V -+ lR be the multilinear map given by representing the arguments as column vectors of components with respect to the given basis and then taking the determinant of the n x n matrix built from these column vectors. Then det(eb ... , en) = 1. But by the universal property above there is a linear map det" such that det = det" 0 /\k and so det/\(el/\ ... /\ en) = det/\ 0 /\k (el, ... , en)
= det(eI, ... ,en) = 1; thus, we conclude that el /\ ... /\ en is not zero (and is a basis for A. n V). Now it is easy to see that the elements of the form eil /\ ... /\ eik span A. k V. To see that we have linear independence, suppose that
L
ail ...ikeil /\ ... /\ eik
= o.
l::;il <···
= 0,
from which we conclude that ail ...ik = 0, and since iI, ... , ik was arbitrary, we are done. D Exercise 8.18. Show that VI, •.• , Vk E V are linearly independent if and only if VI /\ ••• /\ Vk =1= O. Compare this with Corollary 8.9.
8. Differential Forms
356
Definition 8.19. An element ~ E /\ k V is called decomposable if ~ for some VI, ••. ,Vk E V.
VI 1\ ... 1\ Vk
Let us gain a little practice dealing with multivectors by proving the following proposition: Proposition 8.20. Let ~ E /\ 2 V with ~ of V such that
=1=
O. Then there is a basis VI,
••• ,Vn
+ V3 1\ V4 + ... + V2r-1 1\ V2r'
~ = VI 1\ V2
Furthermore, in this case the r-fold product ~ 1\ ... 1\ ~ is nonzero and decomposable while the r + I-fold product is zero. Proof. First we prove that there exists such a decomposition eI, ... , en be a basis for V. We have
~=
L
aijei 1\ ej
= al2 e l 1\ e2
for~.
Let
+ aI3 e l 1\ e2 + ... + ainel 1\ en
i<j
+ a23e2 1\ e3 + a24e2 1\ e4 + ... + a2ne2 1\ en + e', where ~' does not involve el or e2. By renumbering if necessary we may assume that aI2 is not zero. But then if
+ (a23/aI2) e3 + ... + (a2n/aI2) en V2 := aI2e2 + a13e3 + ... + alne n ,
VI
we have that
:= el
VI, V2, e3, ..• ,en
are linearly independent and ~ = VI 1\ V2
+ v,
where v does not involve el or e2. If v = 0, then we are done. Otherwise we may repeat the process with
v=
L
b~jei 1\ ej.
i<j ~,j~3
Clearly a simple induction gives ~ = VI 1\ V2
for some nonzero basis.
VI, .•. , V2r
+ V3 1\ V4 + ... + V2r-1 1\ V2r
such that
VI, ... , V2n e2r+ 1,
Next we consider the r-fold product. If we set k = 1, ... , r, then r
~=
L i=I
and
= O.
••• , en
is the desired
:= V2k-I 1\ V2k for
8.1. More Multilinear Algebra
357
On the other hand, if i =J j, then
Thus
< l2 < l3 < l4 and li E {l, ... ,2r}}.
e = e/\ e= 2 L
e = r!
while
er +1 = o.
o
The number r from the previous proposition is called the rank of the element E 1\ 2 V and the definition works just as well for elements of 1\ 2 V*. It can be shown that if 1 = aijei /\ ej = '2 aijei /\ ej, i<j i,j
e
e L
L
e
where (aij) is an antisymmetric matrix, then the rank of is the rank of the matrix (aij). The following proposition follows easily from the universal properties of the exterior product. Remark 8.21. There is a natural isomorphism
L~t(V;W) ~ L
(/\kV;W).
Lemma 8.22. In particular,
L~t(V) ~
(/\kV) *
We would now like to embed 1\ k V* into ®k V*, and this involves a choice. For each k, let Ak : V* x ... x V* --+ ®k V* be defined by Ak(al, .. . , ak) :=
L sgn(u)aO'l ® ... ® aO'k' 0'
By the universal property of 1\ k V* we obtain an induced map
- /\k V * --+ \(Y ~k V * .
Ak :
8. Differential Forms
358
If we identify
rg/ V* with Tk (V), then we get a map Ak : /\k V* -+ Tk (V).
Proposition 8.23. The image of the map Ak : /\ k V* -+ Tk (V) is a linear isomorphism with image equal to L~t (V) such that Ak(QI /\ ... /\ Qk)(VI, ... , Vk)
= det(Qi (Vj)).
o
Proof. We leave the proof as Problem 5. We combine these maps to obtain a linear isomorphism
A: 1\ V* -+ Lalt(V). Now both Lalt(V) and /\ V* have already independently been given the exterior alg~ra structures via their respective wedge products. One may check that A has been defined in such a way as to be an isomorphism of these algebras:
1\ V* ~ Lalt(V) (as exterior algebras). Also notice that
I\k V* ~ L~t(V) ~ (1\ kV) * ,
which allows us to think of /\ k V* as dual to /\ k V in such a way that (QI /\ ... /\ Qk)(VI /\ ... /\ Vk)
= det (Qi (v))).
Remark 8.24 (An identification). In what follows, we will identify /\ k V* with L~t (V) and hence /\ V* with Lalt (V) whenever convenient. In other words, we freely treat elements of /\ k V* as alternating multilinear forms.
8.2. Differential Forms Let M be an n-manifold. We now bundle together the various spaces L!lt(TpM). That is, we form the natural bundle L~t(TM) that has as its fiber at p, the space L!lt(TpM). Thus L~t(TM) = UpEM L~t(TpM). Exercise 8.25. Exhibit the smooth structure and vector bundle structure on L!lt(TM) = UPEM L!lt(TpM). [Hint: Let (U, x) be a chart of M and xl, ... ,xn the coordinate functions. Let fJ := UpEu L~t (TpM) and let d = (~). Then we have a map fJ -+ U x lRd given by Qp I--t (p, a) where a is the d-tuple of components (in some fixed order) of Qp given by its local coordinate representation.] Let the smooth sections of this bundle be denoted by
(8.4)
nk(M) = r(M; L~t(TM)),
and sections over U C M by nt-(U).
359
8.2. Differential Forms
Definition 8.26. Elements of Ok(M) are called differential k-forms or just k-forms. The space Ok(M) is a module over the algebra of smooth functions COO(M) = F(M). If n - dimM then we have the direct sum n
O(M) =
EB Ok(M), k=O
with a similar decomposition for any open U eM. Exercise 8.27. Show that there is a module isomorphism
~n'(M) '" r (~L~t(TM)). Definition 8.28. Let M be an n-manifold. The elements of O(M) are called differential forms on M. We identify Ok(M) with the obvious subspace of O(M) = EBk Ok(M). A differential form in Ok(M) is said to be homogeneous of degree k. If W E O(M), then we can uniquely write W = E~ 1 wk where Wk nk(M) and the Wk are called homogeneous components of w.
E
Definition 8.29. For W E Okl (M), and '1] E Ok2 (M), we define the exterior product W A '1] E Okl+k2(M) by
(w A '1])(p) := w(P) A'1](P). It is easy to see that O(M) is a ring under the wedge product and, in fact, a COO (M)-algebra. Whenever convenient, we may extend this to a sum over all n E Z by defining (as before) Ok(M) := 0 for k < 0 and Ok(M) := 0 if k > dim(M). We have made the trivial extension of A to a Z-graded algebra by declaring that W A '1] = 0 if either '1] or W is homogeneous of negative degree. Thus O(M) is a graded algebra and is said to be graded commutative because CiA/3 = (-l)ki/3ACi for Ci E Ok(M) and /3 E Ol(M). Sometimes one see this notion referred to as skew-commutativity. Just as a tangent vector is the infinitesimal version of a (parametrized) curve through a point p E M, so a covector at p E M is the infinitesimal version of a function defined near p. At this point one must be careful. It is true that for any single covector Cip E TpM there always exists a function f such dIp = Cip. But as we saw in Chapter 2, if Ci E 01(M), then it is not necessarily true that there is a function I E Coo (M) such that dl = Ci. If II, 12,···, fk are smooth functions, then one way to picture dbA·· ·Adfk is by thinking of the intersecting family of level sets of the functions b, h, ... , Ik' which in some cases can be pictured as a sort of "egg crate" , structure. For
8. Differential Forms
360
Figure 8.1. 2-form as "flux tubes"
a 2-form in a 3-manifold, one obtains "flux tubes" as shown in Figure 8.l. The infinitesimal version of this is a sort of straightened out "linear egg crate structure," which may be thought of as existing in the tangent space at a point. This is the rough intuition for dh Ip /\ ... /\ dfk Ip' and the k-form dh /\ ... /\ dfk is a field of such structures which somehow fit the level sets of the family h, 12, ... , !k. Of course, dh /\ ... /\ dJk is a very special kind of k-form. In general, a k-form over U may not arise from a family of functions. In local coordinates, calculation is often quite easy and formal. Fo example, in IR3 with standard coordinates x, y, z, a simple wedge product calculation is as follows: (xydx + zdy + dz) /\ (xdy + zdz)
+ xydx /\ zdz + zdy /\ xdy + zdy /\ zdz + dz /\ xdy + dz /\ zdz = x 2 ydx /\ dy + xyzdx /\ dz + z 2 dy /\ dz + xdz /\ dy = x 2 ydx /\ dy + xyzdx /\ dz + (z2 - x)dy /\ dz.
= xydx /\ xdy
An equally trivial calculation shows that (xyz 2 dx /\ dy + dy /\ dz) /\ (dx + xdy + zdz)
= (xyz3 + l)dx /\ dy /\ dz.
8.2.1. Pull-back of a differential form. Since we treat differential forms as alternating covariant tensor fields, we have a notion of pull-back already defined. It is easy to see that the pull-back of an alternating tensor field is also an alternating tensor field, and so given any smooth map f : M ~ N,
361
8.2. Differential Forms
we get a map f* : O,k(N) --+ O,k(M). We recall here the definition:
for tangent vectors VI, ... , Vk E TpM. The pull-back extends in the obvious way to a map f* : o'(N) --+ o'(M). Proposition 8.30. Let f : M --+ N be a smooth map and let"'1,"'2 E o'(N). Then we have
Proof. This follows directly from Proposition 8.14.
D
Proposition 8.31. Let f : M --+ Nand 9 : N --+ P be smooth maps. Then for any smooth differential form ." E o'(P) we have (J 0 g)*." = g*(J*TJ)· Thus (Jog)* =g*of*. Proof. We prove only the case TJ E O,l(N). The general case is entirely similar and is left to the reader. For v E TpM, we have
(Jog)*TJ(V) =TJ(T(Jog)v) =.,,(TfoTg(v)) =
/*." (Tg . v) =
g*(/*TJ)(v),
which completes the proof for the considered case.
D
From the above propositions we see that we have a contravariant functor from the category of smooth manifolds and smooth maps to the category ofrings which assigns to each smooth manifold M the space o'(M) and to each smooth map f the ring homomorphism f*. In case S is a regular submanifold of M, we have the inclusion map (, : S <---+ M, which maps pES to the very same point p EM. As mentioned before, it is natural to identify TpS with Tp~(TpS) for any pES. In other words, we often do not distinguish between a vector vp and Tpl.(vp). Thus we view the tangent bundle of S as a subset of TM. With this in mind we must realize that for any a E O,k (M) the form ~* a is just the restriction of Q to vectors tangent to S. In particular, if U c M is open and ~ : U <---+ M, then f.,*a = alu' The local expression for the pull-back is described as follows. Let us abbreviate: t{ = €~~·.:t (recall the definition given by equation (8.2)). Let (U,x) be a chart on M and (V,y) a chart on N with f(U) c V. Then,
362
writing fJ
8. Differential Forms
= :E bldyl and abbreviating a(~::f) to simply ~, etc. we have
where
8y J ._ 8(yil , ... , yik ) _ d -.- et 8x L 8 (Xil , ... , Xik)
[
~ ax. 1 . .
ay~k
ax 1
... ax
~l . k
.
ay~k
ax k
Since the above formula is a bit intimidating at first sight, we work out the case where dimM = 2, dimN = 3 and k = 2. As a warm up, notice that since dx i /\ dx i = 0, we have
8.3. Exterior Derivative
363
Remark 8.32. Notice that the space nO(M) is just the space of smooth functions COO(M), and so unfortunately we now have several notations for the same space: COO(M) = nO(M) = T8(M).
8.3. Exterior Derivative Here we will define and study the exterior derivative d. For O-forms, exterior differentiation is just the operation of taking the differential: f I----t df. Let us start right out with giving an idea of what the exterior derivative looks like for k-forms defined on an open set in U C ]Rn. Using standard rectangular coordinates xi, all k-forms can be written as sums of terms of the form fdx~l /\ ... /\ dx i ,. for some f E COO(U). We know that the differential of a O-form is a I-form: d: f I----t Udx i . We inductively extend the definition of d to an operator that takes k-forms to k + 1 forms. We declare d to be linear over real numbers and then define d(fdxit /\ .. . /\dx i ,,) = df /\dX i1 /\ . • ·/\dX ik . It is an easy exercise to show that if the latter formula holds for increasing indices il < ... < ib then it holds for all choices of indices. For example, if in]R2 we have a I-form a = x 2 dx + xy dy, then da = d(x 2 dx + xydy)
+ d (xy) /\ dy 2x dx /\ dx + (y dx + x dy) /\ dy
= d (x 2 ) =
/\
dx
= ydx /\ dy. We now develop the general theory on manifolds. For the next theorem we think of nM as an assignment nM : U t---t nM(U) = n(U) for open U c M. Thus we are simply thinking in terms of (pre)sheaves. We will drop the subscript Man nM when there is no chance of confusion. Definition 8.33. A (natural) graded derivation of degree r on nM is a family of maps, one for each open set U c M, denoted Vu : nM(U) ---t nM(U), such that for each U c M, Vu : nt(U) ---t n~r(U)
and such that (1) Vu is 1R linear; (2) Vu(a /\ (3) = Vua /\ (3 + (_I)kr a /\ Vu(3 for a E nk(U) and (3 E nM(U); (3) Vu is natural with respect to restriction: Vu)
nk+r(u)
vv)
nk+r(v)
-l-
8. Differential Forms
364
As usual we will denote all of the maps by a single symbolV. In summary, we have a map of (pre )sheaves V : nM ---+ nM. Along the lines similar to our study of tensor derivations, one can show that a graded derivation of nM is completely determined by, and can be defined by its action on 0forms (functions) and 1-forms. In fact, since every form can be locally built out of functions and exact 1-forms, i.e. differentials, we only need to know the action on O-forms and exact 1-form to determine the graded derivation. Recall that an element of n1 (U) is said to be exact if it is the differential of a smooth function. Remark 8.34. If one has a map V : n(M) ---+ n(M) that satisfies (1) and (2) of the previous definition, then it is just called a graded derivation of degree k. But, when we meet derivations below, they will also be defined on open submanifolds and will all give natural derivations. Proposition 8.35. IfV 1 and V2 are (natural) graded derivations of degrees rl
and
r2
respectively then the operator [VI, V 2] := VI
0
V 2 - (-lrlr2V2
is a (natural) graded derivation of degree
rl
0
VI
+ r2.
Proof. See Problem 15.
o
Lemma 8.36. Suppose VI : n~(U) ---+ n~.r(U) and V2 : n~(U)
-t
n';.tr(U) are defined for each open set U c M and both satisfy (1), (2) and (3) of Definition 8.33 above. If VI and V2 agree when applied to functions and exact forms, then VI = V2.
Proof. By (3), if VI and V2 agree on chart domains, then they agree globally. Let xl, ... , xn be local coordinates on U. Then every element of nt(U) is a sum of elements of the form fdx i1 1\ ... 1\ dX ik . But by (2) we have
± fV 1 (dX i1 = V2/ 1\ dx h 1\ ... 1\ dX'k ± fVl (dX i1
VI (J dX i1 1\ ... 1\ dX'k) = VI! 1\ dX i1 1\ ... 1\ dX ik
1\ ... 1\ dx'k) 1\ ... 1\ dxtk) .
The last term can be expanded using (2), and then the elements VIdx' can be replaced by V2dXi,. The result is equal to V2 (JdX i1 1\ ... 1\ dXZk). 0 The differential d defined by (8.5)
df(X) = Xf for X E XM(U) and f E COO(U)
gives a map n~ ---+ nk. Next we show that this map can be extended to a degree one graded derivation.
8.3. Exterior Derivative
365
Theorem 8.31. Let M be a smooth manifold. There is a unique degree one graded derivation d: OM -+ OM such that dod= 0 and such that for each open U c M and f E COO(U) = O~(U), the 1-form df coincides with the usual differential. Furthermore, for any chart (U, x) for M, we have the following local formula: dLar dxr = L
(dar) A dx r .
f
Proof. We define an operator dx for each chart (U, x). For a O-form on U (Le. a smooth function), we just define dxf to be the usual differential given by df = I: -9!rdxi . For a E O~(U), we have a = I: ardxf and we
= I: darAdxr. To show the product rule ((2) of Definition 8.33), consider a = I:afdx r E O~(U) and /3 = ~/3JdxJ E O~(U). Then define dxa
dx (a A /3) = dx (L ar dxr A L /3Jdx~
= dx (Laf/3Jdxf Adx~ = L ((daf) /3J+ af(d/3J)) dx f A dx J =
(~>'jAdxj) 2ff3fdxf A
+
~"jdxf ((-1)'2fdf3fi\dx~ , A
since d/3 J A dx f = (-1 )kdxr A d/3 J due to the k interchanges of the basic differentials dxi. This means that the product rule holds for each dx • For any
. 8 ,). . • ....!t.L functIOn f, we have dxdxf = dxdf = I:ij ( ~ dx~ A dx J = 0 SInce 8x'8x J is symmetric in i, j and dx i A dxJ is antisymmetric in i, j. More generally, for any functions f,g E COO(U) we have dx(df A dg) = 0 because of the graded commutativity. Inductively we get dx(dh A dh A ... A dfk) = 0 for any functions fi E COO(U). From this it follows that for any a = ~ ajdx r E OXt(U) we have dxdx I: ardxf = dx I: dar A dx f = 0 since dxdar - 0 2
and dxdx r = O. We have now defined, for each coordinate chart (U, x), an operator dx that clearly has the desired properties on that chart. Consider two different charts (U, x) and (V, y) such that Un V =1= 0. We need to show that dx restricted to U n V coincides with dy restricted to U n V, but it is
8. Differential Forms
366
clear that these restrictions of dx and d y satisfy the hypothesis of Lemma 8.36 and so they must agree on un v. It is now clear that the individual operators on coordinate charts fit together to give a well-defined operator with the desired properties. 0 Definition 8.38. The degree one graded derivation just introduced is called the exterior derivative. Another approach to the existence of the exterior derivative is to exhibit a global coordinate free formula. Let W E nk(M) and view W as an alternating multilinear map on X(M). Then for Xo, Xl"'" Xk E X(M), define
dw(Xo, XI, ... ,Xk) =
L
(-I)iXi(w(Xo, ...
,x:, ... ,Xk))
O~i
+
""""' .-... .-... L..J (-1)'"+"1w([Xi' Xj], Xo, ... , Xl)"" Xj,"" Xk)' O~'<1
One can check that dw is an alternating COO (M)-multilinear map on X(M) and so defines a differential form of degree k + 1. By applying this formula to coordinate fields one obtains the same local operator defined previously. Lemma 8.39. Given any smooth map f : M -t N, we have that d is natural with respect to the pull-back:
Proof. By Lemma 2.119 we know the result is true if 1] is a I-form. Because d is natural with respect to restriction, we need only prove the formula for a differential form defined in the domain of a chart (U, x). By linearity we may assume that rJ = 9 dxil /\ ... /\ dxik since an arbitrary form on U is a sum of forms of this type: f*(drJ) = f*(d(gdx i1 /\ ... /\ dX ik )) = f*(dg /\ dX i1 /\ ... /\ dX ik )
=d(f*g)/\ d(f*x i1 ) /\ .. ·/\d (f*x ik )
f) /\ ... /\ d (xik 0 f) = d((g 0 f) d (xil 0 f) /\ ... /\ d (xik 0 f)) = d(f*rJ). = d (g 0 f) /\ d (xil
0
Definition 8.40. A smooth differential form a is called closed if do. and exact if a = df3 for some differential form (3.
0
=0
Notice that since dod = 0, every exact form is closed. In general, the converse is not true. The extent to which the converse fails is a topological property of the manifold. This is the point of the de Rham 1 cohomology lGeorges de Rham 1903-1990.
8.4. Vector- Valued and Algebra-Valued Forms
367
to be studied in detail in Chapter 10. Here we just give the following basic definitions. The set of closed forms of degree k on a smooth manifold M is the kernel of d : nk(M) -t nk+l(M) and is denoted Zk(M). The set of exact k-forms is the image of the map d: n k- 1 (M) -t nk(M) and is denoted Bk(M). Since dod = 0, we have Bk(M) C Zk(M). Definition 8.41. The k-th de Rham cohomology group (actually a vector space) is given by
k Zk(M) H (M) = Bk(M)'
(8.6)
In other words, we look at closed forms and identify any two whose difference is an exact form.
If a
Zk (M) c
nk (M), then the equivalence class that contains
a is denoted [a] and called the cohomology class of a. If f : M -t N is smooth and [,8] E Hk (N), then it is easy to see that f*,8 is closed since ,8 is closed. Thus we obtain a cohomology class [f*,8]. Also, [f*,8] depends only on the equivalence class of,8. Indeed, if ,8 -,8' = d1], then f*,8 - f*,8' = df*1]. Thus we may define a linear map f* : Hk(N) -t Hk(M) by f* [,8] := [f*,8]. We return to this topic later. E
8.4. Vector-Valued and Algebra-Valued Forms Now we consider a straightforward generalization. Let V and W be real vector spaces so that we have L:1t(V; W) (Definition 8.1). For W E L:1t(V; W) and'f/ E L~t(V; W), we define the exterior product using the same formula as before except that we use the tensor product so that W 1\ 1] is an element of L~tt(V; W ® W):
(w 1\ 1])(Vl' V2,···, Vk, Vk+ll Vk+2,"" 1 k 1·'k' 2·
L
Vk+f)
sgn( 0" )w(Vu l ' V U2 ' .•. , Vu k) ® 1](VU(k+l) , VU(k+2) , •.. , VU(kH»)'
uESkl +k2
We want to globalize this algebra. Let M be a smooth n-manifold and consider the set
L:1t(TM;W) =
U L:lt(TpM;W).
pEM
This set can easily be given a rather obvious vector bundle structure. In this setting, it is convenient to identify L~t(TpM; W) with W®(/\ k T; M), so that our bundle be identified with the vector bundle W ® (/\ k T* M) whose fiber at p is W ® (/\ k T; M). The Coo (M)- module of sections of this bundle is denoted nk(M, W). Elements of nk(M, W) are called (smooth) W-valued k-forms.
8. Differential Forms
368
We obtain an exterior product O,k(M, W) x O,i(M, W) -+ O,k+i(M, W ® W) as usual by (a 1\ (3)(p) := a(p) 1\ (3(p). We give an alternative definition of 1\ and leave it to the assiduous reader to show that the result is the same. Let WI, ... , Wm be a basis for Wand wE O,k(M, W) and'f/ E O,i(M, W). We have m
m
W = LWiW~ and 'f/
= LWJrf
~-1
J=1
for some wi E O,k(M) and rf E O,i(M), where we write Wiwi rather than W t ® wi, etc. Then, m
W 1\
m
LL
'f/ =
~-1
Wi
® Wi wi 1\ rf·
J 1
One can show that this definition is independent of the choices and is consistent with the basis free definition. We still have a pull-back operation defined as before so that if f : N -7 M is smooth and w - 2::1 Wiwi E O,k(M, W) for smooth k-forms wi, then m
f*w
m
= Lf* (Wiwi) = Lwd*wi i
1
I
t
is an element of O,k(N, W). We also define a natural exterior derivative
d: O,k(M, W) -+ O,k+1(M, W) as follows: If w E O,k(M, W), then choose a basis as above and write w = . wi-I WiW~. Then ~m
m
dw :=
L widwi . 1
i
If w
= 2::n I Wi wi where Wi = 2: wJA~, then we must have
from which it follows that m
wi
i = '"' L..J Aiw t , ~
1
and so m
m
LwJdwi = i
I
m
LW LA{dw J
j-I
i
I
m t
= LWi dWi . j=1
8.4. Vector-Valued and Algebra-Valued Forms
369
Thus the definition does not depend on the choices. It turns out that the invariant definition is Okv()(O,)(I, ... ,)(k)
=
2: (-l)~)(i(W()(O' ... 'X:' ... ')(k)) + 2: (-l)i+iw([)(t, )(j], X:, ... ,X;, ... ,)(k), O::;i::;k
)(0,···,
O::;~<j::;k
x:, ... ,
where now w()(o, ... , )(k) is a W-valued function. (A vector field )( acts on a W-valued function as )(f := df()().) If W happens to be an algebra, then the algebra product W x W -+ W is bilinear, and so it gives rise to a linear map m: W 0 W -+ W. We compose the exterior product with this map to get an exterior product nk(M, W) x nl(M, W) -+ nk+l(M, W). If w E nk(M, W) and 11 E nl(M, W), then as above we may write
A:
m
W
=
2:
m
Wi wi
and
1}
= 2: Wjryi,
i=l
j=1
for some wi E nk(M) and ryi E nl(M) and m
w/\11=
m
2:2: m (Wt 0w )w t\ryi. J
~=1
i
J-l
Using a dot for the multiplication, we also have
In this case d is also defined as before. A particularly important case is when W is a Lie algebra 9 with bracket [., .]. Then we write the resulting m product t\ as [., .]/\ or just [.,.] when there is no risk of confusion. Thus if W,1} E 01(U,g) are Lie algebra-valued l-forms, then [w, 11]/\()(, Y) = [w()(), 11(Y)]
+ [11()() , w(Y)].
In particular, ![w, w]/\()(, Y) = [w()(), w(Y)], which might not be zero in general! Example 8.42. The Maurer-Cartan forms are g-valued l-forms.
8. Differential Forms
370
8.5. Bundle-Valued Forms It is convenient in several contexts to have on hand the notion of a differential form with values in a vector bundle. Let ~ = (E, 7r, M) be a smooth real vector bundle of rank r. We can consider the vector bundle L:1t(TM,E) over M whose fiber at P is L:lt(TpM, Ep). We identify L~t(TpM, Ep) with Ep ® /\ kT; M and thus the bundle is identified with E ® /\ kT* M.
Definition 8.43. Let ~ = (E, 7r, M) be a smooth vector bundle. A smooth differential k-form with values in ~ (or values in E) is a smooth section of the bundle E ® /\ kT* M. These are denoted by Ok (M; E). Remark 8.44. The reader should avoid confusion between Ok(M; E) and the space of sections r(M, /\ k E). Theorem 8.45. There is a natural COO(M) module isomorphism Ok(M; E) 9;: L~t(X(M), r(E)). If this isomorphism is taken as an identification, then
J.L(XI. ... ,Xk)(p) = J.L(X1 (P), ... ,Xk(p)) for J.L E Ok(M; E) and XI. ... , Xk E L~t(X(M), r(E)). Proof. We use Proposition 6.55. The reader should check each of the following:
Ok(M; E)
9;:
r(E ® /\kT* M)
9;:
r (Hom(/\kTM,E))
C=!
Hom ( /\ kX(M), r(E))
9;:
r (L:1t(TM, E)) 9;:
Hom(r (/\kTM) ,r(E))
9;:
L~t(X(M), r(E)).
0
In order to get a grip on the meaning of Ok(M; E), let us exhibit transition functions. For a vector bundle, knowing the transition functions is tantamount to knowing how local expressions with respect to a frame transform as we change frames. A local frame for E ® /\kT* M can be given by combining a local frame for E with a local frame for /\ kT* M. Let (el, ... , e,.) be a frame field for E defined on an open set U. We may as well take U to also be a chart domain for the manifold M. Then any local section of Ok (M; E) defined on U has the form s=
2: a}ej ® dx i
for some smooth functions aj~I = a1·l····k .; defined in U. Then for a new local set up with frames (II, ... , fr) and dyi = dyi l have
/\ . . . /\
dyi k (il < ... < ik) we
8.5. Bundle-Valued Forms
371
for some a~ and the transformation law -j _
~
i
af - L..Jaj
where
ajaxj i
ayf'
cf is defined by IsCj = ej.
Exercise 8.46. Derive the above transformation law. Note that if we write wj = 8
2: a~dxf, then we may write 8 in U as
= L e j ®w j
for w j E ~i(U).
Example 8.47. If E is a trivial product bundle MxV -+ M, then nk(M; E) is canonically isomorphic to nk (M; V). Example 8.48. For an n-manifold M, the bundle map I : TM -+ TM which is the identity on each fiber can be interpreted as an element of
Ol(M;TM). Example 8.49. If I : M -+ N is a smooth map, then the tangent map Tf : TM -+ TN can be interpreted as an element of nl(M; j*TN) , where f*TN is the pull-back bundle (Definition 6.18). Now we want to define an important graded module structure on the direct sum
n(M; E) =
EB nk(M; E). k
This will be a module over the graded algebra n(M). The action of n(M) on n(M; E) is given by maps /\ : nk(M) x n.e(M; E) -+ nk+.e(M; E), which in turn are defined by extending the following rule linearly:
w1 /\
(8
® w 2) :=
8
® w 1 /\ w 2 for w l E nk(M), w 2 E n.e(M) and
8
E r(E)
(with the same formula for local sections). Actually we can also define the analogous right multiplication /\ : n.e(M; E) x nk(M) -+ nk+l(M; E),
(8 ® wl)
/\ w 2 :=
8 ® wl
/\ w 2 ,
and then we have TJ /\ /-L =
(_1)k.e /-L /\ TJ for /-L
E
nl(M; E), TJ
E
nk(M).
If the vector bundle is actually an algebra bundle, say A -+ M, then we may turn A ® /\T* M := 2:;=0 A ® /\PT* M into an algebra bundle whose sections can be multiplied: For wl E nk(M), w2 E nl(M), and 81, 82 E r(A), define
8. Differential Forms
372
where . is the product in A. This extends linearly on (possibly locally defined) sections:
(L a;sl ® wj) ® (L b1s k ® wi) := L
a;b~SiSk ® wi 1\ wi,
where a; and b1 are smooth functions. We obtain a product that is natural with respect to restriction to open sets. From this the sections O(M, A) = r( M, A ® I\T* M) become an algebra over the ring of smooth functions. We can think of elements of Ok(M, A) as COO(M) multilinear maps on X(M). Then the invariant formula for ¢ E Ok(M, A) and'IjJ E Oi(M, A) is ¢ *'IjJ(Xl' ... ,Xk+i) 1 = k!i! sgn( 0' )¢(XO'll ... ,XO',.) . 'IjJ(XO' k+1)' ••• , XO'(kH»)
L 0'
for Xl"'" XkH E X(M). Depending on the context, the symbol may be chosen to be the same as that for the product in A, or it may just be II (especially if A is commutative), or it may be some hybrid symbol. Another important example is where A = End(E). Locally, say on U, sections J1-1 and J1-2 of O(M, End(E)) take the form J1-1 = l:A~ ® 0' and J1-2 = l: Bi ® f3 i , where Ai and Bi are maps U -+ End(E). Thus for each x E U, the Ai and Bi evaluate to give A(x), Bi(X) E End(Ex). The multiplication is then
(LA ® a i ) 1\ (LBJ ® f3j) = LABJ ® a i 1\ f3 j , l,j
where the AiBj : U -+ End(E) are local sections given by composition:
AiBj : x
f--t
Ai(x) 0 Bj(x).
Exercise 8.50. Show that O(M, End(E)) acts on O(M, E) making O(M, E) a module over the algebra O(M, End(E)).
Perhaps it would help to think as follows: We have a cover of a manifold M by open sets {UaJ that simultaneously locally trivialize both E and T* M. Then these also give local trivializations over these open sets of the bundles End(E) and I\T* M. Associated with each local trivialization is a frame field for E -+ M, say (el,"" er ), which allows us to associate with each section 0' E Ok(M, E) an r-tuple of k-forms O'u = (0'&) for each U such that 0' = l:O'&e~. Similarly, a section A E Oi(M,End(E)) is equivalent to assigning to each open set U E {Uao } a matrix of i-forms Au. The algebra structure on O(M, End(E)) is then just matrix multiplication, where the entries are multiplied using the exterior product Au 1\ Bu,
(Au 1\ Bu); = LA! 1\ BJ.
8.6. Operator Interactions
373
The module structure of the above exercise is given locally by (TU H Au 1\ O'u. Where did the bundle go? The global topology is now encoded in the transformation laws, which tell us what the same section looks like when we change to a new frame field on an overlap Ua n UfJ. In this sense, the bundle is a combinatorial recipe for pasting together local objects. Recall that with the product [A, B] := A 0 B - BoA, the bundle Hom(E, E) is written as g[ (E) rather than End(E), and then our general construction gives O(M, g[ (E)) an algebra structure, whose product is denoted [4>, ¢ll\ or something similar.
8.6. Operator Interactions The Lie derivative acts on differential forms since the latter are, from one viewpoint, alternating tensor fields. When we apply the Lie derivative to a differential form, we get a differential form, so we should think about the Lie derivative in the context of differential forms. Lemma 8.51. For any X E X(M) and any f E OO(M), we have .cxdf
=
dCxf. Proof. For a function
(Cxdf)(Y)
(:tl = Y(:tl
=
f, we compute as
o (cpf)*df)(Y) o (cpf)*f)
=
:tl
o df(Tcpf . Y)
=
:tl
o Y((cpf)*f)
= Y(.cxf) = d(.cxf)(Y),
where Y E X(M) is arbitrary.
D
We now have two ways to differentiate sections in O(M). First, there is the Lie derivative .cx : Oi(M) -+ ni(M), which turns out to be a graded derivation of degree zero, (8.7)
.cx(a 1\ (3) = .cxa 1\ (3 + a 1\ .cx(3.
We may apply .cx to elements of O(U) for U C M, and it is easy to see that we obtain a natural derivation in the sense of Definition 8.33. Exercise 8.52. Prove the above product rule. Second, there is the exterior derivative d which is a graded derivation of degree 1. In order to relate the two operations, we need a third map, which, like the Lie derivative, is taken with respect to a given field X E X (M). This map is defined using the interior product given in Definition 8.12 by letting
(8.8)
ixW(X1, ... , Xi-1)(p) := ixpwp(X1(p), ... , X~ 1(P)).
8. Differential Forms
374
Alternatively, if w E Oi (M) is viewed as a skew-symmetric multilinear map from X (M) x ... x X (M) to Coo (M), then we simply define
ixW(XI , ... , Xi-I) := w(X, Xl, ... ,Xi-I). By convention ixf = 0 for f E Coo(M). We will call this operator the interior product or contraction operator. This operator is clearly linear over R Notice that for any f E Coo (M) we have
ifxw = fixw, and ixdf = df(X) = LXf. Proposition 8.53. ix is a graded derivation of O(M) of degree -1:
ix(a /\ {3) = (ixa) /\ {3 + (-l)ka /\ (ix{3) for a E Ok (M). It is the unique degree -1 graded derivation of O(M) such that ix f = 0 for f E 0° (M) and ixO = O(X) for 0 E 0 1 (M) and X E X (M) .
o
Proof. This follows from Proposition 8.13.
Actually, ix is natural with respect to restriction, so it is a natural graded derivation in the sense of Definition 8.33. Formulas developed for the interior product in the vector space category also hold for vector fields and differential forms. For example, if 01, ... , Ok E 0 1 (M) and X EX (M), then k
. 1,x
(0 1 /\
... /\
Ok)
""" = L...
'""' /\ ... /\ Ok· (-1) k+1 O£(X)Ol /\ ... /\ Of
£=1
Notation 8.54. Other notations for ixw include XJw and (X,w). These notations make the following theorem look more natural: Theorem 8.55. The Lie derivative is a derivation with respect to the pairing (X,w) I-t (X,w). That is,
Lx(iyw) = i,C,xYw + iy,Cxw, or in alternative notations, LX(YJW) = ('cxY)Jw+YJ(,Cxw), 'cx(Y,w) = ('cxY,w)
+ (Y,'cxw).
o
Proof. Exercise.
Now we can relate the Lie derivative, the exterior derivative and the contraction operator.
Theorem 8.56. Let X E XM. Then we have Cartan's formula,
(8.9)
LX = do ix + ix
0
d.
8.7. Orientation
375
Proof. Both sides of the equation define derivations of degree zero (use Proposition 8.35). So by Lemma 8.36 we just have to check that they agree on functions and exact 1-forms. On functions we have ix f = 0 and ixdf = Xf = Cxf so the formula holds. On differentials of functions we have (d 0 ix
+ ix 0 d)df =
(d 0 ix)df = dCxf = Cxdf,
where we have used Lemma 8.51 in the last step.
o
As a corollary, we can extend Lemma 8.51: Corollary 8.57. do Cx
= Cx 0 d.
Proof. We have dCxa
= d(dix + ixd)(a) = dixda = dixda + ixdda = (Cx 0 d) a.
o
Corollary 8.58. We have the following formulas:
(i) i[x,Yj = Cx 0 iy - iy 0 £x; (ii) Cfxw = f Cxw + df 1\ ixw for all wE O(M). Proof. We leave (i) as Problem 9. For (ii), we compute:
+ d(ifxw) = ifxdw + d(f ix w) ixdw + df 1\ ixw + fd (ixw)
Cfxw = ifx dw = f
= f (ix dw + d (ixw)) + df 1\ ixw = f £xw + df 1\ ixw.
0
8.7. Orientation A vector bundle E ---+ M is called oriented if every fiber Ep is given a smooth choice of orientation. There are several equivalent ways to make a rigorous definition: Proposition 8.59. Let E ---+ M be a rank k real vector bundle with typical fiber V. The following are equivalent:
(i) There is a smooth global section w of the bundle /\ k E* ~ L!lt (E) ---+ M such that w is nowhere vanishing. (ii) There is a smooth global section s of the bundle /\ k E ---+ M such that s is nowhere vanishing.
(iii) The vector bundle has an atlas of VB-charts (local trivializations) such that the corresponding transition maps take values in GL+ (V), the group of positive determinant elements of GL(V). This means that the standard GL(V)-structure on E ---+ M can be reduced to a GL + (V)-structure (refer to Chapter 6).
8. Differential Forms
376
Proof. We show that (i) is equivalent to (iii) and leave the rest as an easy exercise. Suppose that (i) holds and that w is a nonvanishing section of L~t(E). Now fix a basis (eI, ... ,ek) on V and recall that with this basis fixed, each VB-chart (U, ¢) for E -+ M corresponds to a local frame field. Indeed, we let ei(p) := ¢-I(p,ei). The transition maps between two charts will have values in GL +(V) exactly when the matrix function that relates the corresponding frame fields has positive determinant (check this). Given a VB-atlas we construct a new atlas. We retain those VB-charts (U,4» whose corresponding frame field eI, ... , ek satisfies w(el, ... , ek) > o. For the charts for which w( eI, ... ,ek) < 0, we replace ell ... ,ek by -el, ... , ek, and the resulting chart will be included in our new atlas. Now if eI, ... , ek and h, ... , fk are two frame fields coming from this atlas, then fi = L: eJ and
C;
w(h, ... ,fk) = (detC)w(el, ... ,ek)· We conclude that det C > o. Conversely, suppose the vector bundle has an atlas {(Uer, ¢er)} taking values in GL +(V). We will use the frame fields coming from this atlas to construct a nowhere vanishing section of L:1t(E) -+ M. If el, ... , ek and h, ... ,fk are two frame fields coming from this atlas, then let fl, ... ,fk be dual to It, ... , fk and consider fl /\ ... /\ fk. We have Ii = L: C; ej and
(II /\ ... /\ fk) (el, ... , ek) = det C > o. For each chart (Uer, ¢er) in our VB-atlas, let ff, ... , fr be the corresponding
frame field and let (J~, ... , f~) be the dual frame field. Then let {Per} be a partition of unity subordinate to the cover {Uer }. Let
w :=
L Perf; /\ ... /\ f~.
Then, w is nowhere vanishing. To see this let P E M and suppose that P E U~ for some chart (U~, ¢~) from the GL + (V)-valued atlas. Then
wuf,···, ff)(p) = L Per(P) det C~er(P) > 0, where C~er is the matrix that relates ff, ... ,ff and
ff, ... , fr.
0
Definition 8.60. If anyone (and hence all) of the conditions in Proposition 8.59 hold, then E -+ M is said to be orientable. A VB-atlas that satisfies (iii) will be called an oriented atlas. If E -+ M is orient able as above, then two nowhere vanishing sections
of L:1t(E), say WI and W2, are said to be equivalent if WI = fW2, where f is a smooth positive function. We denote the equivalence class of such a nowhere vanishing w by [w].
8.7. Orientation
377
Definition 8.61. An orientation for an orient able vector bundle of rank k is an equivalence class [w] of nowhere vanishing sections of /\ k E*. If such an orientation is chosen, then the vector bundle is said to be oriented by
[w]. Notice that if we have two oriented VB-atlases on a vector bundle, then we know what it means for them to determine the same GL + (V)-structure. This was the notion of strict equivalence from Chapter 6. The next exercise shows that the notion of a reduction to a GL + (V)-structure is equivalent to the notion of an orientation as we have defined it. Exercise 8.62. Recall the construction of a nowhere vanishing section win the proof of Proposition 8.59. Show that the class [w] does not depend on the partition of unity used in the construction. Show that if two oriented VBatlases determine the same GL+ (V)-structure, then the constructed sections are equivalent and so determine the same orientation. Conversely, show that an orientation as we have defined it determines a unique reduction to a GL + (V)-structure on the vector bundle. Let E --+ M be oriented by [w]. A frame (VI, ... , Vk) of fiber Ep is positively oriented (or just positive) with respect to [w] if and only if w(P) (Vb" . , Vk) > O. This condition is independent of the choice of representative w for the class [w]. Definition 8.63. Let 7r : E --+ M be an oriented vector bundle. A frame field (h, ... , /k) over an open set U is called a positively oriented frame field if (h (P), ... , fk (P» is a positively oriented basis of Ep for each p E U. Exercise 8.64. Let 7r : E --+ M be an oriented vector bundle. Show that if M is connected, then there are exactly two possible orientations for the vector bundle. Exercise 8.65. If 7r1 : EI --+ M and 7r2 : E2 --+ Mare orientable, then so is the Whitney sum 7r1 EB 7r2 : EI EB E2 --+ M. Definition 8.66. A smooth manifold M is said to be orientable if T M is orientable. An orientation for the vector bundle T M is also called an orientation for M. A manifold M together with an orientation for M is said to be an oriented manifold. Definition 8.67. An atlas {(Ua , xa)} for M is said to be an oriented atlas if the associated frame fields are positively oriented. If this atlas is positively oriented with respect to an orientation on M (an orientation [w] of T M), then we call {(Ua , xa)} a positively oriented atlas. It follows from the definitions that an oriented atlas induces an orientation for which it is a positively oriented atlas. For this reason, a choice of
8. Differential Forms
378
oriented atlas is equivalent to a choice of orientation, and so one often sees an orientation of an orientable manifold defined as simply being given by a choice of oriented atlas. Now let M be an n-manifold. Consider a top form, i.e. an n-form tv E nn(M), and assume that tv is nowhere vanishing. Thus M must be orient able. We call such a nonvanishing tv a volume form for M, and every such volume form obviously determines an orientation for M. If t.p : M -+ M is a diffeomorphism, then we must have that t.p*tv = d'W for some d E COO(M), which we will call the Jacobian determinant of t.p with respect to the volume element tv:
t.p*tv = Jw(t.p)tv. Clearly Jw(t.p) is a nowhere vanishing smooth function. Proposition 8.68. Let (M,
[tvD be an oriented n-manifold. The sign of Jw(t.p) is independent of the choice of volume form tv in the orientation class [tv].
Proof. Let tv' E nn(M). We have never zero on U. Furthermore,
J(t.p)tv and since
7
tv
= atv' for some function a that is
= (t.p*tv) = (a 0 t.p)(t.p*tv') = (a 0 t.p)JWI(t.p)tv' = a 0 t.p tv, a
o
> 0 and tv is nonzero, the conclusion follows.
Let us consider a very important special case of this: Suppose that 'P : ]Rn. Then letting tvo = du 1 /\ ••• /\du fl
U -+ U is a diffeomorphism and U c we have for any x E U,
t.p*tvo(x) = t.p*du 1 /\ •.. /\ t.p*dun(x) =
(~8(~~~t.p)lx du il )
= det
/\ ... /\
(8(~; t.p) (x)) tvo(x) =
(~8(~::t.p)lx du
tn )
Jt.p(x)tvo(x).
So in this case, Jwo(t.p) is just the usual Jacobian determinant of t.p. More generally, let a nonvanishing top form tv be defined on M and let 'W' be another such form defined on N. Then we say that a diffeomorphism t.p : M -+ N is orientation preserving (or positive) with respect to the orientations determined by tv and tv' if the unique function JW,wl such that t.p*tv' = JW,w1tv is strictly positive on M. Exercise 8.69. An open subset of an oriented manifold M inherits the orientation from M since we can just restrict a defining volume form. Show that a chart (U, x) on an oriented manifold is positive if and only if
379
8.7. Orientation
x : U ---+ x (U) is orientation preserving. Here, x (U) inherits its orientation from the ambient Euclidean space with its standard orientation. We now construct a two-fold covering manifold Mor for any manifold M. The orientation cover will itself always be orientable. Recall that the zero section of a vector bundle over M is a submanifold of the total space diffeomorphic to M. Consider the vector bundle whose total space is " n T* M and remove the zero section to obtain (/\n T* M)
x := (/\ n T* M)
\{zero section}.
Define an equivalence relation on (/\n T* M) x by declaring VI '" V2 if and only if VI and V2 are in the same fiber and if VI = aV2 with a > O. The space of equivalence classes is denoted Mo r and we will show that it is a smooth manifold. Let q : (/\n T* M) x ---+ Mo r be the quotient map and give Mor the quotient topology. There is a unique smooth map 7ror making the following diagram commute: (/\nT*M)X _ _ (Mor)
~or! M It is easy to see that for each p E M, the set 7r~I(p) contains exactly two elements, which are the two orientations of TpM. We give the set Mo r a smooth structure. First let [J.Lo] be the standard orientation of]Rn and choose a fixed orientation reversing linear involution TO : ]Rn ---+ ]Rn. Let {(Ua , x a )} be an atlas for M. By composing some of the charts with TO and adding the resulting charts to the atlas, we may suppose that {(Ua , xa )} has the property that for every chart (U, x) in the atlas there is a chart (U, y) in the atlas with the same domain such that yo x-I is orientation reversing. Let us say that such an atlas is "balanced". (The maximal atlas is obviously balanced.) Now for each chart (U, x), where x = (xl, ... , x n ), define a map ¢x : x (U) -+ Mor by
E Mor.
Because we have assumed that the atlas is balanced, it is easy to see that each element of Mo r is in the image of some
380
8. Differential Forms
for some>. > O. Indeed, we must have>.
= det(D(xa ox,81)(u)).
Then,
x(3 0 ~xc>(u) = x(3 ([dx;(xa(u)) /\ ... /\ dx~(xa(u))]) = x(3 ([dx~(x(3(w)) /\ ... /\ dx~(x(3(w))]) =W
= x(3
0
x;;l(u).
Thus x(3 0 x; 1 = x(3 0 x; 1 on xa (Ua n U(3). Note that for each a and f3 the set xa(Ua n U(3) consists of exactly those connected components of Xa (Ua n U(3) on which det (D(x(3 0 x;l)) is positive. Thus xa(Ua n U(3) is open. One may check that the topology induced by this atlas coincides with the quotient topology and that the quotient map is smooth. Also, note that if Ua nU(3i= 0, then det (D(x(3 0 x;l)) > 0 and so our atlas is oriented! We thus have a canonical orientation of Mor. For each admissible chart (U, x), we obtain a section U
~
Mo r given by
p I-t [dx 1 (P) /\ ... /\ dxn(p)]. It is easy to see that such sections are smooth. From the existence of these sections, one surmises that the connected components of the chart domains are evenly covered and so 7ror : Mor ~ M is a two-fold covering map (but Mor may not be connected). This two-fold covering map is called the orientation covering map. The space Mor itself is called the orientation (double) cover. Exercise 8.70. Let R+ be the multiplicative Lie group of positive real numbers. Show that R+ acts freely and properly on U\ n T* M) x and that the quotient manifold is Mor. Show that the canonical orientation on Mor can be described as follows: Each y E Mor is an orientation of T7ror(y)M. Since Ty7ror : TyMor ~ T7ror(y)M is an isomorphism, we may transfer the orientation yon T7ror(y)M to an orientation for TyMor. This gives a canonical orientation on each fiber of T Mo r which agrees with the orientation derived from the oriented atlas described above. How can we get a smooth global nonvanishing top form that induced this orientation? Exercise 8.71. A manifold M is orient able if and only if Mo r is disconnected. If M is oriented, then Mo r has exactly two components and an orientation of M corresponds to a choice of connected component of Mor. 8.7.1. Orientation induced on a boundary. Now let M be a manifold with boundary. According to Problem 15, we can consider vector bundles over M. The definitions of orientable and orientation make sense for M. Here we wish to consider orientations of 8M. If [w] is an orientation for an orientable vector bundle 7r : E ~ M, then [wI 8M ] is an orientation on EI8M ~ 8M where EI8M = 7r- 1 (8M). In particular, if M is oriented by
8.7. Orientation
381
[w], then we may obtain an orientation on T MlaM ---+ 8M. But what we would really like to obtain at this time is an orientation on 8M. In other words, we need an orientation on the bundle T( 8M). First notice that by our conventions, an atlas for M consists of charts that take values in half-spaces of the form lR~~c' Exercise 8.72. Consider vp E TpM for some boundary point p E 8M. Let (U, x) be a half-space chart containing p and taking values in lR~>c and let Cv, y) be another chart containing p, but taking values in lR~~d' - Then dxody-l maps lR~.odx(vp) < 0 if and only if 11 0 dy( vp) < O. Similarly, >. 0 dx( vp) > 0 if and only if 11 0 dy( vp) > O. In light of the exercise above, the following definition makes sense: Definition 8.73. Let p E 8M. A vector vp E TMp c TMlaM is called outward pointing if in some lR~>c-valued chart (U, x) we have >.odx(vp) < O. A smooth section X of TMla~ is called outward pointing if X(p) is outward pointing for each p. Inward pointing is defined analogously. To clarify the situation, let us consider the special case of an lR~k. = _uk). If p E -U, then vp is outward pointing exactly when dxk(vp) > O. In particular, a~k is outward pointing. For an lR~k>o-valued chart (U, x), the reverse is true; vp is outward pointing exactly when dxk(vp) < O. We shall see that lR:1 0 in mind. -
Ip
/.. t.
\ ... L...-_....L .................. ..
Figure 8.2. Outward pointing
Notice that the notion of outward pointing on 8M does not depend on M being orient able (see Figure 8.3). In fact, we have the following: Lemma 8.74. Outward pointing sections of T MlaM always exist.
382
8. Differential Forms
Figure 8.3. Outward at boundary of Mobius band
Proof. We may assume that we are dealing with an ~~l
Then for p
E
8M we have
X(p) = LPa(P) 881 I . a
Xa p
To see if X(p) is outward pointing, let (U(3, x(3) be a given lR~l
dyl(X(p)) =
La Pa(P) dyl (88
since Pa(P) ;::: 0 for all 0:, we have Pa(P) dyl(k(p)) > 0 for all 0:.
I ) > 0,
I Xa p
> 0 for at least one
0:,
and 0
In the case that M is oriented we can use the orientation plus the notion of outward to define an orientation on the boundary. Let (M, [w]) be an oriented smooth manifold with boundary and suppose that X is an outward pointing section of TMlaM' Let W E [w]; then we define i:x.w E nn-I(8M) by
i:x.w(p) (V2' •.• ,vn) = w(P) (X(P), V2, ••• ,vn). It is clear that i:x.w is nowhere vanishing. If also WI E [w], then it is easy to see that i:x.w = ft:x.WI for a positive function f. Thus [i:x.w] depends only on [w] and provides an orientation on 8M. Definition 8.75. The orientation on 8M defined above is called the induced orientation. It follows that if (/1,12, ... , fn) is a positively oriented frame field on an open set U c M with nonempty intersection with 8M and if /1(P) is
8.7. Orientation
383
outward pointing for all p E UnaM, then (12, ... , fn) restricted to UnaM is a positively oriented frame with respect to the induced orientation on aM. The case n = 1 needs some interpretation. Here aM is a discrete set of points aM = {PI, ... ,Pk}, and an orientation is an assignment of +1 or -1 to each point. For Pi E aM, we assign +1 if w(Pi) (X(P» > 0 for some outward pointing vector X(P). If every chart in an atlas takes values in a fixed half-space lR~>c' then we say the atlas is lR~>c-valued. It is not true generally that an oriented n-manifold has an atlas of positively oriented charts with values in a fixed half-space. However, it is almost true:
Lemma 8.76. If M is an oriented n-manifold with nonempty boundary and n 2: 2, then there is an atlas of positively oriented charts. Proof. If (U, x) is not positively oriented, then replace x = (xl, x 2 , ••• , xn) by y = (yl, y2, ... , yn) := (xl, -x 2, ... , xn) to obtain a positively oriented chart. Notice that this does not work if n = 1. 0 Exercise 8.77. Show that although, as a manifold with boundary, the interval M = [0,1] is oriented in the standard way, there is no atlas consisting of positively oriented lRtlO pushes the problem to the other endpoint. This last exercise exhibits a fact that is an annoyance if one wants to work with an atlas taking values in a fixed half-space such as the ever popular upper half-space lR~n>o. It is an issue often overlooked in the literature. Definition 8.78. A nice chart on a smooth manifold (possibly with boundary) is a chart (U, x) such that x(U) = lR~l
(i) Every smooth manifold has an atlas consisting of nice charts. (ii) Every oriented smooth manifold without boundary has an atlas consisting of positively oriented nice charts.
(iii) Every oriented smooth manifold with boundary of dimension n 2: 2 has an atlas consisting of positively oriented nice charts.
Proof. (i) If (U, x) is a chart with range in the interior of the left half-space lR.:1
384
8. Differential Forms
an orientation preserving diffeomorphism). If (U, x) is a chart with range meeting the boundary of the left half-space lR~l <0' then we can find a halfball B_ in lR~l
(ii) For this, we repeat the procedure of (i) and notice that since lRn is diffeomorphic to B by an orientation preserving diffeomorphism, the new nice chart will be positively oriented if (U, x) is positive. (iii) If (U, x) is a chart with range in the interior of the left half-space lR~l
-k, ... ,-len
S.S. Invariant Forms Throughout this section G is an n-dimensional Lie group. Definition 8.81. An element w E Ok (G) is called left invariant if L;w - w for all 9 E G. It is easy to see that a left invariant k-form is determined by its value
w{ e) at the identity element. Now suppose that wI, ... , wn are invariant i-forms such that w1 (e), ... ,wn {e) is a basis of T;G. The w1, ... ,wn are independent everywhere (why?). If Xl,"" Xn is the frame field dual to wI, ... , wn , then each X, is a left invariant vector field as may be easily checked. By definition, a left invariant form satisfies
W{X)(VI"'" Vk) = w(gx)(TLgvI, ... , TLgVk) for all x E M and 9 E G. This would make sense even if w were not smooth, but w{XI, ... , Xk) is not only smooth but constant, so any left invariant form must be smooth after all. Every left invariant k-form w can be written as
(8.1O)
w= I
for unique constants
ail ... ik'
B.B. Invariant Forms
385
The exterior derivative of a left invariant form is left invariant since for any g E G, we have
L;dw = dL;w
dw.
=
Furthermore for any left invariant vector fields X, Y we have
dw(X, Y) = Xw(Y) - Yw(X) - w([X, Y]) = -w([X, Y]),
(8.11) (8.12)
and so in particular
dwle (v, w) = -we([v, w]),
(8.13)
for any v,w E g. A form is called right invariant if R~w = w for all g E G. Of course, if w is right invariant, then so is dw. If inv : x f---t x-I denotes the inversion map on G, then inv 0 Rg = L g-1 0 inv, and so
R;
0
inv* = inv*
0
L;-l.
As a consequence, if w is left invariant, then R*9
0
inv*w = inv*
0
L*g w =-inv*w l,
so inv*w is right invariant. Similarly, if w is right invariant, then inv*w is left invariant. Lemma 8.82. Teinv = - id : 9 ---* g. Proof. Let v E g. Then the curve t f---t exp(tv) has tangent v. Thus (exp(tv))-I has tangent Teinv· v. But (exp(tv))-I = exp(-tv) so we must have Teinv· v = -v. 0 Proposition 8.83. Let inv : x
f---t X-I
be the inversion map on G.
(i) If w E nk( G), then (inv*w)e = (-1 )k we .
(ii) If W is left and right invariant, then dw = O. Proof. (i) It suffices to assume that w = fw it 1\ ... 1\ wik for certain I-forms wt1 , ••• ,Wik that may be taken to be left invariant. But the result will follow if we can show that (inv*w)e = -lwe for any I-form. So let v E g. Then, by Lemma 8.82, (inv*w)e (v) = We (Teinv· v) = -we(v).
(ii) From (i) we have
8. Differential Forms
386
If w is left and right invariant, then both inv*w and ware left invariant and this continues to hold globally:
inv*w
= (-I)kw;
dw is also left and right invariant, so inv*dw
= (-I)k+1dw.
On the other hand, inv*dw so (-I)k+ l dw
= dinv*w = (-I)kdw,
= (-I)kdw and then dw = O.
o
Corollary 8.84. If G is abelian, then g is abelian. Proof. If G is abelian, then every left invariant form is also right invariant, so dw = 0 for all left invariant forms w. But then by equation (8.13), for any v, w E g we have we([v, w]) = 0 for all We E g*. Thus [v, w] = 0 for any v,w. 0 If Xl"'" Xn are left invariant vector fields with XI(e), ... , Xn(e) dual to the basis wl(e), ... ,wn(e), then
[Xi(e),Xj(e)] = I::CfjXk(e) for 1 ~ i,j ~ n, k
where cfj are the structure constants from Definition 5.56. We also have [X~, XJ] = Lk cfjXk for 1 ~ i,j ~ n, so by the equation above and equation (8.13) we have for i < j,
dwk(Xi' XJ)
= -wk([Xi' Xj]) = -cfj.
Since dwk(Xi,Xj) gives the components of dw k , we have (8.14)
which are called the equations of structure, or the structural equations. We now use the concept of Lie algebra-valued forms and the product [ , ]" introduced earlier. Recall the left Maurer-Carlan form WG. If wI, ... ,wn is a basis of left invariant I-forms dual to XI, ... ,Xn , then (omitting tensor product signs), n
We
= 2:X,(e)wi. i=l
8.B. Invariant Forms
387
Indeed, if Xg = L:~l aiXi(g) E TgG, then X = L:~ 1aiXi is the unique left invariant vector field such that X(g) = Xg and we have
~
1
= X(e)
=
TLg-l . X(g)
=
wc(Xg).
Moreover,
dwc
=
tXk(e)dwk k
=-
= -
1 n
tXk(e) (LCfjw i /\WJ) ~<J
k-l
~ ~ ciJXk(e)w /\ w3. '"''"' k i ' k 1
~<j
On the other hand we have
[wc,wc)'i\ = [ t Xi(e)w i , t
xj(e)wJ] "
iIi 1 n
[X~(e), Xj(e)] wi /\ wj
=L ~,j
1
n
= L L ctXk wi /\ wi = ~,j=l
k
~L k
n
L cfjXkw i /\ wj , i<J
so that an alternative and concise form of the equations of structure is the single equation (8.15)
dwc =
-"21 [we, we] " .
Using what we learned at the end of Section 8.4, this equation may be written as
dwG(X, Y) = - [we(X),wG(Y)] for any X, Y E X(G). If G is a matrix group, then the structural equation can be written as
dwe = -WG /\wG· Indeed, if we abbreviate WG to just w, then we have (dw); - dw; and [we(X),we(Y)]; = (w(X)w(Y) - w(Y)w(X)); i k i k - W(X)kW(Y)j - W(Y)kW(X)j
=
(w1 /\w;) (X,Y).
Equation (8.15) is also called the Maurer-Cartan equation.
8. Differential Forms
388
Problems (1) Let SN denote the group of permutations of the set {I, 2, ... , N}. Now let G k,£ be the subgroup of SkH which consists of permutations that leave the sets {I, ... , k} and {k + 1, ... , k + f} each invariant. A cross section of Gk,i is a subset K of SkH that contains exactly one element from each coset in SkH/Gk,i' Show that for any such cross section we have W
1\ TJ(v1 ,
=
••• ,
Vk, Vk+b •.• , Vk+£)
L sgn(u)w(vUl"'"
v u /.)TJ(VUk +1'···' VUk+l)'
uEK
Also, show that the set of all (k, f)-shuffle permutations is a cross section of Gki. , (2) Show that if VI, .• " Vk E V, then VI/\·· are linearly independent.
·I\Vk =1=
0 if and only if Vb .. ·, Vk
(3) Let h, ... , In be smooth functions on an open set in an n-manifold. Let p be in their common domain. Then dh 1\ ... 1\ din is nonzero at p if and only if h, ... , In agree with the coordinate functions of a chart whose domain is a neighborhood of p. /\kV. Show that ifvl\w = 0 for all w E /\n-k V , where n = dim V, then V = O. (b) Using part (a), show that more generally, if V 1\ w = 0 for all w E /\mV, where m ::; n - k, then v = O. [Hints: If v 1\ w = 0 for all w E /\m V, then v 1\ (w 1\ x) = 0 for all w E /\m V and all x E /\ n-k-m V. Elements of the form w 1\ x as above span /\ n k V.]
(4) (a) Let
V E
(5) Prove Proposition 8.23. (6) Show that the sphere is orientable. (7) Prove (i) of Proposition 8.4. (8) Prove Proposition 8.30. (9) Prove equation (i) of Corollary 8.58. (10) Prove Cartan's lemma: Let k ::; n = dimM and let WI, ... ,wk be 1· forms on M which are linearly independent at each point. Suppose that there are I-forms (h, ... ,Ok such that k
L Oi 1\ i=1
Wi =
0 (identically).
Problems
389
Then there exists a symmetric k x k matrix of smooth functions (Aij) such that k
(Ji
= L Ajw) for i = 1, ... , k. j=l
(11) Let M
= ]R3\{O} and let W=
x dy /\ dz + y dz /\ dx + z dx /\ dy (x2 + y2 + z2)3/2
--~--~~~--~~~--~
Find dw and determine whether w is closed and if so, whether it is exact. Find the expression for w in spherical coordinates. (12) Show that every simply connected manifold is orientable. (13) (a) Let V and W be vector spaces with dim V = n and dim W m. Show that if A E L(V, W), then there is a unique map /\A : /\kV -+ /\kW such that /\A(VI/\·· ·/\Vk) = AVI/\·· ·/\AVk whenever Vl, ... ,VkEV.
(b) Show that if el, ... , en is a basis for V and h, ... , 1m is a basis for Wand if Aei = then for 1 ~ il < ... < ik ~ we have
2:aili,
/\A
n
L
(eil /\ ... /\ eik) =
ai~:::1: lil /\ ... /\ Ii",
lC:;}l <···<jkC:;m
where
a~l ...~k ll···llc
is the k x k minor determinant of the matrix (a~) given by a~l ...~1c = '"" sgn(7)a~0t} ... a~(jk). ll···tk
L..J
O'ES"
(14) Finish the proof of Proposition 8.13. (15) Prove Proposition 8.35.
II
llc
Chapter 9
~nt8gration
and Stckes'
'='hscreII!
Let M be a smooth n-manifold possibly with boundary 8M and assume that M is oriented and that 8M has the induced orientation (Definition 8.75). Definition 9.1. The support of a differential form a E O(M) is the closure of the set {p EM: a(p) i- O} and is denoted by supp(a). The set of all k-forms that have compact support is denoted O~(M), and the set of all k-forms with compact support contained in U c M is denoted by O~(U).
Let us consider the case of an n-form a on an open subset U of lRn. Let (u\ ... , un) be standard coordinates on U. We may write a = a du! !\ ... !\ dun for some function a. If a has compact support in U, we may define the integral Ju a by
i =i i a
:=
a du! !\ ... !\ dun a(u) du!··· dun,
where this latter integral is the Riemann (or Lebesgue) integral of a(·). For this we extend a(·) by zero to all of lRn and integrate over a sufficiently large closed n-cube containing the support of a(·) in its interior. We could have written Idu! . .. dun I instead of du l ... dun to emphasize that the order of the du~ does not matter as it does for du! !\ ... !\ dun. If U is an open subset of the half-space lR~>c' then we define Ju a by the same formula. If ~ : V -+ U is an orientation preserving diffeomorphism of open sets in ]Rn, then det Det> > O. Let u l , ... ,un denote standard coordinates on U, and let
391
9. Integration and Stokes' Theorem
392
VI, ... ,vn denote standard coordinates on V. Then by the classical change of variable formula,
fu a = fu a(u) du l ... dun = fv a
0
>(v) Idet D>I dv l ... dv n
= fv a 0 >( v) det D> dv l ... dv n
= fvao>(v)detD>dvl/\ ... /\dv n = fv>*a. So (9.1) Next consider an oriented n-manifold M without boundary and let 0: E nn(M). If a has compact support inside U for some positively oriented chart (U,x), then x-I: x(U) ---+ U and (x-I)*a has compact support in x(U) c lRn. We define
fu
a:=
l(u/ x- )*a. I
The change of variables formula (9.1) shows that this definition is independent of the positively oriented chart chosen. In fact, if (y, V) is another chart and a has support inside Un V, then (x-I)*a has support inside x(U n V) and (y-I)* a has support in y(U n V). Thus since x 0 y-l is orientation preserving, we have
1
(x-I)*a
=
x(U)
1 1
(x-I)*a
=
x(unv)
=
1 1
(x 0 y-I)* (x-I)*a
y(Unv)
(y-l)* a
y(unv)
=
(y-l)*a.
y(U)
The same definition works fine in case M has nonempty boundary, but there is a small technicality. Namely, suppose we wish to work only with positive charts taking values in a fixed half-space lRA>c' Then as long as n ~ 2 there is no problem, but if n = 1, we are faced-with the fact that there may be no positively oriented lRLc-valued atlas at all even if M is orientable. Some authors define manifold with boundary completely in terms of a fixed half-space and seem unaware of this little glitch. Since we allow multiple half-spaces, this is not a problem for us, but in any case, we could modify the definition slightly in a way that works in all dimensions and for any chart. Let M be an oriented manifold with boundary. If a has compact support inside U for some chart (U, x), then (9.2)
( a Ju
= sgn(x)
1
x(U)
(x-I)*a,
9. Integration and Stokes' Theorem
393
where sgn(x) =
{
1 if (U, x) is positively oriented, -1 if (U,x) is not positively oriented.
This could be taken as a definition. Once again, the standard change of variables formula shows that this definition is independent of the chart chosen. Remark 9.2. Because we have used the sgn(x) factor in the definition, we can use arbitrary charts. But the manifold still must be oriented so that sgn(x) makes sense! If a E O~(M) has compact support but does not have support contained in some chart domain, then we choose a positively oriented atlas {(Xi, Ui)} for M and a smooth partition of unity {(pi, Ui)} subordinate to the atlas, and consider the sum
(9.3) Proposition 9.3. In the sum above, only a finite number of terms are nonzero. The sum is independent of the choice of atlas and smooth partition of unity {(pi, Ui)}. Proof. First, for any p E M there is an open set 0 containing p such that only a finite number of Pi are nonzero on O. But a has compact support, and so a finite number of such open sets cover the support. This means that only a finite number of the Pi are nonzero on this support. Now let {(Xi, Vi)} be another positive atlas and Pi a partition of unity subordinate to it. Then we have
o Since the sum (9.3) above is the same independently of the allowed choices, we make the following definition: Definition 9.4. Let (M, [roD be an oriented smooth manifold with or without boundary. Let a E on(M) have compact support. Choose a positively
9. Integration and Stokes' Theorem
394
oriented atlas {(x~, Ui )} and a smooth partition of unity nate to {Ui}. Then we define
f
a:=
i(M,[fIJ])
{(p~,
Ui )} subordi-
L iu,f Pia. i
We usually omit the explicit reference to the orientation [tv] and simply write 1M a. Remark 9.5. Of course if we take (9.2) as a definition, then we may take
and 1(M, fIJI) a = ~t 1u, pta even for an arbitrary unoriented atlas. Once again, we still need M to be oriented. In the online supplement we introduce twisted n-forms, and these may be integrated even on nonorientable manifolds!
In case M is zero-dimensional, and therefore a discrete set of points M = {P!'P2," .}, an orientation [tv] is an assignment of +1 or -1 to each point. Then, if a = f E nO(M), we have
r
i(M,[fIJl)
f =
L ±f(Pt),
where we choose the ± according to the orientation at the point.
9.1. Stokes' Theorem In this section we take up the main theorem of the chapter. It is the fundamental theorem of exterior calculus known as Stokes' theorem. Our definition of integration works for any (positive) atlas, but we can use a specific atlas for theoretical purposes. We will employ lR:1c where c = 0 and>' = -u1 • Let us consider two special cases of integration. Case 1. This is the case of a compactly supported (n - I)-form on IRn , Let Wj = fdu 1 /\ ••• /\ d;;J /\ ... /\ dun be a smooth (n -I)-form with compact support in lRn , where the caret symbol over the du j means that this j-th
9.1. Stokes' Theorem
395
factor is omitted. Then we have
r
1'Ji:
dWj 1
=
0
r
1'Ji:
d(fdu 1
/\ •.• /\
d;J /\ ... /\ dun)
1 <0
-- l'rJin
(df /\ du 1 /\
•.. /\
d;J /\ ... /\ dun)
"I 0
= =
r (""' af du l'Jin ~ auk ,,10k r
du 1 /\
... /\ -;;;;; /\ ... /\
dUn)
(_l)j-l af. du 1 /\ ... /\ dun
l'Jin
au)
,,1<0
=
k /\
a f du1 ... dun=0 r (_l)j-l au)
l'Jin
by the fundamental theorem of calculus and the fact that f has compact support. All compactly supported (n -l)-forms won lRn are sums of forms of this type and so we have
Case 2. This is the case of a compactly supported (n - l)-form on lR:1
r
1'Ji:
dwj 1 <0
r d(fdu d;J /\ ... /\ dun) 1'Ji: = r (_l)j-l (joo a f .duj ) du d;J .. . dun 1'Jinau) =
1 /\ ..• /\
1 <0
1 ...
1
= =
r 1'Jin-
ia'Ji:
= O.
-00
f(0,u 2 , ••• ,un )du2 ••. du n
1
f(O, u 2 , ••• ,un )du2
/\ ... /\
dun =
1 :::;0
faaIR:
WI· 1 :::;0
For this last equality, it is important that we have set things up so that du 2 /\ ••• /\ dun be positive on alR: 1
8lR:1
J'Jin
,,1:::;0
dWj =
Ja'Jin
,,1<0
wJ • All compactly
396
9. Integration and Stokes' Theorem
supported (n - 1)-forms W on summing we have
ffi.:
-
L:",
dw =
i.R:,<, w.
Let us assume that M is an oriented smooth manifold of dimension n ~ 2. Then there is a positively oriented atlas {(Ua , Xa)}aEA consisting of nice charts so either Xa : Ua ~ ffi.n or Xa : Ua ~ 1
ffi.:
r dw = JUa r Ld(Paw) = L Ju", r d(Paw)
JM
a
a
= Ll a
=L a
(x;;l)*d(Paw) = L l a
xa(Ua )
r
((x;;I)*Paw)
=
L
Ja{xa(Ua)}
d((x;;I)*PaW)
xa(U",)
a
r
Jaua
PaW
=
r
JaM
w,
where we have used the fact that x~1 : Xa (oUa ) -+ oUa is orientation preserving. We have now proved Stokes' theorem stated below for the n > 1 case. The n = 1 case is easily proved directly and amounts to the familiar fundamental theorem of calculus.
Theorem 9.6 (Stokes' theorem), Let M be an oriented smooth manifold with boundary (possibly empty) and give aM the induced orientation. Then for any W E n~-l(M) (i.e. with compact support) we have
r dw = JaM r w.
JM
Note that JaM W := 0 if aM = 0. It can be shown that if (U, x) is a chart for a manifold such that M\ U has measure zero, then
1M w:= l(u) (x-1)*w. The definition of integration using partitions of unity is fine for the theoretical purposes we intend to pursue (such as cohomology), but for actually calculating integrals it is nearly useless. Consider the task of integrating the form W :- zdy /\ dz + xdx /\ dy over the sphere 8 2 C JR3. Technically speaking, we are integrating the restriction of W to S2. Let O"(u,v) = (cosucosv,sinucosv,sinv) for (u,v) E (O,2rr) x (-rr,rr). This gives a parametrization of a portion of the sphere. Thus 0" plays the role of x-I. The image of our parametrization is the domain of the chart x, and the complement of the domain of the chart has measure zero in 82,
397
9.2. Differentiating Integral Expressions; Divergence
Because of this it seems plausible that we would get the correct answer by calculating J(O,211")X( 11",'71") u*w. But u*w does not have compact support in (0,271") X (-71",71"), and so this is not justified in terms of the theory we have developed. Let us try it anyway. We pull the form back to the u, v space and then integrate just as one normally would in calculus of several variables:
r
JS2
w:=
r
u*w
J(O,2'71"jX(-'7I",7!-)
=
i
(O,2'71"jx(-7I",7!")
= 171"
r
2 71"
(
8(Y,z) 8(X,y)) z(u,v)8( )+x(u,v)8( ) dudv U,v
U,v
cos 2 vcosusinvdudv = O.
-7I"Jo
We are led to a nice integral in this case, but in general, proceeding in this way may lead to improper Riemann integrals, and anyway, it is not immediately clear how to connect this with our theory given in terms of partitions of unity. That the answer we obtained above is indeed the correct answer can be seen by applying Stokes' theorem:
r w St~es JBrdw = JBr0 = O.
JS2
Here B is the unit ball and we used the fact that dw = O. A practical way of calculating integrals of forms is to break up the manifold in a nice way and add up the integrals over the pieces. In Problem 10 we ask the reader to prove the following theorem: Theorem 9.7. Let M be an oriented n-manifold with possibly nonempty boundary. Suppose that there are subsets D, C ]Rn for i = 1, ... , K and smooth maps (Pi : Di -t M such that the following assertions hold:
(i) Each Di is compact and has a boundary of measure zero. (ii) Each
U¢, (D
1 ),
we have
K
r w = L JD,r
JM
i-I
9.2. Differentiating Integral Expressions; Divergence Suppose that ScM is a k-dimensional regular submanifold with boundary 8S (possibly empty) and CPt is the flow of some complete vector field X E
398
9. Integration and Stokes' Theorem
X(M). In this case, <1>t(8) is also a regular sub manifold with boundary. We then consider -9t Is <1>;11 for some k-form 11 with compact support. We have
:t Is
l~ ~ [Is <1>;+h11- Is <1>;11] = ~!6 [Is ~<1>; (<1>h11- 11)]
<1>;11 =
= [{ lim -h1 (<1>h11- 11)] = { .cx11· J;11 - I
~
{ 11 = { .cX11· dt J
As a special case (t = 0), we have -9t It=o I
:
{ 11 t J
=: Js =J
=
{
{<1>;11
J
ixd11 +
.cX11
(
Ja(
ixd11 +
{
J
dix11
ix11·
This formula is particularly interesting in the case when 8 = n is an open submanifold of M with compact closure and smooth boundary an, and where 11 = J.L is a volume form on M. Let nt := <1>t n. We have Jnt J.L Jant i x J.L and then
-it
~I
{J.L= { ixJ.L. dt t=o Jnt Jan
Definition 9.8. If J.L is a volume form orienting a manifold M, then .cXJ.L = (div X) J.L for a unique function div X called the divergence of X with respect to the volume form J.L. We have
1 = In'
-dl J.L dt t=o nt
an
?,xJ.L
and
: t It-O Jnt{J.L = In{ .cXJ.L = In{ (div X) J.L. From this we conclude that { (div X) J.L = {
In
Jan
ixJ.L.
This formula helps to give geometric meaning to div X. The above formula is a version of Gauss' theorem and can be easily proven not just for domains
9.2. Differentiating Integral Expressions; Divergence
399
in a manifold with a volume form, but for a general manifold with boundary with a volume form: Theorem 9.9. Let M be a manifold with boundary and let /.£ be a volume form on M. In particular, M is oriented by [/.£]. Then we have
r (divX)/.£= JaM r ix/.£
JM
for any compactly supported smooth vector field X.
Proof. The proof uses the trick we used above:
1M (div X) /.£ = 1M .cx/.£ =
1M dLx /.£ + 1M ixd/.£
=
JM
r dLx/,£ - JaM r ix/.£
(Stokes'theorem).
0
Now let EI, ... , En be a local frame field on U c M and 01 , ... , on the dual frame field. Then for some smooth function p we have /.£ = pOl /\ ... /\ on, and a simple calculation gives n
LX (pOl/\ ... /\ on) = l)-l)j-lpX kOl /\ ... /\ (jk /\ ... /\ On. k=l
Then we have
.cxP, =.cx (pOl/\ ... /\ on) = dLx (pOl/\ ... /\ on) n
= dL(-l)j-lpX kOl /\ ... /\ (jk /\ ... /\ On k=l n
= L(-l)j- l d(pX k ) /\ 01 /\ ...
/\
(jk /\ ... /\ On
k=l
n
n
k=l
i=l
= L( -1 )j-l L(pXk)iO~ /\ 01 /\ ... /\ (jk /\ ... /\ on =
t
k-l
(~(pXk)k) pOl /\ ... /\ On = P
t
(~(pXk)k) /.£,
kIP
where (pXk)k := d(pXk)(Ek)' Thus we end up with a nice formula div X =
t k=l
!(pXk)k. P
9. Integration and Stokes' Theorem
400
In particular, if Ek = ~ for some chart (U, x) = (xl, ... , x n ), then
(9.4)
div X
=
t! k=l
00 k (pXk).
P x
If we were to replace the volume form IL by -IL, then divergence with respect to that volume form would be given locally by 2:~=1 !p ~ (- pXk) = 2:~-1 ~ ~ (pXk) and so the orientation seems superfluous! What if M is not even orient able? In fact, since divergence is a local concept and orientation is a global concept, it seems that we should replace the volume form in the definition by something else that makes sense even on a nonorientable manifold. On nonorientable manifolds, pseudoforms can be used in place of forms for many purposes. Alternatively, many situations can be handled by going to the two-fold orientation cover. See the online supplement [Lee, J eff1 for a discussion.
9.3. Stokes' Theorem for Chains Here we give another version of Stokes' theorem that is very useful in connection with topology. We say that distinct points Po, ... ,Pk+1 E IRn are in general position if they are not contained in any k-dimensional affine subspace of]Rn. Equivalently, Po, ... , Pk are in general position if the vectors PI - Po,··· ,Pk - Po are linearly independent. A set of the form
{t
i=O
tiPi : 0
~ tt ~ 1 and
t i
ti = I}
a
for Po, ... ,Pk in general position is called a geometric k-simplex. The set above is a closed convex set and is the smallest such set containing the points Po, ... ,Pk. The integer k is the dimension of the simplex, and a geometric O-simplex is just a point. Let us denote the geometric k-simplex determined by the points po, ... ,Pk by (Po, ... ,Pk)' If (po, ... ,Pk) is such a geometric k-simplex and {Ptll'" , Pz3 } c {po, ... , Pk+1}, then Ptll ... , Pi are in general position and the geometric j-simplex (Pil, ... ,Pt3 >is called a face of (po, ... ,Pk)' In particular, the geometric (k - I)-simplices obtained by omitting one of Po, ... ,Pk are called boundary faces. The i-th boundary face of (Po, ... ,Pk) is the simplex obtained by omitting the Pt and is denoted Oi (Po, ... ,Pk) = (Po, ... ,Pi,··· ,Pk)' Geometric simplices can be combined to create geometric simplicial spaces, but we take a slightly more flexible approach. Definition 9.10. For k :;::: 0, the set f::.. k := {a E ]Rk : a i :;::: 0 and 2: at < I} is called the standard k-simplex in ]Rk. Note that f::.. a = IRa consists of just a single point denoted O. In other words, f::..k is (eo, ... , ek), where eo = (O, ... ,O),el = (1,0, ... ,O),e2 = (0,1, ... ,0), ... , etc.
9.3. Stokes' Theorem for Chains
401
cr
Definition 9.11. Let M be an n-manifold. A map (J' : 11k --+ M is r called a C singular k-simplex. Let G be an abelian group. A formal sum c = ~u Cu(J', where the sum is over all singular simplices (J', Cu E G, and Cu = 0 for all but finitely many simplices (J', is called a smooth singular k-chain with coefficients in G. A O-simplex (J' : 11° --+ M is often identified with the image (J'(O) = p E M. By definition, Cu - 0 for all but finitely many simplices; we say that a singular k-chain has finite support. The set of all C r k-chains with coefficients in G is an abelian group denoted Ck(M, GY, where we use additive notation for the group operation. Addition is given by
The cases r = 0 and r = 00 are the most important. We will restrict to the r = 00 case and drop the superscript in Ck(M,G)oo. We now define the notion of the i-th face of a singular simplex and then the boundary of a simplex. First, if qo, ... , qk are distinct points in ]Rn, then there is a unique affine map ]Rk --+ ]Rn which maps ei to qi for 0 ~ i ~ k. This map restricts to a map from 11k onto the convex hull of qo, ... ,qk, which will be a geometric k-simplex if qO, ... , qk are in general position. For given qo,···, qk, denote this map by a (Qo, .. . , qk). For 0 ~ i ~ k, let I1 k- 1 --+ !:::.k be a(eo, ... , €i, ... , ek). Explicitly, 16(0) = 1, !l(O) = 0 and for k > 0,
H:
(k( aI , ... JO
, ak-I) =
(1
-
~ ai,lak , ..., aI ) , ~
t=1
f ik( aI , ... ,ak-I) = (aI , ... ,ai-I"0a ,i ... ,ak-l) . Thus I,k is a homeomorphism of !:::.k-1 onto its image, which is the i-th boundary face of !:::.k. Thus it parametrizes the face of !:::. k opposite ei while keeping track of the order of the vertices. The i-th face of a singular k-simplex (J' is the singular (k - I)-simplex (J'i given by
Definition 9.12. The boundary operator 8k: Ck(M,G) --+ Ck-1(M,G) is defined on a simplex (J' by k
8k(J' :=
L( _1)i(J'i ,=0
9. Integration and Stokes' Theorem
402
Figure 9.1. A face of u
and then extended to be a group homomorphism Ok 2:00 CooCT = 2:00 Cooaka. If CT is a O-simplex, then we define OOCT = 0 (the group identity) so that for a O-chain 2:00 CooCT we have Ok 2:00 CooCT = O. We will denote all the maps Ok simply by 0 for all k. It can be shown that
000 = O.
(9.5)
Exercise 9.13. Prove that 000 = O. Hint: First show that I tk IJk 0 Ilt~l for k > 1 and i > j. The maps Itk are defined above.
o/Jk 1
It is convenient to define Ck(M, G) to be the trivial group {O} for all k < 0 and define Ok - 0 for all k < O. Then 000 = 0 remains true. The sequence of spaces and maps
... ~ Ck+1(M, G) ~ Ck(M, G) ~ Ck I(M,G) ~ ... is called the singular chain complex with coefficients in G. Let Zk(M, G) Kerok and Bk(M, G) :- 1m Ok-I. Then because of equation (9.5) we have Bk(M, G) c Zk(M, G). The k-th singular homology group Hk(M; G) with coefficients in G is defined by
Zk(M,G) Hk(M; G) = Bk(M, G)' If C E Zk(M, G), then the equivalence class of C is denoted [c]. If CI and C2 are k-chains in the same class, then CI = C2 + OC for some C E Bk(M, G , and in this case we say that q and C2 are homologous (or in the same homology class). The most important choices for G are JR and Z.
Exercise 9.14. Check that if G = JR, then Ck(M, JR), Zk(M, JR), Bk(M,lR and Hk(M; JR) are all vector spaces in a natural way and the boundary maps o extend to linear maps. We define the integral of a k-form as
1
0::=
a
0:
over a smooth singular k-simplex (J
r
JAk
4;>*0:,
9.3. Stokes' Theorem for Chains
and then for a chain c =
403
I:a: CuU E Ck(M, R) we can define
j a=L cu l a . u
c
u
Ju 1 = 1(0). We state
If U is a O-simplex and if 1 E nO(M) = COO(M), then without proof the following version of Stokes' theorem:
Theorem 9.15 (Stokes' theorem for chains). Let M be a smooth manilold. For c E Ck+1(M,R), and a a k-Iorm on M, we have
1 = lac da
a.
For a proof see [War]. Notice that in this version, M is not assumed to be orient able. Also, a need not have compact support since the chain c has finite support. If u is a I-simplex and 1 E COO(M), then the above reduces to
1=
f(u (1)) - I(u (0)).
df
Now we define the de Rham map. First, if a E Zk(M) c nk(M), then we can define an element la of Hk(MjR), the dual space of Hk(MjR): For [c] E Hk(Mj R) represented by c E Zk(Mj R) we define
/a([c]) =
1
a.
This is well-defined since if c+ Bd E [c], then
1 l
la(c+Bc') =
c+ac'
a
a + { a = ja+
=
Jac'
c
=
1
a
c
(since da
1
da
(Stokes)
c'
= 0).
This gives a linear map Zk(M) ---+ Hk(Mj R). Now if a E Bk(M) (image of d), then a - d{3 and
la([eD = ja = jd{3 - { {3 = 0 (since Be = 0). c
c
Jac
Thus we obtain a linear map called the de Rham map H~eR(M) ---+ Hk(Mj R),
where H~eR (M) = Zk (M) / Bk (M) is the de Rham cohomology defined earlier. The content of the celebrated de Rham theorem is in part that this map is an isomorphism. The theorem is fairly difficult to prove.
9. Integration and Stokes' Theorem
404
Theorem 9.16 (de Rham). The de Rham map defined above is an isomorphism, H~eR(M) ~ Hk(M; IR).
o
Proof. See [Bo-Tu] or [War].
We have defined Hk(M; IR) using smooth singular chains, but we could have used continuous chains. The result is isomorphic to Hk(M;IR) as we have defined it (see [War] or [Bo-Tu]).
9.4. Differential Forms and Metrics Let (V,g) be a real scalar product space (not necessarily positive definite). We wish to induce a scalar product on L~t(V) ~ I\k(V*). Even though elements of L!lt(V) ~ I\k(V*) can be thought of as tensors of type (O,k) that just happen to be alternating, we will give a scalar product to this space in such a way that the basis
(9.6) is orthonormal if e1 , •.• ,en is orthonormal. Recall that if a, 13 E V*, then by definition (a,f3) = g(a d, f3d). Now suppose that a = a 1 1\ a 2 1\ •.. 1\ ci and 13 = 13 1 1\ 13 2 1\ ... 1\ 13 k , where the a i and f3i are I-forms. Then we want to have
(alf3) = (a 1 1\ a 2 1\ .. . 1\ ak
I 13 1 1\ 13 2 1\ ... 1\ 13 k )
= det [(a i ,f3i )] ,
where [( ai, f3i)] is an n x n matrix. Notice that we use (a I13) rather than (a,f3) since the latter could be taken to be the natural inner product of a and 13 as tensors-the latter differs from the first by a factor. We want to extend this bilinearly to all k-forms. We could just declare the basis (9.6 above to be orthonormal and thus define a scalar product. Of course, one must then show that this scalar product does not depend on the choice of basis. For completeness, we now show how to arrive at the appropriate scalar product using universal mapping properties and obtain some formulas. Fix 13 1 ,132 , •.• ,13k E V* and consider the map J-Lpl,p2, ... ,pk : V* x ... x V* -+ IR given by J-Lpl,p2, ... ,pk : (al, a 2 , .•• , a k ) t-+ det [(a i ,f3i )] . Since this is an alternating multilinear map, we can use the universal property of 1\ k (V*) to see that this map defines a unique linear map Jipl,p'l, ... ,pk : /\ k (V*) -+ IR
405
9.4. Differential Forms and Metrics
such that Ji{jI,/32, ... ,/3k
.. ] . (a1 /\a2 /\ ..k ·/\a) =det [ (a\/31)
Similarly, for fixed a E Ak(V*), the map ma:: (131, 13 2 , ... ,13k ) I--t ilf31 ,13 2 ,... ,f3k (a) is alternating multilinear and so gives rise to a linear map
_ : /\k (V*) --+ IR ma: such that
ma:(f31/\ 132 /\
.•. /\
13 k ) = il(31,f32, ... ,(3k (a) .
Lemma 9.17. The map k
k
(·1·) : /\ (V*) x /\ (V*) --+ lR defined by
(alf3) 2S
:=
ma:(f3)
symmetric and bilinear. We have
(a 1 /\ a 2 /\ ... /\ a k 1 13 1 /\ 13 2 /\ ... /\ 13 k ) = det [(a i ,f3i )]
Proof. By construction, the map is linear in the second slot for each fixed Fix 13 and write 13 as a sum 13 = 2: bi1 ... ik f3'1 /\ ... /\ f3'k in any way. Then
0:.
ma: (13) =
~ . ma: L..J b·'l""k
(f3i 1 /\ ••• /\ f3i k)
- L..J ~ b·~l'''~kr'f3 . Ii 1 ,(3 2 , ... k ,(3 (a)
•
But each map a I--t Ji(31,(32, ... ,(3k (a) is linear and so a I--t ma:(f3) is also linear. Now let cp(.,.) : W x W --+ lR be any bilinear map on a real vector space W. If SeW spans Wand if cp(81' 82) = cp(82' 81) for all 81, 82 E S, then cp is symmetric. The set of all elements 13 E Ak (V*) of the form 13 = 131 /\ •.• /\ 13 k for I-forms 131, •.• , 13 k , is a spanning set. Since
(a 1 /\ a 2 /\ ... /\ a k 1 13 1 /\ 132 /\ ... /\ 13 k ) = det [(a i ,f3J )] = det [(f3i ,ai )] =
(13 1 /\ 13 2 /\ .•. /\ 13 k a 1 /\ a 2 /\ ••. /\ a k ),
we conclude that (,1,) is symmetric.
1
o
The bilinear map defined in the previous lemma is a scalar product for the vector space Ak(V*). Notice the vertical bar rather than a comma in
9. Integration and Stokes' Theorem
406
the notation. If e1 , ... ,en is an orthonormal basis for V*, then {eiI /\ ... /\ e~k hl <...
which is zero if iI, ... ,ik is not a permutation of h, ... ,jk, while on the other hand, if (j1, ... ,jk) = (0" (il), ... , 0" (ik)), then every column of the above determinant has exactly one nonzero entry, which is either 1 or -1. Abbreviating ei1 /\ ... /\ eik to el and so on, we have in the orthonormal case
(ell i) = ±1. So, if a
= 'Lalel, then (ell a) = (ell el)al and al= (il el)(il a) = ±(il a).
We see that the scalar product need not be positive definite. Unless we give a specific linear order to the basis {e l }, the signature is perhaps best thought of as the indexed set {E (1) }1 so that a =
LE(1)(el l a)e l . 1
We are very much interested in /\ k(V*), but note that /\ k(V) is also a scalar product space in the analogous way so that
Exercise 9.18. Let IRt denote Minkowski space. Determine the index of the scalar product on /\ 2 (JRt) described above. We already have a scalar product denoted (".) for tensors. We wish to compare it with the scalar product ('1-) just defined on forms. For simplicity we consider the positive definite case. If a = 'Lalel and f3 = 'Lf3l are k-forms considered as covariant tensor fields, then by Problem 18 of Chapter 7 we have
i
Now let e 1 , ... ,en be an orthonormal basis for V*. Then
9.4. Differential Forms and Metrics
407
and we have
We conclude that
(ad/3)
1
= k! (a, /3) ,
so the two scalar products differ by a factor of k!. Now let 01 , .•• ,on be any basis for V* (not necessarily orthonormal). For a given a E Ak(V*), we can also write a = ~ aIel and then as a tensor
k
lL a"
a = -k!
tl .. ·tlc
.
. = L a"
otl /\ ••• /\ otic
tl .. ·tlc
. ® ... ® .
otl
otic •
But when a = ~ alOI and /3 = ~ brOr are viewed as k-forms, we must have 1 (a, /3) = k! 1", I '" r (al/3) = k! L.J alb = L.J a/J . Definition 9.19. We defined the scalar product on A k V* ~ L~t (V) by first using the above formula for exterior products of I-forms and then extending (bi)linearly to all of Ak V*. We can also extend to the whole Grassmann algebra A V* = EB A k V* by declaring forms of different degree to be orthogonal. We also have the obvious similar definition for AkV and A v. If we have an orthonormal basis e1 , ••. ,en for V*, then e1 /\ ••• /\ en E 1\n V*. But An V* is one-dimensional, and if t : V -+ V is any isometry of (V*,g*), then the dual map t'" : V* -+ V* is an isometry of (V,g). Thus i.*e 1 /\ ••• /\ t*e n = ±e l /\ ... /\ en. In particular, for any permutation u of the letters {1,2, .. . ,n} we have el /\ ... /\ en = sgn(u)eO'(l) /\ ... /\ eO'(n). For a given ordered basis (el,"" en) for V, with dual basis (e 1 , ••• , en), we have
(e l
/\ •.• /\
en le 1 /\ •.. /\ en)
= €l€2 ... €n = ±1.
Exercise 9.20. Let (el,"" en) be orthonormal. Show that the only elements w of AnV* with l(w,w)1 = 1 are e1 /\ ••• /\ en and -e 1 /\ ••. /\ en. Given a fixed orthonormal basis (ell"" en), all ordered orthonormal bases for V fall into two classes. Namely, those bases (h, .. . , In) for which 11/\ ... /\ jn = e1 /\ . .. /\ en and those for which 11/\ ... /\ In = -e l /\ ... /\ en. Thus for each orientation of V there is a corresponding element of An V* called a metric volume element for (V,g). The metric volume element corresponding to a basis (el' ... ,en) is just e1 /\ ... /\ en. On the other hand,
9. Integration and Stokes' Theorem
408
we have seen that any nonzero top form w determines an orientation. If we have an orientation given by a top form w, then obviously, (el, ... ,en) determines the same orientation if and only if w(el, ... , en) > 0 since this means that w = eel /\ ... /\ en for some c> O. Proposition 9.21. Let an orientation be chosen on the scalar product space (V,g) and let f =(el, ... ,en) be an oriented orthonormal frame so that vol := e l /\ ... /\ en is the corresponding volume element. Then if F = (h, ... , fn) is a positively oriented basis for V with dual basis P (/1, ... , fn), then
vol = Vldet(gij)ll /\ ... /\ fn,
where gij =
(Ii, fJ).
If 9 is positive definite, then det (gij) > 0 and so vol =
Proof. Let ei
Vdet(g~])fl /\ ... /\ fn.
= E a;fj • Then we have
(L: a~fk, L: a!nfm ) = L: a~a!n Uk, fm) = L: a~a!ngkm,
€ic5 ij
= ±c5ij = (e i , d) k,m
so that ±1
k,m
= det([a~])2det([gkm]) = ±Vldet(gij)I
(det([a~]))2(det(gi]))-I. Thus,
= det([4])·
But since f and F are both positively oriented, we must have det([a~]) and so
>0
On the other hand, vol _ e1 /\ ...
/\
en
= (L: alJkl )
/\ ... /\
(L: akJkl) = det([a~])fl /\ ... /\ r,
and the result follows. If 9 is positive definite, then det( [a~])2 det( [gkm]) =1 so det(gij) > O. 0 Fix an orientation and let (el, ... , en) be an orthonormal basis in that orientation class. Then we have the corresponding volume element vol = e1 /\ ... /\e n. Now we define the Hodge star operator * : 1\ k V· ---+ An k V· for 1 ~ k ~ n, where n - dim(V).
Theorem 9.22. Let (V,g) be a scalar product space with dim(V) = n a d the corresponding volume element vol. For each k, there is a unique linear
9.4. Differential Forms and Metrics
409
isomorphism * : /\ k V* ---+ /\ n-k V* such that 0: /\ *(3 = (0:1(3) vol for all 0:, (3 E /\ k V* . Proof. Given , E /\ n-k V*, define a linear map L-y : /\ k V* ---+ IR by requiring that L-y(O:) vol = 0: /\ ,. By Problem 3, if L-y(O:) = 0 for all 0:, then 'Y = 0 and, t--+ L-y gives a one-to-one linear map /\ k V* ---+ (/\ k V*)*. But since dim /\ k V* = dim(/\ k V*)*, this must be a linear isomorphism. Thus, for each (3 E /\ k V*, there is a unique element *(3 such that
L*/3(O:) = (0:1(3) for all 0: E /\ k V*. This defines a map
* : /\ k V* ---+ /\n-k V* such that
0: /\ *(3 = L*/3(O:) vol = (0:1(3) vol. This map is easily seen to be linear. The equation above also shows that it is one-to-one and hence an isomorphism. 0 Proposition 9.23. Let (V, g) be a scalar product space with dim(V) = n and corresponding volume element vol. Let (el, ... , en) be a positively oriented orthonormal basis with dual (e 1 , ... , en). Let u be a permutation of (1,2, ... , n). On the basis elements eu(l) /\ ... /\ eu(k) for /\ k V* we have
= EUI EU2 ••• EUk sgn(u)e u (k+1) /\ ... /\ eu(n). In other words, if we let {ik+b ... , in} = {I, 2, ... , n}\{iI, ... , ik}, then (9.7)
*(eu(l) /\ ... /\ eu(k)) *( e il
/\ ... /\
where we take the e1 /\ ... /\ en.
eik ) = ±(Eit Ei2
+ sign if and
...
Eik )eik+l
only if e il
/\ ..• /\
/\ ... /\
ein ,
eik /\ eik+l /\ ... /\ ein =
Proof. Formula (9.7) above actually defines a map /\k V * ---+ /\n-k V *. For this proof we denote this map by * and show it satisfies the same defining equations as the Hodge star. It is enough to check that the defining formula 0: /\ *(3 = (0:1(3) vol holds for typical (orthonormal) basis elements 0: = eil /\ •.. /\ eik and (3 = eml /\ ... /\ emk . We have
(9.8)
(e ml
=
/\ ... /\
emk ) /\ *(eil
eml /\ ... /\
/\ ... /\
eik )
emk /\ (±eik+l /\ ... /\ eini ).
The last expression is zero unless {ml, ... ,mk} U {ik+1, ... ,in } = {1,2, ... ,n}, or in other words, unless {il, ... ,ik} = {mI, ... ,mk}. But this is also true for (9.9) (e mi /\ ••• /\ emk 1 eit /\ ... /\ eik ) vol. On the other hand if {h, ... , ik} = {mI, ... , mk}, then both (9.8) and (9.9) give ± vol. So the lemma is proved up to a sign. We leave it to the reader 0 to show that the definitions are such that the signs match.
410
9. Integration and Stokes' Theorem
Remark 9.24. In case the scalar product is positive definite, = En = 1 and so the formulas are a bit less cluttered.
El = E2 = ...
The Hodge star operators for each k can be combined to give a Hodge star operator /\ V* -t /\ V*. We also note that the scalar product on /\ 0 V* = lIt is just (alb) = abo Then, we still have the formula 0: /\ *(3 = (0:1,8) vol for any 0:,,8 in the algebra /\ V*. Proposition 9.25.
Let
V
dim (V) =
be as above with
n.
The following
identities hold for the star operator:
(1) *1 = vol; (2) * vol = (_1)ind(g); (3) * * 0: = (_1)ind(g)( _1)k(n-k)0: for 0: (4) (*0:1 *,8) = (_1)ind(g) (0:1,8).
E /\ k V*;
Proof. (1) and (2) follow directly from the definitions. For (3), it suffices to let 0: = eu(l) /\ ... /\ eu(k) for some permutation u E Sn. We first compute *(e u (k+1) /\ •• '/\eu(n»). We must have *(e u (k+1) /\ .• ./\eu(n») = ceu(l) /\ .. ·/\eu(k for some constant C. On the other hand, EU(k+1) ... Eu(n)
vol =
I
(e u (k+1) /\ ... /\ eu(n) eu (k+1) /\ ... /\ eu(n»)
vol
*
= (eu (k+1) /\ ... /\ eu(n») /\ (e u (k+1) /\ ..• /\ eu(n») = (e u (k+1) /\ .•. /\ eu(n») /\ ceu(l) /\ ... /\ eu(k) =
so that c = we have
(-1 )k(n-k) c sgn( u) vol
EU(k+1) ... Eu(n) ( _l)k(n-k)
* * (eu(l) /\ ... /\ eu(k») =
sgn(u). Using this and equation (9.7),
*Eu(1)Eu(2) ..• EU(k)
sgn(u)eu (k+1)
/\ ... /\ eu(n)
= EU(1)Eu(2) ... Eu(k)Eu(k+1) ... Eu(n) (sgn( u))2 X (_l)k(n-k)eu(l) /\ ••. /\ eu(k)
= (_l)ind(g) (_l)k(n-k)eu(l) /\ •.. /\ eu(k),
which implies the result. For (4) we compute as follows:
(*0:1 * ,8) vol = *0: /\ * * ,8 = (_l)ind(g) (_l)k(n-k) * 0: /\,8
= (_1)ind(g),8 /\ *0: = (_1)ind(g) (0:1,8) vol.
0
We obtain a formula for the star operator that uses a basis that is not necessarily orthonormal. Recall E~~::·.~~ defined by equation (8.2). If we let 12 n.. , then Eh".i n IS . JUs . t th ' a fth e permut a t'Ion (12".n) Eil".i.. := Eil.::i e sIgn il."i,,' U' SIng this notation, we have the following theorem.
9.4. Differential Forms and Metrics
411
Theorem 9.26. Let (V,g) be an oriented scalar product space with corresponding volume element vol. Let e1 , ••• ,en be a positively oriented basis of V*. Ifw =;h EW'l''''lc e''I\···l\e'Ic (orw = E'1
then
(9.10)
L
*w = V1det[gij]I
jlc+1 < .. ·<jn
Proof. Let us use the common abbreviation 9 := det[gij]. Let ...:" denote the operator defined so that ..~:.W is given by the right hand side of equation (9.10) above. Our task is to show that.,. = For this we need to show that a 1\ .~"f3 = {01f3} vol for all 0, f3 E /\ k V*. First note that for fixed j1, ... ,jn we have €j1 ...jne1.::::f; = €i1 ... in (no sum). We compute the coefficients of a 1\ ~f3 using formula (8.3):
*.
1 "" ( •. ~) e!l ... jn (a I\"~) 'It-fJ i1 ... in = k!(n _ k)! L..J °31 .. ·jlc .tfIofJ jlc+l ... jn i1··.in _
1
1 "" I 11/2
- k!(n - k)! k! L..J 9 - I I1/2
-
9
= Igl and so 01\ olvf3
1
k!(n - k)!
.
. ~m1· .. mlc
°31 ...'lcfJ
. e!l ... jn €ml ... mlc'lc+l",'n il ... in
L O'l. ...'lcfJ . . .. ..3". .. . eJl....3n €31 ...31c'Ic+l ...3n il···in ~'l·
1/2 1 "" .. k! L..J Ojl ..., lc f3 31 ···'Ic€il ... in = ({01f3) vol)'l ... in
= {01f3} vol; thus by uniqueness we have,. = *.
o
The definitions we gave for volume element, star operator, and the musical isomorphisms all apply to each tangent space TpM of a semi-Riemannian manifold and these concepts globalize to the whole of T M accordingly. Let M be oriented. The metric volume form induced by the metric tensor 9 is defined to be the n-form vol such that vol p is the metric volume form on TpM matching the orientation. If (U, x) is a positively oriented chart on M, then we have vollu =
Idet[gij]ldx 1 1\ ... 1\ dxn.
If f is a smooth function with compact support, then we may integrate over M by using the volume form:
IMf
vol.
The volume of a compact oriented Riemannian manifold is vol(M) :=
1M vol.
f
412
9. Integration and Stokes' Theorem
The volume of an open set D vol(D):= sup
c M is defined as
{1M I vol : supp(f) cD with I
E
C~(M) and 0~ I ~ I},
where Cgo(M) denotes the space of smooth functions with compact support.
Example 9.27. The metric volume form volJR'" on where xl, ... , xn are standard coordinates. Also,
Rn
is dx 1
/\ .•• /\
dx n ,
volJR'" (VIp, ... , V np ) = det( VI, ... , vn ), where
Vip
= (P,v,) E TpRn = {p} x
Rn.
Once again assume that M is oriented by a metric volume form "vol". The star operator gives two types of maps, which are both denoted by *. Namely, the bundle maps * : /\ k T* M --+ /\ n-k T* M, which have the obvious definition in terms of the star operators on each fiber, and the maps on forms * : nk(M) --+ nn-k(M), which are also induced in the obvious way. The definitions are set up so that * is natural with respect to restriction and so that for any oriented orthonormal frame field {El"'" En} with dual {01, ... , on} we have *(Oi l /\ ... /\(Pk) = ±(€il €i2 ... €ik )Oil /\ ... /\OJ'''-k, where, as before, we use the + sign if and only if OiI /\ ... /\ Oik /\ Oil /\ ... /\ OJn k = 01 /\ ••. /\ on = vol. As expected we have *1 = vol,
* vol = **a
(_l)ind(g),
= (_l)ind(g)(_l)k(n-k)a
for a E nk(M).
Example 9.28. On open sets in R3 the star operator associated with the standard metric and volume is given by
I
JIdx
r-+
+ hdy + !adz
1--+
~~/\~+~~/\~+~~/\~
Idx /\ dy /\ dz
r-+ r-+
Idx /\ dy /\ dz, JIdy/\dz + hdz/\dx +!adx/\dy, 91 dx + 92 dy + 93 dz ,
f.
For an oriented Riemannian manifold, div will always be the divergence with respect to the metric volume form (giving the orientation). From equation (9.4) we see that the local coordinate expression for the divergence of X = L.Xk-b is divX =
t ~aak(v'det9Xk), k=l
where det9 = det(9ij).
et9 x
9.4. Differential Forms and Metrics
413
Definition 9.29. The gradient of a smooth function manifold (M, g) is defined to be
grad(J) := and the Laplacian
~
~df,
is defined by ~f := -
if M is oriented. If ~f
f on a Riemannian
div(grad(J))
= 0, then f is called a
harmonic function.
The minus sign in the above definition is a matter of convention, and this choice of sign is popular with differential geometers. Ironically, this choice of sign makes the Laplacian a positive operator. Lemma 9.30. If M is an oriented semi-Riemannian manifold with metric volume element vol, then ix vol = *pX for X E X(M). Proof. We show that for fixed p E M we have i Vl vol = *PVI for all VI E TpM. First consider the case where V is not a null vector and (Vb VI) = €1 = ±1. Since the maps (V2,"" v n ) M i Vl vol (V2, ... ,vn ) and (V2,"" v n ) M (*PVl) (V2, ... ,vn ) are both alternating multilinear maps, it suffices to check that they are equal on some basis. We extend to an orthonormal basis (Vb V2, ... ,vn ) with dual basis e1 , ... , en. Then we wish to show that i V1 vol (ViI' •.. ,Vin_l) = (*PVI)(Vip, .. ,Vin_l) for 1 :s; il < .. , < in-I :s; n. But if Vil = VI, then
i Vl vol(ViI'
.•. ,Vin_l)
= vol (VI, Vb ••. )
= 0
and
(* PV l)
=
(ViI' .•. , Vin_l)
€1
(*e l )
(Viu •.• , Vin_l)
= Eie 2 /\ ... /\ en (Vl"") =
On the other hand, if Vil
i- VI,
then
i V1 vol(v2, ... , v n )
(ViI"'" Vin-l)
O.
= (V2' .•. ,vn ) and
= vol(vI, V2, •.• , vn ) = 1
while (*PVI)
(V2, ••. , vn )
= q (*e 1 ) =
(V2,""
vn )
€ie 2 /\ ••• /\ en (v2""
Thus i V1 vol = *PVl for all nonnull vectors linear in VI, this establishes the result.
VI.
,vn ) = 1.
But since both sides are 0
Proposition 9.31. If M is an oriented semi-Riemannian manifold with metric volume element vol, then div X = (_1)ind g * d * pX for X E X(M).
414
9. Integration and Stokes' Theorem
Proof. By Lemma 9.30 and Cartan's formula (8.56), we have .ex vol d ix vol = d * DX so that * div X vol = *d * DX or div X or
=
* vol = *d * DX
(_l)ind g div X = *d * ~X.
o
Proposition 9.32. II M is an oriented semi-Riemannian manilold with metric volume element vol, then div I X = (grad I, X)
+ I div X.
Proof. Write vol = J.L and compute as follows: (div I X) J.L
= .e/XJ.L = di/xJ.L + i/xdJ.L = d (fixJ.L) = dl 1\ ixJ.L + IdixJ.L = dl 1\ ixJ.L + I (div X) J.L
-ix (df 1\ J.L) + ixdf 1\ J.L + I (div X) J.L = dl (X) J.L + I (div X) J.L = ({X, grad f) + I (div X)) J.L, =
where in the third line we use the fact that ix is a graded derivation and dl 1\ J.L = O. 0
Exercise 9.33. Show that the local expression for
!::i.1 =
-
!::i.1 is
~ ~ Jl8 (gikv'detg uX ~/k)' etg 'k uXj ;),
9.5. Integral Formulas Definition 9.34. Let S be a regular codimension one submanifold of an n-dimensional Riemannian manifold (M, g). A smooth vector field N along S is called a unit normal field if (N, N) = 1 and if N(P) 1. TpS for every pE S. Let (M,g) be oriented by a metric volume form vol and suppose that a regular (n - l)-dimensional sub manifold S of M has a (global) unit normal field N along S. If N is extended to a field N on a neighborhood of 5, then iN vol is an (n - l)-form on this neighborhood. Clearly, the restriction of iN vol to S is independent of this extension, and so we can denote this restriction by iNvolls. Then we have
(iNvolls)p
(Vi"'" Vn-i)
= volp(N(P), Vb· .. , Vn-i)
for Vi, •.• , Vn-i E TpS. This is a volume form on S which orients S and is clearly the metric volume form for the induced metric I,*g on S. Note that
415
9.5. Integral Formulas
iN volls is sometimes denoted by iN vol, but this is an abuse of notation since iN vol (P)(VI, ... ,Vn-l) makes sense for V!, ... ,Vn-l E TpM, but this is not what we want. On the other hand, iNvolls is (iN vol) , where '/, : S y M is the inclusion map. In turn, this means the same thing as iN vol by convention.
Is
Is'/,*
Is
Proposition 9.35. Let S be a regular codimension one submanifold of an n-dimensional oriented Riemannian manifold (M, g) and let /., : S y M be the inclusion. Let N be a smooth unit normal vector field along S. Let volM be the volume form of M and vols the volume form of S with respect to the orientation induced by N and the metric /., *g. If X is a smooth vector field along S, then ix volM = (X, N) vols . Proof. Let XT := X - (X, N) Nj ix volM
= iXT volM +i(X,N)N volM = iXT volM + (X, N) iN volM,
and since iNVolMlaM = vols, it remains to show that iXT volM = O. But for VI, ... , Vn-l E TpS, iXT VOIM(VI, ... , Vn-l)
= voIM(X T , Vb .. ·, Vn-l) = 0
since X T , VI, ••• ,Vn-l cannot be linearly independent.
0
Now suppose that an oriented (M, g) has a boundary 8M. Since 8M is basically a co dimension one regular submanifold of M, the above constructions can be applied to 8M so that 8M is Riemannian with induced metric. Since 8M has a global outward pointing vector field, we may normalize to obtain a global outward pointing unit normal field N along 8M. The orientation induced on 8M by the metric volume form iN volMlaM is exactly the induced orientation on 8M described previously. We will denote iN volM laM by volaM when the orientation and normal field are understood. Exercise 9.36. Let x : V -+ U c M be a parametrization of an open subset of a surface S in IR3 and N := (Xul x xu2)/IIXul x Xu211. Show that if dS := iN volls' where vol is the usual volume form on IR3, then x*dS = IlXul x Xu211 du l
/\
du 2 .
We now introduce a special (n - l)-form cr on IRn (we use the same symbol for all n if no confusion arises). This form is used in many constructions. If Xl, ... ,xn denote standard coordinates, then n
(9.11)
cr :=
2:( _l)i-Ixi dx1/\ ... /\;hi /\ ... /\ dx n , i=l
416
9. Integration and Stokes' Theorem
where, as usual, the caret denotes omission. For example, on IR3
(]' = x dy /\ dz - y dx /\ dz + z dx /\ dy. Proposition 9.31. If we use the outward pointing normal field N on sn-I and the metric induced from IRn, then the induced volume form on sn-I (the area form) is given by volsn-l = l,*(]" where" : sn-l y IRn is inclusion.
Proof. Let Vi = Ev{ 8j8xilx E Txsn-l. Identify elements of Txsn- 1 with vectors in IR n so that Vi is the column vector (v;, ... ,v~)T. Then we use expansion along the first column by cofactors to obtain vols" 1 (VI, ... ,Vn-l) = det(x, VI, ... , Vn-l) n
= ~) _1)i-l x i det(M,)
,=1 n
=
L:( _1)i-l x i (dxl /\ ... /\ d;i /\ ... /\ dxn) (Vb ... , Vn-l), i-I
where Mi is the sub matrix obtained by deleting the first column and i-th
0
~.
Exercise 9.38. Show that the volume form on sn-I(R) = {E~=l(X'f R2} is where /, : sn-l(R) y IRn is the inclusion map.
*"*(]',
=
Theorem 9.39. Let M be an oriented Riemannian n-manifold with boundary. For any X E X(M) with compact support, { div X VOlM = { (X, N) VOlaM , 1M laM where N is the outward pointing unit normal field on 8M. Proof. We have JM div X VOIM = JM d(ix VOIM) = JaM ix VOIM which is equal to JaM (X, N) VOlaM by Proposition 9.35. 0 Corollary 9.40. Let M be a compact oriented Riemannian n-manifold with boundary and X E X(M). Then for f E COO(M) we have { (X, grad f) VOIM + { f div X VOIM = { f (X, N) volaM. 1M 1M kM
Proof. We have { ((grad f, X) 1M
+ f div X) VOIM =
{
1M
div f X = { f (X, N) volaM' 0 laM
417
9.5. Integral Formulas
Theorem 9.41. Let M be an oriented Riemannian n-manifold with boundary and f,g E O")Q(M). Then
f
1M
(grad f, grad g) volM -
f
1M
ft::..g volM
= f
laM
f (gradg, N) volaM
and
f (- f t::..g + gt::..f) volM = f
1M
laM
[f (grad g, N) - 9 (grad f, N) 1volaM
where N is the outward unit normal field. Proof.
- 1M f t::..g volM
= 1M f div (grad g) volM =
f
laM
f (grad g, N) volaM -
f
1M
(grad g, grad f) volM
by Corollary 9.40. The second formula follows from the first by interchanging the roles of f and 9 and subtracting. D The integral equations of the previous theorem are called Green's first and second formulas respectively (when comparing with other versions in the literature, do not forget our convention: t::..f := - div grad f). If the boundary of M is empty, then the boundary terms are zero, so that for instance JM f t::..g VOlM = JM gt::..f volM = O. We close this section with an application of Green's formulas. Theorem 9.42 (Hopf). Let (M, g) be a compact, connected and oriented Riemannian n-manifold without boundary. If f E COO(M) and t::..f ~ 0 (or /),.f ~ 0), then f is a constant function. Proof. Suppose t::..f
~
O. Then integration gives
which means that t::..f == 0 on M. Now use Green's second formula with f = 9 to get
1M IIgradfl1 2 volM = 1Mft::..f volM = 0, so that IIgrad fll2 = 0 on M. Thus f is constant since M is connected.
D
418
9. Integration and Stokes' Theorem
9.6. The Hodge Decomposition In this section we follow [War]. Let (M, g) be an oriented semi-Riemannian n-manifold. (However, we soon restrict to the compact Riemannian case.) Let vol denote the metric volume form. For each k, the scalar products induced by gp on /\ k T; M for each p combine to give a symmetric COO(M)bilinear map
(·1·) : nk(M) x nk(M) ~ COO(M) with (al,8)(p) := (a(p)I,8(p))p = gp(a(p), ,8(P)). We see that (al,8) vol = a /\ *,8. We can put a scalar product (·1·) on the space n(M) = l:k Ok(M) by letting (al,8) = 0 for a E nkl(M) and,8 E nk2(M) with kl i- k2 and letting
Definition 9.43. Let (M,g) be an oriented semi-Riemannian n-manifold. For each k with 0 ~ k ~ n = dim M, the codifferential 8 : nk(M) -+ Ok-l(M) is defined by 8 := (-1 )ind(g) (-1 t(kH)H * d* and 8 = 0 on nO(M). These operators combine to give a linear map O(M) -+ n(M), which is also denoted 8. Remark 9.44. Notice that if M is nonorientable, then * is still defined locally by choosing an orientation valid on a chart domain. But * occurs twice in the formula for 8, so any sign ambiguities cancel and thus 8 is globally well-defined even if M is nonorientable! Proposition 9.45. The codifferential8 is the formal adjoint of the exterior derivative on O(M). That is,
(dQI,8) = (aI8,8) for all a,,8 in O(M) with compact support in the interior of M. Proof. It suffices to check that (dal,8) = (aI8,8) for a a (k - I)-form and ~ a k-form. In this case, d *,8 is an (n - k + I)-form and so *8,8
= * (( _l)ind(g) (_I)n(kH)H * d * ,8) = (_l)ind(g)( -It(kH)H * * d *,8
= (_l)n(kH)H( _l)(n-kH)(k-l)d *,8 = (_1)k2 d *,8,
419
9.6. The Hodge Decomposition
where we have used that n (k k2 mod 2. Then we have
+ l)+l+(n -
k + 1) (k-1) = 2k+2kn-k2 =
d (a /\ *fJ) = do. /\ *fJ + (_1)k- 1a /\ d * fJ
= do. /\ *fJ + (_l)k-l( _1)k2 0./\ * 8fJ =
do. /\ *fJ - 0./\ * 8fJ
since k -1 + k 2 = k(k + 1) -1, which is always odd. Now we integrate and use Stokes' theorem to get the desired result:
1M do. /\ *fJ = 1M 0./\ *8fJ·
0
Obviously, 8 is defined on nk(U) for open sets U in M and 8 is natural with respect to restriction. Definition 9.46. For 0 ~ k ~ n, the differential operator ~ : n(M) -+ O(M) defined by ~ = 8d + d8 is called the Laplace-Beltrami operator. For each k, the restriction of ~ to nk(M) is also called the Laplace-Beltrami operator (on k-forms). If ~w = 0, we call w a harmonic form. Notice that by Remark 9.44 the Laplace-Beltrami operator is defined whether or not M is orientable. On nO(M) = COO(M) the operator ~ reduces to the Laplacian defined earlier. Proposition 9.47. The following assertions hold:
(i) For all a, fJ E n(M), we have (~alfJ) =
(ii)
(daldfJ) + (8aI8fJ)·
~
is formally self-adjoint. That is (~alfJ) = (al~fJ) for all a,fJ E n(M) with compact support in the interior of M.
Proof. By Proposition 9.45 we have
= (8da + d8alfJ) = (8dalfJ) + (d8alfJ) = (daldfJ) + (8aI8fJ) = (al8dfJ + d8fJ) = (al~fJ), which proves (i). But by symmetry (~alfJ) = (daldfJ) + (8aI8fJ) = (al~fJ), (~alfJ)
which proves (ii).
0
In the remainder of this section, we restrict attention to an oriented compact Riemannian n-manifold (M,g) without boundary. In this case, (alfJ) = iM 0./\ *fJ defines a positive definite inner product on n(M). We write 110.11 := (ala)1/2. Proposition 9.48. If (M, g) is compact, oriented and Riemannian, then !:1w = 0 if and only if dw = 0 and 8w = o.
9. Integration and Stokes' Theorem
420
Proof. This follows from (~wlw) = (dwldw) + (b'wlb'w) since (·1,) is positive 0 definite in the Riemannian case.
Now we wish to consider the equation ~w = a for a E nk(M). First notice that if ~w = a holds, then for any f3 E nk(M) we have (~wl.B) = (wl~f3) = (alf3)· Then since l(wl~f3)1 = l(alf3)1 ~ Ilallllf3ll, the map fw : f3 f-..7 (wl~f3) is a bounded linear functional. Employing a common idea from the theory of differential equations, we make the following definition: Definition 9.49. A bounded linear functional i : nk(M) ---+ lR is called a weak solution of the equation ~w = a if
for all f3 E nk(M). Notice that if a weak solution i is represented by w, then it must be an ordinary solution since in that case we have (~wlf3) = (wl~f3) = i(~f3) =
(alf3)
for all f3, which means that ~w = a. We will use the following powerful regularity result (see [War] for a proof): Theorem 9.50. Let a E nk(M). If i is a weak solution of ~w = a, then there exists w E Ok (M) such that
i(f3) = (wlf3) for all f3 E
nk (M)
and hence ~w=a.
We will also need a rather technical result whose proof can also be found in [War]. Proposition 9.51. Let {ai}i be a sequence in nk(M). Suppose that for some C > 0 we have
for all i. Then {ai}i has a Cauchy subsequence.
Let 1£k = {w E nk(M) : ~w = O} and let (1£k).1 denote {f3 (f3lw) = 0 for all w E 1£k}. Lemma 9.52. There exists a constant C all f3 E (1£k).1.
> 0 such that 11f311
~
E
nk(M):
C II~.B for
421
9.6. The Hodge Decomposition
Proof. Suppose there is no such constant C. Then we may find a sequence {,8l}f C (1-£k).1 such that II,8ill = 1 while liIDi-+oo 1I~,8ill = O. By Proposition 9.51, {,8l}r has a Cauchy subsequence, which we may as well assume to be {,8i}i. Hence we have that for any fixed e E nk(M), the sequence {(,8ile)} is Cauchy in 1R and so converges to some number. Now we define l: nk(M) -t 1R by
£(e) :- lim (,8lle). '-+00
Then,
and since £ is clearly bounded, it is a weak solution to ~w = 0. By Theorem 9.50, there must exist ,8 E nk(M) such that £(e) = (,8le) for all e E nk(M). It follows that (,8le) = limi-+oo (,8i 10) for all 0 and so ,8i -t ,8 since
II,8i - ,8J12 = (,8i - ,81,8i - ,8) = (,8il,8i) - 2(,8il,8) + (,81,8) -t O. Since II,8ill = 1, we must have 11,8J1 = 1 and of course ,8 E (1-£k).1. But by Theorem 9.50, ~,8 = 0 so ,8 E 1-£k n (1-£k) .1 = 0, which contradicts 11,811 - 1. We conclude that C exists after all. 0 Theorem 9.53 (Hodge decomposition). Let (M, g) be compact and oriented (and without boundary). For each k with 0 ::; k ::; n = dim M, the space
of harmonic k-forms 1-£k is finite-dimensional. Furthermore, we have an orthogonal decomposition of nk(M), nk(M) = ~ (nk(M)) ffi 1-£k = d8 (nk(M)) ffi 8d (nk(M)) ffi 1-£k
=d(n k- 1 (M)) ffi8(nk+l(M)) ffi1-£k. Proof. If 1-£k were infinite-dimensional, then it would contain an infinite orthonormal sequence {wi}i. In this case, we would have
t-
for all i,j with i j. By Proposition 9.51, this sequence would contain a Cauchy subsequence. But this contradicts the above equation. Thus 1-£k must be finite-dimensional. We now prove the orthogonal decomposition nk(M) = ~ (nk(M)) ffi1-£k. The other two decompositions can be derived from the first, and we leave
9. Integration and Stokes' Theorem
422
this as a problem for the reader (Problem 9). Choose an orthonormal basis WI, ••• ,Wd for l£k. If a E nk(M), then we may write d
a = 13 + L(Wila)Wi i=l
where 13 E (l£k) 1.. It is easy to show that this decomposition is unique, so we have the orthogonal decomposition nk(M) = (l£k) 1. EB l£k and our task is to show that (l£k) 1. = A (nk(M)). Since (Aalw) = (aIAw) - 0 whenever W E l£k, we see that A (nk(M)) C (l£k) 1.. Now let a E (llk).i and define a linear functional f on A (nk (M)) by
f(AO)
:=
(aIO).
This is well-defined since if AOI = A02, then 01 - 02 E l£k and so (allh) (aI02) = (alOl - (2) = o. We show that f is bounded. Let > := 0 - H(O). where Hk : nk(M) --t l£k is the orthogonal projection. Then using Lemma 9.52 we have If(~O)1
= If(~¢)1 = l(al¢)1 ~ lIallll¢11 ~ C IlalIIlA¢1I = C lIa1i1lA011 .
By the Hahn-Banach theorem, the functional f extends to a bounded functionall defined on all ofnk(M), which is then a weak solution of Aw = a. By Theorem 9.50, there is an wE nk(M) with Aw = a so (1I. k ) 1. C A (nk(M)) and so (l£k) 1. = A (nk(M)). 0 In order to take full advantage of the Hodge decomposition, we now introduce a so-called "Green's operator" Gk : nk(M) -+ (l£k) 1.. We simply define Gk(a) to be the unique solution of Aw = a - Hk(a) where Hk : nk(M) --t l£k is the orthogonal projection as above. The Gk combine to give a map G : n(M) --t EBk (l£k) 1. also called the Green's operator. Lemma 9.54. Let Gk : nk(M) --t (l£k)
1.
be the Green's operator defined
above for each k. Then (i) Gk is formally self-adjoint; (ii) if L : nk(M) --t nr(M) is linear and commutes with A, then G commutes with L (that is, L 0 Gk = Gr 0 L). In particular, G commutes with d and 8. Proof. We have
(Gk(a)lf3) = (Gk (a) 113 - Hk(f3)) = (G k(a)I AGk(f3)) = = (a - Hk(a)IG k (13)) = (alGk (13)),
(~Gk(a)IGk
(13))
423
9.6. The Hodge Decomposition
so G is self-adjoint. For each j, let 1rj : OJ(M) -+ (1I.i).1 denote orthogonal projection (thus 1rJ + HJ = id{l.1). Now suppose that L : Ok(M) -+ or(M) is linear and commutes with Ll. Notice that by definition we have Gle = (AI (1-£k) .1) 01rk. The fact that LLl = LlL implies that L(1-£k) C 1-£r. Also, since Ll (Ok(M)) = (1-£k).1, we have L((1-£k).1) C (1-£r).1. Thus
L 0 1rk =
1rr 0
L,
L 0 (LlI (1-£k).1) = (LlI (1-£r).1) 0 LI(,w).L , and (LlI (1-£r) .1)-10 L = LI(,w).L 0 (LlI (l£k) .1)-1. It follows that G commutes with L.
o
We note in passing that G maps bounded sequences into sequences that have Cauchy subsequences. Indeed, suppose that {ai}f C Ok(M) is a sequence with lIaill ~ O. If f3i := G(ai), then using Lemma 9.52 we have lIf3ill ~ II Af3i II = Ilai - H(ai)11 ~ Ilaill ~ 0,
and so by Proposition 9.51, {f3i}f has a Cauchy subsequence. Theorem 9.55. Let (M,g) be a compact Riemannian manifold (without boundary). Then each de Rham cohomology class contains a unique harmonic representative. Proof. First assume that M is orient able and fix an orientation. Let a E
nk(M) and use the Hodge decomposition and the definition of G to obtain a
= LlGk(a) + Hk(a) = d8G k(a) + 8dGk(a) + Hk(a).
Then since G commutes with d, we have a = d8Gk(a)
+ 8Gk+1(da) + Hk(a).
So if da = 0, then a - Hk(a) = d8Gk(a), and so Hk(a) represents the same cohomology class as a. To show uniqueness, suppose that a1 and a2 are both harmonic and in the same class so that a2 - a1 = df3 for some f3. Then we have 0 = df3 + (a1 - a2). But a1 - a2 is orthogonal to df3 since by Proposition 9.48
(df31 a1 - a2) = (f3 I8a1 - 8a2) = (f310) = o. Next suppose that M is nonorientable. If 1r : Mor -+ M is the 2fold orientation cover of M (Section 8.7), then there is a unique metric on Mor such that 1r restricts to an isometry on sufficiently small open sets (1r is a Riemannian covering). Now suppose that c E Hk(M) is a de Rham cohomology class and consider the class 1r"'C E Hk(Mor) (see the discussion after Definition 8.41). Since Mor is orient able , 1r"'C is uniquely represented
9. Integration and Stokes' Theorem
424
by a harmonic form "ij on Mor. If r : Mo r -+ Mo r is the involution which transposes the two points in each fiber, then it is easy to see that r is an isometry so that r*"ij is also harmonic. It follows from the identity 11" 0 r 7r that r*"ij is also a representative of the cohomology class 11"* c. Indeed, [r*17] r* [1j] = r*1I"*c = 1I"*c. So by uniqueness, we must have r*"ij = "ij, which is exactly the condition that guarantees that "ij = 1I"*rJ for some rJ E nk(M). But 11" is a local isometry and so by Problem 5, rJ must be harmonic. We show that rJ represents c. Suppose that c = [JL] for JL E nk(M). Then both 11"* JL and 1I"*rJ represent the class 11"* c, so 11"* (rJ - JL) = df3 for some f3. But then, as for r*"ij and "ij above, r* f3 is in the same class as f3, which means that df3 = d r* f3. Then 11"*
(rJ - JL) = df3 =
~ (df3 + d r* f3) = d (~(f3 + r* f3)) .
Since r is an involution, f3 + r*f3 is r-invariant, that is, r* (f3 + r*{3) (f3 + r* f3), so there exists a E Ok (M) with 11"* a = (f3 + r* f3). Thus
!
11"*
(rJ - JL) = d 11"* a =
=
11"* d a,
and since 1T"* is a local diffeomorphism, we must have rJ - JL = da, which means that [rJ] = [JL] = c. The uniqueness of the harmonic representative TJ is clear. 0 Notice that if r is the involution of Mor introduced in the above proof, then 1T"*a = r*1I"*a for any a E nk(M). Corollary 9.56. If M is compact, then its cohomology spaces Hk(M) a e finite-dimensional. Proof. Any manifold can be given a Riemannian metric, and so if M is compact and oriented, then Theorems 9.53 and 9.55 combine to give the result. Now suppose that M is not orientable. Let a E nk(M) and suppose that 1T"*a = df3 so that 1T"*[a] = O. Then 1I"*a = r*1T"*a = r*df3 = dr*{3. Now 11"*0 = !(f3 + r*f3) for some 0 and so
1I"*a = d (~(f3 + r* f3)) = d1l"*0 = 1I"*dO. Since 11" is a local diffeomorphism, we have a = dO or [a] = O. Thus the map 11"* : Hk(M) -+ Hk(Mor) has trivial kernel and is injective. The finite-dimensionality of Hk(M) now follows from that of Hk(Mor). 0 The results above allow us to give a quick proof of Poincare duality for de Rham cohomology. Choose an orientation for M. We define a bilinear pairing Hk (M) x Hn-k (M) -+ lR as follows:
(([w], [rJ))) =
1M w 1\ rJ·
9.7. Vector Analysis on ]R3
This is well-defined since if WI = then by Stokes' theorem
1M
WI
1\ "71
425
W
+ dO' and "71 = "7 + d(3 (with w, "7 closed),
= 1M W 1\ "7 + 1M da 1\ "7 + 1M W 1\ d(3 +
L
dal\ d(3
1M W1\ "7 + 1M d (a 1\ "7) - 1M d (W 1\ (3) + 1M d (a 1\ d(3) = 1M wl\"7. =
We wish to show that the pairing defined above is nondegenerate. Thus, given any nonzero [w] E Hk(M) we wish to produce a ["7] E Hn-k(M) such that (([w], ["7])) =I O. Choose a Riemannian metric and metric volume element for M. We may assume that W is the harmonic representative for [w]. But it is easily checked that ~ commutes with * and so *w is also harmonic. In particular, *w is closed and so represents a cohomology class [*w]. Then [*w] is the desired class since
(([w], [*w]))
= 1M w 1\ *w =
1M (wi * w) vol = IIwl1
2
> O.
For each fixed [w] E Hk(M), we have the linear map t[w] E (Hn-k(M))* given by t[w] (["7D := (([w], ["7])). Since the pairing is nondegenerate, the map [w] t-+ t[w] defines an isomorphism from Hk(M) to (Hn-k(M))*. Thus we have proved the following: Theorem 9.57 (Poincare duality). If M is an orientable compact n-manifold without boundary, then we have an isomorphism
coming from the pairing defined above.
We will take up Poincare duality again in Chapter 10.
9.7. Vector Analysis on IR3 In ]R3, the I-forms may all be written (even globally) in the form 8 = hdx+ hdy+ !adz for some smooth functions h , h and !a and all 2-forms (3 may be written (3 = gldyl\dz+g2dzl\dx+g3dxl\dy. The forms dyl\dz, dzl\dx, dxl\dy constitute a basis (in the module sense) for the space of 2-forms on]R3 just as dx, dy, dz form a basis for the I-forms. The single form dx 1\ dy 1\ dz provides a module basis for the 3-forms in ]R3. Suppose that x( u, v) parametrizes a surface S C ]R3 so that we have a map x : U ~ ]R3. Then the surface is
9. Integration and Stokes' Tbeorem
426
oriented by this parametrization, and the integral of f3 over S is
if3=i~~A&+~&A~+~~A~ {( 8(y,z) = Ju gl(X(U, v)) 8 (u, v)
8(z,x)
+ g2 (x(u, v)) 8 (u, v)
8(x,y)) + g3 (x(u, v)) 8 (u, v) dudv.
Here and in the following we disregard technical issues about integration (but recall Theorem 9.7). Exercise 9.58. Find the integral of f3 = x dy A dz + dz A dx + xz dx 1\ dy over the sphere oriented by the parametrization given by the usual spherical coordinates ¢, f}, p. If w = h dx Ady Adz has support in a bounded open subset U c 1R3 which we may take to be given the usual orientation implied by the rectangular coordinates x, y, z, then
fu w = fu
hdx A dy A dz
=
fu
hdxdydz.
In order to relate differential forms on 1R3 to vector calculus on ]R3, we will need some ways to relate forms to vector fields. To a l-form f} = hdx+ hdy + hdz, we can obviously associate the vector field ~f} = hi + f2,j + 13k. But recall that this association depends on the notion of orthonormality prov~ed by_the d~t product. If f} is expressed in sa~ sph~ica15oordinates f} = hdp+ hdf}+ hd¢, then it is not true that ~f} = hi+ f2,j + 13k. Neither is it generally true that ~f} = hp + lif + h'¢, where p, 0, '¢ are unit vector fields in the coordinate directions 1 and where the are just the fi expressed in polar coordinates. Rather, our general formulas give -1.- 1 ..... ~f} = fIp + h-f} + 13-.-f}¢. p psm In rectangular coordinates x, y, z, we have
h
~:
~: ~:
dx dy dz
-+ i, -+ j, -+ k,
while in spherical coordinates we have ~:
dp ~ : pdf} ~: psin f} d¢ 1 Here
() is the polar angle ranging from 0 to
'7r.
t-+
p,
t-+
0,.....
t-+
¢.
427
9.7. Vector Analysis on JR3
As an example, we can derive the familiar formula for the gradient in spherical coordinates by first just writing t in the new coordinates t(p, 0, ¢) := !(x(p, 0, ¢), y(p, 0, ¢), z(p, 0, ¢)) and then sharping the differential
at at at df = ap dp + ao dO + a¢ d¢ to get
af a
gradf = ~df = ap ap
1 af a
1
af a
+ pao ao + psinO a¢ a¢'
where we have used
1 0
10 [
[
o P o 0
o o
o 1.
1p
0
psfn8
In order to proceed to the point of including the curl and divergence of traditional vector calculus, we need a way to relate 2-forms with vector fields. This part definitely depends on the fact that we are talking about forms in JR3. We associate to a 2-form 7] the vector fields H*7]). Thus 91dy /\ dz + g2dz /\ dx + ggdx /\ dy gives the vector field X = gli +g2j + ggk. Now we can see how the usual divergence of a vector field comes about. First fiat the vector field, say X = Ili+ hj+ fgk, to obtain pX = /ldx + /2dy + fgdz and then apply the star operator to obtain /ldy /\ dz + hdz /\ dx + fgdx /\ dy. Finally, we apply exterior differentiation to obtain
d(/ldy /\ dz + hdz /\ dx + fgdx /\ dy) =~/\~/\~+~/\~/\~+~/\~/\~
=
( all ax dx + all ay dy + all) az dz /\ dy /\ dz + the other two terms all
= ax dx /\ dy /\ dz
all = ( ax
+
ah afg ax dx /\ dy /\ dz + ax dx /\ dy /\ dz
afg) + ah ax + ax dx /\ dy /\ dz.
Now we see the divergence appearing. In fact, if we apply the star operator one more time, we get the function div X = ~ + ~ +~. We are thus led to *d* (pX) = div X which agrees with the definition of divergence given earlier for a general semi-Riemannian manifold.
9. Integration and Stokes' Theorem
428
What about the curl? For this, we just take d (~X) to get
d (IIdx + hdy + hdz) = dII =
1\
(a;: dx + 0:: dy + 0:: dZ)
= (a h _ a h ) dy 1\ dz
ay
az
dx + dh 1\
+ (a h
az
1\
dy + dh 1\ dz
dx + the obvious other two terms _ alI) dz 1\ dx + (alI _ a h ) dx 1\ dy ax ax ay
and then apply the star operator and sharping to get back to vector fields obtaining
In short, we have ~
* d (~X) =
curlX.
Exercise 9.59. Show that the fact that dd = 0 leads to both of the following familiar facts:
curl (grad f) = 0, div (curl X) = O. The 3-form dx 1\ dy 1\ dz is the (oriented) volume element of ]R3. Every 3-form is a function times this volume form, and integration of a 3-form over a sufficiently nice subset (say a compact region) is given by JD W= JD f dx 1\ dy 1\ dz = JD f dx dy dz (usual Riemann integral). Let us denote dx 1\ dy 1\ dz by dV. Of course, dV is not to be considered as the exterior derivative of some object V. Let u 1 , u 2, u 3 be curvilinear coordinates on an open set U C ]R3 and let 9 denote the determinant of the matrix [gij 1where gij = (a~" a':v ). Then d V = J9 du 1 1\ du 2 1\ du 3 • A familiar example is the case when (u 1 , u 2, u 3 ) are spherical coordinates p, 0, ¢>, in which case dV = p2 sin 0 dp 1\ dO and if D
c
]R3
1\
d¢>,
is parametrized by these coordinates, then
in
in = in
fdV =
f(p, 0, ¢»p2 sinO dp 1\ dO 1\ d¢> f(p, 0, ¢»p2 sinO dpdO d¢>.
If we go to the trouble of writing Stokes' theorem for curves, surfaces and domains in ]R3 in terms of vector fields associated to the forms in an appropriate way, we obtain the following familiar theorems (using standard
9.B. Electromagnetism
notation):
429
1
Vf· dr = f(r(b)) - f(r(a)),
JIs
curl (X) x dS =
JJl
div (X) dV =
t
X . dr (Stokes' theorem),
JIs
X . dS (Divergence theorem).
Similar and simpler things can be done in R2 leading for example to the following version of Green's theorem for a planar domain D with (oriented) boundary c = aD:
l (~~ -~~)
dx/\dy
=
l
d(Mdx+Ndy)
=
1
Mdx+Ndy.
All of the standard integral theorems from vector calculus are special cases of the general Stokes' theorem.
9.8. Electromagnetism In this subsection we take a short trip into physics. Consider Maxwell's equations2 :
V·B=O, V x
aB
E+Ft =0, V· E =
VxB_
aE
at
(!,
=j.
Here E and B, the electric and magnetic fields, are functions of space and time. We write E = E(t,x), B = B(t,x). The notation suggests that we have conceptually separated space and time as if we were stuck in the conceptual framework of the Galilean spacetime. Our purpose is to slowly discover how much better the theory becomes when we combine space and time in Minkowski spacetime Rt. Recall that Rt is treated as a semi-Riemannian manifold, which is ]R4 endowed with the indefinite metric (x, y)1I = -xOyo + E~=l xiy~. Here the standard coordinates are conventionally denoted by (xO, x \ x 2 , x 3 ), and xO is to be thought of as a time coordinate and is also denoted by t (we take units so that the speed of light c is unity). In what follows, we let r = (xl, x 2 , x 3 ), so that (xO,x 1 ,X2 ,x3 )
= (t,r).
2 Actually, thIS is the form of Maxwell's equations after a certain convenient choice of units, and we are ignoring the somewhat subtle distinction between the two types of electric fields E and D and the two types of magnetic fields B and H and their relation in terms of dielectric constants.
9. Integration and Stokes' Theorem
430
The electric field E is produced by the presence of charged particles. Under normal conditions a generic material is composed of a large number of atoms. To simplify matters, we will think of the atoms as being composed of just three types of particle; electrons, protons and neutrons. Protons carry a positive charge, electrons carry a negative charge and neutrons carry no charge. Normally, each atom will have a zero net charge since it will have an equal number of electrons and protons. If a relatively small percent of the electrons in a material body are stripped from their atoms and conducted away, then there will be a net positive charge on the body. In the vicinity of the body, there will be an electric field which exerts a force on charged bodies. Let us assume for simplicity that the charged body which has the larger, positive charge, is a point particle and stationary at ro with respect to a rigid rectangular coordinate system that is stationary with respect to the laboratory. We must assume that our test particle carries a sufficiently small charge, so that the electric field that it creates contributes negligibly to the field we are trying to detect (think of a single electron). Let the test particle be located at r. Careful experiments show that when both charges are positive, the force experienced by the test particle is directly away from the charged body located at ro and has magnitude proportional to qe r2, where r = Ir - ro I is the distance between the charged body and the test particle, and where q and e are positive numbers which represent the amount of charge carried by the stationary body and the test particle respectively. If the units are chosen in an appropriate way, we can say that the force F is given by r-ro F = qe 3' Ir- rol By definition, the electric field at the location r of the test particle is
(9.12)
E=q
r-ro Ir - rol
3'
If the test particle has charge opposite to that of the source body, then one of q or e will be negative and the force is directed toward the source. The test particle could have been placed anywhere in space, and so the electric field is implicitly defined at each point in space and so gives a vector field on ]R3. If the charge is modeled as a smoothly distributed charge density p which is nonzero in some region U C ]R3, then the total charge is given by integration Q = p(t, r) dVr and the field at r is now given by E(t, r) = p(t, y) 1:~~3 dV Since the source particle is stationary at rOt the electric field will be independent of time t. A magnetic field is produced by circulating charge (a current). If charge e is located at r and moving with velocity v in a magnetic field B(t, r), then the force felt by the charge is F = eE + ~v x B, where v is the velocity of the test particle. The test
Ju
Ju y.
9.B. Electromagnetism
431
particle has to be moving to feel the magnetic part of the field! At this point it is worth pointing out that from the point of view of spacetime, we are not staying true to the spirit of differential geometry since a vector field should have a geometric reality that is independent of its expression in a coordinate system. But a change of inertial frame can make B zero. Only by treating E and B together as aspects of a single field can we obtain the proper view. Our next task is to write Maxwell's equations in terms of differential forms. We already have a way to convert (time dependent) vector fields E and B on JR3 into (time dependent) differential forms on JR3. Namely, we use the flatting operation with respect to the standard metric on JR3. For the electric field we have
For the magnetic field we do something a bit different. Namely, we flat and then apply the star operator. In rectangular coordinates, we have
If we stick to rectangular coordinates (as we have been), the matrix of the standard metric is just 1= (dij), and so we see that the above operations do not numerically change the components of the fields. Thus in any rectangular coordinate system we have
and similarly for the B's. It is not hard to check that in the static case where and B are time independent, the first pair of (static) Maxwell's equations are equivalent to
e
dE
= 0 and dB = O.
This is nice, but if we put time dependence back into the picture, we need to do a couple more things to get a nice viewpoint. So assume now that E and B and hence the forms E and B are time dependent, and let us view these as differential forms on spacetime JRt. In fact, let us combine E and B into a single 2-form on JRt by setting
F= B+E 1\ dt.
!
Since F is a 2-form, it can be written in the form F = Fp,vdxP, 1\ dx v , where Fp,v - -Fvp, and where the Greek indices are summed over {O, 1,2, 3}. It is traditional in physics to let the Greek indices run over this set and to let Latin indices run over just the "space indices" 1, 2, 3. We will follow this convention for a while. If we compare F = B + E 1\ dt with ~ Fp,vdxP, 1\ dx v ,
9. Integration and Stokes' Theorem
432
we see that the FJ-LIJ form an antisymmetric matrix which is none other than
[
~x -~x Ey Ez
-Bz By
-t
y
=~:]
0
Bx
-Bx
0
.
Our goal now is to show that the first pair of Maxwell's equations are equivalent to the single differential form equation
dF=O. Let N be an n- manifold and let M = (a, b) x N for some interval (a, b). Let the coordinate on (a, b) be t = xo (time). Let (xl, ... , xn) be a coordinate system on N. With the usual abuse of notation, (xO, xl, ... ,xn) is a coordinate system on (a, b) xN. One can easily show that the local expression dw = BJ-LfJ-Ll ... J-Lk /\ dxJ-L /\ dx/1 1 /\ ••• /\ dXJ-Lk for the exterior derivative of a form w = f /11···J.l.k dXJ-Ll /\ ... /\ dX/1k can be written as 3
dw
(9.13)
=~ /\ dx i L...J {Jw t J-Ll···J-Lk
/\
dXJ-Ll /\ ... /\ dXJ-Lk
i=l
+ BOWJ-Ll ... J-Lk /\ dxO /\ dXJ-Ll /\ ... /\ dXJ-Lk, where the /li sum over {O, 1, 2, ... , n}. Thus we may consider the spatial part ds of the exterior derivative operator d on (a, b) x S = M. That is, we think of a given form w on (a, b) x S as a time dependent form on N so that dsw is exactly the first term in the expression (9.13) above. Then we may write dw = dsw + dt /\ Btw as a compact version of the expression (9.13). The part dsw contains no dt's. By definition F = B + & /\ dt on lR x lR3 = lRt, and so
dF = dB + d(& /\ dt) = dsB + dt /\ BtB + (ds& + dt /\ Bt&) /\ dt = dsB + (BtB + ds&) /\ dt. The part dsB is the spatial part and contains no dt's. It follows that dF is zero if and only if both dsB and BtB + ds& are zero. Unraveling the definitions shows that the pair of equations dsB = 0 and BtB + ds& = 0 (which we just showed to be equivalent to dF = 0) are Maxwell's first two equations disguised in a new notation. In summary, we have
dF = 0
¢:::::?
dsB=O BtB+ds& = 0
V'·B=O V'xE+~~=O
Below we rewrite the last pair of Maxwell's equations, where the advantage of combining time and space together manifests itself to an even greater degree. Let us first pause to notice an interesting aspect of the
9.B. Electromagnetism
433
first pair. Suppose that the electric and magnetic fields were really all along most properly thought of as differential forms. Then we see that the equation dF = 0 has nothing to do with the metric on Minkowski space at all. In fact, if ¢ : IRf -+ IRf is any diffeomorphism at all, we have dF = 0 if and only if d( ¢* F) = 0, and so the truth of the equation dF = 0 is really a differential topological fact; a certain form F is closed. The metric structure of Minkowski space is irrelevant. The same will not be true for the second pair. Even if we start out with the form F on spacetime it will turn out that the metric will necessarily be implicit in the differential forms version of the second pair of Maxwell's equations. In fact, what we will show is that if we use the star operator for the Minkowski metric, then the second pair can be rewritten as the single equation *d * F = J, where J is formed from j = (jl, j2, j3) and p as follows: First we form the 4-vector field J = pat + lax +j 2 ay + j 3 az (called the 4-current) and then using the flatting operation we produce .J - -pdt+j 1 dx+j 2 dy+j 3 dz = JO dt+J1 dx+J2 dy+Jadz, which is the covariant form of the 4-current. We will only outline the passage from *d * F = J to the pair V . E = {! and V x B = j. Let *s be the operator one gets by viewing differential forms on IRt as time dependent forms on IR3 and then acting by the star operator with respect to the standard metric on IR3. The first step is to verify that the pair V . E = {! and V x B - ~~ = j is equivalent to the pair *sds *s E = {! and -atE + *sds *s B = J, where J := j 1dx + j 2 dy + j 3 dz and Band E are as before. Next we verify that
cg;
*F = *s£ - *sB 1\ dt. So the next goal is to get from *d * F = *J to the pair *sds *s E = -atE + *sds *s B = J. The following exercise finishes things off.
{!
and
Exercise 9.60. Show that *d * F = -atE - *sds *s £ 1\ dt + *sds *s Band then use this and what we did above to show that *d * F = J is equivalent to the pair *sds *s £ = {! and -ate + *sds *s B = J. We have arrived at the following formulation of Maxwell's equations:
dF=O, *d*F= J. If we just think of this as a pair of equations to be satisfied by a 2-form F where the I-form J is given, then this last version of Maxwell's equations makes sense on any semi-Riemannian manifold. In fact, on a Lorentz manifold that can be written as (a, b) x S = M with the product metric -dt®dt x g, for some Riemannian metric 9 on S, we can write F = B+E I\dt,
9. Integration and Stokes' Theorem
434
which allows us to identify the electric and magnetic fields in their covariant form.
9.9. Surface Theory Redux Consider a surface M c ]R3. In what follows, we take advantage of the natural identifications of the tangent spaces of ]R3 with ]R3 itself. Let el, e2 be an oriented orthonormal frame field defined on some open subset U of M and let e3 - el x e2. We can think of each e}, e2 and e3 as vector fields along U. Using the identifications mentioned above, we also think of them as ]R3- valued O-forms. Note that the identity map idJR3 may be considered as an ]R3-valued 0form on ]R3. Let I := t* (idJR3) where t : U c M "---+ ]R3 is inclusion. Then dI := t *d (idJR3) = 0 since idJR3 is constant. Thus, I is an ]R3-valued O-form, and we may write 1= el e 1 + e2e2, where (19 1 ,19 2) is the frame field dual to (el' e2). Note that volM = 19 1 /\ 02• We have the ]R3-valued I-forms dej and 3
dej = Lekwj k 1
for some matrix of I-forms (w}). Note that if v E TpM, then dej(v) = 'Vue], where V' is the flat Levi-Civita derivative on ]R3. (In fact, we should point out that if i,j, k is the standard basis on ]R3, then any field X along U may be written as X = hi+ h,j+ 13k for some smooth functions h, 12, 13 defined on U and may be considered as an ]R3-valued I-form. Then dX(v) = V'vX = dh{v)i + dh(v)j + dh(v)k.) For an arbitrary tangent vector v we have
wj(v) = (dej(v),ei) = - (ej,dei(V)) = -w{(v), and so it follows that the matrix (w}) is antisymmetric. We write ]R3-valued forms on U as E~-l ekrl where 17k E O(U). This is to conform with the order of matrix multiplication. Theorem 9.61. Let Me ]R3, e1, e2, 191, 19 2 and (w}) be as above. Then the following structure equations hold: 2
de i
=-
wr /\
Lwl /\ 19 k , k=l
19 1
+ w~ /\ 19 2 =
""'i
0,
3
k dWji = - L..Jwk 1\ Wj' k=l
9.9. Surface Theory Redux
435
Proof. We calculate as follows: 2
o=
dI = d L k 2
2
ek(i - L dek A Ok 1 k-l
3
2
= L ( L wte,) k=l , 2
+L
A Ok
ek A dO k
k=l
1
2
=L (
?= ejwt + e3w~)
k=l
,
2
2
+ ek A dO k
2
A
+
Ok
1
?= ek
A
dO k
,-I
2
2
= L L e,wt A Ok + L ej A dO' + L e3w~ A Ok j=lk=l
=
t
ej
j=l
(dO j
+
j=l
t k
and it follows that dOt = - 2:~ Also,
k-l
wt A Ok)
+ e3
1 wi
(t w~ k
1
A Ok and
A Ok) ,
1
wf A 01 + w~ A 02 = O.
3
0= ddej = dL ekwj k-l 3
3
= L dek A wj + L ek A dw; I
k
k-l
3 3 3
= L(Leiwi ) Awj+ Lei Adw; k-l
i
I
i-I
i)
~ ei (~i = L..J L..Jwk AWjk + dw, , and so
.
dwj = -
i-I 3· 2:k=I A
wl:
k I k wj .
o
Notice that for v E TpM we have de3(V) = Vve3 = -S(v) and so IJ(v,w) = (-de3(V),W) (recall Definition 4.20). Therefore, II(ei,ej)
= (-de3(ei),e,) = / - tekwhei),ej) = \
-w~(ei)
k=1
and so [w1(el) w5(el)] [ II(eI' et} II(el,e2)] _ II(e2' eI,) II(e2' e2) - wl(e2) w5(e2) . Since (el' e2) is an orthonormal frame field, the matrix of the first fundamental form is the identity matrix. It follows that the matrix which represents
9. Integration and Stokes' Theorem
436
the shape operator is the same as that which represents the second fundamental form. Thus
K
= det [
II(el, e1) II(e 1,e2)] II(e2' e1,) II(e2' e2)
Notice that de3 = We have
2:%=1 ekw~
= _ det [w§(e 1) w~(e1)]. w§(e2) w~(e2)
reduces to de3 = e1w§
+ e2w~ since wl = O.
w§ = w§(edO l + w§(e2)02, w~ = w~(edOl +w~(e2)02,
so that
w§ "w~ = - K 01 " 02 , w§ " w~ = K 01 " 02, dw~ = K 01 " 02 • Recall that for v, w E Tx S2 we have vols2 (v, w) = (v x w, x). If we consider e3 as a map e3 : U C M -+ S2, then it is called the Gauss map. If M is assumed oriented, then we may take e3 to be equal to a global normal field. If v, wE TpM, then we have ea vols2(v, w) = (de3(V) x de3(w),e3)
+ w~(v)e2) x (w§(w)e1 + w~(w)e2) ,e3) w§ "w~(v, w) = K 01 " 02(v, w) = KvolM(v, w).
= ((w§(v)e1 =
Thus we obtain ea volS2 = K volM . This shows that the Gauss curvature is a measure of distortion of the signed volume under the Gauss map. In particular, if e3 : U C M -+ S2 is orientation reversing at p, then the curvature at p is negative. Let p E M with p in the domain U of e3. If A is a nice domain in U, then e3 maps A to a set in S2. Without worrying about measure-theoretic technicalities, we have vol(e3 (A)) = =
r
i e3(A)
L
volS2
ea vols2 =
L
K volM .
One can get an idea of the curvature near a point on a surface by visualizing the Gauss map (see Figure 9.2). For example, it is clear that the right circular cylinder has Gauss curvature zero since it maps every region of the cylinder onto a set with zero area. It is also fairly clear that the saddle surface in the diagram has negative curvature.
Problems
437
Figure 9.2. Gauss map
Problems (1) Let /, : 8 2
Y
lR3 \{0} be the inclusion map. Let
7"
= /'*w where
x dy 1\ dz + y dz 1\ dx + z dx 1\ dy (x2 + y2 + z2)3/2
W=-~-""""""''''';;'''''---=--''''''''''''''''''''''''''--~
Compute
fS27"
where 8 2 is given the orientation induced by 7" itself.
(2) Let M be an oriented smooth compact manifold with boundary 8M and suppose that 8M has two connected components No and Nl' Let Zi : Ni Y M be the inclusion map for i = 0, 1. Suppose that a is a = 0 and f3 an (n - p -I)-form with l,if3 = O. Prove p-form with that in this case
zoa
1M da 1\ f3 = (_l)P+l 1M a 1\ df3. (3) Consider the set up in the proof of Theorem 9.22 where'Y E /\ n-k and L-y : /\ k V* -+ lR is defined so that L-y (a) vol = a 1\ 'Y. Show that 'Y H- L-y E (/\ k V*)* is linear. Show that if L-y(a) = 0 for all a E /\ k V*, then 'Y = O. Thus 'Y H- L-y is injective (and hence an isomorphism). See the related Problem 4.
9. Integration and Stokes' Theorem
438
(4) Recall the notion of a manifold with corners as in Problem 21 in Chapter 1. Define orientation on manifolds with corners. Let M be an nmanifold with corners. Show that if eM is the set of corner points of M, then M\ C M is a manifold with boundary. Develop integration theory and Stokes' theorem for manifolds with corners. [Hint: If W is an (n - 1)-form with compact support in the domain of a chart (U, x), then define
where Fi := {x E ]R~ : xi = O}
is given the induced orientation as a subset of the boundary of {x E lRn xi 20}.]
:
(5) Show that if I: (M, g) -+ (N, h) is an isometry or a local isometry and w is a k-form on N, then f*w is harmonic if and only if w is harmonic. (6) Let M be a connected oriented compact Riemannian manifold with Laplace operator~. A smooth nonzero function I is called an eigenfunction for ~ with eigenvalue Aif ~I = AI. (a) Show that zero is an eigenvalue and that all other eigenvalues are strictly positive. (b) Show that if ~11 = Adl and ~h = A2!2 for Al =1= A2, then (11112) = 1M 1112 volM = o. (a) Show that if p : M -+ M is a smooth covering space of multiplicity m, then for any compactly supported wE nn(M) we have fitP*w=m fM w .
(7) Let Me ]Rn+1 be an oriented hypersurface. If N is a positively oriented normal field along M with N = 2: N i 8/8x i , then the following formula gives the volume form corresponding to the induced metric on M: volM :=
2:( _l)i-l Nidxl 1\ ... 1\ ;J;i 1\ ... 1\ dx n +1, i
where xl, ... ,xn +1 are the standard coordinates on are restricted to M. (8) Let U be a starshaped open set in coordinates. Given w=
""" ~
h< .. -
]Rn
and let
W·~l,···,tk . dX i1 1\ ... 1\
]Rn+l
xl, ...
dX ik ,
and the dx'
,xn be standard
439
Problems
define
where the caret means omission. Show that if dw (9) Derive the decompositions
= 0, then dlw = O.
~i(M) = d8 (nk(M)) ED 8d (nk(M)) ED 1ik and from
nk(M) = Ll (nk(M)) ED 1ik. (10) Prove Theorem 9.7.
----------------------------------------.L
Chapter 10
We have already given the definition of de Rham cohomology (8.41) and the de Rham theorem which says that for a compact oriented smooth manifold M, the de Rham cohomology is isomorphic with the singular cohomology. In this chapter we introduce just a few of the most basic tools proper to the subject such as the Mayer-Vietoris sequence. We shall also introduce the compactly supported de Rham cohomology and revisit Poincare duality. In this chapter all manifolds will be without boundary. For a given n-manifold M, we have the sequence of maps called the de Rham complex
°~ nO(M) ~ nl(M) ~ ... ~
nn(M) ~ 0,
and we have defined the de Rham cohomology groups (actually vector spaces) as the quotients
k Zk(M) H (M) = Bk(M)' where Zk(M) := Ker(d : nk(M) ~ nk+ 1 (M)) and Bk(M) := Im(d : Ok l(M) ~ nk(M)). The elements of Zk(M) are called closed k-forms or k-cocycles, and the elements of Bk(M) are called exact k-forms or kcoboundaries. If f : M ~ N is smooth and LB] E Hk(N), then since d commutes with pull-back, it is easy to see that 1*/3 is closed because /3 is closed. Thus we obtain a cohomology class [1*/3]. Also, [1*/3] depends only on the equivalence class of /3. Indeed, if /3 - /3' = d1], then 1* /3 - 1* /3' = d1*1]. Thus we may define a linear map 1* : Hk(N) ~ Hk(M) by 1* [/3] := [1*/3]. Let us immediately review a simple situation from Section 2.10, which will help the reader better see what the de Rham cohomologies are all about.
441
10. De Rbam Cohomology
442
Let M
= JR2\{0} and consider the I-form rJ'= xdy - ydx . x2 + y2 .
We got this I-form by taking the exterior derivative of (} = arctan(y x . This function is not defined as a single-valued smooth function on all of JR 2 \ {O}, but rJ is well-defined on all of JR2 \ {O}. One can extend (} to be defined on the plane minus the ray {x ~ 0, y - O} as follows:
(10.1)
(} =
arctan(y/x) if x> 0, { arccos(x/ 2 + y2) if ~ 0 and y > 0, - arccos (x/ ";x2 + y2) if x ~ 0 and y < O.
Jx
x
See Problem 1. However, this is still not defined on all of JR2\ {O}. One may also check that drJ 0 and so rJ is closed. Using Proposition 2.125 and Example 2.126 of Section 2.10, it is easy to see that we have the following situation:
(1) rJ := x:~~~gx is a smooth I-form on JR2\{0} with drJ (2) There is no function
O.
f defined on (all of) JR2\{0} such that rJ = df.
(3) From Section 2.10 we know that for any ball B(p, c) in JR2\{O} there is a function f E COO (B(p, c)) such that rJIB(p,e) = df. Assertion (1) says that rJ is globally well-defined and closed, while (2 says that rJ is not exact. Assertion (3) says that rJ is what we might call1ocally exact or locally conservative. What prevents us from finding a (global function with rJ = df? Could the same kind of situation occur if the manifold is JR2? The answer is no, and the difference between JR2 and JR2\{O} is that HI(JR 2 ) = 0 while HI (JR2 \ {O}) :f O. Exercise 10.1. Verify (1) and (3) above. We recall from Chapter 2 that this example has something to do with path independence. In fact, if we could show that for a given I-form a, the path integral a only depended on the beginning and ending points of the curve c, then we could define a function f(x) := a, where a is just the path integral for any path beginning at a fixed Xo and ending at x. With this definition one can show that df = a and so a would be exact. In our example, the form rJ is not exact, and so there must be a failure of path independence.
Ie
I:a
I:a
Exercise 10.2. A smooth fixed endpoint homotopy between a path CO : [a, b] -+ M and CI : [a, b] -+ M is a one parameter family of paths hs such that ho = CO and hI = CI and such that the map H(t, s) := hs(t) is smooth a = O. on [a, b] x [0,1]. Show that if a is an exact I-form, then
is Ihs
10. De Rbam Cohomology
443
The de Rham complex and its cohomology can be viewed in terms of differential equations. For example, the task of finding a closed l-form f dx + 9 dy on an open set U C JR2 amounts to finding a solution of the differential equation
ag _ af ax
=
o.
By
The "trivial" solutions are the exact forms since they are automatically closed. Thus the de Rham cohomology HI(U) is a space of solutions modulo the "uninteresting" solutions. Similar statements apply to the higher cohomology groups. Since we have found a closed l-form on JR2\{0} that is not exact, we know that HI(JR2\{0}) i- o. We are not yet in a position to determine HI (JR2 \ {O}) completely. We will start out with even simpler spaces and eventually develop the machinery to bootstrap our way up to more complicated situations. First, let M = {p}. That is, M consists of a single point and is hence a O-dimensional manifold. In this case,
JR for k = 0, nk({p}) = Zk({p}) = {
o Furthermore, Bk( {p})
for k
> O.
= 0 and so JR for k = 0, Hk({p}) = {
o
for k
> O.
Next we consider the case M = R Here, ZO(JR) is clearly just the constant functions and so is (isomorphic to) R On the other hand, BO(JR) = o and so HO(JR) = R Now since d : nI(JR) -t n2(JR) = 0, we see that Zl(JR) = nl(JR). If g(x)dx E nl(JR), then letting
f(x) :=
foX g(x) dx
we get df = g(x)dx. Thus, every nl(JR) is exact; Bl(JR) = nI(JR). We are led to Hl(JR) = O. From this modest beginning we will be able to compute the de Rham cohomology for a large class of manifolds. Our first goal is to compute Hk (JR n ) for all k. In order to accomplish this, we will need some preparation. The methods are largely algebraic, and so we will need to introduce a bit of "homological algebra".
444
10. De Rham Cohomology
Definition 10.3. Let R be a commutative ring. A differential R-complex is a direct sum of modules C = ffikEZ C k together with a linear map d : C --+ C such that dod = 0 and such that d( Ck) c CHl. Thus we have a sequence of linear maps .. . ---t
C k-l
d
~
Ck ---t d Ck+! ---t ...
where we have denoted the restrictions dk = dl C k all simply by the single letter d. Let A = ffikEZ Ak and B = ffikEZ Bk be differential complexes. A map ! : A ---t B is called a chain map if! is a (degree 0) graded map such that do! = 10 d. In other words, if we let II Ak := Ik' then we require that !k(Ak) c Bk and that the following diagram commutes for all k:
Notice that if ! : A ---t B is a chain map, then Ker(f) and Im(f) are complexes with Ker(f) = ffikEZ Ker(fk) and Im(f) - ffikEzIm(fk). Thus the notion of exact sequence of chain maps may be defined in the obvious way. Definition 10.4. The k-th cohomology of the complex C = ffikEZ Ck is
k Ker(dl Ck) H (C):= Im(dICk-l)' The elements of Ker( dl Ck) (also denoted Zk (C)) are called k-cocycles, while the elements of Im(dl Ck 1) (also denoted Bk(C)) are called k-coboundaries. If I : A ---t B is a chain map, then it is easy to see that there is a natural (degree 0) graded map !. : H(A) ---t H(B) defined by !.([x]) := [f(x)] for
x
E
Ak.
Definition 10.5. An exact sequence of chain maps of the form
is called a short exact sequence.
445
10. De Rham Cohomology
Associated to every short exact sequence of chain maps is a long exact sequence of cohomology groups:
The maps f* and g* are the maps induced by f and g, and the "coboundary map" or connecting homomorphism d* : Hk(C) --+ Hk+1(A) is defined as follows: Let c E Zk(C) C C k represent the class [cJ E Hk(C) so that de - O. Starting with this we hope to end up with a well-defined element of Hk+1 (A), but we must refer to the following diagram with exact rows to explain how we arrive at a choice of representative of our class d*([c]): J
9
0 - - Ak+1 - - Bk+l - - C k+1 - - 0
dl O-Ak
J
dl ) Bk
9
dl ) Ck-O
By the surjectivity of 9 there is abE Bk with g(b) = c. Also, since g(db) = d(g(b)) = de = 0, it must be that db = f(a) for some a E Ak+1. The scheme of the process is C --+ b --+ a. Certainly f(da) = d(f(a)) = ddb = 0, and so since f is 1-1, we must have da = 0, which means that a E Zk+1(A). We would like to define d*([c]) to be [aJ, but we must show that this is well-defined. Suppose that we repeat this process starting with d = c + dO! for some O! E C k 1. In our first step, we find b' E Bk with g(b') = c' and then a' with f(a') = db'. We wish to show that [aJ = [a'J. We have g(b - b') = e - d = dO!. But there must be an element f3 E B k - 1 such that g(f3) = O!. Now we have
g(b - b' - d(3) = g(b) - g(b') - g(df3)
= g(b) - g(b') - dg(f3) = c - c' - dO! = O.
10. De Rham Cohomology
446
By exactness at Bk, there must be a"f E Ak such that f("() Now we have
= b - b' - dfj.
f(a - a' - d"f) = f(a) - f(a') - df("() = db - db' - d(b - b' - d(3) = 0, and since f is injective, we have a - a' - d"f = 0, which means that [aJ = [a'J. Thus our definition d*([c]) := [aJ is independent of the choices. We leave it to the reader to check (if there is any doubt) that d*, so defined, is linear. Let us review the situation we have with de Rham cohomology noticing how it fits the abstract homological algebra above. We let Ok(M) := 0 for k < 0 and we have a differential complex d : O(M) -+ O(M), where dis the exterior derivative and O(M) is the direct sum of the Ok(M). In this case, Hk(O(M)) = Hk(M) by definition. If f : M --+ N is a Coo map, then we have 1* : O(N) -+ O(M). Since pull-back commutes with exterior differentiation d and preserves the degree of differential forms, 1* is a chain map. Thus we have the induced map on cohomology denoted by 1* so that
1* : H*(N) -+ H*(M), 1* : [aJI-t [1* a], where we have used H*(M) to denote the direct sum EBi Hi(M). Notice that f I-t 1* together with M I-t H*(M) is a contravariant functor, which means that for f : M --+ Nand 9 : N --+ P we have
(g 0 f)* = 1* 0 g*. (This is why we have now put the stars in as superscripts.) In particular, if
tu : U --+ M is the inclusion of an open set U in M, then tua is the same as the restriction of the form a to U. If [aJ E H*(M), then tu*([a]) E H*(U); tu* : H*(M) --+ H*(U). Remark 10.6. If a E Zk(M) and f3 E Zl(M), then for any a' E Ok l(M and any f3' E OI-l(M) we have
(a + da') A f3 = a A f3 + da' A f3 = a A f3 + d(a' A (3) - (_l)k-l a ' A df3 =aAf3+d(a' A(3) and similarly aA (f3+df3') = aAf3+d(aAf3'). Thus we may define a product Hk(M) x HI(M) --+ Hk+I(M) by [aJ A [f3J := [a A f3J. This gives H*(M) a graded ring structure.
447
10.1. The Mayer- Vietoris Sequence
10.1. The Mayer-Vietoris Sequence Suppose that M = U U V for open sets U and V. We have the following commutative diagram of inclusions:
U
y~ (10.2)
UnV
~/,
M
V
This gives rise to the Mayer-Vietoris short exact sequence 0-+ D(M)
4 D(U) EB D(V) ~ D(U n V) -+ 0,
where
.* .*) W J.* () W := (JIW,h and
oi*(a,!3) :- i2(!3) - ii(a) = !3lunv - alunv· Note that jiw E D(U) while jiw E D(V). Note also that ii(a) = alunv and ii(!3) = !3lu v live in D(U n V). Let us write j* suggestively as (ji,ji). Let us show that this sequence is exact. First if (ji,ji)(w) := (jiw,j2w) = (0,0), then wl u = wlv = 0 and so w = 0 on M = UUV. Thus (ji,j2) is 1-1 and the exactness at D(M) is demonstrated. Next, if'f/ E D(U n V), then we take a smooth partition of unity {pu, pv} subordinate to the cover {U,V} and let ~:= (-(PV'f/)U, (PU'f/)V), where (PV'f/)U is the extension of the form Pvlunv'f/ by zero to U, and (pu'f/)V likewise extends pulunv'f/ to V. This may look backwards at first, so note carefully that we use Pv, which has support in V, to get a function on U, and similarly Pu is used to define (pu'f/) v on V! Figure 10.1 shows the circle as the union of two open sets U and V and serves as a schematic of what we have in mind. In the figure 'f/ E DO is 1 on one connected component of U n V and 0 on the other component. Now we have oi*( -(pv'f/)u, (pu'f/) v)
= (pu'f/)Vl unv + (pv'f/)ulunv = PU'f/lunv + PV'f/lunv = (pu + pV)'f/ = 'f/. Perhaps the notation is too pedantic. If we let the restrictions and extensions by zero take care of themselves, so to speak, then the idea is expressed by saying that oi* maps the element (-pv'f/, pU'f/) E D(U) EB D(V) to PU'f/ (-pv'f/) = 'f/ E D(U n V). Thus we see that oi* is surjective.
10. De Rham Cohomology
448
It is easy to see that Bi*o(ji,j2) = 0 so that Im(ji,j2) C Ker(8i*). Now let (a, fJ) E O(U) EEl O(V) and suppose that Bi* (a, fJ) = O. This translates to alunv = f3l unv , which means that there is a form W E O(U U V) = O(M) such that w coincides with a on U and with fJ on V. Thus
so that Ker (Bi*) C Im(ji, j2), which, together with the reverse inclusion, gives Ker(Bi*) = Im(ji,j2)' Following the general algebraic pattern, the Mayer-Vietoris short exact sequence gives rise to the Mayer-Vietoris long exact sequence:
Since our description of the coboundary map in the algebraic case was rather abstract, we will do well to take a closer look at the connecting homomorphism d* in the present context. Referring to the diagram below, 1] E Ok(U n V) represents a cohomology class [1]] E Hk(U n V) so that in particular d1] = 0:
I I
0*
I I
I I
0 - - Ok+l(M) ~ Ok+l(U) EEl Ok+l(V) ~ Ok+l(U n V) - - 0 d
O-Ok(M)
°
J*
dffid
~ Ok(U)EElOk(V)
d
8i*
~ Ok(UnV)-O
We will abbreviate the map d EEl d to just d so that d(1]1,1]2) = (d1]1,d1]2. By the exactness of the rows we can find a form ~ E Ok(U) EEl Ok (V) which maps to 1]. In fact, we may take ~ - (-pv1], pU1]) as before. Since dry = 0 and the diagram commutes, we see that ~ must map to 0 in Ok+! (U n V). This just tells us that -d (PV1]) and d (pu1]) agree on the intersection Un V. (Refer again to Figure 10.1.) Thus there is a well-defined form in Ok+1(M)
10.2. Homotopy Invariance
449
u
v Figure 10.1. Scheme for d* on the circle
which maps to
d~.
This global form 'Y is given by 'Y = {-d(PV'f/)
d(pu'f/) and then by definition d*['f/l
= bl
E
on U, on V,
Hk+l(M).
Exercise 10.7. Let the circle SI be parametrized by the angle () in the usual way. Let U be the part of a circle with -7r /6 < () < 77r /6 and let V be given by 57r /6 < () < 137r /6. (a) Show that HO(U) ~ HO(V) ~ R d*
(b) Show that the "difference map" HO(U) EBHO(V) -+ HO(Un V) has 1-dimensional image. (c) What is the cohomology of SI?
10.2. Homotopy Invariance Now we come to a result about the relation between H*(M) and H*(M x JR) which provides the leverage needed to compute the cohomology of some higher-dimensional manifolds based on that of lower-dimensional manifolds. One of our main goals in this section is to prove the homotopy invariance of de Rham cohomology and also the Poincare lemma. We start with a theorem which is of interest in its own right. Let M be an n-manifold and also consider the product manifold M x R We then have the projection
10. De Rham Cohomology
450
x JR -+ M, and for a fixed a E JR we have the section Sa : M -+ M x JR. given by x I---t (x, a). We can apply the cohomology functor to this pair of maps to obtain the maps 71"* and s;:
7r : M
H*(M x JR) n* {
8~
}
H*(M)
Theorem 10.8. Given M and the maps defined above, we have that 71"* : H*(M)-+H*(M x JR) and s~ : H*(M x JR) -+H*(M) are mutual inverses for each a. In particular, H*(M x JR) ~ H*(M).
Proof. In what follows we let id denote the identity map on O(M x JR.), H*(M) and H*(MxJR) as determined by the context. We already know that id = 0 7r*. The main idea of the proof is the use of a so-called homotopy operator, which in the present case is a degree -1 map K : O(M x JR.) -+ o (M x JR) with the property that
s;
(10.3)
id -7r* 0
s: = ±(d
0
K - K
0
d).
The point is that do K - K 0 d must send closed forms to exact forms. ThUb on the level of cohomology ±(do K - K 0 d) = 0, and hence id -71"* 0 s; must be the zero map. Thus we also have id = 7r* on H*(M), which gives the result. So let us then prove equation (10.3) above. Let denote the obvious vector field that is related to the standard coordinate field on JR under the projection M x JR -+ JR, and let iBt denote interior product with respect to The map K is defined on Ok(M x JR) by
os;
at
at.
K(w) = (_l)k-l
it
More explicitly, for
71"* s;
(iatw) d-r = (_1)k- 1
vI, ... , Vk-I
K(w) I(q,t) (VI, ... , Vk-I) =
it
(ST
071")*
(iBtw) dr.
E T(q,t)M x JR,
i t W(TSTOT7I"
(VI)' ... ' TST OT7I" (Vk 1)' at T)dr.
This operator and d are both local and linear over JR, and so if id -71"* 0 s~ = ±(d 0 K - K 0 d) is true on charts, then it is true in general. Thus we may as well assume that M is an open set U in JRn. For any wE Ok(U X JR.), we can find a pair of functions !I (x, t) and h (x, t) such that
w = h(x, t)7I"*a + h(x, t)7I"*/3 A dt for some forms a E Ok(U) and /3 E Ok-I(U). This decomposition is unique in the sense that if !I (x, t)7r*a + h(x, t)7I"* /3 A dt - 0, then !I (x, t)7I"*a == 0
451
10.2. Homotopy Invariance
and h(x, t)7r*j3 /\ dt = O. In what follows we will abuse notation a bit. The standard coordinates on U will be denoted by xl, ... , x n , and we write 7r*x~ simply as xi so that with t the standard coordinate of R\ we have coordinates (Xl, ... , x n , t) on U X lR. Using the decomposition above, one can see that K (h (x, t )7r* a) is zero and in general
This map is our proposed homotopy operator. Let us now check that K has the required properties. It is clear from what we have said that we may check the action of K separately on forms of the types h (x, t)7r*a and h(x, t)7r* j3 /\ dt. Case I (type f(x, t)7r*a). If w - f(x, t)7r*a, then Kw = 0 and so (d 0 K - K 0 d)w = -K(dw). Then
K(dw) = K(d(f(x, t)7r*a)) = K(df(x, t) /\ 7r*a + (_l)k f(x, t)7r*da)
- K
(L ~~~ dx
i /\
7r*a
+
~{ dt /\ 7r*a + (_l)k f(x, t)7r*da)
(~{ (x, t)7r*a /\ dt)
= (_l)k K
i
t
of
= (_l)k a ot(x,r)dr x 7r*a= (_l)k(f(x,t)-f(x,a))7r*a.
On the other hand, since 7r*s~f(x, t) 7r*a = 7r*
[f(x, a) (s~ 0 7r*) a] - 7r* [(f 0 Sa)a] - (f 0 Sa 0 7r*) 7r*a = f(x, a)7r*a,
we have (id -7r*
0
s~)w =
(id -7r*
0
s~)f(x, t)7r*a
= f(x,t)7r*a-f(x,a)7r*a = (f(x, t) - f(x, a)) 7r*a
as above. So in this case we get (d 0 K - K
0
d)w = ±(id -7r*
0
s~)w.
10. De Rham Cohomology
452
Case II (type w = f(x, t)1f*{J /\ dt). We have do K (w) = do K (J(x, t)1f* (J /\ dt) = d (1f* (J
= 1f*d{J
(it
f(x, r) dr)
+ (_l)k-l1f* {J /\ L
it
f(x, r) dr)
+ (-l)k-l1f*{J /\ f(x, t) dt
(it :~
(x, r) dr) dx i
and K
0
dw = K d (J 1f* (J /\ dt) = K d ( 1f* (J /\ f dt) = K ( 1f*d{J /\ fdt
+ (_l)k-l1f* (J /\ df /\
dt)
= K ( 1f*d{J /\ fdt
+ (-l)k-l1f*{J /\ L
::idxi /\ dt)
=
(it f(x,r) dr) 1f*d{J + (-l)k-l1f*{J /\ L (it ::i(X,r)dr)
dX'.
Thus (d 0 K - K 0 d)w = (-l)k- l 7r*{J /\ f(x, t) dt = (_l)k-l w. On the other hand, we also have (id -1f* 0 s;)w = w since s; dt = 0 and so (d 0 K - K 0 d) = ±(id -1f* 0 s;), which is equation (10.3). As explained at the beginning of the proof, this implies that 1f* and are inverses and so H*(M x R) ~ H*(M). 0
s;
Corollary 10.9 (Poincare lemma). k
k
H (Rn) = H (point) =
{R0
0,
if k = otherwise.
Proof. One first verifies that the statement is true for Hk (point). Then the remainder of the proof is a simple induction:
Hk(point) ~ Hk(point x R) ~
= Hk(R)
Hk(R x R) = Hk(R2)
= ...
rv
o Corollary 10.10 (Homotopy axiom). If f : M
homotopic, then the induced maps f* : H*(N) H* (M) are equal.
~
~
Nand 9 : M ~ N are H*(M) and g* : H*(N)-t
10.2. Homotopy Invariance
453
Proof. By extending the homotopy as in Exercise 1.77 we may assume that we have a map F : M x JR -+ N such that
F(x, t) = f(x) for t ~ 1, F(x,t) - g(x) for t:::; o. If S1(X) := (x, 1) and so(x) := (x, 0), then f = F 0 S1 and 9 = F 0 So, and so
j*-sioF*, g* = So 0 F*. It is easy to check that si and So are one-sided inverses of 11"*, where M x JR -+ M is the projection as before. But we have shown that 11"* is an isomorphism. It follows that si = So in cohomology, and then from the above we have f* = g*. D
Homotopy plays a central role in algebraic topology, and so the last corollary is very important. Recall that if there exist maps f : M -+ Nand 9 : N -+ M such that both fog and 9 0 f are defined and homotopic to idN and idM respectively, then f (or g) is called a homotopy equivalence, and M and N are said to have the same homotopy type. In particular, if a topological space has the same homotopy type as a single point, then we say that the space is contractible. If we are dealing with smooth manifolds, we may take the maps to be smooth. In fact, any continuous map between two smooth manifolds is continuously homotopic to a smooth map. We shall use this fact often without comment. The following corollaries follow easily. Corollary 10.11 (Homotopy invariance). If M and N are smooth manifolds which are of the same homotopy type, then H*(M) ~ H*(N). Corollary 10.12. If M is a contractible n-manifold, then HO(M) ~ JR, Hk(M)
= 0 for 0 < k :::; n.
Next consider the situation where A is a subset of M and i : A,--+ M is the inclusion map. If there exists a map r : M -+ A such that r 0 i = idA, then we say that r is a retraction of M onto A. If A is a regular submanifold of a smooth manifold M, then in case there is a retraction r of M onto A, we may assume that r is smooth. If we can find a smooth retraction r such that i 0 r is smoothly homotopic to the identity idM, then we say that r is a (smooth) deformation retraction, and this homotopy itself is also called a deformation retraction. In this case, A is said to be a (smooth) deformation retract of M. Corollary 10.13. If A is a smooth deformation retract of M, then A and M have isomorphic cohomologies.
10. De Rham Cohomology
454
Exercise 10.14. Let U+ and U_ be open subsets of the sphere sn C ]Rn+l given by
u+ := {(xi)
E Sn : -c < x n+1 ~ 1}, U_ := {(Xi) E Sn : -1 ~ x n+1 < c}, where 0 < c < 1/2. Show that there is a deformation retraction of U+ n U onto the equator xn+1 = 0 in sn. Notice that the equator is a two point set in case n = O. Show that sn is not contractible. (See also Problem 6.) Theorem 10.15 (Hairy sphere theorem). A nowhere vanishing smooth vector field exists on the sphere sn if and only if n is odd. Proof. If X is a nowhere vanishing vector field on sn, then define /) : XI IIXII. We introduce a map [0,1] x sn 4 sn by F(x, t) := (COS7rt) x
+ (sin 7rt) v(x).
This is a smooth homotopy between the identity map and the antipodal map a : x 1--+ -x. Now if volsn is the volume form on sn, then [vols Jib not zero in Hn(sn). Indeed, suppose that volsn = dw. Then 0 =f:. vol(sn) vol = 10 w = 0, which is a contradiction. Recall the special n-form on IRn +1 given in Cartesian coordinates by
Isn
n+1
U
:= ~(_1)n Ixidxl /\ ... /\ dx~
/\ ... /\ dxn+1.
i-I
It was shown previously that volsn is the restriction of u to sn. But if n is even, then n+1 is odd, and it is clear from the above formula that a*u - -(1. Since volsn is obtained from u taking restriction, we have a* volsn = vols,
and so also on the level of cohomology: a* [volsn] = - [volsn] .
But this is impossible since if the homotopy exists, we must have a* id* on Hn(sn). We conclude that a nonvanishing vector field does not exist on sn if n is even. If n is odd, then we can easily construct a nowhere vanishing vector field on sn. For example, if n = 3, then let X -
8 a a a X2- - X l - +X4- -X3-, aXI 8X2 aX3 8X4
where Xl! X2, X3, X4 are the standard coordinates on 1R4 , and restrict X to S3. Notice that X is indeed tangent to S3. The generalization to higher odd dimensions should be clear. 0
10.2. Homotopy Invariance
455
A trivial consequence of the Poincare lemma is that the cohomology spaces of a Euclidean space are finite-dimensional. Below we use an induction that shows that this is true for a large class of manifolds which include all compact manifolds. For this and later purposes, we introduce a technical condition. An open cover {UQ}QEA of an n-manifold M is called a good cover if for every choice ao, ... ,ak the set UQo n ... n UOi.k is diffeomorphic to R n (or empty). {O~l8EB is any open cover of an n-manifold M, then there exists a good cover {UOi.}Oi.EA such that each UOi. is contained in some Op.
Proposition 10.16. If
Proof. The proof requires a result from Riemannian geometry (see Problem 2 of Chapter 13). We know that every manifold can be given a Riemannian metric. Each point of a Riemannian manifold has a neighborhood system made up of small geodesically convex sets. What matters to us now is that the intersection of any finite number of such sets is geodesic ally convex and hence diffeomorphic to ]Rn. Actually, the assertion that an open geodesically convex set in a Riemannian manifold is diffeomorphic to ]Rn is common in the literature, but it is a more subtle issue than it may seem, and references to a complete proof are hard to find (but see [Grom]). Granted this claim, the result follows. 0 Many of the results we develop below will be proved for orientable manifolds that possess a finite good cover. This includes all orientable compact manifolds. Theorem 10.17. If an n-manifold has a finite good cover, then its de Rham cohomology spaces are all finite-dimensional.
Proof. The proof is an induction argument that uses the Mayer-Vietoris sequence. Our k-th induction hypothesis is the following: P(k): Every smooth manifold that has a good cover consisting of k open sets has finite-dimensional de Rham cohomologies.
Since we know that an open set diffeomorphic to Rn has finite-dimensional de Rham cohomology spaces, the statement P(1) is true by Corollary 10.9. Suppose that P(k) is true. We wish to show that this implies P(k+ 1). So suppose that M has a good cover {Ul,"" Uk+1} and let Mk := Ul U ",UUk·
Since we are assuming P(k), Mk has finite-dimensional de Rham cohomology spaces. Note that Mk n Uk+1 has a good cover, which is just {Ul n Uk+l,"" Uk n Uk+1}' From the long Mayer-Vietoris sequence we
10. De Rbam Cohomology
456
have for a given q ----+ Hq(Mk
n Uk+d ~ Hq(Mk+1) ~o+\i Hq+1(Mk ) EEl Hq+1(Uk+1)
----+,
which gives the exact sequence d·
0----+ Kerd* ----+ Hq(Mk+1)
L·+L·
P-...l 1m (to + ti) ----+ O.
Since Hq(Mk n Uk+1) and Hq+1(Mk) EEl Hq+1(Uk+1) are finite-dimensional by hypothesis, the same is true of Ker d* and Im( /'0 + /,i). It follows that Hq(Mk+1) is finite-dimensional and the induction is complete. 0
10.3. Compactly Supported Cohomology Let ne(M) denote the algebra of compactly supported differential forms on a manifold M. Obviously, if M is compact, then ne(M) = n(M), and so our main interest here is the case where M is not compact. We now have a slightly different complex
... ~ n~(M) ~ n~+1(M) ~ ... , which has a corresponding cohomology H~ (M) called the de Rham cohomology with compact support. By definition k Z~(M) He (M) = B~(M)'
where Z~(M) is the vector space of closed k-forms with compact support and B~(M) is the space of all k-forms dw where w has compact support. Note carefully that B~(M) is not the set of exact k-forms with compact support. To drive the point home, consider f E COO(JRn ) with compact support and with f > 0 and f > 0 at some point. Then w=
f
dx 1 1\ ... 1\ dx n
is exact since every closed form on JRn is exact. However, w cannot be da for an a with compact support, since then we would have
{ w = { da = ( lR~
lRn
loRn
a = 0,
which contradicts the assumption that f is nonnegative with f > 0 at some point. This already shows that H~(JRn) 1= O. Using a bump function with support inside a chart, one can similarly show that H~(M) 1= 0 for any orient able manifold M. Exercise 10.18. Let M be an oriented n-manifold. If w E w 1= 0, then [w] 1= 0 in H~(M).
n~(M)
Exercise 10.19. Show that H;(JR) ~ JR and that H~(M) dim M > 0 and M is connected.
0 whenever
IM
=
and
10.3. Compactly Supported Cohomology
457
If we look at the behavior of differential forms under the operation of pull-back, we immediately realize that the pull-back of a differential form with compact support may not have compact support. In order to get desired functorial properties, we consider the class of smooth proper maps. Recall that a smooth map f : P -+ M is called a proper map if f-l(K) is compact whenever K is compact. It is easy to verify that the set of all smooth manifolds together with proper smooth maps is a category and the assignments M M nc(M) and f M {a M f*a} give a contravariant functor. In plain language, this means that if f : P -+ M is a proper map, then f* : Oc(M) -+ Oc(P) and for two such maps we have (J 0 g). = g. 0 f* as before, but now the assumption that f and 9 are proper maps is essential. We will use a different functorial behavior associated with forms of compact support. The first thing we need is a new category (which is fortunately easy to describe). The category we have in mind has as objects the set of all open subsets of a fixed manifold M. The morphisms are the inclusion maps jv,u : V y U, which are only defined in case V cU. For any such inclusion jv,u, we define a map (jv,u). : Oc(V) -+ Oc(U) according to the following simple prescription: For a E nc(V), let (jv,u). a be the form in Oc(U) which is equal to a at all points in V and equal to zero otherwise (extension by zero). Since the support of a is neatly inside the open set V, the extension (jv,u). a is smooth. In what follows, we take this category whenever we employ the functor nco Let U and V be open subsets which together cover M. Recall the commutative diagram of inclusion maps (10.2). Now for each k, let us define a map (-ih' i2.) : O~(U n V) -+ O~(U) E9 n~(V) by a M (-iha, i2.a). Now we also have the map jh + h. : n~(V) E9 nc(U) -+ O~(M) given by (O:l,a2) M jhal +l2.a2. Notice that if {¢u,¢v} is a partition of unity subordinate to {U, V}, then for any w E O~(M) we can define Wu := ¢uwiu and Wv := ¢vwiv so that we have
(jh + h.) (wu,wv) = jhWU + j2.WV = ¢uw + ¢vw = w. Notice that Wu and wv have compact support. For example, Supp(wu)
= Supp(¢uw)
C Supp(¢u)
n Supp(w).
Since Supp(¢u) n Supp(w) is compact, Supp(wu) is also compact. Thus jh + h. is surjective. We can associate to the diagram (10.2), the new sequence (10.4)
0 -+ n~(V n U)
(-ih ,:2*)
O~(V) E9 O~(U) 31*+,2* O~(M) -+
o.
This is the short Mayer-Vietoris sequence for differential forms with compact support. Theorem 10.20. The sequence (10.4) is exact.
458
10. De Rham Cohomology
We have shown the surjectivity of ih + i2", above. The rest of the proof is also easy and left as Problem 4.
Corollary 10.21. There is a long exact sequence
which is called the (long) Mayer- Vietoris sequence for cohomology with compact supports. Notice that we have denoted the connecting homomorphism by d*. We will need to have a more explicit understanding of d",: If [w] E H~(M), then using a partition of unity {pu, PV} as above we write w = ihWU + h*wv and then d ihWU = -d i2 .. wv on U n V. Then
(10.5) Next we prove a version of the Poincare lemma for compactly supported cohomology. For a given n-manifold M, we consider the projection 1r : M x R -+ M. We immediately notice that 71'* does not map n~ (M) into n~(M x R). What we need is a map 71' .. : n~(M x JR) -+ n~-l(M) called integration along the fiber. Before giving the definition of 71'.. we first note that every element of n~(M x JR) is locally a sum of forms of the following types: Type I: f 71''' A, Type II: f 71''''t.p /\ dt, where A E n~(M), t.p E n~-l(M) and f is a smooth function with compact support on M x R. By definition 71'.. sends all forms of Type I to zero, and for Type II forms, we define 71'* (7I''''t.p •
f /\ dt) - t.p
L:
f(" t) dt.
By linearity this defines 71'", on all forms. In Problem 2 we ask the reader to show that 71'.. 0 d = d 0 71'111 so that 71'111 is a chain map. Thus we get a map on
10.3. Compactly Supported Cohomology
459
cohomology: 71'* : H~(M x
Next choose e E n~(I~.) with n~+1(M x 1R) by
Je -
e* :
1R) -+ H~-l(M). 1 and introduce the map
e* : n~(M)
-+
w H 71'*w /\ 71'2'e,
where 71'2 : M x lR -+ lR is the projection on the second factor. It is easy to check that e* commutes with d, so once again we get a map on the level of cohomology: e* : H~(M) -+ H~+1(M x lR) for all k. Our immediate goal is to show that e* 0 71'* and 71'* 0 e* are both identity operators on the level of cohomology. In fact, it is not hard to see that 71'* 0 e* = id already on n~(M). We need to construct a homotopy operator K between e* 071'* and id. The map K : n~(M x lR) -+ n~-l(M x 1R) is given by requiring that K is linear, maps Type I forms to zero, and if W = 71'*
K(w) = K(7I'*
71'*
([00 f(x, u) du - T(t)
L:
f(x, u) dU) ,
where T(t) = J~oo e. In Problem 3, we ask the reader to show that
(10.6)
id -e* 071'* = (_l)k-l(dK - Kd)
on n~(M x lR). It follows that id = e* 071'* on H~(M x JR), and so finally we have the following result: Theorem 10.22. With notation as above the following maps are isomorphisms and mutual inverses: 71'* : H~(M x
JR) -+ H~-l(M),
e* : H~-l(M)
-+ H~(M
x
JR).
Corollary 10.23 (Poincare lemma for compactly supported cohomology). We have H~(JRn)
= lR,
H~(JRn) = 0
if k =1= n.
Proof. By Exercise 10.19 we have H~(JR) ~ JR. But from the previous theorem H~(lRn) ~ H~-l(JRn-l) ~ ... ~ H~(JR). Also, trivially, H~(lRn) = ofor k > n, and if k < n, then H~(lRn) ~ H~-l(lRn-l) ~
by Exercise 10.19.
...
~ H~(lRn-k)
=0 D
460
10. De Rbam Cohomology
The isomorphism H~(lRn) = JR is given by repeated application of 71"., which is just integration over the last coordinate of JRn. Remark: Notice that JRn is homotopy equivalent to JRn-I and yet, H~ (lRn) =1= H~ (JRn-I ). This contrasts the compactly supported cohomology with the ordinary cohomology, since the latter is a homotopy invariant.
10.4. Poincare Duality Looking back at our results for JRn we notice that Hk(JRn) = H~-k(IRn)*. This is a special case of Poincare duality which is proved in this section. Recall that we have already met Poincare duality for compact manifolds in Section 9.6. The version we develop here will be valid for noncompact manifolds also. So let M be an oriented n-manifold which is not necessarily compact. For each k, we have a bilinear pairing
nk(M) x n~-k(M) -+ JR, (WI, W2)
f--+
1M WI /\ W2,
and as in Section 9.6 this defines a pairing on cohomology:
Hk(M) x H~-k(M) -+ JR, ([WI] , [W2])
f--+
1M WI/\ W2·
In turn, this provides a linear map PD k : Hk(M) -+ (H~-k(M))* defined as
PDk([WI]) ([W2]) :=
1M WI/\W2·
Our goal is to show that this map is an isomorphism. We prove this for orientable manifolds with a finite cover, but the result remains valid more generally (see [Madsen]). Notice also that unless the cohomologies are finite-dimensional, we may have Hk (JR n )* =1= H~-k (JRn). The reason is that for an infinite-dimensional vector spaces, V* ~ W does not imply V ~ W*. In what follows, we will denote all the duality maps P Dk by the single designation PD. We use a theorem from algebra called the five lemma: Consider the following diagram of vector spaces (or abelian groups) and homomorphisms:
461
10.4. Poincare Duality
Lemma 10.24 (Five lemma). Suppose that the above diagram commutes up to sign (so for example hi 0 a = ±(3 0 hI). Suppose also that the rows are exact and that a, (3, 6, and e are isomorphisms. Then the middle map, is also an isomorphism.
The five lemma is usually stated for the case when the diagram commutes, but a simple modification of the usual proof (see [L2]) gives the version above. If {U, V} is a cover of the oriented n-manifold M, then consider the following diagram: -
Hk-l(U)eHk-l(V)
--+
HI: l(UnV)
~
~
.... H n-lc+l(U)*e H"-Ic+l (V) * _ H"-Ic+l(U c
c
C
~
Hk(M)
--+
Hk(U)eHk(V)-
~
!
n V)* (d.)' _ H"-Ic(M)* -H"-Ic(U)*eH'!-Ic(V)*_ c C
C
Here the bottom row is the sequence of dual maps from the Mayer-Viet oris long exact sequence and is exact itself (easy check). The vertical maps are the duality maps PD (or obvious direct sums of such). Lemma 10.25. The above diagram commutes up to sign.
Proof. First let us consider the diagram
Hcn-k(M)*
(;h)* +(32.)·
~
Hn-k(u)* EB Hn-k(v)* c
c
In order to show that (PD EB PD) 0 (ji + j2) = ((jh)* + (i2*)*) 0 PD, it is enough to show that PD 0 ji = (jh)* 0 PD and PD 0 j2 = (i2*)* 0 PD. For a given [w] E Hk(M), the linear form PD 0 ji([w]) takes an element [0] E Hk(U) to
!ujiw 1\ O. On the other hand, ((jh)* 0 PD) ([w]) maps [0] to
1M wl\jh O• But w 1\ jh9 and jiw 1\ 0 both have support in U and agree on this set. Therefore the above two integrals are equal. Similarly, P Doj2 = (i2*)* oP D.
10. De Rbam Cohomology
462
Next we show that the following square commutes up to sign:
Hn-k+1(u n V)* _
(d.).
C
Hn-k(M)* C
Let {pu, pv} be a partition of unity subordinate to {U, V} as before. If [w] E Hk-l(U n V), then d*[w] is represented by a form (which we denote by d*w) that has the properties
d*wlu = -d(pvw) on U, d*wlv = d(puw) on V. On the other hand, if [r] E H~-k(M), then d* [r] is represented by a form d*T which has the properties
-ihd* r = d(puw) on U, i2*d*r = d(pvw) on V. Using the fact that w is closed, we have
PD 0 d* ([wD ([rD =
r d*w
JM
1
1\
r=
r
Junv
d*w 1\ T
1
d(pu w ) 1\ r = dpu 1\ w 1\ r, unv unv where have used the fact that d*w 1\ r has support in un V. Meanwhile, =
(d*)*
0
P D ([wD ([rD =
1 -1
unv
w 1\ d*r =
-1
unv
w 1\ d (pur)
w 1\ dpu 1\ r, unV so the two integrals are equal up to sign. We leave the sign commutativity 0 of the remaining square to the reader. =
Theorem 10.26 (Poincare duality). Let M be an oriented n-manifold with a finite good cover. Then for each k the map PDk : Hk(M) --+ (H~-k(M)r
is an isomorphism. Proof. We will prove that P Dk is an isomorphism by induction. The inductive statement P(N) is that P Dk is an isomorphism for all orientable manifolds which have a good cover consisting of at most N open sets. By the two Poincare lemmas we already know that if U is diffeomorphic to ]Rn, then Hk (U) ~ (H~-k (U))*, where both sides are either zero or isomorphic to lR. It is then an easy exercise to show that in fact PDk : Hk(U) --+ (H~-k(U)r is an isomorphism. This verifies the statement P(l). Now assume that
10.4. Poincare Duality
463
P(N) is true and suppose we are given an oriented manifold M with a good cover {Ul,"" UN+!}. Let MN := Ul U··· U UN and use Lemma 10.25 with U := MN, V := UN+! and U U V - M. Then by assumption PD is an isomorphism for U, V and un V, so the hypotheses of Lemma 10.24 are satisfied. But then the middle homomorphism Hk(M) -+ H~-k(M)* is an isomorphism (k was arbitrary). 0 Corollary 10.27. If M is a connected oriented n-manifold with finite good cover, then H~(M) ~ JR. This isomorphism ~s given by integration over M. This last corollary allows us to define the important concept of degree of a proper map. Let M and N be connected oriented n-manifolds and suppose that f : M -+ N is a proper map. Then a pull-back of a compactly supported form is compactly supported so that we obtain a map H~ (N) -+ H~ (M) and the following commutative diagram:
r:
H~(N) ~ H~ (M)
!IN
JR
!1M
Deg,
~JR
The induced map Deg/ : JR -+ JR must be multiplication by a real number. This number is called the degree of f and is denoted deg(f). If [w] E H~ (N) is chosen so that IN w = 1, then
1M rw = deg(f). Such a form is called a generator and can be chosen to have support in an arbitrary open subset V of N. To see this, just choose a chart (U, x) in N with U c U c V and let ¢ be a cut-off function with supp ¢ cU. Then let w := ¢ dx 1 1\ ... 1\ dx n and then scale ¢ so that IN w = 1. Of course, W is closed, but also [w] =1= O. Indeed, if we had [w] = 0, then there would be an (n -I)-form ry with compact support and with w - dry. But then Stokes' theorem applies, and so IN w = IN dry = IaN ry = 0 since aN = 0. This would contradict IN w = 1. Theorem 10.2S. Let M and N be connected n-manifolds oriented by volume forms f.1.M and f.1.N respectively. If f : M -+ N is a proper map and yEN is a regular value, then deg(f) =
~
signxf,
xe/-1(y)
where signxf = 1 if (f*f.1.N) (x) is a positive multiple of f.1.M(X) and -1 otherwise. In particular, deg(f) is an integer.
10. De Rbam Cohomology
464
Proof. Using Sard's Theorem 2.34 and the fact that I is proper, we may choose a neighborhood V of y such that 1-1 (V) = U~1 Ui, where U~nUj 0 for i i=- j and such that Ilu. is a diffeomorphism onto V for each i. Choose [w] E H; (N) with fNw - 1, and such that w has support in V. We may arrange that V and all Ui are connected so that signxl is constant on each U~. In this case, Ilu. is orientation preserving or reversing according to whether signxl is 1 or -1. Let Xi E Ui be the inverse image of y in Ut. We have
deg(f)
= { /*w =
JM
L
N
N
t=1
i-I
L Ju.{ /*w = L (sign
x ./) (
Jv
w 0
signxf.
xE! l(y)
We close this chapter with a quick explanation of the Poincare dual of a sub manifold. Let Y be an oriented regular submanifold of an oriented n-manifold. Let the dimension of Y be n - k. Then we obtain a linear map
I :n~ I w t-+
k(M) ---+ JR,
,,"'w.
where" : Y y M is the inclusion map. Since this map is zero on exact forms (by Stokes' theorem), it passes to the quotient giving a linear functional
By Poincare duality PD : Hk(M) ~ (H;-k(M))"', there must be a uniqu class rOy] E Hk(M) such that this linear form is given by
I
[w]- lOy 1\ w for all [w]
E
H~-k(M).
This class (or a given representative Oy) is called the Poincare dual of the submanifold Y. For more on this and other important topics such as the Thom isomo phism, the Leray-Hirsch theorem, and eech cohomology see [Bo-Tu].
Problems
465
Problems (1) Show that formula (10.1) defines a smooth function () on the open set U:- JR2\{x:::; O,y - O} and show that dO = xdy - ydx U x 2 +y2 on .
(2) Show that '1r* : n~(M x JR) --+ n~-l(M) as defined in the discussion leading to Theorem 10.22 is a chain map. (3) Prove the homotopy formula (10.6) (or see [Bo-Tu], pages 38 39). (4) Prove exactness of (10.4). (5) Prove Corollary 10.11.
(6) Use Exercise 10.14 and the long Mayer-Vietoris sequence to show that Hk(sn) _ {JR if k = ~ or n, o otherwIse.
Isn
(7) Show that W E nn(sn) is exact if and only if w - O. (8) Determine the cohomology spaces for the punctured Euclidean spaces IRn\ {O}. Show that if B is a ball, then JRn\B has the same cohomology. (9) Let Bx be a ball centered at x E JRn. Show that a closed (n - l)-form on JRn\B x is exact if and only /"*w = 0 for some sphere centered at x. Here /., : S '-+ JRn is the inclusion map. Show that this is true for all spheres centered at x.
Is
(10) Let n ~ 2 and suppose that w is a compactly supported n-form on JRn such that IlRn w = O. (a) Let Bl and B2 be open balls centered at the origin of IRn with Bl C Bl C B2 and suppw C Bl. By the (first) Poincare lemma there is a form a such that do. = w. Show that I8B2 a = O. (b) Continuing from (a), show that a is exact on JRn\B (use Problem 9 above). (c) Let pEnn 2(JRn\B) be such that dp = a where a is as above. Show that there is a function 9 such that if (3 :- 0'.- d(gp), then (3 is smooth, compactly supported and d{3 = w. Deduce that H~(JRn) = O. (11) Find H*(M) where M is JRn\{p, q} for some distinct points p, q E JRn .
Chapter 11
Distributions and
The theory of distributions is a geometric formulation of the classical theory of certain systems of partial differential equations. The solutions are immersed submanifolds called integral manifolds. The Frobenius theorem gives necessary and sufficient conditions for the existence of such integral manifolds. One of the most important applications is to show that a subalgebra of the Lie algebra of a Lie group corresponds to a Lie subgroup. We give this application near the end of this chapter. We also develop a classic version of the Frobenius theorem and use it to prove the basic existence theorem for surfaces. One can view the theory studied in this chapter as a higher-dimensional analogue of the study of vector fields and integral curves. If X is a smooth vector field on an n-manifold M, then we know that integral curves exist through each point, and if X never vanishes, then the integral curves are immersions. Nonvanishing vector fields do not even exist on many manifolds, and so we consider a somewhat different question. A one-dimensional distribution assigns to each p E M a one-dimensional subspace Ep of the tangent space TpM. We say that the assignment is smooth if for each p E M there is a smooth vector field X defined in an open set U containing p such that X (p) spans Ep for each p E U. It is easy to see that a one-dimensional distribution is essentially the same thing as a rank one subbundle of the tangent bundle. A curve c : (a, b) --+ M with c#-O is called an integral curve of the one-dimensional distribution if c(t) - Ttc· is contained in Ec(t) for each
gt
467
11. Distributions and Frobenius' Theorem
468
z
x
y
Figure 11.1. No integral manifold through the origin
t E (a, b). The curve is an immersion, and its restrictions to small enough subintervals are integral curves of vector fields that locally span the distribution. Since integral curves cannot cross themselves or each other, the curve is an injective immersion. The images of such curves are called integral manifolds (they are immersed submanifolds). Of course, every nonvanishing global vector field defines a one-dimensional distribution, but some onedimensional distributions do not arise in this way. For example, the tangent spaces to the fibers of the Mobius band bundle MB -+ 8 1 form a distribution not spanned by any globally defined vector field (see Example 6.9). The integral manifolds are the fibers themselves.
11.1. Definitions We wish to generalize the above idea to higher-dimensional distributions. Definition 11.1. A smooth rank k distribution on an n-manifold M is a (smooth) rank k vector subbundle E -+ M of the tangent bundle. Sometimes we refer to a distribution as a tangent distribution to distinguish it from the notion of a distributional function on a manifold. Recalling the definition of a vector subbundle (Definition 6.26) and the criteria given in Exercise 6.27, we see that a smooth rank k distribution on an n-manifold M gives a k-dimensional subspace Ep C TpM for each p E M such that for each fixed p E M there is a family of smooth vector fields XI, ... ,Xk defined on some neighborhood Up of p and such that X 1 (q), ... ,Xk(q) are linearly independent and span Eq for each q E Up. We say that such a set of vector fields locally spans the distribution. In other words, XI. ... , Xk can be viewed as a local frame field for the subbundle E. Consider the punctured 3-space M = JR3\ {a}. The level sets of the function c : (x, y, x) J--7 x 2 +y2 +x2 are spheres whose union is all of ]R3\{O}. Now define a distribution on M by the rule that Ep is the tangent space at
11.1. Definitions
469
p to the sphere containing p. Dually, we can define this distribution to be
the one given by the rule
Ep = {v
E
TpM: dc(v) = a}.
Thus we see that a distribution of rank n-1 can arise as the family of tangent spaces to the level sets of nondegenerate smooth functions. More generally, rank k distributions can arise from the level sets of sufficiently nondegenerate JRn-k-valued functions. On the other hand, not all distributions arise in this way, even locally. Definition 11.2. Let E -+ M be a distribution. An immersed submanifold S of M is called an integral manifold (or integral submanifold) for the distribution if for each pES we have T~(TpS) = Ep, where ~ : S y M is the inclusion map. The distribution is called integrable if each point of M is contained in an integral manifold of the distribution. 1 If ~ : S y M is the inclusion map, then we usually identify T~(TpS) with TpS so the condition which makes S an integral manifold is stated simply as Ep = TpS for all pES. Consider the distribution in IR3 suggested by Figure 11.1. For each p, Ep is the subspace of Tp lR 3 orthogonal to
Y~I-X~I 8x 8y p
p
+ ~I 8z p'
If there were an integral manifold though the origin, then it would clearly be tangent to the x, y plane at the origin. But if one tries to imagine following a closed path on such an integral manifold around the z axis, a problem appears. Namely, the curve would have to be tangent to the distribution, but this would clearly force the curve to spiral upward and never return to the same point. A distribution as defined above is sometimes called a regular distribution to distinguish the concept from that of a singular distribution which is defined in a similar manner, but allows for the dimension of the subspaces to vary from point to point. Singular distributions are studied in the online supplement [Lee, J efl1.
Definition 11.3. Let E -+ M be a distribution on an n-manifold M and let X be a vector field defined on an open subset U c M. We say that X lies in the distribution if X(p) E Ep for each p in the domain of X; that is, if X takes values in E. In this case, we sometimes write X E E (a slight abuse of notation). 1 In other contexts, integral manifold might be defined by the weaker condition n(TpS) C Ep. We will always require the stronger condition T~(TpS) Ep.
=
470
11. Distributions and Frobenius' Theorem
Definition 11.4. If for every pair of locally defined vector fields X and Y with common domain that lie in a distribution E -+ M, the bracket [X, YJ also lies in the distribution, then we say that the distribution is involutive. It is easy to see that a vector field X lies in a distribution if and only if k . for any spanning local frame field Xl, ... ,Xk we have X = Lt=l Xi for smooth functions fi defined on the common domain of X,XI, ... ,Xk.
r
Locally, finding integral curves of vector fields is the same as solving a system of ordinary differential equations. The local model for finding integral manifolds of a distribution is that of solving a system of partial differential equations. Consider the following system of partial differential equations defined on some neighborhood U of the origin in JR 2 :
Zx = F(x, y, z), Zy = G(x, y, z).
(11.1 )
Here the functions F and G are assumed to be smooth and defined on U x JR. We might attempt to find a solution defined near the origin by integrating twice. Suppose we want a solution with z(O, 0) = zoo First, solve f'(x) = F(x, 0, f{x)) with f(O) = zoo Then for each fixed x in the domain of f solve Oyz(x, y) = G(x, y, z(x, y)) with z(x, 0) = f(x). This function z(x, y) may not be a C 2 solution of our system of partial differential equations since, if it were, we would have Zxy = Zyx. But
Zxy = (zx)y
=
o
oy F(x, y, z)
=
Fy + GFy,
and similarly, Zyx = Gx + FG x. Thus the equality of mixed partials gives a necessary integrability condition; (11.2) We shall see that this condition is also sufficient for the local solvability of our system. Exercise 11.5. (a) Show that solving the differential equations above is equivalent to finding the integral curves of the following pair of vector fields.
o
0
0
0
X - ox - F(x,y,z) OZ and Y = ox - G(x,y,z) oz' (b) Show that the graph of a solution to the system (11.1) is an integral curve of the distribution spanned by X and Y. (c) Show that [X, YJ = 0 is the same condition as (11.2). There are no integrability conditions for finding integral curves of a vector field. But now we see that we should expect integrability conditions for the existence of integral manifolds of higher rank distributions.
11.2. The Local Frobenius Theorem
471
11.2. The Local Frobenius Theorem We will relate the notion of involutivity to the existence of integral manifolds. Lemma 11.6. Let E -+ M be a distribution (tangent subbundle) on an nmanifold M. Then E -+ M is involutive if and only if for every local frame field XI, ... , Xk for the sub bundle E -+ M we have k
[Xi, X j] =
2:::: ciJXs s=1
for some smooth functions
cij
(1
~
i,j, s
~
k).
Proof. It is clear that such functions exist if E is involutive. Now, suppose that such functions exist. If X and Y lie in the distribution, then k
X=
2:::: fiXi , i=1 k
Y =
2:::: gj Xj j=l
for smooth functions fi and gj. We wish to show that [X, Y] lies in the distribution. This follows by direct calculation:
[X,y] ~
[~J'X.,~giX;l
= 2::::figi [Xi,Xj]
+ ~fi (Xigi) Xj - ~gi (Xjfi) Xi' i,j
i,j
0
i,j
We will show that a distribution is involutive if and only if it is integrable. In fact, we shall see that the integral manifolds fit together nicely. In the following discussion we identify IRk x IRn-k with IRn. Definition 11.7. A rank k distribution E -+ M on an n-manifold M is called completely integrable if for each p E M there is a chart (U, x) centered at p such that x(U) = V x W C IRk x IRn-k and such that each set of the form x-I (V x {y}) for each yEW is an integral manifold of the distribution. If x(P) = (al, ... , an), then
x- 1 (V x {y}) = {q E U : xk+l(q) = ak+l, ... , xn(q) = an}, where y = (ak+l, ... , an). For a chart as in the last definition we have T
(x-l(V x {y})) =
Elx-l(Vx{y})
11. Distributions and Frobenius' Tbeorem
472
for each yEW. We can always arrange that V and W in the above definition are connected (say open cubes or open balls). In this case, we refer to such a chart as a distinguished chart for the distribution. The integral manifolds of the form x-I (V X {y}) for a distinguished chart are called plaques. Notice that since V and Ware assumed path connected for a distinguished chart, the plaques are path connected. The reader was asked to prove the following result in Problem 4 of Chapter 3, but we will state and prove the result here for the sake of completeness. Lemma 11.8. Suppose that f : N ---1- M is a one-one immersion and that Y is a vector field on M such that Y(f(p)) E Tpf (TpN) for every pEN. Then there is a unique vector field X on N that is f -related to Y. Proof. It is clear that we can define a field X such that X(p) is the unique element of TpN with Y(f(p)) = Tpf . X(P). The task is to show that X so defined is smooth. For pEN, let (U, x) be a chart centered at p and (V, y a chart centered around f(p) such that yo f 0 x-I has the form
(a\ ... , an)
I--t
(a\ ... , an, 0, ... ,0).
Then it is easy to check that
Tpf· aa i x
So if Y =
Ip =
aa i Y
I!(:P) .
,,",yi~ ~
ay~
for smooth functions yi, then
""' . a
X= ~X~axi'
where we must have Xi = yi hence the field X is smooth.
0
f.
This shows that the Xi are smooth and 0
We apply the previous lemma to the situation where S is an immersed submanifold of M and t, : S y M is the inclusion. In this case, if Y is a vector field on M which is tangent to S, then there is a unique smooth vector field on S, denoted Yl s , such that Yis (q) = Y(q) for all q E S. The vector field Yis is called the restriction of Y to S and it is t,-related to Y. Theorem 11.9 (Local Frobenius theorem). A distribution is involutive and only if it is completely integrable and if and only if it is integrable.
1
Proof. First, completely integrable clearly implies integrable. Now assume that E ---1- M is a rank k distribution which is integrable and let dim(M) = n. Let X and Y be smooth vector fields defined on U which both lie in the
11.3. Differential Forms and Integrability
473
distribution. Let P E U and let S be an integral manifold that contains p. Then X q, Yq E TqS for all q E Un S. Thus the vector fields are tangent to U n S and hence if t, : U n S <---+ U is the inclusion map, then X and Yare t,-related to their restrictions to un S as in the previous lemma. This means that [Xl uns , Yl uns ] is t,-related to [X, Y] and hence [X, Y]q E TqS = Eq for all q E un S and in particular [X, Y]p E Ep. Since P was an arbitrary element of U, we conclude that the distribution is involutive. Now suppose that E -+ M is involutive. Pick a point Po E M and let (U, x) be a chart centered at Po. By shrinking U if necessary we can find fields XI, ... , Xk defined on U that span the distribution. Notice that if we have a chart of the type produced by Theorem 2.113, then we can easily arrange that the image of the chart is of the form V x W C IRk X Rn-k, and it is then easy to see that we have exactly the kind of chart needed (see the comments immediately following the proof of Theorem 2.113). Thus the strategy is to replace the frame field Xl' ... ' Xk by a commuting frame field. Construct a chart (U, x) centered at Po such that Xj(Po} = 8~j IpO for 1::; j ::; k. Let 7r : Rn -+ Rk be the map (a\ ... , an) t---+ (a l , . .. , ak ) and let f := 7r 0 x. We have
(TpOJ) (Xj(Po))
= Tpof
8~j IpO = ~J 10'
where (t l , ... , t k ) are standard coordinates on Rk. Then TPof : TpOM -+ ToRk maps EpO isomorphically onto ToRk. It follows that Tpf maps Ep isomorphically onto Tf(P)R k for all p in some neighborhood of Po. Hence, for each such p, there are unique vectors Y I (p) , ... , Yk(P) in Ep with Tpf (Yj(P)) = 8~3If(P)" Thus we get vector fields YI , .. . , Yk whose smoothness we leave to the reader to check. By shrinking U if necessary, we may assume that these are defined on U. It is clear that YI, ... , Yk form a local frame for E and Yj is f-related to ef:3. Thus [Yi, Yj] is f-related to [ef:., 8~] and in particular
Tpf
([~, Yj]p) = [~i' 8~j ] 0 = o.
Since the distribution E is involutive, we have [Yi, Y,]p E Ep for each p. Then since Tpf is injective on Ep, we see that [Yi, Yj]p - 0 for all p. Thus we arrive at a local frame field 1'1, ... , Yk for E with [Yi, Yj] = o. By Theorem 2.113 and the comments above, we are done. o
11.3. Differential Forms and Integrability One can use differential forms to effectively discuss distributions and involutivity. What is interesting is the way the exterior derivative comes into play.
474
11. Distributions and Frobenius' Theorem
The first thing to notice is that a smooth distribution is locally defined by I-forms. We need a bit of creative terminology: Definition 11.10. If a l , ... ,ai are I-forms on an n-manifold M and S is a subset of M, then we say that a l , ... , a i (defined at least on some open set containing S) are independent on S if { a l (p), ... , ai (p)} is an independent set in T; M for each pES. Proposition 11.11. Let E -+ M be a rank k distribution on an n-manifold M. Every p E M has an open neighborhood U and I-forms a l , ... , an k independent on U such that
Eq = Keral(q) n··· nKeran-k(q) for all q E U. Proof. Let p E M and let Xl, ... ,Xk span E on a neighborhood 0 of p. We may extend to a frame field Xl, ... , Xk,"" Xn defined on some possibly smaller neighborhood U of p. Indeed, choose any other frame field Yi, ... , Y1\ and then rearrange indices so that Xl (p), ... , X k(P), Yk+1 (p), ... , Yn(p) are linearly independent. The vectors Xl(q), ... , Xk(q), Yk+1(q), ... , Yn(q) must be independent for all q in some neighborhood of p, which we may as well take to be O. Indeed, q f--t Xl(q) /\ ... /\ Xk(q) /\ Yk+1(q) /\ ... /\ Yn(q) is nonzero at p and so remains nonzero in a neighborhood. In any case, let Xl"'" Xk,"" Xn be our extended local frame and let (}l, ... , en be the dual frame. Then it is easy to check that v E TqM is in Eq if and only if (};+1(v) = ... = (}~(v) = O. Thus we just let (al, ... , a n - k ) := ((}k+l, ... ,(}n). 0 The forms described in the previous proposition are sometimes called local defining forms for the distribution. The set of all differential forms vanishing on a distribution has a convenient algebraic structure. Recall that O(M) is a ring under the wedge product. If.J is the ideal in O(M) and U c M is open, then Jlu is the set of restrictions to U of elements of .J and is an ideal in O(U) since the wedge product is natural with respect to restrictions (that is, with respect to pull-back by inclusion maps). Definition 11.12. Let J be an ideal in O(M); we say that .J is locally generated by n-k smooth I-forms if there is an open cover of M such that for each open U from the cover, there are I-forms a l , ... , a n - k independent on U such that .Jlu is an ideal of O(U) generated by al, ... , an - k • Exercise 11.13. Show that if .J is locally generated by n - k smooth 1forms as above, then it defines a unique smooth rank k distribution E on M such that for each U in the cover, and corresponding a l , ... , a n - k , we have
Eq = Keral(q)
n··· n Keran-k(q)
for all q E U.
11.3. Differential Forms and Integrability
475
Definition 11.14. Let E ~ M be a rank k distribution on an n-manifold M. Let U be open in M. A j-form w E oj (U) is said to annihilate the distribution on U if for each p E U Wp(VI, ... , Vj)
= 0 whenever
vb ... , Vj E Ep.
An element W E O(U) is said to annihilate the distribution if each homogeneous component of W annihilates the distribution. The subset of all elements of O(U) that annihilate the distribution E ~ M is denoted I(U). It is easy to check that I(U) is an ideal in O(U). Clearly, local defining forms for a distribution annihilate the distribution.
Lemma 11.15. The following conditions on a distribution E equivalent:
~
M are
(i) E -+ M is an involutive distribution. (ii) For every open U c M, and 1-form wE OI(U) that annihilates the distribution, the form dw also annihilates the distribution on U.
Proof. The proof that these two conditions are equivalent follows easily from the formula
(11.3)
dw(X, Y)
= X(w(Y))
- Yw(X) - w([X, Y]).
Suppose that (i) holds. If w annihilates E and X, Y lie in E, then the above formula becomes dw(X, Y) = -w([X, Y]), which shows that dw vanishes on E since [X, Y] E E by condition (i). Conversely, suppose that (ii) holds and that X, Y E E. Let a l , ... , a n - k be local defining 1-forms. Then each form da i annihilates the distribution and also aj(X) = aj(y) = 0 for all j. Using equation (11.3) again, we have ai([X, Y]) = -dai(X, Y) = 0 for all i so that [X, Y] E E. 0 Lemma 11.16. Let E ~ M be a rank k distribution on the n-manifold M. A j-form w annihilates E if and only if, whenever a l , ... , a n - k are local defining I-forms on some open U, we have n-k
wl u= Lai Af3i
(11.4)
i=l
for some
f31, ... ,f3n - k
E OJ-l (U).
Proof. Recalling the definition of exterior product makes it clear that if w satisfies (11.4), then it annihilates E on U. If w satisfies such an equation near each point for some generating I-forms, then w annihilates E on all ofM.
11. Distributions and Frobenius' Theorem
476
Suppose that a 1 , ... ,an k are local defining forms for the distribution defined on an open set U and let p E U. Then on some neighborhood V of p we can extend the coframe field aI, ... ,an- k to a 1, ... ,an in such a way that a 1 , ... , an is independent on V. Define Xl,"" Xn as dual to the frame a1Iv"'" anl v so that Xn k+ll'" ,Xn span E on V. Now for any w E oj (U), we have
wlv =
'"' ~
W·
.
",il
~1 .. ·~'.....
IV
1\ . .. 1\ ",i'l ..... v'
il < .. ·
where W~l'''i, = w(Xil"'" X~3) on V. Thus w annihilates E on V if and only if W(X~l' ... , Xi,) = 0 on V whenever n - k + 1 ~ il < ... < i J S n. So if w annihilates E, then in the expansion of w we need only include those wi! "'i, such that il < ... < i J and such that at least one index is less than or equal to n - k. But in the latter case we would have il S n - k. Thus we can write
wl v
=
",ill V 1\ ... 1\ ",i'l · .. ·~,..... W ~1 ..... V f:il
=
L ail v ~
1\
I
n k
:=
L ail v t
1\
,Bv·
1
This expression holds on V c U, but there is a cover of U by such V on which similar expressions hold. Denote this cover by V. By using a partition of unity {¢v : V E V} subordinate to this cover we can patch these expressions together. Notice that for any V E V we have ¢v (ail v ) = ¢vo:i for 1 ~ i S n - k, so n k
wlu - L ¢v wlv = L ¢v L ail v I\,B~ VEV
~
VEV
1
n k
= L
L (¢v ail v ) I\,B~
VEV~-l
n-k
= L L¢v ai l\,Bv VEV n-k
t
1
n k
=
Lai 1\ L ¢v,B~ i=l
VEV
= L a i l\,Bi t=l
o
11.3. Differential Forms and Integrability
477
Corollary 11.17. A distribution E -+ M is involutive if and only if for every local defining set of 1-forms 0;1, ... , o;n-k we have n k
do;i
=
2: o;i /\')';
J
1
for some 1-forms ,j with 1 < i,j ~ n - k.
Proof. Just use the above theorem together with Lemma 11.15.
0
Theorem 11.18. A distribution E -+ M on M is involutive (and hence completely integrable) if and only if dI(E) c I(E). The condition dI(E) C I(E) is expressed by saying that I(E) is a differential ideal. Proof. Suppose that dI{E) C I{E). If w is any 1-form that annihilates E on an open set U, then pick an arbitrary point p E U and a smooth cut-off function ¢ with support in U and equal to one on a neighborhood of p. Then d (tPw) E I(E). But d (¢w) (p) = dW(p), and since p was arbitrary, we see that dw annihilates E on U. The open set U was also arbitrary so we can apply Lemma 11.15. Now suppose that E -+ M is involutive and choose TI E I(E). Without loss of generality we may assume that TI is homogeneous of degree j. Notice that the case j = 0 is trivial, while the case j - 1 is Lemma 11.15. For j > 2, consider a local defining set of 1-forms 0;1, ... , o;n-k on an open set U. Since by assumption TI annihilates E, we can use Lemma 11.16 and Corollary 11.17 to write
dTllu = d
(~
o;i 1\
f3i)
~-1
n-k
=
i
~-1
n k
=
n-k
2: da 1\ f3i - 2:
o;i 1\
df3i
i=l
n k
2: 2: o;i /\ ,; /\ f3i - 2:
o;i 1\
df3~.
i=l
Now use Lemma 11.16 again to conclude that dTl annihilates E and so is in I(E). 0 We close this section with a convenient application of vector-valued forms. Proposition 11.19. Let w be a 1-form on an n-manifold with values in a finite-dimensional vector space V. Then Ex = Ker Wx defines a distribution
478
11. Distributions and Frobenius' Theorem
E if dim Ex = k is constant for x EM. This distribution is involutive and hence integrable if and only if dw(X, Y) = 0 whenever X, Y lie in E.
Proof. The first statement is clear once we choose a basis el, ... ,en for V for then we can write w = L: eiw~ for some linearly independent smooth 1-forms w l , ... ,wk which clearly define the distribution.
Choose local spanning fields Xl"'" X k defined on an open set U. Then E is integrable on U if and only if [X~,
XJ }
E
span{X!, ... , Xk} for 1 ~ i,j
which is true if and only if w ([Xi, X j ]) = 0 for 1 ~ i, j
~ k, ~ k.
But as before,
dw (Xi, Xj) = Xi(W(Xj)) - Xj(W(Xi)) - W([Xi' Xj]) = -w([X~, X j ]).
Thus E is integrable on U if and only if dw (Xi, X j This implies the result.
)
= 0 for 1 ~ i, j ~ k.
0
11.4. Global Frobenius Theorem First notice that if (U, x) is a distinguished chart for a distribution on an n-manifold with x(U) = V x W C jRk x jRn-k, then the plaques x-I (V x {y}) for each yEW are embedded integral manifolds. If 0 c V and YEW, then x-l(Ox {y}) is an open subset of the plaque x-l(Vx{y}) (in its submanifold topology). Clearly, all open subsets of a plaque are of this form. Proposition 11.20. Let E -+ M be an integrable distribution of rank k on the n-manifold M. Let (U, x) be a fixed distinguished chart for E where x(U) = V x W C ]Rk x jRn-k. If S is a connected integral manifold for E, then S n U is a countable disjoint union of connected open subsets of the plaques of (U, x). These open connected subsets are open in S and embedded inM. Proof. Since the inclusion {, : S y M is an immersion, the sets S n U = (,-l(U) are open in S. The set S n U is a disjoint union of connected components each of which is open in S. Since S is second countable, this union must be a countable union. Let C be a connected component of S n U. We show first that C c x-l(V x {y}) for some yEW. Since (U,x) is a distinguished chart, the 1-forms dx k+1, ... , dxn are local defining forms for the distribution and so dx k+1 = ... = dxn = 0 on S n U and hence on C. But C is connected, so xk+1, ... ,xn are all constant on C, which puts it in a single plaque, say x-I(V x {y}).
We know that C y M is smooth. Since x-I (V x {y}) is embedded in y x l(V X {y}) is also smooth by Corollary 3.15 and is thus an injective immersion. Since C and x-l (V x {y}) are manifolds of the
M, the inclusion C
11.4. Global Frobenius Theorem
479
same dimension, the inclusion C Y x 1(V X {y}) is a local diffeomorphism and hence an open map. It follows that C Y x- l (V x {y}) is an embedding. Since the inclusion C Y M is the composition of embeddings C Y x- 1 (V x {y }) Y M, it is itself an embedding. D If E --+ M is an integrable distribution of rank k on the n-manifold M and p EM, then we would like to consider the union N-
U 8, SEJ=(p)
where F(P) is the family of all connected integral manifolds containing p. We would like to say that N is the maximal integral manifold that contains p, but we must show that N is indeed an integral manifold. Let {(Ua,Xa)}aEA be an atlas of distinguished charts for the distribution where xa(Ua ) = Va X Wa C jRk X jRn-k. On each plaque x-l(Va x {y}) of (Ua , xa) which is in N, we define pr1
0
Xa : x-1(Va
X
{y}) --+ Va C
jRk,
and this map is taken as a chart on N whose domain is a plaque. Charts obtained in this way will be called plaque charts and the family of all such charts provides N with an atlas. The question of the smoothness of overlaps is routine and is left to the reader. Thus we have a smooth atlas on N which induces a topology such that each plaque, and indeed, each member of F(p), is open. The topology induced by this smooth structure is often called the leaf topology. It is also evident that N is connected since it is the union of connected sets containing a fixed point. We can also see directly that N is path connected. Indeed, if q is in N, then it must be in some 81 E F(p). But also p E 8 1 by definition, and so since 81 is connected (and hence smoothly path connected), there is a smooth curve connecting p and q. The questions that remain are whether the topology is Hausdorff and whether it is paracompact. Once again the proof that N is Hausdorff is easy and is left to the reader. The harder question is paracompactness. But this follows from the fact that N clearly satisfies property W(k) of Definition 3.18, and so N is a weakly embedded submanifold by Proposition 3.20. On the other hand, the part of the proof of that proposition that established paracompactness appealed to induced Riemannian metrics and material about topologies induced by metrics that we still have not covered. For this reason we offer a direct and traditional proof. First, since N is connected, the goal is to prove that it is second countable. The inclusion N y M is continuous, so N is contained in a connected component of M. Thus we may assume without loss of generality that M is second countable, and so also that the atlas {(Ua , Xa)}aEA of distinguished charts is countable. This latter fact is one of the key points. Notice that each plaque is a regular
480
11. Distributions and Frobenius' Tbeorem
Figure 11.2. Slice accessible from point p
submanifold, so each such plaque is itself second countable. Observe that by construction, if a set N n UCt intersects a plaque in UCt , then that plaque is actually contained in N n UCt . Indeed, if a plaque of UCt intersects N, then there is an element S E F(P) such that S meets the plaque in an open set in the topology of the plaque (Proposition 11.20). But then the union of thib plaque with S is another element of F(P) and so occurs in the union that defines N. This reduces the problem to that of showing that N contains only a countable number of such plaques. We consider all sequences of the form (UCt(I) , PI) , ... , (UCt(m) , Pm), where Pi is a plaque for a chart (UCt(i) , XCt(~)) taken from our countable atlas of distinguished charts, such that Pm - P is a plaque contained in UCt(m) n N, p E PI, Pm = P, Pi eN and Pi
n PHI ¥- 0 for i
1, ... , m-1.
Notice that m is not fixed, but must be finite. Let us call such a sequence a finite connection from p to the plaque P. We first show that for each plaque P contained in some UCt n N, there is at least one such finite connection from p to P. To see this, consider a point q E UCt n N and suppose that q E P. Let c : [0,1] --+ N be a smooth curve with c(O) = p and c(l) q. Now c ([0,1]) is a compact set, so there is a sequence tl < t2 < ... < tm such that for each i, c ([ti' ti+1]) is contained in a distinguished chart, say (UCt(~), XCt (,)). Of course, each c ([ti' ti+1]) is also connected and so must be contained in a connected component of UCt(i) n N and thus in some plaque which we call Rt. Clearly we have created a finite connection from p to P. Now consider the family C of all finite connections of p to the plaques contained in the various UCt n N. Each finite connection can be mapped to the plaque on which it terminates, and we have just seen that this map is surjective. Thus the set of plaques of the various Uf3 n N that are finitely connectable to p must have cardinality less than or equal to that of C. It may already seem that since the atlas is countable, C must also be countable. However, while the set of all finite sequences UCt(I),"" UCt(m) is certainly
11.4. Global Frobenius Theorem
481
countable, there is still the question of how many ways there are to choose the Pi inside each Ua(i) to build a connection of p to a plaque. However, since each plaque Pi is certainly an integral manifold, Proposition 11.20 tells us that it can only intersect Ua (i+1) in a countable number of plaques. Thus for a given sequence U a (l),"" Ua(m) there is only a countable number of ways to choose the corresponding Pi. It follows that N contains only a countable number of plaque charts and so is a second countable connected integral manifold for the distribution. It is clearly maximal by construction. One final thing to notice is that unless the rank of the distribution is equal to the dimension of the manifold, M has an uncountable number of maximal integral manifolds, and they partition M. We have now proved the following: Theorem 11.21. Let E -+ M be an integrable distribution of rank k on the n-manifold M. There is a set of maximal integral manifolds for E which partition N. These are weakly embedded submanifolds. There is another slightly different way of looking at the above theorem. Namely, the set of all plaque charts puts a new smooth structure on the set M. We have seen that the overlaps are only nonempty if the plaques are on the same integral manifold. This smooth structure and topology gives us a new manifold which we might denote by M(E). It has the same underlying set as M, but has a finer topology. The connected components of M(E) are exactly the maximal integral manifolds. Proposition 11.22. If £1 and £2 are connected integral manifolds of an integrable distribution, then either £1 n £2 is empty or it is an open subset of the maximal integral manifold containing any point of £1 n £2. The set £1 n £2 is also open in the integral manifolds £1 and £2. Proof. Assume that £1 n£2 is not empty and let p E £1 n£2. Then there is a distinguished chart (U, x) for the distribution containing p such that both £1 n U and £2 n U are plaques containing p. Clearly, the plaques must be the same. Since plaques are open by definition in the maximal integral manifold, and since p was an arbitrary point in £1 n £2, it follows that £1 n £2 is open 0 in this maximal integral manifold. The last statement is obvious. Corollary 11.23. Suppose that M and N are smooth manifolds and that there is an integrable distribution E on M x N. Suppose that for some connected open subset U of M there are smooth functions II : U -+ Nand h : U -+ N such that the graphs of II and h are integral manifolds. Then these graphs are either disjoint or equal. In the latter case, II = h.
482
11. Distributions and Frobenius' Theorem
Proof. Let r 11 and r h the graphs and suppose that r 11 nr h is not empty. Let (po, h (Po)) = (po, h (Po)) E r 11 n r h and consider the set S
= {p E U : h(P) = h(P)}.
Then S is clearly closed. We show that it is also open. Pick pES. Then and r h intersect at the point (p, h (p)) = (p, h (p)). But since r It and r h are integral manifolds of an integrable distribution on M x N, the previous proposition tells us that r 11 n r h is open in r 11, and so, since prl : U x N --+ U restricts to a diffeomorphism on r 11 (as for all graphs of smooth functions), we see that S = prl (r11 n r h) is an open set in U. Since U is connected, we have S = U. Thus h = h and the graphs are equal. 0
r 11
If we consider what we have discovered about integrable distributions, we arrive naturally at the following definition (and new terminology). Definition 11.24. Let M be an n-manifold. A k-dimensional foliation FM of M (or on M) is a partition of M into a family of disjoint connected subsets {Ca}aeA such that for every p E M, there is a chart (U, x) centered at p of the form x : U --+ V x W C IRk X IRn-k for connected V and W with the property that for each Ca the connected components (U n COt )/3 of un La are given by 'P((U n Ca )(3) V x {c Ot ,/3}, where ca ,/3 EWe F are constants. These charts are called distinguished charts for the foliation or foliation charts. The connected sets La are called the leaves of the foliation while for a given chart as above, the connected components (U n Ca )/3 are called plaques.
It is easy to see that if F M is a foliation as above, then there is a unique integrable distribution E --+ M on M such that if p is in the domain of a. foliation chart (U, x), then Ep = Ker
dxl;+1 n ... n Ker dxl; .
Furthermore the leaves of the foliation are exactly the maximal integral manifolds of the distribution. We call this the distribution generated by the foliation. Combining this observation with Theorem 11.21 we obtain Theorem 11.25. Every integrable rank k distribution gives rise to a unique k-dimensional foliation whose leaves are the maximal integral mamfolds. Conversely, a foliation generates an integrable distribution whose maximal integral manifolds are exactly the leaves of the original foliation. Thus the theory of foliations is essentially the study of integrable distributions. Now we see that the leaves are weakly embedded submanifolds. The connected components (UnC Ot )/3 of UnC a are of the form Ca:(UnLa)
11.4. Global Frobenius Theorem
483
for some x E CO!. (recall Definition 3.17). An important point anticipated by our concern above is that a fixed leaf CO!. may intersect a given chart domain U in many, even a count ably infinite number of disjoint connected pieces no matter how small U is taken to be. In fact, it may be that U n CO!. is dense in U. On the other hand, each CO!. is connected by definition. The usual first example of this behavior is given by the irrationally foliated torus. Here we take M = T2 := Sl X Sl and let the leaves be given as the image of the immersions ta : t I---t (eia'/l"t, ei'll"t), where a is a real number. If a is irrational, then the image t a (1R) is a (connected) dense subset of Sl x Sl. On the other hand, even in this case there are infinitely many distinct leaves. It may seem that a foliated manifold is just a special manifold, but from one point of view, a foliation is a generalization of a manifold. For instance, we can think of a manifold M as a foliation where the points are the leaves. This is called the discrete foliation on M. At the other extreme a manifold may be thought of as a foliation with a single leaf C = M (the trivial foliation). We also have the following special cases: Example 11.26. On a product manifold, say M x N, we have two complementary foliations: {{p} X NheM and
{M x {q} }qEN. Example 11.27. Given any submersion {f-1(q)}qeN form the leaves of a foliation.
f : M -+ N, the level sets
Example 11.28. The fibers of any vector bundle foliate the total space. Example 11.29 (Reeb foliation). Consider the strip in the plane given by {(x,y) : Ixl ~ I}. For a E IR U {±oo}, we form leaves Ca as follows:
Ca := {(x, a + f(x)) : Ixl ~ I} for a E IR, C±oo:= {(±l,y): Iyl ~ 1},
1::2)
where f(x) := exp( -1. By rotating this symmetric foliation about the y axis we obtain a foliation of the solid cylinder. This foliation is such that translation of the solid cylinder C in the y direction maps leaves diffeomorphi cally onto leaves, and so we may let Z act on C by (x, y, z) I---t (x, y+n, z) and then C /Z is a solid torus with a foliation called the Reeb foliation. Example 11.30. The one point compactification of IR3 is homeomorphic to S3 C IR4. Thus S3 - {p} ~ IR3 and so there is a copy of the Reeb foliated solid torus inside S3. The complement of a solid torus in S3 is another solid torus. It is possible to put another Reeb foliation on this complement and thus foliate all of S3. The only compact leaf is the torus that is the common boundary of the two complementary solid tori.
484
11. Distributions and F'robenius' Theorem
Theorem 11.31. Let F M be a foliation on M and C a leaf of the foliation. Then for any smooth map f : N ---t M with f(N) c C the map f : N -+ {, is smooth. Proof. Just use Theorem 3.14.
o
11.5. Applications to Lie Groups First we fulfill the promise of proving Theorem 5.66, which we repeat here for convenience.
Theorem 11.32. Let G be a Lie group with Lie algebra g. If ~ is a Lie subalgebra of g, then there is a unique connected Lie subgroup H of G whose Lie algebra is ~. Proof. For a E G, we let Aa denote the subspace of TaG that is the set of all X (a) for left invariant vector fields X with X (e) E ~. Thus Va E Aa if and only if Va = TeLa' V for some V E ~. If Xl, ... ,Xk are left invariant fields such that XI(e), ... , Xk(e) is a basis for ~, then XI, ... , Xk span the distribution which is therefore smooth. Since ~ is a sub algebra, it follows that A : a I--t Aa is an integrable distribution. Let H be the maximal (connected) integral manifold containing e. Note that for any bEG, we have TeLb(Aa) = Aba, so that TLb : TG ---t TG leaves the distribution invariant. Thus Lb induces a permutation of the set of maximal integral manifolds and takes the maximal integral manifold through a diffeomorphic ally to that through ba. Thus if h E H, then Lh-l takes H to the maximal integral manifold containing e, which is just H. In other words, Lh-l (H) = H. This shows that H is a subgroup. It remains to show that the multiplication map f..t : H x H ---t H is smooth. But the multiplication map into G, H x H -+ G, is smooth and so by Theorem 11.31 the map H x H ---t H is smooth. 0 Theorem 11.33. Let G and H be Lie groups with respective Lie algebras 9 and ~. If h : 9 ---t ~ is a Lie algebra homomorphism, then there is a neighborhood U of the identity e E G and a smooth map f : U ---t H such that f(xy) = f(x)f(y) whenever x, y E U and xy E U, and such that Tef·v=h(v) for every v E g.
Proof. Let
e c 9 x ~ be defined by e := {(v, h(v)) : v E g}.
485
11.5. Applications to Lie Groups
The fact that h is a homomorphism implies that t is a Lie subalgebra of 9 x~. Thus by Theorem 11.32, there is a connected Lie subgroup K of G x H with Lie algebra t. Now let L: K y G x H be inclusion and define a homomorphism p : K -+ G by p:= prl
0 L.
If v E g, then Tp· (v, h(v))
= v,
and this means that Tp : T(e,e)K -+ T eG is a linear isomorphism. Thus by the inverse mapping theorem, there is a neighborhood V of (e, e) E K such that plv is a diffeomorphism onto an open neighborhood U of e E G. Define the homomorphism 'tj; : K -+ H by 'tj; := pr2
0 L,
where pr2 : G x H -+ H is the second factor projection. Notice that Te'tj; . (v, h(v)) = h(v). Now let f:= 'tj; 0 plv l
.
A straightforward diagram chase argument shows that f (xy) x, y E U and xy E U.
=
f (x) f (y) if
If v E g, then Tp(v, h(v)) = v implies that T(p VI) . V = (v, h(v)) so Tf . v = T'tj;
0
T (plv l )
.
v = T'tj; . (v, h(v))
= h(v).
0
Theorem 11.34. If II : G -+ Hand h : G -+ H are Lie group homomorphisms with dii = dh : 9 -+ ~ and G is connected, then II = h· Proof. Let h:= dii = dh and define t:= {(v, h(v)) : v E g} and K c GxH as in the proof of the previous theorem. Now define () : G -+ G x H by 8(x) := (x, lI(x)). The image of 0 is a subgroup KI C G x H. For v E g, we have TeO· V = (v, h(v)) so the Lie algebra of KI must be t. Since G is connected, we must have K = KI which implies that II = f on U, where f and U are constructed from h as in the last theorem. But equally, h = h on U, and so by Proposition 5.40 we have II = h. 0 We say that two Lie groups G and H are locally isomorphic if there is a diffeomorphism f from a neighborhood U of the identity of G onto a neighborhood V of the identity of H such that f(xy) = f(x)f(y) whenever x, y and xy are contained in U. Corollary 11.35. The following assertions hold:
(i) Two Lie groups with isomorphic Lie algebras are locally isomorphic. (ii) A connected Lie group with abelian Lie algebra is abelian.
486
11. Distributions and Frobenius' Theorem
Proof. (i) Let h : 9 --+ ~ be a Lie algebra isomorphism. Then if f is the map constructed in Theorem 11.33, then f is a diffeomorphism on some possibly smaller neighborhood of the identity since Tel = h is an isomorphism. (ii) By (i) a connected Lie group G of dimension n must be locally isomorphic to the (additive) abelian Lie group ]Rn. But a neighborhood of the identity generates the whole group, and so G is abelian. 0
11.6. FUndamental Theorem of Surface Theory In this section we state and outline the proof of a fundamental theorem concerning the existence of surfaces with prescribed first and second fundamental form. Our proof of the main theorem follows [Pa2]. To begin with, we need a few results about certain systems of partial differential equations. The first is equivalent to the local Frobenius theorem. For an open set U C ]Rk X ]Rm, denote standard coordinates by (x, z) = (xl, ... ,xk, z\ ... ,zm). Proposition 11.36. Let U be an open set in ]Rk X ]Rm and let (A~) be an m x k matrix of smooth functions on U. Then the following assertions are equivalent:
(i) For every (xo, zo)
E U, there is a neighborhood V of Xo in ]Rk and a unique smooth map f : V --+]Rm with f(xo) = Zo such that
~~; (x) = A; (x, f(x))
(11.5)
for all i,j.
(ii) The functions A} satisfy the following system of equations on U: (11.6)
8A; 8x k
1 8A; 8Al '"" 1 8Al + '"" ~Ak-8 I = -8 . + ~Aj-8I 1 z xl I Z
. .
for all ?',),k.
Proof. If (i) is true, then we obtain (ii) by equality of mixed partials of the fi and the chain rule (see the comments following the proof).
Conversely, cOIlBider the vector fields on U defined by
Xj
:=
8~j I + LAj(P) 8~r I . P
r
P
A bit of linear algebra tells us that these are everywhere independent. Let E --+ U be the distribution spanned by these fields. A straightforward check using (11.6) shows that
[Xi,Xj] = 0, so there is an integral manifold through each p. Let N be the integral manifold through (xo, zo). Using the last m coordinate functions of some distinguished chart, we obtain a map 4? : U' --+]Rm for some connected open U' c U so that the level sets of 4? are integral manifolds. The tangent map
11.6. Fundamental Theorem of Surface Theory
487
Tp«I> has kernel Ep at each pEN, and since a~3lp never lies in E, we see that
a«I> -a . =1= 0 on N n U' for all J.. z3
In particular, this holds at (xo, zo), and the implicit mapping theorem tells us that a neighborhood of (xo, zo) in a plaque of N is the graph of a function f : V --7 ]Rm with f(xo) = zoo Define a function F : V --7 ]Rk X ]Rm by F(x) := (x, f(x)). Writing p = F(x), we see that for each i the vector
TF.
~I axi
x
= ~I + " aaxi r (x) ~I axi P ~ az r P r
is a linear combination of vectors Xj defined above:
~I ax'
f(x)
=
+ 2: af~ (x) ~I r r ax'
tc: 8=1
(aa 8 x
az
f(x)
I(x,/(x)) + 2:A~(P) aa I ). r zr (x,f(x))
Collecting terms and comparing we see that
c: = cSt and
ar _ r ax j (x) - A3 (x, f(x)). It follows from Corollary 11.23 that f is uniquely determined on a sufficiently 0 small connected neighborhood of (x, y). It is often convenient to be able to come up with the integrability conditions for a given application without trying to match indexing and notation with the above theorem. The basis of the procedure is to set mixed partials equal to each other. We demonstrate this using the notation of the theorem. We start with
a .
a .
axk Aj (x, f(x)) = ax j Ak(x, f(x)). Apply the chain rule:
aA~ axk (x, f(x))
a
ar
+ 2: azrAj(x, f(x)) axk (x) i
r
_ aAt - - x3 a. (x, f(x))
a i ar +" - a Ak(x, f(x))-a ~ Z8 x3. (x). 8
Finally, substitute back using the original equations (11.5) and replace all occurrences of f(x) in the arguments with the independent variable z. We arrive at the integrability conditions:
"I
"I
8Aj aA~I (x,z) = aAi 8Ai a j (x,z) + ~Aj(x,z)-aI (x,z). 8 x k (x,z) + ~Ak(X,Z)-a I Z x I Z
11. Distributions and Frobenius' Theorem
488
The convenience of this may not be clear yet, but we shall shortly demonstrate the usefulness of this method. Proposition 11.37. Let U be open in}Rk x}Rm and (A~) an m x k matm of smooth functions on U. Let (xo, zo) E U and suppose that for som connected open set V, both h : V ~ U C }Rm and /2 : V ~ U C }Rm are solutions of
;~; (x) =
A; (x, f(x)) for all i,j,
f(xo) = zoo Thenh
=/2.
Proof. This follows from Corollary 11.23 and the considerations in the proof of the previous proposition. 0 Lemma 11.38. Let V be an open set in}Rk and let (A~) be an m x k matm of smooth functions on V x }Rm that are linear in the second argument and satisfy the integrability conditions (11.6) on V x }Rm. Then for any Xo E V there is a ball Bxo C V such that for any (a, b) E Bxo x}Rm there is a solution defined on Bxo with f(a) = b. Proof. Let fi be the solution with fi(XO) = ei, where ei is the i-th standard basis vector of }Rm. Then h, ... , fm are defined and linearly independent on some ball Bxo containing Xo and contained in the intersection of the domains of the ft. Choose (a, b) E Bxo x}Rm and note that b = Ebrfr( for some uniquely defined numbers bi • Now define f = E bi f, on Bxo' Then writing f = (fl, ... , f m ), we have for any x E V,
and f(a)
=
L b fr(a) = b.
o
r
Corollary 11.39. Let V be a simply connected open set in IRk and let (A ) be an m x k matrix of smooth functions on U = V x }Rm. Suppose that ea h is linear in its second argument. If
A;
oA;+ L AIk oA~= oA1 L I oAk ·oxk Ozl ax]+ A ] Ozl I
I
for all i, j, k
11.6. Fundamental Theorem of Surface Theory
489
on V x ]Rm, then given any (xo, zo) E V x ]Rm, there exists a unique smooth map f : V -+ ]Rm such that
afi i ax j (x) = Aj(x, f(x))
for all i,j,
f(xo) - zoo Proof. Let Xj := 8~' Ip + Er Aj(p) 8~r Ip be the fields that span an integrable distribution on V x ]Rm as in Proposition 11.36. Let L(xO,%o) be the maximal integral manifold through the point (xo, zo). Let go denote the restriction of the projection prl : V x ]Rm -+ V to L(xo,zo). Let (aI, bl ) E £(xo,zo) and consider the set Fal = p-l(al).
By Lemma 11.38 above, there is a fixed open set U containing al such that for every (all b) E Fal there is a solution fb : U -+ lRm with f(al) = b. By Corollary 11.23, the graphs of these solutions are all disjoint open sets in L(xo,zo) and p restricts to a diffeomorphism on each such graph. Thus p : £(xo,zo) -+ V is a smooth covering map. The local solutions guaranteed to exist by Proposition 11.36 are local sections of this covering. Thus since V is simply connected, we know from Theorem 1.95 that there is a smooth lift p of idv : V -+ V such that p{xo) = (xo, zo), which in this case means that we have a global section: go 0 p = idv. Now let f := pr2 0 p: V -+ ]Rm. Then p{x) = (x, f(x)) and f(x) must be smooth and for every a E V the function f must agree with the unique local solution which takes the value f(a) at a. 0 We return to the situation studied in Chapter 4. Consider an immersion x : V -+ ]R3, where V is an open set in ]R2 whose standard coordinates will be denoted by ul,u 2 • Let (/I,/2,fg) be the frame fields along x defined by fl :=xul
f2 := Xu2,
,
fg :=N = Xul x xu2/ IIXul x Xu211 , where Xul = entries
ax/ au l , etc.
The first fundamental form is given by the matrix
gij = (xu" x u1 ) for 1 ::; i, j ::; 2, while the second fundamental form is given by the matrix entries iij
= -
(Nul, Xu3)
= (N, X U'u 3) = (/3, X u'u1) .
Let us consider f = (/I,/2,fg) as a matrix function of values in GL(3). We have 3
(fi)ul
-
L P; fr r=1
3
and (fi)u2 =
(U l
L Qi fT r=1
,u2 ) that takes
11. Distributions and Frobenius' Tbeorem
490
for some matrix functions P and Q. In matrix notation, we have
fu1 = fP, fu2 = fQ,
(11.7)
and these are called the frame equations. For convenience, we define a 3 x 3 matrix function G by Gij := (Ii, fJ) for 1 ~ i,j ~ 3 so that
(11.8)
G=
911 921
(
o
0)
912 922
0
0
.
1
For any given x = ~t xi Ii, we have
ei = (x, Ii) = (L xk Ik' Ii) = xk L and so
ei = L
Uk, Ii) = Gki Xk ,
L Gik Xk .
(at)ik xk =
Now let x = (fi)Ul = ~ fkPik. Then xk = Pik, so if we define Bt3 .(Ii, (/j )u1), then we have
L GikPf, = Bi2 = L Gik P;, = Bi3 = L GikP;,
(fi, (h)u 1 ) = Bi1 =
(11.9)
(fi, (h)u 1 )
(Ii, (h)u 1 )
or B = GP. Similarly, if G = (Gij) = (Ii, (fJ)u2), then
(fi, (fdu 2 ) = Gil = (11.10)
L GikQ~,
(fi, (h)u 2 )
= Gi2 = L
(Ii, (f3)u 2 )
= Gi3 =
GikQ~,
L GikQ~,
or G = GQ. We arrive at P = G- 1B,
Q = G- 1G. We denote the entries of G- 1 by 9ij so that
(11.11)
G- 1 :=
(
gIl
912
9 21
g22
o
0 ~ (911)u 2
! (922)u £12
1
11.6. Fundamental Theorem of Surface Theory
491
and
~ (gU)U2 ( 0= (g22)u1 - ~ (gll)u2
(11.13)
~ (g12)u2 - ~ (g22)u1 -£12) ~ (g22)u2 £22
£12
-£22,
0
In particular, P and Q can be written in terms of the matrix entries of the first and second fundamental forms.
Proof. The proof is just a calculation, and we only do part. For example, if i is either 1 or 2, then
Bii
1
1
= (Ii, (fi)u 1) = (Xu·,Xu•u1) = 2' (Xu"XU')u1 = 2' (gii)u1'
Similarly, for i = 1 or 2 we have 1
= (Ii, (fi)u 2) = (Xu"Xu' u2) = 2' (gu)u2' (Xu1,Xu2)u1 = (Xul u1,Xu2) + ~ (gll)u2 from above, and so
Oil Now, ~ (g12)u1 =
B21 =
(12, (l1)u1) = (Xu1u1,Xu2 )
=
1
1
2' (g12)u1 - 2' (gll)u2.
The entries B12, 012, 021 are calculated similarly. Next consider Bi3 for i = 1 or 2. We have and so
B3i
= -Bi3.
The entries 03i and Oi3 are obtained in the same way. (13, fa) = 1, 0=
1
2' (13, fa)u/c =
((f3)u/C , fa)
=
{ B33 if k = 1, 033 if k = 2.
Lastly, since
o
We record an observation to be used later: (11.14)
B +Bt = Gu 1, 0+ ot = Gu 2.
The frame equations (11.7) are a system to which Proposition 11.36 applies. Rather than trying to rewrite the equations in a form that matches that proposition we obtain the integrability conditions by setting
(fu1)u2 and then
= (fu2)u1
11. Distributions and Frobenius' Theorem
492
Substituting from the frame equations we obtain f (Pu 2
-
QuI - (PQ - QP)) = O.
Now f is a nonsingular matrix, so we have the equivalent integrability equation (11.15)
Pu2 - QuI - (PQ - QP)
= O.
At this point we pause to appreciate an important fact. Namely, direct calculation reveals that these equations are equivalent to the combination of the Codazzi-Mainardi equation and the Gauss curvature equation, which we now see are integrability conditions (see Problem 7). We thus refer to the above integrability equations (11.15) as the Gauss-Codazzi equations with apologies to Gaspare Mainardi (1800 1879). We now tum things around. Rather than assuming that we have a surface, we take the (g'3) and (i.,) as some given symmetric smooth matrixvalued functions defined on a connected open V C R.2 with the assumption that (g,,) is positive definite. Furthermore, we now assume that G, P, and Q are actually defined in terms of these by the formulas above, which we found to be true in the case where we started with a surface. We will show that we can obtain a surface with these as first and second fundamental form. Theorem 11.41 (Fundamental existence theorem for surfaces). The following assertions hold:
(i) Let V be an open set in R.2 diffeomorphic to an open disk and let x : V
-+ R.3 be an immersion with the corresponding first and
second fundamental forms given in matrix form as (gi;) and (i.3). Let y : V -+ R.3 be another immersion with the corresponding forms (gi;) and (~j). If
y=fox for some proper Euclidean motion f : R.3 -+ R.3, then
g'j = g",
l'j
= li,.
Conversely, if the last equations hold, then y = fox for som Euclidean motion f. (ii) Suppose that (g,,) and (li,) are given symmetric matrix-valuedfunctions defined on V with (gij) positive definite and suppose that G, Band C are defined in terms of the entries of (gij) and (lij) as in
11.6. Fundamental Theorem of Surface Theory
493
formulas (11.8), (11.11), (11.12) and (11.13). Then if
= G-IB, Q = G-le, P
and if Pu2 - Qul - (PQ - QP) = 0, then there exists an embedding x : V -+ ]R3 such that (gij) and (iij) are the corresponding first and second fundamental forms.
Proof. We leave the proof of the first part of (i) to the reader, but note that it can be proved using direct calculation or it can be derived from Theorem 4.22. For the rest of (i), note that by composing with a translation we may assume that both x and y map some fixed point U E V to the origin in ]R3. Let (II, h, h) be the natural frame for x as above and let (fr, be that of y. By making a rotation we may assume tha~ these two frames agree at p. But since we are assuming that g~} = 9ij and i~j = it}, it follows that both frames satisfy the same frame equations and so by Proposition 11.37 they must agree on the connected set V. In particular, xu. = Yu' for i = 1,2. Thus x and y only differ by a constant, which must be zero since x(u) = y(u).
h, h)
Next we consider (ii) where (9tj) and (iij) are given. We want to construct a surface, but first we construct the frame for the desired surface. Since it is assumed that 9 = (gt}) is positive definite, g and the extended matrix G are both invertible and positive definite. Thus P and Q are well-defined. Since we assume that the integrability equations Pu2 - QuI (PQ - QP) = 0 hold, Theorem 11.36 tells us that we can solve the frame equations locally, near any point U E V and with any initial conditions f(u) = fo holding as desired. But the system is linear and our domain is diffeomorphic to a disk so we can use Corollary 11.39 to obtain a solution on all of V. Since G is positive definite, we may choose these initial conditions so that (ft(u) , fj(u)) = Gtj(u) (ij-th entry of G at u). Having obtained the fi near u, we now wish to obtain a surface. This means solving the system (11.16) and this time the integrability conditions are derived from
(lI)u 2 = (f2)u l
•
Using the frame equations, we obtain integrability conditions
LQ{1i - LP~fj.
11. Distributions and Frobenius' Theorem
494
This just says that the second column of P is equal to the first column of Q, which is true. Thus we can find x : V ~ R3 with x(u) = 0 so that (11.16) holds. Next we show that (Ii, i,) = Gij on all of V. We compute as follows:
(Ii, ij)u 1 = ((fi)u1 ,Ii) + (Ii, (li)u 1 ) = ~ Qi (lr,
Ii) + ~ Qj (Ii, i8) =
r
(QG + (QG)t)ij
8
= (GQ
+ (GQ)t)ij
= (B
+ Bt)ij = (Gul)ij = (Gij)u1
by equations (11.14). Similarly for u 2 • Thus (Ii, Ii) - G" is a constant, which must be zero since it is zero at u. From (Ii, Ii) = Gij it follows that (13, fa) = 1 and that It, 12 are independent and orthogonal to fa. It remains to show that ((f3)u' ,i,) = -iij. We compute as follows: - ((f3)'11.1 ,i,) = (gl1i11 + g12 i 12 ) (It, Ii) + (g 12 i 11 + g22i12) (12, Ii)
+ l 2i12)glj + (g 12i11 + g22i12)g2j ill (gl1 g1j + g12g2j) + i12(g21 g1j + g22g2j) i116j + i 126;.
= (gl1i11 = =
This shows that ((f3)u 1 , Ii) = -iIj for j = 1,2. The computation of - ((f3)u2 ,Ii) is similar and left for the reader. 0
11. 7. Local Fundamental Theorem of Calculus Recall the structure equations (8.15) satisfied by the Maurer-Cartan form WG for a Lie group G:
dwG =
-~ [wG,wG]".
If VI = Xl (e), ... ,Vn = Xn (e) is a basis for the Lie algebra 9 which extends to left invariant vector fields Xl, ... , X n , then the above equation is equivalent to i 1\ wj = dw k = - 'L" ck,w 1\ wj , ~, 2 L- c~,wi ~,
-! '"
ct
i<j
i,j
where the are the structure constants associated to wI, ... , wn , which is the left invariant frame field dual to Xl"'.' X n . If M is some m-manifold and i : M ~ G is a smooth map, then wJ = j*wG is a g-valued I-form on M. By naturality we have
or equivalently
11.7. Local Fundamental Theorem of Calculus
495
where w} = f*w i for i = 1, ... , n. The (I-valued I-form wI is sometimes called the (left) Darboux derivative of f. The right Darboux derivative is defined similarly using the right Maurer-Cartan form.
If we think of a {I-valued I-form on a manifold M as a map TM --+ (I, then wI = f*we = We 0 T f. From this point of view we can understand why wI is a kind of derivative of f by considering the special case where G is a vector space V with its abelian (additive) Lie group structure. In this case, the Lie algebra is V itself and the Maurer-Cartan form is just the canonical map pr2 : TV = V x V -+ V, and so for a smooth map f : M --+ V, the Darboux derivative is the differential df = pr2oTf. Just as for the differential, the Darboux derivative embodies less information than the tangent map since the values that the map takes are "forgotten" and only tangential information is retained. Indeed, notice that if Lg : G --+ G is a left translation and F := Lg 0 f, then WF
= F*we = f* L;we = f*we = WI
since We is left invariant. Hence two smooth maps into G that differ by left translation have the same (left) Darboux derivative. This generalizes the fact that two functions that differ by an additive constant have the same differential. For a smooth I-form {) = gdt on JR, we can always find a smooth function with df = {) since by the Fundamental theorem of calculus one need only choose f(t) = J~ g(T)dT. More generally, if M is simply connected and Gis a vector space V, then the fact that Hl(M) = 0 means that every V-valued 1-form is the differential of some smooth f : M -+ V. For a general G, if {) is a {I-valued I-form on M, then we may ask for an f such that {) = WI' But there is no reason to expect a general {) to satisfy the above structure equation that WI satisfies. Now if we choose a basis {Vi} for g, then there must be I-forms {)1, ... , {)n E 0 1 (M) such that
f
n
{) =
L
Vi{)i.
i=l
Then d{) =
-! [{),{)]/\ is equivalent to d{)k =
_~2 "L...J c~t} {)i A {ji , i,j
where cfj are the structure constants. As we said, this mayor may not hold. These equations are the integrability conditions for the existence of an f such that {) = wI' More precisely, we have the following theorem.
11. Distributions and Frobenius' Tbeorem
496
Theorem 11.42. Let M be an m-manifold and G an n-dimensional Lze group. If iJ is a g-valued 1-form on M that satisfies the structural equation diJ = -~ [iJ, iJ]''', then for every Po E M there is a neighborhood Upo of Po such that given any (a, b) E Upo x G there is a smooth function f : UXo ~ G with f(a) = band iJ wf' Proof. Let prl : M x G -+ M and pr2 : M x G -+ G be the canonical projections and define a g-valued 1-form on M x G by 0:= priiJ - pr:iwG. For each (p, g) E M x G, let E(p,g) = Ker O(p,g). Now define a vector bundle homomorphism T (M x G) ~ (M x G) x 9 by v(p,g) 1---+ ((p, g), O(p,g) (V(p,g»))' By Proposition 6.28, if this homomorphism has constant rank, then the kernel is a subbundle which clearly has fiber E(p,g) at (p, g). By linear algebra, this is equivalent to showing that the dimension of E(p,g) is independent of (p, g). This will follow if we show that Tpr 11 E(p,g) : E(p , g) -+ TpM is an isomorphism for all (p,g). If we identify T(p,g) (M x G) with TpM x TgG, then Tprl is just the projection (v, w) 1---+ v and similarly for Tpr2' Now if (v, w) E E(p,g) and Tprl . (v, w) = 0 then v = O. But, since iJ(v) wa(w), we have w - 0 also. Thus TprllE (p,g) is injective. It is also surjective since for any v E TpM, we clearly have (v, TeLg(iJ(v)) E E(p,g) and this has v as its image. Now we use Proposition 11.19 to show that E is integrable:
dO = pridiJ - pr:idwG = pri ( -~ [iJ, iJ]") - pr2 ( -~ [wG, wG]" ) *w] -"21 [prl*.Qv ,p r *]" l'1? +"21 [pr2*wG,p r2 G" . But priiJ = 0 + pr;wG, so =
dO = =
-~ [(0 + pr2wG) , (0 + pr2wG)]" - ~ [pr;wG, pr2W G] " -~[O,O]" - ~[O,pr:iwGl" - ~[pr:iwG,Ol'"
which makes it clear that dO(X, Y) = 0 whenever 0 (X) = 0 and 0 (Y) O. Now we use the leaves of the distribution to construct the solution. Let Xo E M and fix go E G. Then let L(xO,go) be the maximal integral manifold through (xo, go). The map Tprll E(PO,go) : E(po,go) -+ TpM is an isomorphism so the inverse mapping theorem tells us that prll L(pO,gO restricts to a diffeomorphism on some neighborhood 0 of (Po, go) in L(po,go)' Let
11.7. Local Fundamental Theorem of Calculus
497
for all p E U. Observe that ~*(n) = 0 since the image of distribution by construction. Thus we have
o = ~*(n) = ~*(pri~ -
T~
lies in the
pr;wG)
= ~*pri~ - ~*pr;wG) = ~ -
j*wG,
or ~Iu = f*wG = wI. Let (a, b) E U x G. Now we argue that we may modify f so that we still have ~ u = wI while now f(a) = b. In fact, if f(a) = b1 , then let g = b11b and replace f by Lg 0 f. 0 Proposition 11.43. Let M be an m-manifold and G an n-dimensional Lie group. Let ~ be a g-valued I-form on M and suppose that h : U -+ G and h: U -+ G are smooth maps such that 'l9lu = JtwG for i = 1,2. Then if U is connected and h(po) = h(po) for some Po E U, then h = h. Proof. Let ~i := (idu, Ii). Then ~:(n) = ~i{pri~ -
= ~ipri~ -
pr;wG) ~:pr;wG) = ~ -
j*wG = 0,
so ~i : U -+ M x G is an embedding for i - I , 2 whose image is the graph of fi and an integral manifold of the distribution generated by n that contains (po, f(Po)). Corollary 11.23 applies to show that ~1(U) = ~2(U) and h =12. 0 Corollary 11.44. Let M, G, ~ and n be as above and suppose that d~ -~ [~,~ll\. Then the restriction of the map prl : M x G -+ M to any leaf of the distribution given by n is a covering map. Proof. Let L be a leaf and choose Po EM. Choose (Po, go) ELand let cp : U -+ 0 c L(po,go) = L be the diffeomorphism constructed as in the proof of the previous theorem where U is connected. Now ~ = (idu'!) where f(po) = go· If (Po, gI) is any other point in the leaf, then for g = g1g0 1 the map ~1 := (id u , Lg 0 f) is a diffeomorphism U -+ 01 where (po, gl) E 01 and prl (0 1 ) = U. In fact, it is easy to see that ~1 0 ~-1 : 0 -+ 01 is a diffeomorphism. If (po, gl) and (po, g2) are distinct points in the leaf, then we construct diffeomorphisms ~l : 0 -+ 0 1 and CP2 : 0 -+ 02 as above, and Corollary 11.23 applies to show that 0 1 and O2 are disjoint and each map diffeomorphic ally onto 0 under prl. It is now clear that M is evenly covered by prll L. 0 Corollary 11.45. Let M be a simply connected m-manifold and G an ndimensional Lie group. If ~ is a g-valued I-form on M that satisfies the structural equation d~ - -~ [~, ~l", then for every (Po, go) E M x G there is a smooth function f : M -+ G with f (Po) = go and ~ = wI.
11. Distributions and Frobenius' Theorem
498
Proof. Let L be the leaf of the distribution determined by n that contains (Po, 90). Since M is simply connected, we can lift idM : M --+ M to a section q, : M --+ L of prli L such that q,(Po) = 90. Once again q, = (idM, f) for some smooth function! with !(po) = 90 and we argue as before to conclude that iJ = wI (this time globally). D
---------------------------------WW___ Problems r~
~
(1) Show that the following vector fields define a rank 2 distribution on ]R3 which is not involutive (and hence not integrable):
a a Y=--. ay
a
x = ax +Yaz' Draw a picture of the portion of this distribution that sits at points in the x, y-plane and try to see geometrically why the distribution is not integrable. (2) Show that the distribution on ]R4 given by X = y + xfz and Y = + y/w, where (x,y,z,w) are standard coordinates, has no integral manifolds. (3) Let f} be a 1-form. Show that a 2-form "1 is in the ideal generated by B if and only if "1 /\ f} = O. (4) Consider again the system of partial differential equations:
t
Ix
Zx Zy
F(x, y, z), = G(x, y, z).
=
Show that the graphs of solutions of these equations are integral manifolds of the distribution defined by the 1-form f} = dz - Fdx - Gdy. Use Theorem 11.18 to deduce the integrability conditions for this system. [Hint: Use Problem 3.] (5) Let H be the Heisenberg group consisting of all matrices of the form
A=
[~ xl :~:]. 2
001
The
give global coordinates and a diffeomorphism with ]R3. Let V12, V13, V23 be the left invariant vector fields on H that have values at the identity with components with respect to the coordinate fields given by (1,0,0), (0,1,0), and (0,0,1), respectively. Let ~{Vi.2,V13} and Xij
Problems
499
Ll{V12,V2S} be the 2-dimensional distributions generated by the indicated pairs of vector fields. Show that Ll{V12,v13} is integrable and Ll{V12,V23} is not.
(6) Prove (i) of Theorem 11.41. (7) Show that for a given surface, equations (11.15) are equivalent to a combination of the Codazzi-Mainardi equations and the Gauss curvature equations defined in Chapter 4.
Chapter 12
Connections and Covariant Derivatives
The terms "covariant derivative" and "connection" are sometimes treated as synonymous. In fact, a covariant derivative is sometimes called a Koszul connection. From one point of view, the central idea is that of measuring the rate of change of sections of bundles in the direction of a vector or vector field on the base manifold. Here the derivative viewpoint is prominent. From another related point of view, a connection provides an extra structure that gives a principled way of lifting curves from the base to the total space. The lifts are parallel sections along the curve. In this chapter, we will always take the typical fiber of an IF-vector bundle of rank k to be lFk. We let (el' ... , ek) be the standard basis of lFk. Thus every vector bundle chart (U, ¢) is associated with a frame field (el,"" ek) where ei(x) := ¢-l(x, ei).
12.1. Definitions Let 7r : E ---+ M be a smooth IF-vector bundle of rank k over a manifold M. A covariant derivative can either be defined as a map \7 : X(M) x f(M, E) ---+ r(M, E) with certain properties from which one derives a well-defined map V' : T M x f(M, E) ---+ r(M, E) with nice properties or the other way around. We rather arbitrarily start with the first of these definitions.
Definition 12.1. A covariant derivative or Koszul connection on a smooth IF-vector bundle E ---+ M is a map \7 : X(M) x f(M, E) ---+ r(M, E) (where \7(X, s) is written as \7 xs) satisfying the following four properties: (i) \7 JX(s) = f\7 XS for all f E COO(M), X E X(M) and s E r(M, E);
-
501
12. Connections and Covariant Derivatives
502
(ii) V'XI+X28 = V'X I 8+ V'X28 for all X l ,X2 E X(M) and 8 E f(M,E); (iii) V' X(81 + 82) f(M,E);
= V' X81 + V' X82
for all X E X(M) and 81,82 E
(iv) V'X(f8) = (Xf)8 + fV'X8 for all f E COO(M;lF), X E X(M) and 8
E
r(M,E).
For a fixed X E X(M), the map V' x : f(M, E) covariant derivative with respect to X.
-7
f(M, E) is called the
As we will see below, this definition is enough to imply that V' induces maps V'u : X(U) x f(U, E) -7 f(U, E), one for each open U c M, that are naturally related in a sense we make precise below (this is not necessarily true for infinite-dimensional manifolds). Furthermore, we also prove that for a fixed p E M, the value (V' X8)(p) depends only on the value of X at p and only on the values of 8 along any smooth curve c representing Xp. Thus we get a well-defined map V' : T M x f(M, E) -7 f(M, E) such that V'v8 = (V' X 8) (p) for any extension of v E TpM to a vector field X with Xp = v. The resulting properties are (i') V'av(8) = aV'v8 for all a E JR., v E TM and
8
E
f(M,E);
(ii') for all p E M we have V' VI +V2 8 = V'VI 8 + V'V2 8 for all VI, V2 E TpM, and 8 E f(M, E); (iii') V'v(81 + 82) = V'v81 + V'v82 for all vET M and 81,82 E f(M, E); (iv') for all p E M we have V' v(f 8) = (Vf)8(p)+ f(p)V' v8 for all v E TpM, 8 E f(M, E) and f E COO(M; IF); (v') p
t-+
V' X p 8 is smooth for all smooth vector fields X.
A map satisfying these properties is al80 called a covariant derivative (or Koszul connection). Note that it is easy to obtain a Koszul connection in the first sense since we just let (V'X8) (p) := V'Xp 8.
Definition 12.2. A covariant derivative on the tangent bundle T M of an n-manifold M is usually referred to as a covariant derivative on M. In Chapter 4 have already met the Levi-Civita connection V' on ]Rn, which is, from the current point of view, a Koszul connection on the tangent bundle of ]Rn. The definition of this connection takes advantage of the natural identification of tangent spaces which makes taking the difference quotient possible:
V'v X
=
lim X(p t-tO
+ tv) -
X(p) .
t
In that same chapter we obtained, by a projection, a covariant derivative on (the tangent bundle of) any hypersurface in JR. n . A covariant derivative on
12.1. Definitions
503
a submanifold of arbitrary co dimension can be obtained in the same way. Let M be a submanifold of IR n and let X E X(M) and v E TpM. We have V'vX:=
(:tl o
X
0
c)
T E
TpM.
One may easily verify that V', so defined, is a covariant derivative (in the second sense above). Returning to the case of a general vector bundle, let us consider how covariant differentiation behaves with respect to restriction to open subsets of our manifold. Recall the restriction map r~ : r(U, E) -+ r(V, E) given by r~ : a t--+ alv and where V c U. Definition 12.3. A natural covariant derivative V' on a smooth IFvector bundle E -+ M is an assignment to each open set U c M of a map V'u : X(U) x r(U, E) -+ r(U, E) written as V'u : (X, a) t--+ V'~a such that the following assertions hold: (i) For every open U c M, the map V'u is a Koszul connection on the restricted bundle Elu -+ U.
(ii) For nested open sets V
C U, we have r~(V'~a)
=
V'vuxr~a (natrv
urality with respect to restrictions). E X(U) and a E r(U, E) the value (V'~a)(p) only depends on the value of X at p E U.
(iii) For X
Here V'~a is called the covariant derivative of a with respect to X. We will denote all of the maps V'u by the single symbol V' when there is no chance of confusion. We have explicitly worked the naturality conditions (ii) and (iii) into the definition of a natural covariant derivative, so this definition is also appropriate for infinite-dimensional manifolds. The definition of Koszul connection did not specifically include these naturality features and was only defined for global sections. We shall now see that, in the case of finite-dimensional manifolds, a Koszul connection gives a natural covariant derivative for free. Lemma 12.4. Suppose V' : X(M) x r(E) -+ r(E) is a Koszul connection for the vector bundle E -+ M. Then if for some open U either Xlu = 0 or al u = 0, then (V' xa)(p) = 0 for all p E U. Proof. We prove the case of al u reader.
= 0 and leave the case of
Xlu = 0 to the
Let q E U. Then there is some relatively compact open set V with q E V C U and a smooth function f that is identically one on V and zero
12. Connections and Covariant Derivatives
504
outside of U. Thus fO" == 0 on M and so since \7 is linear, we have \7(J0") == 0 on M. Thus since (iv) of Definition 12.1 holds for global fields, we have
\7 x(JO")(q) = f(p)(\7 xO")(q) = (\7xO")(q)
+ (XqJ)O"(q)
= O.
o
Since q E U was arbitrary, we have the result.
We now define a natural covariant derivative derived from a given Koszul connection \7. Given any open set U c M, we define \7u : X(U) x r( Elu) -+ r(Elu) by
(12.1)
x
(\7~0") (p) := (\7 (7) (p),
p E U,
for X E X(M) and (j E r(E) chosen to be any sections which agree with X and 0" on some open V with p E V c if c_U. By the above lemma this definition does not depend on the choices of X and (j.
Proposition 12.5. Let E -+ M be a rank k vector bundle and suppose that \7 : X(M) x r(E) -+ r(E) is a Koszul connection. If for each open U we define \7u as in (12.1) above, then the assignment U t-+ \7u is a natural covariant derivative as in Definition 12.3. Proof. We must show that (i), (ii) and (iii) of Definition 12.3 hold. It is easily checked that (i) holds, that is, that \7u is a Koszul connection for each U. The demonstration that (ii) holds is also easy and we leave it for the reader to check. Now since X -+ \7 xO" is linear over COO(U), (iii) follows by familiar arguments (\7 xO" is linear over functions in the argument X). 0 Because of this last lemma, we may define \7 vpO" for vp E TpM by
\7 vp O":= (\7xO") (p), where X is any vector field with X (p) = vp E Tp]i;f. We say that \7 xO" is "tensorial" in the variable X. The result can be seen as a special case of Proposition 6.55. We now see that it is safe to write expressions not directly justified by the definition of Koszul connection. For example, if X E X(U) and 0" E r(V, E), where Vi- 0, then \7 xO" is taken to be an element of r(u n V, E) defined by
un
(\7 xO") (p) = \7 xpO" := \7~pO" for all p E
un V.
This is a particularly useful convention when U is the domain of a chart (U, x) and X = f)~i and also when 0" is a member of a frame field of the vector bundle defined on some open set. In the same way that one extends a derivation on vector fields to a tensor derivation, one may show that a covariant derivative on a vector bundle induces naturally related covariant derivatives on all the multilinear bundles.
12.1. Definitions
505
In particular, if 7f* : E* ---7 M denotes the dual bundle to 7f : E ---7 M we may define connections on 7f* : E* ---7 M and on 7f ® 7f* : E ® E* ---7 M. We do this in such a way that for s E f(M, E) and s* E r(M, E*) we have \7~®E' (s ® s*)
= \7 XS ® s* + s ® \7f s*,
and (\7f s*)(s) = X(s*(s)) - 8*(\7 X8). Of course, the second formula follows from the requirement that covariant differentiation commutes with contraction: X(s*(s)) = (\7 xC(s ® s*)) = C(\7~®E' (s ® s*))
= C (\7 XS ® s* + s ® \7f s*) = s*(\7 xs) + (\7f s*)(s), where C denotes the contraction given by s ® 0: f---t o:(s). All this works like the tensor derivation extension procedure discussed previously, and we often write all of these covariant derivatives with the single symbol \7. The bundle E ® E* ---7 M is naturally isomorphic to End(E), and by this isomorphism we get a connection on End(E). If we identify elements of f (End(E)) with End(fE) (see Proposition 6.55), then we may use the following formula for the definition of the connection on End(E): (\7xA)(s) := \7x(A(s)) - A(\7xs).
Indeed, since C : s ® A
f---t
A(s) is a contraction, we must have
\7x(A(s))
= C (\7x8 ® A + s ® \7xA) = A(\7xs)
+ (\7x A )(s).
Notice that if we fix s E f(E), then for each p E M, we have an element (\7s) (p) of L(TpM, Ep) ~ E ® T* M given by (\7s) (p) : vp
f---t
\7 vp 8 for all vp E TpM.
Thus we obtain a section \7 s of E ® T* M given by p f---t (\7 s) (p), which is easily shown to be smooth. In this way, we can also think of a connection as giving a map \7 : f(E) ---7 f(E ® T* M) with the property that \7 f
8
= f\7 s + s ® df
for all s E f(E) and f E COO(M). Since, by definition, r(E) = nO(E) and f(E®T* M) = nl(E), we really have a map nO(E) ---7 nl(E). Later we will extend to a map nk(E) ---7 nk+l(E) for all integral k 2: O. Now if X is a smooth vector field, then X f---t \7xs E f(E), so we may also view \7s as an element of Hom(X(M), f(E)).
12. Connections and Covariant Derivatives
506
12.2. Connection Forms Let 1r : E -+ M be a rank r vector bundle with a connection \7. Recall tha a choice of a local frame field over an open set U c M is equivalent to a trivialization of the restriction 1ru : Elu -+ U. Namely, if ¢ = (1r, <1» is such a trivialization over U, then defining ei(x) = ¢-l(x,ei), where (ei) is the standard basis of lFn , we have a frame field (el,"" ek). We now examine the expression for a given covariant derivative from the viewpoint of such a local frame. It is not hard to see that for every such frame field there must be a matrix of 1-forms w = (W}h:::;i,j:::;r such that for X E r(U, E) we may write k
The forms
w}
\7x ej = Lw}(X)ei. i=l are called connection forms.
Proposition 12.6. If s = L:i siei is the local expression of a section s E r(E) in terms of a local frame field (el,"" ek), then the following local expression holds:
Proof. We simply compute:
\7x(I: siei) = I: (X si)ei + I: si\7 Xei
\7xs =
t
i
i
= I: (X si)ei + I: siw1 (X)ej i,j
i
= I: (Xsi)ei + I: srW~(X)ei i
i,r
o So the i-th component of (12.2)
\7 Xs is
(\7xS)i = Xsi
+ Lw~(X)sr. r
We may surely choose U small enough that it is also the domain of a coordinate frame {8J.t} for M. Thus we have (12.3)
\78l"ej = LWZjek, k
12.3. Differentiation Along a Map
where
wZj = wi (8J.L).
We now have the local formula n
k
(12.4)
Vxs
=
507
n
k
2::(2::XJ.L8J.LSi + 2::2::XJ.Lw~rSr)ei. i=l J.L=1
J.L=lr=l
Or, using the summation convention,
Now suppose that we have two moving frames whose domains overlap, say u = (el, ... , ek) and u' = (e~, ... , eU. Let us examine how the matrix of I-forms w = (wi) is related to the corresponding Wi = (w: j ) defined in terms of the frame u' . The change of frame is
e~ =
2::gfej, j
which in matrix notation is u ' =ug
for some smooth 9 : Un U ' -+ GL(n). (We treat u as a row vector of fields.) For a given moving frame, let Vu := [Vel, ... , VekJ. Differentiating both sides of u ' = ug and using matrix notation we have u ' = ug, Vu '
= V(ug),
+ udg, u'w' = ugg-1wg + ugg-ldg, u I wI = u I 9 -1 wg + u I 9 -ldg,
u'W' = (Vu)g
and so we obtain the transformation law for connection forms: Wi
= g-lwg + g-ldg.
12.3. Differentiation Along a Map Once again let 1T : E -+ M be a vector bundle with a Koszul connection V. Consider a smooth map j : N -+ M and a section (J' : N -+ E along j. Let el, ... , ek be a frame field defined over U c M. Since j is continuous, o = j-l(U) is open and el 0 j, . .. , ek 0 j are fields along j defined on O. We may write (J' = 2::~=1 (J'a ea 0 j for some functions (J'a : 0 C N -+ IF. For any p E 0 and v E TpN, we define (12.5)
12. Connections and Covariant Derivatives
508
A direct albeit tedious calculation shows that this result is independent of the choice of frame field. Thus we obtain a global map V'f : TN x
r f(E)
-t
r f(E).
The map V' f : TN x r f (E) -t r f (E) satisfies properties that qualify it as a covariant derivative along f. Namely, we have Definition 12.7. Let 1f : E -t M and f : N -t M be as above. A covariant derivative along f is a map V' f : TN x r f (E) -t r f (E) that satisfies (i) V' f : TN x r f (E) -t r f (E) is fiberwise linear in the first argument:
V'~u+b1P = V'!u a + V'~a for all a E r f(E), scalars a, b, and u, v E TpN (p is arbitrary). (ii) V'!a = V'!a1
+ V'!a2
for any u E TN and any a1,a2 E rf(E).
(iii) For v E TpN, hE COO(N; IF), and a E
V'C(ha)
r f(E),
we have
= h(p)V'Ca + v(h)a(p).
(iv) If U E X(N), then p H V'{;(p)a is smooth for all a E r f(E). (v) If 9 : P -t Nand f : N -t M are smooth, then V'fog is related to V' f by the following chain rule:
V'£og(a
0
g) = (V'fg.ua )
for u E T P; see the following diagram:
In the next section we give a more geometric view of covariant differentiation. Among other things, we obtain a more geometric way of producing Vi that does not appeal to a frame field. Also, we usually omit the superscript on V'f since it will be clear from context. As a special case, consider a section a of E along a smooth curve c : (a, b) -t M. Then we can form the operator V' Jt a, which is also denoted by at
f?t or V'Ot' We have the formula
where
ei 0
c is abbreviated to ei.
12.4. Ehresmann Connections
509
12.4. Ehresmann Connections One way to get a natural covariant derivative on a vector bundle is through the notion of connection in another sense. What we will define in this section is a special case of what is called an Ehresmann connection. An Ehresmann connection is a structure that can be defined in the category of general smooth fiber bundles, but the two most common instances are defined for vector bundles and for principal bundles, and in each case extra hypotheses are added to the definition. We work with vector bundles here and give an explanation of the principal bundle approach in the online supplement
[Lee, J efl1. We first describe the construction in words since the notation tends to obscure what is really a simple geometric idea. First, notice that an ordinary function of one variable is constant on an interval if its graph is horizontal over that interval. For sections of a vector bundle, there is no a priori notion of "horizontal", so we have no principled way to decide what sections should be considered constant. Consider a moving frame field el, ... ,ek on some vector bundle. If we had a reason to declare these frame fields to be constant, then we could differentiate a general section s = I: siei in a direction v by the rule Dvs = I: (vs i ) ei· So it seems clear that having a geometrically motivated way of picking out which sections should be constant will lead to a way of differentiating sections. The first step is to notice that we have a natural notion of vertical for a vector bundle 1f : E ~ M. The tangent spaces of the fibers of E are to be considered vertical. Thus the vertical subspace of TyE is TyEp c TyE, where 1f(Y) = p. Now if we have a vector in TyE for some y E E, we would like to project it onto the vertical direction. However, this entails having a complementary horizontal space along which we project. The idea is then to assume that there is a distribution on E, that is, a subbundle of TE, that is everywhere complementary to the vertical directions. Thus we obtain a subspace complementary to the vertical, which allows projection onto the vertical. Once we have this we can say that a section s E r(E) is horizontal (i.e. constant) along a curve c : I ~ M if Ts . c(t) has a vertical projection of zero for all t. Now we can define a covariant derivative as follows. If v E TpM and s is a smooth section, then we can project Tps . v onto the vertical space Ts(p)Ep obtaining an element \7 vS' But Ep is a vector space, and so we may identify Ts(p)Ep with Ep and take \7 vS to be an element of Ep. The resulting map v H \7 vS turns out to define a covariant derivative. We will also consider sections along general smooth maps f : N ~ M and obtain covariant differentiation for sections along f. JP :
There is a canonical "parallelism" on IRn. Recall the canonical map IRn ~ TpIRn. A vector VI E TpIRn is parallel to a vector V2 E TqIR n exactly
12. Connections and Covariant Derivatives
510
----e-----M p
Figure 12.1. Vertical space
when (Jq ° J;l) (VI) = V2. The map Jq ° J;l is a special example of what is called "parallel translation" and in this case establishes what is sometimes called "distant parallelism". In the presence of a connection, we obtain a map between fibers of a vector bundle which is called parallel translation. But this map may depend on a choice of smooth curve connecting the base points. Locally, this path dependence of parallel translation is due to the curvature of the connection. We now proceed more formally. We give the definition of vertical bundle not just for a vector bundle, but also for a general fiber bundle. First a lemma:
Lemma 12.8. Let E ~ M be a fiber bundle with typical fiber a k-manifold F. Fix p E M and let 2: Ep Y E be the inclusion. For all y E E p, we have Ty2 (TyEp)
= Ker[Ty7T' : TyE
--+ TpM]
= (Ty7T')-l (Op)
C TyE,
where Op E TpM is the zero vector. If
Note: As usual we identify Ty2 (TyEp) with TyEp.
12.4. Ehresmann Connections
511
Definition 12.9. Let 1T' : E -t M be a fiber bundle with typical fiber F and dimF = k. Let VyE := (Ty1T')-l (Op) where 1T' (y) = p. The vertical bundle on 1T' : E -t M is the real vector bundle 1T'V : V E -t E with total space defined by the disjoint union
VE:= UVyECTE. yEE
The projection map is defined by the restriction 1T'V := 1T'TEl vE . A vector bundle atlas on V E is given by vector bundle charts of the form (1T'V' dx 0 T~) : 1T'V l (1T'-l (U) n ~-l (V)) -t (1T'-1 (U) n ~-l (V)) x Rk, where rjJ
= (1T', ~) is a bundle chart on E over U and (V, x) a chart in F.
For the following, refer to the diagram: NxE
\~E \ !
!n
N-M
Vj*E-VE
!
1
!
j*E-E
!
!
f N-M
Exercise 12.10. Prove: Let f : N -t M be a smooth map and 1T' : E -t M a fiber bundle with typical fiber F. Then V 1* E -t 1* E is bundle isomorphic to J*VE -t 1*E where pr21f*E: 1*E -t E and pr2: M x E -t E.
1:=
Now consider the pull-back bundle 1* E where f is as above. Since 1* E is the sub manifold of N x E defined by the condition that (q, y) is in 1* E if and only if f(q) = 1T'(y), a curve in 1* E must be of the form (Cl' C2), where Cl is a curve in Nand C2 is a curve in E, and it must also be the case that focI = 1T' 0 C2. Now if pr 1 and pr2 are the first and second factor projections from N x E, then (Tprl' Tpr2) gives a vector bundle isomorphism of the bundle T (N x E) -t N x E with the bundle TN x TE -t N x E, and so we expect that under this isomorphism T (1* E) corresponds to a subbundle of TN x T E -t N x E. Exercise 12.11. Show that under the bundle isomorphism (Tprl, Tpr2) : T (N x E) ~ TN x T E, the tangent bundle T (1* E) corresponds to { (v, w) E TN x T E : T f·v = T1T'·w}. Under this isomorphism (V 1* E)(q,y) corresponds to {Oq} x VyE.
We need to make an observation. If V is a complex vector bundle and x E V, then the tangent space Tx V has a natural complex structure. Indeed, the map Jx : V -t Tx V is used to transfer the complex structure of V to that of Tx V. In particular, if 1T' : E -t M is a complex vector bundle, then we may view TyEp = (Ty1T')-l (Op) as a complex vector space. Thus VE is a complex vector bundle.
512
12. Connections and Covariant Derivatives
The vertical vector bundle V E is isomorphic to the vector bundle 7f* E over E (we say that V E is isomorphic to E along 7f). To see this, note that if (v, w) E 7f* E, then 7f( v + tw) is constant in t. From this we see that the map from 7f*E to TE given by (v,w) 1----1 d/dtlo(v + tw), maps into VE. It is easy to see that this map is a vector bundle isomorphism: J:7f*E~VE,
J: (y,w) 1----1 Jyw:= :tl o (y
+ tw)
=
wy.
The meaning of the symbol J depends on the bundle we have in mind, but it is consistent with our previous use in the sense that if E is the tangent bundle of an open set in a vector space, then Jy is the canonical isomorphism as before. Definition 12.12. A (linear Ehresmann) connection on a vector bundle -+ M is a smooth distribution Ji on the total space E such that
7f : E
(i) Ji is complementary to the vertical bundle:
TE = Ji ttl VE. (ii) Ji is homogeneous: Ty/-Lr (Ji y) = Ji ry for all y E E, r E JR, where /-Lr : E -+ E is the multiplication map given by /-Lr : Y 1----1 ry. The sub bundle Ji is called the horizontal distribution (or horizontal subbundle). The statement T E = Ji ttl V E means that for every y E E we have the internal direct sum decomposition TyE = Ji y ttl VyE. Any v E TE has a corresponding decomposition v = PvV + PhV. Here, Pv : v 1----1 pvv and Ph : v 1----1 PhV are the obvious projections onto VyE and Ji y referred to respectively as the horizontal and vertical projections. Note well that without a choice of horizontal distribution Ji there is no vertical projection Pv = 1 - Ph. For y E E, an individual element w E TyE is horizontal if w E Ji y and vertical if w E VyE. A vector field X E X(E) is said to be a horizontal vector field (resp. vertical vector field) if X(y) E tty (resp. X(y) E VyE) for all y E E. On the right hand side of Figure 12.2 we see a schematic representation of the field of horizontal spaces. Notice that these spaces are tangent to the zero section (which we identify with M). This must be the case and is a consequence of the homogeneity condition (ii) from the definition. On the left hand side we see a particular tangent space to E together with a vector and its vertical projection. Exercise 12.13. Show that Ji
~
7f*TM.
Theorem 12.14. Every vector bundle admits a connection.
513
12.4. Ehresmann Connections
---------..-----
Figure 12.2. Horizontal distribution
Proof. First notice that we may easily define a connection on a trivial bundle prl : M x V -+ M. Given a fixed v E V, let iv : M -+ M x V be defined by iv(p) := (p, v). Next define 1l(p,v) := Tiv(TpM) for each p. We then have Tprl (1l(p,v)) = TpM. Since for any scalar a we have J.La 0 iv = i av , we also have T J.La 0 Tiv = Ti av so that
For a general vector bundle 7r : E -+ M, let {UaJ be a locally finite cover of M such that the bundle is trivial over each Un. Then we may choose a connection 1l a on each 7r- 1 (Ua ). Let {Pal be a partition of unity subordinate to {Ua }. For each y E E, define Ly : T'/r(y)M -+ TyE by
Ly(v)
:=
where for each a, the vector Wa is the unique vector in 1l a such that T7r,w a = V. It is easy to check that Ly is linear and satisfies Ty7r 0 Ly = idTpM. The distribution we seek is then defined by Ly (T'/r(y)M) for each y. We leave it to the reader to check that this distribution is smooth and satisfies the required conditions. 0
Theorem 12.15. If 1l is a connection on 7r : E -+ M and f : N -+ M a smooth map, then the distribution J*1l (Tl)-l(1l), where pr2irE' defines a connection on the pull-back bundle J* E -+ N. This is referred to as the pull-back connection:
=
J*1l
c
TJ*E
~ TE
T~~TL
1:=
514
12. Connections and Covariant Derivatives
Proof. First note that f is the restriction of the projection N x E ---+ E and so J*H := (T i)-l(H) can be defined: For (q, y) E J* E, we let (f*H)(q,y) =
(T(q,y)i)-lHy. By Exercise 12.11 we can identify T(f*E) with {(v,w) E TNxTE: Tj-v = T7f'w}, Under this isomorphism, (VJ*E)(q,y) corresponds to {Oq} x VyE while (f*H)(q,y) corresponds to {(v, w) E TN x H : Tf· u = T7f . v}. Thus we have the decomposition
(V,PhW)
+ (O,Pv w ),
which is easily seen to be unique. Hence the distribution J*H is complementary to V J* E. Under the same identification, multiplication ma on J* E is ma(q,y) = (q,/1aY) and Tma(v,w) = (v, T/1a w ), which makes it clear that the second defining condition for a connection holds for J*H since it holds 0 for H. Given a vector v E TpM and a choice of y E Ep, there is a unique vector Vy E Hy C TyE such that Ty7f . Vy = v. This vector is called the horizontal lift of v to TyE. The idea works for fields too. Given a vector field X E X(M), there is a unique vector field X E X(E) such that X(y) is horizontal for all y E M and Ty7f'X(y) = X(7f(Y)). Thus X E r (H) c X(E). This horizontal ve~or field on the total space E is called the horizontal lift of X. Clearly X is 7f-related to X. Proposition 12.16. Let 7f : E ---+ M be a vector bundle with connection 1£. If X, Y E X(M) and f E COO(M), then ~
--
(i) aX + bY = aX + bY for all a, b E 1R; ~
-
(ii) fX=(f07f)X; (iii)
M
= Ph[X, 17].
Proof. (i) and (ii) are obvious. (iii) follows from the easy to check equalities 7f*[X,17] = [X, Y] = 7f*M and the uniqueness of horizontal lifts.
0
Definition 12.17. Let a : N ---+ E be a section of E along a map f : N -+ M. We say that a is a parallel section ifTa·v is horizontal for all v E TN. If s is a section of E and c : I ---+ E is a curve, then we say that s is parallel along c provided soc is parallel. Exercise 12.18. Recall that a section of J* E must have the form s : q ...... (q, as (q)), where a is a section along f. Show that if s is parallel with respect to the pull-back connection on J* E, then as is parallel. Exercise 12.19. Let [0, b] be an interval and let t denote the standard coordinate on [0, b]. Suppose that 7f : E ---+ [0, b] is a vector bundle over [0, b] with connection. Let denote the horizontal lift of 0/
a
ot.
515
12.4. Ehresmann Connections
(a) Show that if c : [0, a] -7 E is an integral curve of a, then c(a) E Ea. [Hint: Show that 1C ° c is an integral curve of a/at.]
°: ;
°
(b) Let to < b. Show that there is a fixed E > depending only on to such that all maximal integral curves of originating in the fixed fiber Eta are defined at least on [to, E). [Hint: Endow E with a bundle metric and consider all integral curves originating in the unit sphere in Eta and then use property (ii) of Definition 12.12.]
a
(c) Using (a) and (b), show that integral curves of equal to [0, b].
a all have domain
Finding a horizontal lift of a vector field in X(M) is trivial and automatic. On the other hand, finding a parallel section along a given map with prescribed value a-(q) for some q is generally nontrivial and may not exist. However, we have the following Theorem 12.20. Let
E -7 M be a (smooth) vector bundle with a connection 1i.. Suppose that c : [a, b] -7 M is a smooth curve. For each u E Ec(a), there is a unique parallel section CJ C,u along c such that CJ C,u (a) = u. If Pc : Ec(a) -7 Ec(b) denotes the map which takes u E Ec(a) to CJc,u(b), then Pc is linear. 1C :
Proof. Without loss of generality we may take a = 0. Let a = a/at denote the standard coordinate vector field on the interval [0, b] and let [j denote its horizontal lift in the pull-back bundle c* E with respect to .!he pull-back connection c*1i.. Let Cu denote the maximal integral curve of a in c* E with cu(o) = (0, u) E c* E. We have d
dt (pr1
-
° cu) = Tpr1 ° Cu = Tp r 1 ° a ° u = a ° pr1 ° cu· C
Thus pr1 OCu is an integral curve of a = a/at and so pr1 oCu(t) = t. From this we see that cu(t) = (t, pr2ocu ). By Exercise 12.19 we know that Cu is defined on [0, b] and that cu(b) E (c* Eh. Let CJc,u := pr2 ° Cu on [0, b]. Then CJc,u is a section of E -7 M along c which is parallel since Cu is horizontal (see Exercise 12.18). We define Pcu := CJc,u(b) for u E Ec(o). The uniqueness follows from the uniqueness of integral curves and we leave this to the reader. For any r E JR., the field rCJc,u is parallel. Indeed, (rCJc,u)" = T{Lr ° Crc,u is horizontal since T{Lr preserves 1i.. But then Pc(ru) = rPc(u) so Pc is homogeneous. We aim to show that Pc = J0 1 ° T Pc ° Jo, which is a composition of linear maps. Let Vo E ToEc(o) so that Vo = 1(0), where, is defined by ,(t) = tv for an appropriate v E Ec(o), Then J0 1 vo = v. We have
ToPcvo
d
=
dt (Pc
° ,) (0).
516
12. Connections and Covariant Derivatives
But also Pe o/,(t)
= Pe(tv) = tPe(v) so that ToPevo = Jo (Pe (v)) = Jo 0 Pe 0 Jo 1vo.
Thus Jo 0 Pe 0 Jo 1 = ToPe or Pe = Jo 1 0 ToPe is linear.
0
Jo, and we conclude that Pc
Since Pe has inverse Pe+- where cf-(t) := c(b - t), we see that it is a linear isomorphism. 0
Definition 12.21. Let c : [a, b] ~ M be as in the theorem above. The map Pe is called parallel translation along c from c(a) to c(b). Let c be any smooth curve in M. For t1 and t2 in the domain of c, let P(c)~~ : Ec(td -+ E C (t2) be defined as P(c)~~ := P el [tl,t2] if t2 2:: t1 and P(c)~~ := P~[~2hl if t1 2': t2· We also say that the curve C1e ,u of Theorem 12.20 is a parallel lift or horizontal lift of the curve c. The map Pc is also sometimes called parallel transport. Suppose that c : [a, b] ~ M is a (continuous) piecewise smooth curve (Definition 2.121). Thus we may find a monotonic sequence to, h, ... , tj = t such that Ci := cl[t.l - l , t.]t (or cl[t t.'t-l l) is smooth. 1 In this case we define P(c)~o := P(C)t_l 0 . . . 0 P(c)~~. 't,
Exercise 12.22. The map P(cH o : Ec(to) ~ Ee(t) is a linear isomorphism for all t with inverse P(c)~o. Parallel transport behaves nicely with respect to reparametrization. This is the content of the next theorem, which we ask the reader to prove in Problem 1.
e(b)
era)
Parallel transport
Theorem 12.23. Let c : [a, b] ~ M be a smooth curve. If r : [a', b'] ~ [a, b] is a smooth map with dr / dt > 0, then for /' := cor we have Pc = P"!. lIt may be that t <
to.
12.4. Ehresmann Connections
517
It is often convenient to use special vector bundle charts constructed using parallel translation. Let (U, x) be a chart on M centered at p E M and such that x(U) is a ball BR(O) of radius R. Consider the family of curves Cu : [0, R] ---+ M given for each u E sn-1(R) = 8BR(0) by Cu ( t)
:= x -1 ( tu ) .
Define a frame field (e1' ... , ek) for E over U as follows: Let (e1 (0), ... ,ek(O)) be an ordered basis of Ep. If q E U, then let ei (q) be defined as
ei( q) := P(c u )&( ei(O)), where (u, t) is the unique element of sn-1(R) X [0,1] such that cu(t) = q. We say that (e1,' .. , ek) is radially parallel with respect to the spherical chart (U, x). Notice that by composing with a dilation of jRn if necessary, we may choose R to be any positive number. Also, if we use a radially parallel frame field to define a vector bundle chart ¢ = (1f, will be constant along each curve cu' We still have more to learn about the geometric meaning of a connection. We will eventually be led to the notion of curvature. At this point let us just consider what it means when a connection is integrable as a distribution. Definition 12.24. A connection on a vector bundle is called flat if it is integrable as a distribution. Let us agree to call a connection on a vector bundle E a trivial connection if given any u E E there is a parallel section s such that s (1f (u)) = u. Finding such sections is not always possible (even locally). In fact, we have the following characterization. Theorem 12.25. Let 1f : E ---+ M be a vector bundle with connection H. The following assertions are equivalent:
(i) For any simply connected open set U c M, the restriction of H to 1f-1(U) is a trivial connection on the restricted vector bundle 1f-1(U) ---+ U. (ii) H is flat. Proof. If (i) holds, then given any u E E there is a parallel section s defined on U. Then it is easy to see that s(U) is an integral manifold of the distribution H. Since we can find such an integral manifold through any u, we see that H is integrable (i.e. fiat). If H is fiat, then certainly Hlu is fiat for any open U. Suppose that U is simply connected. By the Frobenius theorem there is a maximal integral manifold Lu of Hlu through any u E 1f-1(U). By Theorem 12.20, we see that any smooth path in U can be lifted uniquely to Lu. This is enough to
12. Connections and Covariant Derivatives
518
imply that 1T1Lu : Lu -+ U is a covering space (see [Span], 2.4.10). Since U is simply connected, the lifting theorem for covering spaces (Theorem 1.95) implies that 1T1Lu has an inverse. The desired parallel section is then
LJ -1 .
0
s := (1T I
We now come to the task of relating covariant derivatives with Ehresmann connections on vector bundles. Denote the vector bundle isomorphism from VEto E along 1T by p: p: VE -+ E, p:
w.
Wy M
For each y, the map p just gives the canonical identification of TyEp with E p , and on each fiber, it is the inverse of J. If we have a connection on 1T : E -+ fYI, then we have an associated connector (or connection map), which is the map '" : T E -+ E defined by
'" (v)
:=
p(Pvv)
=
J;l(pvv)
for v E TyE. The connector is a vector bundle homomorphism along the map 1T : E -+ M: TE~E
1
1
E~M
An interesting fact is that given the appropriate definition of vector space structure on the fibers, T E is also a vector bundle over T N[ via the map T1T (Problem 11 from Chapter 6). Recall that the addition and scalar multiplication on a fiber T1T- 1(X) of this bundle are defined by
u EB v
:=
Ta· (u, v) for u, v E TE with T1T' u
c 8 v := TJ.Lc· v for v E TE and where a(Y1' Y2) := Y1 c E IF.
+ Y2
C
for (Y1, Y2) E E
=
T1T'
V =
x,
E IF, E!)
E and J.LcY := cy for Y E E and
Lemma 12.26. Suppose that f : IRK -+ IRk is a smooth map such that f(av) = af(v) for all v E IRK and a E IR. Then f is linear. Similarly, if f : C K -+ Ck is a smooth map such that f(av) = af(v) for all v E C K and a E C, then f is linear. Proof. Let f : IRK -+ IRk as in the statement. Then D f(O)v
= It It=o f(tv)
= Itlt=otf(v) = f(v). Thus f = Df(O) and so f is linear. If f: C K -+ Ck is a smooth map such that f(av) = af(v) for all v E CK and a E C then the first part shows that f : C K -+ C k is linear over R But by hypothesis f(iv) = if(v), so f is actually complex linear. 0
519
12.4. Ehresmann Connections
Cor~llary
12.27. Let 11"1 : E1 --+ M1 and 11"2 : E2 --+ M2 be IF -vector bundles. Let f : E1 --+ E2 be a fiber bundle morphism over f : M1 --+ M 2 . If f is homofLeneous on each fiber, so that j( av) = aj( v) for all v E E1 and a E IF, then f is linear on fibers, and so it is a vector bundle morphism.
o
Proof. Use vector bundle charts and Lemma 12.26.
Lemma 12.28. Let ILr : E --+ E be multiplication by r. Then for any p EM and y, WEEp we have
TILr (Jyw) = Jry (rw) = rJryw. Proof. We have
TILr (Jyw) = dd
t
I
ILr(Y
t=o
= Jry (rw)
+ tw)
= dd
t
I
(ry
t=O
+ trw)
o
= rJryw.
The connector K, gives a vector bundle homomorphism along 1I"T AI
:
T M --+ M. More precisely, we have
Theorem 12.29. If K, is the connector for a connection on a vector bundle 11" : E --+ M, then K, gives a vector bundle homomorphism from the bundle T1I": T E --+ T M to the bundle 11": E --+ M along the map 1I"TM: T M --+ M, TE~E Tn
1
TM
1 -----+
M
Proof. It is easy to check that the above diagram commutes. Thus we must show that K, is linear on fibers. Let Yy E (T1I")-1 (Xp) where 1I"(Y) = p. We may write Yy = Hy + Vy, where Hy and Vy are horizontal and vertical respectively. Since Xp = T1I" (Yy) = T1I" (Hy), we see that Hy is the horizontal lift Xy of Xp to TyE. Also, Vy Jyw for a unique WEEp. So the decomposition may be written as Yy = Xy + Jyw. By definition we have
=:
K, (Yy) = w. Using the homogeneity property of the horizontal distribution together with Lemma 12.28 above we have
TILr (Yy) = TILr (Xy)
+ TILr(JyW)
= X ry
+ Jryrw.
Thus K,(TILT(YY)) = rw. Therefore, we have K,(TILr(Yy)) = rK,(Yy). Ifwe denote the scaling operation for the vector bundle structure on T E --+ T M using 8 as before, then this last statement is just K, (r 8 Yy) = rK, (Yy). The result now follows from the Corollary 12.27. 0
520
12. Connections and Covariant Derivatives
Once we have a connection on our bundle then the addition in the vector bundle T E -+ T M can be described in a convenient form. Any element of ~E that lies over the same eleme~t Xp E T M can be written in the form Xy + Jy~ for some W E EE' where Xy is the horizontal lift of Xp to the point y. Let X Y1 + Jm Wi and X Y2 + )Y2W2 be two such expressions. Then the sum of these two elements using the addition in the vector bundle T E -+ T M is given by XY1 +Y2 + JYl +Y2 (Wi + W2), where XY1 +Y2 is the horizontal lift of Xp to the point Yi + Y2· Exercise 12.30. Prove the last statement. Exercise 12.31. Deduce from the previous theorem that (-lfTE, I-\;) : TE-+ E EEl E is a vector bundle isomorphism along the tangent bundle projection 7rTM : TM -+ M. Using this notion of connection with associated connector I-\; we can get a covariant derivative. If H is a connection on a vector bundle E -+ M with connector 1-\;, then for a section rJ E r f(E) along a smooth map f : N -+ M we make the definition \1£rJ :=
I-\;
(TprJ . v)
for v E TpN.
If V is a vector field on N, then (\1trJ)(p) := \1~(p)rJ.
Theorem 12.32. Let 7r : E -+ M be a vector bundle. Suppose that the vector bundle is endowed with a connection H and associated connector "', Then for each smooth map f : N -+ M, the map \1 f defined above is a covariant derivative along f (Definition 12.7). If f = id,v/, then we obtain a K oszul connection. Conversely, if \1 is a Koszul connection on define a connection by
7r :
E -+ M, then we may
Hy := {Ts . u - Jy \1 us: s E r(E), s(7r(Y)) = y, u E T1r(y)M}. The resulting connection, in turn, gives back the K oszul connection according to \1 vs := I-\; (Tps, v) for v E TpM.
Proof. We write \1 for \1 f when no confusion is likely. We start by simply noting that (i) and (iv) of Definition 12.7 are easily seen to be true and leave these as an exercise. Notice that if 9 : P -+ Nand f : N -+ M are smooth and u E T P, then for each rJ E r f(E), we have \1 u (s 0 g) = \1Tg'uS. Indeed, \1 u (rJ
0
g)
=
(I-\; 0
T(rJ
0
g)) U
= I-\; (TrJ (Tg . u)) = \1Tg.urJ.
This gives (v) of Definition 12.7. Next we claim that if rJi,rJ2 E rf(E) and u E TpN, then \1u (rJi + rJ2) := \1 urJi + \1urJ2, which implies (ii) of Definition 12.7.
12.4. Ehresmann Connections
521
Proof of the claim: Recall the definition of addition in the vector bundle Trr : T E -+ T M in terms of the tangent lift of a : (u, v) I-t u+v. If u = ')'(0) for a smooth curve "I in N with "1(0) = p, then
:t
TO"l . u 83 T0"2 . u := Ta(TO"l . u, T0"2 . u) = = :tl o (0"1
10 (0"10"1 + 0"2 0 "I)
+ 0"2) 0"1 = T(O"l + 0"2)· u.
Now we use the above together with the fact that K, is a bundle homomorphism along 1fTM. We have
\7 u (0"1
+ 0"2) = K, (T(O"l + 0"2)U) = K,(TO"lU 83 T0"2U) = \7 uO"l + \7u0"2.
Claim: If h E COO(N, IF) and 0" E r ,(E), then \7 uhO" = u (h) O"(p) + h(p)\7 uO", which is (iii) of Definition 12.7. Proof: Let u E TpN and let 0" : N -+ E be a section along a smooth map f : N -+ M. We begin by calculating Tp, : TlR. x TE -+ TE, where p, : 1R. x E -+ E is scalar multiplication in the vector bundle E -+ M. So let (a,y) E 1R. x E and (b ftla'v y) E TalR. x TyE. To find Tp,· (b ftla'v y) we calculate T p, . (Oa, vy) and T p, . (b ft Ia ' Oy) separately. Let c be a smooth curve with c(O) = y and C(O) = vy. Then
Tp, . (Oa, v y ) =
:t
10 p,(a, c(t)) =
:t
10 P,a(c(t))
= TP,a . Vy =: a 0 vY ' where 0 refers to multiplication in the vector bundle structure of T E -+ TM. Next, let c be the curve in the manifold 1R. given by c(t) := a + tb so that C(O) = b ft Then
la.
Tp,· (b
:t la'
Oy)
=
:t 10 p,(c(t), y)
= :t 10 (ay
=
+ tby) =
:t
10 ((a + bt)y)
Jay(by).
We now have
Tp,· (b :tla,v y) =a0vy +Jay(by). Next we let c be a curve in N with c(O) = p and C(O) = U E TpN. Then
Tph· U = dd I h(c(t)) = (h 0 c)'(t) dd I t o t h(c(O)) =
u[h] dd I t h(p)
.
522
12. Connections and Covariant Derivatives
Now let h x (J : N -+ lR x E denote the map defined by (h x (J) (x) = (h(x), (J(x)) and let (j := h(J. Then using the linearity of K, and what we developed above we have
\7 uh(J = K,(T(j· u) = K, [TIL 0 T (h x (J) (u)]
= K,TIL(U[h]
dd
I
t h(p)
,Tp(J . U)
= K, {h(p) 8 (T(J . u) + Jh(p)a(p) (u[h](J(p))} = h(p )K,(T(J . u) + u[h](J(p) = h(p)\7 u(J + u[h](J(p). The proof that
lI..y := {Jy \7 uS - Ts . u I s E r(E), u
E
T7T(y)M}
defines a connection that gives back the original covariant derivative is left to the reader as Problem 9. 0 Example 12.33. In Chapter 4 we saw that each hypersurface in lRn +1 has a natural covariant derivative (the Levi-Civita covariant derivative). This means that there is a corresponding horizontal distribution. Let us consider the canonical connection on sn c lRn+l. We may identify T sn with the subset of sn x lR n + 1 given by {(p,u) E sn x lR n + 1 : (p,u) = O}. Under this identification, the velocity of a curve e has the form c = (c, e'). Now let us find a representation of T(T sn) as a subset of T sn x lR2n+2. A section of T sn along a curve e has the form (J : t f--t (e( t), x (t)) and so we take & = (e,x, e',x') E TS n x lR2n+2. Note that since (e,x) = 0, we have (e, x') = - (e', x). Also, since (e, e) = 1, we have (c, e') = 0. It follows that we may make the identification
T(Tsn) = {((p, u) , (v, w)) E TS n x lR 2n + 2 : (p, v) = 0, (p, w)
+ (u, v) = o}.
The vertical space V(p,u) in T(p,u) (T sn) is
V(p,u) = {((p, u), (0, w))
E
T(p,u) (Tsn)}.
Recall that in Chapter 4 we obtained the Levi-Civita covariant derivative by projection of the ambient derivative back into the tangent space. In the present context, this means that ((p, u) , (v, w)) is horizontal if w is a multiple of p. But since we also have (p, w) + (u, v) = 0, the criterion for being horizontal becomes w = - (u, v) p. A little thought reveals that we should take 11.. to be defined by
lI..(p,u) := {((p, u) , (v, - (u, v) p))
E
T(p,u)Tsn}.
Exercise 12.34. Let p, q E S2 C lR 3 be such that (p, q) circle containing p and q can be parametrized as
e(t) := (cost) p + (sin t) q
= 0. The great
12.4. Ehresmann Connections
523
Figure 12.3. Parallel transport around path shows holonomy
for 0 S; t S; 271". Using the notation of the previous example, show that if (J = (c, x) is a parallel section along c, then both (x, x) and (c', x) are constant along c. Figure 12.3 shows parallel translation of a vector at the north pole around a loop comprised of segments of great circles. Notice that parallel translation around this loop results in a map which is not the identity map. In general, if 7T : E -t 111 is a vector bundle with connection and c: [a, b] -t M is a piecewise smooth closed curve with p = c(a) = c(b), then we obtain an isomorphism Pc E GL(Ep ). This map is called the holonomy of the curve. Definition 12.35. The subset Gp C GL(Ep ) consisting of the maps Pc as c ranges over all piecewise smooth closed curves c with p = c(a) = c(b) is called the holonomy group at p for the connection. It is easy to see that any piecewise smooth curve c: [a, b] -t M induces a group isomorphism between Gc(a) and Gc(b) given by
Thus for a connected manifold we may speak of the holonomy group of the manifold, and this is well-defined up to group isomorphism. One may use parallel translation to recover the covariant derivative and this is a very natural viewpoint: Theorem 12.36. Let 7T : E -t M be a vector bundle with connection N. Let f : N -t ]v! be a smooth map. If u E TpN and (J is a section of E along
12. Connections and Covariant Derivatives
524
f, then for any smooth curve c: (-E,E) -+ N with C(O) f
.
'V uO" = hm
(P(f
0
c)br l O"(c(t)) -
0
O"(c(O))
t
t-+O
where P(f (f 0 c) (t).
= u we have
c)b is the parallel transport along f
0
,
c from (f
0
c) (0) to
Proof. Use parallel transport to obtain parallel frame fields el, ... , ek along f 0 c with el(O), ... , ek(O) a basis of E(foc)(o). Then we may write
for unique smooth functions O"i defined on (-E, E). Then
(P(f
0
c)b) -10" 0 c(t)
= (P(f 0 =
c)br l I: O"i(t)ei(t)
I: O"i(t)ei(O).
Let Do denote %t 10. Then using the chain rule for connections and the fact that V' 8tei(t) = 0 we have
'VuO" = 'Vc(O)O" = 'VTc·DoO" = 'V Do =
(0" 0
c) = 'V Do
I: d;i 10 ei(O) =
= lim (P(f 0
I: O"i (t) ei (t)
:t 10 [(P(f
c)br l O"(c(t)) -
0
c)br l 0"
0
c(t)]
O"(c(O)).
0
t
t-+O
The most important special cases are the case of f = idM, where we recover a basic Koszul connection, and also the case where f itself is a curve c : I -+ M. However, recall that we often omit the superscripts indicating maps on the various covariant derivative operators.
Corollary 12.37. Let 7f : E -+ M be a vector bundle with connection 1£. If v E TpM and s is a section of E, then for any smooth curve c: (-E,E) --t M with C(O) = v we have
. (p(c)br l s(c(t)) - s(c(O)) 'Vvs = hm , t-+O
t
where P(c)b is the parallel transport along c from c(O) to c(t). Now recall that if c : I -+ M is a smooth curve and 0" : I -+ E is a section along c, we define 'V 2... 0" to be a section along c given by at
0") (t) := 'V 2...1 0". ( 'V 2... at at t
12.5. Curvature
525
We have Corollary 12.38. Let IT : E -+ M be a vector bundle with connection ]{. Let c : I -+ M be a smooth curve and suppose that (J : I -+ E is a section along c. Then we have
. (P(c)~+€)-l (J(t ( V'a(J ) () t =hm at
€--tO
+ E) -
E
(J(t)
.
Exercise 12.39. Derive the above two corollaries from Theorem 12.36. Corollary 12.40. Let IT : E -+ M be a vector bundle with connection ]{. Let f : N -+ M be a smooth map. If (J is a section of E along f, then (J is parallel if and only if V'l(J = 0 for all u E TN. In particular, a section S E f(E) is parallel along a curve c if and only if V' a/atS 0 c == o.
If one approaches parallelism solely via covariant derivatives, then one could take V'l(J == 0 as the definition of parallel. This is a common equivalent approach. Warning: There is a subtle point to be made here. If S is a section of E and c : (-E, E) -+ M is a curve, then V' c(t) S is zero whenever c( t) = 0, but for a section (J : I -+ E along c it is possible that (V'8jat(J) (t) is nonzero even when c(t) = O. For example, if cp : I -+ M is a constant curve taking the fixed value p E A1, and (J : I -+ TpM is a smooth map, then (J can be considered a section along cp and then V' a/at(J = (J' (the ordinary derivative of a curve in the vector space TpM). Exercise 12.41. Show that (J E f(M, E) is a parallel section if and only if (J 0 c is parallel along c for every curve c : I -+ At. Exercise 12.42. Show that if t H (J(t) is a curve in E p , then we can consider (J as a section along the constant map Cp : t H P and then V' at d t) = (J' (t) E Ep.
Exercise 12.43. Let V' be a connection on E -+ M and let and f3 : [a, b] -+ Ea(to). If X(t) := P(O:)~o(f3(t)), then
0: :
[a, b] -+ M
(V'atX) (t) = P(O:)~o(f3'(t)). Note: f3'(t) E Ea(to) for all t. [Hint: Use a parallel frame field along the curve.]
12.5. Curvature An important fact about covariant derivatives is that they do not need to commute. If (J : M -+ E is a section and X E X(M), then V' x(J is a section also, and so we may take its covariant derivative V'yV' x(J with respect to some Y E X(M). In general, V'yV'x(J i- V'xV'y(J. A measure of this
12. Connections and Covariant Derivatives
526
lack of commutativity is the curvature operator that is defined for a pair X, Y E X(M) to be the map F(X, Y) : f(E) -+ f(E) given by
F(X, Y)a:= \7x\7ya - \7y\7xa - \7[x,Yla, i.e. F(X, Y):= [\7x, \7y]- \7[X,Yl' Theorem 12.44. For fixed a, the map (X, Y) f-t F(X, Y)a is COO(M) bilinear and antisymmetric. Also, F(X, Y) : f(E) -+ r(E) is a COO(M)
module homomorphism,- that is, it is linear over the smooth junctions, F(X, Y)(fa)
= j F(X, Y)(a).
Proof. We leave the proof of the first part as an easy exercise. For the second part, we calculate:
F(X, Y)(fa) = '\7 x\7y ja - \7y\7 X ja - \7[x,Ylja
=
\7 X (f\7ya
+ (Y J)a)
- \7y(f\7 xa + (X J)a)
- j\7[x,Yla - ([X, Y]J)a = j\7x\7ya + (XJ)\7ya + (YJ)\7xa + X(YJ) - j\7y'\7 xa - (Y J)\7 xa - (X J)\7ya - Y(X J) - j\7[x,Yla - ([X, Y]J)a = J[\7x, \7y]- j\7[X,Yla = jF(X, Y)a.
0
Thus we also have F as a map F : X(M) x X(M) -+ f(M, End(E)). But F is COO(M) bilinear in the first two slots also. Exercise 12.45. Show that F is COO(M) bilinear in the first two slots. Let vp, wp E TpM and ap E Ep be given. Let X, Y E X(M) and a E f(E) such that X(p) = vp, Y(p) = wp and a(p) = ap. Argue that F(Xp, Yp)ap is well-defined and that the map (vp, wp, ap) f-t F( vp, wp)ap is multilinear. In light of this exercise we see that for each vp, wp E TpNf, F( vp, wp) is a linear map Ep -+ Ep. In other words, F(Xp, Yp) E End(Ep) (a.k.a. L(Ep, Ep)). But F(Xp, Yp) is antisymmetric, and so we obtain, for each p, an element of End(Ep) @1\ 2T;M. Finally, we see that F may be considered as a section of End(E) @ 1\2T* M. The space of such sections is denoted n2(M, End(E)).
I Curvature is an End(E)-valued 2-form. I We will have many occasions to consider differentiation along maps, and so we should also look at the curvature operator for sections along maps.
12.5. Curvature
527
f(s, t)
p Figure 12.4. Parallel translation around loop
Let f : N ---+ M be a smooth map and a a section of E ---+ M along U, V E X(N), we define a map Ff (U, V) : r f(E) ---+ r f(E) by
f. For
Ff (U, V)a := "Vu"Vva - "Vv"Vu a - "V[u,vla for all a E
r f(E).
Note that if s E r(E), then sof E rf(E), and if U E X(N), then TfoU =: Xf (M) is a vector field along f. As a matter of notation, we let F(Tf 0 U, Tf 0 V)s denote the map p f--t F(Tf· Up, Tf· Vp)s (p), which makes sense because of Theorem 12.44. Thus F(T f 0 U, T f 0 V)s E r f(E). Then we have the following useful fact: E
r f (T M)
Proposition 12.46. Let X E rf(E) and U, V E X(N). Then
Ff (U, V)X
= F(Tf 0 U, Tf 0 V)X.
o
Proof. Exercise!
The next theorem is an important step in understanding the geometric meaning of curvature in terms of parallel transport. Refer to Figure 12.4. Theorem 12.47. Let 7r : E ---+ 1\1 be a vector bundle with connection. Let u, v E Tplvl and Y E Ep. Let U be a neighborhood of the origin in JR2 and denote standard coordinate frame fields on JR2 by aI, 02 and their values at the origin by 01 (0) and 02 (0). Suppose that we have a smooth map f: U ---+ M withf(O) =p, Tpf'Ol(O) = U and Tpf'02(O) =V. For small s,t, let Ys,t denote the element of Ep obtained by parallel translation of Y around the piecewise smooth loop obtained by successively tracing the following four curves: (1) c~,t : a
f--t
f(a, 0), 0::; a ::; s;
12. Connections and Covariant Derivatives
528
(2) c~,t:
7 H
(3) c~,t : (J
H
1(S,7), 0:::;
7:::; t;
1(s - (J, t), 0:::; (J :::; s;
(4) c!,t : 7 H 1(0, t - 7), 0:::; 7 :::; t.
Then, · Ys t - Y FU,v ( ) Y = - 11m' . s,t-tO st
°
Proof. Let Y be a section along 1 defined as follows: If we choose E > sufficiently small, then for each (s, t) E [0, E] X[0, E] the four curves described above will be defined. For each such (s, t), let Y (s, t) be the result of parallel translation of y along the first two curves c~,t and c~,t. We leave it to the reader to argue that Y is smooth. Notice that y = Y(O, 0). By Proposition 12.46 we have
F(u,v)y = Ff(8 1 (0), 82 (0))y
=
\7aJ(0)\7a2Y - \7a2(0)\7a1Y
since [81 ,82 ] = 0. Since Y is parallel along the curves c~,t, we have \7 a2 Y and so F(u,v)y = -\7a2(0)\7a1y. If we let Pt denote parallel translation along 7 then
V'
a
\7 Y = lim p t-
2 (0)
a
1
1
H
t
0
1(0,7) from p to 1(0, t),
(\7a1Y ) (O,t) - (\7a1Y ) (0,0) = lim p t-
t-tO
=
1
(\7a 1 (0,t)Y).
t-tO
On the other hand, if Ps,t denotes parallel translation along (J from 1(0, t) to 1(s, t), then
t H
1((J, t)
. ps~lY(s, t) - Y(O, t) \7 a1 (0 t) Y = 11m ----'------,
s-tO
S
Putting things together we have . p t- 1 ps-/Y(s, t) - pt-1y(0, t) . Ys t - Y F(u, v)y = - h m ' = - hm' . s,t-tO st s,t-tO st
0
Next we will try to understand curvature in terms of the horizontal distribution. Let us define an object that in some sense measures how far the horizontal distribution is from being integrable. First note that the horizontal lifts of local frame fields on the base manifold M give vector fields on the total space that span the distribution. If the bracket of these lifted fields were always horizontal, then the distribution would be integrablethe connection would be flat. This suggests the following: For every pair of horizontal fields Z 1, Z2 on E, let
12.5. Curvature
529
Lemma 12.48. Let r(H) denote the COO (E)-module of horizontal fields and let r(V) denote the COO(E)-module of vertical fields. The map C : r(H) x r(H) ~ r(V) given by (Zl' Z2) r--+ C(Zl' Z2) is a module homomorphism. Proof. Let
f
E
Coo (E). Then we have
C(f Zl, Z2) = Pv([f ZI, Z2]) = Pv(f[ZI, Z2]- (Z2f) Z2)
= f Pv([ZI, Z2]) - Pv((Z2f) Z2) =
=
f Pv([ZI, Z2])
fC(ZI, Z2)'
The analogous result for the second factor follows because C is clearly skewsymmetric. Additivity is obvious. 0 Corollary 12.49. For y E E, the value C(ZI' Z2)(y) depends only on the values ZI(Y) and Z2(Y)' Also, C is well-defined on locally defined horizontal fields. Proof. The corollary says that C is "tensorial". The proof is just like the proof of Theorem 7.32 and can also be derived from Proposition 6.55. 0 Because of this corollary, we may think of C as a map H x H
~
V.
Theorem 12.50. Let 7r : E ~ M be a vector bundle with connection. Then for u, v E TpM and y E Ep we have
F(u, v)y = -J;I(C(uY , v Y )), where u Y , vY denote the horizontal lifts of u and v to the point y. Here Jy : TyEp ~ Ep is just the canonical map as usual. Proof. Choose U, V E X(M) with U(p) = u and V(p) = v. We may assume that [U, V] = 0 ne~r p. Let CPs and 'If;t be local flows of U ~nd V r~spectively. Also, let 'Ps and 'If;t be the flows of the horizontal lifts U and V. Observe that since CPs 07r = 7r 0 'Ps, we have that 'Ps(Y) is the parallel translation of y along the curve (j r--+ CPa 0 7r (y), 0 :S (j :S s. Similarly, ;j;t (y) is the parallel translation of y along the curve T r--+ 'If;T 0 7r(y), O:S T :S t. Now consider the curve c defined for small t by
-
-
c(t) := 'If;-..fl; 0 'P-..fl; 0 'If;..fl; 0 'P..fl;' Then, by Theorem 12.47, we have · Ys,t - Y Fu,( v) y = - 11m = - l'1m y..fl;,..fl; - Y = - l'1m c(t) - c(O) . s,t--+O st t--+O+ t t--+O+ t But by Theorem 2.112,
_ lim c(t) - c(O) = -J;lc(O) = -J;I[U, V](y). t--+O+ t
.530
12. Connections and Covariant Derivatives
Finally, notice that [U, V](p) = 0 so [U, V](y) is vertical and is equal to Pv([U, V])(y) = C(U, V)(y) = C(uY,vY). 0 Unwinding the definitions, the result of the previous theorem can be written as
(F(U, V)s) (p) = -K([U, V]s(p)), where s E r(E) and
K
is the connector associated to the connection.
Theorem 12.51. Let n : E -+ M be a vector bundle with connection. If the curvature F vanishes, then the connection is fiat. In particular, given any y E E, there is a locally defined parallel section s with s(p) = y where n(y) =p.
Proof. Let Zl, Z2 be locally defined horizontal vector fields. Then for any y in the common domain of ZI, Z2, let u, v be the images of Zl (y), Z2(y) under Tn. Thus Z 1 (y) and Z2 (y) are the horizontal lifts of u and v. Then we have _);1 (Pv([Zl, Z2]y)) = F(u, v)y = o. );1 is an isomorphism, and so Pv([Zl, Z2]y) = 0, which means that [Zl' Z2]y is horizontal. Since y was arbitrary, [Zl' Z2] = o. We conclude that 1i is integrable by the Frobenius theorem. Now apply Theorem 12.2.5. 0
But
12.6. Connections on Tangent Bundles A connection on the tangent bundle T M of a smooth manifold M is called a linear connection on .M. It is traditional to use special notation for the components of the connection form when using coordinate frames {a~i }. In this case, formula (12.3) becomes \7 --.!L aB =
(12.6)
ax"
xl
L r7j aax k . k
The r7j are a special case of the components of the connection forms. H X = xj....!2..,. and Y = yj....!2..,. then ax) ax]' (12.7)
\7 x Y =
( ayk" ax j Xl
k i yl") + rijx
a axk
(summation convention),
which is formula (12.4) in the current context. The functions r7j are called Christoffel symbols. It is a consequence of Proposition 7.37 that for each X E X(M) there is a unique tensor derivation \7 x on r; (M) such that \7 x commutes with contraction and coincides with the given covariant derivative on X(M) (also denoted \7 x) and with LX on COO(M).
12.6. Connections on Tangent Bundles
531
To describe the covariant derivative on tensors more explicitly, consider wE 'J""?(M). Since we have the contraction Y 0 W I----t C(Y 0 w) = w(Y), we
should have \7xw(Y)
= \7xC(Y@w) = C(\7x(Y@w)) = C(\7xY 0w + Y@\7xw) = w(\7xY) + (\7xw)(Y).
So we define (\7 x w)(Y) := \7x(w(Y)) - w(\7xY). This implies that for a local frame field E 1 , ... ,En with dual frame field (}l, ... ,(}n we have \7 X(}i
=-
L
w}(X)(}j, k
where w} are the connection forms satisfying \7xEj = 2:kw}(X)Ei' More generally, if Y E r;, then (12.8)
r
=
X(Y(Wl,"" wr , Y1 , ... , Ys)) - LY(WI,.\7 XWj,"" wr , Y1 , ... ) j=1
s
-L Y
(WI, ... , W r ,
Y1 , ... , \7 x Yi, ... ).
i=l
Definition 12.52. The covariant differential of a tensor field Y E denoted by \7Y and is defined to be the element of Ti~1 given by
Tik
is
\7Y(w 1, ... ,w1,X, Y1 ,.··, Ys) := \7xY(wl, ... ,wI, YI , ... , Ys).
For any fixed frame field E I , ... , En, we denote the components of \7Y by
n
.yi.1 ... i 'r .
Vt
J1···Js
Remark 12.53. We have placed the new variable at the beginning as suggested by our notation \7 i Y~~ ... js for the components of \7Y but in opposition to the equally common notation Yj~ ... js;i' This has the advantage of meshing well with exterior differentiation, making the statement of Theorem 12.56 as simple as possible. The reader is asked in Problem 2 to show that T(X, Y) := \7 x Y - \7y X[X, Y] defines a tensor and so T(Xp, Yp) is well-defined for Xp, Yp E TpM. This tensor is called the torsion tensor. If T vanishes identically, then we say that the connection is torsion free. We have already noted that the
532
12. Connections and Covariant Derivatives
Levi-Civita connection defined in Chapter 4 for a hypersurface is torsion free. We shall have more to say about torsion further on, but for now, notice that if the connection is torsion free, then
o = 'V' oJ}j -
'V' o/Ji
=
L (rt - rji) Ok, k
and so the Christoffel symbols are symmetric with respect to the lower two indices;
r~j
=
rji'
However, even for torsion free connections, this is generally not true for w!j defined by 'V' OJ.'- ej = 2: W!jek unless the elements el, ... , en of the frame field have pairwise vanishing Lie brackets. But this amounts to saying that they are locally coordinate frame fields for some chart.
12.7. Comparing the Differential Operators On a smooth manifold we have the Lie derivative LX : r;(M) -t T;(M) and the exterior derivative d : O,k(M) -t O,k+l(M), and in case we have a torsion free covariant derivative 'V', that makes three differential operators which we would like to compare. To this end, we restrict attention to purely covariant tensor fields T;J(M). The extended covariant derivative on tensor fields 'V' x : Tj(M) ---+ Tj(M) respects the subspace consisting of alternating tensors, and so for each k we have a map
and these combine to give a degree preserving map
In other notation, 'V' x : o'(M) -t o'(M). It is also easily seen that not only do we have 'V' x (a 0 'V' x 13, but also
'V'x(aAj3)
13) = 'V' xa 0 13 + a ®
= 'V'xaAj3+aA 'V'xj3.
Recall (12.8) and the similar formula (7.10) for the Lie derivative. If 'V is torsion free so that LXYi = [X, YiJ = 'V'xYi - 'V'YiX, then we obtain the following modification of formula (7.10) which incorporates 'V'.
12.7. Comparing the Differential Operators
533
Proposition 12.54. For a torsion free connection, we have the following (M): equality for any S E
r;
(.cXS)(Yl,'" ,Ys )
= (V' xS)(Yl, ... , Y s ) s
(12.9)
+
L S(Yl , ... , Yi-l' V'YiX, Yi+l,""
Ys).
i=l
o
Proof. See Problem 10. Corollary 12.55 . .cxS - V' xS is a tensor.
For W E Ok(M), we have that V'w is a covariant tensor field but now not necessarily alternating. 2 We search for a way to fix this. By antisymmetrizing, we get a map Ok(M) -t Ok+l(M) which turns out to be none other than our old friend the exterior derivative as will be shown below. Theorem 12.56. If V' is a torsion free covariant derivative on M, then d=(k+1)AltoV' or in other words, if wE Ok(M), then k
dw(Xo, Xl"'" Xk)
= L( -l)i(V' x;w)(Xo, ... , £, ... ,X k ). i=O
Proof. First we show that
Consider the subgroup H consisting of the permutations of {O, 1, ... , k} that fix O. The cosets of Hare H = Ho, H l , ... , Hk where Hj is the set of permutations that send 0 to j. If we decompose the group of permutations into these cosets, then (Alt 0 V'w) (Xo, Xl,' .. ,Xk) = (k
1
+ 1)! L sgn(O")(V' x (TOw) (XUll ... ,X
Uk )
U
2 However,
"ilw can be viewed as being in [!k(M) 0coo [!l(M).
534
12. Connections and Covariant Derivatives
To finish we calculate as follows: k
dw(Xo, Xl, .. " X k ) =
i=O
L
+
2) -l) iXi(W(XO, ... , £, ... ,Xk))
(-lr+ sw([Xr ,Xs],XO, ... ,X;, ...
,x:, ... ,Xk)
l:Sr<s:Sk k
=
L( -l) iXi(w(XO, ... , £, ... ,Xk)) + L (_1)r+sW(V XrXs - V XsXr , XO, ... , X;, ... ,x:, ... ,Xk), i=O
l:Sr<s:Sk
which is equal to k
L( -l)iXi (w(XO, ... , £, ... ,Xk)) i=O
+ l:Sr<s:Sk
L
(-1)sW(XO, ... , V XsXr, ...
,x:, ... ,Xk)
l:Sr<s:Sk k
= L( -l)iV XiW(X O, ... , £, ... ,X k )
(by using 12.8).
0
i=O
12.8. Higher Covariant Derivatives Now let us suppose that we have a connection VEi on every vector bundle Ei --7 M in some family {EdiEI' We then also have the connections VEi induced on the duals Ei --7 M. By demanding a product formula be satisfied as usual we can form a related family of connections on all bundles formed from tensor products of the bundles in the family {Ei' EihEi. In this situation, it might be convenient to denote any and all of these connections by a single symbol as long as the context makes confusion unlikely. In particular, we have the following common situation: By the definition of a connection we have that X H V~a is C=(M) linear and so VEa is a section of the bundle T* M ® E. We can use the Levi-Civita connection or any torsion free connection V on M together with VE to define a connection on E 0 T* M. To get a clear picture of this connection, we first notice that a section ~ of the bundle E 0 T* M can be written locally in terms of a local frame field {Oil on T* M and a local frame field {ed on E. Namely, we may write ~ = L: ~]ei 00 j . Then the connection VE0T* M on E®T* M is defined and W] be the connection forms for \1 so that a product rule holds. Let
'Yt
12.8. Higher Covariant Derivatives
535
and VE respectively, so that locally, using the summation convention, we have V~@T· M ~
= vf
(~~ei) 0 Oil-
+ ~~ei 0
V X Oil-
= (Xe~ei + ~~ Vfei) 0 Oil- + ~~ei 0 = (X~~er
+ erW[(X)e~)
V X Oil-
0 OV - 'Y~(X)e~er 0 OV
= (Xe~ + w[(X)e~ - 'Y~(X)~~) er 00 v . Now let ~ = VEO" for a given 0" E r(E). The map X H V~@T· M (VEO") is COO(M) linear, and VE@T*M(VEO") is an element of r (E 0 T* M 0 T* M), which can again be given the obvious connection. The process continues, and denoting all the connections by the same symbol we may consider the k-th covariant derivative VkO" E r(E 0 T* M@k) for each 0" E r(E). It is sometimes convenient to change viewpoints just slightly and define the covariant derivative operators V XI,X2"",XIe : r(E) -+ r(E). The definition is given inductively as
- V""v Xl x 2, x 3, .. ·, x Ie 0" - ... - Vx 2, X 3, .. ·, v Xl X Ie 0" . '<"7
Then we have the following convenient formula, which is true by definition:
V(k)0"(X 1 , .. . , Xk) =
V XI, ... ,XleO",
Warning: VBi VB j T is a section of E, but it is not the same section as VBiBj T since in general V~V~T#V~V~O"-V~~~~
Now we have that V
XI,X 20" -
V X2,XI 0"
= V Xl V x 20" - V~XI x 20" - (V X2 V Xl 0" - V~X2XI 0") = V Xl V x 20" - V X2 V Xl 0" - V' (~XI X2-~X2XI)0" = F(Xl' X2)0" - V T(XI,X2)0"'
So, if V (the connection on the base M) is torsion free, then we recover the curvature V
XI,X 20" -
V X2,XI 0"
= F(Xl' X2)0".
One thing that is quite important to realize, is that F depends only on the connection on E, while the operators V XI,X2 involve a torsion free connection on the tangent bundle T M.
12. Connections and Covariant Derivatives
536
12.9. Exterior Covariant Derivative The exterior covariant derivative essentially anti symmetrizes the higher covariant derivatives just defined in such a way that the dependence on the auxiliary torsion free linear connection on the base cancels out. Of course this means that there must be a definition that does not involve this connection on the base at all. We give the definitions below, but first we point out a little more algebraic structure. We can give the space of E-valued forms 0.(M,E) the structure of a (0.(M),0.(M))-"bi-module". This means that we define a product 1\ : 0.(M) x 0.(M, E) -+ 0.(M, E) and another product 1\ : 0.(M, E) x 0.(M) -+ 0.(M, E) which are compatible in the sense that (0: 1\
for w E 0.(M, E) and linearly the rules 0: 1\
1\ (3
(O"@w):= O"@o:l\w for
(O"@w)
1\ 0: :=
0: 1\ 0"
0"
=
0: 1\
(w
1\
(3)
0.(M). These products are defined by extending
E
0:
w)
@w 1\ 0: for
0"
E
0"
E
= 0" 1\ 0: = 0" @ 0: for
r(E) and o:,W r(E) and o:,W
E
0.(M),
E
0.(M),
0.(M) and
0"
E
0:
E
r(E).
More precisely, we extend by using the universal properties of multilinear products. It follows that 0: 1\
W = (_1)k1w
1\ 0:
for
0:
E
0. k (M) and wE 0. 1(M, E).
In the situation where 0" E r(E) := 0.°(M, E) and w E 0.(M), we have all three of the following conventional equalities:
O"W =
0"
1\
w=
0" @
w (special case).
The proof of the following theorem is analogous to the proof of the existence of the exterior derivative.
Theorem 12.57. Given a connection \7 on a vector bundle 7r : E -+ M, there exists a unique operator d'V : 0.(M, E) -+ 0.(M, E) such that (i) d'V (0. k (M, E)) c 0.k+l(M, E); (ii) For 0: E 0. k (M) and w E 0.£(M, E) we have
(iii) d'V 0" =
d'V (0:
1\
d'V (w
1\ 0:) =
\70"
for
0"
w) = do:
E
1\
w + (-l)ko:
1\
d'V w 1\ 0: + (-l)£w
d'V w, 1\
do:;
r(E).
In particular, if 0: E 0. k (M) and d'V (0" @
0:)
0"
E 0.°(M, E)
= r(E), we have
= d'V 0" 1\ 0: + 0" @ do:.
12.9. Exterior Covariant Derivative
537
It can be shown that if we use nk(M; E) ~ L:1t(X(M), r(E)), then we have the following formula:
Definition 12.58. The operator d'il whose existence is given by the previous theorem is called the exterior covariant derivative. Exercise 12.59. We shall sometimes use the notation dE rather than d'il. This is especially useful when two possibly unrelated connections on possibly different bundles are involved in the discussion. For example, we may deal with a connection '\1™ on TM and at the same time deal with a connection '\1 E on E -+ M. Then it is conceivable that we may have to work with both dE and d™. Failure to notice these notational possibilities can result in serious confusion.
Note that we have the following special case for J.L E nl(M, E): (12.10)
d'il J.L(X, Y) = '\1x (J.L(Y)) - '\1y (J.L(X)) - J.L([X, Y]) =
d'il (J.L(Y)) (X) - d'il (J.L(X)) (Y) - J.L([X, Y]).
Example 12.60. Let us consider the case where E = TM with connection '\1 T M. Bundle maps T M -+ T M may be regarded as elements of 0,1 (M; T M) (think about this). In particular, the identity map idT M (v) = v can be viewed as a T M -val ued 1-form. If we take the exterior covariant differential of 0 := idTM, we obtain an element d™O ofn2(M; TM). For X, Y E X(M), we compute
d™O(X, Y) = d™ (O(Y)) (X) - d™ (O(X)) (Y) - O([X, Y]) =
d™ (Y) (X) - d™ (X) (Y) - [X, Y]
=
'\1xY - '\1yX - [X, Y] = T(X, Y).
So we see that dT M 0 gives the torsion of the connection '\1 T M . If '\1 T M is a torsion free covariant derivative on M and '\1 E is a connection on the vector bundle E -+ M, then as before we get a covariant derivative '\1 on all the bundles E ® 1\kT* M and as for the ordinary exterior derivative we have the formula
d'il = dE = (k
+ l)Alt 0 '\1,
12. Connections and Covariant Derivatives
538
or in other words, if wE O,k(M, E), then
"i
-
k
dV' w(XO,X1, ... ,Xk ) = L)-l) (\7x i w)(XO, ... ,Xi , ... ,Xk). i=O
Let us look at the local expressions with respect to a moving frame. Let {ed be a frame field for E on U C M. Then locally, for w E O,k(M, E), we may write w = Lei@wi, where wi E O,k(M). Using the summation convention, we have dV' w = dV' (ei
@
dw j = ej @ dw j ej
=
@
wi) = ei
@
dw i + dV' ei
1\
wi
+ wi ej 1\ wi + ej @ wi 1\ wi
= ej @ ( dw j
+ wi 1\ wi)
.
Thus the "( k + 1)-form coefficients" of dV' w with respect to the frame {ej} are given by dw j + wi 1\ wi. We can extend our wedge operation to a bilinear operation between o'(M, End(E)) and o'(M, E) in such a way that
(A
@
ex) 1\ (0'
@,8)
= A(O')
@
ex 1\,8.
To understand what is going on a little better, let us consider how the action of o'(M, End(E)) on o'(M, E) comes about from a slightly different point of view. We can identify o'(M, End(E)) with o'(M, E@E*), and then the action of o'(M, E @ E*) on o'(M, E) is given by tensoring and then contracting: For ex,,8 E o'(M), s,O' E r(E), and s* E r(E), we have (s
@
s*
@
ex) 1\ (0'
@,8) :=
=
C(s
@
s*
@
0' @ (ex 1\ ,8))
s*(O')s @ (ex 1\,8).
From \7 E we get related connections on E*, E @ E* and E @ E* @ E. The connection on E @ E* is also a connection on End(E). Of course, we have \7~nd(E)0E (L
@
0') = (\7~nd(E) L) @O' + L @ \7~0',
and after contraction,
\7~ (L(O')) = \7~C(L @ 0') = C(\7~nd(E)0E L @ 0' + L
@
\7~0')
= (\7~nd(E) L)(O') + L(\7~O') = (\7~nd(E) L)(O') + L(\7~O'). But X
J---+ L(\7~O')
is just L 1\ \7EO'. SO we have
\7 E(L(O')) =
(\7End(E) L)(O')
+ L 1\ \7EO'.
12.9. Exterior Covariant Derivative
539
The connections on End(E) and End(E)0E give the corresponding exterior covariant derivative operators dEnd(E) : nk(M, End(E)) -t nk+l(M, End(E))
and dEnd(E)®E : nk(M, End(E) 0 E) -t nk+l(M, End(E) 0 E).
It must be expected that many readers will find this formalism a bit daunting but it is not really as bad as it seems. Local calculations turn out to be quite natural as we shall see below. We encourage the reader to seek as much exposure as possible. To this end, we recommend [BaMu], [Poor], and [Dar]. Also, part of what might be intimidating is really just the bulkiness of the notations. The following notational convention makes things more palatable: Notation 12.61. Let us now agree that, whenever convenient, we may write simply \7 for \7 E" ,\7End(E), \7End(E)®E, etc. and d'il for what we have called dE", dEnd(E) etc. Proposition 12.62. For q> E nk(M, End(E)) and wE nk(M, E), we have
d'il (q>!\ w) = d'il !\ w + (_l)kq>!\ d'il w. Proof. We have
dE((A 0 a)!\ (a 0 13)) := dE(A(a) 0 (a!\ 13)) = \7 E(A(a))!\ (a!\ 13) + A(a) 0 d (a!\ 13) = (_l)k (a!\ \7 E(A(a)) !\ 13) + A(a) 0 d (a!\ 13)
= (_l)k (a!\ {\7End(E) A!\ a + A!\ \7 Ea}!\ 13)
+ A(a) 0
da!\ 13 + (_l)k A(a) 0 a!\ d13
= (\7End(E) A) !\ a!\
+ A(a) 0 =
a!\ (3 + (_l)k A!\ a!\ (\7 Ea) !\ (3
da!\ (3 + (_l)k A(a) 0 a!\ d(3
(\7End(E) A!\ a
+ (_l)k (A 0
+A 0
da) !\ (a 0 (3)
a) !\ (\7 Ea!\ (3 + a 0 d(3)
= dEnd(E) (A 0 a)!\ (a 0 (3)
+ (-ll (A 0 a)!\ dE (a 0
(3).
By linearity we conclude that for q> E nk(M, End(E)) and wE n(M, E) we have
dE(q>!\ w)
= dEnd(E) !\ w + (_l)kq>!\
So in light of our notational conventions we are done.
dEw.
o
12. Connections and Covariant Derivatives
540
Remark 12.63. In the literature, it seems that the different natures of (d'V) k and ('V)k are not always appreciated. For example, the higher derivatives given by (d'V) k are not appropriate for defining k-th order Sobolev spaces since (d'V) 2 is zero for any flat connection.
12.10. Curvature Again The space o'(M, End(E)) is an algebra over COO(M) where the multiplication is according to (Ll ® WI) 1\ (L2 ® W2) = (Ll 0 L 2) ® WI 1\ W2. Now o'(M, E) is a module over this algebra because we can multiply (using the symbol 1\) as follows:
(L ® 0:)
1\
(0' ® ;3) = LO' ® (0: 1\ ;3) .
As usual, this definition on simple elements is sufficient. If X, Y E X (M) and \II E 0,2(M, End(E)), then using 0,2(M; E) ~ L~lt(X(M), r(E)) we have
(\II
1\
0') (X, Y) = \II (X, Y) 0'
for any 0' E o'(M, E).
Proposition 12.64. The map d'V 0 d'V : O,k(M, E) ---+ 0,k+2(M, E) is given by the action of F, the curvature 2-form of'V E , d'V
0
d'V Il
=F
Il for Il E O,k(M, E).
1\
Proof. Let us check the case where k = 0 first. From formula (12.10) above we have for 0' E O,O(M, E), (d'V
0
d'V 0') (X, Y) = 'Vx (d'V O'(Y)) - 'Vy (d'V O'(X)) - O'([X, Y])
= 'Vx'VYO' - 'Vx'VYO' - O'([X, Y]) = F(X, Y)O' = (F 1\ O')(X, Y). More generally, we just check d'V d'V
0
d'V Il = d'V
0
0
d'V on elements of the form Il = 0' ® fJ:
d'V (0' ® fJ)
= d'V (d'V 0' 1\ fJ
= (d'V d'V 0') =
1\
+ 0' ® dfJ)
e - d'V 0' 1\ de
+ d'V 0' 1\ dfJ + 0
(F 1\ 0') 1\ e = F 1\ (0' 1\ e)
= F 1\ Il.
=
F 1\ (0' ® e)
o
Let us take a look at how curvature appears in a local frame field. As before restrict to an open set U on which eu = (el,"" er ) is a given local frame field and then write a typical element s E O,k(M, E) as s = e'r/, where
541
12.11. The Bianchi Identity
'r/ = ('r/1, ... , 'r/T)t is a column vector of smooth k-forms. With Wu connection forms we have d'il
i" s =
d'il d'il (eu'r/u) = d'il (eud'r/u
=
/\ 'r/u)
+ Wu /\ 'r/u) + eu /\ d'il (d'r/u + Wu /\ 'r/u) euwu /\ (d'r/u + Wu /\ 'r/u) + eu /\ dwu /\ 'r/U - eu /\ Wu /\ d'r/u eudwu /\ 'r/U + euwu /\ Wu /\ 'r/U eu (dwu + Wu /\ wu) /\ 'r/u·
= euwu /\ =
(w}) the
+ d'il eu /\ 'r/u)
= d'il (eu (d'r/u + Wu /\ 'r/u)) = d'il eu /\ (d'r/u + Wu /\ 'r/u) + eu /\ d'il (d'r/u + Wu =
=
(d'r/u
The matrix dwu+wu/\wu represents a section ofEnd(E)u@/\2T *U. In fact, we will now check that these local sections paste together to give a global section of End(E) @ /\ 2T* M, or in other words, an element of n2(M, End(E)), which is clearly the curvature form: F : (X, Y) f--7 F(X, Y) E r(End(E)). Let Fu = dwu +wu /\wu and let Fv = dwv +wv /\wv be the corresponding form for a different moving frame eu := evg, where g : Un V -+ GL(JFT), and r is the rank of E. What we need to verify is the transformation law Fv = g-lFug,
which we met earlier in equation (6.2). Recall that Wv = g-lwUg + g-ldg. Using d (g-l) = _g-ldgg- 1 , we have Fv
= dwv + Wv /\ Wv = d (g-lwUg + g-ldg)
+ (g-lwUg + g-ldg) /\ (g-lwUg + g-ldg) = d (g-l) /\ wug + g-ldwUg - g-lwU /\ dg + d(g-l) /\ dg + g-lwU /\ wug + g-ldgg- 1 /\ wug + g-lwU /\ dg + g-ldg /\ g-ldg = g-ldwUg + g-lwU /\ wug = g-l Fug, where we have used that g-ldg /\ g-ldg = d(g-lg) = O.
12.11. The Bianchi Identity In this section we give several versions of the so-called Bianchi identity for a connection \7 on a vector bundle E -+ M. Perhaps the simplest version to understand is the following: If U, V, WE X(M), then [\7u, [\7v, \7wll
+ [\7v, [\7w, \7ull + [\7w, [\7u, \7vll
= 0 (Bianchi identity),
12. Connections and Covariant Derivatives
542
where [\7u, \7v] := \7u 0 \7v - \7v 0 \7u. This identity follows trivially once we observe that the set of linear operators on any vector space is a Lie algebra under the commutator bracket operation [A, B] := A 0 B - BoA. So, in this form, the Bianchi identity is just an instance of the Jacobi identity. Let (U,x) be a chart on M, and let F/1-v : f(E)lu -+ f(E)lu be the local curvature operator defined by
F/1-v where 0/1identity:
= [\7al" \7aJ = F(0/1-,ov),
= 0/ ox/1-, etc. Then we have the following version of the Bianchi
[\7 a!', Fv'\] + [\7 av , F'\/1-]
+ [\7 a>. v , F/1-v]
= 0
(Bianchi identity).
Another revealing form of the Bianchi identity depends on our discussions in the last section and in particular on Proposition 12.62. We have, for any E-valued form T/,
(d'V) 3 T/
= d'V ( (d'V) 2 T/) = d'V (F /\ T/) = d'V F /\ T/ + F /\ d'V T/.
But equally,
(d'V) 3 T/ = (d'V) 2 (d'V T/) = F /\ d'V T/ so it must be that d'V F /\ T/ = 0 for any T/ and so we obtain
=0
d'V F
(Bianchi identity).
Exercise 12.65. Use a calculation in local coordinates and with respect to a local frame el, ... , ek to show that the above versions of the Bianchi identity are equivalent. 12.12. G-Connections If the vector bundle 1f : E -+ M has a G-bundle structure, then there should be certain connections that respect this structure. These are the G-connections. It is time to confess that we have come face to face with a weakness of our approach to connections. It is much easier and more natural to define the notion of G-connection if one first defines connections in terms of horizontal distributions on principal bundles. We treat this in the online supplement [Lee, J efl1. However, we can still say what a G-connection should be from the current point of view without too much trouble.
Recall that if 1f : E -+ M is a rank k vector bundle with typical fiber V that has a G-bundle structure where G acts on V by the standard action as a subgroup of GL(V), then we have the bundle of G-frames
Fc(E) := UPEMFC(Ep), where Fc(Ep) := {u E GL(V,Ep) : U = c/J-l(p,.) for some (U,c/J) E Ac} and Ac is the maximal G-atlas defining the structure. The elements of
12.12. G-Connections
543
FG (Ep) are called G-frames. Having fixed a basis (e1, ... , ek) for V, each element of FG(Ep) can be identified as a basis (U 1, ... , Uk) for Ep according to U : x r---+ L xiUi' If F(Ep) is the set of all frames at p, then FG(Ep) C F(Ep) and FG(E) is a subbundle of F(E). Now let c : [a, b] -+ M be a smooth curve. If (U1, ... , Uk) is a frame, then (PeU1, ... ,PeUk) is also a frame. Thus we get a map Pc : F(Ee(a)) -+ F (Ee(a)). The following is then a workable definition of G-connection: Definition 12.66. Let 11' : E -+ M be a vector bundle with a G-bundle structure as above. A connection on the bundle is called a G-connection if parallel transport Pc takes G-frames to G-frames for all piecewise smooth curves c.
A G-frame field is a frame field which is a G-frame at each point of its domain. Exercise 12.67. Show that if ~ is the covariant derivative associated to a G-connection on E, then for each G-frame field the associated connection forms take values in the Lie algebra of G thought of as a matrix subgroup of GL(n, IF). Let 11' : E -+ M be a vector bundle with metric h = (., .) and standard fiber IRk. The metric h gives a reduction of the structure group to O( n). In this case, an O(n)-frame is an orthonormal frame. A connection on E is an O(n)-connection if and only if the associated covariant derivative \7 satisfies
(12.11) for all 81,82 E r(E) and all v E TM. Exercise 12.68. Prove this last assertion. One may also consider metrics which are nondegenerate but not necessarily positive definite. In this case, the structure group reduces to one of the semiorthogonal groups O(k, n - k). Definition 12.69. A covariant derivative satisfying (12.11) above for some metric h on a vector bundle E -+ M is called a metric covariant derivative (or metric connection). A simple partition of unity argument shows that if h is a given metric, then there exists a (nonunique) metric connection for h (Problem 11). For a metric connection, parallel transport is an isometry (Problem 12). Furthermore if h is the metric and \7 is metric with respect to h, then it is easy to check that \7h = O.
12. Connections and Covariant Derivatives
544
Proposition 12.70. Suppose that h = (-,.) is a metric on a vector bundle E -+ M. If \7 is a metric covariant derivative, then the corresponding curvature satisfies
for all X, Y E X(M) and all 0-1,0-2 E r(E).
Proof. Let X and Y be fixed vector fields. Without loss of generality we may assume that [X, Y] = 0 (recall Exercise 7.34). It is also enough to show that (F(X, Y)o-, 0-) = 0 for all 0-. We have
(F(X, Y)o-,o-) = (\7x\7yo-,o-) - (0-, \7x\7yo-) = X (\7yo-, 0-) - (\7yo-, \7 xo-) - Y (\7 xo-, 0-) - (\7 xo-, \7yo-) 1
= 2" (XY (0-,0-) - Y X (0-,0-)) = 0
o
since [X, Y] = O.
We shall study metric connections on tangent bundles in the next chapter.
Problems (1) Prove Theorem 12.23.
(2) Let M have a linear connection \7 and let T(X, Y) := \7xY - \7yX. Show that T is COO(M)-bilinear (tensorial). The resulting tensor is called the torsion tensor for the connection. (3) Show that a holonomy group is indeed a group. Show that the holonomy at any point of a sphere is isomorphic to the special orthogonal group SO(2). Second order differential equations and sprays (4) A second order differential equation on a smooth manifold M is a vector field on TM, that is, a section X of the bundle TTM (second tangent bundle) such that every integral curve 0: of X is the velocity curve of its projection on M. In other words, 0: = /y where., := 7rTM 00:. A solution curve., : I -+ M for a second order differential equation X is, by definition, a curve with &(t) = X(a(t)) for all T E I.
Problems
545
In the case M = ~n, show that this concept corresponds to the usual system of equations of the form y' = v,
v' = f(y, v), which is the reduction to a first order system of the second order equation y" = f(y, y'). What is the vector field on T~n = ~n X ~n which corresponds to this system? Notation: For a second order differential equation X, the maximal integral curve through vET M will be denoted by O:v and its projection will be denoted by 'Yv := 7rTM 00:.
(5) A spray on M is a second order differential equation, that is, a section X of TT M as in the previous problem, such that for vET M and s E ~, a number t E ~ belongs to the domain of 'Ysv if and only if st belongs to the domain of 'Yv, and in this case
'Ysv(t) = 'Yv(st). Show that there are infinitely many sprays on any smooth manifold M. [Hint: (i) Show that a vector field X E X(T M) is a spray if and only if T7r 0 X = idTM, where 7r = 7rTM:
TTM
71
TM~d 1 TM
TM
(ii) Show that X E X(T M) is a spray if and only if for any s E ~ and any v E TM, Xsv = TJ1s(sXv), where J1s : v t---+ sv is the multiplication map. (iii) Show the existence of a spray on an open ball in a Euclidean space. (iv) Show that if Xl and X 2 both satisfy one of the two characterizations of a spray above, then so does any convex combination of Xl and X2.] (6) Show that if one has a linear connection on M, then there is a spray whose solutions are the geodesics of the connection. (7) Show that given a spray on a manifold there is a (not unique) linear connection V' on the manifold such that 'Y : I ---+ M is a solution curve of the spray if and only if V' 8/Y = 0 for all tEl. Note: 'Y is called a geodesic for the linear connection or the spray. Does the stipulation that the connection be torsion free force uniqueness?
546
12. Connections and Covariant Derivatives
(8) Let X be a spray on M. Equivalently, we may start with a connection on M (i.e. a connection on T M) which induces a spray. Show that the set Ox: = {v E T M : ')'V (1) is defined} is an open neighbor hood of the zero section of T M. (9) Finish the proof of Theorem 12.32. (10) Prove formula (12.9). (11) Let h be a metric on a vector bundle E ~ M. Show that there exists a metric connection for h. [Hint: Use orthonormal frames defined on each open set of a locally finite cover.] (12) Show that parallel transport along curves with respect to a metric connection in a vector bundle with metric is an isometry of scalar product spaces.
Chapter 13
Riemannian and Semi-Riemannian Geometry
"The most beautiful thing we can experience is the mysterious. It is the source of all true art and science."
- Albert Einstein
In this chapter we take up the subject of semi-Riemannian geometry, which includes Riemannian geometry and Lorentz geometry as important special cases. The exposition is inspired by [ON1], which we follow quite closely in some places (also see [L1]). Recall that by definition, a semiRiemannian manifold (M, g) has a well-defined index denoted ind (M) or ind(g). In the case of an indefinite metric (ind(M) > 0), we will need a classification: Definition 13.1. Let (V, (-,.)) be a scalar product space. A nonzero vector v E V is called (1) spacelike if (v, v) > 0; (2) lightlike or null if (v, v) = 0;
(3) timelike if (v, v) < 0; (4) nonnull if v is either timelike or spacelike. The terms spacelike, null (lightlike), and timelike indicate the causal character of a vector. The word causal comes from relativity theory and is most apropos in the context of Lorentz manifolds defined below. If (M, g) is a
-
547
548
13. Riemannian and Semi-Riemannian Geometry
TIMELIKE
LlGHTLIKE
SPACELIKE
Figure 13.1. Lightcone
semi-Riemannian manifold, then each tangent space is a scalar product space and the above definition applies. Recall that we define Ilvll = l(v,v)1 1/ 2 , which we call the length of v.
Note: We have so far left the causal character of the zero vector undefined. It may seem reasonable that it should be considered null. A second possibility is that the zero vector should have all three causal characters. Actually, we shall see that if the index of the scalar product is one, then it is convenient to consider the zero vector as being spacelike. Definition 13.2. The set of all null vectors in a scalar product space is called the nullcone or lightcone. If (M, g) is a semi-Riemannian manifold, then the null cone in TpM is called the nullcone at p. Definition 13.3. Let I c lR be some interval. A curve c : I -t (M, g) is called spacelike, null, timelike, or nonnull, according as c(t) E Tc(t)M is spacelike, null, timelike, or nonnull, respectively, for all tEl. While every smooth manifold supports Riemannian metrics by Proposition 6.45, the existence of an indefinite metric on a given smooth manifold has an obstruction:
Theorem 13.4. A compact smooth manifold admits a continuous CO indefinite metric of index k if and only if its tangent bundle has a CO rank k sub bundle. This result is Theorem 40.11 of [St].
Definition 13.5. Let (M, g) be semi-Riemannian. If c : [a, b] piecewise smooth curve, then Lc(a),c(b) (c)
=
lb
l(c(t),c(t))1 1 / 2 dt
is called the arc length or simply length of the curve.
-t
M is a
13. Riemannian and Semi-Riemannian Geometry
549
The word length could cause confusion since the length of a null curve is zero. Thus for indefinite metrics, arc length can have some properties that are decidedly not like our ordinary notion of length. In particular, a curve may connect two different points and the arc length might still be zero! The word length is therefore sometimes reserved for timelike or spacelike curves.
Definition 13.6. A positive reparametrization of a smooth curve c ; [a, b] -t M is a curve defined by composition co h ; [a', b'] -t M, where h; [a', b'] -t [a, b] is a smooth monotonically increasing bijection. Similarly, a negative reparametrization is given by composition with a smooth monotonically decreasing bijection h ; [a', b'] -t [a, b]. By a reparametrization we shall mean either a positive or negative reparametrization. The above definition can be extended to piecewise smooth curves. Suppose c ; [a, b] -t M is a continuous curve such that, for some partition a = to < tl < ... < tk = b, we have that c is smooth on each [ti-l, til. A positive reparametrization of c is a curve co h ; [a', b'] -t M, where h ; [a', b'] -t [a, b] is a monotonically increasing continuous bijection that is smooth on each interval h- 1 ([ti-l, til). Negative reparametrization is defined similarly.
Remark 13.7 (Important fact). The integrals above are well-defined since e(t) is defined and continuous except for a finite number of points in [a, b]. Also, it is important to notice that by standard change of variable arguments, a reparametrization , = c 0 h does not change the arc length of the curve;
J b
h-l(b)
l(e(t),e(t))llj2dt =
a
r Jh-
Ih(u),--y(u))1 1 / 2 du. 1 (a)
Thus the arc length of a piecewise smooth curve is a geometric property of the curve; i.e. a semi-Riemannian invariant.
Definition 13.8. Let (M,g) be semi-Riemannian. Let I be an interval (possibly infinite). If c ; I -t M is a smooth curve with Ilell = 1, then we say that c is a unit speed curve. If c ; I -t M is a curve such that Ilell is never zero, then choosing a reference to E I, we may define an arc length function £ ; I -t I' c lR by
(13.1)
£(t)
;=
it
I(e(t), e(t))1 1 / 2 dt.
to
For a finite interval of definition [a, b]' the reference to is most often taken to be the left endpoint a. Since d£/dt = l(e(t),e(t))1 1 / 2 > 0, we may invert to find £-1 and then reparametrize;
,(s)
;=
c(£-l(s)).
550
13. Riemannian and Semi-Riemannian Geometry
It is easy to see from the chain rule that the resulting curve is a unit speed curve. Conversely, if, is a unit speed curve, then the arc length from ,(Sl) to ,(S2) is S2 - Sl. Often one abuses notation by writing s = £(t) and then ds/dt instead of d£/dt. The use of the letter s for the parameter of a unit speed curve is traditional, and we say that the curve is parametrized by arc length. In the case of timelike curves, people sometimes use the letter T instead of s and refer to it as a proper time parameter.
13.1. Levi-Civita Connection In this chapter we will use the term "connection" to be synonymous with covariant derivative. Let (M, g) be a semi-Riemannian manifold and '\7 a metric connection for M (see Definition 12.11). By definition, we have
x (Y, Z) = (\7 x Y, Z) + (Y, \7 x Z) for all X, Y, Z E X(M). It is easy to show that the same formula holds for locally defined fields. Recall that the operator T : X(M) x X(M) ---+ X(M) defined by T(X, Y) = \7 x Y - \7y X - [X, Y] is a tensor called the torsion tensor of \7. From the previous chapter we know that (X, Y) M T(X, Y) defines a COO(M)-bilinear map X(M) x X(M) ---+ X(M). The isomorphism (7.6) implies that T gives a section of Tg(TM;TM). That is, T can be thought of as defining a TpM-valued 2-tensor field at each p, so if X p, Yp E TpM, then T (Xp, Yp) is a well-defined element of TpM. Recall from our study of tensor fields that T (Xp, Yp) is defined to be T(X, Y)(p) for any fields X, Y such that X(p) = Xp and Y(p) = Yp. Requiring that a connection be both metric and torsion free, pins down the metric completely.
Theorem 13.9. For a given semi-Riemannian manifold (M,g), there is a unique metric connection '\7 such that its torsion is zero, T == O. This unique connection is called the Levi- Civita connection for (M, g). Proof. We will derive a formula that must be satisfied by \7 and that can be used to actually define \7. Let X, Y, Z, W be arbitrary vector fields on M. If \7 exists as stated, then we must have X (Y, Z)
= (\7 x Y, Z) + (Y, \7 x Z),
Y(Z,X)
=
Z(X, Y) =
+ (Z, \7yX), (\7zX, Y) + (-- - Y).
(\7yZ,X)
551
13.1. Levi-Civita Connection
Now add the first two equations and subtract the third to get
+ Y(Z, X) - Z(X, Y) (VxY,Z) + (Y, VxZ) + (VyZ,X) + (Z, VyX)
X(Y, Z) =
- (VzX, Y) - (X, VzY).
If we assume the torsion zero hypothesis, then this reduces to
+ Y(Z, X) - Z(X, Y) (Y, [X, Z]) + (X, [Y, Z]) - (Z, [X, Y]) + 2(V xY, Z).
X(Y, Z) =
Solving, we see that V X Y must satisfy (13.2)
2(VxY, Z) = X(Y, Z)
+ Y(Z, X) -
+ (Z, [X, Y]) -
Z(X, Y)
(Y, [X, Z]) - (X, [Y, Z]).
Since knowing (V X Y, Z) for all Z is tantamount to knowing V X Y, we conclude that if V exists, then it is unique. On the other hand, the patient reader can check that if we actually define (V X Y, Z) and hence V X Y by this equation, then all of the defining properties of a covariant derivative are 0 satisfied and furthermore T will be zero. Formula (13.2) above, which serves to determine the Levi-Civita connection, is called the Koszul formula. It is easy to see that the restriction of a Levi-Civita connection to any open submanifold is just the Levi-Civita connection on that open sub manifold with the induced metric. It is a straightforward matter to show that the Christoffel symbols for the Levi-Civita connection in some chart are given by
rk, = ~ll lJ
where 9jkgki = (12.6).)
63,
2
(09jl ox i
+ ogli ox j
_ 09i j ) ox 1
(Recall that gij = (ai, OJ) and
rfj
'
are given by formula
We know from the study in the last chapter that we may take the covariant derivative of vector fields along maps. The most important cases for this chapter are fields along curves and fields along maps of the form h: (a, b) x (c, d) ---+ M. Exercise 13.10. Show that if 0 : I ---+ M is a smooth curve and X, Yare vector fields along 0, then (X, Y) = (VajatX, Y) + (X, Va/atY).
1t
Exercise 13.11. If h : (a, b) x (c, d) ---+ M is smooth, then oh/ot and oh/os are vector fields along h. Show that VB/Btoh/os = VB/Bsoh/ot. [Hint: Use local coordinates and the fact that V is torsion free.]
13. Riemannian and Semi-Riemannian Geometry
552
13.1.1. Covariant differentiation of tensor fields. Let V' be any natural covariant derivative on M. It is a consequence of Proposition 7.37 that for each X E X(U) there is a unique tensor derivation V' x on T;(U) such that V' x commutes with contraction and coincides with the given covariant derivative on X(U) (also denoted V' x) and with Lx f on Coo(U), Recall that if T E
T}, then s
(V' xT)(Y1 , ... , Ys)
= V' x(T(Y1 , ... , Ys )) -
L T( ... , V' xYi, .. ,), i=l
If Z E
Trl,
we apply this to V' Z E
Tl
and get
(V'xV'Z)(Y) = X(V'Z(Y)) - V'Z(V'xY) = V'x(V'yZ) - V'V'xyZ,
from which we get the following definition: Definition 13.12. The second covariant derivative of a vector field Z E Tr} is Definition 13.13. A tensor field T is said to be parallel if V' ~ T = 0 for all ~ E T M. Similarly, if a : I -t T; (M) is a tensor field along a curve e : I -t M that satisfies V' fit a = 0 on I, then we say that a is parallel along e. Just as in the case of a general connection on a vector bundle we then have a parallel transport map P(en o : T;(M)c(to) -t T;(M)c(t). From the previous chapter we know that V'ata (t )
.
= f-tO hm
+ E) -
P(eH+fa(t
E
a(t)
,
T;, and if eX is the curve t H rpf (p), then I' P(exn+f(T 0 rpf (p)) - Y 0 rpf (p)
It is also true that if T E n
v X
'Y'() 1.
P
=
1m
f-tO
E
.
I t is easy to see that the space of parallel tensor fields of type (r, s) is a vector space over ~. Exercise 13.14. Show that if T is parallel, then for any smooth curve e : [a, b] -t M such that e(a) = p and e(b) = q we have P(e)~T p = T q• Deduce that if M is connected, then the dimension of the space of parallel tensor fields of type (r,s) has dimension less than or equal to dimT;(M)p for any fixed p. The map V' x : T; M -t T; M just defined commutes with contraction by construction. Furthermore, if the connection we are extending is the Levi-Civita connection for a semi-Riemannian manifold (M, g), then V'~g =
0 for all
~ E
TM.
13.2. Riemann Curvature Tensor
553
To see this, recall that V'~(g @ Y @ W) = V'~g @X @Y
+ 9 @ V'~X @Y + g@X @ V'~Y,
which upon contraction yields V'~(g(X, Y)) ~(X, Y)
+ g(V'~X, Y) + g(X, V'~Y), = (V'~g)(X, Y) + (V'~X, Y) + (X, V'~Y). =
(V'~g)(X, Y)
We see that V' ~g == 0 for all ~ if and only if (X, Y) = (V' ~X, Y)+(X, V' ~Y) for all ~,X, Y. In other words, the statement that the metric tensor is parallel (constant) with respect to V' is the same as saying that the connection is a metric connection. Exercise 13.15. Let V' be the Levi-Civita connection for a semi-Riemannian manifold (M, g). Prove the formula
(13.3)
(eXg)(Y, Z)
= g(V' x Y, Z) + g(Y, V' xZ)
for vector fields X, Y, Z E X(M).
13.2. Riemann Curvature Tensor For (M, g) a Riemannian manifold with associated Levi-Civita connection V', the associated curvature tensor field is called the Riemann curvature tensor: For X, Y E X(M) we have the map R (X, Y) : X(M) -t X(M) defined by R(X, Y) Z:= Rx,yZ:= V'xV'yZ - V'yV'xZ - V'[X,YJZ.
Exercise 13.16. Show that V'i- y(Z) - V'~x(Z) = R (X, Y) Z (recall Definition 13.12). "
By direct calculation, or by appealing to Theorem 12.44 and Exercise 12.45 from the previous chapter, we find that (X, Y, Z) H R (X, Y) Z is COO(M)-multilinear (tensorial). Appealing to the isomorphism (7.7), we conclude that R gives a section of T~ (T M; T M). That is, R can be thought of as defining a TpM-valued tensor field at each p. In other words, if X p, Yp, Zp E TpM, then R (Xp, Yp) Zp is a well-defined element of TpM and (Xp, Yp, Zp) H R (Xp, Yp) Zp gives a multilinear map. Here, R (Xp, Yp) Zp is defined to be (R (X, Y) Z) (p) for any fields X, Y, Z such that X(p) = X p, Y(p) = Yp, and Z(p) = Zp- Many interpretations of R arise. From the previous chapter we know that R is also to be thought of as a TM-valued 2-form. From this point on we will freely interpret elements of r;(M) as elements of TT s (X(M)) when convenient. Notice that we will use both the notation R (X, Y) as well as Rx,y. Definition 13.17. A semi-Riemannian manifold (M, g) is called fiat if the curvature tensor is identically zero.
13. Riemannian and Semi-Riemannian Geometry
554
Recall that if f : (M, g) --7 (N, h) is a local diffeomorphism between semi-Riemannian manifolds such that f*h = g, then f is called a local isometry and we say that the manifolds are locally isometric. Theorem 13.18. Let (M,g) be a semi-Riemannian manifold of dimension n and index 1/. If (M, g) is fiat, that is, if the curvature tensor is identically zero, then (M, g) is locally isometric to the semi-Euclidean space IR~. Proof. Let p E M given. If the curvature tensor vanishes, then by Theorem 12.51 we can find local parallel vector fields defined in a neighborhood of p with prescribed values at p. So we may find parallel fields Xl, ... ,Xn such that XI(p), ... , Xn(P) is an orthonormal basis. But since parallel translation preserves the various scalar products (Xi, X j ), we see that we actually have an orthonormal frame field in a neighborhood of p. Next we use the fact that the Levi-Civita connection is symmetric (torsion zero). We have V' XiXj V'xjXi - [Xi,Xj] = 0 for all i,j. But since the Xi are parallel, this means that [Xi, Xj] == O. Therefore there exist coordinates xl, ... ,xn on a possibly smaller open set such that = Xi for all i. 00. xt
The result is that these coordinates give a chart which is an isometry of a neighborhood of p with an open subset of the semi-Riemannian space IR~.
0
For another proof of the previous theorem see the online supplement [Lee, J eflj. The next theorem exhibits the symmetries of the Riemann curvature tensor: Theorem 13.19. The map X, Y, Z, W t--+ (Rx,YZ, W) is tensorial in all variables. Furthermore, the following identities hold for all X, Y, Z, W E
X(M):
(i) Rx,y = -Ry,x. (ii) (Rx'yZ, W) = -(Rx'yW, Z). (iii) Rx'yZ
+ Ry,zX + Rz,xY = 0
(iv) (Rx'yZ, W)
=
(First Bianchi identity).
(Rz,wX, Y).
Proof. Tensorality is immediate from our previous observations. Also, (i) is immediate from the definition of Rand (ii) is just a special case of Proposition 12.70 from the previous chapter. For (iii) we calculate: Rx'yZ + Ry,zX
= V' xV'yZ -
+ Rz,xY V'yV'xZ + V'yV' zX -
V' zV'yX
+ V' zV' xY -
V' xV' zY
= O.
13.2. Riemann Curvature Tensor
555
The proof of (iv) is rather unenlightening and is just some combinatorics. Since R is a tensor, we may assume without loss of generality that [X, Y] = O. For any X, Y, Z, let (CRh,y Z be defined by (CRh,y Z:= Rx,YZ
+ Ry,zX + Rz,xY.
By (iii) we have ((CR)yZ X, W) = 0 for any W. Summing over all cyclic permutations of Y, Z, X,'W, we obtain 0= ((CR)y,z X, W)
+ ((CR)w,y z, X) + ((CR)x,w Y, Z) + ((CR)z,x w, Y).
Expand this expression using the definition of CR, and we have twelve terms. Four pairs of terms cancel due to (i) and (ii) resulting in
2 (Rx,YZ, W)
+ 2 (Rw,zX, Y) = O. D
Using (i) we obtain the result.
Theorem 13.20 (Second Bianchi identity). For X, Y, Z E X(M), we have (\7 zR)(X, Y)
+ \7 x R(Y, Z) + \7y R(Z, X) =
O.
Proof. This is the Bianchi identity for the Levi-Civita connection and in this context is also called the second Bianchi identity. We give an independent proof here. Since this is a tensor equation, we only need to prove it under the assumption that all brackets among the X, Y, Z are zero (recall Exercise 7.34). First we have (\7 zR)(X, Y)W
= \7 z(Rx,YW) - R(\l zX, Y)W - R(X, \7zY)W - RX'y\7zW = [\7z, Rx,Y]W - R(\7zX, Y)W - R(X, \7zY)W.
Using this, we calculate as follows:
+ (\7 xR)(Y, Z)W + (\7y R)(Z, X)W [\7z, Rx,Y]W + [\7x, Ry,z]W + [\7y,Rz,x]W
(\7 zR)(X, Y)W =
- R(\7zX, Y)W - R(X, \7zY)W - R(\7 x Y, Z) W - R(Y, \7 x Z) W - R(\7yZ, X)W - R(Z, \7yX)W
+ [\7x, Ry,z]W + [\7y,Rz,x]W + R([X, Zl, Y)W + R([Z, Yl, X)W + R([Y, X], Z)W [\7 z, [\7 x, \7 y]] + [\7 x, [\7 y, \7 z]] + [\7 y, [\7 z, \7 x]] =
= [\7z, Rx,Y]W
=
O.
The last identity is the Jacobi identity for commutators and is true for purely D algebraic reasons (see the next exercise).
13. Riemannian and Semi-Riemannian Geometry
556
Note: Given a semi-Riemannian manifold (M,g), the tensor R defined by R(X, Y, Z, W) := (Rx,Y Z, W) for all X, Y, Z, W, is also called the Riemann curvature tensor. Often this tensor is defined with a different ordering of the slots and one should always check which conventions are in use. One traditional ordering is R(W, Z, X, Y) := (Rx,y Z, W). Exercise 13.21. Show that if Li, i = 1,2,3, are linear operators, and the commutator is defined as usual ([A, B] = AB - BA), then we always have the Jacobi identity [Ll' [L2' L3ll + [L2' [L3, Llll + [L3, [Ll' L2ll = o.
We now introduce several objects which hold all or part of the information in the curvature tensor in different forms. First we mention that the reader should keep an eye out for expressions of the form (R(v, w)v, w) or (R( v, w)w, v) for v, w in the tangent space of a point on the semi-Riemannian manifold (M, g) under study.
It will be convenient to introduce a little linear algebra at this point. Recall that if (V, (., .)) is a finite-dimensional scalar product space, then there is an associated natural scalar product on 1\ 2V defined so that for an orthonormal basis {ed, the basis {ei 1\ ej h,j for 1\ 2V is also orthonormal. On simple elements we have
(Vl,V3) (V1,V4)) g(Vll\ v 2,v3I\ v4)=det ( ( ) ( ). V2,V3 V2,V4 We will use the angle brackets (.,.) for this scalar product also. The quantity (v 1\ W, v 1\ w) is important and needs special attention in the case of an indefinite scalar product. If (-, .) on V is indefinite, then the induced scalar product is also indefinite and (v 1\ w, v 1\ w) may be zero, even when v and W are linearly independent. For the next lemma, recall that a subspace W of a scalar product space (V, (., .)) is called nondegenerate if (., .) restricted to W is nondegenerate. Lemma 13.22. Let (V, (-, .)) be a finite-dimensional scalar product space.
Let P be a plane spanned by v, w. Then, (i) P is nondegenerate if and only if (v 1\ w, v 1\ w) i- o. (ii) (v 1\ w, v 1\ w) > 0 if and only if (., .) restricted to P is definite. (iii) (v 1\ w, v 1\ w) < 0 if and only if (-,.) restricted to P is indefinite.
o
Proof. Exercise. If v and w span a nondegenerate plane, then I(v squared area of the parallelogram spanned by v and w.
1\
w, v 1\ w) I is the
Lemma 13.23. Let (V, (-, .)) be a finite-dimensional scalar product space. If v, w E V are any two vectors, then there exist vectors v', w' E V arbitrarily
close to v and w respectively such that v', w'span a nondegenerate plane.
13.2. Riemann Curvature Tensor
557
Proof. Assume that (v 1\ w, v 1\ W) = 0, or there is nothing to prove. Any pair of vectors is close to a pair of linearly independent vectors, so we may assume that v and ware linearly independent. There exists a vector x such that (v 1\ x, v 1\ x) < 0. Indeed, if v is null, then we can pick x so that (v, x) of- 0, which means that (v 1\ x, v 1\ x) < 0. If v is not null, then pick x of opposite causal character. That is, pick x to be spacelike if v is timelike and vice versa. Now if WE := W + EX for small E > 0, then (v 1\ WE, V 1\ WE) = 2Eb + E2(v 1\ x, v 1\ x) for some number b independent of E. If b = 0, then (v 1\ WE' V 1\ WE) < 0, and we are done by Lemma 13.22. If b of- 0, then (v 1\ WE' V 1\ WE) is nonzero in case E is sufficiently small, and then v, WE span a nondegenerate plane by Lemma 13.22 again. 0 Note that in the previous lemma, closeness is measured in the standard topology of a finite-dimensional vectors space. One does not try to use an indefinite scalar product to define the topology! The symmetry properties for the Riemann curvature tensor allow that we have a well-defined map (13.4) which is symmetric with respect to the natural extension of 9 to The map 9t is defined implicitly as follows:
g(9t( VI
1\ V2), v3 1\ V4) :=
1\ 2(T M).
(R( VI, V2)V4, V3).
Notice the switch in the indices 3 and 4. This hides a sign and must be remembered to avoid confusion later. Another commonly used quantity is the sectional curvature K. If v and W span a non degenerate plane in TpM, then define
K(v
1\
w) :=
(R(v,w)w,v) (v, v) (w, w) - (v, w)2 (9t(vl\w),vl\w) (vl\w,vl\w)
The value K (v 1\ w) only depends on the oriented plane spanned by the vectors v and w; therefore if P = span{ v, w} is such a nondegenerate plane, we also write K(P) instead of K(v 1\ w). The set of all planes in TpM is denoted Grp (2). We remark that if M is 2-dimensional, then K is a scalar function on M. It turns out that this function is exactly the Gauss curvature introduced in Chapter 4. There we showed that the Gauss curvature is intrinsic and we found an expression for it in terms of the metric, which is still valid in this situation.
In the following definition, V is an R-module. The two cases we have in mind are (1) V is X(M), R = COO(M), and (2) V is TpM, R = R
13. Riemannian and Semi-Riemannian Geometry
558
Definition 13.24. A multilinear function F : V x V x V x V --t R is said to be curvature-like if it satisfies the symmetries proved for the curvature R above; namely, if for all x, y, z, wE V we have
(i) F(x, y, z, w) = -F(y, x, z, w); (ii) F(x,y,z,w) = -F(x,y,w,z);
+ F(y, z, x, w) + F(z, x, y, w)
(iii) F(x, y, z, w)
= 0;
(iv) F(x, y, z, w) = F(w, z, x, y). Exercise 13.25. Define the tensor Cg by Cg(X, Y, Z, W) := g(Y, Z)g(X, W) - g(X, Z)g(Y, W).
Show that Cg is curvature-like. Proposition 13.26. If F is curvature-like and F(v,w,v,w) v, wE V, then F == o.
o for
all
Proof. From (iv) it follows that for each v, the bilinear map (w, z) t-+ F(v,w,v,z) is symmetric, and so if F(v,w,v,w) = 0 for all v,w E V, then F(v,w,v,z) = 0 for all v,w,z E V. Now it is a simple matter to show that (i) and (ii) imply that F == o. 0 Proposition 13.27. If (Rv,wv, w) is known for all v, w E TpM, then R itself is determined at p. If K(P) is known for all nondegenerate planes in TpM, then R itself is determined at p. Proof. Let R2(V, w) := (Rv,wv, w) for v, w E TpM. Using an orthonormal basis for TpM, we see that K and R2 contain the same information, so we will just show that R2 determines R:
~8:
usut =
(R2(v+tz,w+su)-R2(v+tu,w+sz))
I 0,0
~8:
usut
I
{g(R(v+tz,w+su)v+tz,w+su)
0,0
- g(R(v
+ tu, w + sz)v + tu, w + sz)}
= 6R(v,w,z,u).
The second part follows by continuity and the fact that (R(v, w)w, v) (v 1\ W, v 1\ w) K (v 1\ w) for v, w spanning a non degenerate plane P. 0 For each vET M, the tidal operator Rv : TpM --t TpM is defined by Rv(w) := Rv,wv.
We are now in a position to prove the following important theorem.
13.2. Riemann Curvature Tensor
559
Theorem 13.28. The following assertions are all equivalent stant):
(Ii
is a con-
(i) K(P) = Ii for all P E Grp(2), P nondegenerate. (ii) (Rv1,v2V3,V4) = liCg(VI,V2,V3,V4) for all VI,V2,V3,V4 E TpM. (iii) -Rv(w) = Ii(W - (w, v) v) for all w, v E TpM with Ilvll = l. (iv) !.R(~) = Ii~ for all ~ E 1\ 2TpM. Proof. Let p EM. The proof that (ii)=}(iii) and that (iii)=}(i) is left as an easy exercise. We prove that (i) =}(ii)=}(iv)=}(i). (i)=}(ii): Let R be defined by R(VI,V2,V3,V4) := (RV1,V2V3,V4) and let Tg := R - liCg. Then Tg is curvature-like and Tg(v, w, v, w) = 0 for all v, w E TpM by assumption. It follows from Proposition 13.26 that Tg == O. (ii)=}(iv): Let {el,"" en} be an orthonormal basis for TpM. Then {ei 1\ ej h<j is an orthonormal basis for 1\ 2TpM. Using (ii), we see that
(!.R(ei
1\
ej), ek
1\
q)
=
\Rei,ejek, q)
=
(R(ei,ej)ek,el)
=
liCg(VI, V2, v3, V4) Ii (ei 1\ ej, ek 1\ q) for all k, l.
=
This implies that !.R( ei 1\ ej) Ii
liei 1\ ej. (iv)=}(i): This follows because if v, ware orthonormal, then we have = (!.R(v 1\ w), v 1\ w) = K(v 1\ w). 0 =
Definition 13.29. Let (M, g) be a semi-Riemannian manifold. The Ricci curvature is the (1, I)-tensor Ric defined by n
i=l
where (el,"" en) is any orthonormal basis of TpM and Ei := (ei, ei). We say that the Ricci curvature Ric is bounded from below by Ii and write Ric 2:: k if Ric(v,w) 2:: k(v,w) for all v,w E TM. Similar and obvious definitions can be given for Ric :::; k and the strict bounds Ric > k and Ric < k. Actually, it is often the case that the bound on Ricci curvature is given in the form Ric 2:: Ii(n - 1), where n = dim(M). In passing, let us mention that there is a very important and interesting class of manifolds called Einstein manifolds. A semi-Riemannian manifold (M, g) is called an Einstein manifold with Einstein constant k if and only if Ric(v,w) = k(v,w) for all v,w E TM. We write this as Ric = kg or even Ric = k. For example, if (M, g) has constant sectional curvature Ii, then it is an Einstein manifold with Einstein constant k = Ii( n - 1). The effect of this
13. Riemannian and Semi-Riemannian Geometry
560
condition depends on the signature of the metric. Particularly interesting is the case where the index is 0 (Riemannian) and also the case where the index is 1 (Lorentz manifold). Perhaps the first question one should ask is whether there exist any Einstein manifolds that do not have constant sectional curvature. It turns out that there are many interesting Einstein manifolds that do not have constant sectional curvature. For manifolds of dimension > 2 the Einstein manifold condition is natural and fruitful. Unfortunately, we do not have space to explore this fascinating topic (but see [Be]).
Exercise 13.30. Show that if M is connected and dim(M) > 2, and Ric (.,.) = I(-' .), where I E COO(M), then I == k for some k E lR (so (M, g) is Einstein). 13.3. Semi-Riemannian Submanifolds Let M be ad-dimensional submanifold of a semi-Riemannian manifold M of dimension n, where d < n. The metric g(-,.) = (.,.) on M restricts to a tensor on M, which we denote by h. Since h is a restriction of g, we shall also use the notation (.,.) for h. If the restriction h is nondegenerate on each space TpM and has the same index for all p, then h is a metric tensor on M and we say that M is a semi-Riemannian submanifold of M. If M is Riemannian, then this nondegeneracy condition is automatic and the metric h is automatically Riemannian. More generally, if ¢ : N -+ (M, g) is an immersion, we can consider the pull-back tensor ¢* 9 defined by ¢*g(X, Y) = g(T¢· X, T¢· Y).
If ¢* 9 is non degenerate on each tangent space, then it is a metric on N called the pull-back metric and we call ¢ a semi-Riemannian immersion. If N is already endowed with a metric gN, and if ¢*g = gN, then we say that ¢ : (N, 9 N) -+ (.!v!, g) is an isometric immersion. Of course, if ¢* 9 is a metric at all, as it always is if (M, g) is Riemannian, then the map ¢ : (N,¢*g) -+ (M,g) is automatically an isometric immersion. Every immersion restricts locally to an embedding, and for the questions we study here there is not much loss in focusing on the case of a submanifold .!v! C M.
There is an obvious bundle on M which is the restriction of T M to M. This is the bundle TMIAf = UpEMTpM. Recalling Lemma 7.47, we see that each tangent space TpM decomposes as --
TpM ~
--
~
= TpM EB (TpM) , ~
where (TpM) = {v E TpM : (v, w) = 0 for all W E TpM}. Then TM = UPEM (TpM)~, with its natural structure as a smooth vector bundle, is called
13.3. Semi-Riemannian Submanifolds
561
the normal bundle to M in M. The smooth sections of the normal bundle will be denoted by r (TMl..) or X(M)l... The orthogonal decomposition above is globalized as
A vector field on M is always the restriction of some (not unique) vector field on a neighborhood of M. The same is true of any, not necessarily tangent, vector field along M. The set of all vector fields along M will be denoted by X( M) 1M' Since any function on M is also the restriction of some function on M, we may consider X(M) as a submodule of X(M)IM' If X E X(M), then we denote its restriction to M by XIM or sometimes just X. Notice that X(M)l.. is a submodule of X(M) 1M' We have two projection maps, nor: TpM -+ TpM l.. and tan: TpM -+ TpM which in turn give module projections nor: X(M)IM -+ X(M)l.. and tan: X(M)IM -+ X(M). We also have the pair of naturally related restrictions Coo (M) res~ion Coo (M) and X(M) res~ion X(M) 1M' Note that X(M) is a Coo(M)-module, while X(M)IM is a Coo(M)-module. We have an exact sequence of modules:
Now we shall obtain a sort of splitting of the Levi-Civita connection of M along the submanifold M. First we notice that the Levi-Civita connection 'V on M restricts nicely to a connection on the bundle T M IM -+ M. The reader should be sure to realize that the space of sections of this bundle is exactly X(M)IM' We wish to obtain a restricted covariant derivative 'VIM: X(M) x X(M)IM -+ X(M)IM' If X E X(M) and W E X(M)IM' then 'VxW does not seem to be defined since X and Ware not elements of X(M). But we may extend X and W to elements of X(M), use 'V, and then restrict again to get an element of X (M) 1M' Then recalling the local properties of a connection we see that the result does not depend on the extension.
Exercise 13.31. Use local coordinates to prove that 'V x W does not depend on the extensions used. It is also important to observe that the restricted covariant derivative is exactly the covariant derivative obtained by the methods of Section 12.3 of the previous chapter. Namely, it is the covariant derivative along the inclusion map M '---7 M. Thus, it is defined even without an appeal to the process of extending fields described above.
13. Riemannian and Semi-Riemannian Geometry
562
We shall write simply V7 in place of V71 M since the context will make it clear when the latter is meant. Thus for all p EM, we have V7 x W (p) = V7 x W (p), where X and Ware any extensions of X and W respectively. Clearly we have V7 x (Y1 , Y2 ) = (V7 x Y1 , Y2 ) + (YI, V7 x Y2 ) and so V7 is a metric connection on T J\;[IM' For fixed X, Y E X(M), we have the decomposition of V7 x Y into tangent and normal parts. Similarly, for V E X( M)l., we can consider the decomposition of V7 x V into tangent and normal parts. Thus we have
+ (V7xy)l., V7 X V = (V7 X v)tan + (V7 X V)l..
V7x Y = (V7x y )tan
Proposition 13.32. For a semi-Riemannian submanifold Me M, we have
V7 x Y = (V7 X y)tan for all X, Y
E
X(M),
where V7 is the Levi-Civita covariant derivative on M with its induced metric. Proof. It is straightforward to show that if we extend fields X, Y, Z to X, Y, Z, then the Koszul formula (13.2) for V7 xY implies that
(13.5)
2((V7xy)tan,Z) = X(Y,Z)
+ Y(Z,X)
+ (Z, [X, Y]) -
- Z(X, Y)
(Y, [X, Z]) - (X, [Y, Z])
for all Z. But the Koszul formula which determines V7 x Y shows that (V7 x Y) tan = V7 X Y. 0 Definition 13.33. Define maps II: X(M)xX(M) ----7 X(M)l., II: X(M)x X(M)l. ----7 X(M) and V71. : X(M) x X(M)l. ----7 X(M)l. according to
II(X, Y)
:=
(V7 X y)l. for all X, Y E X(M),
II(X, V)
:=
(V7xV)tan for all X E X(M), V E X(M)l.,
V7iv
:=
(V7xV)l. for all X E X(M), V E X(M)l..
It is easy to show that V71. defines a metric covariant derivative on the normal bundle TMl.. The map (X, Y) f---t II(X, Y) is clearly COO(M)-linear in the first slot. If X, Y E X(M) and V E X(M)l., then 0 = (Y, V) and we have
o = V7 x (Y, V) =
(V7 x Y, V)
+ (Y, V7 x V)
= ((V7xy)l. ,V) + (Y, (V7xV)tan) = (II(X, Y), V) + (Y, II(X, V)). It follows that
(13.6)
(II(X, Y), V) = -(Y, II(X, V)).
13.3. Semi-Riemannian Submanifolds
563
From this we see that II(X, Y) is not only COO (M)-linear in X, but also in Y. This means that II is tensorial, and so II(Xp, Yp) is a well-defined element of TpMl.. for each Xp, Yp E TpJovf. Thus II is a TMl..-valued tensor field and for each p we have an IR-bilinear map IIp : TpM x TpM -t TpM1.. (We often suppress the subscript p.) Proposition 13.34. I I is symmetric. Proof. For any X, Y E X(M) we have II(X, Y) - II(Y,Xt)
= (V'xY - V'yX) 1. = ([X, Y])1. = o.
D
We can also easily deduce that II is a symmetric COO(M)-bilinear form with values in X(M) and is similarly tensorial. So II(Xp, Vp) is a welldefined element of TpM for each fixed Xp E TpM and Vp E TpM 1.. We thus obtain a bilinear form 1. IIp: TpM x TpM -t TpM. In summary, I I and I I are tensorial, V'1. is a metric covariant derivative and we have the following formulas: V'xY = V'xY + II(X, Y), 1.V'xV = V'xV + II(X, V)
for X, Y E X(M) and V E X(M)l... Recall that if (V, (., ·)1) and (W, (., ·)2) are scalar product spaces, then a linear map A : V -t W has a metric transpose At : W -t V uniquely defined by the requirement that
(Av,w)2 = (v, Atw\ for all v E V and w E W. Definition 13.35. For v E TpM, we define the linear map BvO := II(v, .). Formula (13.6) shows that the map II(v,·) : TpM1. -t TpM is equal to -Bt : TpM -t TpMl... Writing any Y E X(M)IM as a column vector (ytan, y1.)t, we can write the map V'x : X(M)IM -t X(M)IM as a matrix of operators:
[~;
-:t]·
Next we define the shape operator, also called the Weingarten map. We have already met a special case of the shape operator in Chapter 4. The shape operator is sometimes defined with the opposite sign.
13. Riemannian and Semi-Riemannian Geometry
564
Definition 13.36. Let p E M. For each unit vector u normal to M at p, we have a map called the shape operator Su associated to u defined by
Su(v) := - (V' v U ) tan, where U is any unit normal field defined near p such that U (p) = u. Exercise 13.37. Show that the definition is independent of the choice of normal field U that extends u.
The family of shape operators {Su : u a unit normal} contains essentially the same information as the second fundamental tensor I I or the associated map B. This is because for any X, Y E X(M) and U E X(M)l., we have (SuX,Y) = ((_V'xU)tan ,Y) = (U,-V'xY) = (U,(-V'xy)l.)
=
(U,-IJ(X,Y)).
In the case of a hypersurface, we have (locally) only two choices of unit normal. Once we have chosen a unit normal u, the shape operator is denoted simply by S rather than Su. Theorem 13.38. Let M be a semi-Riemannian submanifold of M. For any V, W,X, Y E X(M), we have
(RvwX, Y) =(RvwX, Y) - (IJ(V, X), IJ(W, Y))
+ (IJ(V, Y), IJ(W, X)).
This equation is called the Gauss equation or Gauss curvature equation.
Proof. Since this is clearly a tensor equation, we may assume that [V, Wl = o (see Exercise 7.34). We have (RvwX, Y) = (V'vV'wX, Y)-(V'wV'vX, Y). We calculate:
+ (V'v(IJ(W, X)), Y) (V'vV'wX, Y) + (V'v(IJ(W, X)), Y) (V'vV'wX, Y) + V(IJ(W, X), Y) - (IJ(W, X), V'vY)
(V'vV'wX, Y) = (V'vV'wX, Y) = =
= (V'vV'wX, Y) - (IJ(W, X), V'vY). Since (IJ(W, X), V'vY)
=
(IJ(W, X), (V'vy)l.)
= (IJ(W, X), IJ(V, Y)), we have (V'vV'wX, Y) = (V'vV'wX, Y) - (IJ(W, X), IJ(V, Y)). Interchanging the roles of V and W and subtracting we get the desired conclusion. 0
13.3. Semi-Riemannian Submanifolds
565
The second fundamental form contains information about how the semiRiemannian sub manifold M bends in M. Definition 13.39. Let M be a semi-Riemannian submanifold of M and N a semi-Riemannian sub manifold of N. A pair isometry : (M, M) -+ (N, N) consists of an isometry : M -+ N such that (M) = N and such that IM : M -+ N is an isometry. Proposition 13.40. A pair isometry : (M, M) -+ (N, N) preserves the
second fundamental tensor: Tp . II( v, w) = II(Tp . v, Tp . w) for all v, wE TpM and all p EM. Proof. Let p E M and extend v, w E TpM to smooth vector fields V and W. Since isometries respect Levi-Civita connections, we have <1>* 'VvW = 'Vcp.v*W. Since is a pair isometry, we have Tp(TpM) C Tcp(p)N and
Tp(TpM.l) C (Tcp(p)N).l. This means that <1>* : X(M)IM -+ X(N)IN preserves normal and tangential components *(X(M)) C X(N) and *(X(M).l) C X(N).l. We have Tp· II(v, w) = *II(V, W)( (p)) = <1>* ('VvW).l ( (p))
= (*'VvW).l((p)) = ('Vcp.v*W).l((p)) II (<1>* V, <1>* W)( (p)) = II (<1>* V, <1>* W)( (p)) = II(Tp . v, Tp . w).
=
o
The following exercise gives a simple but conceptually important example. Exercise 13.41. Let M be the 2-dimensional strip {(x, y, 0) : -11" < X < 11"} considered as a sub manifold of ]R3. Let N be the subset of ]R3 given by { (x, y, J 1 - x 2 ) : -1 < x < 1}. Show that M is isometric to N. Show that there is no pair isometry (]R3, M) -+ (]R3, N). Definition 13.42. Let M be a semi-Riemannian submanifold of a semiRiemannian manifold M. Then for any V, W, Z E X(M) define ('VvII) by
('VvII) (W, Z)
:=
'V& (II(W, Z)) - II('VvX, Y) - II(X, 'VvY).
Theorem 13.43. With M, M, and V, W, Z E X(M) as in the previous
definition we have the following identity: (RvwZ).l = ('VvII)(W, Z) - ('VwII) (V, Z)
(Codazzi equation).
13. Riemannian and Semi-Riemannian Geometry
566
Proof. Since both sides are tensorial, we may assume that [V, W] Then (RvwZ)-L = (V'vV'wZ)-L - (V'wV'vZ)-L. We have (V'vV'wZ)-L
O.
= (V'v (V'wZ))-L - (V'v (II(W, Z)))-L = II(V, V'wZ) - (V'v (II(W, Z)))-L.
Now recall the definition of (V'vII) (W, Z) and find that (V'vV'wZ)-L = II(V, V'wZ) + (V'vII) (W, Z) +II(V'vW, Z) +II(W, V'vZ).
Now compute (V'vV'wZ)-L - (V'wV'vZ)-L and use the fact that V'vWV'wV= [V,W] =0. 0 The Gauss equation and the Codazzi equation belong together. If we have an isometric embedding f : N -+ M, then the Gauss and Codazzi equations on f(N) c M pull back to equations on N and the resulting equations are still called the Gauss and Codazzi equations. Obviously, these two equations simplify if the ambient manifold M is a Euclidean space. For a hypersurface existence theorem featuring these equations as integrability conditions see [Pe]. 13.3.1. Semi-Riemannian hypersurfaces. A semi-Riemannian submanifold of codimension one is called a semi-Riemannian hypersurface. Let M be a semi-Riemannian hypersurface in M. By definition, each tangent space TpM is a nondegenerate subspace of TpM. The complementary spaces (TpM)-L are easily seen to be nondegenerate, and ind (TpM-L) is constant on M since we assume that ind (TpM) is constant on M. The number ind (TpM-L) called the co-index of M. Exercise 13.44. Show that the co-index of a semi-Riemannian hypersurface must be either 0 or 1. Definition 13.45. The sign E of a hypersurface M is defined to be +1 if the co-index of M is 0 and is defined to be -1 if the co-index is 1. We denote it by sgn M. Notice that if E = 1, then ind( M) ind(M) = ind(M) - 1.
= ind( M), while if
E
= -1, then
Proposition 13.46. Let f E COO(M) and M := f-l(c) for some c E JR. Suppose that M of- 0 and that grad f of- 0 on M. Then M is a semiRiemannian submanifold if and only if either (grad f, grad f) > 0 on M, or (grad f, grad f) < 0 on M. The sign of M is the sign of (grad f, grad f), and grad f / Ilgrad fll restricts to a unit normal field along M.
13.4. Geodesics
567
Proof. The relation (grad f, grad f) of. 0 on M ensures that df of. 0 on M, and it follows that M is a regular submanifold of codimension one. Now if v E TM, then (gradf,v) = df(v) = v(f)
= v(fIM) = 0, so grad f is normal to M. Thus for any p E M the space (TpM)l.. is nondegenerate, and so the orthogonal complement TpM is also nondegenerate. The rest is clear. 0 We now consider certain exemplary hypersurfaces in 1R~+l. 1R~+l -+ IR be the quadratic form defined by v
q(x)
:=
(x, x)
= -
2)xi)2 + i=l
Let q :
n+l
L
(x i )2
i=v+l
n i=l
where the reader will recall that Ei = -1 or Ei = 1 as 1 ::; i ::; v or 1 + v ::; i ::; n + 1. Hypersurfaces in 1R~+1 defined by Q(n,r,E) := {x E 1R~+l : q(x) = Er2}, where E = -1 or E = 1, are called hyperquadrics. Exercise 13.47. Let Q(n, r, E) be a hyperquadric as defined above. Let P := 2: XiOi be the position vector field in 1R~+l. Show that the restriction of Plr to Q(n, r, E) is a unit normal field along Q(n, r, E). Exercise 13.48. Show that a hyperquadric Q(n, r, E) as defined above is a semi-Riemannian hypersurface with sign E.
13.4. Geodesics In this section, I will denote a nonempty interval assumed to be open unless otherwise indicated by the context. Usually, it would be enough to assume that I has nonempty interior. We also allow I to be infinite or "half-infinite". Let (M, g) be a semi-Riemannian manifold. Suppose that, : I -+ M is a smooth curve that is self-parallel in the sense that 'iJa/y = 0
along ,. We call, a geodesic. To be precise, one should distinguish various cases as follows: If, : [a, b] -+ M is a curve which is the restriction of a geodesic defined on an open interval containing [a, b], then we call, a (parametrized) closed geodesic segment or just a geodesic for short. If , : [a, (0) -+ M (resp. , : (-00, a] -+ M) is the restriction of a geodesic then we call, a positive (resp. negative) geodesic ray.
13. Riemannian and Semi-Riemannian Geometry
568
If the domain of a geodesic is JR, then we call 'Y a complete geodesic. If M is an n- manifold and the image of a geodesic 'Y is contained in the domain
of some chart with coordinate functions xl, ... , x n , then the condition for 'Y to be a geodesic is
(13.7)
d2 x i 0'Y dt 2 (t)
.
+ L rjkb(t))
dxio'Y dxk0'Y dt (t) dt (t) = 0
for all tEl and 1 :S i :S n. This follows from formula (12.7) and this is a system of n second order equations often abbreviated to f + L ~t = 0, 1 :S i :S n. These are the local geodesic equations. Now consider a smooth curve 'Y whose image is not necessarily contained in the domain of a chart. For every to E I, there is an E > 0 such that 'Yi(to-€,to+€) is contained in the domain of a chart, and thus it is not hard to see that 'Y is a geodesic if and only if each such restriction satisfies the corresponding local geodesic equations for each chart which meets the image of 'Y. We can convert the local geodesic equations (13.7) into a system of 2n first order equations by the usual reduction of order trick. We let v denote a new dependent variable and then we get
d;t:
fjk d::
We can think of xi and vi as coordinates on T M. Once we do this, we recognize that the first order system above is the local expression of the equations for the integral curves of a vector field on TM. Exercise 13.49. Show that there is a vector field G E X(T M) such that a is an integral curve of G if and only if'Y := 7fTM 0 a is a geodesic. Show that the local expression for G is
The vector field G from this exercise is an example of a spray (see Problems 4-8 from Chapter 12). The flow of G in the manifold T M is called the geodesic flow. Lemma 13.50. For each v E TpM, there is an open interval I containing 0 and a unique geodesic 'Y : I --+ M, such that -r (0) = v (and hence 'Y (0) = p). Proof. This follows from standard existence and uniqueness results for differential equations. One may also deduce this result from the facts about flows since, as the exercise above shows, geodesics are projections of integral curves of the vector field G. The reader who did not do the problems
13.4. Geodesics
569
on sprays in Chapter 12 would do well to look at those problems at this time. 0 Lemma 13.51. Let 1'1 and 1'2 be geodesics I ---t M. If'h(to) = 'Y2(tO) for
some to
E
I, then 1'1 = 1'2·
Proof. If not, there must be t' E I such that 1'1 (t') # 1'2 (t'). Let us assume that t' > to since the proof of the other case is similar. The set A = {t E I : t > to and 1'1 (t) # I'2(t)} has an infimum b = inf A. Note that b ~ to. Claim: 'Y1(b) = 'Y2(b). Indeed, if b = to, there is nothing to prove. If b > to, then 'Y1 (t) = 'Y2(t) on the interval (to, b). By continuity 'Y1 (b) = 'Y2(b). Now t H 1'1 (b + t) and t H 1'2 (b + t) are clearly geodesics with initial velocity 'Y1(b) = 'Y2(b). Thus by Lemma 13.50, 1'1 = 1'2 for some open interval containing b. But this contradicts the definition of b as the infimum of A.
0
A geodesic I' : I ---t M is called maximal if there is no other geodesic with open interval domain J strictly containing I that agrees with I' on I. Theorem 13.52. For any v E TM, there is a unique maximal geodesic
I'v with 'Yv(O) = v. Proof. Take the class Yv of all geodesics with initial velocity v. This is not empty by Lemma 13.50. If (x, fJ E Yv and the respective domains Ia and I f3 have nonempty intersection, then (X and fJ agree on this intersection by Lemma 13.51. From this we see that the geodesics in Yv fit together to form a manifestly maximal geodesic with domain I = U')'E9J,. Obviously this geodesic has initial velocity v. 0 Definition 13.53. If the domain of every maximal geodesic emanating from a point p E TpM is all of~, then we say that M is geodesically complete at p. A semi-Riemannian manifold is said to be geodesically complete if and only if it is geodesically complete at each of its points. Exercise 13.54. Let ~~ be the semi-Euclidean space of index v. Show that all geodesics are of the form t H Xo + tw for w E ~~. Definition 13.55. A continuous curve I' : [a, b] ---t M is called a broken geodesic segment if it is a piecewise smooth curve whose smooth segments are geodesic segments. If t* is a point of [a, b] where I' is not smooth, we call I'(t*) a break point. (A smooth geodesic segment is considered a special case.) Exercise 13.56. Prove that a semi-Riemannian manifold is connected if and only if every pair of its points can be joined by a broken geodesic I' : [a, b] ---t M.
570
13. Riemannian and Semi-Riemannian Geometry
Exercise 13.57. Show that if '"Y is a geodesic, then a reparametrization c := '"Y 0 f is a geodesic if and only if f(t) := at + b for some a, b E JR and a =I o. Show that if ~ is never null, then we may choose a, b so that the geodesic is unit speed and hence parametrized by arc length.
The existence of geodesics passing through a point p E M at parameter value zero ~ith any specified velocity allows us to define a very important map. Let Dp denote the set of all v E TpM such that the geodesic '"Yv is defined at least on the interval [0, 1]. The exponential map, exp p : i5p -+ M, is defined by expp v := '"Yv(l). Lemma 13.58. If '"Yv is the maximal geodesic with ~v(O) = v E TpM, then for any c, t E JR, we have that '"Ycv(t) is defined if and only if '"Yv(ct) is defined. When either side is defined, we have
'"Ycv(t) = '"Yv(ct). Proof. Let Jv,c be the maximal interval for which '"Yv(ct) is defined for all t E Jv,c. Certainly 0 E J. Use the chain rule for covariant derivatives or calculate locally to see that t f--7 '"Yv(ct) is a geodesic with initial velocity cv. But then by uniqueness and the maximality of '"Ycv, the interval Jv,c must be contained in the domain of '"Ycv and for t E Jv,c we must have
'"Ycv(t) = '"Yv(ct). In other words, if the right hand side is defined, then so is the left and we have equality. Now let u = cv, s = ct and b = 1/c. Then we just as well have that
'"Ybu(S) = '"Yu(bs), where if the right hand side is defined, then so is the left. But this is just '"Yv(ct) = '"Ycv(t). So left and right have reversed and we conclude that if either side is defined, then so is the other. 0 Corollary 13.59. If '"Yv is the maximal geodesic with then
~v(O) =
v E TpM,
(i) t is in the domain of '"Yv if and only if tv is in the domain of expp; (ii) '"Yv(t) = expp(tv) for all t in the domain of '"Yv. Proof. Suppose that tv is in the domain of expp- Then '"Ytv (1) is defined. But '"Ytv (1) = '"Yv (t) and t is in the domain of '"Yv by the previous lemma. The converse is proved similarly and so we obtain expp(tv) = '"Ytv(1) = '"Yv(t). 0
13.4. Geodesics
571
Now we have a very convenient situation. The maximal geodesic through p with initial velocity v can always be written in the form t f-t expp tv. Straight lines through Op E TpM are mapped by expp onto geodesics which we sometimes refer to as radial geodesics through p. Similarly, we have radial geodesic segments and radial geodesic rays emanating from p. The result of the following exercise is a fundamental observation. Exercise 13.60. Show that (i'v(t),'Yv(t))
= (v,v) for all t in the domain of
"(V·
The exponential map has many uses. For example, it is used in comparing semi-Riemannian manifolds with each other. Also, it provides special coordinate charts. The basic theorem is the following: Theorem 13.61. Let (M, g) be a semi-Riemannian manifold and p E M. There exists an open neighborhood Up C Vp containing Op such that expp IfJ: p is a diffeomorphism onto its image Up. Proof. The tangent space TpM is a vector space, which is isomorphic to IR n and so has a standard differentiable structure. Using the results about smooth dependence on initial conditions for differential equations, we can easily see that exp p is well-defined and smooth in some neighborhood of Op E TpM. The main point is that the tangent map Texp p : TOp(TpM) --+ TpM is an isomorphism and so the inverse mapping theorem gives the result. To see that T expp is an isomorphism, let vOp E Top (TpM) be the velocity of the curve t f-t tv in TpM. Then, unraveling definitions, we have T expp vOp = fA 10 expp tv = v. Thus T expp is just the canonical map vOp f-t v. 0 Definition 13.62. A subset C of a vector space V that contains 0 is called star-shaped about 0 if whenever VEe, tv E C for all t E [0,1]. Definition 13.63. If U c Vp is a star-shaped open set about Op in TpM such that expp IfJ is a diffeomorphism as in the theorem above, then the
image expp(U) = U is called a normal neighborhood of p. In this case, U is also referred to as star-shaped.
Ii
Theorem 13.64. U c M is a normal neighborhood about p with corresponding preimage U C TpM, then for every point q E U there is a unique geodesic"( : [0,1] --+ U c M such that "((0) = p, "((1) = q, '1'(0) E U and expp'Y(O) = q. (Note that uniqueness here means unique among geodesics with image in U.) Proof. The preimage U corresponds diffeomorphic ally to U under expp. Let -1
-
-
(q) so that v E U. By assumption U is star-shaped and so the map p : [0,1] --+ TpM given by t f-t tv has image in U. But then, the
V
= expplfJ
13. Riemannian and Semi-Riemannian Geometry
572
geodesic segment 'Y : t f--7 expp tv, t E [0,1] has its image inside U. Clearly, 'Y(O) = P and 'Y(1) = q. Since p = v, we get
1'(0) = Texppp(O) = Texppv = v under the usual identifications in TpM. Now assume that 'Y1 : [0,1] ---+ U c M is some geodesic with 'Y1(0) and 'Y1 (1) = q. If 1'1 (0) = w, then 'Y1 (t) = expp two
=
P
Claim: The ray PI : t f--7 tw (t E [0,1]) stays inside fJ. If not, then the set A = {t : tw rJ- fJ} is nonempty. Let t* = inf A and consider the set C := {tw : t E (0, t*)}. Then C c fJ and fJ\i5 is contractible (check this). But its image expplif (fJ\C) is U\C, where C is the image of (0, t*) under 'Y1. Since U\C is certainly not contractible, we have contradiction. Thus the claim~is true, and in particular w = P1(1) E fJ. Therefore, both wand v are in U. On the other hand,
exp p w = 'Y1 (1) = q = expp V. Thus, since exp pIif is a diffeomorphism and hence injective, we conclude that w = V. By the basic uniqueness theorem for geodesics, the segments 'Y and 'Y1 are equal and both given by t f--7 exp p tv. D Let (M,g) be a semi-Riemannian manifold of dimension n. Let Po E M and pick any orthonormal basis (e1,"" en) for the semi-Euclidean scalar product space (TpoM, (-, ·)po). This basis induces an isometry I : lR~ ---+ TpoM by (xi) f--7 L; xiei' If U is a normal neighborhood centered at Po EM, then X norm := 10 exp po 10 1 : U ---+ lR~ = lR n is a coordinate chart with domain U. These coordinates are referred to as normal coordinates centered at Po. Normal coordinates have some very nice properties: Theorem 13.65. Ifx norm = (xl, ... , xn) are normal coordinates defined on U and centered at Po, then
gij (po) = \ r~k(pO)
a~i ' a~j ) PO =
Ei<5ij
for all i, j,
= 0 for all i,j, k.
(When using normal coordinates, it should not be forgotten that the only guaranteed to vanish at po.)
r;k are
Proof. Let v E TpoM and let {e i } be the basis of T;oM dual to {ed. We
write v and so
= L;aiei. We have that ei
0
exppo l01 = xi. Now 'Yv(t) = exp po tv
13.4. Geodesics
573
So we see that v
=
)'v(O)
=
z= a
i
a~i Ipo. In particular, if ai
=
6}, then
ei =
a~i Ipo and (a~i' a~j )po = Ei 6ij. Since "Iv is a geodesic and xi("(v(t)) = ta'i, the coordinate expression for the geodesic equations reduces to
I: r.ik("(v (t))aja k = 0 for all i. In particular, this holds at Po = 'Yv(O). But v is arbitrary, and hence the n-tuple (a i ) is arbitrary. Thus the quadratic form defined on ffi,n by Qi(u) = r;k(pO)Uju k is identically zero, and by polarization, the bilinear form Qi: (u,v) I---t z=r;k(pO)Ujv k is identically zero. Of course this means that r~k(pO) = 0 for all j, k and arbitrary i. 0
z=
Notation 13.66. From now on, whenever we write exppl , we must have in mind an open set fj and an open set U such that exp pIfj : fj ---+ U is a diffeomorphism. Thus, exppl is an abbreviation for exp pli7 l . Usually, fj will be star-shaped and thereby U is a normal neighborhood of the point p. Definition 13.67. Let U be a normal neighborhood of a point Po in a semi-Riemannian manifold M. The radius function r : U ---+ ffi, is defined by
rpo(p)
:=
Ilexp;ol(p)11
for p E U. We often write simply r if the central point Po is understood. If (xl, ... , xn) are normal coordinates defined on U and centered at Po, then the radius function is given by
"~"pO ~ 1- t(X;)2 + ;,E}X;)21
1 2 /
The radius function is smooth except on the set where it is zero. This zero set is called the local nullcone and is the image of the intersection of the nullcone in TpoM with fj = exppol(U). In the Riemannian case, where the metric is definite, the radius function r is smooth except at the center point Po. Note that in this case, r2 is smooth even at Po. Now suppose that "I : [0,1] ---+ U c M is the geodesic with "1(0) = Po, "1(1) = P and )'(0) = v. Then we have the useful and often used fact that
L("() = r(p). To see this, first note that v = exppol(p). Then, since 111'11 is constant (by Exercise 13.60), we have
L("()
=
fal Ibll dt
=
fa1 Ilvll dt =
Ilvll
=
r(p).
13. Riemannian and Semi-Riemannian Geometry
574
More generally, if r > then
°
is such that "Iv : t t-+ expp tv is defined for
°
~
t
~
r,
In particular, if v is a unit vector, then the length of the geodesic "Iv I[O,r] is equal to r.
TpM ~ M toge~her to get a map exp : 15 ~ M defined by exp(v) := eXP7r(v)(v). The set V is the set of vET M such that the geodesic "Iv is defined at least on [0,1]. Let
15
:=
Up15p.
We can gather the maps expp
:
-
-
15p
C
Proposition 13.68. V is open and for each p EM, Vp is open and star-
shaped. Thus Vp
:=
exp p(15 p) is a (maximal) normal neighborhood of p.
Proof. Let W C lR x T M be the domain of the geodesic flow (s, v) t-+ ~v (s). This is the flow of a vector field on T M, and so W is open. W is also the domain of the map (s,v) t-+ 1r 0 ~v(s) = 'Yv(s). The map (1,v) t-+ v is a diffeomorphism {1} x TM ---+ TM. Under this diffeomorphism, 15 corresponds to th~ set n ({1} x T M), and so it must be open in T M. It also follows that Vp = V n TpM is open in TpM and Vp is open in M.
l!'
-
-
To see that Vp is star-shaped, let v E Vp. Then "Iv is defined for all t E [0,1]. On the other hand, 'Ytv(1) is defined and equal to 'Yv(t) for all t E [0,1] by Lemma 13.58. Thus, by definition, tv E 15p for all t E [0,1]. 0 Let _D. = {(p, p) : p E M} be the diagonal subset of M x M. Let EXP : VeT M ~ M x M be defined by
EXP: v t-+ (1r(v),exp p v), where
1r :
TM
~
M is the tangent bundle projection.
Theorem 13.69. If Op is the zero element of TpM, then there is a neigh-
borhood W ofOp in TM such that EXPl w is a diffeomorphism onto a neighborhood of (p,p) ED. c M x M. Proof. We first show that if Tx expp is nonsingular for some x E Vp c TpM, then Tx EXP is also nonsingular at x. So assume that Tx expp is nonsingular and suppose that Tx EXP(v x ) = 0. We have 1r = prl oEXP and so T1r(v x ) = Tprl(TxEXP(vx)) = 0. This means that Vx is tangent to 15p c TpM. But
13.4. Geodesics
575
the restricted map EXPliip is related to exp p by trivial diffeomorphisms: EXPl v p
1)p id
1
.
pxM
Jpr2 exp p
1)p
"M
Thus Txexpp(vx) = 0 and hence Vx = O. Since Top expp is nonsingular at each point Op of the zero section, we see that the same is true for Top EXP, and the result follows from the inverse mapping theorem. 0
Definition 13.70. An open subset U of a semi-Riemannian manifold will be said to be totally star-shaped if it is a normal neighborhood of each of its points. Notice that U being totally star-shaped according to the above definition implies that for any two points p, q E U, there is a geodesic segment "( : [0,1] ---+ U such that "((0) = p and "((1) = q, and this is the unique such geodesic with image in U. (One may always make an affine change of parameter, but then we have a different interval as the domain.) Thinking about the sphere makes it clear that even if U is totally star-shaped, there may be geodesic segments connecting p and q whose images do not lie in U.
Theorem 13.71. Every p E M has a totally star-shaped neighborhood. Proof. Let p E M and choose a neighborhood W of Op in T M such that EXPlw is a diffeomorphism onto a neighborhood of (p,p) EM x M. Bya simple continuity argument we may assume that EXPlw (W) is of the form U(<5) x U(<5) for U(<5) := {q : 2:~=1 (x i (q))2 < <5} and x = (xl, ... , xn) is a normal coordinate system. Now consider the tensor b on U(<5) whose components with respect to x are bij = <5ij - 2:k ffjx k . This is clearly symmetric and positive definite at p, and so by choosing <5 smaller if necessary we may assume that this tensor is positive definite on U (<5). Let us show that U (<5) is a normal neighborhood of each of its points q. Let Wq := W n TqM. We know that EXPl wq is a diffeomorphism onto {q} x U(<5), and it is easy to see that this means that eXPq Iw is a diffeomorphism onto U (<5). We q now show that Wq is star-shaped about Oq. Let q' E U (<5), q' 1= q, and v = EXPI~q (q, q'). This means that "(V : [0,1] ---+ M is a geodesic from q to q'. If "(v([O, 1]) c U(<5), then tv E Wq for all t E [0,1] and so we could conclude that Wq is star-shaped. Let us assume that "(v ([0, 1]) is not contained in U (<5) and work for a contradiction.
13. Riemannian and Semi-Riemannian Geometry
576
If in fact IV leaves U(6), then the function f: t N l:~1(xihv(t)))2 has a maximum at some to E (0,1). Thus the second derivative of f cannot be positive at to. We have
2
d f -dt2 -
2
~n (d (xi 0 IV) + x i dt
d2 (xi dt2
IV))
0 0 'V - - ' - - - , : - - - - ' IV
.
i=l
But
IV
is a geodesic, and so using the geodesic equations we get
~ _ ~ ( .. _ ~ k. k) d (xi 0 IV) d (xj 0 IV) dt2 f - 2 ~ 6~J ~ r ~J x dt dt· i,j
k
Plugging in to we get
::2f(to) = 2b(-Yv(to),'Yv(to)) > 0, which contradicts
o
f having a maximum at to.
Note: It follows from the proof that given p EM, there is a 6 > 0 such that exp( {vp E TpM : Ilvpll < c}) is totally star-shaped for all E < 6. Warning: Clearly a totally star-shaped open set is, in some sense, "convex". Indeed, some authors define convexity in this manner. However, notice that on the circle, open intervals are totally star-shaped while the intersection of two such intervals need not even be connected. For proper Riemannian manifolds, there is another definition of convexity that does not suffer from this defect (see Problem 2). Actually, there are several notions of convexity on manifolds and the terminology does not seem to be quite standardized.
Theorem 13.72 (Gauss lemma). Let p E M, x E TpM, with x =1= Op in the domain of expp . Choose Vx, Wx E Tx(TpM), where Vx, Wx correspond to v, w E TpM under the canonical isomorphism between Tx(TpM) and TpM. If vx is radial, i. e. if v is a scalar multiple of x, then (Tx expp v x , Tx expp Wx) = (v x , w x ).
Proof. Clearly we may suppose that x h: [0,1 + E) X (-E, E) -t TpM by
= v.
For small
E
> 0, we define
h(t, s) = t(v + sw). If we take E sufficiently small, then on the same domain we may define h by h(t, s) := expp(t(v
+ sw)).
Then ah a at(t,s) :=T(t,s)h. at' ah a as (t, s) := T(t,s)h . as·
13.4. Geodesics
577
Figure 13.2. Gauss lemma
We have that
~~ (1,0)
=
Vv and ~~ (1,0)
oh
ot (1,0)
=
=
wv , so that
Tvexpp Vv,
oh
os (1, 0) = Tvexpp
W v·
We wish to show that (~, ~~)(1, 0) = (vv, wv ) = (v, w). Since the curve t I-t expp(t(v + sw)) is a geodesic with initial velocity v + sw, we have
Oh) (Oh Oh) ( Oh ot 'ot (t, s) = ot' ot (0, s) = (v + sw, v + sw). We have
.?-(Oh Oh) = (V' a oh Oh) + (Oh ot at' os
at
= = =
ot ' os
(~~, V' It ~:)
ot '
(since t
V' a Oh) at os I-t
h( t, s) is a geodesic)
Oh) . ( Oh ot ' V' %8 ot (by ExercIse 13.11) 1 0 (Oh Oh) 20s at' ot = (v,w).
Since h(O, s) = p for all s, we have (~~, ~~)(O, 0) t(v, w). The result follows by letting t = 1.
= 0 and so (~~, ~~)(t, 0) = 0
If Vx is not assumed to be radial, then the above equality does not always hold. We need to have the dimension of the manifold greater than 3 in order to see what can go wrong. Figure 13.3 shows a unit sphere in the tangent space of a Riemannian manifold and a pair of orthogonal vectors tangent to this sphere. Under the exponential map these vectors map to vectors which need not be orthogonal.
578
13. Riemannian and Semi-Riemannian Geometry
Figure 13.3. Distortions under the exponential map
We now introduce the position vector fields associated with a normal neighborhood of a point p. First, let us consider a vector space V with scalar product (', .). Then V is a semi-Riemannian manifold with metric defined by (vx, Wx) := (v, w). Let \7 denote the associated Levi-Civita connection. Because of the canonical isomorphisms Tv V ~ V, every vector field Y on V can be identified with a map Y : V -+ V. Under this identification, \7 x Y is just the directional derivative of Y in the X direction. The position vector field on V is defined by P : v t---+ vv, and it is easy to check that \7 x P = X for any vector field X. Now consider the quadratic form q defined by q(v) = (v, v). Unraveling the definitions we see that q = (P, P). We have for any X (grad q, X) = Xq = X(P, P) = 2(\7 xP, P) = (X,2P), and we conclude that gradq = 2P. It follows that P is normal to every hyperquadric q-1(c), c E IR, c f- O. Now we want a similar result in a normal neighborhood U of a point p on a semi-Riemannian manifold M. We consider TpM as a semi-Riemannian manifold in its own right. We have the position vector field
P: v t---+ vv, and we have the quadratic form q defined by q (v ) (v, v) . Let U = exp;l(U) and for each c f- 0 let Qc := q-1(c) and Qc := expp(U n Qc). Corresponding to q we have a function q defined on U by q:= -q 0 expp-1 ,
13.4. Geodesics
579
Definition 13.73. For c 01- 0, the sets Qc = q-l(c) are called local hyperquadrics associated to the normal neighborhood centered at p. The set A = q-l(O) = q 0 exp;l(O) is called the local nullcone at p. On the norm~l neighborhood U there is a unique vector field P that is expp-related to P. We refer to P as the local position vector field for the normal neighborhood at p, P
= Texp p oP- 0
expp-1 .
Pro~osition 13.74. Let U be a normal neighborhood of p and let q, Qc, Qc, P, and P be as above. Then
(i) P is normal to each Qc; (ii) (P, P)
expp =
0
(ft, P);
(iii) grad q = 2P. Proof. (i) follows ~om the Gauss lem~a (Lemma 13.72) and the corresponding fact that P is normal to each Qc in TpM. The Gauss lemma also immediately gives (ii). (iii) For any v, let
v be such that T expp v =
v. Then
(T expp v) q expp) = v(q)
(grad q, v) = v( q) = =
v (q 0
= (gradq,V) = 2(P,V) = 2(P,v), where we used the Gauss lemma in the last step. Since v was arbitrary, we obtain (iii). 0 Consider the unit sphere sn-l and the map IR n -+ (0,00) X sn-l given by x r-+ (11xll ,xl Ilxll). Now put coordinates on the sphere, say el , ... ,en-I. Composing, we obtain coordinates (r, el , ... , en-I) on an open subset of IR n \ {O}, where r gives the distance to the origin and the directions are normal to the r direction. A standard method of choosing the angle functions el , ... , en - l leads to what is sometimes called hyperspherical coordinates. If (M,g) is Riemannian, and if (r,el, ... ,en - l ) are "spherical" coordinates
e
13. Riemannian and Semi-Riemannian Geometry
580
on ]Rn as above, then we can compose with normal coordinates centered at p to obtain coordinate functions on our normal neighborhood, which we again denote by (r, ()l, ... ,()n-l). These coordinates are called geodesic spherical coordinates or geodesic polar coordinates. As usual, the function r is extended to be zero at the center, and in the case of hyperspherical coordinates, the angle functions are extended to be multivalued. The resulting "coordinates" are not really proper coordinates on the normal neighborhood since they suffer from the usual defects. For example, r is not smooth at the center point where it is zero, and the angles ()l, ... , ()n-l become ambiguous when r = O. Thus a little care is need when using spherical coordinates. No matter how we choose the ()l, ... , ()n-l, the function r is the radial function introduced earlier. Whether or not angle functions are introduced, one often uses the notation %r to denote the unit vector field defined as follows: If v is. a unit vector in TpM, then
aar Iq :=
dd
I
expp(tv) ,
t t=to
where q = expp(tov) (and p is the center point of the normal coordinates). In fact, it is not hard to see that %r = PIIIPII. One might use this last equation to define %r in the case of an indefinite metric but note that PI IIPII is undefined when IIPII = 0 and so is undefined on the local nullcone. We refer to %r as the unit radial vector field. If (r, ()l, ... ,()n-l) are geodesic spherical coordinates centered at some point Po of a Riemannian manifold, then by the Gauss lemma
(%r' a~i) = 0 for i = 1, 2, ... , n- 1. Exercise 13.75. Show that if a geodesic '"'( : [a, b) -+ M is extendable to a continuous map 'Y : [a, b] -+ M, then there is an c > 0 such that '"'( : [a, b) -+ M is extendable further to a geodesic ; Y : [a, b + c) -+ M with
; YI [a,b]
=
'Y.
Under certain conditions, geodesics can help us draw conclusions about maps. The following result is an example and a main ingredient in the proof of the Hadamard theorem to be given later. Theorem 13.76. Let f : (M,g) -+ (N,h) be a local isometry of semi-
Riemannian manifolds with N connected. Suppose that f has the property that given any geodesic '"'( : [0,1] -+ Nand p E M with f(p) = '"'((0), there is a curve ; Y : [0,1] -+ M such that p = ;;Y(O) and'"'( = f o;;Y. Then ¢ is a semi-Riemannian covering. Proof. Since any two points of N can be joined by a broken geodesic, it is easy to see that the hypotheses imply that f is onto.
581
13.4. Geodesics
Let U be a normal neighborhood of an arbitrary point q E N and let U c TqN be the open set such that eXPq(U) = U. We will show that U is evenly covered by f. Choose P E f-l(q). Observe that Tpf : TpM -+ TqN is a linear isometry (the metrics on TpM and TqN are given by the scalar products g(p) and h(q)). Thus Vp := Tpf-l(U) is star-shaped about Op E TpM. Now if v E Vp, then by hypothesis, the geodesic ')'(t) := eXPq(t (Tpf (v))) has a lift to a curve;Y: [0,1] -+ M with ;Y(O) = p. But since f is a local isometry, this curve must be a geodesic. It is also easy to see that Tpf(;Y'(O)) = ')"(0) = Tpf (v). It follows that v = ;Y'(O) and then expp(v) = ;Y(1). Thus expp is defined on all of V. In fact, it is clear that f(exppv) = eXPf(p)(Tf(v)) and so we see that f maps Vp := expp(Vp) onto the set eXPq(U) = U. We show that Vp is a normal neighborhood of p. From f 0 expp = exp f(p) 0 T f we see that f 0 exp p
-
-
is a diffeomorphism on V. But then exp p : Vp -+ Vp is bijective. Combini~g this with the fact that T f 0 T exp p is a linear isomorphism at each v E Vp and the fact that T f is a linear isomorphism, it follows th~t Tvexpp is a linear isomorphism. It follows that Vp is open and exp p : Vp --+ Vp is a diffeomorphism. Composing, we obtain flvp which is a diffeomorphism taking Vp onto U.
= eXPf(p) lu 0 Tf
0
exppl~:,
Now we show that if Pi,Pj E f-l(q) and Pi i- Pj, then the sets VPi and VPj (obtained for these points as we did for a generic P above) are disjoint. Suppose to the contrary that m E VPi n VPj and let ')'Pi m and ')'pjm be the reverse radial geodesics from m to Pi and Pj respectively. Then f 0 ')'Pi m and fO')'pjm are both reversed radial geodesics from f(x) to q, and so they must be equal. But then ')'Pi m and ')'pjm are equal since they are both lifts of the same curve and start at the same point. It follows that Pi = Pj after all. It remains to prove that f- 1 (U) c UPEf-1(q) Vp since the reverse inclusion is obvious. Let x E f-l(U) and let a: [0,1] -+ U be the reverse radial geodesic from f (x) to the center point q. Now let ')' be the lift of a starting at x and let P = ')'(1). Then f(p) = a(l) = q, which means that P E f-l(q). On the other hand, the image of')' must lie in Vp and so x E Vp. 0
13.4.1. Geodesics on submanifolds. Let M be a semi-Riemannian submanifold of M. For a smooth curve ')' : I -+ M, it is easy to show using Proposition 13.32 that V' at Y = (V' at Y) tan and that we have
V'at Y = V'atY + lIb, Y) for any vector field Y along ')'. If Y is a vector field in X (M) IM or ill X(M), then YO,), is a vector field along ')'. In this case we shall still write V' at Y = V' at Y +lIb, Y) rather than V' at (Y 0 ')') = V' at (Y 0 ,),)+lIb, Yo')').
13. Riemannian and Semi-Riemannian Geometry
582
TI1M/C0/-exp~ •. ~ o
exp·
·M
TpJM/C0/~ •. C!J:) .•
Tf.~T Tq N/C0/ expq~ ~ N Figure 13.4. Semi-Riemannian covering
Recall that 1 is a vector field along 'Y. We also have 'Vat 1, which in this context will be called the extrinsic acceleration (or acceleration in M). The intrinsic acceleration (acceleration in M) is 'Vat 1. Thus we have
'Vat 1 = 'Vat1
+ II(1,1)·
Since I I (1,1) is the normal part of 'Vat 1, we immediately obtain the following: Proposition 13.77. If'Y : I ~ M is a smooth curve where M is a semiRiemannian submanifold of M, then 'Y is a geodesic in M if and only if 'Vat 1 is normal to M for every t E I.
Exercise 13.78. A constant speed parametrization of a great circle in Sn(r) is a geodesic. Every geodesic in sn (r) is of this form. Definition 13.79. A semi-Riemannian submanifold M c M is called totally geodesic if every geodesic in M is a geodesic in M. Theorem 13.80. For a semi-Riemannian submanifold M C M, the following conditions are equivalent:
(i) M is totally geodesic; (ii) I 1== 0; (iii) For all v E TM, the geodesic 'Yv in M with initial velocity v is such that 'Yv ([0, E]) C M for E > 0 sufficiently small; (iv) For any curve a : I ~ M, parallel translation along a induced by 'V in M is equal to parallel translation along a induced by 'V in M. Proof. (i)===}(iii) follows from the uniqueness of geodesics with a given initial velocity.
(iii)===}(ii): Let v E TM. Applying Proposition 13.77 to 'Yv we see that II(v, v) = O. Since v was arbitrary, we conclude that II == O.
13.4. Geodesics
583
(ii)===?(iv): Suppose v E TpM. If V is a parallel vector field with respect to \7 that is defined near p such that V (p) = v, then \7 at V = \7 at V + JIb, V) = 0 for any "I with "1(0) = P so that V is a parallel vector field with respect to \7. (iv)===?(i): Assume (iv). If "I is a geodesic in M, then ~ is parallel along "I with respect to \7. Then by assumption ~ is parallel along "I with respect to \7. Thus "I is also an M geodesic. 0 From Proposition 13.46, it follows that if M = f-l(c) is a semiRiemannian hypersurface, then U = \7 f /11\7 fll is a unit normal for M and (U, U) = E = sgnM. Notice that this implies that M = f-l(c) is orientable if Mis orientable. Thus not every semi-Riemannian hypersurface is of the form f-l(c). On the other hand every hypersurface is locally of this form. We are already familiar with the sphere sn(r), which is f-l(r 2) where n (')2 f(x) = (x, X) = 2:i=l xt .
Definition 13.81. For n
> 1 and 0 :S v :S n, we define
S~(r) = {x E 1R~+1 : (x, X}v = r2}. S~ (r)
is called the pseudo-sphere of radius r and index v.
Definition 13.82. For n > 1 and 0 :S v :S n, we define H~(r) = {x E lR~ti H~(r)
: (x, X}v =
_r2}.
is called the pseudo-hyperbolic space of radius r and index v.
If II is a two-dimensional plane through the origin in IR~+ 1 , and if C c II is a conic section (ellipse, straight line, hyperbola, etc.), then we shall say that C is a conic section in 1R~+1. If Q C 1R~+1 is a hyperquadric, then it is easy to show that II n Q is a conic section in II and hence in 1R~+1. Problems 4 and 5 show that geodesics in hyperquadrics can be understood once we understand the case of S~(1). With this in mind, we have Proposition 13.83. All geodesics in S~(r) are parametrizations of the connected components of sets of the form II n S~(r), where II is a plane.
a) If'Y is a time like geodesic in
S~ (r),
then it is a parametrization of
one branch of a hyperbola.
b) If "I is a null geodesic in
S~ (r),
then it is a parametrization of a
straight line.
c) If'Y is a space like geodesic in
S~ (r),
an ellipse (and hence periodic).
then it is a parametrization of
584
13. Riemannian and Semi-Riemannian Geometry
Timelike geodesic Spacelike geodesic Lightlike geodesic
--~.....
Figure 13.5. Geodesics in SUI)
Proof. We follow [ONI]. Let p
E S~(r) be given and let
II be a plane in lR~+l containing 0 and p. We will show that the conic section II n S~ (r) can
be parametrized as a geodesic. We identify the type of conic section and we argue that these account for all geodesics on S~ (r). We restrict the scalar product g on lR~+1 to II. Since pis spacelike from the definition of S~, we only have three possibilities for gin. We handle these in turn:
(1) gin is positive definite. Choose an orthonormal basis el, e2 for 11. Then a point ael + be2 of II is also on S~ only if a2+ b2 = r2. Thus S~ (r) n 11 is a circle in II and hence an ellipse in lR~+I. Now,,(t) := r cos tel + r sin t e2 is a parametrization of S~ (r) n II, and since h, 1) 1/ = r2, it is a constant speed spacelike curve. But also V' Ot 1 = - Po" so V' Ot 1 is normal to S~ and thus" is a geodesic. (2) gin is nondegenerate with index 1. Choose an orthonormal basis eo, el for II such that (eo, eo) = -1 and reI = P for some r. Observe that a point aeo + bel of II is also on S~ if and only if -a 2 + b2 = r2. Thus S~(r) n II is both branches of a hyperbola. We can parametrize the branch through pas ,,( t) := (r sinh t) eo
+ (r cosh t) e I .
This time h, 1) 1/ = -r2, so ,,( t) is timelike. Furthermore, V' Ot 1 = Po", so V' Ot 1 is normal to S~ and thus" is a geodesic. (3) gin is degenerate. In this case, the null space of gin must be of dimension 1. We choose a nonzero null vector v so that p, v is a basis for 11. Then a point ap + bv of II is also on S~ only if a = ±1 which means that S~ (r) n II is a pair of lines. The line through p is parametrized as t H P + tv and is a geodesic of lR~+ I contained in S~ and so is certainly a geodesic of S~.
13.5. Riemannian Manifolds and Distance
585
Finally, we argue that, up to reparametrization, this accounts for all geodesics in S~. Indeed, if"( is such a geodesic, then v = '1'(0) is based at p = "((0) and there is a unique plane II through the origin containing p and v. By uniqueness, "( must be a reparametrization of one of the geodesics D already discovered above.
13.5. Riemannian Manifolds and Distance In this section we consider only Riemannian manifolds (definite metrics). Then we have the notion of the length of a curve (Definition 13.5). Using this we can then define a distance function (a metric in the sense of "metric space") as follows: Let p, q E M. Consider the set Path(p, q) consisting of all piecewise smooth l curves that begin at p and end at q. We define the Riemannian distance from p to q as dist(p, q)
(13.8)
= inf{L(c)
: c E Path(p, q)}.
On a general Riemannian manifold, dist(p, q) = r does not necessarily mean that there must be a curve connecting p to q having length r. To see this, just consider the points (-1, 0) and (1, 0) on the punctured plane ]R2\ {O}.
Definition 13.84. If p E M is a point in a Riemannian manifold and R > 0, then the set B R (p) (also denoted B (p, R)) defined by B R (p) = {q EM: dist(p, q) < R} is called an open geodesic ball centered at p with radius R. It is important to notice that unless R is small enough, BR(p) may not be homeomorphic to a ball in a Euclidean space. To see this just consider a ball of large radius on a circular cylinder of small diameter. Proposition 13.85. Let U be a normal neighborhood of a point p in a Riemannian manifold (M, g). If q E U and if "( : [0, 1] --7 M is the radial geodesic such that "((0) = p and "((1) = q, then"( is the unique shortest curve in U (up to reparametrization) connecting p to q. Proof. Let 0: be a curve connecting p to q (refer to Figure 13.6).Without be the loss of generality we may take the domain of 0: to be [0, b]. Let radial unit vector field in U. Then if we define the vector field R along 0: by t I---t la(t)' we may write 0: = (R, o:)R + N for some field N normal to
tr
Ir
1 Recall
that by our conventions, a piecewise smooth curve is assumed to be continuous.
13. Riemannian and Semi-Riemannian Geometry
586
R (but note that N(O) = 0). We now have
L(a)
=
fob(a, a)1/2 dt
=
fob [(R,a)2
+ (N,N)]1/2
dt
2: fo bl (R,a)ldt2: fob(R,a)dt= fob :t(roa)dt = r(a(b)) = r(q).
On the other hand, if v = '1'(0), then r(q) = f01 Ilvll dt = f01('1', '1')1/2 dt so L(a) 2: L(-y). Now we show that if L(a) = L(-y), then a is a reparametrization of ,. Indeed, if L(a) = L(-y), then all of the above inequalities must be equalities so that N must be identically zero and ft (r 0 a) = (R, a) = I(R,a)l. It follows that a = (R,a)R = (ft(roa))R, and so a travels 0 radially from p to q and must be a reparametrization of ,.
~R
p
Figure 13.6. Normal neighborhood of p
It is important to notice that the uniqueness assertion of Theorem 13.85 only refers to curves with image in U. This is in contrast to the proposition below. Proposition 13.86. Let Po be a point in a Riemannian manifold M. There exists a number EO (p) > 0 such that for all E, 0 < E ::; EO (p) we have the following:
(i) The open geodesic ball B(po, E) is normal and has the form B (po,
(ii) For any p
E) = exp po {v
I I < E} .
E Tpo M : v
E B(Po,E), the radial geodesic segment connecting Po to
p is the shortest curve in M, up to parametrization, from Po to p. (Note carefully that we now mean the shortest curve among curves into M rather than just the shortest among curves with image in B(po, E).)
13.5. Riemannian Manifolds and Distance
587
-
-
Proof. Let U c TpoM be chosen so that U = exp po U is a normal neighborhood of PO. Then for sufficiently small c > the ball
°
B(O,c)
:=
{v E TpoM: Ilvll < c}
is a starshaped open set in fj, and so Apo,c = exppo (B (0, c)) is a normal neighborhood of PO. From Proposition 13.85 we know that the radial geodesic segment (J from Po to P is the shortest curve in Apo,c from Po to p. This curve has length less than c. We claim that any curve from Po to P whose image leaves Apo,c must have length greater than c. Once this claim is proved, it is easy to see that
Apo,c = B(po, c)
= {p EM:
dist(po,p) < c}
and that (ii) holds. Now suppose that a : [a, b] -+ M is a curve from Po to p which leaves Apo,c' Then for any r > with r < c, the curve a must meet the set S(r) := exppo({v E TpoM: Ilvll = r}) at some first parameter value tl E [a, b]. Then al := al[a,tl] lies in Apo,c, and Proposition 13.85 tells us that L(a) ~ L( al[ah]) ~ r. Since this is true for all r < c, we have L(a) ~ c, which is what was claimed. 0
°
Theorem 13.87 (Distance topology). Given a Riemannian manifold, define the distance function dist as before. Then (M, dist) is a metric space, and the metric topology coincides with the manifold topology on M. Proof. To show that dist is a true distance function (metric) we must prove that (1) dist is symmetric; dist(p, q) = dist(q,p); (2) dist satisfies the triangle inequality dist( q, p) :'S dist(p, x) + dist(x, q); (3) dist(p, q)
~
0; and
(4) dist(p, q) = 0 if and only if p = q. Now, (1) is obvious, and (2) and (3) are clear from the properties of the integral and the metric tensor. For (4) suppose that p =1= q. Then since M is Hausdorff, we can find a normal neighborhood U of p that does not contain q. In fact, by the previous proposition, we may take U to be of the form B (p, c). Since (by the proof of the previous proposition) every curve starting at p and leaving B(p, c) must have length at least c/2, we see that dist(p, q) ~ c/2. 0 By definition a curve segment in a Riemannian manifold, say c : [a, b] -+ M, is a shortest curve if L(c) = dist(c(a), c(b)). We say that such a curve is (absolutely) length minimizing. Such curves must be geodesics.
Proposition 13.88. Let M be a Riemannian manifold. A length minimizing curve c : [a, b] -+ M must be an (unbroken) geodesic.
588
13. Riemannian and Semi-Riemannian Geometry
Proof. There exist numbers ti with a = to < t1 < ... < tk = b such that for each subinterval [ti' ti+1], the restricted curve cl[t.z,1+1 t. J has image in a totally star-shaped open set. Thus since cl [t.z, t.t+1 J is minimizing, it must be a reparametrization of a unit speed geodesic (use the uniqueness part of Proposition 13.85). Thus there is a reparametrization of c that is a broken geodesic. But this new reparametrized curve is also length minimizing, and so by Problem 1 it is smooth. D
13.6. Lorentz Geometry In this section we define and discuss a few aspects of Lorentz manifolds. Lorentz manifolds play a prominent role in physics and are often singled out for special study. We discuss the local length maximizing property of timelike geodesics in a Lorentz manifold and derive the Lorentzian analogue of Proposition 13.85. Definition 13.89. A Lorentz vector space is a scalar product space with index equal to one and dimension greater than or equal to 2. A Lorentz manifold is a semi-Riemannian manifold such that each tangent space is a Lorentz space with the scalar product given by the metric tensor. Under our conventions, the signature of a Lorentz manifold is of the form (-1,1" ... ,1,1).2 Each tangent space of a Lorentz manifold is a Lorentz vector space, and so we first take a closer look at some of the distinctive features of Lorentz vector spaces. Let us now agree to classify the zero vector in a Lorentz space as spacelike. For Lorentz spaces, we may classify subspaces into three categories: Definition 13.90. Let V be a Lorentz vector space such as a tangent space of a Lorentz manifold. A subspace W c V is called (1) spacelike if glw is positive definite (or if W is the zero subspace); (2) timelike if glw nondegenerate with index 1; (3) lightlike if glw is degenerate. Thus a subspace falls into one of the three types, which we refer to as its causal character. If we take a timelike vector v in a Lorentz space V, then lRv, the space spanned by v, is nondegenerate and has index 1. By Lemma 7.47, v.l. is nondegenerate and V = lRv EEl v.l.. Since 1 = ind (V) = ind (lRv) + ind (v.l.) , it follows that ind (v.l.) = 0, so that v.l. is spacelike. This little observation is useful enough to set out as a proposition. 2Some authors use (1, -1, ... , -1, -1), but this does not really change the geometry.
13.6. Lorentz Geometry
589
Figure 13.7. Causal character of a subspace
Proposition 13.91. If V is a Lorentz vector space and v is a timelike element, then v~ is spacelike, and we have the orthogonal direct sum V = lRv EEl v~. Exercise 13.92. Show that if W is a subspace of a Lorentz space, then W is timelike if and only if W ~ is spacelike. Exercise 13.93. Suppose that v, ware linearly independent null vectors in a Lorentz space V. Show that (v, w) =I- O. [Hint: Use an orthonormal basis to orthogonally decompose; V = lReo EEl P, where eo is timelike and where (-,.) is positive definite on P. Suppose (v, w) = 0; write v = aeo + PI and w = /3eo + P2. Then show that (PI,P2) = a/3, (Pi, Pi) = a 2 = /3 2 and I(PI,P2)1
= IlpIilllp211·]
Lemma 13.94. Let W be a subspace of a Lorentz space. Then the following conditions are equivalent:
(i) W is timelike and so a Lorentz space in its own right. (ii) There exist null vectors v, w E W that are linearly independent. (iii) W contains a timelike vector. Proof. Suppose (i) holds. Let el, ... , em be an orthonormal basis for W with el timelike. Then el + e2 and el - e2 are both null and, taken together, are a linearly independent pair so that (ii) holds. Now suppose that (ii) holds and let v, w be a linearly independent pair of null vectors. By Exercise 13.93 above, either v+w or v-w must be timelike so we have (iii). Finally, suppose (iii) holds and v E W is timelike. Since v~ is spacelike and W~ C v~, we
13. Riemannian and Semi-Riemannian Geometry
590
see that W~ is spacelike. But then W is timelike by Exercise 13.92 so that (i) holds. 0 Exercise 13.95. Use the above lemma to prove that if W is a nontrivial subspace of a Lorentz space, then the following three conditions are equivalent: (i) W contains a nonzero null vector but no timelike vector.
(ii) W is lightlike. (iii) The intersection of W with the nullcone is one-dimensional. Definition 13.96. The timecone determined by a timelike vector v is the set C (v) := {w E V : (v, w) < o}. In Problem 6 we ask the reader to show that timelike vectors v and w in a Lorentz space V are in the same timecone if and only if (v, w) < o. Exercise 13.97. Show that there are exactly two timecones in a Lorentz vector space whose union is the set of all nonzero timelike vectors. Describe the relation of the nullcone to the timecones. Now we come to an aspect of Lorentz spaces that underlies the twins paradox of special relativity. Proposition 13.98. If v, ware timelike elements of a Lorentz vector space then we have the backward Schwartz inequality
l(v,w)l2: Ilvllllwll, with equality only if v is a scalar multiple of w. Also, if v and ware in the same timecone, then there is a uniquely determined number a 2: 0, called the hyperbolic angle between v and w, such that
(v, w) =
-llvllllwll cosha.
Note: The minus sign appears because of our convention that (v, v) = -1 for timelike vectors. Proof. We may write w = av
a2 (v, v)
+z
where z E v~. We have
+ (z, z)
=
(w, w) <
o.
Using this and recalling that (v, v) < 0, we have (v, w)2 = a 2 (v, v) = ((w, w) - (z, z)) (v, v)
2:
(v, v) (w, w) =
Ilvllllwll.
591
13.6. Lorentz Geometry
Equality holds exactly when (z, z) = 0, but since z E v..L, this implies that z = 0 so w = avo Using Problem 6, we see that since v and ware in the same timecone, we have (v, w) < 0, and hence
(v, w) -\\v\\\\w\\
~
l.
The properties of the function cosh now give a unique number that (v, w) = -\\v\\\\w\\ cosh 0: as required.
0: ~
0 such D
Corollary 13.99. If v, ware timelike elements of a Lorentz vector space which are in the same timecone, then we have the backward triangle inequality:
\\v\\ + \\w\\ :S \\v + w\\. Equality holds only if v is a scalar multiple of w. Proof. Since (v, w) proposition. Then
(I\v\\
< 0 by hypothesis, we have - (v, w)
+ \\wl\)2 =
+ 2\\w\\\\v\\ + \\w\\2 :S \\v\\2 - 2 (v, w) + \\w\\2 =
~
\\v\\\\w\\ by the
\\v\\2
\\v + w\\2 .
Equality happens only if - (v,w) = \\v\\\\w\\, which, by the previous proposition, means that v is a scalar multiple of w. D
In relativity theory, spacetime (the set of all "idealized" possible events) is modeled as a 4-dimensional Lorentz manifold and the paths of massive bodies are to be timelike curves. But we have yet to talk about what distinguishes the past from the future! In each tangent space we have two timecones and we could arbitrarily choose one of them to be the future timecone. But what we really want is a smooth way of choosing a future timecone in each tangent space. This leads to the notion of time orientability. First we say that a vector field X on a Lorentz manifold M is timelike if X (p) is timelike for each p EM. Definition 13.100. A Lorentz manifold M is said to be time-orientable if and only if there exists a timelike vector field X E X(M). A time orientation of M is a choice of timecone C (p) E TpM for each p such that there exists a timelike X E X(M) with Xp E C(p) for each p. In the latter case, C(p) is referred to as the positive or future timecone at p. The other timecone at TpM is called the negative timecone. Timelike vectors in the positive timecone are said to be future pointing (and those in the negative timecone are past pointing). Definition 13.101. A lightlike vector in a tangent space TpM of a timeoriented Lorentz manifold M is said to be future pointing if it is the limit of
13. Riemannian and Semi-Riemannian Geometry
592
a sequence of future pointing timelike vectors in TpM. Thus the light cone (nullcone) in TpM is partitioned into future lightcone and past lightcone. Based on these definitions we can speak of timelike or lightlike curves as being either future pointing or "past pointing". Time orient ability is certainly a global condition since we need a choice of timecone in every tangent space and this choice must be made smoothly. However, the following exercise shows that the smoothness condition can be described in terms of local vector fields: Exercise 13.102. Suppose that it is possible to choose a timecone C(p) in every tangent space TpM of a Lorentz manifold M in such a way that in a neighborhood of each point there is a local smooth vector field with values in these timecones. Show that this implies that M is orientable. [Hint: Use a partition of unity argument.] Exercise 13.103. Show that Sr(r) := {p E IRf : (p,p) = r2} is a timeorient able Lorentz manifold. [Hint: consider the restriction to Sr(r) of the first coordinate vector field from IRf.]
Now let us consider timelike curves in a Lorentz manifold 1\1. Since we want to include piecewise smooth curves, we have to decide what timelike should mean. If 'Y : I ---+ M is a piecewise smooth curve and ti E I is a parameter value at which 'Y is not smooth, then what condition is appropriate if we are to refer to the curve as a timelike curve? We consider the one-sided limits '1'(tt) := lim'1'(ti + E) and '1'(t:i):= lim'1'(ti - E). c:.J,.O
c:.J,.O
For many purposes, the following definition is appropriate: Definition 13.104. Let M be a Lorentz manifold. A piecewise smooth curve 'Y : I ---+ M is called timelike if
(i) '1' is timelike where it is smooth, and
(ii) for every ti where 'Y is not smooth, we have that '1'(tt) and'1'(ti) are timelike and the following further condition holds at each such
k Thus, for timelike curves, '1'(tt) are '1'(ti) are in the same timecone. Following [ONl], we next prove a useful technical lemma. Lemma 13.105. Let p !!e a point in a Lorentz manifold M and U a normal neighborhood of p. Let U be the corresponding starshaped open set in TpM with U = expp(if). Let'Y: [0, b] ---+ if C TpM be a piecewise smooth curve such that C\' := exp p 0'Y : [0, b] ---+ M is timelike (in the sense of Definition
13.6. Lorentz Geometry
593
13.104}. Then the image of, is contained in a single timecone ofTpM and (a, P) < o. Proof. Let us first handle the case where, is smooth. Then since 1(0) is timelike, , is initially in one of the timecones which we denote by C(p). Here and below, "initially'~is taken to mean "for all sufficiently small positive parameter values t". Let P be the position vector field ~n TpM and let P be the local position vector field which ~ expp-related to P and defined on the normal neighborhood U. Note that P is timelike and outward radial at each point of if n C(p). Thus (1, P) is initially negative. Letting q(x) := (x, x) i:: TpM and considering Tp.Nf as a Lorentz manifold itself we have grad q = 2P and hence = 2(1, P). The Gauss lemma (Lemma 13.72) gives
iq 0,
(1, P) = (a, P) ,
iq 0,
which implies that (a, P) and hence are initially negative. For any t > 0 such that ,(t) is in C(p), the vector P(r(t)), and hence P(a(t)), must be timelike. For such t, (a, P) < 0 which implies that (1, P) < 0 and hence iq < O. So q starts out negative and goes down hill as long as ,(t) is in C(p). Since, can only exit C(p) by reaching the nullcone (or 0) where q vanishes, we see that, must remain inside C(p).
0,
0,
Now we consider what happens if , is timelike but merely piecewise smooth. The first segment remains in C(p), and at the first parameter value tl where, fails to be smooth, we must have (1(f1), P) < O. But then by the Gauss lemma again (a(tl)' P) < O. The technical restriction of Definition 13.104 forces (a(ti), p) < 0 so that a(ti) E C(p). Applying the Gauss lemma gives (1(ti), P) < 0 and so cannot change sign at tl. We are now set up to repeat the argument for the next segment. The result follows 0 inductively.
iq 0,
The following proposition for Lorentz manifolds should be compared to Proposition 13.85 proved for Riemannian manifolds. In this proposition we find that the geodesics are locally longer than nearby curves. Proposition 13.106. Let U be a normal neighborhood of a point p in a Lorentz manifold. If the radial geodesic, connecting p to q E U is timelike, then it is the unique longest geodesic segment in U that connects p to q. Once again, uniqueness is up to reparametrization. Proof. Let if be related to U as usual. Take any timelike curve a : [0, b] -7 U segment in U that connects p to q. By the previous lemma, j3 := exp;l 0 a
stays inside a single timecone C(p) and so also inside c(p)nif. Thus a stays inside expp(C(p)nif) where it is timelike and where the field R = (P/r)oa is
13. Riemannian and Semi-Riemannian Geometry
594
a unit timelike field along a. We now seek to imitate the proof of Proposition 13.85. We may decompose a as
a=
-
(R,a)R+N,
where N is a spacelike field along a that is orthogonal to R. We have
Iiall =
J- (a,a) = V(R,a)2 -
(N,N)
~ I(R,a)l·
Recall that q(.) = (., .) and q := q 0 exp; 1. Since r = R and so grad r = -P/r, we have (grad r)oa = -R. By the previous lemma (a, R) is negative. Then . . d(roa) l(a,R)I=-(R,a)= dt . Thus we have
L(a) =
fob Ila(t)11 dt ~ r(q) = Lb).
If L(a) = Lb), then N = 0 and we argue as in the Riemannian case to conclude that a is the same as I up to reparametrization. 0
Recall that the arc length of a timelike curve is often called the curve's proper time and is thought of as the intrinsic duration of the curve. We may reparametrize a timelike geodesic to have unit speed. The parameter is then an arc length parameter, which is often referred to as a proper time parameter. We may restate the previous theorem to say that the unit speed geodesic connecting p to q in U is the unique curve of maximum proper time among curves connecting p to q in U.
13.7. Jacobi Fields Once again we consider a semi-Riemannian manifold (M, g) of arbitrary index. We shall be dealing with smooth two-parameter maps h : (-E, E) X [a, b] ~ M. The partial maps t H- hs(t) = h(s, t) are called the longitudinal curves, and the curves s H- h(s, t) are called the transverse curves. Let a be the center longitudinal curve t H- ho(t). The vector field along a defined by V(t) = hs(t) is called the variation vector field along a. We will use the following important result more than once:
1s Is=o
Lemma 13.107. Let Y be a vector field along the smooth map h : (-E, E) X
[a,b]
~
M. Then
Proof. If one computes in a local chart, the result falls out after a mildly tedious computation, which we leave to the curious reader. 0
13.7. Jacobi Fields
595
Suppose we have the special situation that, for each s, the partial maps t f---t hs (t) are geodesics. In this case, let us denote the center geodesic t f---t ho(t) by I. We call h a variation of I through geodesics. Let h be such a special variation and V the variation vector field. Using Lemma 13.107 and the result of Exercise 13.11 we compute
V' EJt V' at V = V' at V' aJ3s h = V' at V' a/3th = V' as V' at 8th + R( 8th, 8 s h )8t h = R(8t h, 8 s h)8th and evaluating at s = 0 we get V'atV'atV(t) = Rb(t), V(t))i'(t). This equation is important and shows that V is a Jacobi field as per the folowing definition:
Definition 13.108. Let I : [a, b] -+ M be a geodesic and let J E X')'(M) be a vector field along I. The field J is called a Jacobi field if
V' at V' aJ = Rb(t), J(t))i'(t) for all t E [a, b].
In local coordinates, we recognize the above as a second order system of linear differential equations and we easily arrive at the following Theorem 13.109. Let (M, g) and the geodesic I : [a, b] -+ M be as above. Given WI, W2 E T')'(a)M, there is a unique Jacobi field J W l,W2 E X')'(M) such that J(a) = WI and V' aJ(a) = W2. The set Jac (r) of all Jacobi fields along I is a vector space isomorphic to T')'(a)M x T')'(a)M. We now examine the more general case of a Jacobi field J W l,W2 along a geodesic I : [a, b] -+ M. First notice that for any curve 0: : [a, b] -+ M with I(a( t), a( t)) I > 0 for all t E [a, b], any vector field Y along 0: decomposes into an orthogonal sum yT + Y l.. This means that yT is a multiple of a and that Y l. is normal to a. If I : [a, b] -+ M is a geodesic, then V' at y l. is also normal to ly since 0 = -9t(yl.,ly) = (V'atYl.,ly) + (yl., V'atly) = (V'atYl.,ly). Similarly, V' at yT is parallel to ly all along I.
Theorem 13.110. Let I : [a, b] -+ M be a geodesic segment.
(i) If Y E X')'(M) is tangent to I, then Y is a Jacobi field if and only ifV'~tY = 0 along I. In this case, Y(t) = (at + b)i'(t). (ii) If J is a Jacobi field along I and there are some distinct tl, t2 E [a, b] with J(tl)..ily(tl) and J(t2)..ily(t2), then J(t)..ily(t) for all t E [a, b]. (iii) If J is a Jacobi field along I and there is some to E [a, b] with J(to)..ily(to) and V'aJ(to)..ily(to), then J(t)..ily(t) for all t E [a,b]. (iv) If I is not a null geodesic, then Y is a Jacobi field if and only if both yT and yl. are Jacobi fields.
13. Riemannian and Semi-Riemannian Geometry
596
Proof. (i) Let Y
= f"r. Then the Jacobi equation reads 'V~J"r(t) = R("f(t) , f"r(t)h(t) = O.
Since I is a geodesic, this implies that
1" = 0 and
(i) follows.
(ii) and (iii) We have ~(J,"r) = (R("f(t),J(t)h(t),"r(t)) = 0 (from the symmetries of the curvature tensor). Thus (J(t), "r(t)) = at + b for some a, bE R The reader can now easily deduce both (ii) and (iii). (iv) The operator 'V~t preserves the normal and tangential parts of Y. We now show that the same is true of the map Y H R("f(t) , Yh(t). Since we assume that I is not null, we have yT = f"r for some "r. Thus R("f(t) , yTh(t) = R("f(t) , f"r(t)h(t) = 0, which is trivially tangent to "r(t). On the other hand, (R("f(t) , y..L(t) h(t), "r(t)) = 0 by symmetries of the curvature tensor. We have
('V~tY) T + ('V~tY)..L = 'V~tY = R("f(t) , Y(t)h(t)
+ R("f(t) , y..L(t)h(t)
=
R("f(t) , yT (t)h(t)
=
0 + R("f(t) , y..L(t)h(t).
So the Jacobi equation 'V~t Y(t) = R("f(t), Y(t)h(t) splits into two equations 'V~tyT (t)
= 0,
'V~ty..L(t) = R"y(t),Y.L(t)"r(t), and the result follows from this.
o
Corollary 13.111. Let I = IV and J~'w be as above. Then J~,rv(t) rt"rv(t). If w.l..v, then (J0,W(t),"rv(t)) = 0 for all t E [0, b]. Furthermore, every Jacobi field JO,w along expv tv with JO,W(O) = 0 has the form JO,w := rt"rv + JO,Wl, where w = 'VaJ(O) = rv + WI, wI.l..v and JO,Wl (t).l.. "rv(t) for all t E [0, b].
The proof of the last result shows that a Jacobi field decomposes into a parallel vector field along I, which is just a multiple of the velocity "r, and a "normal Jacobi field" J..L, which is normal to I at each of its points. Of course, the important part is the normal part since the tangential part is merely the infinitesimal model for a variation through geodesics which are merely reparametrizations of the s = 0 geodesic. Thus we focus attention on the Jacobi fields that are normal to the geodesics along which they are defined. Thus we consider the Jacobi equation 'V~J(t) = R("f(t), J(t)) "r(t) with initial conditions such as in (ii) or (iii) of Theorem 13.110. Exercise 13.112. For v E TpM, let v..L := {w E TpM: (w,v) = O}. Prove that the tidal operator Rv : w H Rv,wv maps v..L to itself.
In light of this exercise, we make the following definition.
13.7. Jacobi Fields
597
Definition 13.113. For v E TpM, the (restricted) tidal force operator Fv : v.l ---+ v.l is the restriction of Rv to v.l C TpM.
Notice that in terms of the tidal force operator the Jacobi equation for normal Jacobi fields is 'l~J(t)
= F"yCt) (J(t)) for all t.
If J is the variation vector field of a geodesic variation, then it is an infinitesimal model of the separation of nearby geodesics. In general relativity, one thinks of a one-parameter family of freely falling particles. Then '1 at J is the relative velocity field and 'l~t J is the relative acceleration. Thus the Jacobi equation can be thought of as a version of Newton's second law with the curvature term playing the role of a force. Proposition 13.114. For v E TpM, the tidal force operator Fv : v.l ---+ v.l is self-adjoint and Trace (Fv) = - Ric(v, v). Proof. First, (FvWl' W2) = (Rv,Wl v, W2) = (Rv,W2 v, WI) = (FvW2' WI) by (iv) of Theorem 13.19. The proof that Trace Fv = - Ric(v, v) is easy for definite metrics but for indefinite metrics the possibility that v may be a null vector involves a little extra work. If v is not null, then letting e2, ... ,en be an orthonormal basis for v.l we have
Ric(v,v) = -
I:>i
(Rv,eiv,ei) = - LEi (Fvei,ei) = -TraceFv .
If v is null, then we can find a vector W such that (w, v) = -1 and w, v span a Lorentz plane L in TpM. Define el := (v + w) /..;2 and e2 := (v - w) /..;2. One checks that el is timelike while e2 is spacelike. Now choose an orthonormal basis e3, . .. ,en for L.l C v.l so that el, ... ,en is an orthonormal basis for TpM. Then we have
Ric(v,v) = (Rv,elv,el) - (Rv,e2v,e2) - LEi (Rv,eiv,ei). i>2
But (Rv,el v, el)
= ! (Rv,wv, w) = (Rv,e2V, e2) and so we are left with
Ric(v,v) = - LEi (Rv,eiv,ei) = - LEi (Fvei,ei). i>2 i>2 Since v, e3, . .. ,en is an orthonormal basis for v.l and Fvv = 0, we have Ric(v, v)
= - LEi (Fvei,ei) i>2
= -
(Fvv, v) - LEi (Fvei' ei) = - Trace Fv.
0
i>2
Definition 13.115. Let, : [a, b] ---+ M be a geodesic. Let the set of all Jacobi fields J such that J(a) = J(b) = o.
Job, a, b) denote
13. Riemannian and Semi-Riemannian Geometry
598
Definition 13.116. Let, : [a, b] -+ M be a geodesic. If there exists a nonzero Jacobi field J E Joh, a, b), then we say that ,(a) is conjugate to ,(b) along ,.
From standard considerations in the theory of linear differential equations it follows that the set Joh, a, b) is a vector space. The dimension of the vector space Joh, a, b) is the order of the conjugacy. Since the Jacobi fields in Joh, a, b) vanish twice, and, as we have seen, this means that such fields are normal to "I all along" it follows that the dimension of Joh, a, b) is at most n - 1, where n = dim M. We have seen that a variation through geodesics is a Jacobi field; so if we can find a nontrivial variation h of a geodesic, such that all of the longitudinal curves t t-7 h8 (t) begin and end at the same points ,(a) and ,(b), then the variation vector field will be a nonzero element of Joh, a, b). Thus we conclude that ,(a) is conjugate to ,(b). We will see that we may obtain a Jacobi field by more general variations, where the endpoints of the curves meet at time b only to first order. Let us bring the exponential map into play. Let, : [0, b] -+ M be a geodesic as above. Let v = "1(0) E TpM. Then,: t t-7 expp tv is exactly our geodesic, which begins at p and ends at q at t = b. Now we create a variation of , by h(s, t) = expp t(v + sw), where W E TpM and s ranges in (-E, E) for some sufficiently small E. We know that J(t) = %818=0 h(s, t) is a Jacobi field, and it is clear that J(O) := %818=0 h(s, 0) = O. If Wbv is the vector tangent in Tbv(TpM) which canonically corresponds to w, in other words, if Wbv is the velocity vector at s = 0 for the curve s t-7 b (v + sw) in TpM, then
J(b) =
~!
uS
8=0
h(s, b) =
~!
uS 8=0
expp b(v + sw)
= Tbv expp(wbv).
(We have just calculated the tangent map of expp at x
= bv!) Also,
'VaJ(O) = 'Vat%s eXPpt(v+SW)!s=o,t=o = 'Va5
Is=0 ~! ut
exppt(v+sw). t=O
But X (s) := ~ 1t=o expp t( v+sw) = v+SW is a vector field along the constant curve t t-7 p, and so by Exercise 12.42 we have 'Va.ls=oX(s) = X'(O) = w. The equality J(b) = Tbv expp(vbv) is important because it shows that if Tbv expp : nv(TpM) -+ Ty(b)M is not an isomorphism, then we can find a vector Wbv E Tbv(TpM) such that Tbv expp (Wbv) = O. But then if W is the vector in TpM which corresponds to Wbv as above, then for this choice of w,
13.8. First and Second Variation of Arc Length
599
the Jacobi field constructed above is such that J(O) = J(b) = 0 so that ')'(0) is conjugate to ')'(b) along ')'. Also, if J is a Jacobi field with J(O) = 0 and \7 aJ(O) = w, then this uniquely determines J and it must have the form %818=0 expp t(v + sw) as above. Theorem 13.117. Let')': [0, b] --+ M be a geodesic. Then the following are equivalent:
(i) ')' (0) is conjugate to ')' (b) along ')'. (ii) There is a nontrivial variation h of ')' through geodesics that all start at p = ')'(0) such that J(b) := ~~ (0, b) = O. (iii) If v = 1 (0), then T bv expp is singular. Proof. (ii)===}(i): We have already seen that a variation through geodesics is a Jacobi field J and that if (ii) holds, then by assumption J(O) = J(b) = 0, and so we have (i). (i)===}(iii): If (i) is true, then there is a nonzero Jacobi field J with J(O) = J(b) = O. Now let w = \7 aJ(O) and h(s, t) = expp t(v + sw). Then h (s, t) is a variation through geodesics and 0 = J (b) = %slo exp p b( v + sw) = Tbv exp p(Wbv) so that Tbv expp is singular. (iii)===}(ii): Let v = 1 (0). If nvexpp is singular, then there is a w with nv expp Wbv = O. Thus the variation h(s, t) = exp p t(v + sw) does the job. 0
13.8. First and Second Variation of Arc Length Let us restrict attention to the case where 0 is either spacelike or timelike (not necessarily geodesic). This is just the condition that 1(6:(t),6:(t))1 > O. Let E = +1 if 0 is spacelike and E = -1 if 0 is timelike. We call E the sign of 0 and write E = sgn o. Consider the arc length functional defined by
L(o) =
lb
(E(6:(t),6:(t)))1/2 dt
=
lb
1(6:(t), 6:(t)) 11/2 dt.
If h : (-c, c) x [a, b] --+ M is a variation of 0 as above with variation vector field V, then formally V is a tangent vector at 0 in the space of curves [a, b] --+ M. By a simple continuity and compactness argument we may
choose a real number c > 0 small enough that I(hs(t), hs(t)) I > 0 for all s E (-c, c). Then we have the variation of the arc length functional defined by
d id I 8=0 L(h s ) := ds 8=0
8Lla (V) := ds
Ib (. . ) a
E(hs(t), hs(t))
1/2
dt.
600
13. Riemannian and Semi-Riemannian Geometry
Thus, we are interested in studying the critical points of L(s) := L(h s ), and so we need to find L'(O) and L"(O). For the proof of the following proposition we use the result of Exercise 13.11 to the effect that 'V' a.8th = 'V' at8sh. Proposition 13.118. Let h : (-E, E) X [a, b] ~ M be a variation of a curve 0::= ho such that l(hs(t),hs(t))1 > 0 for all s E (-E,E). Then
L'(s)
lb
=
E('V' as8th(s, t), 8t h(s, t)) (E(8t h(s, t), 8th(s, t)) )-1/2 dt.
Proof. We have
L'(s)
=
=
:s lb I lb lb lb b
a
I
Ilhs(t) dt
d (.
.
ds E(hs(t), hs(t))
) 1/2
dt
2E('V' ashs(t), hs(t))~ (E(hs(t), hs(t))) -1/2 dt
=
=
E('V' a.8th(s, t), 8t h(s, t)) (E(8t h(s, t), 8t h(s, t)) )-1/2 dt
=E
('V' at8sh(s, t), 8t h(s, t)) (E(8t h(s, t), 8t h(s, t)) )-1/2 dt.
0
Corollary 13.119. We have
oLla (V)
=
L'(O)
=
lb
E
('V' at V(t), a(t)) (E(a(t), a(t)) )-1/2 dt.
Let us now consider a more general situation where 0: : [a, b] ~ M is only piecewise smooth (but still continuous). Let us be specific by saying that there is a partition a = to < tl < ... < tk < tk+l = b so that 0: is smooth on each [ti, ti+l]. A variation appropriate to this situation is a continuous map h : (-E, E) X [a, b] ~ M with h(O, t) = o:(t) such that h is smooth on each set of the form (-E, E) X [ti, ti+1]. This is what we mean by a piecewise smooth variation of a piecewise smooth curve. The velocity a and the variation vector field V(t) := ah~~,t) are only piecewise smooth. At each "kink" point ti we have the jump vector 6a(ti) := a(ti+) - a(ti-), which measures the discontinuity of a at k Using this notation, we have the following theorem which gives the first variation formula: ~ M be a piecewise smooth variation of a piecewise smooth curve 0: : [a, b] ~ M with variation vector field V. If 0: has constant speed c= (E(a,a))1/2, then
Theorem 13.120. Let h : (-E, E) X [a, b]
oLla (V)
=
L'(O)
=
-~
I
b
('V' ata, V) dt - ~ cae
k
b
L (6a(ti), V(ti)) + ~(a, V)I a i=l
C
601
13.8. First and Second Variation of Arc Length
Proof. Since c = (E(6:,6:))1/2, Proposition 13.119 gives
L'(O)
r
ti + 1 iti (\7 8t V(t),
E
~L
=
6:(t)) dt.
-it
Since we have (6:, \7 8t V) = (6:, V) - (\7 8t6:, V), we can employ integration by parts: On each interval [ti, ti+l] we have E lti+l
-
C
E
(\7 8t V, 6:) dt = - (6:, V) c k to get
ti
We sum from i
= 0 to i =
b
Iti+l ti
E lti+l
-
C
ti
L'(O) = ~(6:, V)I - ~ L(66:(ti), V(ti)) - ~ c
k
c
a
lb (\78t6:, V) dt,
C
i =l
(\7 8t 6:, V) dt.
a
o
which is the required result.
A variation h : (-to, to) X [a, b] ---+ M of a is called a fixed endpoint variation if h(s, a) = a(a) and h(s, b) = a(b) for all s E (-to, to). In this situation, the variation vector field V is zero at a and b.
Corollary 13.121. A piecewise smooth curve a : [a, b] ---+ M with constant speed c > 0 on each subinterval where a is smooth is a (nonnull) geodesic if and only if 8Lla (V) = 0 for all fixed endpoint variations of a. In particular, if M is a Riemannian manifold and a: [a, b] ---+ M minimizes length among nearby curves, then a is an (unbroken) geodesic. Proof. If a is a geodesic, then it is smooth and so 66:(ti) = 0 for all ti (even though a is smooth, the variation still only needs to be piecewise smooth). It follows that L'(O) = o. Now if we suppose that a is a piecewise smooth curve and that L' (0) = 0 for any variation, then we can conclude that a is a geodesic by picking some clever variations. As a first step we show that al[t. t.'1.+1 1 is a geodesic for each segment [ti, ti+1]' Let t E (ti' ti+l) be arbitrary and let v be any nonzero vector in Ta(t)M. Let f3 be a cut-off function on [a, b] with support in (t - 8, t + 8) and 8 chosen sufficiently small. Then let V(t) := f3(t)Y(t), where Y is the parallel translation of y along a. We can now easily produce a fixed endpoint variation with variation vector field V by the formula '1"
h(s, t)
:= eXPa(t)
sV(t).
With this variation the last theorem gives
lb (\78t 6:, V) dt = -- I + (\78t6:, f3(t)Y(t)) dt, c
L'(O) = --E
E
a
C
t 8
t-8
which must hold no matter what our choice of y and for any 8> O. From this it is straightforward to show that \7 8t 6:(t) = 0, and since t was an arbitrary
13. Riemannian and Semi-Riemannian Geometry
602
element of (ti' ti+1), we conclude that al[t. t.1.+1 J is a geodesic. All that is left is to show that there can be no discontinuities of a.:. Once again we choose a vector y, but this time y E TCX(ti)M, where ti is a potential kink point. Take another cut-off function [3 with supp [3 C [ti-l,ti+1l = [ti-l, til u [ti' ti+1], [3(ti) = 1, and i a fixed but arbitrary element of {I, 2, ... , k}. Extend y to a field Y as before and let V = [3Y. Since we now have that a is a geodesic on each segment, and we are assuming that the variation is zero, the first variation formula for any variation with variation vector field V reduces to 1.,
for all y. This means that 6a.:(ti) = 0, and since i was arbitrary, we are ~~.
0
°
We now see that, for fixed endpoint variations, L'(O) = implies that a is a geodesic. The geodesics are the critical "points" (or curves) of the arc length functional restricted to all curves with fixed endpoints. In order to classify the critical curves, we look at the second variation but we only need the formula for variations of geodesics. For a variation h of a geodesic ,,(, we have the variation vector field V as before, but now we also consider the transverse acceleration vector field A(t) := \7 8sosh(0, t). Recall that for a curve "( with I(i', 1') I > 0, a vector field Y along "( has an orthogonal decomposition Y = yT + y.l (tangent and normal to "(). Also we have (\7 8t y).l = \7 8t y.l, and so we can use \7 8t y.l to denote either of these without ambiguity. We now have the second variation formula of Synge: Theorem 13.122. Let "( : [a, bl ---+ M be a (nonnull) geodesic of speed c > 0. Let c be the sign of"( as before. If h : (-10, E) X [a, bj is a variation
of"( with variation vector field V and acceleration vector field A, then the second variation of L(8) := L(hs(t)) at 8 = 0 is
._ 1(8s 8h (8, t), 8h Proof. Let H (8, t).8s (8, t)) 11/2 -_ (8h c( 8s (8, t), 8h 8s (8, t)) ) 1/2 . We have L'(8) = iH(8, t) dt. Computing as before, we see that
J:
OH(8, t) c / oh Oh) 08 = H\08(8,t),\78s ot (8,t) .
13.8. First and Second Variation of Arc Length
603
Taking another derivative, we have
8 2H(s, t) = ~ (H~/8h V 8h) _ /8h V 8h)8H) 8s 2 H2 8s \ 8t ' as 8t \ 8t ' as 8t 8s
+ \ 8t ' V 0 at -
1/8h 8h)8H) H \ at ' Va 8t 8s
~ ( \ V as ~~ , Va ~~) + \ ~~ ,V~s ~~) -
;2 (~~ ,Va ~~) 2) .
C
=H
=
8h 8h) \ Vas 8t ' Vas 8t
/8h
(/
8h)
2
8
8
Using Vat8sh
8
8
= Vas8th and Lemma 13.107, we obtain
8h Va Vas 8t 8
8h
= Vas Vat 8s = R
(8h 8h) 8h 8s' at 8s
8h
+ Vat Vas 8s '
and then
8 2H 8s2
8h
C {
8h
/8h
(8h 8h) 8h)
= H (Vat 8s ' Vat 8s) + \ 8t ,R 8s' 8t 8s 8h
+ (8t' Vat Va
8
8h C /8h 8h)2} 8s) - H2 \ at ' Vat 8s .
Now we let s = 0 and get
82 H
8s 2 (0, t)
=
C{
~ (Vat V, V Ot V)
+ b, R(V, 1')V)
+ b, Vat A ) -
:2 b, Vat V)2}.
Before we integrate the above expression, we use the fact that (1', VatA) = (1', A) ('Y is a geodesic) and the fact that the orthogonal decomposition of Vat Vis Vat V = c2 b, Vat V)'Y + Vat V -1,
ft
C
so that (Vat V, Vat V) = 2x b, Vat V)2 + (Vat V -1, Vat V -1). Plugging these identities in, observing the cancellation, and integrating, we get
L"(O)
=
lb ~s~
(O,t)dt =
~
lb
((Vat V-1 , VatV-1)
C Ib . + -b,A) C a
+ (1',R(V,1')V)) dt D
The right hand side of the main equation of the second variation formula just proved depends only on V except for the last term. But if the variation is a fixed endpoint variation, then this dependence drops out. It is traditional to think of the set S1 a ,b(p, q) of all piecewise smooth curves 0: : [a, b] -+ M from p to q as an infinite-dimensional manifold. Then a variation vector field V along a curve 0: E S1(p, q) which is zero at the endpoints is the "tangent vector" at 0: to the curve in S1 a ,b(P, q) given by the corresponding fixed endpoint variation h. Thus the "tangent space"
604
13. Riemannian and Semi-Riemannian Geometry
TaO = Ta (Oa,b(P, q)) at Q is the set of all piecewise smooth vector fields V along Q such that V(a) = V(b) = O. We then think of L as being a function on Oa,b(P, q) whose constant speed and nonnull critical points we have discovered to be nonnull geodesics beginning at P and ending at q at times a and b respectively. Further thinking along these lines leads to the idea of the index form. Let us abbreviate Oa,b(P, q) to f2 a,b or even to 0. For our present purposes, we will not lose anything by assuming that a = 0 whenever convenient. On the other hand, it will pay to refrain from assuming that b = 1. Definition 13.123. For a given nonnull geodesic T [0, b] ----7 M, the index form 1"(: T,,(O x T,,(O ----7 IR is defined by I"((V, V) = L~(O), where L"((s) =
Ji I(hs(t), hs(t))ll/2 dt and V' ash(O, t) = V.
Of course this definition makes sense because L~(O) only depends on V and not on h itself. Also, we have defined the quadratic form I"((V, V), but not directly I"((V, W). Of course, polarization gives I"((V, W), but if V, W E T,,(O, then it is not hard to see from the second variation formula that (13.9) It is important to notice that the right hand side of the above equation is in fact symmetric in V and W.
It is important to remember that the variations and variation vector fields we are dealing with are allowed to be only piecewise smooth even if the center curve is smooth. So let 0 = to < tl < ... < tk < tk+l = b as before and let V and W be vector fields along a geodesic 'Y. We now derive another formula for I"((V, W). Rewrite formula (13.9) as I"((V, W) =
~ 2: k
lti+l . {(V'at V~, V' at W~) + (R(i',
i=O
Vh, W) } dt.
t,
On each interval [ti, ti+l] we have
(V' at V~, V' at W~) = V' at (V' at V~, W~) - (V'~t V~, W~), and substituting this into the above formula we obtain E~ k
I
t,+1 {
I"((V, W) = ~ ~. i=O
~
~
2
~
~
V' at (V' at V ,W ) - (V' at V ,W )
t,
+(R(i', Vh, W) } dt. As for the last term, we use symmetries of the curvature tensor to see that
(R(i', Vh, W) = (R(i', V~h, W~).
605
13.8. First and Second Variation of Arc Length
Substituting we get I, (V, W)
=
f:o' I + t'
C,"", k
~
1
ti
{
-.l
2
-.l
-.l
-.l
\7 8t (\7 8t V ,W ) - (\7 8t V ,W )
+(R('Y, V-.l)"Y, W-.l)} dt. Using the fundamental theorem of calculus on each interval [ti' ti+1], and the fact that W vanishes at a and b, we obtain the following alternative formula:
Proposition 13.124 (Formula for index form). Let, : [0, bJ ----t M be a nonnull geodesic. Then for V, W E T,Oa,b, I, (V, W)
=
r (\7~t c io
-~
b
V-.l
+ R('Y, V-.l)"Y, W-.l) dt
k
C,"", -.l - - ~ (~\7 8t V (ti),
c
-.l
W (ti)),
i=1
Letting V = W we have I, (V, V) =
-~ c
b
k
r (\7~t V-.l+R('Y, V-.l)"Y, V-.l) dt-~ L(~ \7 io c
8t V-.l(ti),
V-.l(ti)) ,
i=1
and the presence of the term (R(~, V-.l)"Y, V-.l) indicates a connection with Jacobi fields.
Definition 13.125. A geodesic segment, : [a, bJ ----t M is said to be relatively length minimizing (resp. relatively length maximizing) if for all piecewise smooth fixed endpoint variations h of , the function L(s) := I(hs(t), hs(t))11/2 dt has a local minimum (resp. local maximum) at s = (where, = ho(t) := h(O, t)).
J: °
If , : [a, bJ ----t M is a relatively length minimizing nonnull geodesic, then L"(O) = 0, which means that I, (V, V) = for any V E T,Oa,b. The adverb "relatively" is included in the terminology because of the possibility that there may be curves in Oa,b which are "far away" from , and which have smaller length than,. A simple example of this is depicted in Figure has greater length than, even though is relatively length 13.8, where minimizing. We assume that the metric on (0,1) x 8 1 is the usual definite metric dx 2+dy2 induced from that on lRx (0,1), where we identify 8 1 x (0, 1) with the quotient lR x (0, l)/((x, y) '" (x + 27r, y)). On the other hand, one sometimes hears the statement that geodesics in a Riemannian manifold are locally length minimizing. This means that for any geodesic , :
°
,2
,2
606
13. Riemannian and Semi-Riemannian Geometry
S
1
X(O,l)
Figure 13.8. Geodesic segments on a cylinder
[a, b] -+ M, the restrictions to small intervals are always relatively length minimizing. But note that this is only true for Riemannian manifolds. For a semi-Riemannian manifold with indefinite metric, a small geodesic segment can have nearby curves that decrease the length. For example, consider the metric -dx 2 + dy2 on lR x (0,1) and the induced metric on the quotient Sl x (0,1) = lR x (0,1)/,,-,. In this case, the geodesic "( in the figure has length greater than all nearby geodesics; the index form 1'1 is now negative semidefinite. Exercise 13.126. Prove the above statement concerning 1'1 for 8 1 x (0,1) with the index 1 metric -dx 2 + dy2. It is not hard to see that if even one of V or W is tangent to -y, then 1'1 (V, W) = and so 1'1 (V, W) = 1'1 (V.l , W.l). Thus, we may as well restrict 1'1 to T¢n = {V E T'Yn : V l..-Y}.
°
Notation 13.127. The restriction of 1'1 to T.:j-n will be called the restricted index and will be denoted by 1~. The nullspace N(I~) is then defined by
N(I~)
:=
{V E T¢n: 1¢(V, W) =
°
for all WE T¢n}.
Theorem 13.128. Let "( : [0, b] -+ M be a nonnull geodesic. The nullspace T.:j-n -+ R is exactly the space Jo(,,(,O,b) of Jacobi fields vanishing at "((0) and "((b).
N(I~) of 1~ :
Proof. The formula of Proposition 13.124 makes it clear that JO("(, 0, b) N(I~).
c
Suppose that V E N(I~). Let t E (ti' ti+l), where the ti determine a partition of [0, b] such that V is potentially nonsmooth at the ti as before. Pick an arbitrary nonzero element y E ("((t)).l c T'Y(t)M and let Y be the unique parallel field along "(I[t.t,t+1 t. 1 such that Y(t) = y. Picking a cut-off
13.8. First and Second Variation of Arc Length
607
function (3 with support in [t + 8, t - 8] C (ti' ti+l) as before we extend (3Y to a field W along 'Y with W(t) = y. Now V is normal to the geodesic and so Ly(V, W) = I~(V, W) and
1
t 8 + (V7~t V I, (V, W) = --E c t-8
+ R('Y, Vh, (3Y) dt.
For small 8, (3Y~ is approximately the arbitrary nonzero y and it follows that V7~t V + R('Y, Vh is zero at t. Since t was arbitrary, V7~t V + R('Y, Vh is identically zero on (ti' ti+l). Thus V is a Jacobi field on each interval (ti' ti+ 1)' and since V is continuous on [0, b], it follows from the standard theory of differential equations that V is a smooth Jacobi field along all of 'Y. Since V E T,n, we already have V(O) = V(b) = O. We conclude that V E Joh, 0, b). 0 Proposition 13.129. Let (M,g) be a semi-Riemannian manifold of index ind(g) and'Y : [a, b] -+ M a nonnull geodesic. If the index form I, is positive semidefinite, then ind(g) = 0 or n (thus the metric is definite and so, up to the sign of g, the manifold is Riemannian). On the other hand, if I, is negative semidefinite, then ind(g) = 1 or n - 1 (so that up to the sign convention, M is a Lorentz manifold). Proof. For simplicity we assume that a = 0 so that 'Y : [0, b] -+ M. Let I, be positive semi-definite and assume that 0 < /I < n (/I = ind( M)). In this case, there must be a unit vector u in T,(o)M which is normal to -y(0) and has the opposite causal character of -Y(O). This means that if E = b(O), -y(0)) / 1h(0)11, then E(U, u) = -1. Let U be the field along 'Y which is the parallel translation of u. By choosing 8 > 0 appropriately we can arrange that 8 is as small as we like and simultaneously that sin(t/8) is zero at t = 0 and t = b. Let V := 8 sin(t/8)U and make the harmless assumption that II'YII = 1. Notice that by construction V ..1 -y. We compute: I, (V, V)
= E fob {(V7 Ot V, V7 Ot V) + (R( -y, Vh, V)} dt
= E fob {(V7 Ot V, V7 Ot V) - (R('Y, V)V, -y)} dt = E fob {(V7 Ot V, V7 Ot V) - K(V A -y)(V A -y, V A -y)} dt = Efob {(V7Ot V, V7 Ot V) - K(V A -y)(V, V)E} dt, where
K(V A -y) := (9t(V A -y), V A -y) = (9t(V A -y), V A -y) (V A -y, V A -y) E(V, V)
13. Riemannian and Semi-Riemannian Geometry
608
as defined earlier. Continuing the computation we have If'(V, V) = c =
fob {(U, U) cos 2 (t/8) + K(V 1\ '1')8 2 sin2 (t/8)} dt
fob {- cos 2 (t/8) + cK(V 1\ '1')8 2 sin2 (t/8)} dt
= -b/2 + 82 fob cK(V 1\ '1') sin2 (t/8) dt. Now as we said, we can choose 8 as small as we like, and since K(V(t) 1\ '1'( t)) is bounded on the (compact) interval [0, b], this clearly means that If'(V, V) < 0, which contradicts the fact that If' is positive semidefinite. Thus our assumption that 0 < v < n is impossible. Now let If' be negative semidefinite. Suppose that we assume that contrary to what we wish to show, v is not 1 or n - 1. In this case, one can find a unit vector U E Tf'(o)M normal to '1'(0) such that c(u, u) = +1. The same sort of calculation as we just did shows that If' cannot be semidefinite; 0 again a contradiction. By changing the sign of the metric the cases handled by this last theorem boil down to the two important cases: 1) where (M, g) is Riemannian, 'Y is arbitrary, and 2) where (M, g) is Lorentz and 'Y is timelike. We consolidate these two cases by a definition: Definition 13.130. A geodesic 'Y : [a, b] -+ M is cospacelike if the subspace '1'(8)1.. C Tf'(s)M is spacelike for some (and consequently all) 8 E [a, b]. Exercise 13.131. Show that if 'Y : [a, b] -+ M is cospacelike, then 'Y is nonnull, '1'(8)1.. C Tf'(s)M is spacelike for all 8 E [a, b], and also show that (M, g) is either Riemannian or Lorentz.
A useful observation about Jacobi fields along a geodesic is the following: Lemma 13.132. If we have two Jacobi fields Jl and J2 along a geodesic 'Y, then (V' oJl' J2) - (J1 , V' oJ2) is constant along ,. Proof. To see this, we note that
V' at (V' oJl' J2)
= (V'~Jl' J2) + (V' oJl' V' at h) = (R('Y, hh, h) + (V'oJl, V'oth) =
(R('Y, hh, h)
+ (V'oJ2' V'oJl)
= V'Ot(V'oJ2' J 1 ).
Similarly, we compute V' at (11, V' oJ2) and subtract the result from the above to obtain the conclusion. 0
609
13.8. First and Second Variation of Arc Length
In particular, if (\7 aJ1' h) = (11, \7 aJ2) at t = 0, then (\7 aJ1, h) (11, \7 at h) = 0 for all t. We need another simple technical lemma: Lemma 13.133. If h, ... , Jk are Jacobi fields along a geodesic 'Y such that (\7 aJi' Jj) = (1i, \7 aJj) for all i, j E {I, ... , k}, then any field Y which can
be written as Y = (\7 at Y, \7 at Y)
L: rpi Ji
has the property that
+ (R(Y, i')Y, 1') = ((8t rpi) Ji, (8t rpi) Ji) + 8t (Y, rpr (\7 aJr)).
Proof. We have \7 at Y = (8t rpi) Ji + rpr (\7 at Jr ) and so using the summation convention,
+ (Y, \7 at [rpr (\7 aJr)]) (( 8t rpi) Ji , rpr (\7 aJr)) + (rpr (\7 aJr) , rpk (\7 aJk)) + (Y, 8t rpr \7 at J r ) + (Y, rpr\7~t Jr ).
8t (Y, rpr (\7 aJr)) = ((\7 at Y ) , rpr (\7 aJr)) =
The last term (Y, rpr\7~Jr) equals (R(Y, i')Y, 1') by the Jacobi equation. Using this and the fact that (Y, 8t rpr \7 at Jr ) = (( 8t rpi) Ji, rpr (\7 at J r )), which follows from a short calculation using the hypotheses on the Ji, we arrive at
8t (Y, rpr (\7 aJr)) = 2((8t rpi) Ji, rpr (\7 aJr))
+ (rpr (\7 aJr) , rpr (\7 aJr))
+ (R(Y, i')Y, 1')' Using the last equation together with \7 at Y = (8t rpi) Ji the result (check it!).
+ rpr (\7 aJr)
gives 0
Exercise 13.134. Work through the details of the proof of the lemma above.
Throughout the following discussion, 'Y : [0, b] -+ M will be a cospacelike geodesic with sign E and speed c. Suppose that there are no conjugate points of p = 'Y(O) along 'Y. There exist Jacobi fields J1,.'" I n- 1 along 'Y which vanish at t = 0 and are such that the vectors \7 aJ1 (0), ... , \7 aJn-1 (0) E TpM are a basis for the space 1'(0)1- c T-y(o)M. Claim: J 1 (t), ... , I n -1(t) are linearly independent for each t > O. Indeed, suppose that C1J1(t) + ... + C2Jn-1(t) = 0 for some t. Then, Z := L::-11 CiJi is a normal Jacobi field with Z(O) = Z(t) = O. But then, since there are no conjugate points, Z = 0 identically and so 0 = \7 atZ(O) := L:~:11 ci\7aJi(O). Since the \7aJi(O) are linearly independent, we conclude that Ci = 0 for all i and the claim is proved. It follows that at each t with 0 < t :::; b the vectors h(t), ... , I n -1(t) form a basis of i'(t)1- C T-y(t)M. Now let Y E T-y(O) be a piecewise smooth
13. Riemannian and Semi-Riemannian Geometry
610
variation vector field along , and write Y = 2:: 'Pi Ji for some piecewise smooth functions 'Pi on (0, b], which can be shown to extend continuously to [O,b] (see Problem 3). Since ('\7aJi' Jj ) = (Ji, '\7aJj) = 0 at t = 0, we have ('\7aJi' Jj ) - (Ji, '\7aJj) = 0 for all t by Lemma 13.132. This allows for the use of Lemma 13.133 to arrive at
('\7at Y , '\7atY)
+ (R(Y,-y)Y,-y) =
(L (8 'Pi) Ji, L t
(8t 'Pi)
Ji )
+ 8t (y, L
'P r ('\7 at J r ) )
and then (13.10) cIl'(Y' Y)
=
~ fob (L (8t 'Pi) Ji, L
(8t 'Pi) J i )dt + ~
(Y, 'P r ('\7 aJr))lg.
On the other hand, Y is zero at a and b and so the last term above vanishes. Now we notice that since, is cospacelike and the Ji are normal to the geodesic, we must have that the integrand in equation (13.10) above is nonnegative. We conclude that cIl'(Y' Y) ~ o. On the other hand, if II'(Y' Y) = 0 identically, then J~ (2:: (8t 'Pi) Ji, 2:: (8t 'Pi) Ji) dt = 0 and (2:: (8t 'Pi) Ji , 2:: (8t 'Pi) Ji ) = O. In turn, this implies that 2:: (8t 'Pi) Ji == 0 and that each 'Pi is constant, in fact zero, and finally that Y itself is identically zero along,. All we have assumed about Y is that it is in the domain of the restricted index I¢ and so we have proved the following: Proposition 13.135. If, E n is cospacelike and there is no conjugate points to p = ,(0) along" then cI¢(Y, Y) ~ 0 and Y = 0 along, if and only if I¢(Y, Y) = o.
We may paraphrase the above result as follows: For a cospacelike geodesic , without conjugate points, the restricted index I¢ is definite; it is positive definite if E = +1 and negative definite if E = -1. The first case (E = +1) is exactly the case where (M,g) is Riemannian (Exercise 13.131). Next we consider the situation where the cospacelike geodesic, : [0, b] -+ M is such that ,(b) is the only point conjugate to p = ,(0) along ,. In this case, Theorem 13.128 tells us that I¢ has a nontrivial nullspace and so II' cannot be definite. Claim: II' is semidefinite. To see this, let Y E Tl'n and write Y in the form (b - t) Z (t) for some (continuous) piecewise smooth Z. Let bi -+ b and define Yi to be (bi - t)Z(t) on [0, bJ Our last proposition applied to := ,1[O,bi] shows that cII'JYi, Yi) ~ O. Now cIl'i (Yi, Yi) -+ cIl'(Y' Y) (some uninteresting details are omitted) and so the claim is true.
,i
Now we consider the case where there is a conjugate point to p before ,I [O,r] with 0 < r < b
,(b). Suppose that J is a nonzero Jacobi field along
13.8. First and Second Variation of Arc Length
611
such that J(O) = J(r) = O. We can extend J to a field Jext on [0, b] by defining it to be 0 on [r, b]. Notice that '\l aJext(r-) is equal to '\l aJ(r), which is not 0 since otherwise J would be identically zero (over determination). On the other hand, '\laJext(r+) = 0 and so the "kink" l::.J~xt(r) := '\l aJext(r+) - '\l aJext(r-) is not zero. Notice that l::.J~xt(r) is normal to "I (why?). We will now show that if W E T)'(O) is such that W(r) = l::.J~xt(r) (and there are plenty of such fields), then cI)'(Jext + £5W, Jext + £5W) < 0 for small enough £5 > O. This will allow us to conclude that I)' cannot be definite since by Proposition 13.135 we can always find a Z with cI)'(Z, Z) > O. We have
cI)'(Jext + £5W, Jext + £5W) = cI)'(Jext , Jext} + 26cI), (Jext, W)
+ c:£5 2 I)'(W, W).
It is not hard to see from the formula of Theorem 13.124 that I)'(Jext , Jexd is zero since it is piecewise Jacobi and is zero at the single kink point r. But using the formula again, cI),(Jext(r), W(r)) reduces to
' ) , W(r)) = --1 1l::.Jext(r) I 12 < 0, --1 (l::.Jext(r
c c and so taking £5 small enough gives the desired conclusion.
Summarizing the conclusions of the above discussion (together with the result of Proposition 13.135) yields the following nice theorem: Theorem 13.136. If'Y : [0, b] -+ M is a cospacelike geodesic of sign c:, then (M, g) is either Riemannian or Lorentz and we have the following three
cases:
(i) If there are no points conjugate to "1(0) along "I, thencI~ is positive definite. (ii) If 'Y(b) is the only conjugate point to "1(0) along "I, then I)' is not
definite, but must be semidefinite. (iii) If there is a point 'Y(r) conjugate to "1(0) with 0 < r
< b, then I)' is
not semidefinite (or definite). As we mentioned the Jacobi equation can be written in terms of the tidal force operator Rv : TpM -+ TpM as
'\l~J(t)
= R.y(t)(J(t)).
The meaning of the term force here is that R.y(t) controls the way nearby families of geodesics attract or repel each other. Attraction tends to create conjugate points, while repulsion tends to prevent conjugate points. If "I is cospacelike, then we take any unit vector u normal to 1'(t) and look at the component of R.y(t)(u) in the u direction. Up to sign this is
(R.y(t) (u), u)u = (R.y(t),u(i'(t)), u)u = -(9\(1'(t) Au), 1'(t) A u)u.
612
13. Riemannian and Semi-Riemannian Geometry
In terms of sectional curvature, (R"y(t) (u), u)u = K('Y(t) 1\ u) ('Y(t) , 'Y(t)) . It follows from the Jacobi equation that if (R"y(t) (u), u) ~ 0, i.e., if K('Y(t) 1\ u)('Y(t), 'Y(t)) ~ 0, then we have repulsion, and if this always happens anywhere along " we expect that ,(0) has no conjugate point along ,. This intuition is indeed correct.
Proposition 13.137. Let, : [0, b] -+ M be a cospacelike geodesic. If for every t and every vector v E ,(t)..L we have (R"y(t) (v), v) ~ 0 (i.e. if K('Y(t) 1\ v)('Y(t), 'Y(t)) ~ 0), then ,(0) has no conjugate point along ,. In particular, a Riemannian manifold with sectional curvature K ~ 0 has no conjugate pairs of points. Similarly, a Lorentz manifold with sectional curvature K ~ 0 has no conjugate pairs along any timelike geodesics.
Proof. Take J to be a Jacobi field along, such that J(O) We have (J, J) = 2 (\7 aJ, J) and
1t
=
0 and J..l'Y.
d2 dt 2 (J, J) = 2(\7 at J, \7 at J)
+ 2(\7~t J, J) = 2(\7aJ, \7aJ) + 2(R"y(t),J('Y(t)), J) = 2 (\7aJ, \7aJ) + 2 (R"y(t) (J), J),
and by the hypotheses ~(J, J) ~ O. On the other hand, (J(O), J(O)) = 0 and 10 (J, J) = O. It follows that since (J, J) is not identically zero we must have (J, J) > 0 for all t E (0, b] and the result follows. 0
1t
13.9. More Riemannian Geometry Recall that a manifold is geodesic ally complete at p if and only if expp is defined on all of TpM. The following lemma is the essential ingredient in the proof of the Hopf-Rinow theorem stated and proved below. Note that this is a theorem about Riemannian manifolds.
Lemma 13.138. Let (M,g) be a connected Riemannian manifold. Suppose that expp is defined on the ball of radius p > 0 centered at 0 E TpM. Let Bp(p):= {x: dist(p,x) < pl. Then each point q E Bp(p) can be connected to p by an absolutely minimizing geodesic. In particular, if M is geodesically complete at p EM, then each point q E M can be connected to p by an absolutely minimizing geodesic.
i=
q and let R = dist(p, q). Choose E > 0 small enough that B2f (p) is the domain of a normal coordinate system. (Refer to Figure 13.9.) By Lemma 13.85, we already know the theorem is true if Bp(p) c Bf(p), so we will assume that E < R < p. Because 8Bf(p) is
Proof. Let q
E Bp(p) with p
613
13.9. More Riemannian Geometry
diffeomorphic to sn-l C jRn, it is compact and so there is a point PE E aBE(p) such that x 1---7 dist(x, q) achieves its minimum at PE' This means that dist(p, q) = dist(p,PE) + dist(PE, q) = E + dist(PE, q). Let "1: [O,p]--+ M be the geodesic with
Ii'I = 1, "1(0) = p,
and 'Y(E)
= PE'
~(p) 1
q
y
Figure 13.9
It is not difficult to see that the set T = {t E [0, R] : dist(p, 'Y(t)) + dist("((t), q) = dist(p, q)}
is closed in [0, R] and is nonempty since E E T. Let tsup = supT > O. We will show that tsup = R from which it will follow that "11 [O,R] is a minimizing geodesic from P to q since then dist("((R), q) = 0 and so 'Y(R) = q. With an eye toward a contradiction, assume that tsup < R. Let x := 'Y(tsup ) and choose El with 0 < El < R - tsup and small enough that B2El (x) c Bp(p) is the domain of normal coordinates about x. Arguing as before we see that there must be a point X E1 E aBE! (x) such that dist(x, q) = dist(x, xE!) + dist(x q , q) = El + dist(x Ell q). Now let "11 be the unit speed geodesic such that "Y1 (0) = x and "11 (El) = x q But since tsup E T and x = 'Y(tsup), we also have dist(p, x)
+ dist(x, q) = dist(p, q).
Combining, we now have dist(p, q) = dist(p, x) + dist(x, X E1 ) + dist(xq , q). By the triangle inequality, dist(p, q)
:s; dist(p, Xq) + dist(x Ell q) and so
dist(p, x) +dist(x,xE1):S; dist(p,xq).
.
614
13. Riemannian and Semi-Riemannian Geometry
+ dist(x, Xq) and so dist(p,xq ) = dist(p,x) + dist(x,xq).
But also dist(p, Xq) :S dist(p, x)
Examining the implications of this last equality, we see that the concatenation of 1'1 [O,tsup] with 1'1 forms a curve from p to Xq of length dist(p, Xq), which must therefore be a minimizing curve. By Problem 1, this potentially broken geodesic must in fact be smooth and so must actually be the geodesic I'I[o,t sup +q]. Then, tsup+El E T which contradicts the definition oftsup . This 0 contradiction forces us to conclude that tsup = R and we are done. Theorem 13.139 (Hopf-Rinow). If(M,g) is a connected Riemannian manifold, then the following statements are equivalent:
(i) The metric space (M, dist) is complete. sequence is convergent.
That is, every Cauchy
(ii) There is a point p E M such that M is geodesically complete at p. (iii) M is geodesically complete. (iv) Every closed and bounded subset of M is compact. Proof. (iv)===>(i): The set of points of a Cauchy sequence is bounded and so has compact closure. Thus there is a subsequence converging to some point. Since the original sequence was Cauchy, it must converge to this point.
(i)===>(iii): Let p be arbitrary and let I'v(t) be the geodesic with 'Yv(O) = v and J its maximal domain of definition. We can assume without loss of generality that (v, v) = 1 so that L( I'vl[tl,t2]) = t2 - tl for all relevant tl, t2. We want to show that there can be no upper bound for the set J. We argue by contradiction: Assume that t+ = sup J is finite. Let {t n } C J be a Cauchy sequence such that tn -7 t+ < 00. Since dist hv (t), I'v (s )) :S 1t - s I, it follows that I'v(tn) is a Cauchy sequence in M, which by assumption must converge. Let q := limn-too I'v (tn) and choose a small ball Bi (q) which is small enough to be a normal neighborhood. Take it with 0 < t+ - tl < E/2 and let 1'1 be the (maximal) geodesic with initial velocity 'Yv(tt). Then in fact 1'1 (t) = I'v (tl + t) and so 1'1 is defined for tl + E/2 > t+ and this is a contradiction. (iii)===>(ii) is a trivial implication. (ii)===>(iv): Let K be a closed and bounded subset of M. For x E M, Lemma 13.138 tells us that there is a minimizing geodesic ax : [0,1] -7 M connecting p to x. Then Ilax(O)11 = dist(p, x) and exp(ax(O)) = x. Using the triangle inequality, one sees that sup{llax(O)II} :S r xEK
13.9. More Riemannian Geometry
615
for some r < 00. From this we obtain {ax(O) : x E K} c Br := {v E TpM : Ilvll :::; r}. The set Br is compact. Now exp(Br) is compact and contains the closed set K, so K is also compact. (ii)===*(i): Suppose M is geodesically complete at p. Now let {xn} be any Cauchy sequence in M. For each X n , there is (by assumption) a minimizing geodesic from p to X n , which we denote by 'YPXn. We may assume that each 'Ypx n is unit speed. It is easy to see that the sequence {In}, where In := Lbpxn) = dist(p, xn), is a Cauchy sequence in IR with some limit, say l. The key fact is that the vectors 'Ypx n are all unit vectors in TpM and so form a sequence in the (compact) unit sphere in TpM. Replacing bpxn} by a subsequence if necessary we have 'Ypx n -+ U E TpM for some unit vector u. Continuous dependence on initial velocities implies that {xn} = {')'PXn (In)} has the limit 'Yu (l). D Let (M, g) be a complete connected Riemannian manifold with sectional curvature K :::; O. By Proposition 13.137, for each point p EM, the geodesics emanating from p have no conjugate points and so Tvp exp p : TvpTpM -+ M is nonsingular for each vp E TpM. This means that exp p is a local diffeomorphism. If we give TpM the metric exp;(g), then exp p is a local isometry. It now follows from Theorem 13.76 that exp p : TpM -+ M is a Riemannian covering. Thus we arrive at the Hadamard theorem. Theorem 13.140 (Hadamard). If (M, g) is a complete simply connected Riemannian manifold with sectional curvature K :::; 0, then exp p : TpM -+ M is a diffeomorphism and each two points of M can be connected by a unique geodesic segment. Definition 13.141. If (M, g) is a Riemannian manifold, then the diameter of M is defined to be
diam(M) := sup{dist(p, q) : p, q EM}. The injectivity radius at p E M, denoted inj(p), is the supremum over all E > 0 such that exp p : B(Op, E) -+ B(p, E) is a diffeomorphism. The injectivity radius of Mis inj(M) := infpEM{inj(p)}. The Hadamard theorem above has as a hypothesis that the sectional curvature is nonpositive. A bound on the sectional curvature is stronger than a bound on Ricci curvature since the latter is a sort of average sectional curvature. In the sequel, statements like Ric 2: C should be interpreted to mean Ric(v, v) 2: C(v, v) for all v E TM. Lemma 13.142. Let (M, g) be an n-dimensional Riemannian manifold and let'Y : [0, L] -+ M be a unit speed geodesic. Suppose that Ric 2: (n - 1) K, > 0 for some constant K, > 0 (at least along 'Y). If the length L of 'Y is greater than or equal to 1r / ",;K" then there is a point conjugate to 'Y (0) along 'Y.
13. Riemannian and Semi-Riemannian Geometry
616
°
Proof. Suppose < 7r / y'K, :s; L. If we can show that 1':;- is not positive definite, then Theorem 13.136 implies the result. To show that 1':;- is not positive definite, we find an appropriate vector field V i- along 'Y such that I(V, V) :s; 0. Choose orthonormal fields E 2, ... ,En so that E 2,·· . ,En is an orthonormal frame along 'Y. For a function f : [0, 7r / yK] -+ lR that vanishes at endpoints, we form the fields f E i . Using (13.9), we have
° -r,
r 1fi {!'(s)2 + f(s)2(REj,"y(Ej(s)), -r(s))} ds,
I-y(f Ej, f Ej ) = io and then n
L I-y(f Ej,J Ej ) = j=2
rrr /fi io {(n - 1) (1,)2 -
f2 Rich,
-r)} ds
0
r
:s; (n - 1) io 1fi ((1,)2 - K,f2) ds. Letting f (s)
= sin( y'K,s),
we get
n
r 1fi
LI(fEj,fEj):S; (n -1) io j=2 0 and so I(f Ej,J Ej)
:s;
K,
(cos 2(y'K,s) - sin2(~s)) ds
°
for some j.
= 0, D
The next theorem also assumes only a bound on the Ricci curvature and is one of the most celebrated theorems of Riemannian geometry. A weaker version involving sectional curvature was first proved by Ossian Bonnet (see [Hicks], page 165).
Theorem 13.143 (Myers). Let (M,g) be a complete connected Riemannian manifold of dimension n. If Ric 2: (n - 1) K, > 0, then (i) diam(M)
(ii)
7rl (M)
:s; 7r / y'K"
M is compact, and
is finite.
Proof. Since M is complete, there is always a shortest geodesic 'Ypq between any two given points p and q. We can assume that 'Ypq is parametrized by arc length: 'Ypq : [0, dist(p, q)] -+ M. It follows that 'Ypq I[O,aj is arc length minimizing for all a E [0, dist(p, q)]. From Proposition 13.129 we see that the only possible conjugate to p along 'Ypq is q. The preceding lemma shows that 7r / y'K, > dist(p, q) is impossible. Since the points p and q were arbitrary, we must have diam( M) It follows from the Hopf-Rinow theorem that M is compact.
:s; 7r / y'K,.
For (ii) we consider the simply connected covering ~ : M -+ M (which is a local isometry). Since ~ is a local diffeomorphism, it follows that ~-l(p)
617
13.10. Cut Locus
has no accumulation points for any p EM. But also, because M is complete and has the same Ricci curvature bound as M, it is compact. It follows that p-1(p) is finite for any p EM, which implies (ii). 0 The reader may check that if sn(R) is a sphere of radius R in JRn+1, then sn(R) has constant sectional curvature K, = 1/ R2 and the distance from any point to its cut locus (defined below) is 7r /..jK,. A result of S. Y. Cheng states that with the curvature bound of the theorem above, if diam(M) = 7r /..jK" then M is a sphere of constant sectional curvature K,. See [Cheng].
13.10. Cut Locus In this section we consider Riemannian manifolds. Related to the notion of conjugate point is the notion of a cut point. For a point p E M and a geodesic, emanating from p = ,(0), a cut point of p along, is the first point q = ,( tt) along , such that for any point r = ,(t") beyond p (Le. t" > tt) there is a geodesic shorter than ,1[O,t'l which connects p with r. To see the difference between this notion and that of a point conjugate to p, it suffices to consider the example of a cylinder Sl x JR with the obvious flat metric. If p = (1,0) E Sl X JR, then for any x E JR, the point (e i7f , x) is a cut point of p along the geodesic ,(t) := (e it7f , tx). We know that beyond a conjugate point, a geodesic is not (locally) minimizing. In our cylinder example, for any f > 0, the point q = ,(1 + f) can be reached by the geodesic segment ,2 : [0,1 - f] -+ Sl X JR given by ,(t) := (e it7f , ax), where a = (1 + f)/(1 - f). It can be checked that is shorter than ,I [0,1 + fl· However, the last example shows that a cut point need not be a conjugate point. In fact, Sl x JR has no conjugate points along any geodesic. Let us agree that all geodesics referred to in this section are parametrized by arc length unless otherwise indicated.
,2
Definition 13.144. Let (M,g) be a complete Riemannian manifold and let p EM. The set C (p) of all cut points to p along geodesics emanating from p is called the cut locus of p.
For a point p EM, the situation is summarized by the fact that if q = ,(tt) is a cut point of p along a geodesic " then for any t" > t' there is a geodesic connecting p with q which is shorter than ,1[O,t'l' while if t" < tt, then not only is there no geodesic connecting p and ,( t") with shorter length but there is no geodesic connecting p and ,(t") whose length is even equal to that of ,1[O,t"l. (Why?) Consider the following two conditions:
(C1): ,(to) is the first conjugate point of p = ,(0) along ,.
618
13. Riemannian and Semi-Riemannian Geometry
(C2): There is a unit speed geodesic a from ')'(0) to ')'(to) that is different from ')'1 [O,to] such that L(a) = L( ')'I[o,to])' Proposition 13.145. Let M be a complete Riemannian manifold.
(i) If for a given unit speed geodesic ,)" either condition (C1) or (C2) holds, then there is a t1 E (0, to] such that ')'(t1) is the cut point of p along ')'. (ii) If ')'(to) is the cut point of p = ')'(0) along the unit speed geodesic ray')', then either condition (C1) or (C2) holds. Proof. (i) This is already clear from our discussion: For suppose (C1) holds, then ')'1 [O,t'] cannot minimize for t' > to and so the cut point must be ')'(t1) for some t1 E (0, to]. Now if (C2) holds, then choose E > 0 small enough that a(to - E) and ')'(to + E) are both contained in a convex neighborhood of ')'(to). The concatenation of al[o,to] and ')'1 [to,to+£J is a curve, say c, that has a kink at ')'(to). But there is a unique minimizing geodesic T joining a(to-E) to ')'(to + E), and we can concatenate the geodesic al[o,to-£] with T to get a curve with arc length strictly less than L(c) = to + E. It follows that the cut point to p along,), must occur at ')'(t') for some t' :S to + Eo But E can be taken arbitrarily small and so the result (i) follows.
(ii) Suppose that ')'(to) is the cut point of p = ')'(0) along a unit speed geodesic ray')'. We let Ei --+ 0 and consider a sequence {aJ of minimizing geodesics with ai connecting p to ')'(to + Ed. We have a corresponding sequence of initial velocities Ui := O:i(O) E 8 1 C TpM. The unit sphere in TpM is compact, so replacing Ui by a subsequence we may assume that Ui --+ U E 8 1 C TpM. Let a be the unit speed segment joining p to ')'(to + Ei) with initial velocity u. Arguing from continuity, we see that a is also minimizing and L(a) = L( ')'I[o,to])' If a i=- ')'1 [O,to] , then we are done. If a = ')'1 [O,toJ, then since ')'1 [O,to] is minimizing, it will suffice to show that Tto"l(o) exp p is singular because that would imply that condition (C1) holds. The proof of this last statement is by contradiction: Suppose that a = ')'1 [O,to] (so that ...y(0) = u) and that Tto"l(o) exp p is not singular. Take U to be an open neighborhood of to...y(O) in TpM such that expplu is a diffeomorphism. Now ai(to + E~) = ')'(to + Ei) for 0 < E~ :S Ei since the ai are minimizing. We now restrict attention to i such that Ei is small enough that (to + EDui and (to + Ei)U are in U. Then we have expp(to + Ei)U = ')'(to =
+ Ei) ai(to + ED =
expp(to
+ EDui,
and so (to + Edu = (to + EDui, and then, since Ei --+ 0 and both U and Ui are unit vectors, we have ...y(0) = U = Ui for sufficiently large i. But then
619
13.11. Rauch's Comparison Theorem
for such i, we have Qi = 'Y on [0, to 'YI [O,tO+EiJ is not minimizing.
+ ti],
which contradicts the fact that 0
Exercise 13.146. Show that if q is the cut point of p along 'Y, then p is the cut point of q along 'Y~ (where 'Y~(t) := 'Y(L - t) and L = Lh)).
It follows from the development so far that if q E M\ C (p), then there is a unique minimizing geodesic joining p to q, and that if B(p, R) is the ball of radius R centered at p, then expp is a diffeomorphism on B(p, R) provided R:S: d(p, C(p)). In fact, an alternative definition of the injectivity radius at pis d(p, C(p)) and the injectivity radius of M is inj(M) = inf {d(p, C(p))}. pEM
Intuitively, the complexities of the topology of M begin at the cut locus of a given point. Let TI M denote the unit tangent bundle of the Riemannian manifold:
TIM = {u E TM:
Ilull = 1}.
Define a function CM : TI M ~ (0,00] by ( ) '=
CM u.
{to if 'Yu(to) is the cut point of 7rTM(U) along 'Yu, 00 if there is no cut point in the direction u.
Recall that the topology on (0,00] is such that a sequence tk converges to the point 00 if limk-+oo tk = 00 in the usual sense. It can be shown that if (M, g) is a complete Riemannian manifold, then the function eM : TI M ~ (0,00] is continuous (see [Kobl).
13.11. Rauch's Comparison Theorem In this section we deal strictly with Riemannian manifolds. Definition 13.147. Let 'Y : [a, b] ~ M be a smooth curve. piecewise smooth vector fields along 'Y, define
I,,(X, Y) := l\"V at X, "Vat Y)
For X, Y
+ (R.y,x"!, Y) dt.
The map X, Y t-t I,,(X, Y) is symmetric and bilinear. In defining I" we have used a formula for the index I" valid for fields which vanish at the endpoints and with 'Y a nonnull geodesic. Thus when 'Y is a nonnull geodesic, the restriction of I" to variation vector fields which vanish at endpoint is the index I". Thus I" is a sort of extended index form.
620
13. Riemannian and Semi-Riemannian Geometry
Corollary 13.148. Let 'Y : [0, b] -+ M be a cospacelike geodesic of sign E with no points conjugate to 'Y( 0) along 'Y. Suppose that Y is a piecewise smooth vector field along'Y and that J is a Jacobi field along 'Y such that
Y(O) = J(O), Y(b) = J(b), and (Y - J) 1-i'. Then ELy(J, J) :::; EI'Y(Y' Y). Proof of the corollary. From Theorem 13.136, we have 0 :::; cI¢(Y - J, Y - J) = ELy (Y - J, Y - J) and so
0:::; ELy(Y, Y) - 2EI'Y(J, Y)
+ ELy(J, J).
Integrating by parts, we have
ELy(J, Y) = E (\1 atJ, Y)lg
-
fob (\1~tJ, Y) -
= E (\1 aJ, Y)lg = E (\1 aJ, J)lg = EI'Y(J, J) (since J is a Jacobi Thus 0 :::; EI'Y(Y' Y) - 2EI'Y(J, Y)
+ ELy(J, J)
(Rry,Ji', Y) dt
field).
= ELy(Y, Y) - ELy(J, J).
0
Recall that for a Riemannian manifold M, the sectional curvature KM (P) of a 2-plane P c TpM is
(v:t (el/\ e2), el/\ e2) for any orthonormal pair el, e2 that spans P. Definition 13.149. Let M, g and N, h be Riemannian manifolds and let 'YM : [a, b] -+ M and 'YN : [a, b] -+ N be unit speed geodesics defined on the same interval [a, b]. We say that KM :2: KN along the pair ("(M, 'YN) if KM (Q'YM(t)) :2: KN (P'YN(t)) for all t E [a, b] and every pair of 2-planes Qt E T'YM(t)M, Pt E T'YN(t)N. We develop some notation to be used in the proof of Rauch's theorem. Let M be a given Riemannian manifold. If Y is a piecewise smooth vector field along a unit speed geodesic 'YM such that Y (a) = 0, then let
If4 (Y, Y) := =
is is -(\1~tY(t),
(\1atY(t), \1atY(t)) Y(t))
+ (RyM,Yi'M, Y)(t)dt
+ (RyM,yi'M, Y)(t)dt + (\1atY, Y)(s).
If Y is an orthogonal Jacobi field, then
621
13.11. Rauch's Comparison Theorem
Theorem 13.150 (Rauch). Let M, 9 and N, h be Riemannian manifolds of the same dimension and let "fM : [a, b] ---+ M and "fN : [a, b] ---+ N be
unit speed geodesics defined on the same interval [a, b]. Let JM and IN be Jacobi fields along "fM and "fN respectively and orthogonal to their respective curves. Suppose that the following four conditions hold:
(i) JM (a) = IN (a) = 0 and neither of JM (t) or IN (t) is zero for t E (a,b].
IIV'oJM(a)11 = lIV'oJN(a)lI.
(ii)
(iii) L("(M) = dist("(M (a), "fM (b)). (iv) KM 2: KN along the pair ("(M, "fN).
Then IIJM(t)1I ~ IIJN(t)1I for allt E [a,b]. Proof. Let fM be defined by fM(S) := IIJ M (s)1I2 and hM by hM(S) '-
I~(JM, JM)/ IIJ M(s)1I2 for s
E
(a,b]. Define fN and hN analogously. We
have
fk(s) = 2I~(JM, JM) and fk/fM = 2hM and the analogous equalities for fN and hN. If c E (a, b), then In(IIJ M (s)1I2) = In(IIJ M (c)1I2)
+
21
8
hM(s')ds'
with the analogous equation for N. Thus
2 M M In ("J (s)1 ) =In("J (c)12) +21 s [h (s')-hN(s')]ds'. IIJN (s)112 IIJN (c)112 c M From the assumptions (i) and (ii) and L'Hopital's rule, we have
. IIJ M (c)1I2 11m = 1, c-+a+ IIJN (c)112 and so
1
M 8 (s)II:) = 2 lim [hM(S') - hN(s')]ds'. N liJ (s) II c--+a+ c If we can show that hM(S) - hN(S) ~ 0 for s E (a,b], then the result will follow. So fix r E (a,b] and let ZM(s):= JM(s)/IIJM(r)1I and ZN(s):= J N (s ) / J N (r ) Notice the r in the denominator; Z N (s) is not necessarily of unit length for s oF r. We now define a parametrized families of subtangent spaces along "fM by WM(S) := 'YM(s)~ c T"IM(s)M and similarly for WN(S). We can choose a linear isometry Lr : WN(r) ---+ WM(r) such that Lr(ZN (r)) = ZM (r). We now want to extend Lr to a family of linear isometries Ls : WN(S) ---+ WM(S). We do this using parallel transport by In ("J
I
II.
Ls
:=
p("(M):
0
Lr
0
p("(N)~.
13. Riemannian and Semi-Riemannian Geometry
622
Define a vector field Y along
,M
by Y(s):= Ls(ZN(s)). Check that
Y(a) = ZM (a) = 0, Y(r)
= ZM (r),
11Y112 = IIzNI1 2, IIV'atYI12 = IIV'at ZN I1 2. The last equality is a result of Exercise 12.43 where in the notation of that exercise f3(t) := P(rM)i 0 Y(t). Since (iii) holds, there can be no conjugates along up to r. Now Y - ZM is orthogonal to the geodesic and so by Corollary 13.148 we have I~(ZM,ZM):s I~(Y,Y) and in fact, using (iv) and the list of equations above, we have
,M
I;f (ZM, ZM)
,M
:s I~ (Y, Y) =
iT IIV'atYI1 2iT IIV'atYI1 2:s iT IIV'atZNI12 iT IIV'atZN I1 2
iT IIV'at YI1 2+
Y,-rM, Y) b MA YI1 2
=
Kb M
=
K(-rM, Y,-rM, Y)
=
,
RM b M, Y, -rM, Y)
b MI1211Y112
Kb M , ZN,-rM, ZN)
bMI1211zNI12
(by (iv))
+RNbN,zN,-rN,ZN) =I;'(ZN,ZN).
Recalling the definition of ZM and ZN we obtain
and so hM(r) - hN(r) :S O. But r was arbitrary, and so we are done.
0
Corollary 13.151. Let M, g and N, h be Riemannian manifolds of the same dimension and let [a, bj -+ M and ,N : [a, bj -+ N be unit speed geodesics defined on the same interval [a, bj. Assume that KM ~ KN along the pair (rM, ,N). Then if (a) has no conjugate point along ,M, then ,N (a) has no conjugate point along ,N.
,M :
,M
The above corollary is easily deduced from the Rauch comparison theorem above and we invite the reader to prove it. The following famous theorem is also proved using the Rauch comparison theorem but the proof is quite difficult and also uses Morse theory. The proof may be found in
[C-Ej.
13.12. Weitzenbock Formulas
623
Theorem 13.152 (The sphere theorem). Let M, 9 be a complete simply connected Riemannian manifold with sectional curvature satisfying the condition 1 1 1
4: R2 < K::;
for some R
R2
> O. Then M is homeomorphic to the sphere
sn.
13.12. Weitzenbock Formulas The divergence of a vector field in terms of Levi-Civita connection is given by
= trace(V' X). (0 1, ... , on) are dual bases for TpM div(X)
Thus if (el' ...
,en)
and
and T; M, then
n
(div X)p =
L oj (V' ejX) . j=l
Exercise 13.153. Show that this definition is compatible with our previous definition by showing that, with the above definition, we have Lx vol = (div X) vol, where vol is the metric volume element for M. Hint: Use Lx = dix +ixd. Definition 13.154. Let V' be a torsion free covariant derivative on M. The divergence of a (k, £) tensor field A is a (k - 1, £) tensor field is defined by n
(div A)p
(al ... , ak-l, VI,···,
ve) =
L (V' ejA) (oj,
al ... , ak-l, VI,""
ve),
j=l
where (el' ...
,en)
and (0 1, ...
,on)
are dual as above.
Notice that the above definition depends on the choice of covariant derivative but if M is semi-Riemannian, then we will use the Levi-Civita connection. In index notation the definition is quite simple in appearance. For example, if Ai{[ are the components of a (2,2) tensor field, then we have (div A)jk[ = V'rArkl' where V' aAb{[ are the components of V' A. Also, the definition given is really for the divergence with respect to the first contravariant slot, but we could use other slots (the last slot being popular). Thus V' rAj~[ is also a divergence. If we are to define a divergence with respect to a covariant slot (Le., a lower index), then we must use a metric to raise the index. This leads to the following definition appropriate in the presence of a metric.
13. Riemannian and Semi-Riemannian Geometry
624
Definition 13.155. Let \7 be the Levi-Civita covariant derivative on a manifold M. The (metric) divergence of a function is defined to be zero and the divergence of a (0, £) tensor field A is a (0, £ - 1) tensor field defined at any p E M by n
(div A)p (VI, ... , vt'-d
=L
(ej, ej) (\7ej A) (ej, VI,···, VC-l),
j=l
where (el, ... , en) is an orthonormal basis for TpM. Note that the factors (ej, ej) are equal to 1 in the case of a definite metric. On may check that if A il ... il are the components of A in some chart, then the components of div A are given by
Recall that the formal adjoint 0 : O(M) -+ O(M) of the exterior derivative on a Riemannian manifold M is given on Ok(M) by 0:= (_1)n(k+l)+1 *d*. In Problem 8 we ask the reader to show that the restriction of div to Ok (M) is -0: div = -0
(13.11)
on Ok(M) for each k.
In the same problem the reader is asked to show that if /1 E Ok (M) is parallel (\7/1 = 0), then it is harmonic (recall Definition 9.46). In what follows we simplify calculations by the use of a special kind of orthonormal frame field. If (M,g) is a Riemannian manifold and p E M, choose an orthonormal basis (el' ... , en) in the tangent space TpM and parallel translate each ei along the radial geodesics t H expp (tv) for v E TpM. This results in an orthonormal frame field (E l , ... , En) on some normal neighborhood centered at p. Smoothness is easy to prove. The resulting fields are radially parallel and satisfy Ei (p) = ei and (\7 Ed (p) = 0 for every i. Furthermore we have [Ei,Ej] (p)
= \7EiEj(P) -
\7EjEi(p)
=0
for all
~,J.
We refer to an orthonormal frame field with these properties as an adapted orthonormal frame field centered at p. Before proceeding we need an exercise to set things up. Exercise 13.156. Describe how a connection \7 (say the Levi-Civita connection) on M extends to a connection on the bundle I\T* M in such a way that nv X (1 a A··· A a
k)
k
, IA n =, L..t,a"A··· v xa i i=l
A··· A a
k
13.12. Weitzenbock Formulas
625
for Q/ E nl(M). Show that the curvature of the extended connection is given by
R(X, Y)/-l
= \7 x\7Y/-l- \7y\7 x/-l- \7[X,Y]/-l·
Relate this curvature operator to the curvature operator on X(M) and show that the extended connection is flat if the original connection on M is flat. Let M be Riemannian and /-l E nk(M). Define RJ.L by n
RJ.L(VI, ... , Vk)
k
= L L (R(ei' Vj)/-l) (VI' ... ' Vj-l, ei, Vj+1, ... , Vk). i=l j=l
Theorem 13.157 (Weitzenbock formulas). Let M be Riemannian and /-l E nk (M) . Then we have
1
(~/-ll/-l) = 2~ 11/-l112 + 11\7/-l11 2+ (RJ.LI/-l) , ~/-l
where 11\7/-l11 2(p) JorTpM.
:=
= - div \7/-l + RJ.L'
L:i (\7
ei
/-ll\7 ei /-l) Jor any orthonormal basis (el, ... ,en)
Proof. Using the formula of Theorem 12.56, we see that for U, VI, ... , Vk E X(M), we have
(\7/-l - d/-l) (U, VI, ... , Vk ) = \7/-l (U, VI, ... , Vk) - d/-l (U, VI, ... , Vk) k
= L (\7Vj/-l) (VI' ... ' YJ-I' U, YJ+I, ... , Vk). j=l
With this in mind, we fix p E M and VI, . .. ,Vk E TpM. We may choose VI, ... , Vk E X(M) such that Vi(p) = Vi and may assume that (\7Vi) (p) = o. N ow choose an adapted orthonormal frame field E I , ... , En centered at p so that (\7Ed (p) = 0 for all i and [Ei,Ej] (p) = 0 for all i,j. In the following calculation, several steps may appear at first to be wrong. However, if one begins to write out the missing terms, one sees that they vanish because of how the fields were chosen to behave at p. Using equation (13.11) we have
n
i=l
n
= Lei [(\7/-l- d/-l) (Ei' VI, .. . , Vk)] i=l
13. Riemannian and Semi-Riemannian Geometry
626
and using what we know about the fields at p, this is equal to
t
n
=
(~ ("lVj") (Vl,""
V,-l,
E V,+l,"" Vk)) j,
k
L L ei ((\7V ll) (VI, ... , Vi-I, Ej , Vi+I, ... , Vk)) j
i=1 j=1
n
=
k
L L (\7
Ei
(\7YjIl)) (VI, ... , Vi-I, E j , Vi+I, ... , Vk) (p)
i=1 j=1
n
=
k
LL
(\7 ei \7vj ll) (VI, ... , Vi-I, ej, Vi+I,· .. , Vk) .
i=1 j=1
We also have k
d61l(VI, ... ,Vk) = L(-l)J+I (\7 vj 61l) (VI, ... ,Vj, ... ,Vk) j=1
k
= -
n
L L \7
Vi
\7 Ej Il (VI, ... , Vi-I, ej, Vi+ I, ... , Vk) .
j=1 i=1
Since [Ei' Vj] = \7 Ei Vj - \7VjEi = 0, we can add the above results to obtain !::J.1l + div \71l = RJ-L" For the second part we again take advantage of our arrangement \7 Ei = 0, \7Vj = 0 at p; we have n
(div \71l) (VI' ... ' Vk) =
L (\7
ei
\71l) (ei' VI,· .. , Vk)
i=1 n
=L
ed(\7Ei ll)(VI, ... , Vk))
i=1
n
=L i=1
(\7 ei \7 Eill) (VI, ... , Vk) .
13.13. Structure of General Relativity
627
From this we obtain n
(-div\7JLIJL)(p)
= - L(\7 ei \7EiJLIJL(p)) i=l
n i=l
o We give only one application (but see [Pel or [Poor]). Proposition 13.158. If (M, g) is a fiat connected compact Riemannian manifold (without boundary), then a form JL E nk(M) is parallel if and only if it is harmonic. Proof. We have observed the inclusion {parallel k-forms} C {harmonic kforms} (Problem 8). On the other hand, if (M,g) is fiat, then RJ.L = 0 and so by Theorem 13.157 we have (~JLIJL) = ~~ IIJLI12 + 11\7 JL112. So if ~JL = 0, we have
1M 11\7JLI12 dV = -~ 1M ~ IIJLI12 dV = 0
by Stokes' theorem. Thus \7 JL =
o.
o
Corollary 13.159. Let M be a connected compact n-manifold. If M admits a fiat Riemannian metric g, then dim Hk(M) ~ G). Proof. Pick any p EM. From the Hodge theorem, and the previous proposition, we have
dim Hk(M)
= dim {harmonic k-forms} = dim{parallel k-forms}
~ diml\kT;M = (~). The inequality follows from Exercise 13.14.
o
13.13. Structure of General Relativity The reader is now in a position to appreciate the basic structure of Einstein's general theory of relativity. We can only say a few words about this wonderful part of physics. General Relativity is a theory of gravity based on the mathematics of semi-Riemannian geometry. In a nutshell, the theory models spacetime as a four dimensional Lorentz manifold M4 that is usually assumed to be time oriented. The points of spacetime are idealized events.
13. Riemannian and Semi-Riemannian Geometry
628
The motion of a test particle subject only to gravity is along a geodesic in spacetime and the metric is subject to a nonlinear tensor differential equation that involves the curvature and a physical tensor T that describes the local flow of energy-momentum. We consider the following equations to be central: (13.12)
Ric -~Rg = 87rK,T
(13.13)
divT = 0
(13.14)
=0 V'aV'aV + R(V,a)a = 0
(13.15)
V'aa
(Einstein's equation), (continui ty), (geodesic equation), (Jacobi equation),
where in the geodesic equation and Jacobi equation, ,\ J---t a (,\) is a parametrized curve that represents the career of a test particle subject only to gravitation. In the Jacobi equation we are to imagine a smooth family of geodesic curves t J---t hs(t) = h(t, s) such that ho = a. Then V = ~~ is the variation vector field. The first equation, Einstein's equation, is the centerpiece of Einstein's theory. On the right hand side of this equation we see a tensor T called the stress-energy-momentum tensor; it represents the matter and energy that generate the gravitational field. The constant K, is Newton's gravitational constant, which is also often denoted by the letter G. The left hand side is the Einstein curvature tensor and is built from the Riemann curvature tensor and so ultimately from the metric tensor. There, "Ric" is the Ricci curvature whose index form is written RJ-LY; R is the scalar curvature defined by contraction R := gJ-LY RJ-LY' and 9 is the metric tensor. Einstein's equation can be seen as an equation for the metric of spacetime; it shows how the distribution of matter and energy influences the metric and the resulting curvature of spacetime. We have already studied the last two equations, but we will say something below about their role in gravitational theory. We will base our explanations on the following two-pronged incantation: Matter tells spacetime how to curve. Spacetime tells matter how to move.
Newtonian gravity. To fully appreciate Einstein's theory of gravity one must compare it to Newton's theory. In Newton's theory, the equations of motion of a test particle moving in (flat Euclidean) space and subject to a gravitational field g is described by (13.16)
d2 x(t) mI---;{i2
= mcg(x(t)).
Here x(t) is a vector-valued function of t that gives the location of the test particle relative to a fixed inertial frame (which entails a choice of origin and
13.13. Structure of General Relativity
629
system of rectangular coordinates). The field g is a vector-valued function of position in space. Here m[ is the inertial mass of the particle, which is the m in Newton's F = mao The constant me is the gravitational mass, which plays the role of gravitational charge. As demonstrated by Galileo and later by Ei::itvi::is, we actually have equality m[ = me. This equality is in fact one of the key influences on Einstein's thinking, and it led him to assert that, for sufficiently small regions of spacetime, gravitational forces and inertial forces (as perceived in an accelerating frame) are indistinguishable. If p represents the mass density in a region of space, then the gravitational potential ¢ produced by this matter is given by
\7 2 ¢ = 47rK,p.
The gravitational field is then g
=
-grad¢.
Thus we have the pair of equations (13.17) (13.18)
\7 2 ¢
= 47rK,p,
d2 x
dt 2 = - grad ¢,
where \7 2 ¢ = div (grad ¢) and where the right hand side is evaluated at x( t). The first equation tells how matter creates the Newtonian gravity field, and the second describes how the field tells matter how to move. This is the Newtonian analogue of the two-pronged incantation above. In Newton's picture, gravity is unambiguously treated as a field created by mass that induces a force on test particles. Free fall. Let a be a curve parametrized by proper time r that represents the path of a test particle. In Einstein's theory, what corresponds to equation (13.18) is the geodesic equation a := \7 aD: = 0. According to Einstein, if the particle is subject only to gravity, then a is a geodesic. If we choose a coordinate system (xO, xl, x 2, x 3 ), then the geodesic equation gives four differential equations (the geodesic equations), which can be written as (13.19)
d2x f1 = _~gf18 (8 g(38 dr2 2 8x Oi
_ 8 ga(3) dx Oi dx(3 8x(3 8x 8 dr dr'
+ 8g8Oi
J1
= 0, 1,2,3,
where we have written out the formula for r~(3 explicitly. (We use the common convention that the Greek indices run over 0,1,2,3, while the Latin indices run over 1,2,3.) The above equations have a similarity to (13.18) when the latter are written in the form i
= 1,2,3.
630
13. Riemannian and Semi-Riemannian Geometry
From this point of view, (13.19) looks like a force law and the metric components ga(3 playa role analogous to the potential ¢ in the Newtonian theory. From the point of view of the chosen coordinate system, these equations appear to tell the particle how to accelerate with respect to the coordinate system. However, if we choose normal coordinates at an event p E Jv[4, then at p the left hand side of (13.19) is zero! In fact, from the intrinsic point of view, the law is simply that the career of the particle in spacetime is a geodesic and therefore represents a state of zero intrinsic acceleration. Rephrasing the second prong of our incantation, we say that spacetime tells free test particles how to curve or accelerate. Namely, not at all. This is a wonderfully simple geometric law of motion. Freely falling bodies are described by geodesics in spacetime. Tidal forces. Imagine yourself in free fall in a uniform gravitational field. Imagine that you are surrounded by a spherical array of apples in free fall which are initially stationary with respect to you. All the apples appear motionless against the starry background of space. If the field was truly uniform, the spherical swarm would remain spherical. However, this situation corresponds to no spacetime curvature, and so from the intrinsic point of view, is no gravitational field at all. A realistic gravitational field such as that produced by the Earth is not uniform, and our sphere of apples would deform becoming elongated along a line passing through the center of mass of the gravitating body (the Earth, say) and passing through the center of the array. If you were in free fall at the center of the spherical array with your feet toward the earth, then apples that are roughly in a plane perpendicular to the axis of your body would be seen to accelerate towards you, while those below your feet and above your head would be seen to recede. Each apple follows a geodesic in spacetime (not in space), and so we have a family of geodesics. This situation, properly idealized, is described by the Jacobi equation (13.15). The vector field V can be thought of as describing the separation of nearby geodesics, and the Jacobi equation describes the relative acceleration of nearby geodesics. Curvature is the "force" behind this relative acceleration. It is this relative acceleration, positive in some directions and negative in others, that is responsible for the distortion of our initially spherical array of free falling apples. If we neglect the attraction that the apples have for each other, then the volume of the array remains constant. This is because we are in a region where the Einstein tensor (and hence the Ricci tensor) vanishes. Energy-momentum tensor. Fields and particles carry 4-momentum. Let Ct be a unit speed timelike curve giving the career of a particle of (rest) mass m. The 4-momentum of the particle is p = rna. In a flat spacetime we may choose Lorentz coordinates (xO, X l , x 2, x 3) so that the corresponding coordinate frame field (8/-L) is oriented orthonormal with 80
13.13. Structure of General Relativity
631
timelike. We choose units so that the speed of light is unity; c = 1 so that xo = t is coordinate time . Let v '= (dXl dx 2 dx 3 ) be the "ordi. dt ' dt ' dt nary" spatial velocity or "3-velocity" as viewed in this Lorentz frame and let v := Ivl = [L:f=l(dx i jdt)2J I / 2. Then the 4-momentum has components (E, 'YP), where E is the relativistic energy of the particle, 'Y = (1 - v2)-1/2, and we shall refer to 'YP as the relativistic 3-momentum. Notice that if we change to a different Lorentz coordinate, then we will have a new energy and 3-momentum, but the geometric 4-momentum vector rna is an "invariant" notion defined without reference to a specific coordinate frame. Thus 4-momentum unifies the notions of momentum and energy. Furthermore, this relativistic energy is really mass-energy. Indeed, if we choose a frame in which the particle is (momentarily) at rest, then the 4-momentum has components (m,O). Now when we consider a region in spacetime filled with particles and fields, it is appropriate to go to the continuum approximation. The right hand side of Einstein's equation features the (0, 2)-tensor T that keeps track of the flow of 4-momentum produced by all the matter and non-gravitational energy. It will pay to first think about charge and then consider the meaning of T in the setting of special relativity. In this setting of special relativity we are assuming that T does not produce curvature (contrary to fact). In standard Lorentz coordinates (xO, xl, x 2, x 3), the tensor T has 16 components TJ.!v, Let us do a type change, T~ := gW~TOiV' Then (T~, Tq, T~, T~) represents the density of 4-momentum and (Tio, T\, T i2 , T~) represents the flux of 4-momentum in the spatial direction i (so i = 1,2, or 3). By this we mean that the result of a flux integral should be a 4-vector quantity, while the flux of a vector field in the usual calculus sense is a scalar (such as charge or mass per unit time). In electrodynamics we describe the flow of a charge by a time dependent vector field J, and if p is the charge density, then local conservation of charge is given by the continuity equation
~=
-divJ.
We can combine p and J into a unified notion of 4-current J that has components in a Lorentz frame given by (Jo, ... , J3) = (p, J) and the corresponding covector field .J has components (Jo, ... ,h) = (-p, J). The continuity equation can then be written as d *.J = 0,
13. Riemannian and Semi-Riemannian Geometry
632
and follows from Maxwell's equations. The corresponding integral version of conservation of charge is simply
{ *3 = 0,
JaR
where aR is the boundary of a region of spacetime R. For a general volume o that does not necessarily bound an open region of spacetime, the integral *3 gives the total charge "crossing" O. It is the fact that charge is a scalar quantity that allows us to do a continuous sum over O.
10
In the Newtonian theory, mass is the analogue of charge and we have a similar continuity equation. However, in relativity, rest mass is not a conserved quantity and energy is not a scalar. Somehow charge is to be replaced by energy-momentum as we cross the bridge of analogy to the land of gravity. If spacetime is Minkowski space, then there is a quantity we can integrate to get a total energy-momentum. The integral implies a sum, and we can add tensors located at different points if we take advantage of distant parallelism (see [M-T-W]). If we express T in a Lorentz frame, then we can integrate to get a total energy-momentum Ptot crossing a "3-volume" V. The covariant components of Ptot in the Lorentz frame are given by
prot =
i T~
dz. 1J ,
where dz. 1J = *dx lJ = ~cIJCI:,B,dxCl: A dx,B A dx'. Then, the conservation law says that T~dz.1J = if V = ao is the boundary of a spacetime 4-volume O. The differential form of the conservation law is div T = 0, which in a Lorentz frame is just af.1T~ = 0.
Iv
°
Now, for general relativity we retain the local version of conservation by assuming that div T = 0, which is now defined in terms of the covariant derivative. In general coordinates we have
\1 f.1T~ = 0. However, we must give up the integral version, although it would still hold approximately for sufficiently small regions of spacetime since the latter would be approximately flat. We shall find that this continuity equation is automatically satisfied if Einstein's equation holds since it turns out that the divergence of the left hand side is zero for purely geometric reasons! The Einstein tensor. The left hand side of Einstein's equation features the tensor Ric -~Rg given in index notation as Rf.11J - ~Rgf.1lJ. Here, Rf.11J = RCl:f.1C1:IJ and R = Rf.1f.1 (sum). This tensor is denoted by the letter G and is called the Einstein tensor. Let us show that div G = 0. This fact shows that the conservation law is forced by the geometry (assuming that Einstein's equation holds). In this sense we may take this to be another manifestation of the second prong of our incantation in that the geometry
633
13.13. Structure of General Relativity
tells matter to behave in accordance with local conservation. We have not seen many examples of tensor calculations using index notation, so we take this opportunity to do a calculation. In index notation, what we wish to show is that V' p,GP,v = V'P,Gp,v = O. We start with the Bianchi identity, make a switch in the first two indices of the first term and then raise indices and contract:
o = V' p,Raf3'Y8 + V' a Rf3p,'Y8 + V' f3 Rp,a'Y8, o = - V'P, Rf3a'Y8 + V' aRf3~8 + V'f3 RP,a'Y 8' + V' aR8~8 + V'8 R'Ya'Y 8' o = - V''Y R8a'Y8 - V' aR'Y~8 + V'8 R'Ya'Y 8' o = V''Y Ra'Y - V' aR + V'8 Ra8. 0= - V''Y R8a'Y8
This last equation gives V''Y Ra'Y -
V'P,G p,v = V'P, ( Rp,v -
! V' aR =
~ Rgp,v) =
O. But since V'P,gp,v = 0, we have
(V'P, Rp,v -
~ V'P, (Rgp,v) )
= (V'P, Rp,v - ~gp,aV' a (Rgp,v)) = (V'P, Rp,v - ~gp,a (gp,v V' aR)) = V''Y Ra'Y -
~6~V' aR = V''Y Ra'Y - ~ V' aR = O.
The Schwarzschild metric. If we contract both sides of Einstein's equation, we obtain R = -811'K,Tj;. Plugging this back into Einstein's equation and rearranging we obtain an equivalent form of Einstein's equation Rp,v = 811' K, (Tp,v - Tg 9 p,v ). Thus in case the tensor T vanishes in the region of interest, we obtain the vacuum field equation
!
Rp,v
= O.
There is a famous metric defined in terms of spherical coordinates (t, r, (), ¢), given by
ds 2 = _ (1 - 2K,rM) dt 2 + (1 _ 2K,rM) -1 dr 2 + r2(d()2
+ sin2 () d¢2),
where () is the polar angle 0 ::; () ::; 11'. Notice that this expression for our metric is undefined at both r = 0 and r = 2K,M. But this is only the coordinate expression for a metric that intrinsically may be quite nice at r = 2K,M and/or r = O. It can be shown that there is a metric perfectly well-defined on r = 2K,M whose expression in (t, r, (), ¢) just happens to be the above away from r = 2K,M and r = O. The fact that the above expression blows up as we approach r = 2K,M is a failure of the coordinates and not a feature of the intrinsic metric. On the other hand, if one calculated the scalar curvature, then it can be seen to blow up as r -+ 0, which makes
13. Riemannian and Semi-Riemannian Geometry
634
° °
a true singularity. This metric is called the Schwarzschild metric and describes a spherically symmetric solution of Einstein's equation that is taken to be due to a spherical distribution of matter concentrated near r = (such as a star). We should mention that this metric only satisfies Einstein's equation in the vacuum away from the star. In most cases the radius of the star is larger than 2K,M, and then the Schwarzschild metric is not a solution inside the star anyway. If the radius of the star is less than 2K,M, then we have a nonrotating black hole, and r = 2K,M defines the famous event horizon. r =
Problems (1) Show that in a Riemannian manifold, a length minimizing piecewise smooth curve must be a smooth geodesic. [Hint: Each potential kink point has a totally star-shaped neighborhood. Use Proposition 13.85.] (2) A set U in a Riemannian manifold M is said to be geodesically convex if for each pair of points p, q E U, there is a unique length minimizing geodesic segment connecting them, and this unique geodesic segment lies completely in U. (a) Show that the intersection of geodesically convex sets is geodesically convex. (b) Show that given p E M there is a 8 > such that exp( {vp E TpM : Ilvpll < c}) is geodesically convex for all c < 8. (3) Referring to the discussion leading up to Proposition 13.135, let Y E T")'(O) be a piecewise smooth variation vector field along I and write Y = L
°
(4) A smooth map f : (M, g) -+ (N, h) of semi-Riemannian manifolds is called a homothety if 1* h = cg for some constant c =1= 0. The case of c = -1 is called an anti-isometry. Show that an anti-isometry preserves covariant derivatives and geodesics. (5) Show that the mapping
0" :
lR~+1 -+ lR~~;+l given by
O"(al, ... , an+d := (av+l, . .. , an+l, al, ... , a v ) is an anti-isometry (see above) and that its restriction to S;:(r) is an anti-isometry from S;: (r) onto H:::_ v (r). (6) Show that timelike vectors v and w in a Lorentz space V are in the same timecone if and only if (v, w) < 0. (7) Construct examples sufficient to make the point that time orient ability (of Lorentz manifolds) and orient ability are unrelated.
635
Problems
(8) Show that the restriction of div to nk(M) is -6. Show that if /-l E nk(M) is parallel (V' x/-l = 0 for all X E X(M)), then it is harmonic. (9) Let H := {(u,v) E IR2 : v > O} be the upper half-plane endowed with the metric 1 9 := - (du @ du + dv v Show that H has constant curvature K geodesics.
@
=
dv). -1. Find which curves are
(10) (Killing fields) On a semi-Riemannian manifold (M,g), a vector field X is called a Killing field if LXg = O. Show that the local flows of a Killing field are isometries. Show that X is a Killing field if and only if X (V, W) = (LXV, W) + (V, LXW) for all V, WE X(M). Show that X is a Killing field if and only if (V'vX, W) = - (V'wX, V) for all V, WE X(M). (11) Show that if'Y is a geodesic in M and X is a Killing field (see the previous problem), then X O'Y is a Jacobi field along 'Y and that (X 0 'Y, '1) is constant. (12) Show that an IR-linear combination of Killing fields is a Killing field and that if X and Yare Killing fields, then L[X,y] = [LX, Lyj. Deduce that the space of Killing fields is a real Lie algebra. What are the Killing fields of IR3?
(13) Let (M, g) and (N, h) be semi-Riemannian manifolds. Let f be a positive smooth function on M. The warped product metric on M x N is defined by (g XI h) := prig + (f 0 pr2)pr;g, where prl : M X N -+ M and pr2 : M x N -+ N are the first and second factor projections. (a) Show that this is indeed a metric. (b) Show that for each p E M, the map pr21pXN is a homothety. (c) Show that each M x q is normal to each p x N. (d) Let RM be the curvature on (M,g) and R MxN be the curvature tensor on (M x N, 9 x I h). If X, 17, and Z are lifts of X, Y, Z E X(M) as described in Problem 30 of Chapter 2, then what is the relationship between R~>5..N X,y Z and RfJ!,yZ?
Appendix A
The Language of Category Theory
Category theory provides a powerful means of organizing our thinking in mathematics. Some readers may be put off by the abstract nature of category theory. To such readers, I can only say that it is not really difficult to catch on to the spirit of category theory and the payoff in terms of organizing mathematical thinking is considerable. I encourage these readers to give it a chance. In any case, it is not strictly necessary for the reader to be completely at home with category theory before going further into the book. In particular, physics and engineering students may not be used to this kind of abstraction and should simply try to gradually become accustomed to the language. Feel free to defer reading this appendix on category theory until it seems necessary. Roughly speaking, category theory is an attempt at clarifying structural similarities that tie together different parts of mathematics. A category has "objects" and "morphisms". The prototypical category is just the category Set which has for its objects ordinary sets and for its morphisms maps between sets. The most important category for differential geometry is what is sometimes called the "smooth category" consisting of smooth manifolds and smooth maps. (The definition of these terms is given in the text proper, but roughly speaking, smooth means differentiable.) Now on to the formal definition of a category. Definition A.I. A category
-
637
A. The Language of Category Theory
638
denoted Mor(~). In addition, a category is required to have a composition law which is defined as a map 0: HomdX, Y) x HomdY, Z) ~ HomdX, Z) such that for every three objects X, Y, Z E Ob(~) the following axioms hold:
(1) HomdX, Y) and HomdZ, W) are disjoint unless X = Z and Y = W, in which case HomdX, Y) = HomdZ, W). (2) The composition law is associative: f 0 (g 0 h) = (f 0 g) 0 h. (3) Each set of morphisms of the form HomdX, X) must contain a necessarily unique element id x , the identity element, such that f 0 idx = f for any f E HomdX, Y) (and any Y), and idx of = f for any f E HomdY, X). Notation A.2. A morphism is sometimes written using an arrow. For example, if f E HomdX, Y) we would indicate this by writing f : X ~ Y or by X
-4 Y.
The notion of category is typified by the case where the objects are sets and the morphisms are maps between the sets. In fact, subject to putting extra structure on the sets and the maps, this will be almost the only type of category we shall need to talk about. On the other hand there are plenty of interesting categories of this type. Examples include the following. (1) Grp: The objects are groups and the morphisms are group homomorphisms. (2) Rng : The objects are rings and the morphisms are ring homomorphisms. (3) LinIF' : The objects are vector spaces over the field IF and the morphisms are linear maps. This category is referred to as the linear category or the vector space category (over the field IF). (4) Top: The objects are topological spaces and the morphisms are continuous maps. (5) ManT : The category of CT differentiable manifolds and CT maps: One of the main categories discussed in this book. This is also called the smooth or differentiable category, especially when r = 00.
Notation A.3. If for some morphisms Ii : Xi ~ Ii , (i = 1,2), gx : Xl ~ X 2 and gy : Yl ~ Y2 we have gy 0 h = h 0 gx, then we express this by saying that the following diagram "commutes": h
Xl~Yl gx
1
1
gy
X2~Y2
h
A. The Language of Category Theory
Similarly, if h 0
f
639
= g, we say that the diagram
X~Y
~lh Z
commutes. More generally, tracing out a path of arrows in a diagram corresponds to composition of morphisms, and to say that such a diagram commutes is to say that the compositions arising from two paths of arrows that begin and end at the same objects are equal. Definition A.4. Suppose that f : X ---+ Y is a morphism from some category Q:. If f has the property that for any two (parallel) morphisms gl, g2 : Z ---+ X we always have that fogl = fog2 implies gl = g2, i.e. if f is "left cancellable" , then we call f a monomorphism. Similarly, if f : X ---+ Y is "right cancellable", we call f an epimorphism. A morphism that is both a monomorphism and an epimorphism is called an isomorphism. If the category needs to be specified, then we talk about a Q:-monomorphism, Q:-epimorphism and so on).
In some cases we will use other terminology. For example, an isomorphism in the smooth category is called a diffeomorphism. In the linear category, we speak of linear maps and linear isomorphisms. Morphisms which comprise HomdX, X) are also called endomorphisms and so we also write EnddX) := HomdX, X). The set of all isomorphisms in HomdX, X) is sometimes denoted by AutdX), and these morphisms are called automorphisms. We single out the following: In many categories such as the above, we can form a new category that uses the notion of pointed space and pointed map. For example, we have the "pointed topological category". A pointed topological space is a topological space X together with a distinguished point p. Thus a typical object in the pointed topological category would be written as (X,p). A morphism f : (X,p) ---+ (W, q) is a continuous map such that f(p) = q. A functor F is a pair of maps, both denoted by the same letter F, that map objects and morphisms from one category to those of another,
F: Ob(Q:d ---+ Ob(Q:2),
F: Mor(Q:l) ---+ Mor(Q:2), so that composition and identity morphisms are respected. This means that for a morphism f : X ---+ Y, the morphism
F(f) : F(X) ---+ F(Y)
A. The Language of Category Theory
640
is a morphism in the second category and we must have (1) F(id\!:l)
= id\!:2·
(2) If f : X ~ Y and 9 : Y F(g) : F(Y) ~ F(Z) and
~
Z, then F(f)
F(g 0 f) = F(g)
0
F(X)
~
F(Y),
F(f).
Example A.S. Let LinJR be the category whose objects are real vector spaces and whose morphisms are real linear maps. Similarly, let Line be the category of complex vector spaces with complex linear maps. To each real vector space V, we can associate the complex vector space C Q9JR V, called the complexification of V, and to each linear map of real vector spaces f : V ~ W we associate the complex extension fe : C Q9JR V ~ C Q9JR W. Here, C Q9JR V is easily thought of as the vector space V where now complex scalars are allowed. Elements of C Q9IR V are generated by elements of the form c Q9 v, where c E C, v E V and we have i(c Q9 v) = ic Q9 v, where i = J=I. The map fe : C Q9IR V ~ C Q9IR W is defined by the requirement fd c Q9 v) = c Q9 fv. Now the assignments f
H
fe,
V
H
CQ9IR V
define a functor from LinIR to Line. In practice, complexification amounts to simply allowing complex scalars. For instance, we might just write cv instead of c Q9 v. Actually, what we have defined here is a covariant functor. A contravariant functor is defined similarly except that the order of composition is reversed so that instead of (2) above we would have F(g 0 f) = F(f) 0 F(g). An example of a contravariant functor is the dual vector space functor, which is a functor from the category of vector spaces LinIR to itself that sends each space V to its dual V* and each linear map to its dual (or transpose). Under this functor a morphism V ~ W is sent to the morphism V*
..£-
W*.
Notice the arrow reversal. One of the most important functors for our purposes is the tangent functor defined in Chapter 2. Roughly speaking this functor replaces differentiable maps and spaces by their linear parts. Example A.6. Consider the category of real vector spaces and linear maps. To every vector space V, we can associate the dual of the dual V**. This is
641
A. The Language of Category Theory
a covariant functor which is the composition of the dual functor with itself: V**
A**l V*
W**
Now suppose we have two functors, Fl : Ob(Q:l) -+ Ob(Q:2), Fl : Mor(Q:l) -+ Mor(Q:2) and
F2 : Ob(Q:l) -+ Ob(Q:2), F2 : Mor(Q:l) -+ Mor(Q:2)' A natural transformation 7 from Fl to F2 is given by assigning to each object X of Q:l, a morphism T(X) : F1(X) -+ F2(X) such that for every morphism f : X -+ Y of Q:l, the following diagram commutes:
Fl(X)!l2 F2(X)
1
Fl(f)
1
F2(f)
Fl (Y) T(Y) F2 (Y) A common first example is the natural transformation i between the identity functor I : LinIR -+ LinIR and the double dual functor ** : LinIR -+ LinIR: ~(V)
V - - V** f
1
W
~
1r* W **
v:
The map V -+ V** sends a vector to a linear function V* -+ lR defined by v(a) := a(v) (the hunted becomes the hunter, so to speak). If there is an inverse natural transformation 7- 1 in the obvious sense, then we say that 7 is a natural isomorphism, and for any object X E Q:l we say that Fl (X) is naturally isomorphic to F2(X), The natural transformation just defined is easily checked to have an inverse, so it is a natural isomorphism. The point here is not just that V is isomorphic to V** in the category LinIR, but that the isomorphism exhibited is natural. It works for all the spaces V in a uniform way that involves no special choices. This is to be contrasted with the fact that V is isomorphic to V*, where the construction of such an isomorphism involves an arbitrary choice of a basis.
Appendix B
Topology
B.1. The Shrinking Lemma We first state and prove a simple special case of the shrinking lemma since it makes clear the main idea at the root of the fancier versions.
Lemma B.1. Let X be a normal topological space and {Ul' U2} an open cover of X. There exists an open set V with V c U1 such that {V, U2} is still a cover of X. Proof. Since U1 UU2 = X, we have (X\Udn(X\U2) = 0. Using normality, we find disjoint open sets 0 and V such that X\U1 c 0 and X\U2 C V. Then it follows that X\O c Ul and X\ V c U2 and so X = U2 U V. But On V = 0 so V c X\O. Thus V c X\O CUI. 0 Proposition B.2. Let X be a normal topological space and {U1 , U2, ... , Un} a finite open cover of X. Then there exists an open cover {VI, V2,···, Vn } such that Vi C Ui for i = 1,2, ... ,n. Proof. Simple induction using Lemma B.1.
o
The proof we give of the shrinking lemma uses transfinite induction. A different proof may be found in the online supplement [Lee, Jeff]. It is a fact that any set A can be well ordered, which means that we may impose a partial order --< on the set so that each nonempty subset SeA has a least element. Every well-ordered set is in an order preserving isomorphism with an ordinal w (which, by definition, is itself the set of ordinals strictly less than w). For purposes of transfinite induction, we may as well assume that the given indexing set is such an ordinal. Let A be the ordinal which is the indexing set. Each Q E A has a unique successor, which is written as Q + 1.
-
643
644
B. Topology
The successor of ex + 1 is written as ex + 2, and so on. If (3 E A has the form (3 = {ex, ex + 1, ex + 2, ... }, then we say that (3 is a limit ordinal and of course a -< (3 for all a E (3. (This may seem confusing if one is not familiar with ordinals.) Suppose we have a statement P(ex) for all ex E A. Let 0 denote the first element of A. The principle of transfinite induction on A says that if P(O) is true and if the truth of P(ex) for all ex -< (3 can be shown to imply the truth of P((3) for arbitrary (3, then P( ex) is in fact true for all ex E A. In most cases, a transfinite induction proof has three steps: (1) Zero case: Prove that P(O) is true. (2) Successor case: Prove that for any successor ordinal ex + 1, the assumption that P( 0) is true for all 0 -< ex + 1 implies that P( ex + 1) is true. (3) Limit case: Prove that for any limit ordinal w, P(w) follows from the assumption that P(ex) is true for all ex -< w. Definition B.3. A cover {Ua}aEA of a topological space X is called point finite if for every p E X the family A(p) = {ex : p E Ua} is finite. Clearly, a locally finite cover is point finite. Theorem B.4. Let X be a normal topological space and {Ua}aEA a point finite open cover of X. Then there exists an open cover {Va}aEA of X such that Va C Ua for all ex EA. Proof. Assume that A is an ordinal. The goal is to construct a cover {Va}aEA of X such that Va C Ua for all ex E A. We do transfinite induction on A. Let P( ex) be the statement
(*) For all 0 -< ex there exists Vo with Vo C Uo such that {Vo }O-
F = X\ { (UO-
U
(Ua+1jOVO)}.
Clearly, F is closed and F C Ua. By normality there is a set Va with
F
C
Va
C
Va
C
Ua'
Then {VO}O-
B.2. Locally Euclidean Spaces
645
there is an element x E X which is not in the union of this family of open sets. We know that there exists a finite collection of sets Uall ... , Uan' each containing x and such that no other Ua contains x. We have that ai --< w for each i = 1, ... , n, and since w is a limit ordinal, there exists an ordinal S such that ai --< S --< w for all i. Then the point x is in the union {Vo hr-
x
E
(Uo-<w Vo) U (Uw::soUo) ,
which contradicts our assumption that {Vo}o-<w U {Uo}w-
B.2. Locally Euclidean Spaces If every point of a topological space X has an open neighborhood that is homeomorphic to an open set in a Euclidean space, then we say that X is locally Euclidean. A locally Euclidean space need not be Hausdorff. For example, if we take the spaces JR. x {O} and JR. x {I} and give them the relative topologies as subsets of JR. x JR., then they are both homeomorphic to R Now on the (disjoint) union (JR. x {O}) U (JR. x {I}) define an equivalence relation by requiring (x,O) rv (x,l) except when x = O. The quotient topological space thus obtained in locally Euclidean, but not Hausdorff. Indeed, the two points [(0,0)] and [(0,1)] are distinct but cannot be separated. It is as if they both occupy the origin. A refinement of a cover {UB} ~EB of a topological space X is another cover {Vi}iEI such that every set from the second cover is contained in at least one set from the original cover. We say that a cover {VihEI of a topological space X is a locally finite cover if every point of X has a neighborhood that intersects only a finite number of sets from the cover. A topological space X is called paracompact if every open cover of X has a refinement which is a locally finite open cover. Proposition B.5. If X is a locally Euclidean Hausdorff space, then the are following properties equivalent: (1) X is paracompact. (2) X is metrizable. (3) Each connected component of X is second countable. (4) Each connected component of X is a-compact. (5) Each connected component of X is separable.
For a proof see [Spv, volume I] and [Dug].
Appendix C
Some Calculus Theorems
For a review of multivariable calculus and the proofs of the theorems below, see the online supplement [Lee, Jefl1.
Theorem C.l (Inverse mapping theorem). Let U be an open subset ofJRn and let f : U --t JRn be a Crmapping for 1 :s; r :s; 00. Suppose that Xo E U and that D f(xo) : JRn --t JRn is a linear isomorphism. Then there exists an open set V C U with Xo E V such that f(V) is open and f : V --t f(V) C JRn is a C r diffeomorphism. Furthermore the derivative of f- 1 at y is given by D f- 1l y = (D flf-l(y) )-1. Theorem C.2 (Implicit mapping theorem). Let 0 C JRk x JRI be open. Let f : 0 --t JRm be a C r mapping such that f(xo, Yo) = o. If D2!(xo, Yo) : JRI --t JRm is an isomorphism, then there exist open sets Ul C JRk and U2 C JRI such that Ul x U2 C 0 with Xo E Ul and a C r mapping 9 : U1 --t U2 with g(xo) = Yo such that for all (x, y) E U1 X U2 we have
f(x, y) = 0 if and only if y = g(x). We may take Ul to be connected. The function 9 in the theorem satisfies f(x, g(x)) = 0, which says that the graph of 9 is contained in (U1 x U2) n f- 1 (0), but the conclusion of the theorem is stronger since it says that in fact the graph of 9 is exactly equal to (U1 x U2) n f-1(0).
Corollary C.3. If U is an open neighborhood of 0 E JRk and f : U C JRk --t = 0 such that D f(O) has rank k, then there is an open neighborhood V of 0 E JRn, an open neighborhood W of 0 E JR n ,
JRn is a smooth map with f(O)
-
647
C. Some Calculus Theorems
648
and a diffeomorphism g : V --+ W such that go f : f-l(V) --+ W is of the form (a l , ... ,ak ) ~ (a l , ... ,ak,O, ... ,0).
° ° °
Corollary C.4. If U is an open neighborhood of E ffi.n and f : U C ffi.k X ffi.n-k --+ ffi.k is a smooth map with f(O) = and if the partial derivative Dd(O,O) is a linear isomorphism, then there exist a diffeomorphism h : V C ffi.n --+ Ul, where V is an open neighborhood of E ffi.n and Ul is an open neighborhood of E ffi.k such that the composite map f 0 h is of the form
°
(al, ... ,an ) ~ (a\ ... ,ak ).
Theorem C.5 (The constant rank theorem). Let f: (ffi.n,p) --+ (ffi.m,q) be a local map such that D f has constant rank r in an open set containing p. Then there are local diffeomorphisms gl : (ffi. n , p) --+ (ffi. n , q) and g2 : (ffi.m,q) --+ (ffi.m,O) such thatg2ofog1l has the form (xl, ... ,xn) ~ (xl, ... ,xr,o, ... ,O) on a sufficiently small neighborhood of 0.
Theorem C.6 (Mean value). Let U be an open subset of ffi.n and let f : U --+ ffi.m be of class C l . Suppose that for x, z E U the line segment L given by x + t(z - x) (for 0::; t ::; 1) is contained in U. Then
Ilf(z) - f(x)11 ::; liz - xii sup{IIDf(y)11 : y where IIDf(y)11 := sUPllvll=l {IIDf(y)· vii}·
E
L},
Appendix D
Modules and Multilinearity
A module is an algebraic object that shows up quite a bit in differential geometry and analysis (at least implicitly). A module is a generalization of a vector space where the field IF is replaced by a ring or an algebra over a field. For the definition of ring and field consult any book on abstract algebra. The definition of algebra is given below. The modules that occur in differential geometry are almost always finitely generated projective modules over the algebra of C r functions, and these correspond to the spaces of C r sections of vector bundles. We give the abstract definitions but we ask the reader to keep two cases in mind. The first is just the vector spaces which are the fibers of vector bundles. In this case, the ring in the definition below is the field IF (the real numbers lR or the complex numbers q, and the module is just a vector space. The second case, already mentioned, is where the ring is the algebra Cr(M) for a C r manifold and the module is the set of C r sections of a vector bundle over M. As we have indicated, a module is similar to a vector space with the differences stemming from the use of elements of a ring R as the scalars rather than the field of complex C or real numbers R For an element v of a module V, one still has Ov = 0, and if the ring is a ring with unity 1, then we usually require Iv = v. Of course, every vector space is also a module since the latter is a generalization of the notion of vector space. We also have maps between modules, the module homomorphisms (see Definition D.5 below), which make the class of modules and module homomorphisms into a category.
-
649
650
D. Modules and Multilinearity
Definition D.l. Let R be a ring. A left R-module (or a left module over R) is an abelian group (V, +) together with an operation R x V ~ V written as (a, v) f--7 av and such that
+ b)v = av + bv for all a, bE R and all v E V; a( VI + V2) = aVI + aV2 for all a E R and all V2, VI
1) (a
2) 3 (ab)v = a(bv) for all a,b E R and all v E V.
E V;
A right R-module is defined similarly with the multiplication on the right so that 1) v(a + b) = va
+ vb for all a, bE R and all v E V; vIa + V2a for all a E R and all V2, VI
2) (VI + v2)a = 3) v(ab) = (va)b for all a,b E R and all v E V.
E V
If R has an identity and 1v = v (or v1 = v) for all v E V, then we say that V is a unitary R-module. If R has an identity, then by "R-module" we shall always mean a unitary R-module unless otherwise indicated. Also, if the ring is commutative (the usual case for us), then we may write av = va and consider any right module as a left module and vice versa. Even if the ring is not commutative, we will usually stick to left modules in this appendix and so we drop the reference to "left" and refer to such as R-modules. We do also use right modules in the text. For example, we consider right modules over the quaternions.
Remark D.2. We shall often refer to the elements of R as scalars. Example D.3. An abelian group (A, +) is a Z-module, and a Z-module is none other than an abelian group. Here we take the product of n E Z with x E A to be nx := x + ... + x if n ~ 0 and nx := -(x + ... + x) if n < 0 (in either case we are adding Inl terms). Example D.4. The set of all m x n matrices with entries that are elements of a commutative ring R is an R-module with scalar multiplication. Definition D.5. Let VIand V 2 be modules over a ring R. A map L : V 1 V 2 is called a module homomorphism or a linear map if
~
By analogy with the case of vector spaces we often characterize a module homomorphism L by saying that L is linear over R. Example D.6. The set of all module homomorphisms of a module V over a commutative ring R to another R-module M is also a (left) R-module in its own right and is denoted HomR (V, M) or LR (V, M) (we mainly use the
651
D. Modules and Multi1inearity
latter). The scalar multiplication and addition in LR(V, M) are defined by
(f
+ g)(v)
:=
(af) (v)
:=
f(v) + g(v) for f,g E LR(V, M) and all v E V; af(v) for a E R.
Note that ((ab) f) (v) := (ab) f(v) = a (bf(v)) = a ((bf) (v)) = (a (bf))(v). Also, in order to show that af is linear we argue as follows: (af) (cv) = af(cv) = acf(v) = caf(v) = c(af) (v). This argument fails if R is not commutative! Indeed, if R is not commutative, then LR(V, M) is not a module but rather only an abelian group. Example D.7. Let V be a vector space and f : V --+ V a linear operator. Using f, we may consider V as a module over the ring of polynomials lR[t] by defining the "scalar" multiplication by the rule p(t)v := p(f)v for p E IR [t], v E V. Here, if p( t) = 2:: anfn.
2:: an tn,
then p( f) is the linear map
Since the ring is usually fixed, we often omit mentioning the ring. In particular, we often abbreviate LR(V, W) to L(V, W). Similar omissions will be made without further mention. Remark D.B. If the modules are infinite-dimensional topological vector spaces such as Banach space, then we must distinguish between the bounded linear maps and simply linear maps. If E and F are infinite-dimensional Banach spaces, then L(E; F) would normally denote bounded linear maps. A submodule is defined in the obvious way as a subset 8 c V that is closed under the operations inherited from V so that 8 itself is a module. The intersection of all submodules containing a subset A c V is called the submodule generated by A and is denoted (A). In this case, A is called a generating set. If (A) = V for a finite set A, then we say that V is finitely generated. Let 8 be a submodule of V and consider the quotient abelian group V / 8 consisting of cosets, that is, sets of the form [v] := v + 8 = {v + x : x E 8} with addition given by [v] + [w] = [v + w]. We define scalar multiplication by elements of the ring R by a[v] := [av] for a E R. In this way, V /8 is a module called a quotient module. Many of the operations that exist for vector spaces have analogues in the module category. For example, if V and Ware R-modules, then the set V x W can be made into an R-module by defining (VI, WI)
+ (V2' W2)
:= (VI
+ V2, WI + W2)
for
(VI, WI)
and
(V2' W2)
in V x W
D. Modules and Multilinearity
652
and
a(v,w)
(av,aw) for a E Rand (v,w) E V x W. This module is sometimes written as V EB W, especially when taken together with the injections V -+ V x Wand W -+ V x W given by v f--7 (v, 0) and w f--7 (0, w) respectively. Also, for any module homomorphism L : VI -+ V2 :=
we have the usual notions of kernel and image: Ker L = {v E VI : L(v) = O} C VI,
Im(L) = L(V I ) = {w E V2 : w = Lv for some v E Vd C V2. These are sub modules of V I and V 2 respectively. On the other hand, modules are generally not as simple to study as vector spaces. For example, there are several notions of dimension. The following notions for a vector space all lead to the same notion of dimension. For a completely general module, these are all potentially different notions: (1) The length n of the longest chain of submodules 0= Vn £; ... £; VI £; V.
(2) The cardinality of the largest linearly independent set (see below). (3) The cardinality of a basis (see below). For simplicity, in our study of dimension, let us now assume that R is commutative.
Definition D.9. A set of elements {el' ... ,ed of a module are said to be linearly dependent if there exist ring elements rl, ... , rk E R not all zero, such that rl el + ... + rkek = O. Otherwise, they are said to be linearly independent. We also speak of the set {el, ... , ed as being a linearly independent set. So far so good, but it is important to realize that just because el, ... , ek are linearly dependent does not mean that we may write each of these ei as a linear combination of the others. It may even be that some single element v forms a linearly dependent set since there may be a nonzero r such that rv = 0 (such a v is said to be a torsion element). If a linearly independent set {el' ... , ed is maximal in size, then we say that the module has rank k. Another strange possibility is that a maximal linearly independent set may not be a generating set for the module and hence may not be a basis in the sense to be defined below. The point is that although for an arbitrary w E V we must have that {el' ... , ek} U {w} is linearly dependent and hence there must be a nontrivial expression rw + rl el + ... + rkek = 0, it does not follow that we may solve for w since r may not be an invertible element of the ring. In other words, it may not be a unit.
D. Modules and Multilinearity
653
Definition D.lO. If B is a generating set for a module V such that every element of V has a unique expression as a finite R-linear combination of elements of B, then we say that B is a basis for V. Definition D.ll. If an R-module has a basis, then it is referred to as a free module. If this basis is finite we indicate this by referring to the module as a finitely generated free module. It turns out that just as for vector spaces the cardinality of a basis for a finitely generated free module V is the same as that of every other basis for V. If a module over a (commutative) ring R has a basis, then the number of elements in the basis is called the dimension and must in this case be the same as the rank (the size of a maximal linearly independent set). Thus a finitely generated free module is also called a finite-dimensional free module. Exercise D.12. Show that every finitely generated R-module is the homomorphic image of a finitely generated free module. If R is a field, then every module is free and is a vector space by definition. In this case, the current definitions of dimension and basis coincide with the usual ones.
The ring R is itself a free R-module with standard basis given by {1}. Also, Rn := REEl··· EEl R is a finitely generated free module with standard basis {el' ... , en}, where, as usual, ei := (0, ... , 1, ... ,0); the only nonzero entry is in the i-th position. Up to isomorphism, these account for all finitely generated free modules: If a module V is free with basis el, ... ,en, then we have an isomorphism Rn ~ V given by
Definition D.13. Let Vi, i = 1, ... , k, and W be modules over a ring R. A map /1 : VI X ... xV k ---+ W is called multilinear (k-multiline~) if for each i, 1::; i::; k, and each fixed (vl, ... ,iii, ... ,Vk) E VI X ... X Vi X ... x Vk we have that the map v
H
/1(Vl, ... , Vi-I, V ,Vi+I···, Vk),
i-th obtained by fixing all but the i-th variable, is a module homomorphism. In other words, we require that /1 be R-linear in each slot separately. The set of all multilinear maps VI X ... x V k ---+ W is denoted LR (V 1, ... , Vk; W). If VI = ... = Vk = V, then we abbreviate this to L~(V;W). If R is commutative, then the space of multilinear maps LR (V 1, ... ,vk; W) is itself an R-module in a fairly obvious way: If a, b E Rand /11, /12 E LR(V l , ... , Vk; W), then a/1I + b/12 is defined pointwise in the usual way.
D. Modules and Multilinearity
654
Note: For the remainder of this chapter, all modules will be taken to be over a fixed commutative ring R.
Let us agree to use the following abbreviation: Vk = V Cartesian product).
X ..•
x V (k-fold
Definition D.14. The dual of an R-module V is the module V* := LR(V, R) of all R-linear functionals on V.
Any element W E V can be thought of as an element of V** := LR(V*, R) according to w(o:) := o:(w) where 0: E V*. This provides a map V y V**, and if this map is an isomorphism, then we say that V is reflexive. If V is reflexive, then we are free to identify V with V**. Exercise D.15. Show that if V is a finitely generated free module, then V is reflexive.
For completeness, we include the definition of a projective module but what is important for us is that the finitely generated projective modules over COO(M) correspond to spaces of sections of smooth vector bundles. These modules are not necessarily free but are reflexive and have many other good properties such as being "locally free" . Definition D.16. A module V is projective if, whenever V is a quotient of a module W, there exists a module U such that the direct sum V EEl U is isomorphic to W.
Given two modules V and W over some commutative ring R, consider the class CYxW consisting of all bilinear maps V x W --+ X where X varies over all R-modules, but V and Ware fixed. We take members of CYxW as the objects of a category (see Appendix A). A morphism from, say J-LI : V x W --+ Xl to J-L2 : V x W --+ X2 is defined to be a homomorphism C : Xl --+ X2 such that J-L2 = C0J-LI. There exists a vector space Ty,w together with a bilinear map 181 : V x W --+ Ty,w that has the following universal property: For every bilinear map J-L : V x W --+ X, there is a unique linear map Ii : Ty,w --+ X such that J-L = Ii ° 181. If a pair (Ty,w, 181) with this property exists, then it is unique up to isomorphism in CYxw, We refer to such a universal object as a tensor product of V and W. We will indicate the construction of a specific tensor product that we denote by V 181 W with the corresponding map 181 : V x W --+ V 181 W. The idea is simple: We let V 181 W be the set of all linear combinations of symbols of the form v 181 w for v E V and w E W, subject to the relations
+ V2) 181 w = VI 181 W + V2 181 W, v 181 (WI + W2) = v 181 WI + V 181 W2, r (v 181 w) = rv 181 W = v 181 rw, (VI
for r Elf.
D. Modules and Multilinearity
655
The map ® is then simply ® : (v, w) -+ v ® w. Let us generalize this idea to tensor products of several vector spaces at a time, and also, let us be a bit more pedantic about the construction. We seek a universal object for the category of k-multilinear maps of the form J.L : VI x··· x V k -+ W with Vl, ... ,Vk fixed. Definition D.17. A module T = TV1, ... ,vk together with a multilinear map ® : VI X ... x V k -+ T is called universal for k-multilinear maps on V 1 X ... x V k if for every multilinear map J.L : VI X··· x V k -+ W there is a unique linear map ji, : T -+ W such that the following diagram commutes:
i.e. we must have J.L = ji, 0 ®. If such a universal object exists, it will be called a tensor product of VI, ... , V k> and the module itself T = Tv 1,... ,V k is also referred to as a tensor product of the modules VI, ... , V k. The tensor product is again unique up to isomorphism: Proposition D.1S. If (Tl' ®d and (T2' ®2) are both universal for k-multilinear maps on VI X •.. x V k, then there is a unique isomorphism : Tl -+ T2 such that 0 ®1 = ®2: V 1 X··· XVk
Tl
/~~ ~
T2
Proof. By the assumption of universality, there are maps ®1 and ®2 such that 0 ®1 = ®2 and 0 ®1 = ®1, and by the uniqueness part of the definition of universality of ®1 we must have = id or -I. 0 The usual specific realization of the tensor product of modules VI, ... , V k is, roughly, the set of all linear combinations of symbols of the form VI ® ... ® Vk subject to the obvious multilinear relations: VI
® ... ® aVi ® ... ® Vk = a( VI ® ... ® Vi ® ... ® Vk)
and VI
® ... ®
(Vi
+ vD ® ... ® Vk
= VI ® ... ® Vi ® ... ® Vk + VI ® ... ® ViI ® ... ® Vk·
656
D. Modules and Multilinearity
This space is denoted by V I ® ... ® V k or by ®7=1 Viand called the tensor product of V I, ... , V k. Also, we will use V0k or ®kV to denote V ® ... ® V (k-fold tensor product of V). The associated map ® : VI x ... x Vk ---+ VI ® ... ® Vk is simply ® : (VI, ... , Vk)
f---1
VI ® ... ® Vk·
A more pedantic description is as follows. We take V I ® ... ® V k := F(V I X ... xV k)/UO, where F(V 1 X· .. xV k) is the free module on the set VI X· .. xV k and Uo is the sub module generated by the set of all elements of the form
and ( VI, ... , (Vi
+ vD ' ... ,Vk)
- (VI, ... ,Vi, ... ,Vk) - (VI"",V~"",Vk)' where Vi, V~ E Vi, and a E R. Each element (VI, ... , Vk) of the set V I X· .. xV k is naturally identified with a generator of the free module F(V I x ... xV k) and we have the obvious injection VI x ... X Vk '--+ F(VI X ... X Vk)' Its equivalence class is denoted VI ® ... ® Vk and the map ® is then the composition VI x ... xV k '--+ F(VI X ... xV k) ---+ F(VI x ... xV k)/UO. Proposition D.19. ® : VI x ... X Vk ---+ VI ® ... ® Vk is universal for multilinear maps on V I X ... x V k. Exercise D.20. Prove the above proposition. Proposition D.21. If f : VI ---+ WI and 9 : V 2 ---+ W2 are module homomorphisms, then there is a unique homomorphism f ® 9 : V I ® V 2 ---+ WI ® W 2, the tensor product, which has the characterizing properties that f ® 9 is linear and that (J ® g) (VI ®V2) = (Jvt) ® (gv2) for all VI E VI, V2 E V2. Similarly, if fi : Vi ---+ Wi, we may obtain ®di : ®7=1 Vi ---+ ®7=1 Wi· Proof. Exercise.
D
Definition D.22. Elements of ®7=1 Vi that may be written as VI ® ... ®Vk for some Vi are called simple or decomposable. Remark D.23. It is clear from our specific realization of ®7=1 Vi that elements in the image of ® : V I X ... xV k ---+ ®7=1 Vi span ®7=1 Vi. I.e., decomposable elements span the space. Exercise D.24. Not all elements are decomposable but the decomposable elements generate V I ® ... ® V k. It may be that the Vi are modules over more than one ring. For example, any complex vector space is a module over both lR and C Also, the module of
D. Modules and Multi1inearity
657
smooth vector fields XM(U) is a module over COO(U) and a module (actually a vector space) over R Thus it is sometimes important to indicate the ring involved, and so we write the tensor product of two R-modules V and W as V 0R W. For instance, there is a big difference between XM(U) 0coo(u) XM(U) and XM(U) 0IR XM(U).
Lemma D.25. There are the following natural isomorphisms: (1) (V 0 W) 0 U ~ V 0 (W 0 U) ~ V 0 W 0 U, and under these isomorphisms, (v 0 w) 0 u +------+ v 0 (w 0 u) +------+ v 0 W 0 u.
(2) V 0 W
~
W 0 V, and under this isomorphism v 0 W +------+ W 0 v.
Proof. We prove (1) and leave (2) as an exercise. Elements of the form (v 0 w) 0 u generate (V 0 W) 0 U, so any map that sends (v 0 w) 0 u to v 0 (w 0 u) for all v, w, u must be unique. Now we have compositions
and V x (W x U)
idv Y;9
V x (W 0 U) ~ V 0 (W 0 U).
It is a simple matter to check that these composite maps have the same universal property as the map V x W x U ~ V 0 W 0 U. The result now follows from the existence and essential uniqueness (Propositions D.19 and D.1S). 0
We shall use the first isomorphism and the obvious generalizations to identify VI 0 ... 0 V k with all legal parenthetical constructions such as (((V 1 0 V2) 0·· ·0 Vj) 0··· ) 0 Vk and so forth. In short, we may construct VI 0 ... 0 V k by tensoring spaces two at a time. In particular, we assume the isomorphisms (as identifications) (V 1 0 ... 0 V k) 0 (WI 0 ... 0 W k)
~
VI 0 ... 0 V k 0 WI 0 ... 0 W k,
where (v10··· 0Vk) 0 (w10··· 0Wk) maps to v10··· 0Vk 0W10··· 0Wk·
Proposition D.26. If V is an R-module, then we have natural isomorphisms given on decomposable elements as v 0 r assuming that R is commutative.)
M
rv
M
r 0 v. (Recall that we are
The proof is left to the reader. The following proposition gives a basic and often used isomorphism.
D. Modules and MultiJinearity
658
Proposition D.27. For R-modules W, V, U, we have LR(W ® V, U) ~ L(W, Vi U). More generally,
Proof. This is more or less just a restatement of the universal property of W®V. One should check that this association is indeed an isomorphism. 0 Exercise D.28. Show that if W is free with basis (h, ... , in), then W* is also free and has a dual basis (f 1 , ... , r)' that is, fi (fj) = 8}. Theorem D.29. If VI, ... , V k are free R-modules and if (e~, ... , e~j) is a basis for V j, then the set of all decomposable elements of the form ® ... ® e7k is a basis for VI ® ... ® V k .
et
Proof. We prove this for the case of k = 2. The general case is similar. We wish to show that if (eI' ... , enl ) is a basis for V 1 and (h, ... , f n2) is a basis for V 2, then {ei ® fj} is a basis for VI ® V 2. Define ¢lk : VI X V 2 -+ R by ¢lk(ei, fj) = 8~8jl, where 1 is the identity in Rand
8~8kl := {l J
if (l, k) = (i,j), 0 otherwise.
Extend this definition bilinearly. These maps are linearly independent in L(VI' V 2i R) since if L:lk alk¢lk = 0 in R, then for any i,j we have 0= Lalk¢lk(ei,fj) lk = aij·
= Lalk8~8jl lk
Thus dim(VI ® V2) = dim((Vl ® V2)*) = dimL(Vl' V2i R) 2:: nIn2. On the other hand, {ei ® fj} spans the set of all decomposable elements and hence the whole space VI ® V 2, so that dim(VI ® V 2) ~ nIn2 and it follows that {ei ® fJ} is a basis. 0 Proposition D.30. There is a unique R-module map ~: L(VI' WI) ® ... ® L(Vk' Wk) -+ L(VI ® ... ® Vk, WI ® ... ® W k ) such that if h ® ... ® fk is a (decomposable) element of L(VI' WI) ® ... ® L(Vk' W k ) then ~(h
® ... ® fk)(vI ® ... ® Vk) = h(Vl) ® ... ® ik(Vk).
If the modules are all finitely generated and free, then this is an isomorphism.
D. Modules and Multilinearity
659
Proof. If such a map exists, it must be unique since the decomposable elements span L(VI' WI)®" ·®L(Vk, Wk). To show the existence, we define a multilinear map
{) : L(V 1 , WI)
X ...
x L(V k, W k)
X
VI
X .•.
x Vk
~
WI ® ... ® W k
by the recipe
By the universal property there must be a linear map {) : Vi ® ... ® V k® VI ® ... ® V k
such that {)
0
®
= {),
~
WI ® ... ® W k
where ® is the universal map. Now define
LUI ® ... ® fk)(VI ® ... ® Vk) :=
J(h ® ... ® fk
® VI ® ... ® Vk).
The fact that L is an isomorphism in case the Vi are all free follows easily 0 from Exercise D.28 and Theorem D.29. Since R ® R = R, we obtain Corollary D.31. There is a unique R-module map L : Vi ® ... ® V k ~ (VI ® ... ® Vk)* such that if al ® ... ® ak is a (decomposable) element of Vi ® ... ® V k, then
L(al ® ... ® ak)(vI ® ... ® Vk) = aI(vI)'" ak(vk). If the modules are all finitely generated and free, then this is an isomorphism. Corollary D.32. There is a unique module map LO : W ® V* ~ L(V, W) such that if v ® f3 is a (decomposable) element of W ® V*, then LO(W
® f3)(v)
= f3(v)w.
If V and Ware finitely generated free modules, then this is an isomorphism. Proof. If we associate to every W E W the map wmap E L(R, W) given by wmap(r) := rw, then we obtain an isomorphism W ~ L(R, W). Use this and then compose W ® V* ~ L(R, W) ® L(V, R) ~
L(R ® V, W ® R)
~
L(V, W),
V* ® W ~ L(V, R) ® L(R, W)
--+ L(V ® R, R ® W)
~
L(V, W).
o
By combining Corollary D.31 with Proposition D.27 and taking U = R we obtain the following assertion.
660
D. Modules and Multi1inearity
Corollary D.33. There is a unique R-module map I- : Vi ® ... ® VA: -+ L(V I , ... , Vk; R) such that if al ® ... ® ak is a (decomposable) element of Vi ® ... ® VA:, then l-(al ® ... ® ak)(vI, ... , Vk) = aI(vI)'" ak(vk)'
If the modules are all finitely generated and free, then this is an isomorphism. Theorem D.34. If 'Pi : Vi x Wi -+ Ui are bilinear maps for i = 1, ... , k, then there is a unique bilinear map
'P : ®7=I Vi x ®7=I Wi -+ ®7=I Ui such that for Vi E Vi and Wi E Wi, 'P (VI ® ... ® Vk, WI ® ... ® Wk) = 'PI(VI, WI) ® ... ® 'Pk(Vk, Wk). Proof. We sketch the proof in the k = 2 case. If 'P exists, it is unique since elements of the form VI ® V2 span V I ® V 2 and similarly for WI ® W 2. Now by the universal property of tensor products, associated to 'Pi for i = 1,2, we have unique linear maps fi : Vi ® Wi -+ Ui with fi 0 ® = 'Pi. Then we obtain the linear map h ® 12 : (VI ® WI) ® (V2 ® W2) -+ U I ® U 2. On the other hand we have the natural isomorphism S: (VI ® V2) ® (WI ® W2) -+ (V I ® WI) ® (V 2 ® W 2) induced by the obvious switching of factors of simple elements. Now define 'P: (VI ®V2) x (WI ®W2) -+ U I ®U2 by 'P(x,y) = (h ® h) (S(x ® y)) for x E VI ® V2 and y E WI ® W2. Then we have
'P(VI ® V2, WI ® W2) = (h ® h) (VI ® WI ® V2 ® W2) =
h
(VI ® WI) ® 12 (V2 ® W2)
= 'PI (VI, WI) ® 'P2 (V2' W2) .
o
Corollary D.35. Let Vi and Wi be R-modules for i = 1, ... , k. If 'Pi : Vi x Wi -+ R are bilinear maps, i = 1, ... , k, then there is a unique bilinear map
'P : ®7=I Vi x ®7=I Wi -+ R such that for Vi E Vi and Wi E Wi, 'P (VI ® ... ® Vk, WI ® ... ® Wk) = 'PI(VI, WI)'" 'Pk(Vk, Wk). Proof. Imitate the proof of the previous theorem or just use the previous theorem together with the natural isomorphism R ® ... ® R ~ R. 0
D.l. R-Algebras Definition D.36. Let R be a commutative ring. An (associative) R-algebra 2l is a unitary R-module that is also a ring with identity 12(, where the ring addition and the module addition coincide and where r(ala2) = (rat}a2 = al (ra2) for all aI, a2 E 2l and all r E R.
661
D.l. R-Algebras
As defined above, an algebra is associative. However, one can also define nonassociative algebras, and a Lie algebra is an example of such.
Definition D.37. Let Qt and 11) be R-algebras. A module homomorphism h : Qt -+ 11) that is also a ring homomorphism is called an R-algebra homomorphism. Epimorphism, monomorphism, and isomorphism are defined in the obvious way. If a submodule J of an algebra Qt is also a two-sided ideal with respect to the ring structure on Qt, then QtjJ is also an algebra.
Example D.38. Let U be an open subset of a CT manifold M. The set of all smooth functions CT(U) is an lR-algebra (lR is the real numbers) with unity being the function constantly equal to l. Example D.39. The set of all complex n x n matrices is an algebra over C with the product being matrix multiplication. Example D,4D. The set of all complex n x n matrices with real polynomial entries is an algebra over the ring of polynomials lR[x]. Definition D.41. The set of all endomorphisms of an R-module W is an R-algebra denoted EndR(W) and called the endomorphism algebra ofW. Here, the sum and scalar multiplication are defined as usual and the product is composition. Note that for r E R
r(f 0 g) = (rf)
0
9
=f
0
(rg),
where f,g E EndR(W).
Definition D,42. A set A together with a binary operation * : A x A -+ A is called a monoid if the operation is associative and there exists an element e (the identity) such that a * e = e * a = a for all a E A. With the operation of addition, N, Z and Z2 are all commutative mondoids.
Definition D,43. Let (A, *) be a monoid and R a ring. An A-graded R-algebra is an R-algebra with a direct sum decomposition Qt = L:iEA 2li such that QtiQtj C Qti*j. An N-graded algebra is sometimes simply referred to as a graded algebra. A superalgebra is a Z2-graded algebra. Definition D,44. Let Qt = L:iEZ 2li and 11) = L:iEZ l1)i be Z-graded algebras. An R-algebra homomorphism h : Qt -+ 11) is called a Z-graded homomorphism if h(21i) C l1)i for each i E Z. We now construct the tensor algebra on a fixed R-module W. This algebra is important because using it we may construct by quotients many important algebras. Consider the following situation: Qt is an R-algebra, W
662
D. Modules and Multi1inearity
an R-module, and ¢ : W -+ Ql is a module homomorphism. If h : Ql -+ ~ is an algebra homomorphism, then of course h 0 ¢ : W -+ ~ is an R-module homomorphism.
Definition 0.45. Let W be an R-module. An R-algebra II together with a map ¢ : W -+ II is called universal with respect to W if for any R-module homomorphism 'ljJ : W -+ ~ there is a unique algebra homomorphism h : II -+ ~ such that h 0 ¢ = 'ljJ. Again if such a universal object exists, it is unique up to isomorphism. We now exhibit the construction of this type of universal algebra. First we define ®o W := Rand ®I W := W. Then we define ®k W := W®k = W 0···0 W. The next step is to form the direct sum ® W := 2:::'0 ®i W. In order to make this a Z-graded algebra, we define ®i W := 0 for i < 0 and then define a product on ® W := 2:: iEZ ®i W as follows: We know that for i, j > 0 there is an isomorphism Wi® 0 w®j -+ W®(i+j) and so a bilinear map Wi® x w®j -+ W®(i+j) such that (WI
0···0 wd x (w~ 0···0 wj)
H
WI
0···0 Wi 0 w~ 0···0 wj.
Similarly, we define ® °WxW®i = RxW®i -+ W®i by scalar multiplication. Also, W®i x w®j -+ 0 if either i or j is negative. Now we may use the symbol 0 to denote these multiplications without contradiction and put them together to form a product on ® W := 2::iEZ ®i W. It is now clear that iW X 0 j w -+ 0i+jW,
0
where we make the needed trivial definitions for the negative powers: i
<
o.
Definition 0.46. The algebra ® W is a graded algebra called the R-tensor algebra.
Bibliography
[Arm]
M.A. Armstrong, Basic Topology, Springer-Verlag (1983).
[At]
M.F. Atiyah, K-Theory, W.A. Benjamin (1967).
[A-M-R]
R. Abraham, J.E. Marsden, and T. Ratiu, Manifolds, Tensor Analysis, and Applications, Addison Wesley, Reading, (1983).
[Am]
V.I. Arnold, Mathematical Methods of Classical Mechanics, Graduate Texts in Math. 60, Springer-Verlag, 2nd edition (1989).
[BaMu]
John Baez and Javier P. Muniain, Gauge Fields, Knots and Gravity, World Scientific (1994).
[Be]
Arthur L. Besse, Einstein Manifolds, Classics in Mathematics, SpringerVerlag (1987).
[BishCr]
R. L. Bishop and R. J. Crittenden, Geometry of Manifolds, Academic Press (1964).
[Bo-Tu]
R. Bott and L. Tu, Differential Forms in Algebraic Topology, Springer-Verlag GTM 82 (1982).
[Bre]
G. Bredon, Topology and Geometry, Springer-Verlag GTM 139, (1993).
[BrCI]
F. Brickell and R. Clark, Differentiable Manifolds: An Introduction, Van Nostran Rienhold (1970).
[Bro-Jan]
Th. Brocker and K. Janich, Introduction to Differential Topology, Cambridge University Press (1982).
[Brou]
L.E.J. Brouwer, tiber Abbildung von Mannigfaltigkeiten, Mathematische Annalen 71 (1912),97-115.
[Chavel]
I. Chavel, Eigenvalues in Riemannian Geometry, Academic Press, (1984).
[C-E]
J. Cheeger and D. Ebin, Comparison Theorems in Riemannian Geometry, North-Holland (1975).
[Cheng]
S.Y. Cheng, Eigenvalue comparison theorems and its geometric applications, Math. Z. 143 (1975), 289-297.
[Clark]
C.J .S. Clark, On the global isometric embedding of pseudo-Riemannian manifolds, Proc. Roy. Soc. A314 (1970), 417-428.
-
663
664
Bibliography
[Dar]
R.W.R. Darling, Differential Forms and Connections, Cambridge University Press (1994).
[Dieu]
J. Dieudonne, A History of Algebraic and Differential Topology 1900-1960, Birkhiiuser (1989).
[Dod-Pos]
C.T.J. Dodson and T. Poston, Tensor Geometry: The Geometric Viewpoint and Its Uses, Springer-Verlag (2000).
[Dol]
A. Dold, Lectures on Algebraic Topology, Springer-Verlag (1980).
[Donaldson] S. Donaldson, An application of gauge theory to the topology of 4-manifolds, J. Diff. Geo. 18 (1983), 269~316. [Dug]
J. Dugundji, Topology, Allyn & Bacon (1966).
[Eil-St]
S. Eilenberg and N. Steenrod, Foundations of Algebraic Topology, Princeton Univ. Press (1952).
[Fen]
R. Fenn, Techniques of Geometric Topology, Cambridge Univ. Press (1983).
[Freedman] M. Freedman, The topology of four-dimensional manifolds, J. Diff. Geo. 17 (1982), 357~454. [Fr-Qu]
M. Freedman and F. Quinn, Topology of 4-Manifolds, Princeton Univ. Press (1990).
[Fult]
W. Fulton, Algebraic Topology: A First Course, Springer-Verlag (1995).
[G2]
V. Guillemin, and S. Sternberg, Symplectic Techniques in Physics, Cambridge Univ. Press (1984).
[Gray]
B. Gray, Homotopy Theory, Academic Press (1975).
[Gre-Hrp]
M. Greenberg and J. Harper, Algebraic Topology: A First Course, AddisonWesley (1981).
[Grom]
M. Gromov, Convex sets and Kahler manifolds, in Advances in Differential Geometry and Topology, World Sci. Pub!. (1990), pp. 1~38.
[Helg]
S. Helgason, Differential Geometry, Lie Groups and Symmetric Spaces, Amer. Math. Soc. (2001).
[Hicks]
Noel J. Hicks, Notes on Differential Geometry, D. Van Nostrand Company Inc (1965).
[Hilt 1]
P.J. Hilton, An Introduction to Homotopy Theory, Cambridge University Press (1953).
[Hilt2]
P.J. Hilton and U. Stammbach, A Course in Homological Algebra, SpringerVerlag (1970).
[Hus]
D. Husemoller, Fibre Bundles, McGraw-Hill, (1966) (later editions by Springer-Verlag).
[Kirb-Seib] R. Kirby and L. Siebenmann, Foundational Essays on Topological Manifolds, Smoothings, and Triangulations, Ann. of Math. Studies 88 (1977). [Klein]
Felix Klein, Gesammelte Abhandlungen, Vo!' I, Springer, Berlin (1971).
[Kob]
S. Kobayashi, On conjugate and cut Loci, in Global Differential Geometry, M.A.A. Studies in Math, Vo!' 27, S.S. Chern, Editor, Prentice Hall (1989).
[K-N]
S. Kobayashi and K. Nomizu, Foundations of Differential Geometry. I, II, J. Wiley Interscience (1963), (1969).
[Ll]
S. Lang, Fundamentals of Differential Geometry, Springer-Verlag GTM Vo!' 191 (1999).
[L2]
S. Lang, Algebra, Springer-Verlag GTM Vo!' 211 (2002).
Bibliography
665
[L-M]
H. Lawson and M. Michelsohn, Spin Geometry, Princeton University Press, (1989).
[Lee, Jeff]
Jeffrey M. Lee, Online Supplement to the present text, Internet: http://webpages . acs . ttu. edu/ j lee/Supp. pdf (see also http://www.ams.org/bookpages/gsm-l07).
[Lee, John] John Lee, Introduction to Smooth Manifolds, Springer-Verlag GTM Vol. 218 (2002). [L-R]
David Lovelock and Hanno Rund, Tensors, Differential Forms, and Variational Principles, Dover Publications (1989).
[Madsen]
Id Madsen and J¢rgen Tornehave, From Calculus to Cohomology, Cambridge University Press (1997).
[Matsu]
Y. Matsushima, Differentiable Manifolds, Marcel Dekker (1972).
[M-T-W]
C. Misner, J. Wheeler, and K. Thorne, Gravitation, Freeman (1974).
[MacL]
S. MacLane, Categories for the Working Mathematician, Springer-Verlag GTM Vol. 5 (1971).
[Mass]
W. Massey, Algebraic Topology: An Introduction, Harcourt, Brace & World (1967) (reprinted by Springer-Verlag).
[Mass2]
W. Massey, A Basic Course in Algebraic Topology, Springer-Verlag (1993).
[Maun]
C.R.F. Maunder, Algebraic Topology, (reprinted by Dover Publications).
[Mich]
P. Michor, Topics in Differential Geometry, Graduate Studies in Mathematics, Vol. 93, Amer. Math. Soc. (2008)
[Mil]
J. Milnor, Morse Theory, Annals of Mathematics Studies 51, Princeton University Press (1963).
[Milnl]
J. Milnor, Topology from the Differentiable Viewpoint, University Press of Virginia (1965).
[Mil-St]
J. Milnor and J. Stasheff, Characteristic Classes, Ann. of Math. Studies 76, Princeton University Press (1974).
[Molino]
P. Molino, Riemannian foliations, Progress in Mathematics, Birkhiiuser Boston (1988).
[My-St]
S.B. Myers, N.E. Steenrod, The group of isometries of a Riemannian manifold, Annals of Mathematics 40 (1939), no. 2.
[Nash 1]
John Nash, C1-isometric imbeddings, Annals of Mathematics 60 (1954), pp 383-396.
[Nash2]
John Nash, The imbedding problem for Riemannian manifolds, Annals of Mathematics 63 (1956), 20-63.
[ONl]
B. O'Neill, Semi-Riemannian Geometry, Academic Press (1983).
[ON2]
B. O'Neill, Elementary differential geometry, Academic Press (1997).
[Pal
Richard, Palais, Natural Operations on Differential Forms, Trans. Amer. Math. Soc. 92 (1959), 125-141.
[Pa2]
Richard Palais, A Modern Course on Curves and Surfaces, Online Book at http://www.math.uci.edu/-cterng/NotesByPalais.pdf.
[Pen]
Roger Penrose, The Road to Reality, Alfred A. Knopf (2005).
[Pel
Peter Peterson, Riemannian Geometry, Springer-Verlag (1991).
[Poor]
Walter Poor, Differential Geometric Structures, McGraw Hill (1981).
Cambridge Univ. Press (1980)
666
Bibliography
[Roe]
J. Roe, Elliptic Operators, Topology and Asymptotic methods, Longman (1988).
[Ros]
W. Rossmann, Lie Groups. An Introduction through Linear Groups, Oxford University Press (2002).
[Shrp]
R. Sharpe, Differential Geometry; Cartan's Generalization of Klein's Erlangen Program, Springer-Verlag (1997).
[Span]
Edwin H. Spanier, Algebraic Topology, Springer-Verlag (1966).
[Spv]
M. Spivak, A Comprehensive Introduction to Differential Geometry (5 volumes), Publish or Perish Press (1979).
[St]
N. Steenrod, Topology of Fiber Bundles, Princeton University Press (1951).
[Stern]
Shlomo Sternberg, Lectures on Differential Geometry, Prentice Hall (1964).
[Tond]
Ph. Tondeur, Foliations on Riemannian manifolds, Springer-Verlag (1988).
[Tond2]
Ph. Tondeur, Geometry of Foliations, Monographs in Mathematics, Vol. 90, Birkhiiuser (1997).
[Wall
Gerard Walschap, Metric Structures in Differential Geometry, SpringerVerlag (2004).
[War]
Frank Warner, Foundations of Differentiable Manifolds and Lie Groups, Springer-Verlag (1983).
[Wh]
John A. Wheeler, A Journey into Gravity and Spacetime, Scientific American Library, A division of HPHLP, New York (1990).
[WeI]
A. Weinstein, Lectures on Symplectic Manifolds, Regional Conference Series in Mathematics 29, Amer. Math. Soc. (1977).
Index
adapted chart, 46 adjoint map, 220 adjoint representation, 220, 221 admissible chart, 13 algebra bundle, 287 alternating, 345 associated bundle, 300 asymptotic curve, 159 asymptotic vector, 158 atlas, 11 submanifold, 46 automorphism, 201 base space, 258 basis criterion, 6 bilinear, 3 boundary, 9 manifold with, 48, 49 bundle atlas, 260, 263 bundle chart, 260 bundle morphism, 258 bundle-valued forms, 370 canonical parametrization, 180 Cartan's formula, 374 causal character, 547 chart, 11 centered, 11 Christoffel symbols, 168 closed form, 366 closed Lie subgroup, 192 co boundary, 444 co cycle , 263, 444 conditions, 263 Codazzi-Mainardi equation, 173 codimension, 46
coframe, 121 complete vector field, 97 complex, 444 orthogonal group, 196 conjugation, 202 connected, 8 connection forms, 506 conservative, 118 locally, 119 consolidated tensor product, 311 consolidation maps, 309 contraction, 317 coordinate frame, 87 cotangent bundle, 85 cotangent space, 65 covariant derivative, 501, 503 covector field, 110 cover, 6 covering, 33 map, 33 space, 33 critical point, 74 critical value, 74 curvature function, 146 curvature vector, 146 cut-off function, 28 Darboux frame, 160 de Rham cohomology, 367 deck transformation, 34 decomposable, 656 deformation retraction, 453 degree, 463 derivation, 61, 89 of germs, 64 determinant, 353
-
667
668
diffeomorphism, 4, 25 differentiable, 3 manifold, 14 structure, 12 differential, 81, 111 Lie, 211 differential complex, 444 differential form, 359 discrete group action, 42 distant parallelism, 56 distinguished Frenet frame, 148 distribution (tangent), 468 divergence, 398, 623 effective action, 41 embedding, 128 endomorphism algebra, 661 equivariant rank theorem, 230 exact I-form, 112 exact form, 366 exponential map, 213 exterior derivative, 363, 366 exterior product, 347 wedge product, 359 faithful representation, 247 fiber, 138, 258 bundle, 257 flat point, 162 flatting, 334 flow, 95 box, 96 complete, 95 local,97 maximal, 100 foliation, 482 chart, 482 form, I-form, 110 frame, 277 bundle, 292 field, 120, 277 free action, 41 free module, 653 Frenet frame, 146 functor, 640 fundamental group, 36 G-structure, 264 gauge transformation, 297 Gauss curvature, 157, 557 Gauss curvature equation, 172 Gauss formula, 166 Gauss' theorem, 399 Gauss-Bonnet, 186 general linear group, 193 geodesic, 567
Index
geodesic curvature, 160 vector, 161 geodesically complete, 569 geodesic ally convex, 634 germ, 28 global,30 good cover, 455 graded algebra, 661 graded commutative, 359 Grassmann manifold, 21 group action, 40 half-space, 9 chart, 49 harmonic form, 419 Hermitian metric, 280 Hessian, 76, 123 homogeneous component, 359 homogeneous coordinates, 19 homogeneous space, 241 homologous, 402 homomorphism presheaf, 289 homothety, 634 homotopy, 32 Hopf bundles, 295 Hopf map, 238, 239 hypersurface, 143, 153 identity component, 191 immersed submanifold, 130 immersion, 127 implicit napping theorem, 647 index, 332, 337 of a critical point, 77 raising and lowering, 335 induced orientaiton, 382 integral curve, 95 interior, 9 interior product, 374 intrinsic, 169 inversion, 190 isometry, 338 isotropy group, 228 isotropy representation, 248 Jacobi identity, 92 Jacobian, 3 Koszul connection, 501 Lagrange identity, 164 left invariant, 204 left translation, 191 length,333 lens space, 52 Levi-Civita derivative, 550
Index
Lie algebra, 92 automorphism, 206 homomorphism, 206 Lie bracket, 91 Lie derivative, 89, 330 of a vector field, 103 on functions, 89 Lie differential, 211 Lie group, 189 homomorphism, 201 Lie subgroup, 191 closed, 192 lift, 37, 126 lightcone, 548 lightlike curve, 548 lightlike subspace, 588 line bundle, 281 line integral, 117 linear frame bundle, 292 linear Lie group, 193 local flow, 97 local section, 259 local trivialization, 260 locally connected, 8 locally finite, 6 long exact, 445 Lorentz manifold, 337 Mobius band, 261 manifold topology, 13 manifold with corners, 54 Maurer-Cartan equation, 387 Maurer-Cartan form, 224 maximal flow, 99 maximal integral curve, 98 Mayer-Vietoris, 447 mean curvature, 157 measure zero, 75 meridians, 160 metric, 279 connection, 543 minimal hypersurface, 179 Minkowski space, 337 module, 649 morphism, 638 moving frame, 120 multilinear, 653 musical isomorphism, 334 n-manifold, 16 neighborhood, 2 nice chart, 383 nonnull curve, 548 non null vector, 547 normal coordinates, 572 normal curvature, 158 normal field, 153
669
normal section, 159 nowhere vanishing, 277 null vector, 547 one-parameter subgroup, 202 open manifold, 50 open submanifold, 16 orbit, 41 map, 232 orientable manifold, 377 orientation cover, 380 orientation for a vector space, 353 orientation of a vector bundle, 375 oriented manifold, 377 orthogonal group, 195 orthonormal frame field, 165, 279 outward pointing, 381 overlap maps, 11 paracompact, 6, 645 parallel, 171 translation, 516 parallelizable, 278 parallels, 160 partial tangent map, 72 partition of unity, 30 path component, 8 path connected, 8 Pauli matrices, 203 piecewise smooth, 117 plaque, 482 point derivation, 61 point finite cover, 644 presheaf, 289 principal bundle, 293 atlas, 293 morphism, 297 principal curve, 159 principal frame field, 165 principal normal, 146 principal part, 56 principal vector, 158 product group, 191 product manifold, 20 projective plane, 18 projective space, 18 proper action, 231 proper map, 33 proper time, 550 property W, 133 pull-back, 92, 114, 322, 323 bundle, 268 vector bundle, 276 push-forward, 92, 115, 323 R-algebra, 660 radial geodesic, 571
Index
670
radially parallel, 517 rank,78 of a linear map, 127 real projective space, 18 refinement, 6 reflexive, 654 reflexive module, 309 regular point, 74 regular submanifold, 46 regular value, 74 related vector fields, 93 Riemannian manifold, 337 Riemannian metric, 279 Sard's theorem, 76 scalar product, 193, 331 space, 193 second fundamental form, 156 section, 87, 259 along a map, 269 sectional curvature, 557 self-avoiding, 42 semi-Euclidean motion., 339 semi-Riemannian, 337 semiorthogonal, 195 shape operator, 155 sharping, 334 sheaf, 289 short exact, 444 shuffie, 347 sign, 599 simple tensor, 656 simply connected, 37 single-slice chart, 46 singular homology, 402 singular point, 74 smooth functor, 283 smooth manifold, 15 smooth map, 22 smooth structure, 12 smoothly universal, 129 spacelike curve, 548 spacelike subspace, 588 spacelike vector, 547 sphere theorem, 622 spin-j, 250 spray, 544 stabilizer, 228 standard action, 250 standard transition maps, 274 stereographic projection, 17 Stiefel manifold, 51, 243 Stokes' theorem, 396 straightening, 102 structural equations, 386 structure constants, 205 subgroup (Lie), 191
submanifold, 46 property, 46 submersion, 138 submodule,651 summation convention, 5 support, 28, 101, 391 surface of revolution, 160 symplectic group, 196 tangent bundle, 81 tangent functor, 71, 84 tangent map, 67, 68, 81 tangent space, 58, 61, 65 tangent vector, 58, 60, 61 tautological bundle, 281 tensor (algebraic), 308 bundle, 319 derivation, 327 field, 320 map, 308 tensor product, 251, 319, 654, 655 bundle, 282 of tensor fields, 319 of tensor maps, 311 of tensors, 311 theorema egregium, 176 tidal operator, 558 time dependent vector field, 110 timelike curve, 548 timelike subspace, 588 timelike vector, 547 TM-valued tensor field, 320 top form, 378 topological manifold, 7 torsion (of curve), 150 total space, 258 totally geodesic, 582 totally umbilic, 162 transition maps, 261, 265 standard, 274 transitive, 228 transitive action, 41 transversality, 80 trivialization, 84, 260 typical fiber, 258 umbilic, 162 unitary group, 196 universal, 655 cover, 39 property, bilinear, 251, 654 VB-chart, 270 vector bundle, 270 morphism, 271 vector field, 87 along, 89
Index
671
vector subbundle, 271 velocity, 69 volume form, 378
weakly embedded, 132 wedge product, 347 Whitney sum bundle, 276
weak embedding, 129
zero section, 276
Titles in This Series 108 Enrique Outerelo and Jesus M. Ruiz, Mapping degree theory, 2009 107 Jeffrey M. Lee, Manifolds and differential geometry, 2009 106 Robert J. Daverman and Gerard A. Venema, Embeddings in manifolds, 2009 105 Giovanni Leoni, A first course in Sobolev spaces, 2009 104 Paolo Aluffi, Algebra: Chapter 0, 2009 103 Branko Griinbaum, Configurations of points and lines, 2009 102 Mark A. Pinsky, Introduction to Fourier analysis and wavelets, 2009 101 Ward Cheney and Will Light, A course in approximation theory, 2009 100 I. Martin Isaacs, Algebra: A graduate course, 2009 99 Gerald Teschl, Mathematical methods in quantum mechanics: With applications to Schrodinger operators, 2009 98 Alexander I. Bobenko and Yuri B. Suris, Discrete differential geometry: Integrable structure, 2008 97 David C. Ullrich, Complex made simple, 2008 96 N. V. Krylov, Lectures on elliptic and parabolic equations in Sobolev spaces, 2008 95 Leon A. Takhtajan, Quantum mechanics for mathematicians, 2008 94 James E. Humphreys, Representations of semisimple Lie algebras in the BGG category 0,2008 93 Peter W. Michor, Topics in differential geometry, 2008 92 I. Martin Isaacs, Finite group theory, 2008 91 Louis Halle Rowen, Graduate algebra: Noncommutative view, 2008 90 Larry J. Gerstein, Basic quadratic forms, 2008 89 Anthony Bonato, A course on the web graph, 2008 88 Nathanial P. Brown and Narutaka Ozawa, CO-algebras and finite-dimensional approximations, 2008 87 Srikanth B. Iyengar, Graham J. Leuschke, Anton Leykin, Claudia Miller, Ezra Miller, Anurag K. Singh, and Uli Walther, Twenty-four hours of local cohomology, 2007 86 Yulij Ilyashenko and Sergei Yakovenko, Lectures on analytic differential equations, 2007 85 John M. Alongi and Gail S. Nelson, Recurrence and topology, 2007 84 Charalambos D. Aliprantis and Rabee Tourky, Cones and duality, 2007 83 Wolfgang Ebeling, Functions of several complex variables and their singularities (translated by Philip G. Spain), 2007 82 Serge Alinhac and Patrick Gerard, Pseudo-differential operators and the Nash-Moser theorem (translated by Stephen S. Wilson), 2007 81 V. V. Prasolov, Elements of homology theory, 2007 80 Davar Khoshnevisan, Probability, 2007 79 William Stein, Modular forms, a computational approach (with an appendix by Paul E. Gunnells), 2007 78 Harry Dym, Linear algebra in action, 2007 77 Bennett Chow, Peng Lu, and Lei Ni, Hamilton's Ricci flow, 2006 76 Michael E. Taylor, Measure theory and integration, 2006 75 Peter D. Miller, Applied asymptotic analysis, 2006 74 V. V. Prasolov, Elements of combinatorial and differential topology, 2006 73 Louis Halle Rowen, Graduate algebra: Commutative view, 2006 72 R. J. Williams, Introduction the the mathematics of finance, 2006 71 S. P. Novikov and I. A. Taimanov, Modern geometric structures and fields, 2006 70 Sean Dineen, Probability theory in finance, 2005 69 Sebastian Montiel and Antonio Ros, Curves and surfaces, 2005
TITLES IN THIS SERIES
68 Luis Caffarelli and Sandro Salsa, A geometric approach to free boundary problems, 2005 67 T.Y. Lam, Introduction to quadratic forms over fields, 2004 66 Yuli Eidelman, Vitali Milman, and Antonis Tsolomitis, Functional analysis, An introduction, 2004 65 64 63 62
S. Ramanan, Global calculus, 2004 A. A. Kirillov, Lectures on the orbit method, 2004 Steven Dale Cutkosky, Resolution of singularities, 2004 T. W. Korner, A companion to analysis: A second first and first second course in analysis, 2004
61 Thomas A. Ivey and J. M. Landsberg, Cartan for beginners: Differential geometry via moving frames and exterior differential systems, 2003 60 Alberto Candel and Lawrence Conlon, Foliations II, 2003 59 Steven H. Weintraub, Representation theory of finite groups: algebra and arithmetic, 2003 58 Cedric Villani, Topics in optimal transportation, 2003 57 Robert Plato, Concise numerical mathematics, 2003 56 E. B. Vinberg, A course in algebra, 2003 55 C. Herbert Clemens, A scrapbook of complex curve theory, second edition, 2003 54 Alexander Barvinok, A course in convexity, 2002 53 Henryk Iwaniec, Spectral methods of automorphic forms, 2002 52 llka Agricola and Thomas Friedrich, Global analysis: Differential forms in analysis, geometry and physics, 2002 51 Y. A. Abramovich and C. D. Aliprantis, Problems in operator theory, 2002 50 Y. A. Abramovich and C. D. Aliprantis, An invitation to operator theory, 2002 49 John R. Harper, Secondary cohomology operations, 2002 48 Y. Eliashberg and N. Mishachev, Introduction to the h-principle, 2002 47 A. Yu. Kitaev, A. H. Shen, and M. N. Vyalyi, Classical and quantum computation, 2002 46 Joseph L. Taylor, Several complex variables with connections to algebraic geometry and Lie groups, 2002 45 Inder K. Rana, An introduction to measure and integration, second edition, 2002 44 Jim Agler and John E. MCCarthy, Pick interpolation and Hilbert function spaces, 2002 43 N. V. Krylov, Introduction to the theory of random processes, 2002 42 Jin Hong and Seok-Jin Kang, Introduction to quantum groups and crystal bases, 2002 41 Georgi V. Smirnov, Introduction to the theory of differential inclusions, 2002 40 Robert E. Greene and Steven G. Krantz, Function theory of one complex variable, third edition, 2006 39 Larry C. Grove, Classical groups and geometric algebra, 2002 38 Elton P. Hsu, Stochastic analysis on manifolds, 2002 37 Hershel M. Farkas and Irwin Kra, Theta constants, Riemann surfaces and the modular group, 2001 36 Martin Schechter, Principles of functional analysis, second edition, 2002 35 James F. Davis and Paul Kirk, Lecture notes in algebraic topology, 2001 34 Sigurdur Helgason, Differential geometry, Lie groups, and symmetric spaces, 2001 33 Dmitri Burago, Yuri Burago, and Sergei Ivanov, A course in metric geometry, 2001 32 Robert G. Bartle, A modern theory of integration, 2001
For a complete list of titles in this series, visit the AMS Bookstore at www.ams.org/bookstoref.