a" .. . ,a,) is a piecewise C' path between points P = PI and Q = P d I' where Pi is the beginning point of ai (or the end point of ai-I)' By definition and by what we have just seen, we find
400
FUNCTIONS ON n-SPACE
[XV, §4)
that
f/= f.. F =
+ ... + LF rp(P 2 ) - rp(P,) + rp(P 3 )
rp(P 2 )
-
+ ... + rp(P,+,) -
rp(P,)
= rp(P,+,) - rp(P,) = rp(Q) - rp(P).
Hence the same result holds in the general case. In particular, if ex is a closed path, then P = Q and we find
r
F
= rp(P) -
rp(P)
= O.
P••
Thus we have shown that condition (1) implies both (2) and (3). It is obvious that (2) implies (3). Conversely, assume that the integral of F along any closed path is equal to O. We shall prove (2). Intuitively, given two points P, Q and two paths ex, p from P to Q, we go from P to Q along ex, and back along the inverse of p. The integral must be equal to O. To see this formally, let p- be the path opposite to p. Then Q is the beginning point of p- and P is its end point. Hence the path {ex, P-} is a closed path, and by hypothesis,
However,
F i F Jf F- iF
f + Ja
13-
=
a
/3
= O.
Hence
thus proving that (3) implies (2). There remains to prove that if we assume (2), that is if the integral is independent of the path, then F admits a potential function. Let P be a fixed point of U. It is natural to define for any point Q of U the value
[XV, §4]
401
CURVE INTEGRALS
taken along any path Ct, since this value is independent of the path. We now contend that lp is a potential function for F. To verify this, we must compute the partial derivatives of lp. If
is expressed in terms of its coordinate functions D;lp
= Ii
for i
Ii, we must show that
= I, ... ,n.
Let Q = (x ... .. ,xn), and let e; be the i-th unit vector. We must show that lim
such that the image of each small interval
is contained in a disc Dj C U. Then F has a potential on Dj. We replace the curve" on the interval [ai' ai+l] by the rectangular curve drawn on Figure 13. This proves the lemma by using Lemma 1.2.
P"
P,
Figure 13 In the figure, we let Pj = ,,(aJ If " is a closed path, then it is clear that the rectangular path constructed in the lemma is also a closed path, looking like this:
434
WINDING NUMBER AND GLOBAL POTENTIAL FUNCTIONS
[XVI, §3]
"
Figure 14 By definition of homologous, Ihe lemma Slales Ihal I' and 'I are homologous in V. The lemma reduces the proof of Ihe inlegrabilily Iheorem to the case when I' is a reclangular closed chain. We shall now reduce Ihe Iheorem to the case of rectangles by slaling and proving a theorem having nOlhing to do with vector fields. We need a little more terminology. Let I' be a curve in an open set V, defined on an interval [a, b]. Let
be a partition of the interval. Let
be the reslriclion of I' to the smaller inlerval [ail ai+I]' Then we agree to
call the chain 1'1
+ 1'2 + ... + Yn
a subdivision of y. Furthermore, if 'Ii is obtained from Yi by another parametrization, we again agree to call the chain 'II
+ '12 + ... + 'In
a subdivision of y. For any practical purposes, the chains I' and 'I.
+ '12 + ... + 'In
[XVI, §3]
PROOF OF THE GLOBAL INTEGRABILITY THEOREM
435
do not differ from each other. In Figure 15 we illustrate such a chain y and a subdivision 'I, + 'Iz + '13 + '14'
Figure IS Similarly, if y = Lm;y; is a chain, and {'Iii} is a subdivision of Yi' we caU
a subdivision of y. The next theorem is the heart of Artin's proof. Theorem 3.2. Let Y be a rectangular closed chain in U, and assume that y is homologous to 0 in U, i.e. W(y, P) = 0
for every point P not in U. Then there exist closed rectangles R" ... ,R N contained in U, such that if oR i is the boundary of R; oriented counterclockwise, then a subdivision of y is equal to N
L mi' oR;
j::::)
for some integers mi'
Lemma 3.1 and Theorem 3.2 make the Integrability Theorem 2.4 obvious because we know that for any 10caUy integrable vector field F on U we have
f
F=O
OR,
Hence the integral of F over whence the integral of F over y We now prove the theorem. aU vertical and horizontal lines illustrated on Figure 16.
the subdivision of y is also equal to 0, is also equal to O. Given the rectangular chain y, we draw passing through the sides of the chain, as
436
WINDING NUMBER AND GLOBAL POTENTIAL FUNCTIONS
[XVI, §3)
I I I
I
•
I I
I
I
I
I
i I
I Figure 16
Then these vertical and horizontal lines decompose the plane into rectangles, and rectangular regions extending to infinity in the vertical and horizontal direction. Let R, be one of the rectangles, and let p, be a point inside R i • Let
m,
= W(y, P,).
For some rectangles we have m, = 0, and for some rectangles, we have m, ,6 O. We let R" ... ,R N be those rectangles such that m" ... ,mN are not 0, and we let aR i be the boundary of R, for i = 1, ... ,N, oriented counterclockwise. We shall prove the following two assertions:
I. 2.
Every rectangle R i such that m, ,6 0 is contained in U. Some subdivision of y is equal to N
L mi aR,o
1=1
This will prove the desired theorem. Assertion 1. By assumption, Pi must be in U, because W(y, P) = 0 for every point P outside of U. Since the winding number is constant on connected sets, it is constant on the interior of R" hence ,6 0, and the interior of Ri is contained in U. If a boundary point of R, is on y, then it is in U. If a boundary point of R, is not on y, then the winding number with respect to y is defined, and is equal to m i ,6 0 by continuity
[XVI, §3]
437
PROOF OF THE GLOBAL INTEGRABILITY THEOREM
(Lemma 2.1). This proves that the whole rectangle R i, including its boundary, is contained in U, and proves the first assertion. Assertion 2. We now replace l' by an appropriate subdivision. The vertical and horizontal lines cut l' in various points. We can then find a subdivision /1 of l' such that every curve occurring in '1 is some side of a rectangle, or the finite side of one of the infinite rectangular regions. The subdivision '1 is the sum of such sides, taken with appropriate mutliplicities. If a finite side of an infinite rectangle occurs in the subdivision, then after inserting one more horizontal or vertical line, passing through the infinite rectangular region, the finite side will be the side of a rectangle R' of the grid, and its winding number ",' will be equal to zero. Thus without loss of generality, we may assume that every side of the subdivision is also the side of one of the finite rectangles in the grid formed by the horizontal and vertical lines. It will now suffice to prove that '1 =
I
tn,
oR,.
Suppose '1 - I tn, oR, is not the 0 chain. Then it contains some horizontal or vertical segment a, so that we can write '1 -
I
"'ioRi = Ina + C/,
where III is an integer, and C' is a chain of vertical and horizontal segments other than a. Then a is the side of a finite rectangle R.. We take a with the orientation arising from the counterclockwise orientation of the boundary of the rectangle R.. Then the closed chain G = '1 -
I
tn,
oR, - tn oR.
does not contain a. Let p. be a point interior to R., and let P be a point near a but on the opposite side from p., as shown on the figure.
p'
p'
•
o
p,
•
or
=
•
p
J
•
p,
o
Figure 17 Sinoe '1 - I tn, oR, - III oR. does not contain a, the points p. and Pare connected by a line segment which does not interesect C. Therefore W(G, Ph) = WIG, P).
438
WINDING NUMBER AND GLOBAL POTENTIAL FUNCTIONS
[XVI, §4]
But W(I}, p.) = m. and W(oR" p.) = 0 unless i = k, in which case W(oR., p.) = 1. Similarly, if P' is inside some finite rectangle Rj , then j "# k because P' is on the other side of u, and hence W(oR., P') = O.
If P' is in an infinite rectangle, then W(oR., P') = O. Hence:
= W(I} - L m, oR; W(C, P') = W(I} - L mi oR; W(C, p.)
m oR., p.) = m. - m. - m = -m; m oR., P') = mj - mj
=0
if pI is in some finite rectangle Rj> and 0 - 0 = 0 otherwise. This proves that m
= 0, and concludes the proof that
I} -
L m; oR; = O.
XVI, §4. THE INTEGRAL OVER CONTINUOUS PATHS To go further, it is now convenient to extend our notion of local integrability, and to deal with continuous paths rather than piecewise C' paths. We do this as follows. Let U be an open connected set in R 2 . Let F = (fl' f2) be a continuous vector field on U. We say that F is locally integrable on U if given a point P there exists a disc D centered at P such that F has a potential function on D. For the rest of this chapter, we assume that F is locally integrable. Furthermore, by a curve and a path from now on, we mean a continuous curve or continuous path. We do not require further differentiability. We shall define the integral of F along continuous paths. Let YI: [a" a2] -+ U be a (continuous) curve, whose image is contained in a disc D c:: U, and suppose F has a potential 9 on D. Then we define the integral
f
y,
F = g(P2 )
-
g(P1 ),
Since a potential is uniquely determined up to an additive constant, it follows that the value g(P,) - g(PIl is independent of the choice of potential 9 for F on the disc. If the curve YI happens to be C', then the above value coincides with the value we gave in Chapter XV, Theorem 4.2. Suppose now that y: [a, b] -+ U is a continuous curve, without restriction on its image. We have seen that we can find a partition !?J of [a, b], and a sequence of discs Do, ... ,D,,_I connected by the curve along the partition such that each Di c:: U. We next formulate a stronger result. Lemma 4.1. There exists a partition {a o ~ a I ~ ... ~ an} and a sequence ofdiscs {Do, ... ,D,,_I} connected along the partition such that F has a potential gi 011 D,.
[XVI, §4]
THE INTEGRAL OVER CONTINUOUS PATHS
439
Proof. For each P in the image of 1', there is a disc Dr centered at P, such that F has a potential on D . Let 1',. be the radius of D,.. We cover the image of l' by the discs D~ "of radius 1',/2. Since the image of l' is compact, there- is a finite subcovering, say by discs Dj centered at P; of radius 'i/2, j = 0, ... ,Ill. Let E
= min ,/2.
There exists 15 such that if t, . (2 € [a, b] and
1(, -
(21
< 15, then 1J'(t,) - 1'(( 2)1 < E.
We let n be an integer such that lin < 15, and we take the partition [ao, ....a,,] such that the length of [a,. a,+,] is 1/11. Then ,'(a,) is contained in some disc Djl" depending on i, and 1'([a" a,+,]) is contained in a disc of radius at most E centered at l'(a,). Therefore 1'([a" a,+,]) is contained in the disc Di(i) centered at P; of radius 'i. We can then usc the seq uence of discs Dj(o) , ... , Dj(u-I)
to conclude the proof of the lemma. Let y,: [a" ai+d [a"a,+d. Then
->
U be the restriction of y to the smaller interval
I =,~"-'I ;. F
y,
F.
Let ,'(a,) = Pi' and let y, be a potential of F on the disc D,. We define
Thus even though F may not have a potential on the whole open set V, its integral can nevertheless be expressed in terms of local potentials by decomposing the curve as a sum of sufficiently smaller curves. The same formula then applies to a path. This procedure allows us to define the integral of F along any continuous curve; we do not need to assume any differentiability property of the curve. We need only apply the above procedure, but then we must show that the expression
440
WINDING NUMBER AND GLOBAL POTENTIAL FUNCTIONS
[XVI, §4]
0-1
L
[g,(Pi+I) - g,(p,)]
;=0
is independent of the choice of partition of the interval [a, b] and of the choices of the disc D, containing y([a" ai+I])' Then this sum can be taken as the defin"ition of the integral
We state formally this independence, repeating the construction. Lemma 4.2. Let y: [a, b]
->
U be a continuous curve. Let
be a partition of [a, b] such that the image y([a" a i+ I ]) is contained in a disc D" and D, is contained in U. Let F be locally integrable on U and let g, be a potential of F on Di . Let p, = yea;). Then the sum 0-1
L [g,(Pi+l) -
g,(P;)]
i=O
is independent of the choices of partition, discs D i , and potentials g, on D, subject to the stated conditions.
Proof. First let us work with the given partition, but let Bi be another disc containing the image y([a i , ai+ I ]), and B, contained in U. Let hi be a potential of f on B,. Then both gi' hi are potentials of F on the intersection B, n D" which is open and connected. Hence there exists a constant C, such that gi = h, + Ci on Bi n D,. Therdore the differences are equal:
Thus we have proved that given the partition, the value of the sum is independent of the choices of potentials and choices of discs. Given two partitions, we can always find a common refinement, as in elementary calculus. Recall that a partition
is called a refinement of the partition {!} if every point of {!} is among the points of fi2, that is if each aj is equal to some bi • Two partitions always have a common refinement, which we obtain by inserting all the points of one partition into the other. Furthermore, we can obtain a refinement
[XVI, §4]
THE INTEGRAL OVER CONTINUOUS PATHS
of a partition by inserting one point at a time. that if the partition 12 is a refinement of the inserting one point, then Lemma 4.2 is valid suppose that 12 is obtained by inserting some [a., ak+l] for some k, that is 12 is the partition
441
Thus it suffices to prove partition i?J' obtained by in this case. So we can point c in some interval
We have already shown that given a partition, the value of the sum as in the statement of the lemma is independent of the choice of discs and potentials as described in the lemma. Hence for this new partition !2, we can take the same discs Di for alI the old intervals [ai' ai+'] when i # k, and we take the disc Dk for the intervals [a.. c] and [c, ak+l]. Similarly, we take the potential gi on Di as before, and gk on Dk • Then the sum with respect to the new partition is the same as for the old one, except that the single term
is now replaced by two terms
This does not change the value, and concludes the proof of Lemma 4.2. For any continuous path 1': [a, b]
-->
U we may thus define
for any partition [ao,a" ... ,an ] of [a,b] such that 1'([a"ai +l]) is contained in a disc D" Di C U, and gi is a potential of F on Di • We have just proved that the expression on the right-hand side is independent of the choices made, and we had seen previously that if l' is piecewise C' then the expression on the right-hand side gives the same value as the definition used in Chapter XV, Theorem 4.2. It is often convenient to have the additional flexibility provided by arbitrary continuous paths. As an application, we shalI now see that if two paths lie "close together," and have the same beginning point and the same end point, then the integrals of F along the two paths have the same value. We must define precisely what we mean by "close together." After a reparametrization, we may assume that the two paths are defined over the same interval [a, b]. We say that they are close together if there exists a partition
442
WINDING NUMBER AND GLOBAL POTENTIAL FUNCTIONS
[XVI, §4]
and for each i = 0, ... ,n - I there exists a disc D, contained in U such that the images of each segment [a" ai+.J under the two paths y, '1 are contained in D" that is, and Given the locally integrable vector field F, we say that the paths are F-c1ose together if they satisfy the above conditions, and also if F has a potential function on each disc D,. Lemma 4.3. Let y, '1 be two paths in an open set U, and assume that they have the same beginning pOint and the same end point. Let F be a locally integrable vector field on U, and assume that paths are F-c1ose together. Then
Proof. We suppose that the paths are defined on the same interval [a, b], and we choose a partition and discs D, as above. Let g, be a potential of F on D,. Let
Po = yea,)
and
We illustrate the paths and their partition in Figure 18.
Figure 18
The functions gi+l an.d g, are potentials of F on the connected open set Di+l n D" so g'+1 - g, IS constant on. Di+l n Di . But Di+1 n D, contains
[XVI. §4] PI+ I and
THE INTEGRAL OVER CONTINUOUS PATHS
Qi+I'
443
Consequently
Then we find
.-1
=
I
[(g,(P,+d - g,(Qi+I») - (g,(P,) - g,(Q,»)]
i=O
=0. because the two paths have the same beginning point Po = Qo. and the same end point p. = Q.. This proves the lemma. One can also formulate an analogous lemma for closed paths. Lemma 4.4. Let Y. '7 be closed paths in the open set U, say defined on the same interval [a, b]. Assume that they are F-close together. Then
Proof The proof is the same as above, except that the reason why we lind 0 in the last step is now slightly different. Since the paths are closed, we have Po = p.
and
Qo
= Q••
y
Figure 19 as illustrated in Figure 19. The two potentials g.-I and go differ by a constant on some disc contained in U and containing po. Qo· Hence the
444
WINDING NUMBER AND GLOBAL POTENTIAL FUNCTIONS
[XVI, §5]
last expression obtained in the proof of Lemma 4.3 is again equal to 0, as was to be shown.
XVI, §S. THE HOMOTOPY FORM OF THE INTEGRABILITY THEOREM Let Y, fl be two paths in an open set U. After a reparametrization if necessary, we assume that they are defined over the same interval [a, b]. We shall say that Y is homotopic to fl if there exists a continuous function
!/J: [a, b] x [c, d]
-+
U
defined on a rectangle [a, b] x [c, d], such that
!/J(t, c)
= y(t)
and
!/J(t, d) = flit)
for all t E [a, b]. For each number s in the interval [c, d], we may view the function !/J, such that
!/J,(t) = !/J(t, s) as a continuous curve, defined on [a, b], and we may view the family of continuous curves !/J, as a deformation of the path y to the path fl. The picture is drawn on Figure 20. The paths have been drawn with the same end points because that's what we are going to use in practice. Formally, we say that the homotopy !/J leaves the end points fixed if we have
!/J(a, s) = y(a)
and
!/J(b, s) = y(b)
for all values of s in [c, d]. In the sequel it will be always understood that
when we speak of a homotopy of paths having the same end points, Ihen the homotopy leaves the end points fixed, unless othenvise specified. _--=~,...,(bl
.., (al
Figure 20
[XVI, §5]
THE HOMOTOPY FORM OF THE INTEGRABILITY THEOREM
445
If )' is homotopic to '1 (by a homotopy leaving the end points fixed), we denote this property by )' "" '1 (relative to end points). In line with our convention, we might omit the reference to the end points. Similarly, when we speak of a homotopy of closed paths, we assume always that each path t/J, is a closed path. These additional requirements are now regarded as part of the definition of homotopy and will not be repeated each time. Theorem 5.1. Let )', 'I be paths ill an opell set U having the same begillllillg point and the same end point. Assume that they are homotopic ill U relative to the end poillts. Let F be locally integrable on U. Theil
Theorem 5.2. Let)','1 be closed paths ill U, and assume that they are homotopic ill U. Let F be locally integrable on U. Theil
In par'licular,
if )'
is homotopic to a point in U, then
If )', 'I are closed paths in U alld are homotopic, then they are homologous. We prove Theorem 5.2 in detail, and leave Theorem 5.1 to the reader; the proof is entirely similar using Lemma 4.3 instead of Lemma 4.4. The idea is that the homotopy gives us a finite sequence of paths close to each other in the sense of these lemmas, so that the integral of F over each successive path is unchanged. The formal proof runs as follows. Let
t/J: [a, b]
x [c, d] -+ U
be the homotopy. The image of t/J is compact, and hence has distance > 0 from the complement of U. By uniform continuity we can therefore find partitions a
= Qo ~
Q1
~ .,. ;;; an =
c = Co ;;1 c, ;;1 .•. ;;1 Cm
b,
= d,
446 WINDING NUMBER AND GLOBAL POTENTIAL fUNCTIONS [XVI, §5) of these intervals, such that if
Sij
= small
rectangle [ai' ai+') x [Cj' cj+,)
then the image if/(Sij) is contained in a disc D'j which is itself contained in U and such that F has a potential Oij on Dij • Let if/j be the continuous curve defined by j = 0, ... ,m.
Then the continuous curves if/j, if/j+1 are F-c1ose together, and we can apply Lemma 4.4 to conclude that
f =f F
!/IJ
Since
if/o = y and if/m = I},
F.
"'J+I
we see that the theorem is proved.
Remark. It is usually not difficult, although sometimes it is tedious, to exhibit a homotopy between continuous curves. Most of the time, one can achieve this homotopy by simple formulas when the curves are given explicitly. Example. Let P, Q be two points in R 2 • The segment between P, Q, denoted by [P, Q), is the set of points
P +!(Q - P),
o ~! ~
I,
or equivalently, (I - !)P
+ !Q,
O~!~l.
A set S in R 2 is called convex, if whenever P, Q E S, then the segment [P, Q) is also contained in S. We observe that a disc and a rectangle are convex. Lemma 5.3. Le! S be a convex set, and let Y, I} be continuous closed curves in S. Then y, I} are homotopic in S.
Proof We define if/(!, s) = sy(t)
+ (I
It is immediately verified that each curve
- s)I}(!).
if/,
defined by if/,(t)
= if/(!, s)
is a
[XVI, §5]
THE HOMOTOPY FORM OF THE INTEGRADILITY THEOREM
447
closed curve, and t/J is continuous. Also t/J(t,O) = tI(t)
and
"'(t, 1) = j,(t),
so the curves are homotopic. Note that thc homotopy is given by a linear function, so if )', /1 are smooth curves, that is C' curves, then each curve t/J., is also of class C'. We say that an open set V is simply connected if it is connected and if every closed path in V is homotopic to a point. By Lemma 5.3 a convex open set is simply connected. Other examples of simply connected open sets will be given in the exercises. From Theorem 5.2, we conclude at once: Theorem 5.4. Let F be a locally illlegrable vector field cOllllected opell set U. Theil F has a potemial fUllctioll
011
011
a simply
V.
Proof. Theorem 5.2 shows that the third condition of Theorem 4.2 in
Chapter XV is satisfied, and so the potential function may be defined by the integral of F from a fixed point Po to a variable point P in U, independently of the path in U from Po to P. Thus we have derived one useful sufficient condition on an open set V for the global existence of a potential for F, namely simply connectedness. Corollary 5.5. Let F be a locally imegrable vector field
all opell set U. Theil F admits a potemial fUIIClioll 011 every disc alld every rectallgle cOlltaille,1 ill U. ~r R is a closed rectallgle cOlltailled ill U, thell
f
011
F =0.
c'R
Pl'Oof. The first assertion comes from the fact that a disc or a rectangle
is convex. As to the second, since a closed rectangle is compact, there exists an open rectangle W containing R and contained in V (take W with sides parallel to those of R, and only slightly bigger). Then W is simply connected, and we can apply Theorem 5.4 to conclude the proof. Corollary 5.6. If /lVO paThs are close togelher ill V, thell they are F-close together for every locally illlegrable vector field F all V
We can also give a proof of Lemma 2.2 based on a different principle. Indeed, the closed path y being compact, it is contained in some disc D. It is therefore homotopic to a point Q in D. If P lies outside D (which is the case when P is at sufficiently large distance from the curve), it follows
448
WINDING NUMBER AND GLOBAL POTENTIAL FUNCTIONS
[XVI, §5]
from Theorem 5.2 that
because that integral is the same as the trivial integral over the constant curve with value Q. Although in this chapter we are principally concerned with open sets in the plane, it is useful to develop a general formalism of homotopy for more general spaces. A metric space S is called pathwise connected if any two points in the space can be joined by a contin uous curve in the space. Given such a space S, let p. Q E S. Let Path(P, Q) be the set of all con· tinuous curves 0:: [0.11- S such that 0:(0) = P and 0:(1) = Q. We define a homotopy between two such curves 0:, p relative to the end points (i.e. relative to P, Q) to be a continuous map
h: [0, I] x [0, I]
-->
S
such that h(t,O) = <x(t) and h(t. I) = P(t). We denote the property that <X is homotopic to P relative to the end points by 0: ~ p. For homotopies, the interval of definition of a curve. and the interval for the parameter are chosen to· be [0, I] for convenience. One may use other intervals also, and then use the fact that given two intervals, there is a polynomial of degree I which maps one on the other. One can also use the following remark.
Lemma 5.8. (a)
Let [a, b) and [c, d) be tlVO intervals, and let f: [c. d)
-->
[a, b)
be continuous such that f(c) = a and f(d) = b. Let 0:, p: [a, b) -+ S be continuous curves in a metric space S, from P to Q. If <X ~ P. then o:of~pof. (b) Let <X: [0, I] - S be a continuous curve in a metric space S. Let f: [0. I] -+ [0. I] be a continuous function such that f(O) = 0 and f(l) = I. Then 0: ~ 0: 0 f
Proof We leave the first assertion to the reader. As to the second, a homotopy leaving the end points fixed is given by h(t, u) = O:«(i - u)t
+ uf(t)}.
[XVI, §5]
THE HOMOTOPY FORM OF THE INTEGRABILITY THEORE:I1
449
Theorem 5.9. Let U be open in R 2 and let F be a locally i/llegrable vector field on U. Let ex: [a, b] -+ U be a cO/llilillOus curve. and let
I: [c, d] -+ [a, b] be continuous such that ftc) = a and I(d) = b. Then
J F=J
fJ.o!
Q'
F
Proof This is an immediate consequence of what we have already done. First, after a translation and a linear function, one reduces the proposition to the case when both intervals are [0, I]. Then one applies Lemma 5.8 as well as Theorem 5.1, which tells us that the integrals of F over two homotopic paths from P to Q have the same value. Note that in Theorem 5.9 the map I is a very general kind of reparametrization of the interval. We put no condition of any kind on I except continuity, and the value at the end points. We have now finished our discussion of the integrability theorem in the context of homotopies. The next, and final section of this chapter continues with properties of homotopies.
XVI, §S. EXERCISES I. Let A be a closed annulus bounded by two circles IXI = r, and IXI = r, with o < r, < r,. Let F be a locally integrable vector field on an open set containing the annulus. Let V, and V, be the two circles, oriented counterclockwise. Show
that
J J F=
)'1
F.
)'1
2. A set S is called star-shaped if there exists a point Po in S such that the line segment between Po and any point P in S is contained in S. Prove that a star-shaped set is simply connected, that is. every closed path is homotopic to a point. 3. Let U be the open set obtained from R' by deleting the set of real numbers ~ O. Prove that U is simply connected. 4. Let V be the open set obtained from R' by deleting the set of real numbers ~ O. Prove that V is simply connected.
450
WINDING NUMBER AND GLOBAL POTENTIAL FUNCTIONS
[XVI, §6]
XVI, §6. MORE ON HOMOTOPIES In this section we deal with homotopies for their own sake, to complement §4, and give more criteria for paths to be homotopic in various ways. The section can be used for further study, but it will not be used in the rest of this book. Let T be the triangle with vertices (0,0), (0, 1), (1,0), as shown on the figure, and let h: T -+ S be a continuous map of the triangle into a metric space S. (0, I) h
-_.::....-_. S
T (0, 0)
Yo
(I, 0)
Figure 21 Then the map
o ~ t ,s; 1,
1'0: tt-+ h(t, 0),
is a curve 1'0 in S. If we look at the restriction of h to the diagonal, from (0, I) to (1,0), then we may also view the image of this diagonal as a curve in S, but the domain of definition is not the interval [0, I]. By a simple device, we can change the parametrization to a standardized one defined on a square. Indeed, let cp: [0, 1] x [0, 1]
-+
T
be a continuous map of the unit square onto the triangle which keeps the left vertical side fixed, and also the bottom horizontal side fixed, and maps the top side of the square on the diagonal, namely (t, 1) t-+ (t, 1 - t). (0, I)
(I, I)
(0, I) qJ
(0,0)
Yo
•
(I, 0)
T (I, 0)
Figure 22
Yo
(I, 0)
[XVI, §6]
451
MORE ON HOMOTOPIES
For instance, one could take qJ(t, u) = (t, (1 - t)u). Then the composite h 0 qJ gives a homotopy of the image of the bottom curve with the image of the diagonal under h. The homotopy h 0 qJ is defined on the unit square. For each fixed u with 0 S; u ;;; I, the map tI-+h(qJ(t, u»)
is one of the curves in the homotopy. Since h 0 qJ(t, 1) = h(t, 1 - t), we see that the image of the top line of the square under h 0 qJ is precisely the image of the diagonal under h. In practice, up to a point, one gets away with not writing down the homotopy by a formula but just drawing pictures which convince people that the formulas can be written down. The rest of this section consists of exercises, which we state as propositions, although we give occasional hints. Throughout we let S be a pathwise connected metric space.
Proposition 6.1. Let P, Q E S. If If
IX, P, l' E Path(P, Q) IX :::< P, then P :::< IX.
and
IX :::<
P, P :::< 1', then
IX :::<
1'.
Since IX:::< IX, it follows that homotopy in Path(P, Q) is an equivalence relation. Given three points P, P, P" in S, let IX E Path(P, P') and let PE Path(P', P"). Define IX # PE Path(P, P") by the formula:
(IX
1X(2t) if 0 ;;; t ;;; 1/2, # P)(t) = { P(2t _ 1) if 1/2 < t ;;; I.
Proposition 6.2. If
IX :::< IX I
and P :::< PI' then
IX
# P :::< IX, # PI'
[Hint: Let h(t, u) = hi (2t, u) for 0;;; t ;;; 1/2; h(t, u) = h2 (2t - I, u) for 1/2 ;;; t ;;; I.]
Let Hot(P, Q) denote the set of homotopy equivalence classes of paths between P and Q as in Proposition 6.1. Then we can define a "product" Hot(P, P') x Hot(P', P") -+ Hot(P, P")
by
(IX,
Proposition 6.3. (a)
This product is associative, that is (IX
# fJ) # l' :::< IX #
(P #
1').
fJ) 1-+ IX #
p.
452 (b)
(c)
WINDING NUMBER AND GLOBAL POTENTIAL FUNCTIONS
The equivalence class of the constant path (: [0, J] (t) = P for all I is a left unil that is for all
0< E
Every class has an "inverse," that is
if 0<-
between Q and P, then
# ex-
0<
~
--+
[XVI, §6)
S such that
Path(P, Q).
denotes the path such that
(.
Thus the "product" is associative, elements have right and left inverses, and there is a multiplicative unit on the right and on the left. We give the homotopies used in Proposition 6.3 for the record, but you should find them yourself (or others to do the same job) before you copy what follows. (a)
Let A =
(0<
# Pl # y and p = ex # (P # y). Define
fIt) =
2t { ++ t
(t
(b)
if 0 1/4 1)/2
1/4,
if 1/4
~ t ~
if 1/2
~ t ~ 1.
1/2,
Verify that A(I) = pU(I)). and apply Lemma 5.8(b). To show (#ex ~ 0<, define
hIt, u) =
{:(2t + I) u-
I+u
(c)
~ t ~
To show
0<
#
0<- '" (
if 0 ~ I ~ (I - u)/2, if (I - u) /2 ~
I ~
I.
define
0«2tu) hIt, u) = { 0«2u(1 _ t))
if 0 ~ t ~ 1/2, if 1/2 ~ t ~ 1.
Remark. In §5 we dealt with homoto pies leaving the end points fixed, and also with homotopies of closed curves which do not leave end points fixed, and are sometimes called free homotopies. The question arises: If two closed curves in S are freely homotopic, are they homotopic by a homotopy leaving a point fixed? The answer is yes, and from Proposition 6.3, we are now able to prove it. We state this application formally as a theorem, which would otherwise not be immediately obvious.
[XVI, §6]
MORE ON HOMOTOPIES
453
Theorem 6.4. Let ce, p E Path(P, Q), alUl suppose the path ce # p- is homotopic to the point P, with a homotopy leaving P fixed. Then ce and p are homotopic by a homotopy leaving the end points P, Q fixed. Proof One can see this from the associativity
using the fact that ce # p- represents the trivial homotopy class. One can also see this from the following figure. We let h be a homotopy shrinking the path ce # -p- to the point P. We have drawn curves in the square, and the images of these curves under h constitute a continuous family leaving P, Q fixed, with the beginning curve being ce and the end curve being p. Of course, it's a pain to write down the formulas. However, the discussion of the section indicates how to do this, by decomposing the square into triangles, and composing homotopies. P_-r-r--r-tlP
P ------'
a
""_---4 P f3
Q
Figure 23 Proposition 6.5. Let Yo E Path(P, P) be a closed curve in S. Suppose that Yo is homotopic in S to a closed curve y" by a homotopy which does not necessarily leave the point P fixed. Let h: [0, 1] x [0, 1] -+ S be the homotopy, defined on the square shown below, and let ce be the curve as shown, i.e. ce(u) = h(O, u) = h(l, u). Then
Q
Y,
Q h(O, 1)=Qmh(I, 1) h(O, 0). p. h(I, 0)
a
a
p 4 _ -........---..4p
Yo
Figure 24
454
WINDING NUMBER AND GLOBAL POTENTIAL FUNCTIONS
[XVI, §6]
We can define a continuous family of curves u, as shown on the next figure, where the end points of the top segment come closer and closer to the to corners of the square. Of course, there are many possible variations of this idea. We have drawn two of them.
,,---- ..... ----'"
,I
,
I
1\ I \
,
"
I
,,
I I
I
,
I
I
,I
I
P if:::.~_-~p
,
,
I
I
y,
y,
r- .....~
p
Yo
, I
p
Yo
---~--
Yo The curve 0,..
Possible curves lY,
Figure 25 A curve u, is really the image under h of the solid path shown on the figure. Thus 0'0 = Yo is deformed continuously to IX # y, # IX-. The parameter s of the homotopy can range over any interval [0, bJ, say, whatever end point is convenient when actual formulas are written down. But if one insists on having the homotopy being parametrized by the interval [0, IJ, then one simply makes a final linear change of variables. However, mathematicians find the above pictures convincing, and usually do not require writing down the actual formulas in such a straightforward case. Proposition 6.6. Let PES and let y E Path(P, P) be a closed curve in S. Suppose that y is homotopic to a point Q in S, by a homotopy which does not necessarily leave the point P fixed. Then y is also homotopic to P itself, by a homotopy which leaves P fixed. [Hint: Use Proposition 6.5 when YI has the constant value Q. Then IX # YI # IX- simply consists of first going along IX, and then retracing your steps backward. You can then use Proposition 6.3(c).) Remark. In connection with this section, you can look up elementary discussions of homotopy in M. Greenberg and J. Harper, Algebraic Topology: A First Course, Benjamin-Cummings, 1992; and also W. Massey, A Basic Course in Algebraic Topology, Springer-Verlag, 1991.
CHAPTER
XVII
Derivatives in Vector Spaces
XVII, §1. THE SPACE OF CONTINUOUS LINEAR MAPS Let E, F be normed vector spaces. Let J.: E .... F be a linear map. The following two conditions on J. are equivalent: (1) J. is continuous.
(2) There exists C > 0 such that for all vEE we have
lA(v)1 ~
Clvl·
Indeed, if we assume (2), then we find for all x, y E E:
lA(x) - A(y)1 = lA(x - Y)I < C1x - yl, so that J. is even uniformly continuous. Conversely, assume that J. is continuous at O. Given 1, there exists 0 such that if x E E and Ixl ;;; 0 then I A(x)1 < 1. Let v be an element of E, v # O. Then lov/lvll ;;; 0, and hence
This implies that 1
1J.(v)1 < b Ivi, and we can take C = 1/0. 455
456
DERIVATIVES IN VECTOR SPACES
[XVII, §I]
We observe that a linear map A.: Rn -> F into a normed vector space is always continuous. In fact, if ei is the i·th unit vector, and
is an element of Rn expressed in terms of its coordinates, then
whence IA(X)I ~ !xIIIA(e,)1
+ ... + IxnIIA(en)1
~ n max Ix" max I A(e,) I·
If we let C = n max IA(ei) I, we see that Ais continuous, using say the sup norm on Rn. (Cf. also Exercise I.) A number C as in condition (2) above is called a bound for the linear map. It is related to the notion of bound for an arbitrary map on a set as follows. Note that if we view Aas a map on all of E, there cannot possibly be a number B such that IA(X) I < B for all x E E, unless A = O. In fact, if v is a fixed vector in E, and t a positive number, then IA(tx)1 = ItIIA(X)I.
If A(X) ¢ 0, taking t large shows that such a number B cannot exist. However, let us view Aas a map on the unit sphere of E. Then for all vectors vE E such that Ivl = I we find IA(V)I ~ C if C satisfies condition (2). Thus the bound we have defined for the linear map is a bound for that map in the old sense of the word, if we view the map as restricted to the unit sphere. We denote the space of continuous linear maps from E into F by L(E, F). It is a vector space. We recall that if A" A2 are continuous linear maps then At + A2 is defined by (AI
+ A2)(X) = A,(X) + A2(X),
and if C E R then (cA)(X)
= CA(X).
We shall now use the norms on E and F to define a norm on L(E, F). Let A: E -> F be a continuous linear map. Define the norm of A, denoted by IAI, to be the greatest lower bound of all numbers C> 0 such that lA(x)1 ~ Clxl for all x E E. The reader will verify at once that this norm
[XVII, §1]
THE SPACE OF CONTINUOUS LINEAR MAPS
457
is equal to the least upper bound of all values jA(v)1 taken with vEE and Ivi = 1. (If V*" 0, consider A(v)flvl.) Because of this, we see that the norm of A is nothing but the sup norm if we view Aas a map defined only on the unit sphere. Thus by restriction of Ato the unit sphere, we may view L(E, F) as a subspace of the space of all bounded maps f!$(S, F), where S is the unit sphere of E (centered at the origin, of course).
Theorem 1.1. The normed vector space L(E, F) is complete
if F
is com-
plete.
Proof Let {An} be a Cauchy sequence of continuous linear maps from E into F. We shall first prove that for each VEE the sequence {An(v)} of elements of F is a Cauchy sequence in F. Given £, there exists N such that for m, n ~ N we have lAm - Ani < £flvl. This means that
and (Am - AnXv) = Am(v) - An(v). This proves that {A.(v)} is Cauchy. Since F is complete, the sequence converges to an element of F, which we denote by A(v). In other words, we define A; E -> F by the condition
A(v)
= lim
An(v).
If v, v' E E, then
A(V
+ v') =
lim A.(v
+ v') =
+ An(v'»)
lim (An(v)
= lim
An(v)
+
"- co
lim An(v') n-co
= A(v) + A(v'). If c is a number, then
A(CV) = lim An(cv) = lim cAn(v) = c lim An(v) = c.I.(v).
Hence Ais linear. Furthermore, for each n we have
458
DERIVATIVES IN VECTOR SPACES
[XVII, §1]
whence taking limits and using the properties of limits of inequalities together with the fact that the norm is a continuous function, we find that IA(V) I ;§;
Clvi
= lim
IAnl.
where C
Finally, the sequence {An} converges to A in the norm prescribed on L(E, F). Indeed, given £. there exists N such that for m, n ~ N and all v with Ivl = 1 we have
Since we have seen that An(V) -. A(v) as n -+ co, we take n sufficiently large so that IAn(V) - A(v)1 <
We then obtain for all m
~
£.
N the inequality
This is true for every v with Iv I = 1 and our theorem is proved. To compute explicitly certain linear maps from Rn into Rm, one uses their representation by matrices. We recall this here briefly. We write a vector x in Rn as a column vector:
x=(:J If
we define the product Ax to be the column vector
[XVII, §1]
THE SPACE OF CONTINUOUS LINEAR MAPS
459
Let
be the map defined by A,.(x) = Ax.
Then it is immediately verified that A,. is a linear map. Conversely, suppose given a linear map A.: Rn -+ Rm. We have the unit vectors ei (i = I, ... ,n) of Rn, which we view as column vectors, and we can write
in terms of its coordinates x" ... ,xn' Let e'" ,e;" be the unit vectors m ,m and j = 1, ... ,n) such of R • Then there exist numbers au (i = 1, that
Hence A.{x,e,
+ ... + xnen)
= x,A(e,)
= (x,all
+ ... + xnA.{en) + .,. + xna'n)e', + ... + (x,a m , + ... + xnamn)e;"·
The vector A(X) is thus nothing but the multiplication of the matrix A = (alj) by the column vector x, that is we have A = A,., A.{x)
= A,.(X) = Ax.
The space of linear maps L(Rn, Rm) is nothing else but the space of m x n matrices, addition being defined componentwise. In other words, if B = (b lj) and C E R then and One has by an immediate verification: and
460
DERIVATIVES IN VECTOR SPACES
[XVII, §I]
We hope that the reader has had an introduction to matrices and linear maps, and the brief summary which has preceded is mainly intended to remind the reader of the facts which we shall use. Example 1. What are the linear maps of R into R? They are easily determined. Let A: R - R be a linear map. Then for all x E R we have
A(x) = A(x' I) = xA(l). Let a = A(I). Then
A(x) = ax. Thus we can write A = Aa where Aa : R - R is multiplication by the num· ber a.
Example 2. Let A = (a" ... ,an) be a row vector, and x a column vector, corresponding to the coordinates (Xl>'" ,xn ). We still define A· x as a,x, + ... + anxn. We have a linear map
such that
for all x E R. Our discussion concerning matrices shows that any linear map of Rn into R is equal to some A" for some vector A. Example 3. Let F be an arbitrary vector space. We can determine all linear maps of R into F easily. Indeed, let w be an element of F. The map X J-> XW
for x E R is obviously a linear map of R into F. We may denote it by A"" so that A",(X) = XW. Conversely, suppose that A: R _ F is a linear map. Then for all x E R we have
A(X)
= A(x· I) = XA(l).
Now A(l) is a vector in F. Let Wo = A(l). We see that A = A",o' In this way we have described all linear maps of R into F by the elements of F itself. To each such element corresponds a linear map, and conversely; namely to the element w corresponds the linear A",: R _ F such that
for all x E R.
[XVII, §I]
THE SPACE OF CONTINUOUS LINEAR MAPS
461
Observe that a linear map into R'" can be viewed in terms of its coordinate functions. Theorem 1.2. Let E be a normed vector space, and A: E -+ R"'. Let A = (1.1>'" ,A",) be its expression in terms of coordinate functions AI' Then A is a continuous linear map if and only if each Ai is continuous linear for i = I, ... ,m. Proof This is obvious from the definitions.
Remark. One need not restrict consideration to maps into R"'. More generally, if FI>"',F mare normed vector spaces, we can consider maps A: E -+ FIx ... x F", into the product space consisting of all m-tuples of elements (XI> .•. ,x",) with XI E Fl' We take the sup norm on this space, and Theorem 1.2 applies as well. Let us reconsider the case of Rn -+ R'" as a special case of Theorem 1.2. Let A: Rn -+ R'" be a linear map, and A = AA for some matrix A = (ai;)' Let (AI> ••• ,J.",) be the coordinate functions of,t By what we have seen concerning the product Ax of A and a column vector x, we now conclude that if A I> .•. ,A", are the row vectors of A, then Ai(X)
= Ai' X
is the ordinary dot product with Ai' Thus we may write
Finally, let E, F, G be normed vector spaces and let w:E-+ F
and
kF-+G
be continuous linear maps. Then the composite map 1.0 W is a linear map.
Indeed, for v, VI> V2 E E and c E R we have
A(W(V 1
+ V2»
= A(w(vI)
+ w(V2»
= A(w(v 1 »
+ A(w(v2»
and
A(W(CV» = A(cw(v» = d(w(v». A composite of continuous maps is continuous, so 1.0 W is continuous. In terms of matrices, if E = Rn, F = R"', and G = R', then we can represent wand A by matrices A and B respectively. The matrix A is m x n and the matrix B is s x m. Then A W is represented by BA. One verifies this directly from the definitions. 0
462
DERIVATIVES IN VECTOR SPACES
[XVII, §1]
XVII, §1. EXERCISES I. Let E be a vector space and let v, . ... .v. E E. Assume that every element of E has a unique expression as a linear combination XJVI + ... + XnVn with XjE R. That is, given vEE, there exist unique numbers Xi E R such that
Show that any linear map A: E .... F into a normed vector space is continuous. 2. Let Mat m •• be the vector space of all m x n matrices with components in R. Show that Matm •• has elements e'j (i = I, ... ,m and j = I, ... •n) such that every element A of Mat m •• can be wrillen in the form m
A =
•
L LIaijeij' i= 1 j=
with number a'j uniquely determined by A. 3. Let E. F be normed vector spaces. Show that the association L(E, F) x E .... F
given by (.l, y) .-.,l(y)
is a product in the sense of Chapter VII, §l. 4. Let E, F, G be normed vector spaces. A map A: Ex F .... G
is said to be bilinear if it satisfies the conditions .l(v, w, A(V,
+ w2 )
+ V2' w) = .l(v" .l(cv, w)
for all v,
Vi E
E,
W, Wi E
= .l(v, w,)
F, and
CE
w)
+
A(v, w 2 ),
+ A(v2' w),
= cA(v. w) = A(v, cw)
R.
(a) Show that a bilinear map A is continuous if and only if there exists C> 0 such that for all (v, w) E E x F we have IA(v, IV) 1 ~
ClvlllVl.
(b) Let vEE be fixed. Show that if A is continuous, then the map A,: F .... G given by w.... .l(v, w) is a continuous linear map.
[XVII, §2]
THE DERIVATIVE AS A LINEAR MAP
463
For the rest of this chapter, we let E, F, G be euclidean spaces, that is R" or Rm• The reader will notice however that in the statements and proofs of theorems. vectors occur independently of coordinates, and that these proofs apply to the more general situation of complete normed vector spaces. We shall always accompany the theorems with an explicit determination of the statement involving the coordinates, which are useful for computations. The theory which is independent of the coordinates gives, however, a more faithful rendition of the geometric .flavor of the objects involved.
XVII, §2. THE DERIVATIVE AS A LINEAR MAP Lei U be open in E, and let x E U. Let f; U -+ F be a map. We shall say that f is differentiable at x if there exists a continuous linear map A: E -+ F and a map'" defined for all sufficiently small h in E. with values in F, such that lim .-0
"'(h) = 0,
and such that (*)
f(x
+ h) = f(x) + A(h) + Ihl"'(h).
Setting h = 0 shows that we may assume that '" is defined at 0 and that 0/1(0) = O. The preceding formula still holds. Equivalently, we could replace the term Ihlo/l(h) by a term qJ(h) where qJ is a map such that lim qJ(h) = O. Ihl
• -0
The limit is taken of course for h ;f 0, otherwise the quotient does not make sense. A mapping qJ having the preceding limiting property is said to be o(h) for h -+ O. (One reads this "little oh of h.") We view the definition of the derivative as stating that near x, the values of f can be approximated by a linear map A, except for the additive term f(x), of course, with an error term described by the limiting properties of'" or qJ described above. It is clear that if f is differentiable at x, then it is continuous at x. We contend that if the continuous linear map A exists satisfying (*), then it is uniquely determined by f and x. To prove this, let AI> A2 be con-
464
DERIVATIVES IN VECTOR SPACES
[XVII, §2]
tinuous linear maps having property (*). Let veE. Let I have real values > 0 and so small that x + tv lies in u. Let h = tv. We have
f(x
+ h) -
f(x) = A1(h)
+ Ihlr/ll(h)
= Aih) + Ihlr/l2(h) with lim r/lJ{h)
=0
0-0
for j = 1,2. Subtracting the two expressions for
f(x
+ tv)
- f(x),
we find
and setting h =
IV,
using the linearity of A,
We divide by I and find
Take the limit as I - O. The limit of the right side is equal to O. Hence A1(V) - A2(V) = 0 and A,(V) = Aiv). This is true for every ve E, whence A, = A2' as was to be shown. In view of the uniqueness of the continuous linear map A, we call it the derivative of f at x and denote it by f'(x) or Df(x). Thus f'(x) is a continuous linear map, and we can write
f(x + h) - f(x)
= f'(x)h + Ih Ir/I(h)
with lim ljJ(h)
= O.
0-0
We have written f'(x)h instead of f'(x)(h) for simplicity, omitting a set of
[XVII, §2]
THE DERIVATIVE AS A LINEAR MAP
465
parentheses. In general we shall often write }'h
instead of }'(h) when}. is a linear map. If f is differentiable at every point x of V, then we say Ihat f is differentiable on U. In that case, the derivative I' is a map
1': V -+ L(E, F) from V into the space of continuous linear maps L(E, F), and thus to each x E V, we have associated the linear map I'(x) E L(E, F). We shall now see systematically how the definition of the derivative as a linear map actually includes the cases which we have studied previously. We have three cases: Case 1. We consider a map f: J -+ R from an open interval J into R. This is the first case ever studied. Suppose f is differentiable at a number x E J in the present sense, so that there is a linear map }.; R -+ R such that
f(x
+ h) -
f(x) = }'(h)
+ IhlojJ(h)
with lim ojJ(h) = O. • -0
We know that there is a number a such that A(h) = ah for all h, that is A = }.•. Hence
f(x
+ h) - f(x) = ah + Ih IojJ(h).
We can divide by h because h is a number, and we find
f(x
+ h) h
f(x) = a
+ !!:! ojJ(h)
h'
But Ihl/h = I or -1. The limit of (lhl/h)ojJ(h) exists as h -+ 0 and is equal to O. Hence we see that f is differentiable in the old sense, and that its derivative in the old sense is a. In this special case, the number a in the old definition corresponds to the linear map "multiplication by a" in the new definition. (For differentiable maps over closed intervals, cf. the exercises.)
466
DERIVATIVES IN VECTOR SPACES
[XVII, §2]
Case 2. Let U be open in R" and let f: U -+ R be a map, differentiable at a point x E U. This is the case studied in Chapter XV, §1. There is a linear map A.: R· -+ R such that f(x
+ h) -
+ IhlI/J(h)
f(x) = A.(h)
with lim I/J(h)
= O.
h-O
We know that A. corresponds to a vector A, that is A. = A... , where A..(h) = A . h. Thus f(x
+ h) - f(x)
= A· h
+ IhlI/J(h).
This is precisely the notion of differentiability studied in Chapter XV, and we proved there that A = grad f(X) = (of/ox l' ... ,of/ox.). In the present case, the old "derivative" A corresponds to the new derivative, the linear map" dot product with A." Case 3. Let J be an interval in R, and let f J -+ F be a map into any normed vector space. This case was studied in Chapter X, §5. Again suppose that f is differentiable at the number x E J, so that f(x
+ h) -
f(x)
= A.(h) + Ih II/J(h)
for some linear map A.: R -+ F. We know that A. corresponds to a vector WE F, that is A. = A.,. is such that A...(h) = hw. Hence f(x
+ h) - f(x)
= hw
+ IhlI/J(h).
In the present case, h is a number and we can divide by h, so that f(x
+ h~ -
f(x) = w
+
':1
I/J(h).
The right-hand side has a limit as h ..... 0, namely w. Thus in the present case, the old derivative, which was the vector w, corresponds to the new derivative, the linear map Aw , which is "multiplication by w on the right" We have now identified our new derivative with all the old derivatives, and we shall go through the differential calculus for the fourth and last time, in the most general context. Let us consider mappings into Rm.
[XVII, §2]
THE DERIVATIVE AS A LINEAR MAP
467
Theorem 2.1. Let U be all opell set of R", alld let f: U - R'" be a map which is differelltiable at x. Theil the cOlltilluous lillear map f'(x) is represented by the matrix
where j; is the i-th coordinate function of f Proof Essentially this comes from pUlling together Case 2 discussed above, and Theorem 1.2. We go through the proof once morc from scratch. We have using Case 2:
f(x
+ h) -
+ h) -
fl(x f(x) =
=
(
fm(x
(AI'
+ h
A", . h
fl(X»)
: h) - fm(x)
~ ipl(h») + ip",(h)
= (AI:' h)
+
(ipl~h»)
A",· h
ip",(h)
oj;
oj; ) 'ox" '
where
A;
= grad j;(x) = ( ox l ""
and ip!..h) = o(h). It is clear that the vector ip(h) = (ipl(h), ... ,ip",(h») is o(h), and hence by definition of f'(x), we see that it is represented by the matrix of partial derivatives, as was to be shown. The matrix
is called the Jacobian matrix of f at x. We see that if f is differentiable at every point of U, then x ...... J I(x) is a map from U into the space of matrices, which may be viewed as a space of dimension mil.
468
DERIVATIVES IN VECTOR SPACES
[XVII, §3]
We defined f to be differentiable on U if f is differentiable at every point of U. We shall say that f is of class CIon U, or is a C 1 map, iff is differentiable on U and if in addition the derivative
1': U -+ L(E, F) is continuous. From the fact that a map into a product is continuous if and only if its coordinate maps are continuous, we conclude from Theorem 2.1: Corollary. The map f: U -+ Rm is of class CI if and only if the partial derivatives iJfJiJxj exist and are continuous functions, or put another way, if and only if the partial derivatives Dj /;: U -+ R exist and are continuous.
XVII, §2. EXERCISES I.
Find explicitly the Jacobian matrix of the polar coordinate map
x
= rcosO
y = r sin O.
and
2. Find the Jacobian matrix of the map (u, v) = F(x, y) where u = eX cos y,
v = eX sin y.
Compute the determinants of these 2 x 2 matrices. The determinant of the matrix
is by definition ad - be. 3. Let A: R" - Rm be a linear map. Show that A is differentiable at every point, and that A'(x) = A for all x E R".
XVII, §3. PROPERTIES OF THE DERIVATIVE Sum. Let U be open in E. Let f, g: U -+ F be maps which are differenti-
able at x E U. Then f
+ 9 is differentiable at x and (f + g)'{x) = f'(x)
+ g'(x).
If c is a number, then (cf)'{x) = cf'(x).
[XVII, §3)
PROPERTIES OF THE DERIVATIVE
Proof. Let Al
469
= j'(x) and A2 = g'(x) so that f(x
+ h) -
f(x)
= Alh + Ihll/Jl(h),
g(x
+ h) -
g(x)
= A2h + Ihll/J2(h),
where lim l/J;(h) = O. Then .~O
(f + g)(x
+ h) -
(f + g)(x) = f(x =
=
Since lim (l/J,(h)
+ l/J2(h»
=
+ h) + g(x + h) - f(x) - g(x) A1h + A2 h + Ihl(l/J,(h) + l/Jih» (AI + A2)(h) + Ihl(l/J,(h) + l/J2(h».
0, it follows by definition that
.~O
Al
+ A2 = (f + g)'(x),
as was to be shown. The statement with the constant is equally clear. Product. Let FI x F2 .... G be a product, as defined in Chapter VII, §1. Let U be open in E and let f: U .... F I and g: U .... F 2 be maps differentiable at x E U. Then the product map fg is differentiable at x and (fg)'(x)
= j'(x)g(x) + f(x)g'(x).
Before giving the proof, we make some comments on the meaning of the product formula. The linear map represented by the right-hand side is supposed to mean the map VI->
U'(x)v)g(x) + f(x)(g'(x)v).
Note that j'(x): E .... F I is a linear map of E into F I> and when applied to vEE yields an element of Fl' Furthermore, g(x) lies in F 2, and so we can take the product U'(x)v)g(x) E G.
Similarly for f(x)(g'(x)v). In practice we omit the extra set of parentheses, and write simply j'(x)vg(x). .
470
DERIVATIVES IN VECTOR SPACES
[XVII, §3]
Proof We have f(x
+ h)g(x + h) - f(x)g(x) = f(x + h)g(x + h) - f(x + h)g(x) + f(x + h)g(x) - f(x)g(x) = f(x + h)(g(x + h) - g(x») + (f(x + h) - f(x»)g(x) = f(x + h)(g'(x)h + IhI1/t2(h») + (f'(x)h + Ihll/tl(h»)g(x) =f(x + h)g'(x)h + Ihlf(x + h)1/t2(h) + f'(x)hg(X) + Ihll/tl(h)g(x) = f(x)g'(x)h + f'(x)hg(x) + (f(x + h) - f(x»)g'(x)h + Ihlf(x + h)1/t2(h) + Ihll/tl(h)g(x).
The map h 1-+ f(x)g'(x)h
+ f'(x)hg(x)
is the linear map of E into G, which is supposed to be the desired derivative. It remains to be shown that each of the other three terms appearing on the right is of the desired type, namely o(h). This is immediate. For instance, l(f(x
+ h) -
f(x»)g'(x)hl
< If(x
+ h) -
f(x)llg'(x)llhl
and lim .-0
If(x
+ h)
- f(x) I Ig'(x) I = 0
because f is continuous, being differentiable. The others are equally obvious, and our property is proved. Example. Let J be an open interval in R and let
e1-+ A(e)
=
(a,tt))
and
t 1-+
X(t)
be two differentiable maps from J into the space of m x n matrices, and into R" respectively. Thus for each t, A(t) is an m x n matrix, and X(t) is a column vector of dimension n. We can form the product A(t)X(t), and thus the product map t 1-+
A(t)X(t),
which is differentiable. Our rule in this special case asserts that d
dt A(t)X(t) = A'(t)X(t)
+ A(t)X'(t)
[XVII, §3]
PROPERTIES OF THE DERIVATIVE
471
where differentiation with respect to t is taken componentwise both on the matrix A(t) and the vector X(t). Actually, this case is covered by the case treated in Chapter X, §5, since our maps go from an interval into vector spaces with a product between them. The product here is the product of a matrix times a vector. If m = I, then we deal with the even more special case where we take the dot product between two vectors. Chain rule. Let U be open in E and let V be open in F. Let f: U ..... V and g: V ..... G be maps. Let x E U. Assume that f is differentiable at x and 9 is differentiable at f(x). Then 9 0 f is differentiable at x and
(g 0 f)'(x) = g'(f(x») 0 j'(x). Before giving the proof, we make explicit the meaning of the usual formula. Note that j'(x): E ..... F is a linear map, and g'(f(x»): F ..... G is a linear map, and so these linear maps can be composed, and the composite is a linear map, which is continuous because both g'(f(x») and j'(x) are continuous. The composed linear map goes from E into G, as it should.
Proof Let k(h) g(f(x
= f(x + h) -
f(x). Then
+ h») - g(f(x») = g'(f(x»)k(h) + Ik(h)II/!,(k(h»)
with lim I/!,(k) = O. But
'-0
k(h) = f(x
+ h) -
f(x) = j'(x)h
+ Ihll/!2(h),
with lim I/!2(h) = O. Hence h-O
g(f(x
+ h») -
g(f(x») = g'(f(x»)j'(x)h
+ Ihlg'(f(X»)I/!2(h) + Ik(h)II/!,(k(h»).
The first term has the desired shape, and all we need to show is that each of the next two terms on the right is o(h). This is obvious. For instance, we have the estimate
Ik(h)1 < 1j'(x)llhl
+ Ihlll/!2(h)1
and
lim I/!.(k(h») = 0 h-O
from which we see that Ik(h)II/!,(k(h») = o(h). We argue similarly for the other term.
472
DERIVA TIVES IN VECTOR SPACES
[XVII, §3)
The chain rule of course can be expressed in terms of matrices when the vector spaces are taken to be R", Rm, and RS respectively. In that case, in terms of the Jacobian matrices we have
the multiplication being that of matrices. Maps with coordinates. Let U be open in E, let f: U -+ F, x ... x Fm' and let f = (f•• .. . ,f.J be its expression in terms of coordinate maps. Then f is differentiable at x if and only if each /; is differentiable at x, and if this is the case, then f'(x)
= (f~(x), ... ,f~(x».
Proof This follows as usual by considering the coordinate expression f(x
+ h) -
f(x) = (f,(x
+ h) -
f,(x), ... ,fm(x
+ h)
- fm(x».
Assume that Fl..x) exists, so that /; (x
+ h)
- /;(x) = fi (x)h
+
where
+ h) -
f(x) = (f~(x)h, ... ,f~(x)h) + (
and it is clear that this last term in F. x ... x F m is o(h). (As always, we use the sup norm in F. x ... x F m') This proves that F(x) is what we said it was. The converse is equally easy and is left to the reader. Theorem 3.1. Let A: E -+ F be a continuous linear map. Then J. is differentiable at every point of E and J.'(x) = J. for every x E E. Proof This is obvious, because A(x
+ h) -
A(x) = A(h)
+ O.
Note therefore that the derivative of J. is constant on E. Corollary 3.2. LeI f: U -+ F be a differentiable map, and let J.: F -+ G be a continuous linear map. Then (J. 0 f)'(x) = J. 0 f'(x).
[XVII, §4]
MEAN VALUE THEOREM
473
For every VE U we have (A 0 f)'(x)v
= .l.(f'(x)v).
Proof This follows from Theorem 3.1 and the chain rule. Of course, one can also give a direct proof, considering .l.(f(x
+ h»
- .l.(f(x» = .l.(f(x =
+ h) -
.l.(f'(x)h + Ihll/!(h»)
= .l.(f'(x)h)
and noting that lim .l.(I/!(h»)
f(x»)
+ Ihj.l.(I/!(h»),
= O.
0-0
XVII, §3. EXERCISES I. Let U be open in E. Assume that any two points of U can be connected by a continuous curve. Show that any two points can be connected by a piecewise differ-
entiable curve. 2. Let f: U -+ F be a differentiable map such that f'(x) = 0 for all x € U. Assume that any two points of U can be connected by a piecewise differentiable curve. Show that f is constant on U.
XVII, §4. MEAN VALUE THEOREM The mean value theorem essentially relates the values of a map at two different points by means of the intermediate values of the map on the line segment between these two points. In vector spaces, we give an integral form for it. We shall be integrating curves in the space of continuous linear maps L(E, F). This is a complete normed vector space, and we have known how to do this since Chapter X. We shall also deal with the association L(E, F) x E -> F
given by
474
DERIVATIVES IN VECTOR SPACES
[XVII, §4]
for AE L(E, F) and y E E. Note that this is a product in the sense of Chapter VII, §l. In fact, the condition on the norm ~
lA
IAIIYI
is true by the very nature of the definition of the norm of a linear map. Let a: J -+ L(E, F) be a continuous map from a closed interval J = [a, b] into L(E, F). For each t E J, we see that a(t) E L(E, F) is a linear map. We can apply it to an element y E E and a(t)y E F. On the other hand, we can integrate the curve a, and fa(t) dt
is an element of L(E, F). If a is differentiable, then da(t)/dt is identified with an element of L(E, F). If we deal with the case of matrices, then integration and differentiation is performed componentwise. Let us use the notation A: J so that A(t) is an m x
tI
-+
Matm ••
matrix for each t E J, A(t) = (aiJ{t». Then
and dA(t) = (daiJ{t»). dt dt
In this case of course, the aij are functions. Example. Let A(t) =
(~s t sm t
t ).
12
Then A'(I)
= dA(t) = (-sin t dt
and
cos t
1) 2t
[XVII, §4]
475
MEAN VALUE THEOREM
Lemma 4.1. Let a: J
-+ L(E, F) be a continuous map from a closed intervalJ = [a, b] into L(E, F). Let y E E. Then
f
a(t)y dt
=
f
a(t) dt· y
where the dot on the right means the application of the linear map fa(t) dt to the vector y. Proof. Here y is fixed, and the map
A1-+ A(y)
=
Ay
is a continuous linear map of L(E, F) into F. Hence our lemma is a special case of Exercise 2 of Chapter X, §6. If readers visualize the lemma in terms of matrices. they will see that they can also derive a direct proof reducing it to coordinates. For instance, if AI(t), ... ,Arn(t) are the rows of A(t), and y is a fixed column vector, then
and aij(t), Yj are numbers. One can then integrate componentwise and term by term in the expression on the right, taking the y j in or out of the integrals. Similarly,
d(A(t)y) dt
=
(Ai(t y)
= (a'"~t)YI +
+ ain(?Yn)
a;" I (t)y I +
A;"(t). Y
+ a;"n(t)Yn
where we differentiate componentwise. Theorem 4.2. Let U be open in E and let x E U. Let Y E E. Let f: U -+ F be a C I map. Assume that the line segment x + ty with 0 < t ::; 1 is contained in U. Then
f(x
+ y)
- f(x)
=
II
r(x o .
+ ty)y dt =
II 0
r(x
+ ty) dt· y.
476
[XVII, §4)
DERIVATIVES IN VECTOR SPACES
Proof Let get) = f(x + ty). Then g'(t) = f'(x + ty)y. By the fundamental theorem of calculus (Theorem 6.2 of Chapter X) we find that
gel) - g(O) =
fa' g(t) de.
But gel) = f(x + y) and g(O) = f(x). Our theorem is proved, taking into account the lemma which allows us to pull the y out of the integral. Corollary 4.3. Let U be open in E and let x, Z E U be such that the line segment between x and z is contained in U (that is the segment x + t(z - x) with 0 ~ t ~ 1). Letf: U --+ F be of class C I • Then If(z) - f(x) I ~ Iz - xl sup I f'(v) 1, the sup being taken for all v in the segment. Proof We estimate the integral, letting x fa'f'(X
+ ty)y dt
~ (l -
+y
=
z. We find
0) sup If'(x + ty)IIYI
using the standard estimate for the integral, that is Theorem 4.5 of Chapter X. Our corollary follows. (Note. The sup of the norms of the derivative exists because the segment is compact and the map tl-+ If'(x + ty)1 is continuous.) Coronary 4.4. Let U be open in E and let x, z, segment between x and z lies in U. Then
Xo E
U. Assume that the
If(z) - f(x) - f'(XoXz - x)l ~ Iz - xl sup If'(v) - f'(xo)l, the sup being taken for all von the segment between x and z. Proof We can either apply Corollary 4.3 to the map g such that g(x) = f(x) - f'(xo)x, or argue directly with the integral: fez) - f(x) =
fa' f'(x + t(z -
x»)(z - x) dt.
We write f'(x
+ t(z
- x») = f'(x
+ t(z
- x») - f'(xo)
+ f'(xo),
[XVII, §5]
477
THE SECOND DERIVATIVE
and find
fez) - f(x)
= f'(xoXz
- x)
+
s:
[f'(x
+ t(z - x») - f'(xo)](z - x) dt.
We then estimate the integral on the right as usual. We shall call Theorem 4.2 or either one of its two corollaries the Mean Value Theorem in vector spaces. In practice, the integral form of the remainder is always preferable and should be used as a conditioned reflex. One big advantage it has over the others is Ihat the integral, as a funclion of y, is just as smooth as 1', and this is important in some applications. In others, one only needs an intermediate value estimate, and then Corollary 4.3, or especially Corollary 4.4, may suffice.
XVII, §4. EXERCISE I. LeI f: [0, 1]
If'{t)1
~
--+ R" and g: [0, t] --+ R have continuous derivatives. g'(t) for all t. Prove that If(l) - f(O)1 ~ Ig(l) - g(O)I·
Suppose
The following sections on higher derivatives will not be used in an essential way in whatfollows andmay be omitted, especially in what concernS the next chapter. Readers may therefore skip from here immediately to the inverse mapping theorem as a natural continuation ofthe study ofmaps ofclass C 1• They should then take p = 1 in all statements ofthe next chapter. Reference will however be made to the theorem concerning partial derivatives in §7.
XVII, §5. THE SECOND DERIVATIVE Let U be open in E and let f: U
-+
F be differentiable. Then
Df =1': U
-+
L(E, F)
and we know that L(E, F) is again a complete normed vector space. Thus we are in a position to define the second derivative D2f = P2): U -+ L(E, L(E, F») if it exists. This leads us to make some remarks on this iterated space of linear maps. Let v, w be elements of E, i.e. vectors, and let AE L(E, L(E, F». Applying A to v yields an element of L(E, F), that is A(v) is a continuous linear map of E into F. We can therefore apply it to wand find an element of F, which we denote by A(VXW)
= A(v, w)
or also
AV' W
478
DERIVATIVES IN VECTOR SPACES
[XVII, §5]
using this last notation when too many parentheses are accumulating. By definition, fixing v, we see that the preceding expression is linear in the variable w. However, fixing w, we see that it is also linear in v, because if VI> V2 E E then
A(v,
+ V2)(w) = (A(V,) + A(V2»)(W) = A(vd(w) + A(V2)(W) = A(VI> w) + A(Vl> w).
Also trivially,
A(cv)(w) = cA(vXw) = cA(v, w). This now looks very much like a product as in Chapter VII, §1, and in fact it is essentially. Indeed, we have the first two conditions of a product E x E -+ F satisfied if we define the product between v and w to be A(v, w). On the other hand,
IA(VXW) 1;:;; IA(v)llwl ;:;; IAllvllwl
(*)
so the third condition is almost satisfied except for the constant factor IAI. Of course, constant factors do not matter when studying continuity and limits. Actually, we can also view the association
(A, v, w)_ J.(v)(w) as a triple product, which is linear and continuous, satisfying in fact the inequality (*). Cf. Exercise I. In general, a map f: E, x ... x
En -+ F
is said to be multilinear if each partial map
is linear. This means:
f(v" ... ,v;
+ vi,··· ,vJ = f(vl> ... ,vJ + f(vl> ... ,vi, .. . ,vn ),
f(vl> ... ,cv" ... ,vJ = cf(v" ... ,vn ), for v" vi E E; and c E R. In this section, we study the case n = 2, in which case the map is said to be bilinear. Examples. The examples we gave previously for a product (as in Chapter VII, §1) are also examples of continuous bilinear maps. We leave
[XVII, §5]
THE SECOND DERIVATIVE
479
it to the reader to verify for bilinear maps (or multilinear maps) the condition analogous to that proved in §I for linear maps. Cf. Exercise I. Thus the dot product of vectors in Rn is continuous bilinear. The product of complex numbers is continuous bilinear, and so is the cross product in R3 . Other examples: The map L(E, F) x E
-+
F
given by
(A., v) ..-d(v) that we just considered. Also, if E, F, G are three spaces, then L(E, F) x L(F, G)
-+
L(E, G)
given by composition,
is continuous bilinear. The proof is easy and is left as Exercise 4. Finally, if A = (ai)
is a matrix of n 2 numbers au (i = I, ... ,n; j = I, ... ,n), then A gives rise to a continuous bilinear map
by the formula
where X, Yare column vectors, and IX = (XI> .•• ,xn) is the row vector called the transpose of X. We study these later in the section. Theorem 5.1. Let w: E 1 x E 2 -+ F be a continuous bilinear map. Then w is dijferentiable, and for each (Xl> X2) e E 1 x E 2 and every
we have
480
DERIVATIVES IN VECTOR SPACES
so that Dw: E, x E 2 and D3 w = O.
->
L(E,
X
[XVII, §5]
E 2 , F) is linear. Hence D 2w is constant,
Proof We have by definition
This proves the first assertion, and also the second, since each term on the right is linear in both (Xl' X2) = X and h = (h" h 2 )· We know that the derivative of a linear map is constant, and the derivative of a constant map is 0, so the rest is obvious. We consider especially a bilinear map kExE->F
and say that A is symmetric if we have A(v, w)
for all v,
WEE.
= A(w, v)
In general, a multilinear map
A: Ex··· x E
->
F
is said to be symmetric if
for any permutation u of the indices I, ... ,no In this section we look at the symmetric bilinear case in connection with the second derivative. We see that we may view a second derivative D2f(x) as a continuous bilinear map. Our next theorem will be that this map is symmetric. We need a lemma. Lemma 5.2. Let k E x E -> F be a bilinear map, and assume that there exists a map l/J defined for all sufficiently small pairs (v, w) E E x E with values in F such that
l/J(v, w)
lim (tI,
= 0,
w)-(O. 0)
and that
lA(v, w)1 Then A = O.
~
Il/J(v, w)llvllwl.
[XVII, §5]
481
THE SECOND DERIVATIVE
Proof This is like the argument which gave us the uniqueness of the derivative. Take v, WEE arbitrary, and let s be a positive real number sufficiently small so that o/J(sv, sw) is defined. Then IA(SV, sw)1 ;;;; Io/J(sv, sw)llsvllswl, whence
s21A(v, w)l;;;; s21t/J(sv,sw)llvllwl. Divide by S2 and let s
->
O. We conclude that A(v, w)
= 0, as desired.
Theorem 5.3. Let U be open in E and let f; U -> F be twice differentiable, and such that D2f is continuous. Then for each x E U, the bilinear map D2f(x) is symmetric, that is
D2f(x)(v, w) = D2f(x)(w, v) for all v, wEE. Proof Let x E U and suppose that the open ball of radius r in E centered at x is contained in U. Let v, WEE have lengths < rf2. We shall imitate the proof of Theorem 1.1, Chapter XV. Let g(x)
= f(x + v) -
f(x).
Then
f(x
+ v + w) = g(x
+
f(x
+ w) -
w) - g(x) =
= =
f(x
f: f: f
+ v) + f(x)
g'(x + tw)w dt [Df(x
+ v + tw) -
Df(x
+ tw)]w dt
fD 2f(X + SV + tw)vds· wdt.
Let
o/J(sv, tw) = D2f(x + SV + tw) - D2f(x).
482
[XVII, §5]
DERIVATIVES IN VECTOR SPACES
Then g(x + w) - g(x) =
s: s: + s: f:
D 2f(x)(v, w) ds de
t/t(sv, ew)v . w ds de
= D 2f(xXv, w)
+ Ip(v, w)
where I{J(v, w) is the second integral on the right, and satisfies the estimate IIp(v, w)1 ~ sup It/t(sv, ew)llvllwl· s. r
The sup is taken for 0
~
s
~
I and 0 < e :5; I. If we had started with
g,(x) = f(x + w) - f(x) and considered g,(x + v) - g,(x), we would have found another expression for the expression f(x
+ v + w) -
f(x
+ w) -
f(x
+ v) + f(x),
namely D 2f(xXw, v)
+ Ip,(v, w)
where 11p,(v, w)1 < sup It/t,(sv, ew)llvllwl. s••
But then D 2f(x)(w, v) - D 2f(x)(v, w)
= Ip(v, w) -
Ip,(v, w).
By the lemma, and the continuity of D 2fwhich shows that sup It/t(sv, ew)1 and sup It/t,(sv, tw)1 satisfy the limit condition of the lemma, we now conclude that D 2f(x)(w, v)
as was to be shown.
= D 2f(xXv, w),
[XVII, §5]
THE SECOND DERIVATIVE
483
We now give an interpretation of the second derivative in terms of matrices. LetA: Rn x Rn -+ R be a bilinear map, and let e" ... ,en be the unit vectors of Rn. If
and
are vectors, with coordinates v" w, E R (so these are numbers as coordinates) then
..1.(v, w) = ..1.(vle, =
+ ... + vnen, wlel + ... + wnen)
Lj v, wjA(e" ej) i,
the sum being taken for all values of i, j = 1, ... ,n. Let au = A(ei, ej). Then au is a number, and we let A = (aij) be the matrix formed with these numbers. Then we see that
..1.(v, w)
= L aUvi Wj. i. j
Let us view v, W as column vectors, and denote by 'v (the transpose of v) the row vector 'v = (VI> .•• ,vn ) arising from the column vector
v=c} Then we see from the definition of multiplication of matrices that
..1.(v, w)
= 'vAw,
which written out in full looks like
:
: =" a--v-w·.
a.n)(w,) o. •..
ann
Wn
I..JIJIJ
i,i
We say that the matrix A represents the bilinear mapA. It is obvious conversely that given an n x n matrix A, we can define a bilinear map by letting
.(v, w) 1-+ 'vAw.
484
DERIVATIVES IN VECTOR SPACES
[XVII, §5]
Let Ai/ Rn
X
Rn -> R
be the map such that
Then we see that the arbitrary bilinear map Acan be written uniquely in the form
A=
L a'jAij' i, j
In the terminology of linear algebra, this means that the bilinear maps {A'j} (i = 1, ... ,11 and j = 1, ... ,11) form a basis for L 2(Rn, R). We also sometimes write Aij = AI ® Aj where Ai is the coordinate function of Rn given by Ai(v"", ,vn) = Vi' Now let U be open in Rn and let g: U -> L 2(Rn, R) be a map which to each x E U associates a bilinear map g(x): Rn x Rn
->
R.
We can write g(x) uniquely as a linear combination of the A'j. That is, there are functions glj of x such that g(x) =
L gIJ{x)A'j'
i,i
Thus the matrix which represents g(x) is the matrix (gIJ{x»), whose ordinates depend on x.
c0-
Theorem 5.4. Let U be opell ill Rn alld let f: U -> R be a fUllctioll. Theil f is of class C 2 if alld ollly if all the partial derivatives of f of order ;:i; 2 exist and are contilluous. If chis is the case, chell D2f(x) is represellted by the macrix (D,DJ(x»). Proof. The first derivative Df(x) is represented by the vector (1 x matrix) grad f(x) = (Dt!(x), ... , D.J(x»), namely Df(x)v
= Dt!(X)VI + '" + Dnf(x)vn
11
[XVII, §5)
485
THE SECOND DERIVATIVE
if v = (v" ... ,Vn) is given in terms of its coordinates v, E R. Thus we can write
+ ... + Dnf(X)An
Df(x) = D.J(X)A,
where A, is the i-th coordinate function of Rn, that is
Thus we can view Df as a map of U into an n-dimensional vector space. In the case of such a map, we know that it is of class C' if and only if the partial derivatives of its coordinate functions exist and are continuous. In the present case, the coordinate functions of Df are D,f, ... ,Dn! This proves our first assertion. As to the statement concerning the representation of D 2 f(x) by the matrix of double partial derivatives, let wE Rn and write W in terms of its coordinates (wl>'" ,Wn), Wi E R. It is as easy as anything to go back to the definitions. We have Df(x
where qJ(h)
= o(h).
+ h) - Df(x)
= D 2f(x)h
+
Hence
D2f(x)h . W
+ qJ(h)w = Df(x + h)w
- Df(x)w
n
= L (DJ(x + h) -
DJ(x»)wl
i= 1 n
"
L L (DjDJlx)h j + qJllh»wl
=
i=1 )=1
= L" L"
(DjD;/(x)hjw,
+ qJ,lh)w;).
i=1 j=l
Here as usual, !fIj(h) = o(h) for each i = I, ... ,n. Fixing wand letting h .... 0, we see that for each w the effect of the second derivative D 2f(x)h . w on h is given by the desired matrix. In other words, for any v, wE R" we have D 2f(x)v,
W
= L DjDtf(x)v;Wj, I, j
thereby proving the desired formula.
486
DERIVATIVES IN VECTOR SPACES
[XVII, §5]
Note. Instead of going back to the definitions, one could also write D 2f(xXv, w) = DDd(x)vwI
DgJ{x)v
=
+ ... + DDnf(x)vwn,
n
n
i=l
i=l
L D,gJ{x)v, = L D,Djf(x)v"
and substitute in the preceding expression to obtain what we want. The matrix representing D 2f(x) is called the Hessian of f at x and is denoted by
following the same notation as for the Jacobian. The symmetry condition that D 2f(x)(v, w) = D 2f(x)(w, v) is reflected in the matrix representation by the fact that
for the partial derivatives D" Dj • So everything fils together. We can also use the same notation as that of Chapter XV, §5, namely
I D f(xXv, w) = (v· VXw· V)f(x) I 2
where
are differential operators. This is simply a notational reformulation of the theorem. The reader should note: One is torn between trying to avoid the abstraction of the bilinear maps without coordinates which follow a simple but abstract formalism, and the annoyance of the coordinates which make formulas look messy. We have described above the notations which emphasize various aspects of the theory, and which may be used alternatively according to the taste of the user or the requirements of the problems at hand. For bilinear maps, things still look reasonably simple, but indices become much worse for the multilinear case.
[XVII, §6]
HIGHER DERIVATIVES AND TAYLOR'S FORMULA
487
XVII, §5. EXERCISES I. Let E l' ... ,E.. F be normed vector space and let
be a multilinear map. Show that Ais continuous if and only if there exists anum. ber C > 0 such that for all v, E E, we have IA(v" .. . ,Vn)I ;;;; Clvillv,I·· ·Ivnl.
2. Denote the space of continuous multilinear maps as above by L(E" ... ,En; F). If A is in this space, define IAI to be the greatest lower bound of all numbers C > 0 such that
for all v, E E,. Show that this defines a norm. 3. Consider the case of bilinear maps. We denote by L'(E, 1') the space of continuous bilinear maps of Ex E,...F. If AEL(E,L(E,F», denote by f, the bilinear map such that f,(v, w) = .l(vXw). Show thatlAI = If,l. 4. Let E, F, G be normed vector spaces. Show that the composition of mappings L(E, F) x L(F, G) -+ L(E, G)
given by (A, CJJ) 1-+ CJJ 0 A is continuous and bilinear. Show that the constant C of Exercise 1 is equal to 1. 5. Let f be a function of class C' on some open ball U in Rn centered at A. Show that f(X) = f(A)
+ Df(A) . (X -
A)
+ g(XXX -
A, X - A)
where g: U -+ L '(Rn, R) is a continuous map of U into the space of bilinear maps ofRn into R Show that one can select g(X) to be symmetric for each X E U.
XVII, §6. HIGHER DERIVATIVES AND TAYLOR'S FORMULA We may now consider higher derivatives. We define D"f(x) = D(DP-'fXx).
Thus D"f(x) is an element of L(E, L(E, ... ,L(E, F) .. .») which we denoted by LP(E, F). We say thatfis of class CP on U or is a CP map if D'f(x) exists for each x E U, and if D"f: U .... Lk(E, F)
is continuous for each k = 0, ... , p.
488
[XVII, §6)
DERIVATIVES IN VECTOR SPACES
We have trivially DOD'f(x) = D"J(x) if q + r = p and if DPf(x) exists. Also the p-th derivative DP is linear in the sense that DP(f + g) = D"J + DPg
and
DP(cf)
= cDPf
If AE U(E, F) we write
If q
+ r = p, we can evaluate A(VI> ... ,vp) in two steps, namely
We regard A(VI> ... , vo) as the element of U-O(E, F) given by
Lemma 6.1. Let V2"" ,vp be fixed elements of E. Assume that f is p times differentiable on U. Let g(x)
= DP- 1f(x)(V2' ... , vp).
Then 9 is differentiable on U and Dg(xXv)
= D"J(x)(v, V2,""
vp).
Proof The map g: U -+ F is a composite of the maps and where A is given by the evaluation at (V2"" ,vp). Thus A is continuous and linear. It is an old theorem that
namely the corollary of Theorem 3.1. Thus Dg(x)v
= (D"J(X)V)(V2"'"
vp),
which is precisely what we wanted to prove. Theorem 6.2. Let f be of class CP on U. Then for each x E U the map D"J(x) is multilinear symmetric.
[XVII, §6]
HIGHER DERIVATIVES AND TAYLOR'S FORMULA
489
Proof. By induction on p ~ 2. For p = 2 this is Theorem 5.3. In particular, if we let 9 = Dp-1f we know that for VI' V1 E E,
and since D"f = D1DP- 1f we conclude that
(0)
DPf(x)(vl>'" ,vp) = (D1DP-1f(x»)(vl> V1)' (V3,'" ,vp) =
(D1DP-1f(x»)(V1,
VI)'
(V3,'" ,vp)
= D"f(X)(V1' vI> V3, ... ,vp). Let q be a permutation of (2, ... ,p). By induction,
By the lemma, we conclude that
(00) From (0) and (00) we conclude that D"f(x) is symmetric because any permutation of (I, ... ,p) can be expressed as a composition of the permutations considered in (0) or (00). This proves the theorem. For the higher derivatives, we have similar statements to those obtained with the first derivative in relation to linear maps. Observe that if WE LP(E, F) is a multilinear map, and AE L(F, G) is linear, we may compose these Ex···xE~F~G
to get A0 w, which is a multilinear map of Ex··· x E -> G. Furthermore, wand A being continuous, it is clear that A0 w is also continuous. Finally, the map AO : LP(E, F)
->
LP(E, G)
given by "composition with A", namely
is immediately verified to be a continuous linear map, that is for WI> W 1 E LP(E, F) and C E R we have and
490
[XVII, §6]
DERIVATIVES IN VECTOR SPACES
and for the continuity,
SO
Theorem 6.3. Let f: U -+ F be p-times differentiable and let ).: F -+ G be a continuous linear map. Then for every x e U we have DP(}.o f)(x)
= }.
0
DPf(x).
Proof Consider the map x 1-+ DP-I(}.o f)(x). By induction,
By the corollary of Theorem 3.1 concerning the derivative D(}..o DP- 'f), namely the derivative of the composite map
we get the assertion of our theorem. If one wishes to omit the x from the notation in Theorem 6.3, then one must write
Occasionally, one omits the lower. and writes simply DP(}.o f)
= }. oD"f.
Taylor's formula. Let U be open in E and let f: U -+ F be of class CPo Let x e U and let ye E be such that the segment x + ty, 0 ;;;; t ;;;; I, is contained in U. Denote by ylk) the k-tuple (y, y, ... ,y). Then
f(x
Dr(x)y
+ y) = f(x) + ~
I!
+ ... +
DP-I'!(X)
(p- I)
y
(p - 1)1
where Rp =
1
5.o
(I - t)P-I (p - I)!
DPf(x
+ ty)ylp) dt.
+R P
[XVII, §6]
HIGHER DERIVATIVES AND TAYLOR'S FORMULA
491
Proof We can give a proof by integration by parts as usual, starting with the mean value theorem, f(x
+ y) =
f(x)
We consider the map t.-. Df(x product
+ L'Df(X + ty)y dt.
+ ty)y of the interval into F, and the usual RxF-+F
which consists in multiplying vectors of F by numbers. We let u = Df(x
+ ty)y
and
dv = dt, v = - (I - t)
This gives the next term, and then we proceed by induction, letting u
= D"f(x + ty)ylp)
and
dv
=
tr (p _ I)!
(I -
l
dt
at the p-th stage. Integration by parts yields the next term of Taylor's formula, plus the next remainder term. The remainder term R p can also be written in the form Rp =
f.
1
(1
- ty-' D"f(x + ty) dt . y(p).
o (p - I)!
The mapping
y.-.
f.
1
0
(I t)p-I (p-l)! D"f(x+ty)dt
is continuous. Iff is infinitely differentiable, then this mapping is infinitely differentiable since we shall see later that one can differentiate under the integral sign as in the case studied in Chapter X. Estimate of the remainder. Notation as in Taylor's formula, we can also write f(x
+ y) = f(x) + Df(x)y + ... + D"f(x)y(Pl + 8(y) l!
p!
492
DERIVATIVES IN VECTOR SPACES
[XVII, §6)
where
le(y)1 < sup ID'1(x + ty~ - D'1(x)llyIP O~.~l
p.
and
· e(y) IIm-, IP y-o Y
0
= .
Proof We write D'1(x
+ ty) - D'1(x) = t/J(ty).
Since D'1 is continuous, it is bounded in some ball containing x, and
lim t/J(ty)
0
=
y-o
uniformly in t. On the other hand, the remainder R p given above can be written as
f
(I - t)p-I D'1(x)Y
+ fl (1 - t)"-I t/J(ty)y
0
We integrate the first integral to obtain' the desired p-th term, and estimate the second integral by sup 1t/J(ty)llyIP
O~'~l
l(l
f
0
trl
(_ 1)' dt, P .
where we can again perform the integration to get the estimate for the error term e(y).
Theorem 6.4. Let U be open in E and let f: U
-+ FIx ... x F m be a map with coordinate maps (fl' ... , fm)' Then f is of class CP if and only if each is of class CP, and if that is the case, then
r.
D'1 = (D'1.. ... ,D'1m)' Proof We proved this for p
= 1 in §3, and the general case follows by
induction.
Theorem 6.5. Let U be open in E and V open in F. Let f: U g: V -+ G be CP maps. Then 9 0 f is of class CPo
-+ V
and
[XVII, §6)
HIGHER DERIVATIVES AND TAYLOR'S FORMULA
493
Proof We have D(g 0 flex) = Dg(f(x»
0
Df(x).
Thus D(g 0 f) is obtained by composing a lot of maps, namely as represented in the following diagram:
..-J-' F
Dg
l
U_--;;r__ Df
~
L(F, G)} x L(E, F)
........ L(E, G).
If p = I, then all mappings occurring on the right are continuous and so D(g 0 f) is continuous. By induction, Dg and Df are of class C'- I, and all the maps used to obtain D(g 0 f) are of class Cp-l (the last one on the right is a composition of linear maps, and is continuous bilinear, so infinitely differentiable by Theorem 5.1). Hence D(g 0 f) is of class C·-I, whence 9 0 f is of class C·, as was to be shown. We shall now give explicit formulas for the higher derivatives in terms of coordinates when these are available. We consider multilinear maps
kRn x ... x Rn_R (taking the product of Rn with itself p times). If
where
Vu E
R are the coordinates of VI' then
the sum being taken over all r-tuples of integers j I ' .•. , j. between 1 and n. If we let AJ, ... Jp : Rn x ... x Rn _ R be the map such that
then we see that Ai< ... Jp is multilinear; and if we let
494
[XVII, §6]
DERIVATIVES IN VECTOR SPACES
then we can express A as a unique linear combination
where we use the abbreviated symbols (j) = (j., ... ,jp). Thus the multilinear maps All) form a basis of LP(R", R). If g: U -> U(R", R) is a map, then for each x E U we can write g(x) =
L g(j)(X)A(}) (j)
where g(}) are the coordinate functions of g. This applies in particular when 9 = D"f for some p-times differentiable function f In that case, induction and the same procedure given in the bilinear case yield: Theorem 6.6. Let U be open in R" and let f: U -> R be a function. Then f is of class CP if and only if all the partial derivatives of f of order ~ p exist and are continuous. If this is the case, then D"f(x)
= L Dil ... DjJ(X)AiI ... jp lI)
and for any vectors v" ... ,v p E R" we have D"f(x)(v" ... ,vp )
= L Dj, ... Djpf(x)v'iI ... v,Jp' lJ)
Observe that there is no standard terminology generalizing the notion of matrix to an indexed set {ail'" Jp}
(which could be called a multimatrix) representing the multilinear map. The multimatrix {D J, ... DJpf(x}}
represents the p-th derivative D"f(x). In the notation of Chapter XV, §5, we can write also
where v·I . V = v.,D, I
+ ... + v·
In
Dn
is a partial differential operator with constant coefficients are the coordinates of the vector VI'
Vii' ... ,VI"
which
[XVll, §7J
495
PARTIAL DERIVATIVES
XVII, §6. EXERCISE I. Let U be open in E and V open in F. Let
J: U -+ V
g: V
and
-+
G
be of class CPo Let Xo E U. Assume that D'J(xo} = 0 for all k = 0, ... ,p. Show that D"(goJXxo} = 0 for 0 ~ k ~ p. [Hint: Induction.] Also prove that if D"g(J(xo)) = 0 for 0 ~ k ~ p, then (D'(g 0 J))(xo) = 0 for 0 ~ k ~ p.
XVII, §7. PARTIAL DERIVATIVES Consider a product E = E, x ... x En of complete normed vector spaces. Let VI be open in EI and let f:V,x··.xVn-F
be a map. We write an element XE VI x ... X V n in terms of its "coordinates," namely x = (x" ... ,xn ) with XI E VI' We can form partial derivatives just as in the simple case when E = Rn. Indeed, for x" ... ,XI_ .. XI+ .. •.. ,Xn fixed, we consider the partial map
of VI into F. If this map is differentiable, we call its derivative the partial derivative of f and denote it by DI fIx) at the point x. Thus, if it exists,
is the unique continuous linear map AE L(E;, F) such that fIx"~ ... ,XI
+ h, ... ,xn )
-
fIx"~
... ,Xn ) = A(h)
+ o(h),
for hEEl and small enough that the left-hand side is defined. Theorem 7.1. Let VI be open in E j (i = 1, ... ,n) and let f: V, x ...
X
Vn
-
F
be a map. This map is of class CP if and only if each partial derivative
DJ: V, x ... x V n - L(E;, F)
496
[XVII, §7]
DERIVATIVES IN VECTOR SPACES
is of class CP- I. If this is the case, and
then n
Df(x)v =
L Dd(x)v,. i;; 1
Proof We shall give the proof just for n = 2, to save space. We assume that the partial derivatives are continuous, and want to prove that the derivative of f exists and is given by the formula of the theorem. We let (x, y) be the point at which we compute the derivative, and let h = (hI> h2 ). We have
f(x
+ hI> Y + h2 )
-
f(x, y)
=
f(x + hI> Y + h 2 )
=
J: D,f(x + hI>
-
f(x + hi' y) + f(x + hI> y) - f(x, y)
Y + th 2 )h 2 dt + f DJ(x + thl> y)h l dt.
Since D2 f is continuous, the map I/J given by I/J(hl> th 2 ) = D,f(x + hI> Y + th 2 )
-
D,f(x, y)
satisfies lim I/J(hl> th 2 ) = O.
h-O
Thus we can write the first integral as
1 1
o D2 f(x
+ hI> Y + th 2 )h 2 dt
=
11
0 D,f(x, y)h 2 dt +
= D,j(x, y)h 2 +
f
In I/J(hl> th )h I
I/J(hl> th 2 )h 2 dt.
Estimating the error term given by this last integral, we find
fl/J(hl> th 2 )h 2 dt <
0~~~III/J(hlth2)llh21
::;; Ihl sup II/J(hl> th 2 )1
= o(h).
2
2
dt
[XVll, §7J
PARTIAL DERIVATIVES
497
Similarly, the second integral yields Dd(x, y)h,
+ o(h).
Adding these terms, we find that Df(x, y) exists and is given by the formula, which also shows that the map Df = f' is continuous, so f is of class C'. If each partial is of class C', then it is clear that f is C'. We leave the converse to the reader. Example. Let E, be an arbitrary space and let E 2 = Rm for some m so that elements of E2 can be viewed as having coordinates (y" ... oYm)' Let F = RS so that elements of F can also be viewed as having coordinates (z" ... ,zs)' Let U be open in E, x Rm and let
f: U
-+
RS
be a C' map. Then the partial derivative
may be represented by a Jacobian matrix. If (f" ... ,J.) are the coordinate functions of f, this Jacobian may be denoted by J7)(x, y), and we have
of,
of,
oy,
oYm
oj.
oj.
oy,
oYm
J'l'(x, y) =
For instance, let f(x, y, z) = (x 2y, sin z). We view R 3 as the product R x R 2 so that D 2 f is taken with respect to the space of the last two coordinates. Then D 2 f(x, y, z) is represented by the matrix
0)z . z) = (x0 cos 2
(2)
J I (x, y,
Of course, if we split R 3 as a product in another way, and compute D 2 f with respect to the second factor in another product representation, then the matrix will change. We could for instance split R 3 as R 2 x Rand thus take the second partial with respect to the second factor R. In that case, the matrix would be simply
498
DERIVATIVES IN VECTOR SPACES
[XVII, §7]
It will be useful to have a notation for linear maps of products into products. We treat the special case of two factors. We wish to describe linear maps kE, xE 2 -+F, xF 2 •
We contend that such a linear map can be represented by a matrix
All ( A2,
A'2) A22
where each Ai / Ej -+ F, is itself a linear map. We thus take matrices whose components are not numbers any more but are themselves linear maps. This is done as follows. Suppose we are given four linear maps Ai) as above. An element of E, x E 2 may be viewed as a pair of elements (v" V2) with v, E E, and V2 E E 2. We now write such a pair as a column vector
and define A.(v,. V2) to be
G::
~::)(~:) = G::~: : ~::~:)
so that we multiply just as we would with numbers. Then it is clear that A is a linear map of E, x E 2 into F, x F 2 • Conversely, let k E, x E 2 -+ F I X F 2 be a linear map. We write an element (v" V2) E E, x E 2 in the form (v" V2)
= (v" 0) + (0, V2)'
We also write A in terms of its coordinate maps A = (A" A2) where A,: E, x E 2 -+ F, and A2:E, x E 2 -+ F 2 are linear. Then A(v,. V2) = (AI (v,. V2). A2(V" V2») = (A,(v" 0)
+ A,(O. V2). A2(V" 0) + A2(0. V2»)'
The map
v, ...... AI(v" 0) is a linear map of E, into F, which we call A". Similarly, we let A,,(V,)
= A,(v" 0),
Adv2)
A2 ,(v,)
= A2(V" 0),
An(V2) = A2(0, V2)'
= A,(O, V2),
[XVII, §8]
DIFFERENTIATING UNDER THE INTEGRAL SIGN
499
Then we can represent A as the matrix
as explained in the preceding discussion, and we see that A(V"V2) is given by the multiplication of the above matrix with the vertical vector formed with VI and V2. Finally, we observe that if all Ai; are continuous, then the map A is also continuous, and conversely. We can apply this to the case of partial derivatives, and we formulate the result as a corollary. Corollary 7.2. Let U be open in £, x £2 and let I: U C'map. Let
-+
F1 X F2 be a
1= (f•.f2) be represented by its coordinate maps
I.:U-F.
and
12: U - F 2·
Then lor any x E U, the linear map DI(x) is represented by the matrix Dtf.(X) Dd.(x») ( Dtf,(x) D 2/2(x)·
I.
Proof. This follows by applying Theorem 7.1 to each one of the maps and 12, and using the definitions of the preceding discussion.
Observe that except for the fact that we deal with linear maps, all that precedes was treated in a completely analogous way for functions on open sets of n-space, where the derivative followed exactly the same formalism with respect to the partial derivatives.
XVII, §8. DIFFERENTIATING UNDER THE INTEGRAL SIGN The proof given previously for the analogous statement goes through in the same way. We need a uniform continuity property which is slightly stronger than the uniform continuity on compact sets, but which is proved in the same way. We thus repeat this property in a lemma.
500
[XVII, §8]
DERIVATIVES IN VECTOR SPACES
Lemma 8.1. Let A be a compact subset of a /lOrmed vector space, and let S be a subset of this normed vector space containing A. Let f be a contin1I0llS map defined on S. Given £ there exists [) such that if x E A and YES, and Ix - yl < 0, then If(x) - f(y)1 < £. Proof Given £, for each x E A we let r(x) be such that if yES and Iy - xl < r(x), then If(y) - f(x) I < £. Using the finite covering property of compact sets, we can cover A by a finite number of open balls Bi of radius [)i = r(Xi)f2, centered at Xi (i = I, ... ,n). We let
0= min Oi. If x E A, then for some i we have Iy - xd < r(x;), so that
Ix - xd <
If(y) - f(x)1 ;;;; If(y) - f(x;)!
r(x;)f2. If
+ If(x;) -
Iy - xl <
0, then
f(x)1
< 2£, thus proving the lemma. The only difference between our lemma and uniform continuity is that we allow the point y to be in S, not necessarily in A. Theorem 8.2. Let U be open in E and let J = [a, b] be an interval. Let f: J x U tinuous. Let
-+
F be a continuous map such that D 2 f exists and is con-
g(x) =
f
f(t, x) dt.
Then g is differentiable on U and
Dg(x) =
f
D 2 f(t, x) dt.
Proof Differentiability is a property relating to a point, so let x E U.
Selecting a sufficiently small open neighborhood V of x, we can assume that D 2 f is bounded on J x V. Let A. be the linear map
A. =
1:
D2 f(t,x) dt.
[XVII, §8]
DIFFERENTIATING UNDER THE INTEGRAL SIGN
We investigate g(x
+ h) -
g(x) - J.h =
=
r[/(/, r[f f {J~
x
+ h) -
D 2 !(/, x
!(/, x) - D,f(/, x)h] dl
+ uh)h du -
[D2!(t, x
+ uh) -
g(x) - J.hl ~ max ID,f(/, x
+ uh) -
=
501
D,f(/, X)hJ dl
D2!(t,x)]h dU} dt.
We estimate: Ig(x
+ h) -
D,f(I, x)llhl(b - a)
the maximum being taken for a ~ u ~ b and a ~ t ~ b. By the lemma applied to D 2 ! on the compact set J x {x}, we conclude that given £ there exists (j such that whenever Ih I < (j then this maximum is < Eo This proves that J. is the derivative g'(x), as desired.
CHAPTER
XVIII
Inverse Mapping Theorem
XVIII, §1. THE SHRINKING LEMMA The main results of this section and of the next chapter are based on a simple geometric lemma. Shrinking lemma. Let M be a closed subset of a complete normed vector space. Let f: M .... M be a mapping, and assume that there exists a number, K, 0 < K < 1, such thac for all x, y E M we have If(x} - f(y}1 ;;;
Klx - yl·
Then f has a unique fixed point, that is there exists a unique point Xo E M such that f(xo} = Xo' If x E M, then the sequence {fn(x}} (iteration of f repeated n times) is a Cauchy sequence which converges to the fixed point. Proof We have for a fixed x
E
M,
If 2(x} - f(x}1 = If(f(x}) - (x) I ;;; Klf(x) -
xl.
By induction, Ifn+ '(x) - f"(x} I ;;; Klfn(x} - f"-I(x}1 ;;; Knlf(x} - xl.
In particular, we see that the set of elements {fn(x}} is bounded because If"(x} - xl ;;; If"(x} - f"-I(x}1 + If n- 1(x} - !"-2(x}1 + ... + If(x} - xl ;;; (K,,-I + K',-2 + ... + K + I)lf(x) - xl
and the geometric series converges.
502
[XVIII, §1]
THE SHRINKING LEMMA
Now by induction again, for any inleger
In
~
503
1 and k ~ 1 we have
We have just seen that the term fk(x) - x is bounded, independently of k. Hence there exists N such that if In, n ~ N and say n = In + k we have
because Km -+ 0 as In -+ 00. Hence the sequence (f"(x)} is a Cauchy sequence. leI Xo be its limit. Select N such that for all n ~ N we have
Ixo -
j"(x) I <
i.
Then
This proves Ihat Ihe sequence (f"(x)} converges to f(x o). Hence
and Xo is a fixed point. FinaIly, suppose x, is also a fixed point, that is !(x,) = x,. Then
lx, - xol
=
If(x,} - f(xo} I ~ Klx, - xol·
Since 0 < K < I, it foIlows that x, - Xo = 0 and x, = xo. This proves the uniqueness, and the theorem. A map as in the theorem is caIled a shrinking map. We shaIl apply the theorem in §3, and also in the next chapter in cases when the space is a space of functions with sup norm. Examples of this are also given in the exercises.
XVIII, §1. EXERCISES 1. (Tate) Let E, F be complete normed vector spaces. Let J: E -+ F be a map having the following property. There exists a number C > 0 such that for all x, y E E we have IJ(x
+ y) -
J(x) - J(Y)I ;;;:
c.
504
[XVIII, §1]
INVERSE MAPPING THEOREM
(a) Show that there exists a unique additive map g: E -+ F such that g bounded for the sup norm. [Hint: Show that the limit
J is
. J(2"x) g(x) = 11m --"n-co 2
exists and satisfies g(x + y) = g(x) + g(y).] (b) If J is continuous, prove that g is continuous and linear.
2. Generalize Exercise 1 to the bilinear case. In other words, let map and assume that there is a constant C such that IJ(x,
+ x"
If(x, y,
y) - J(x" y) - J(x"y)! ~
+ y,) -
J: E
x F
-+
G be a
c,
J(x, y,) - J(x, y,lI ~ C
for all x, x" x, E E and y, y" y, E F. Show that there exists a unique bi· additive map g: E x F -+ G such that J - g is bounded for the sup norm. If J is continuous, then 9 is continuous and bilinear.
3. Prove the following statement. Let B, be the closed ball of radius r centered at 0 in E. LetJ: B, -+ E be a map such that: (a) IJ(x) - JlY)1 ~ hlx - YI with 0 < h (b) I J(O) I ~ r(l - h).
< I.
Show that there exists a unique point x E B, such that J(x) = x. 4. Notation as in Exercise 3, let g be another map of B, into E and let e > 0 be such that Ig(x) - J(x)1 ~ e for all x. Assume that g has a fixed point x" and let x, be the fixed point ofJ. Show that lx, - x.l ~ ej(1 - h).
5. Let K be a continuous function of two variables, defined for (x, y) in the square a ~ x ~ h and a ~ y ~ h. Assume that IIKII ~ C for some constant C > O. Let J be a continuous function on [a, h) and let r be a real number satisfying the inequality Irl < C(h _ a) Show that there is one and only one function g continuous on [a, h) such that J(x)
= g(x) + r
f
K(t, x)g(t) dt.
6. (Newton's method) This method serves the same purpose as the shrinking lemma but sometimes is more efficient and converges more rapidly. It is used to find zeros of mappings. Let B, be a ball of radius r centered at a point x. E E. Let J: B, -+ E be a C' mapping, and assume that f" is bounded by some number C ~ 1 on B,. Assume that j'(x) is invertible for all x E B, and that Ij'(x) - 'I ~ C for all x E B,. Show that there exists a number 0 depending only on C and r such that if IJ(x.)1 ~ 0 then the sequence defined by
[XVIII, §IJ
505
THE SHRINKING LEMMA
lies in B, and converges to an element x such that fix) inductively that
= O.
[Hillt: Show
Ix.+, - x.1 ;;; Clf(x.)I, ,C
If(x.+I)I;;; Ix.+. -x.1 2' and hence putting rl = (C 30/2),
If(x,,)1 ;;; (C 3/2)'+2+ ...+2'-'0 2 '
Ix"+1 - xol ;;; qr,
+ r,2 + ... + rj,. ).
7. Apply Newton's method to prove the following statement. Assume that f: U - E is of class C 2 and that for some point Xo E U we have f(xo) = 0 and f'(x o) is invertible. Show that given y sufficiently close to 0, there exists x close to X o such that f(x) = )'. [Hillt: Consider Ihe map g(x) = f(x) - y.] [Note. The point of Ihe Newton method is that it often gives a procedure which converges much faster than the procedure of the shrinking lemma. Indeed, the shrinking lemma converges more or less like a geometric series. The Newton method converges with an expollem of 2".] 8. The following is a reformulation due to Tate of a theorem of Michael Shub. (a) Let II be a positive integer, and let f: R - R be a differentiable function such that f'(x) ;;; r > 0 for all x. Assume that f(x + I) = f(x) + II. Show that there exists a strictly increasing continuous map 0:: R -+ R satisfying
a(x + I) = a(x) + I such that
f(a(x)) = a(lIx). [Him: Follow Tate's proof. Show that f is continuous, strictly increasing, and let 9 be its inverse function. You want to solve a(x) = g(a(lIx». Let M be the set of all continuous functions which are increasing (not necessarily strictly) and satisfying a(x + I) = a(x) + I. On M, define the norm
lIall = sup la(x)l· O::i ... ~1
lei T: M - M be Ihe map such that
(Ta)(x) = g(a(lIx». Show that T maps Minto M and is a shrinking map. Show that M is complete, and that a fixcd point for T solves the problem.] Since one can write
IIX = a-'(fa(x»),
506
INVERSE MAPPING THEOREM
[XVIII, §2]
one says that the map x ..... nx is conjugate to f. Interpreting this on the circle, one gels the statement originally due to Shub that a differentiable function on the circle, with positive derivative, is conjugate to the n-th power for some n. (b) Show that the differentiability condition can be replaced by the weaker condition: There exist numbers "'" with 1 < " < " such that for all x ~ 0 we have 'IS;;;;' J(x
+ s) - J(x) ;;;;, "s.
Further problems involving similar ideas, and combined with another technique will be found at the end of the next section. It is also recommended that the first theorem on differential equations be considered simultaneously with these problems.
XVIII, §2. INVERSE MAPPINGS, LINEAR CASE Let A: E -> F be a continuous linear map. We continue to assume throughout that E, F are euclidean spaces, but what we say holds for complete normed spaces. We shall say that A is invertible if there exists a continuous linear map w: F -> E such that we A = idE and A 0 w = id F where idE and id F denote the identity mappings of E and F respectively. We usually omit the index E or F on id and write simply id or 1. No confusion can really arise, because for instance, w 0 A is a map of E into itself, and thus if it is equal to the identity mapping it must be that of E. Thus we have for every x E E and y E F:
W(A(X» = x
and
A(W(Y» = Y
by definition. We write A-I for the inverse of A. Consider invertible elements of L(E, E). If A., ware invertible in L(E, E), then it is clear that W 0 A is also invertible because (w 0 A)-1 = r i o w- I . For simplicity from now on, we shall write WA instead of W 0 A. Consider the special case).: R" -> R". The linear map A is represented by a matrix A = (ai)' One knows that A is invertible if and only if A is invertible (as a matrix), and the inverse of A, if it exists, is given by a formula, namely A
-I
1_
= Det(A) A ,
where A is a matrix whose components are polynomial functions of the components of A. In fact, the components of A are subdeterminants of A. The reader can find this in any text on linear algebra. Thus in this case, A is invertible if and only if its determinant is unequal to O.
[XVlll, §2J
507
INVERSE MAPPINGS, LINEAR CASE
Note that the determinant Det: Mat" X" -+ R is a continuous function, being a polynomial in the n2 coordinates of a matrix, and hence the set of invertible n x n matrices is open in Mat",". The next theorem gives a useful formula whose proof does not depend on coordinates.
Theorem 2.1. The sec of invertible elements of L(E, E) is open in L(E, E). If u E L(E, E) is such chac Iu I < I, chen I - u is invertible, and iCS inverse is given by che convergent series 00
(l - U)-I = I
+ u + u2 + ... = 2: un. n=O
Proof Since lui < I, and since for u, v E L(E, E) we have luvl ;;; lullvl, we conclude that lu"1 ;;; lui". Hence the series converges, being comparable to the geometric series. Now we have (I - uHI
+ u + u2 + ... + un) = I - U"+I = (I
+ u + ... + u"HI -
u).
Taking the limit as n -+ 00 and noting that u"+' -+ 0 as n -+ 00, we see that the inverse of I - u is the value of the convergent series as promised. We can reformulale whal we have jusl proved by stating that the open ball of radius I in L(E, E) centered at the identity I, consists of invertible elements. Indeed, if), E L(E, E) is such that I). - II < I, then we write ). = I - (l - ).) and apply our result. Let Uo be any invertible element of L(E, E). We wish to show that there exists an open ball of invertible elements centered at Uo' Let I
O
By what we have just seen, it follows that uUo' is invertible, and hence uUo 'uo = u is invertible, as was to be shown.
508
[XVIII, §2]
INVERSE MAPPING THEOREM
Remark. If u is sufficiently close to Uo then u - I is bounded, as one sees by writing Iu- II = Iu- I Uo Uo II ;;;; Iu- I Uo II Uo ' I. Denote by Inv(E, E) the open set of invertible elements of L(E, E). The map of Inv(E, E) -> Inv(E, E) given by
is easily seen to be continuous. Indeed, if Uo is invertible, and u is close to uo, then II
- I
-
Uo- ,
= u-
I(
Uo - u) Uo- I .
Taking norms shows that
whence the continuity. However, much more is true, as stated in the next theorem. Theorem 2.2. Let q>: Inv(E, E) -+ Inv(E, E) be the map u 1-+ U-I. Then q> is infinitely differentiable, and its derivative is given by
Proof We can write for small h E L(E, E): (u
+ h)-I - U-I
= (u(l = (I
+ u-1h)t l
-
U-I
+ u-Ih)-Iu-I
= [(I +
- U-I u-Ih)-I - I]u- I .
By Theorem 2.1 there is some power series g(h), convergent for h so small that lu-'hl < 1, for which
and consequently (u
+ h)-I
- u- I = [-u-1h
+ (u-1h)2g(h)]u-1
= -u-Ihu- I
+ (u- l h)2g(h)u- l •
The first term on the right is q>'(u)h in view of the estimate
[XVIII, §2]
509
INVERSE MAPPINGS, LINEAR CASE
for some constant C. Thus the derivative is given by the formula as stated in the theorem. The fact that the map u ...... u- I is infinitely differentiable follows because the derivative is composed of inverses and continuous bilinear maps (composition), so that by induction ql is of class C' for every positive integer p. The theorem is proved.
Remark. In the case of matrices, the map A(x) ...... A(X)-I
where (x) = (Xi}) are the n2 components of the matrix A(x), can be seen to be Coo because the components of A(x) - 1 are given as polynomials in (x), divided by the determinant, which is not 0, and is also a polynomial. Thus one sees that this map is infinitely differentiable using the partial derivative criterion. However, even seeing this does not give the formula of the theorem describing the derivative of the inverse map, and this formula really would not be proved otherwise even in the case of matrices. Note that the formula contains the usual _u- 2 except that the noncommutativity of the product has separated this and placed u - I on each side of the variable v.
XVIII, §2. EXERCISES I. Let E be the space of 11 x
11
matrices with the usual norm IA I such that
IABI
~
IAIIBI.
Everything that follows would also apply to an arbitrary complete normed vector space with an associative product E x E -> E into itself, and an element I which acts like a multiplicative identity, such that II I = I. (a) Show that the series '" A" exp(A) = L n::O n! converges absolutely, and that Iexp(A) - I I < I if IA I is sufficiently small. (b) Show that the series 2
log(I + B)
= -B - -B + ... + (-I)n+ I -B" + ... I
2
11
converges absolutely if IBI < I and that in that case, Ilog(I
If II
+ B)I
~
IBI/(I -IBI).
- CI < I, show that the series 10gC=(C-I)-
converges absolutely.
(C - 1)2
2
+"'+(_1)"+1
(C - I)" 11
+."
510
INVERSE MAPPING THEOREM
[XVIII, §2)
(c) If IAI is suflicienily small show that log exp(A) = A and if IC - II < 1 show that exp log C = C. [Hinr: Approximate exp and log by the polynomials of the usual Taylor series, estimating the error terms.] (d) Show that if A, B commute,that is AB = BA, then exp(A
+ B) = exp A exp B.
State and prove the similar theorem for the log. (e) Let C be a matrix suflicienily close to I. Show that given an integer m > 0, there exists a matrix X such that xm = C, and that one can choose X so that XC = cx. 2 Let U be the open ball of radius I centered at I. Show that the map log: U - E is differenIiable. 3. Let V be the open ball of radius 1 centered at is differentiable.
o.
Show that the map expo V - E
4. Let K be a continuous function of two variables, defined for (x, y) in the square o ~ x ~ band 0 ~ y ~ b. Assume thatl/KI/ ~ C for some consIant C > o. LetJ be a continuous function on [0, b] and let r be a real number satisfying the inequality I
Irl < C(b _ a) . Show that there is one and only one function g continuous on [a, b] such that J(x) = g(x)
+r
r
K(t, x)g(t) dt.
•
(This exercise was also given in the preceding section. Solve it here by using Theorem 21.)
5. Exercises 5 and 6 develop a special case of a theorem of Anosov, by a proof due to Moser. First we make some definitions. Let A: R' _ R' be a linear map. We say that A is hyperbolic if there exist numbers b > I, c < I, and two linearly independent vectors v, w in R' such that Av = bv and Aw = cwo As an example, show that the matrix (linear map)
A=
G ~)
has this property. Next we introduce the C· norm. If J is a C' map, such that both J and bounded, we define the C' norm to be 1/ fII, = max(1/
J' are
n, 1/ !'II),
where 1/ 1/ is the usual sup norm. In this case, we also say that J is C'-bounded.
[XV III, SLJ
INVERSE MAPPINGS, LINEAR CASE
511
The theorem we are after runs as follows: Theorem. Let A: R' --> R' be a hyperbolic linear map. There exists 0 having the Jollowing property. IJ J: R' --> R' is a C 1 map such that
IIJ- All, <0, then there exists a continuous bounded map h: R'
R' satisfying the equation
-->
Joh = h 0 A. First prove a lemma. Lemma. Let M be the vector space oj continuous bounded maps oj R' into R'. Let T: M --> M be the map defined by Tp = P - A-' 0 P 0 A. Then T is a continuous linear map, and is invertible. To prove the lemma, write
where p. and p- are functions, and note that symbolically,
that is Tp· = (1 - S)p. where IISII < I. So lind an inverse for T on p•. Analogously, show that Tp- = (1 - So ')p- where IISoll < I. so that So T = So - 1 is invertible on p -. Hence T can be inverted componentwise. as it were. To prove the theorem. write J = A + 9 where 9 is C'-small. We want to solve for h = 1 + P with p E M, satisfying J 0 h = h 0 A. Show that this is equivalent to solving
Tp = -A-' ogoh, or equivalently,
This is then a fixed point condition for the map R: M
R(P) =
-->
M given by
-r'(r' ogo(1 + p».
Show that R is a shrinking map to conclude the proof. 6. One can formulate a variant of the preceding exercise (actually the very case dealt with by Anosov-Moser). Assume that the matrix A with respect to the standard basis of R' has integer coefficients. A vector Z E R' is called an integral vector if its coordinates are integers. A map p: R' --> R' is said to be periodic if
p(x
+ z) = p(x)
512
[XVIII, §3)
INVERSE MAPPING THEOREM
for all x E R2 and all integral vectors z. Prove: Theorem. Let A be hyperbolic, with integer coefficients. There exists Dhaving the Jollowing property. IJ g is a Co. periodic map. and IIgll, < D, and if
J = A + g. then there exists a periodic continuous map h satisJying the equation Joh
= hoA.
Note. With a bounded amount of extra work. one can show that the map h itself is CO-invertible, and so J = h 0 A 0 h' I.
XVIII, §3. THE INVERSE MAPPING THEOREM Let U be open in E and let I: U -+ F be a C' map. We shall say that I is C'-invertible on U if the image of I is an open set V in F, and if there is a C' map g: V -+ U such that I and 9 are inverse to each other, that is for all x E U and y E V we have g(f(x»)
=x
.
and
I(g(y»)
=
y.
In considering mappings between sets, we used the same notion of invertibility without the requirements that the inverse map 9 be C'. All that was required when dealing with sets in general is that I, 9 are inverse to each other simply as maps. Of course, one can make other requirements besides the C' requirement. One can say that I is CO-invertible if the inverse map exists and is continuous. One can say that I is CP-invertible if I is itself CP and the inverse map 9 is also CPo In the linear case, we dealt with linear invertibility, which in some sense is the strongest requirement which we can make. It will turn out that if I is a C' map which is C'-invertible. and if I happens to be CP, then its inverse map is also CPo This is the reason why we emphasize C' at this point. However, it may happen that a C' map has a continuous inverse, without this inverse map being differentiable. For example: Let J: R - R be the map I(x) = x 3 • Then certainly I is infinitely differentiable. Furthermore, I is strictly increasing, and hence has an inverse mapping g: R _ R which is nothing else but the cube root: g(y) = y1/3. The inverse map 9 is not differentiable at 0, but is continuous at 0. Let
I:U-
V
and
g:V-W
be invertible CP maps. We assume that V is the image of I and W is the image of g. We denote the inverse of I by I-I and that of 9 by g'l. Then
[XV 111,
~jJ
THE INVERSE MAPPING THEOREM
513
it is clear that 9 0 I is CP-invertible, and that (g 0 f) - I = I - log - I, because we know that a composite of CP maps is also CPo Let I: U -+ F be a CP map, and let Xo E U. We shall say that I is locally CP-invertible at Xo if there exists an open subset U I of U containing Xo such that I is CP-invertible on U I' By our definition, this means that there is an open set V. of F and a CP map g: VI -+ U I such that log and 9 0 I are the respective identity mappings of VI and U I' It is clear that a composite of locally invertible maps is locally invertible. In other words. if
I: U -+
and
V
g: V -+ W
are CP maps, Xo E U, I(xo) = Yo, if I is locally CP-invertible at Xo, if 9 is locally CP-invertible at Yo, then 9 0 I is locally CP-invertible at Xo. It is useful to have a terminology which allows us to specify what is the precise Image of an invertible map. For this purpose, we shall use a word which is now standard in mathematics. Let U be open in E and let V be open in F. A map tp:
U
-+
V
will be called a CP-isomorphismif it is CP, and if there exists a CP map t/J:V-+U
such that cp, t/J are inverse to each other. Thus cp is Cp·invertible on U, and V is the image cp(U) on which the CP inverse of cp is defined. We write the inverse often as t/J = cp - I. If
U.l.v
and
V4W
are Cp·isomorphisms, then the composite go I is also a CP-isomorphism, whose inverse is given by I-log-I. The word isomorphism is also used in connection with continuous linear maps. In fact, a continuous linear map J.:E-+F
is said to be an isomorphism if it is invertible. Thus the word isomorphism always means invertible, and the kind of invertibility is then made explicit in the context. When it is used in relation to CP maps, invertibility means CPo invertibility. When it is used in connection with continuous linear maps, invertibility means continuous linear invertibility. These are the only two examples with which we deal in this chapter. There are other examples in mathematics. however. Let t/J: U -+ V be a continuous map which has a continuous inverse cp: V -+ U. In other words, t/J is a CO·invertible map. If U I is an open subset of U, then t/J(U.) = VI is an open subset of V because t/J = cp-I
514
INVERSE MAPPING THEOREM
[XVlIl, §3]
and 'P is continuous. Thus open subsets of U and open subsets of V correspond to each other under the associations and Let U be open in E. A CP map IjJ:U-+V
which is CP-invertible on U is also called a CP chart. If Q is a point of U, we call1jJ a chart at Q. If IjJ is not invertible on all of U but is CP-invertible on an open subset U 1 of U containing Q, then we say that IjJ is a local CPisomorphism at Q. If E = R" = F and the coordinates of R" are denoted by X" . .. ,xn' then we may view IjJ as also having coordinate functions,
In this case we say that 1jJ" ... ,ljJn are local coordinates (of the chart) at Q, and that they form a CP-coordinate system at Q. We interpret IjJ as a change of coordinate system from (XI' ... ,xn) to (1jJ I (x), ••• ,ljJn(x»), of class CP. This terminology is in accord with the change from polar to rectangular coordinates as given in examples following the inverse mapping theorem, and which the reader is probably already acquainted with. We give here another example of a chart which is actually defined on all of E. These are translations. We let
be the map such that tv(x) = given by
X
+ v. Then the derivative of tv is obviously Dt'(x)
=1
where 1 is the identity mapping. Observe that if U is an open set in E and vEE then tv(U) is an open set, which is called the translation of U by v. It is sometimes denoted by U v, and consists of all elements x + v with x E U. We have
if w, vEE, and
A map tv is called the translation by v. For instance, if U is the open ball centered at the origin, of radius Y, then tv(U) = Uv is the open ball centered at v, of radius Y.
[XVIII, §3]
515
THE INVERSE MAPPING THEOREM
When considering functions of one variable, real valued, we used Ihe derivative as a lest for invertibility. From the ordering properties of the real numbers, we deduced invertibility from the fact that the derivative was positive (say) over an interval. Furthermore, at a given point, if the derivative is not equal to 0, then the inverse function exists, and one has a formula for its derivative. We shall now extend this result to the general case, the derivative being a linear map. Theorem 3.1 (Inverse mapping theorem). Let U be open in E, let Xo E U, and let f: U .... F be a C ' map. Assume that the derivative f'(xo): E .... F is invertible. Then f is locally C ' -invertible at Xo' If cp is its local inverse, and y = f(x), then cp'(y) = f'(X)-1. Proof We shall first make some reductions of the problem. To begin with, let A = f'(xo), so that A is an invertible continuous linear map of E into F. If we form the composite
.1.- 1 of: U .... E, then the derivative of .1.-' 0 f at Xo is .1.-' of'(xo) = 1. If we can prove that .1.- 1 0 f is locally invertible, then it will follow that f is locally invertible, because f = A0 .1.- 1 0 f. This reduces the problem to the case where f maps U into E itself, and where f'(xo) = 1. Next, let f(xo) = yo. Let f.(x) = f(x + xo) - yo. Then f, is defined on an open set containing 0, and f, (0) = O. In fact, f, is the composite map "0 U f. E"Yo E U -xo~ -=+ - t o •
It will suffice to prove that f, is locally invertible, because f. and then
=L
yO 0
f
0
Txo
is the composite of locally invertible maps, and is therefore invertible. We have thus reduced the proof to the case when Xo = 0, f(O) = 0 and 1'(0) = 1, which we assume from now on. Let g(x) = x - f(x). Then g'(O) = 0, and by continuity there exists r > 0 such that if Ixl ~ r then
Ig'(x)1 ~ t· Furthermore, by the continuity of I' and Theorem 2.1, f'(x) is invertible for Ixl ~ r.
516
INVERSE MAPPING THEOREM
[XVIII, §3]
From the mean value theorem (applied between 0 and x) we see that Ig(x)1 ;;; !Ixl, and hence 9 maps the closed ball B,(O) into B'/2(0). We contend that given y e B"2(0) there exists a unique element x e B,(O) such that f(x) = y. We prove this by considering the map gy(x) = y
+x
- f(x).
If Iyl ;;; r/2 and Ixl ;;; r then Igy(x)1 ;;; r. and hence gy may be viewed as a mapping of the complete metric space B,(O) into itself. The bound of! on the derivative together with the mean value theorem shows that gy is a shrinking map, namely
x,.
for X2 e B,(O). By the shrinking lemma. it follows that gy has a unique fixed point. which is precisely the solution of the equation f(x) = y. This proves our contention. Let V, be the set of all elements x in the open ball B,(O) such that If(x)1 < rJ2. Then V, is open, and we let V, be its image. By what we have just seen. the map f: V, - V, is injective, and hence we have inverse maps f:V,-V"
f-' = cp:
V, - V,.
We must prove that V, is open and that cp is of class C'. Let x, e V, and let y, = f(x ,) so that ly,l < r/2. If y e E is such that Iyl < r/2 then we know that there exists a unique x e B,(O) such that f(x) = y. Writing x = x - f(x) + f(x) we see that Ix - x,l ;;; If(x) - f(x,)1 ;;; If(x) - f(x,)1
+ Ig(x) - g(x,)1 + !Ix - x,l.
Transposing on the other side, we find that (*)
Ix - xd ;;; 2If(x) - f(x,)I.
This shows that if y is sufficiently close to y" then x is close to x" and in particular. Ixl < r since Ix,l < r. This proves that x e V,. and hence that ye V,. so that V, is open. The inequality (*) now shows that cp = f- I is continuous. To prove differentiability. note that !'(Xl) is invertible because f(x) - f(x,)
=!,(x,)(x -
x,)
+ Ix -
xdt/J(x - x,)
[XVIII, §3)
THE INVERSE MAPPING THEOREM
517
where I/J is a map such that lim I/J(x - x,) = O. Substitute this in the expression (**)
r
'(y) -
r '(y,) -
/,(x,)-'(y - y,) = x - x, - /,(x,)-'(J(x) - f(x,».
Using the inequality (*), and a bound C for /,(x,)-', we obtain
1(**)1
=
I/'(x,)-'Ix - x,ll/J(x - x,)1
~
2C1y - y,lll/J(cp(y) - cp(y,»I.
Since cp = f-' is continuous, it follows from the definition of the derivative that cp'(y,) = /,(x,)-'. Thus cp' is composed of the maps cp, /" and
uinverse," namely cp'(y)
= /'(cp(YW',
and these maps are continuous. It follows that cp' is continuous, whence cp is of class C'. This proves the theorem. Corollary 3.2. If f is of class CP thell its local illverse is of class CPo Proof By induction, assume the statement proved for p - I. Then /' is of class CP-', the local inverse cp is of class CP-', and we know that the map u 1-+ u-, is C"'. Hence cp' is of class CP-', being composed of CP-' maps. This shows that cp is of class CP, as desired.
In some applications, one needs a refinement of the first part of the proof, given a lower bound for the size of the image of f when the derivative of f is close to the identity. We do this in a lemma, which will be used in the proof of the change of variable formula.
Lemma 3.3. Let V be opell ill E, alld let f: V -> E be of class C'. Assume that f(O) = 0, /,(0) = I. Let r > 0 alld assume that B,(O) c U. Let o < s < I, alld assume that I/'(z) - /,(x)1 ;;; s
for all x, z € B,(O). If y € E alld x € B,(O) such that f(x) = y.
Iyl
~ (I - s)r, thell there exists a ullique
518
[XVIII, §3]
INVERSE MAPPING THEOREM
Proof The map g, given by g,(x) = x - f(x) + Y is defined for Ixl ~ r and IYI ~ (I - s)r. Then g, maps B,(O) into itself, because from the esti· mate If(x) - xl
= If(x) -
f(O) - I'(O)xl ~ Ixl sup II'(z) - 1'(0)1 ~ sr,
we obtain Ig,(x) I ~ sr + (I - s)r = r. Furthermore, g, is a shrinking map, because from the mean value theorem we get
Ig,(x,) - g,(x2)1 = Ix! - X2 - (f(x!) - f(X2»)1
= lx,
- X2 - I'(OXx! - X2)
+ <'>(x!, x2)1
= IIi(x" x2)1,
where IIi(x!, x2)1 ,.:;; lx, - xzlsuplf'(z) - 1'(0)1 ~
six! - x 2 1.
Hence g, has a unique fixed point x E B,(O), thus proving our lemma. We shall now give a standard example with coordinates. Example 1. Let E = R 2 and let U consist of all pairs (r,6) with r > 0 and arbitrary 6. Let qJ: U -+ Rz = F be defined by
qJ(r, 6) = (r cos 6, r sin 6). Then
'"
6 -r sin 6) sm6 rcos6
Det J .,(r, 6)
= r cos 2 6 + r sin 2 6 = r.
J (r,6) = (cos
and
Hence J", is invertible at every point, so that qJ is locally invertible at every point. The local coordinates qJ" qJ2 are usually denoted by x, Y so that one usually writes
x=rcos6
and
Y
= r sin 6.
[XVIII, §3]
519
THE INVERSE MAPPING THEOREM
One can define the local inverse for certain regions of F. Indeed, let V be the set of all pairs (x, y) with x > 0 and y > O. Then on V the inverse is given by
r= Jx
2
+
y2
and
.
y
= arCSin r=i==" 2
°
Jx +r
Example 2. Let E = R J and let U be the open set of all elements (P, 0" 02) with P > 0 and 0" O2 arbitrary. We consider the mapping q>: U
-+
F = RJ
such that
q>(p, 0/, 02)
= (p cos 0. sin 02' p sin 0. sin 02' p cos O2),
The determinant of the Jacobian of q> is given by Det J",(p, 0" O2) = _p2 sin O2 and is not equal to 0 whenever O2 is not an integral multiple of 7[, For such points, the map q> is locally invertible. For instance, we write
x
= pcos 0. sin 02'
y
= psin 0/ sin 02'
z=pcos0 2·
Let V be the open set of all (x, y, z) such that x > 0, Y > 0, on V the inverse of q> is given by the map
Z
> O. Then
.p:V-+U such that
ljJ(x, y, z) = (Jx 2 + y2 + Z2, arcsin J : 2' arccos J 2 Z 2 2)' x+y x+y+z The open subset U I of U corresponding to V (that is ljJ(V») is the set of points (p, 0/,° 2 ) such that p
0<02 < 7[/2.
> 0,
Example 3. Let q>: R 2
-+
R 2 be given by
q>(x, y) = (x + x 2f(x, y), Y + y2g(X, y»)
520
INVERSE MAPPING THEOREM
[XVIII, §4]
where f, 9 are C' functions. Then the Jacobian of cp at (0,0) is simply the identity matrix:
Hence cp is locally C'-invertible at (0,0). One views a map rp as in this example as a perturbation of the identity map by means of the extra terms x'f(x, y) and y'g(x, y), which are very small when x, yare near O.
Example 4. The continuity of the derivative is needed in the inverse mapping theorem. For example, let
= x + 2x' sin(l/x) flO) = o.
f(x)
if x -F 0,
Then f is differentiable, but not even injective in any open interval con· taining O. Work it out as an exercise.
XVIII, §3. EXERCISES I. Let f: U - F be of class c' on an open set U of E. Suppose that the derivative of f at every point of U is invertible. Show that f(U) is open.
2. Let I(x, y) = (e" + e", e' - e') By computing Jacobians, show that f is locally invertible around every point of R 2 • Does I have a global inverse on R 2 itself? 3. Let I: R 2 - R' be given by I(x, y) = (e" cos y, e" sin y). Show that Df(x, }') is invertible for all (x, y) E R 2, that I is locally invertible at every point, but does not have an inverse defined on all of R2 • 4. Let I: R 2 _ R 2 be given by I(x, y) = (x' - y2, 2xy). Determine the points of R' at which f is locally invertible, and determine whether f has an inverse defined on all of R2 •
The results ofthe next section will be covered in a more general situation in §5. However, the case offunctions on n-space is suffiCiently important to warrant the repetition. Logically, however, the reader can omit the next section.
XVIII, §4. IMPLICIT FUNCTIONS AND CHARTS Throughout this section, we deal with maps which are assumed to be of class C', and thus we shall say invertible instead of saying C'·invertible, and similarly for locally invertible instead of saying locally C··invertible. We always take p ~ 1.
[XVIII, §4)
IMPLICIT FUNCTIONS AND CHARTS
521
We start with the most classical form of the implicit function theorem.
Theorem 4.1. Let f: J, x J 2
-
R be a function of two real variables,
defined on a product of open intervals J" J 2' Assume that f is of class cPo Let (a, b) E J, x J 2 and assume that f(a, b) = 0 but D,j(a, b) # O. Then the map
given by (x. Y) I-> (x, f(x, y» is locally CP invertible at (a, b).
Proof All we need to do is to compute the derivative of if! at (a, b). We write if! in terms of its coordinates, if! = (if!" if!2)' The Jacobian matrix of if! is given by oif!, oy J~(x,
y) = Oif!2 oy
0
-
of of ax oy
and this matrix is invertible at (a, b) since its determinant is equal to of/oy # 0 at (a, b). The inverse mapping theorem guarantees that if! is locally invertible at (a, b).
Corollary 4.2. Let S be the set of pairs (x. y) such thac f(x, y) = O. Then there exists an open set U, in R 2 containing (a, b) such that if!(S r. U ,) consists of all numbers (x, 0) for x in some open interval around a. Proof Since if!(a, b) = (a, 0), there exist open intervals V" V2 containing a and 0 respectively and an open set U, in R 2 containing (a, b) such that the map if!:U,-V,xV2
has an in verse
(both of which are C' according to our convention). The set of points (x, y) E U, such that f(x, y) = 0 then corresponds under if! to the set of points (x, 0) with x E ~, as desired.
522
tNVERSE MAPPtNG THEOREM
[XVIII, §4]
Theorem 4.1 gives us an example of a chart given by the two coordinate functions x and f near (a, b) which reflects better the nature of the set S. In elementary courses, one calls S the curve determined by the equation f = O. We now see that under a suitable choice of chart at (a, ·b), one can transform a small piece of this curve into a factor in a product space. As it were, the curve is straightened out into the straight line VI' Example I. In the example following the inverse mapping theorem, we deal with the polar coordinates (r,8) and the rectangular coordinates (x, y). In that case, the quarIer circle in the first quadrant was straightened out into a straight line as on the following picture: y
8
--I---I--x
r
In this case V I is the open first quadrant and VI is the open interval 0< 8 < n/2. We have I/J(S n VI) = {l} x ~. The function f is the function f(x, y) = x 2 + y2 - 1. The next theorem is known as the implicit function theorem. Theorem 4.3. Let f: J I X J 2 -+ R be a function of two variables, defined on a product of open intervals. Assume that f is ofclass CPo Let
and assume that f(a, b) = 0 but D,/(a, b) 'I O. Then there exists an open interval J in R containing a and a CP function
g:J
-+
R
such that g(a) = band f(x, g(x» = 0 for all x E J.
[XVIII, §4]
523
IMPLICIT FUNCTIONS AND CHARTS
Proof By Theorem 4.1 we know that the map
t/J:J 1 x J 2 -+R x R=R 2 given by (x,y)~(x,f(x,y»
is locally invertible at (a, b). We denote its local inverse by ({J, and note that ({J has two coordinates, ({J = «{J" ({J2) such that
({J(x, z) = (x, ({J2(X, z»)
for x E R,
ZE
R.
We let g(x) = ((J2(X,0). Since t/J(a, b) = (a, 0) it follows that ((J2(a,0) = b so that g(a) = b. Furthermore, since t/J, ({J are inverse mappings, we obtain
(x, 0) = t/J«{J(x, 0») = t/J(x, g(x») = (x, f(x, g(x»). This proves that f(x, g(x») = 0, as was to be shown. We see that Theorem 4.3 is essentially a corollary of Thcorem 4.1. We have expressed y as a function of x explicitly by means of g, starting with what is regarded as an impliciI relation f(x, y) = O. Example 2. Consider the function f(x, y) = x2 + y2 - 1. The equation f(x, y) = 0 is that of a circle, of course. If we take any point (a, b) on the circle such that b # 0, then D2 f(a, b) # 0 and the theorem states that we can solve for y in terms of x. The explicit function is given by
y=Jl-x 2
if b > 0,
y=-Jl-x 2
if b < O.
If on the other hand b = 0 and then a # 0, then DI f(a, b) # 0 and we can solve for x in terms of y by similar formulas. We shall now generalize Theorem 4.3 to the case of functions of several variables. Theorem 4.4. Let U be open in R· and let f: U -+ R be a CP function on U. Let (a, b) = (a" .. . ,a._" b) E U and assume that f(a, b) = 0 but D.f(a, b) # O. Then the map
t/J: U
-+
R·- 1
X
R = R·
524
[XVIII, §4]
INVERSE MAPPING THEOREM
given by (x, y) ...... (x,f(x, y») is locally CP invertible at (a, b).
[Note. We write (a, y) as an abbreviation for (ai' ... ,an-I' y).] Proof The proof is basically the same as the proof of Theorem 4.1. The map !/J has coordinate functions Xl' ... 'Xn - I and f lis Jacobian matrix is therefore
1
0
0
0
1
0
J",(x) =
0
0 of ox,
of oX n
of OX2
and is invertible since ils determinant is again Dnf(a, b) the theorem.
~
O. This proves
Corollary 4.5. Let S be the set of points P E U such that f(P) = () Then there exists an open set U t in U containing (a, b) such thac !/J(S () U,) consists of all points (x, 0) with X in some open set VI of Rn - '. Proof Clear, and the same as Ihe corollary of Theorem 4.1.
From Theorem 4.4 one can deduce the implicil function theorem for functions of several variables. Theorem 4.6. Let U be open in Rn and let f: U -+ R be a CP function on U. Let (a, b) = (a" . .. ,an-" b) E U and assume thac f(a, b) = 0 but Dnf(a, b) ~ O. Then there exists an open ball V in Rn-t centered at (a) and a CP function
g: V
-+
R
such that g(a) = band fix, g(x») for all x
E
V.
=0
[XVIII, §4]
IMPLICIT FUNCTIONS AND CHARTS
525
Proof The proof is exactly the same as that of Theorem 4.3, except that x = (x" . .. ,x n_ I) lies in Rn- '. There is no need to repeat it.
In Theorem 4.6, we see that the map G given by
x ...... (x, g(x»
= G(x)
or writing down the coordinates
gives a parametrization of the hypersurface defined by the equation f(x" .. . ,xn - I , y) = 0 near the given point. We may visualize this map as follows:
G
,
On the right we have the surface f(X) = 0, and we have also drawn the gradient at the point P = (a, b) as in the theorem. We are now in a position to prove a result which had been mentioned previously (Chapter XV, §2 and §4), concerning the existence of differentiable curves passing through a point on a surface. To get such curves, we use our parametrization, and since we have straight lines in any given direction passing through the point a in the open set V of Rn-I, all we need to do is map these straight lines into the surface by means of our parametrization G. More precisely: Corollary 4.7. Let U be open in Rn and let f: U -+ R be a CP function. Let P E U and assume that f(P) = 0 but grad f(P) # O. Let w be a vector ofRn which is perpendicular to grad f(P). Let S be the set of points X such thar f(X) = O. Then there exists a CP curve a:J
-+
S
defined on an open interval J containing the origin such thar a(O) = P and a'(O) = w. Proof Some partial derivative of fat P is not O. After renumbering the variables, we may assume that Dnf(P) # O. By the implicit function theorem, we obtain a parametrization G as described above. We write P
526
INVERSE MAPPING THEOREM
[XVIII, §5]
in terms of its coordinates, P = (a, b) = (a" ... ,an -" b) so that G(a) = P. Then G'(a) is a linear map
G'(a): Rn-' .... Rn. In fact, for any x the matrix
= (Xl' ... ,Xn -
l)
the derivative G'(x) is represented by
0 0 0
o o
0
og og ax, ax,
og OX n
which has rank n - I. From linear algebra we conclude that the image of G'(a) in Rn has dimension n - I. Given any vector v in Rn-l we can define a curve ex in S by lelling ex(t) = G(a
+ tv).
Then a(O) = G(a) = P. Furthermore, a'(t) = G'(a
+ tv)v, so that
ex'(O) = G'(a)v. Thus the velocity vector of a is the image of v under G'(a). The subspace of Rn consisting of all vectors perpendicular to grad f(P) has dimension n - I. We have already seen (easily) in Chapter XV, §2, that a'(O) is perpendicular to grad f(P). Hence the image of G'(a) is contained in the orthogonal complement of grad f(P). Since these two spaces have the same dimension, they are equal. This proves our corollary.
XVIII, §S. PRODUCT DECOMPOSITIONS We shall now generalize the results of the preceding section to the general case where dimension plays no role, only the product decompositions. The proofs are essentially the same, linear maps replacing the matrices of partial derivatives. Theorem 5.1. Let U be open in a product E x F, and let f: U .... G be a C'map. Let (a, b) be a point of U with a E E and b E F. Assume that
Dd(a, b): F .... G
[XVIII, §5]
527
PRODUCT DECOMPOSITIONS
is invertible (as continuous linear map). Then the map "': U
-+
E
given by
xG
(x, y) H (x, f(x, y»
is locally CP-invertible at (a, b).
Proof We must compute the derivative ""(a, b). Since Dzf(a, b) is invertible, let us call it A. If we consider the composite map A-Iof:U-+G~F
then its second partial derivative will actually be equal to the identity. If we can prove that the map (*)
(x, y) H (x, A- ,
0
f(x, y»
is locally invertible at (a, b), then it follows that'" is locally invertible because '" can be obtained by composing the map from (*) with an invertible linear map, namely (v, w) H (v, .low).
This reduces our problem to the case when G = F and D z f(a, b) is equal to the identity, which we assume from now on. In that case, the derivative ""(a, b) has a matrix representation in terms of partial derivatives, namely II D"'(a, b) = ( D, f(a, b)
° ) (I,
Dzf(a, b) =
D I f(a, b)
0)
Iz .
Let p. = D I f(a, b). Then the preceding matrix is easily seen to have as inverse the matrix II ( -p.
0)
12
representing a continuous linear map of E x F -+ E x F. Thus D"'(a, b) is invertible and we can apply the inverse mapping theorem to get what we want. Note the exact same pattern of proof as that of the simplest case of Theorem 4.1. The values of f are now vectors of course. Let c = f(a, b). Then c is an element of G. Let S be the set of all (x, y) E U such that f(x, y) = c. We
528
INVERSE MAPPING THEOREM
[XVIII. §5]
view S as a level sel of f. wilh level c. The map t/! is a charI al (a. b). and we see Ihal under this chart. we obtain the same kind of straightening out of S locally near (a.b) that we obtained in §4. We fonnulate it as a corollary. Corollary 5.2. Let the notation be as in Theorem 5.1. Let f(a, b) = c and leI S be the subsel of V consisling of all (x, y) such thaI f(x, y) = c. There exists an open set V I of V containing (a, b), and a C'·isomorphism t/!: VI -> VI x V2 with VI open in E, V2 open in G, such that t/!(S n V,) = VI x {c}.
In the chapter on partial derivatives. we saw that the partial D 2 f(a, b) could be represented by a matrix when we deal with euclidean spaces. Thus in Theorem 5.1, suppose E x F = Rn and write
We have the map
and the isomorphism
This isomorphism is represented by the matrix of.
of,
ox n _ m + I
ax"
ofm
ofm AX"
Jlj'(x" ... ,xn ) = OXn -
m+ 1
+" ...
evaluated at (a" ... ,on)' The last set of coordinates (x n- m ,xn) plays the role of the (y) in Theorem 5.1. The creepy nature of the coordinates arises first from an undue insistence on the particular ordering of the coordinates (x I' ... ,xn ) so that one has to keep track of symbols like n-m+l;
second, from the non·geometric nature of the symbols which hide the linear map and identify Rm occurring as a factor of Rn, and Rm occurring as the
[XVIII, §5]
529
PRODUCT DECOMPOSITIONS
space containing the image of f; third, from the fact that one has to evaluate this matrix at (a" ... ,an) and that the notation of,
oan - m+
1
of, oan
to denote ofm
iJan -
m+ 1
ofm oan
( Dn- m+; ,f,(a) Dn- m+,fm(a)
...
Dn~,(a») Dnfm(a)
is genuinely confusing. We were nevertheless duty bound to exhibit these matrices because that's the way they look in the literature. To be absolutely fair, we must confess to feeling at least a certain computational security when faced with matrices which is not entirely apparent in the abstract (geometric) formulation of Theorem 5.1. Putting in coordinates for Rn and Rm, we can then formulate Theorem 5.1 as follows. CoroUary5.3. Let a = (a" ... ,an) be a point of Rn. Let f" .. · Jm be CP functions defined on an open set of Rn containing a. Assume that the Jacobian matrix (DJ,(a») (i = 1, ... ,m and j = n - m + I, ... ,n) is invertible. Then the functions
form a CP coordinate system at a. Proof This is just another terminology for the result of Theorem 5.1 in the case of Rn = Rn- m x Rm.
We obtain an implicit mapping theorem generalizing the implicit function theorem. Theorem 5.4. Let U be open in a product E x F and let f: U --> G be a CP map. Let (a, b) be a point of U with a E E and b E F. Let f(a, b) = O. Assume that D2 f(a, b): F --> G is invertible (as continuous linear map). Then there exists an open ball V centered at a in E and a continuous map g: V --> F such that g(a) = band f(x, g(x») = 0 for all x E V. If V is a sufficiently small ball, then 9 is uniquely determined, and is of class CPo Proof The existence of 9 is essentially given by Theorem 5.1. If we denote the inverse map of o/J locally by rp, and note that rp has two components, rp = (rp" r(2) such that rp(x, z) = (x, rpix, z»),
530
INVERSE MAPPING THEOREM
[XVIII, §5]
then we let 9(x) = 9,: V --> F such that 9,(a) = 9,(a) = band
I(x, 9I(X»)
= I(x, 9,(X») = 0
for all x E V. We know that the map (x, y) 1--+ (x,f(x, y») is locally invertible at (a, b), and in particular, is injective. By continuity and the assumption that 9,(a) = 9,(a) = b, we conclude that 91(Vo) and 9,(Vo) are close to b if Vo is selected sufficiently small. Hence if points (x,9,(x») and (x,9,(x») map on the same point (x, 0) we must have 9,(X) = 9,(X). Now let x be any point in V and let w = x-a. Consider the set of those numbers t with 0:5; t ;;; I such that 91 (a + tw) = 9,(a + tw). This set is not empty. Let s be its least upper bound. By continuity, we have 9,(a + sw) = 9,(a + sw). If s < I, we can apply the existence and that part of the uniqueness just proved to show that 91 and 9, are in fact equal in a neighborhood of a + sw. Hence s = 1, and our uniqueness is proved as well as the theorem. Remark. The shrinking lemma gives an explicit converging procedure for finding the implicit mapping 9 of Theorem 5.4. Indeed, suppose first that D,/(a, b) = I. (One can reduce the situation to this case by letting A = D,/(a, b) and considering 0 I instead of I itself.) Let r, s be positive numbers < I, and let B,(a) be the closed ball of radius r in E centered at a. Similarly for B,(b). Let M be the set of all continuous maps
r'
a: lJ,(a)
-->
lJ,(b)
such that a(a) = b. For each a E M define Ta by
Ta(x) = a(x) - I(x, a(x»). It is an exercise to show that for suitable choice of r < s < I the map T
maps M into itself, and is a shrinking map, whose fixed point is precisely 9. Thus starting with any map ex, the sequence ex, Tex, T'a, . ..
converges to 9 uniformly. If D,/(a, b) = A is not assumed to be I, then we let I, = A- I 0 I, and T is replaced by the map TI such that
T,a(x)
= a(x) - II (x, a(x») = a(x) -
A-1/(x, a(x»).
If the map I is given in terms of coordinates, then D,/(a, b) is represented by a partial Jacobian matrix, and its inverse can be computed explicitly in terms of the coordinates.
[XVIII, §5]
PRODUCT DECOMPOSITIONS
531
We now return to the aspect of the situation in Theorem 5.1 concerned with the straightening out of certain subsets. Such subsets have a special name, and we give the general definition concerning them. Let S be a subset of E. We shall say that S is a submanifo1d of E if the following condition is satisfied. For each point XES there exists a CPisomorphism
mapping an open neighborhood U of x in E onto a product of open sets VI in some space F I' V2 in some space F 2' such that
.p(S n U)
V, x {c}
=
for some point c in V2 • Thus the chart provides a CP-change of coordinates so that in the new space, .p(S n U) appears as a factor in the product. The chart .p at x gives rise to a map of S,
.pIS: S n
U ..... VI
simply by restriction; that is we view .p as defined only on S. The restriction of such a chart .p to S n U is usually called a chart for Sat x. It gives us a representation of a small piece of S near x as an open subset in some space Fl' Of course, there exist many charts for S at x. Theorem 5.1, and the theorems of the preceding section, give criteria for the level set of f to be a submanifold, namely that a certain derivative should be invertible. We shall now derive another criterion, starting from a parametrization of the set. Let E, be a closed subspace of E, and let E 2 be another closed subspace. We shall say that E is the direct sum of E I and E2 and we write
if the map given by is an invertible continuous linear map. If this is the case, then every element of E admits a unique decomposition as a sum
532
INVERSE MAPPING THEOREM
[XVIII, §5]
Example 1. We can wriIe R" as a direct sum of subspaces Rq x {OJ and {OJ x R' if q + s = n. Example 2. Let F be any subspace of R". Let FJ. be the subspace of all vectors WE R" which are perpendicular to all elements of F. Then from linear algebra (using the orthogonalization process) one knows that
R" = F® FJ. is a direct sum of F and its orthogonal complement. This type of decomposition is the most useful one when dealing with R" and a subspace. Example 3. Let VI' •.. ,vq be linearly independent elements of R". We can always find (in infinitely many ways if q #- n) elements vq + 1> ••• ,V" such that {VI> ... ,v"} is a basis of R". Let E, be the space generated by VI' •••• vq and E 2 the space generated by vq + I • . . . •V". Then
In Example 2, we select vq + h ... ,V" so that they are perpendicular to E,. We can also select them so that they are perpendicular to each other. When we have a direct sum decomposition E projections 7t 1 :
E -+ E I
and
7t 2 :
= E I ® E 2 then
we have
E -+ E 2
on the first and second factor respectively, namely 7t,(v I + V2) = v, and 7t,(VI + V2) = V2 if VI E E, and v2 E E 2 · When E 2 = Et is the orthogonal complement of E I then the projection is the orthogonal projection as we visualize in the following picture:
R
[XVIII, §5]
PRODUCT DECOMPOSITIONS
533
We have drawn the case dim E, = 2 and dim E 2 = I. Such decompositions are useful when considering tangent planes. For instance we may have a piece of a surface as shown on the next picture:
We may want to project it on the first two coordinates, that is on R2 x {O}, but usually we want to project it on the plane tangent to the surface at a point. We have drawn these projections side by side in the next picture: R
R' F
The tangent plane is not a subspace but the translation of a subspace. We have drawn both the subspace F and its translation F p consisting of all points w + P with WE F. We have a direct sum decomposition
Theorem 5.5. Let V be an open set in F and let
g: V -. E
534
INVERSE MAPPING THEOREM
[XVIII, §5]
be a CP map. Let a E V and assume that g'(a): F ..... E is an invertible continuous linear map between F and a closed subspace E I of E. Assume that E admits a direct sum decomposition E = E, Ei'> E2 · Then the map ~:
V x E2
.....
E
given by (x, y) t-+ g(x)
+Y
is a local CP-isomorphism at (a, 0).
Proof We need but to consider the derivative of ~, and obtain ~'(a, O)(v,
w) = g'(a)v
+w
for v E F and WE E 2 . Then ~'(a, 0) is invertible because its inverse is given by v, + V2 t-+ (it- 'v" V2) if VI E E" V2 E E 2 and it = g'(a). We can now apply the inverse mapping theorem to conclude the proof. From Theorem 5.5, we know that there exist open sets VI in F containing a, V2 in E2 containing 0, and U in E such that ~:
V, x V2
.....
U
is a CP-isomorphism, with inverse.p: U ..... VI X V2 . Then g(x)
= ~(x, 0).
Let S = g(VI). Then S is the image of VI under g, and is a subset of E parametrized by 9 in such a way that our chart .p straightens S out back into V, x (OJ, that is .p(S) = V, x {O}.
We note that Theorems 5.1 and 5.5 describe in a sense complementary aspects of the product situation. In one case we get a product through a map f which essentially causes a projection, and in the other case we obtain the product through a map 9 which causes an injection. At all times, the analytic language is adjusted so as to make the geometry always visible, without local coordinates. There is a Jacobian criterion for the fact that D 2 f(a, b) is invertible, as described in Chapter XVII, §7. We can also give a matrix criterion for the
[XVIII, §5]
535
PRODUCT DECOMPOSITIONS
hypothesis of Theorem 5.5. Let us consider the case when £ F = Rmwith m ;;; n. Then V is open in Rm and we have a map g: V -+
= R"
and
R".
The derivative g'(a) is represented by the actual Jacobian matrix
J.
og, oa,
Og, oam
og" Oa,
Og" oam
= -
if (a) = (a., ... ,am) and g(x) have:
= (g,(x), ... ,g"(x»).
From linear algebra, we
Theorem 5.6. In order that g'(a) give an isomorphism between Rm and a subspace of R" it is necessary and sufficient that the Jacobian J.(a) have rank m. We won't prove this which is a standard elementary result of linear algebra. It means that the kernel of the linear map represented by J.(a) is 0 precisely when this matrix has rank m. Theorem 5.6 gives us computational means to test whether a specific mapping satisfies the condition of Theorem 5.5. Observe that the space Rm is different from its image in R" under g'(a), and that is the reason why in Theorem 5.5 we took the spaces F and £, different. In the special case of R", as pointed out before, given the subspace £, we can always find some £2 such that R" = £. Ei) £2 is a direct sum decomposition. Example. Let g: R 2
-+
R 3 be the map given by
g(x, y) Then
and hence
=
(sin x, eX cos y, sin y).
536
INVERSE MAPPING THEOREM
[XVIII, §5]
has rank 2, so that in a neighborhood of (0, 0), the map g parametrizes a subset of R 3 as in the theorem.
XVIII, §S. EXERCISES I. Let f: R'
--> R be a funclion of class C . Show that f is not injective. that is there ' must be points p. Q E R', P ", Q. such that f(P) = f(Q).
2. Let f: R' --> Rm be a mapping of class C' with m < n. Assume ['(xo) is surjective for some x o. Show that f is not injective. (Actually much more is true, but it's harder to prove.) 3. Let f: R --> R be a C I function such that ['(x) ", 0 for all x E R. Show that f is a C'-isomorphism of R with the image of f 4. Let U be open in R' and let f: U --> Rmbe Coo with ['(x): R' --> Rmsurjective for all x in U. Prove that flU) is open. 5. Let f: Rm --> R' be a C ' map. Suppose that x E Rm is a point at which Df(x) is injective. Show that there is an open set U containing x such that fCy) ", f(x) for all y E U. 6. Let [a, b] be a closed interval J and let f: J --> R' be a map of class CI. Sbow that the image f(J) has measure 0 in R'. By this we mean that given E, there exists a sequence of squares {SI. S" ... } in 'R' such that the area of the square S" is equal to some number K" and we have f(J)c
US,
Generalize this to a map f: J using cubes instead of squares.
-->
and
LK, <E.
R 3 , in which case measure zero is defined by
7. Let U be open in R' and let f: U --> R3 be a I]lap of class C" Let A be a compact subset of U. Show Ihal f(A) has measure 0 in R 3 • (Can you generalize this, 10 maps of Rm into R' when n > m?) 8. Let U be open in R' and let f: U --> Rm be a C ' map. Assume that m ~ nand lei a E U. Assume that f(a) = 0, and that the rank of the matrix (Djf,(a» is m, if (f" ... ,1m) are the coordinate functions of f Show that there exists an open subset U 1 of U containing a and a C'-isomorphism 'P: VI --> U, (where V, is open in R') such thai f(q>(x" ... ,x,» = (x,_m+ I ' ' ' ' ,x,).
9. Let I: R x R --> R be a C I function such that D,/(a, b) ", 0, and let g solve the implicit function theorem, so thai fIx. g(x» = 0 and g(a) = b. Show that g'(x)
= _ Dtf(x, g(x» D,f(x, g(x»)'
[XVIII, §5]
537
PRODUCT DECOMPOSITIONS
10. Generalize Exercise 9, and show that in Theorem 5.4, the derivative of 9 is given by
g'(x) = - (D 2 f(x, g(x»( I 0 D.J(x, g(x»). II. Let f: R .... R be of class C I and such that I j'(x) I ~ c < I for all x. Define g:
R2 .... R2
by
g(x, y) = (x
+ f(y), Y + f(x)).
Show that the image of 9 is all of R 2 • 12. Let f: R" .... R" be a C I map, and assume that I f'(x) I ~ c g(x) = x + f(x). Show that g: R" .... R" is surjective.
< I for all x E R". Let
13. Let).: E .... R be a continuous linear map. Let F be its kernel, that is the set of all WEE such that ).(w) = O. Assume F # E and let Vo E E, Vo f F. Let F, be the subspace of E generated by vo. Show that E is a direct sum F Ell F, (in particular, prove that the map (w, r) ..... W
+ rvo
is an invertible linear map from F x R onto E). 14. Let J(x, y) = (x cos y, x sin y). Show that the determinant of the Jacobian of J in the rectangle I < x < 2 and 0 < Y < 7 is positive. Describe the image of the rectangle under f. 15. Let S be a submanifold of E, and let PES. If and are two charts for S at P (where V" V 2 are open in R 3 ), show that there exists a local isomorphism between VI at !/J,(P) and V2 at!/J2(P), mapping !/JI(P) on !/J2(P), 16. Let!/J,: VI" S .... VI be a chart for S at P and let gl: VI .... V, "S be its inverse mapping. Suppose VI is open in F" and let x, E F, be the point such that
g,(x l ) = P. Show that the image of g'l(x,): F I
....
E is independent of the chart for S at P.
(It is called the subspace of E which is parallel to the tangent space of S at P.)
CHAPTER
XIX
Ordinary Differential Equations
XIX, §1. LOCAL EXISTENCE AND UNIQUENESS We link here directly with the shrinking lemma. and this section may be read immediately after the first section of the preceding chapter. We defined a vector field previously over an open set of R". We don't need coordinates here, so we repeat the definition. We continue to assume that E, F are euclidean spaces, and what we say holds more generally for complete normed vector spaces. Let U be open in E. By a vector field on U we mean a map f: U -+ E. We view this as associating a vector f(x) E E to each point x E U. We say the vector field is of class cP if f is of class C'. We assume p ;;; I throughout. and the reader who does not like p ;;; 2 can assume p = I. Let Xo E U and let f: U -+ E be a vector field (assumed to be of class C' throughout). By an integral Curve for the vector field, with initial condition xo, we mean a mapping ex: J
-+
U
defined on some open interval J containing 0 such that ex is differentiable. ex(O) = Xo and ex'(t) = f(ex(t» for all t E J. We view ex'(t) as an element of E (this is the case of maps from numbers to vectors). Thus an integral curve for f is a curve whose velocity vector at each point is the vector associated to the point by the vector 538
[XIX, §1]
LOCAL EXISTENCE AND UNIQUENESS
539
field. If one thinks of a vector field as associating an arrow to each point, then an integral curve looks like this:
--
•
/
-
- - ---,-/
Remark. Let ex: J
-+
U be a continuous map satisfying the condition
ex(t) = Xo
+
Lf(ex(u» duo
Then ex is differentiable, and its derivative is ex'(t) = f(ex(t». Hence ex is of class e' and is an integral curve for f Conversely, if ex is an integral curve for f with initial condition xo, then ex obviously satisfies our integral equation since indefinite integrals of a continuous map differ by a constant, and the initial condition determines this constant uniquely. Thus to find an integral curve, we shall have to solve the preceding integral equation. This wi1l be a direct consequence of the shrinking lemma. Theorem 1.1. Let U be open in E and let f: U -+ E be a e' vector field. Let Xo E U. Then there exists an integral curve ex: J -+ U with initial condition Xo. If J is sufficiently sma/l, this curve is uniquely determined. Proof Let a be a number > 0 and let Bo be the open ball of radius a centered at Xo' We select a sufficiently small so that f is bounded by a number e on Bo • We can do this because f is continuous. Furthermore, we select a so small that J' is bounded by a constant K > 1 on the closed ball Bo • Again we use the continuity of J'. Now select b > 0 such that be < a and also bK < 1. Let I b be the closed interval [-b, b]. Let M be the set of all continuous maps
such thaI ex(O) = Xo. Then M is closed in the space of all bounded maps with the sup norm. For each ex E M define a map Sex by (Sex)(t) = Xo
+
Lf(cx(U» duo
540
[XIX, §I]
ORDINARY DIFFERENTIAL EQUATIONS
We contend that Srx lies in M. First, it is clear that Srx(O) Srx is continuous. Next, for alit E lb, ISrx(t) - xol
~
= Xo
and that
bC
so Srx E M. Finally, for ex, PE M we have
Srx(t) - SP(t) =
f~ (f(rx(u)) -
f(P(u))) du,
whence by the mean value theorem, ISrx(t} - SP(t) I ~ bK suplrx(u) - P(u)1 UE/b
~
bKllrx -
PII.
This proves that S is a shrinking map, and by the shrinking lemma, S has a unique fixed point ex, that is Srx = rx. This means that rx satisfies the integral equation which makes it an integral curve of f, as was to be shown. We shall be interested in a slightly more general situation, and for future reference, we state explicitly the relationship between the constants which appeared in the proof of Theorem 1.1. These are designed to yield uniformity results later. Let V be an open set in some space, and let f: V x V
-+
E
be a map defined on a product of V with some set V. We say that f satisfies a Lipschitz condition on V uniformly with respect to V if there exists a number K > 0 such that
If(v, x) - f(v, y)1
~
Klx -
YI
for all v E V and x, Y E U. We call K a Lipschitz constant. If f is of class C 1 , then the mean value theorem shows that f is Lipschitz on some open neighborhood of a given point (vo, xo) in V x V, and continuity shows that f itself is bounded on such a neighborhood. It is clear that in the proof of Theorem 1.1, only a Lipschitz condition intervened. The mean value theorem was used only to deduce such a condition. Thus a Lipschitz condition is the natural one to take in the present situation. Furthermore, suppose that we find integral curves through each point x of U. Then these curves depend on two variables, namely the variable t
[XIX, §1]
LOCAL EXISTENCE AND UNIQUENESS
541
(interpreted as a time variable), and the variable x itself, the initial condition. Thus we should really write our integral curves as depending on these two variables. We define a local flow for f at X o to be a mapping
a:J x Uo -+ U where J is some open interval containing 0, and U 0 is an open subset of U containing xo, such that for each x in U 0 the map t 1-+ ait) = a(t, x)
is an integral curve for f with initial condition x, i.e. such that a(O, x) = x. As a mailer of notation, we have written ax to indicate that we view x as a certain parameter. In general, when dealing with maps with two arguments, say ",(t, x), we denote the separate mappings in each argument when the other is kept fixed by lfJit) or lfJ,(x). The choice of letters and the context will always be carefully specified to prevent ambiguity. The derivative of the integral curve will always be viewed as vector valued since the curve maps numbers into vectors. Furthermore, when dealing with flows, we sometimes use the notation a'(t, x)
to mean D,a(t, x) and do not use the symbol' for any other derivative except partial derivative with respect to t, leaving other variables fixed. Thus a'(t, x)
= a~(t) = D,a(t, x)
by definition. All other partials (if they exist) will be written in their correct notation, that is D 2 , ... and total derivatives will be denoted by D as usual. Example. Let U = E be the whole space, and let 9 be a constant vector field, say g(x) = v ,p 0 for all x E U. Then the flow a is given by a(t, x) = x
+ tv.
Indeed, D,a(t, x) = v and since an integral curve is uniquely determined, with initial condition a(O, x) = x, it follows that the flow is precisely the one we have written down. The integral curves look like straight lines. In Exercise 4, we shall indicate how to prove that this is essentially the most general situation locally, up to a change of charts.
542
ORDINARY DIFFERENTIAL EQUATIONS
[XIX, §1]
We shall raise the question later whether the second partial D z rJ.(t, x) exists. It will be proved as the major result of the rest of this chapter that whenever f is CP then a itself, as a flow depending on both t and x, is also of class CPo Finally, to fix one more notation, we let I b be the closed interval [ - b, b] and we let J b be the open interval - b < t < b. If a > 0 we let B.(x) be the open ball of radius a centered at x, and we let B.(x) be the closed ball of radius a centered at X. The next theorem is practically the same as Theorem 1.1, but we have carefully stated the hypotheses in terms of a Lipschitz condition, and of the related constants. We also consider a time-dependent vector field. By this we mean a map
f: J x U
->
E
where J is some open interval containing O. We think of f(t, x) as a vector associated with x also depending on time t. An integral curve for such a time-dependent vector field is a differentiable map
a:Jo-U defined on an open interval J o containing 0 and contained in J, such that
a'(t) = f(t, aCt»). As before, a(O) is called the initial condition of the curve. We shall need time-dependent vector fields for applications in §4. We also observe that if f is continuous then a is of class C 1 since a' is the composite of continuous maps. By induction, one concludes that if f is of class CP then a is of class Cp+ '. We shall consider a flow for this time-dependent case also, so that we view a flow as a map
a:J o x U o - U where U 0 is an open subset of U containing X o and J 0 is as above, so that for each x the curve
t ..... aCt, x) is an integral curve with initial condition x (i.e. a(O, x) = x). Theorem 1.2. Let J be an open interval containing O. Let U be open in E. Let X o E U. Let 0 < a < I be such that the closed ball Bz.(x o) is COIItained in U. Let
f: J x U -> E
[XIX, §1]
LOCAL EXISTENCE AND UNIQUENESS
543
be a continuous map, bounded by a constant C > 0 and satisfying a Lipschitz condition on U with Lipschitz constant K > 0 uniformly with respect to J. If b < alC and b < 11K then there exists a uniquefiow ex: J b
X
B.(xo) -+ U.
If f is of class CP, then so is each integral curve ex x ' Proof Let x E B.(xo). Let M be the set of continuous maps
such that ex(O) = x. Then M is closed in the space of bounded maps under the sup norm. For each ex E M we define Sex by
Sex(t) = x
+ Lf(U, ex(u»
duo
Then Sex is certainly continuous and we have Scx(O) = x. Furthermore,
ISex(t) - xl ;;;; bC < a so that Sex(t) E
B2 .(xo) and Sex lies in M.
Finally for ex, P E M we have
IISex - SPII ;;;; b sup If(u, ex(u» - f(u, P(u»1 ue1b
;;;; bKllex - PII. This proves that S is a shrinking map, and hence S has a unique fixed point which is the desired integral curve. This integral curve satisfies ex(O) = x, and so depends on X. We denote it by exx ' and we can define ex(t, x) = exit), as a function of the two variables t and x. Then ex is a flow. This proves our theorem. Remark 1. There is no particular reason why we should require the integral curve to be defined on an interval containing 0 such that ex(O) = Xo. One could define integral curves over an arbitrary interval (open) and prescribe ex(t o) = Xo for some point to in such an interval. The existence and uniqueness of such curves locally follows either directly by the same method, writing
ex(t) = ex(t o) +
r' flu, ex(u» J,.
du,
or as a corollary of the other theorem, noting that an interval containing to can always be translated from an interval containing O.
544
[XIX, §l]
ORDINARY DIFFERENTIAL EQUATIONS
Combining Ihe local uniqueness wilh a simple leasl upper bound argumenl, we shall obtain the global uniqueness of integral curves. Theorem 1.3. Let f: J x U the open set U of E. Let
-+
E be a time-dependent vector field over
and be IWO integral curves wilh Ihe same initial condition are equal on J, n J 2 .
Xo'
Then a, and a2
= ait) for < b. Then T contains some b > 0 by the local uniqueness theorem. If T is not bounded from above, the equality of a,(1) and a2(1) for all t > 0 follows at once. If T is bounded from above, let b be ils least upper bound. We must show that b is the right end point of J, n J 2' Suppose this is nOI the case. Define curves P" P2 near 0 by Proof Let T be the set of numbers b such Ihat a, (t)
o~ t
and Then PI' P2 are integral curves of f with the initial conditions a,(b) and aib) respectively. The values P,(I) and P2(1) are equal for small negative t because b is a least upper bound of T. By continuity it follows Ihat a,(b) = a2(b), and finally we see from the local uniqueness theorem that P,(t) = P2(1) for all t in some neighborhood of 0, whence a, and a2 are equal in a neighborhood of b, contradicting the fact that b is a least upper bound of T. We can argue in the same way toward the left end points, and thus prove the theorem. It follows from Theorem 1.3 that the union of the domains of all integral curves of f with a given initial condilion Xo is an open interval which we denote by J(xo). Its end points are denoted by 1+(Xo) and C(xo) respectively. We allow by convention + 00 and - 00 as end points. Let !iP(f) be the subset of R x U consisting of all points (t, x) such that C(x) < t < t+(x).
A global flow for f is a mapping a: !iP(f) -+ U
such that for each x E U the partial map a.: J(x) -+ U, given by a.(t) = aCt, x)
[XIX, §1]
LOCAL EXISTENCE AND UNIQUENESS
545
defined on Ihe open inlerval J(x), is an inlegral curve for f wilh inilial condilion x. We define §)(f) to be the domain of the flow. We shall see in §4 that §)(f) is open and that if f is CP then the flow a is also CP on its domain. Remark 2. A time-dependent vector field may be viewed as a timeindependent vector field on some other space. Indeed, let f be as in Theorem 1.2. Define J:JxU-+RxE
by
!(I, x)
= (1,f(I, x»)
and view! as a time-independent vector field on J x U. Let so that D, a(1, S, x) = !(a(l, s, x»),
a(o, s, x) =
a be its flow,
(s, x).
We note that a has its values in J x U and thus can be expressed in terms of two components. In fact, it follows at once that we can write a in the form a(l, S, x) = (I
+ s, ail, s, x»).
Then a2 satisfies the differential equati9n
as we see from the definition of J Let fJ(l, x) = ail, 0, x). Then fJ is a flow for f, i.e. satisfies the differential equation D, fJ(l, x)
= f( I, fJ(l, x»),
fJ(O, x) = x.
Given x E U, any value of I such that a is defined at (I, x) is also such that a is defined at (I, 0, x) because ax and fJx are integral curves of the same vector field, with the same initial condition, hence are equal. Thus the study of time-dependent vector fields is reduced to the study of timeindependent ones.
546
ORDINARY DIFFERENTIAL EQUATIONS
[XIX, §l]
Remark 3. One also encounters vector fields depending on parameters, as follows. Let V be open in some space F and let
g: J x V x U
-+
E
be a map which we view as a time-dependent vector field on U, also depending on parameters in V. We define G: J x V x U
-+
F x E
by G(t, z, y) = (0, g(t, z, y»
for
V, and y E U. This is now a time-dependent vector field on V x U. A local flow for G depends on three variables, say P(t, z, y), with initial condition P(O, z, y) = (z, y). The map P has two components, and it t E J, Z E
is immediately clear that we can write P(t, z, y) = (z, a(/, z, y»
for some map a depending on three variables. Consequently a satisfies the differential equation D , <>:(t,
Z,
y)
= g(t, z, a(/, z, y»,
<>:(0, z,y) = y,
which gives the flow of our original vector field 9 depending on the parameters Z E V. This procedure reduces the study of differential equations depending on parameters to those which are independent of parameters.
XIX, §1. EXERCISES I. Let f be a C' vector field on an open set U in E. If f(xo) = 0 for some Xo E U, if a: J ..... U is an integral curve for f, and there exists some 10 E J such that a(t o) = Xo, show that a(l) = Xo for aliI E J. (A point Xo such that f(x o) = 0 is called a critical
point of the vector field.) 2. Let f be a C' vector field on an open set U of E. Let a: J -+ U be an integral curve for f Assume that all numbers I > 0 are contained in J, and that there is a point P in U such that lim <>:(t) = P. Prove that f(P) = O. (Exercises 1 and 2 have many applications, notably when g. In this case we see that P is a critical point of the vector field.)
f = grad 9 for some function
3. Let U be open in R" and let g: U ..... R be a function of class C'. Let Xo E U and assume that Xo is a critical point of 9 (that is g'(xo) = 0). Assume also that D'g(xo)
[XIX, §l]
LOCAL EXISTENCE AND UNIQUENESS
547
is negative definite. By definition, take this to mean that there exists a number c > 0 such that for all vectors v we have D'g(xo)(v, v);;;;
-clvl'.
Prove that if XI is a point in the ball B,(xo) of radius 1', centered at x o , and if I' is sufficiently small. then the integral curve" of grad 9 having X, as initial condition is defined for all t !1; 0 and lim ,,(t)
= X o'
[Hint: Let ljJ(t) = (,,(t) - xo)· ("(I) - xo) be the square of the distance from <x(t) to Xo' Show thatljJ is strictly decreasing, and in fact satisfies 1jJ'(t) ~ -2ClljJ(t),
where
C1
> 0 is near c, and is chosen so that D'g(x)(v, v) ~
-c,lvl'
for all X in a sufficiently small neighborhood of X O' Divide by ljJ(t) and integrate to see that log 1jJ(1) - log 1jJ(0)
~ -CI.
Alternatively. use the mean value theorem on ljJ(t,) -1jJ(t ,) to show that this difference has to approach 0 when I, < t, and t,.t, are large.] 4. Let U be open in E and let J: U -> E be a C' vector field on U. Let X o € U and assume that J(x o) = v oF O. Let" be a local flow for J at Xo. Let F be a subspace of E which is complementary to the one-dimensional space generated by v, that is the map
given by (t, y) ..... tv + y is an invertible continuous linear map. (a) If E = R" show that such a subspace exists. (b) Show that the map p: (t, y) ..... <x(I, Xo + y) is a local C' isomorphism at (0.0). You may assume that D,,, exists and is continuous. and that D,,,(O. x) = id. This will be proved in §4. Compute DP in terms of D," and D,,,. (c) The map q: (t, y) ..... X o + Y + tv is obviously a C ' isomorphism, because it is composed of a translation and an invertible linear map. Define locally at Xo the map 'I' by 'I' = p. q- '. so that by definition, q>(xo
+ Y + tv)
Using the chain rule, show that for all
X
=
,,(t, Xo + y).
near X o we have
Dq>rx)v = J(q>(x»
548
ORDINARY DIFFERENTIAL EQUATIONS
[XIX, §2]
If we view 'P as a change of chart near xo, then this result shows that the vector field f when transported by this change of chart becomes a constant vector field with value v. Thus near a point where a vector field does not vanish, we can always change the chart so that the vector field is straightened out. This is illustrated in the following picture:
• In this picture, we have drawn the flow, which is horizontal on the left, the vector field being constant. In general, suppose 'P: V o -+ V o is a C' isomorphism. We say that a vector field 9 on Vo and a vector field f on V 0 correspond to each other under 'P, or that f is transported to Vo by 'P if we have the relation f('P(x» = Dcp(x)g(x).
which can be regarded as coming from the following diagram:
Vo -c:--+l V 0
•
In the special case of our Exercise, 9 is the constant map such that g(x)
= v for all
xeJlo·
XIX, §2. APPROXIMATE SOLUTIONS As before, we let f: J xU ..... E be a time-dependent vector field on U. We now investigate the behavior of the flow with respect to its second argument, i.e. with respect to the points of U. Let J 0 be an open subinterval of J containing 0 and let
be of class C'. We shall say that
1'P'(t) - fIt,
€
[XIX, §2]
549
APPROXIMATE SOLUTIONS
Theorem 2.1. Let 'Ph 'P2 be £,- and £2-approximate integral curves of f on J 0 respectively, and let £ = £, + £2' Assume that f is Lipschitz with constant K on U uniformly in J 0 or that D2 f exists and is bounded by K on J x U. Let to be a point of J o . Then for any tin J o we have
Proof By assumption we have 1'P',(t) - f{t, 'P,(t))I ~
£"
1'P2(t) - f{t, 'Pit»1 ~
£2'
From this we get
Say t ~ to so that we don't have to put absolute value signs around t - to. Let
!/J(t) = 1'P,(t) - 'Pit)I, wet) = If{t, 'P,(t» - f{t, 'Pit))!. We have
I,
('P', - 'P2)
+
I'
[f{u, 'Plu» - f{u, 'P,(U))] du
to
'0
whence
I!/J(t) - !/J(to)1
~ £(t S;
to)
+
I'
w(u) du
'0
£(t - to)
+K
I'
!/J(u) du
'0
S;
K
1:
[!/J(U)
+ ~}u
and finally the relation
!/J(t) < !/J(t o) + K
1:
[!/J(U)
+ ~] duo
~ £(t -
to),
550
[XIX, §2]
ORDINARY DIFFERENTIAL EQUATIONS
On any closed subinlerval of J o , our map tit is bounded. If we add ilK to both sides of the last relation, then we see that our theorem follows from the next lemma.
Lemma 2.2. Let 9 be a positIVe real valued function on an incerval, bounded by a number B. Let to be in the incerval, say to ~ t, and assume that there are numbers C, K ;;; 0 such thac g(t) ;;;; C
+
K
I'
g(u) duo
10
Then for all incegers n ;;; I we have g(t)
< =
[
C I
+
K(t - to)
I I.
+
.. .
+
Kn-'(t - tor- ']
(_ n 1)1 .
+
BKn(t - to)" I /1.
.
Proof The statement is an assumption for n = I. We proceed by induction. We integrate from to to t, multiply by K and use the recurrence relation. The statement with n + 1 then drops out of the statement with n. Theorem 2.1 will be applied immediately to obtain a continuity result for a flow depending on its second variable. If x is close to xo, then the integral curve with initial condition x may be seen as an approximate integral curve with respect to Xo and the estimates of Theorem 2.1 will yield: Corollary 2.3. Let f: J x V -+ E be concinuous, and satisfy a Lipschitz condition on U uniformly with respect co J. Let X o be a poinc of U. Then there exists an open subincerval J 0 of J containing 0, and an open subset U 0 of U cOlllaining X o such that f has a unique flow
a: J o x V o -+ V. We can select J 0 and U 0 such that a is continuous, and satisfies a Lipschitz condition on J 0 x U o. Proof Given x, yE V, we let <'P,(t) = a(t, x) and <'P2(t) = a(t,y) be defined on the Jo x Uo obtained in Theorem 1.2. Then we can apply Theorem 2. I with 0' = 02 = O. For s, I E Jo we obtain la(t, x) - a(s, Y)I ~ la(t, x) - a(t, Y)I
~
+ la(t, y)
- a(s, y)1
Ix - y1e K1/! + II - siB
if we take J 0 of small length and B is a bound for f Indeed, we estimate the first term using Theorem 2.1 with to = O. We estimate the second term using the integral expression for the integral curve and the bound B for f This proves the corollary.
[XIX, §2)
APPROXIMATE SOLUTIONS
551
NeXI we consider Ihe problem of delermining the largest possible interval over which an integral curve can be defined. There are two possible reasons why an integral curve cannot be defined over all of R, or say for all t ;1; O. The first one is that as the curve proceeds along, it is tending toward a point at the boundary of the open set U, but not in U. The curve is thus prevented from reaching this point a priori. One can create this situation artificially. For instance, suppose we have a vector field on E itself and a perfectly reasonable integral curve defined on all of R. Let P be a point of E, and suppose that the integral curve has initial condition Xo and passes through P so that a(t I) = P for some t l' Let U be the open set obtained from E by deleting P. If we view our vector field now on U it is clear that the integral curve starting at Xo cannot be extended beyond t 1 as an integral curve on U, and that as t ..... t l' we have a(t) ..... P. A situation like the above may arise naturally. One can visualize it as in the following picture: p
u
The second reason why an integral curve cannot be extended to all of R is that, as the curve proceeds along, the vector field becomes unbounded, and the curve speeds up so rapidly that it has no time to reach certain numbers of R. The next result states that these are the only possibilities which may prevent a curve from being extendable past a certain point.
Theorem 2.4. Let J be an open interval (a, b) and let U be open in E. Let f: J xU ..... E be a continuous map which is Lipschitz on U uniformly for every compact subinterval of J. Let a be an integral curve of f, defined on a maximal open subinterval (a o , bo) of J. Assume: (i) There exists £ > 0 such that the closure
a(bo -
£,
boH
is contained in U. (ii) There exists a number C > 0 such that I f(t, a(t»)1 ~ C for all t in (b o - £, bo)· Then bo = b.
552
ORDINARY DIFFERENTIAL EQUATIONS
[XIX, §3]
Pl'oq( Suppose bo < b. From the integral expression for a, namely
a(c)
= a(c o) +
we see that for c" c2 in (b o -
£,
J' flu, a(u» du I.
bo) we have
This is the Cauchy criterion, and hence the limit
lim a(c) ,- bo
exists and is equal to an element Xo of U by hypothesis (i). By the local existence theorem, there exists an integral curve {3 defined on an open interval containing bo such that {3(b o) = Xo and {3'(c) = ftc, {3(c». Then {3' = a' on an open interval to the left of b o and hence a, {3 differ by a constant on this interval. Since their limits as I ---> bo are equal, this constant is O. Thus we have extended the domain of definition of a to a larger interval, as was to be shown.
Remark. Theorem 2.4 has an analogue giving a criterion for the integral curve being defined all the way to the left end point of J, and we shall use Theorem 2.4 in both contexts as a criterion for the integral curve to be defined on all of J.
XIX, §3. LINEAR DIFFERENTIAL EQUATIONS We shall consider a special case of differential equations, both for its own sake and for applications to the general case afterwards. We let L be a vector space as usual which in applications will be a space of continuous linear maps. We let E be some space, and assume given a product LxE--+E,
written
(A, w) ...... AW,
that is a bilinear map satisfying the condition IAwl Let J be an open interval, and let
~
IAllwl.
A: J -> L
be a continuous map. We consider the differential equation A'(c) = A(c}A(c)
[XIX, §3]
553
LINEAR DIFFERENTIAL EQUATIONS
corresponding to the time-dependent vector field on E given by (t, w) ...... A(t)w. In the applications. we have two cases: I. The product given by composition of mappings. namely
L(E. E) x L(E, E)
--+
L(E. E)
for some space E, so that .low = A 0 w for .l., W E L(E, E). 2. The product given by applying linear maps to vectors. namely L(E, E) x E
--+
E.
In the first case. suppose E = Rn• Then we can think of A(t) as an n x n matrix, and of the solution as an n x n matrix also. say B(t). so that our differential equation can be written B'(t) = A(t)B(t).
the product being multiplication of matrices. In the second case. we think of .lo(t) as a curve in Rn• which we write X(t), and the differential equation looks like X'(t) = A(t)X(t).
or in terms of coordinates,
x~(t)
= an,(t}x,(t) + ... + a..(t)xn(t).
It is clear that the solutions of our differential equation .l.'(t) = A(t).l.(t) form a vector space. One of the main facts which is always true in this linear case is that the integral curves are defined on the full interval J. This will be proved below, and we consider the slightly more general case when the equation depends on parameters.
Theorem 3.1. Let J be an open interval of R containing O. and let V be open in some space. Let A: J x V
--+
L
be a continuous map. and let L x E --+ E be a product. Let element of E. Then there exists a unique map
A: J x V --+ E.
Wo
be a fixed
554
[XIX, §3]
ORDINARY DIFFERENTIAL EQUATIONS
which, for each x
E
V, is a solution of the differential equation
}.'(t, x) = A(t, x)}.(t, x),
}'(O, x) =
Woo
This map }. is continuous. Proof Let us first fix x E V. Consider the differential equation }.'(t, x) = A(t, x)J.(t, x)
with initial condition }'(O, x) = Woo This is a differential equation on E, with time-dependent vector field f given by f(t, v) = A(t, x)v
for VEE. We want to prove that the integral curve is defined on all of l, and for this we shall use Theorem 2.4. Suppose that t ...... }.(t, x) is not defined on all of l. We look to the right, and let bo be the right end point of a maximal subinterval of 1 on which it is defined. If 1 has a right end point b then bo < b. (Of course, if 1 goes to infinity on the right, there is no b.) Now the map t ...... A(t, x) is bounded on every compact subinterval of J. In particular, we see that our vector field satisfies the Lipschitz condition of Theorem 2.4. Condition (i) is also satisfied, trivially, because our vector field is defined on the entire space E. This leaves condition (ii) to verify. We omit the index x for simplicity of notation, and on the interval ~ t < bo we have
°
}.(t) =
Wo
+ f~ A(u)}.(u) du
so that [}.(I)I
~ jWol + K
D}.(U)' du,
where K is a bound for the map t ...... A(t) on the compact interval [0, bol By Theorem 2.1, it follows that}. is bounded on the interval ~ t < bo, whence
°
f(t, }.(t))
= A(t)}.(t)
is bounded on this interval. Thus condition (ii) is satisfied, and our assumption that bo is not the right end point of J is contradicted. This proves that}. is defined on all of J.
[XIX, §3]
555
LINEAR DIFFERENTIAL EQUATIONS
We now consider A as a map with two variables, t and x, and shall prove its continuity, say at a point
Let c > 0 be so small that the interval I = [to - C. to + c] is contained in J. Let VI be an open ball centered at Xo and contained in V such that A is uniformly continuous and bounded on I x VI' (The existence of this ball is an immediate consequence of the compactness of I. cr. Lemma 8.1 of Chapter XVII, §8, where this is proved in detail.) For (t, x) E 1 X V, we have JA(t. x) - A{rO' xo)1 ~ IA(t, x) - A(t. xo)1
+ IA(t, xo)
- A(t o, xo)l.
The second term on the right is small when t is close to to because A is continuous, being differentiable. We investigate the first term on the right, and shall estimate it by viewing A(t, x) and A(t, xo) as approximate integral curves of the differential equation satisfied by A(t. xl. We find: IX(t. xo) - A(t, x)A(r, xo)1
~
IX{r, xo) - A(r, xo)A(t, xo)1
+ ~
IA(r, XO)A(t, x o) - A(r, x)A(r, xo)1
IA(r, xo) - A(t, x)IIA(t, xo)l·
By the uniform continuity of A and the fact that AU, xo) is bounded for t in the compact interval I, we conclude: Given i, there exists 0 such that if Ix-xol
E.
Therefore A(r, xo) is an i-approximate integral curve of the differential equation satisfied by A(r, x). We apply Theorem 2.1, to the two curves /Po(t)
= A(r. xo)
for each x with Jx - Xo I <
o.
and
/pit)
= A(r, x)
We use the fact that
A(O, x)
= A(O, xo) = Wo.
We then find IA(t, x) - A(r, xo)J < £K,
for some constant K 1 > 0, thereby proving the continuity of A at (to. xo). This concludes the proof of Theorem 3. I.
556
ORDINARY DIFFERENTIAL EQUATIONS
[XIX, §3)
Remark. Suppose given Ihe linear differential equation on L(E, E), that is consider case I, D 1 A(t, x) = A(t, X)A(t, x)
with A(t, x) E L(E, E). Let vEE. Then we obtain a differential equation on E, namely D 1 A(t, x)v = A(t, X)A(t, x)v
whose integral curve is t 1-+ A(t, x )v. This is obvious, and we shall deal with such an equation in the proof of Theorem 4.1 below.
XIX, §3. EXERCISES I. Let A : J - Mato ' 0 be a continuous map from an open interval J containing 0 into the space of n x n matrices. Let S be the vector space of solutions of the differential equation X'(I) = A(I)X(I).
Show that the map X ..... X(O) is a linear map from S into RO, whose·kemel is {OJ. Show that given any n-tuple C = (Ct, ... ,cJ there exists a solution of the differential equation such that X(O) = C. Conclude that the map X ..... X(O) gives an isomorphism between the space of solutions and RO. 2. (a) Let go, ... ,go- I be continuous functions from an open interval J containing ointo R. Show that the study of the differential equation
DOy + go_ID"-ly
+ ... + goY = 0
can be reduced to the study of a linear differential equation in n-space. [Hinl: Let x, = y, x, = y', ... oXo = yo-II.] (b) Show that the space of solutions of the equation in part (a) has dimension n. 3. Give an explicit power series solution for the differential equation
du dl
- =
AU{I)
'
where A is a constant n x n matrix, and the solution U(I) is in the space of n x n
matrices. 4. Let A: J :-' L(E, E) and 1/1: J - E be continuous. Show that the integral curves of the differential equation P'(I)
are defined on all of J.
= A(t)P(I) + 1/1(1)
[XIX, §4]
DEPENDENCE ON INITIAL CONDITIONS
557
5. For each point (to.xo)eJ x E let v(t, to.xo) be the integral curve of the differential equation a'(t) = A(t)a(t)
satisfying the condition a(to) = xo. Prove the following statements: (a) For each t e J, the map x ..... v(t, S, x) is an invertible continuous linear map of E onto itself, denoted by C(t. s). (b) For fixed s. the map t ..... C(t. s) is an integral curve of the differential equation w'(t) = A(t) 0 w(t)
on L(E, E), with initial condition w(s) = id. (c) For s. t. u e J we have C(s. u)
= C(s. t)C(t. u)
and
C(s. t) = C(t. s)-'.
(d) The map (s. t) ..... C(s, t) is continuous. 6. Show that the integral curve of the non-homogeneous differential equation {J'(t) = A(t)P(t)
+ "'(t)
such that P(to) = Xo is given by P(t) = C(t. to)xo
+
r
C(t, s)"'(s) ds.
'0
XIX, §4. DEPENDENCE ON INITIAL CONDITIONS Given a C' vector field f: V ..... E, we consider its flow ex: J x Va ..... V at a point Xo e V o' We are now asking whether ex is also of class C', and this will be the content of the next theorem. Suppose that ex is C'. By definition of an integral curve, we have D,ex(t, x)
= f(ex(t, x».
We want to differentiate with respect to x. Suppose we can do this and interchange D .. D 2 • We obtain
Both Df(ex(t, x» and D 2 ex(t, x) are elements of L(E, E) (that is linear maps of E into itself) and the product here is composition of mappings. Thus we see that D 2 ex(t, x) satisfies a linear differential equation on L(E, E). The
558
[XIX, §4)
ORDINARY DIFFERENTIAL EQUATIONS
preceding argument was purely formal, but is a convenient way to remember the intended differential equation satisfied by D 2 IX. Of course, so far, we don't know anything about the flow IX with respect to x except what was proved in the corollary of Theorem 2.1, namely that IX is locally Lipschitz at every point. We shall prove that IX is of class CP by showing directly that D 2 IX exists and satisfies the linear differential equation described above. As before, we consider a time-dependent vector field, so that instead of laking Df we have to take D2 f Concerning the dependence on I, the differential equation of the flow D,IX(I, x) = f(IX(c, x») shows that D,IX is continuous since it is composed of continuous maps. Theorem 4,1. Lec J be an open imerval ill R containing open in E. LeI
°
and lec V be
{:JxV-.E
be a cP map wich p ~ 1 (possibly P = 00), and lec Xo E V. There exiscs a unique local fiow for f aC Xo' We can selecc an open subinterval J 0 of J cOlltaillillg and all open subset V 0 of V containing .\:0 such Ihac che ullique local flow
°
IX:J O
xVo-.V
is Qf class CP, alld such thaI D2 IX sacisfies che dijferemial equacioll D1 D2 IX(C, x) = D,/(c, IX(C, X»)D 2 IX(I, x) all J o x V o wich inicial condicion D2 IX(0, x) = id. Proof Let A: J x V -. L(E, E)
be given by
A(t, x) = D2 f(c, IX(C, x». Select J, and V 0 such that IX is bounded and Lipschitz on J I X V 0 (using Corollary 2.3), and such that A is continuous and bounded on J 1 x Vo. Let Jo be an open subinterval of J 1 containing such that its closure Jo is contained in J,. Let .«1, x) be the integral curve of the differential equation on L(E, E) given by
°
'<'(c, x)
=
A(c, x).«t, x),
'«0, x) = id,
[XIX. §4)
DEPENDENCE ON INITIAL CONDITIONS
559
as in Theorem 3.1. We contend that D,« exists and is equal to A on J o x U o' This will prove that D,« is continuous on J o x U o' Using Theorem 7.1 of Chapter XVII. this will imply that « is of class C ' . We now prove the contention. Fix x E U o. Let 0(1. h) = «(I.
X
+ h)
- «(I. x).
Then
D,O(e. h) = D, «(e. x + h) - D, «(I. x) =
f(l. «(I. x + h» - f(l. "'(I.
x».
By the mean value theorem. Corollary 4.4 of Chapter XVII. we obtain ID, OU. h) - A(e. x)O(e. h)1 = 1/(1.
~
«(t. x + h» - f(l. «(I. x» - D,f(l. «(I. X»O(I. h)1
Ih/ supID,.f(I. y) - D,f(l. ",(I.
x»1
where the sup is taken for y in the segment between «(t. x) and «(t. x + h). By the compactness of J o it follows from Lemma 8.1 of Chapter XVII. that our last expression is of the type Ihl"'(h) where "'(h) tends to 0 with h. uniformly for I in J o . Thus we can write ID, OU. h) - A(I. X)O(I. h)1 ~ Ihl"'(h). for all IE )0' This shows that 0(1. h) is an Ih''''(h)-approximate integral curve for the differential equation satisfied by A(t. x)h namely
D, A(t. x)h - AU. x)A(e. x)h
=0
with the initial condition A(O. x)h = h. We note that Ott. h) has the same initial condition. 0(0. h) = h. Taking 10 = 0 in Theorem 2.1. we obtain the estimate
IOU. h) - A(I. x)hl
~
C.lh''''(h)
for some constant C, and aliI in )0' This proves the contention that D,« is equal to A on J 0 x Uo. and is therefore continuous. As we said previously. it also proves that « is of class C'. on J 0 x Uo· Furthermore. D,(t satisfies the linear differential equation given in the statement of the theorem. on J o x U o' Thlls our theorem is proved when p = I.
560
ORDINARY DIFFERENTIAL EQUATIONS
[XIX, §4)
The next step is to proceed by induction. Observe that even if we start with a vector field f which does not depend on time I, the differential equa· tion satisfied by D 2cx is time-dependent, and depends on parameters x just as in Theorem 3.1. We know, however, that such vector fields are equivalent to vector fields which do not depend on parameters. In the present case, for instance, we can let A(I, x) = D 2 f(l, cx(l, x»), and let G: J x V x L(E, E)
--+
F x L(E, E)
be the map 'such that G(I, x, w) = (0, A(I, x)w)
for WE L(E, E). The flow for this vector field is then given by the map A such that A(I, x, w)
= (x, A(I, x)w).
Suppose that p is an integer ;;; 2, and assume the local Theorem 4.1 proved up to p - I so that we can assume cx locally of class CP-! (that is we can select J o and U o such that cx is of class CP·' on J o x U o)' Then A is locally of class Cp-! whence D 2 cx is locally of class Cpo! by induction hypothesis. From the expression D,CX(I, x) = f(l, cx(l, x»)
we conclude that D1cx is locally of class CP-', whence our theorem follows from Theorem 7.1 of Chapter XVII, for an arbitrary integer p. If f is CO> and if we knew that the flow cx is of class CP for every integer p on irs domain of definition, then we could conclude that cx is CO> on its domain of definition. (The problem at this point is that in going from p to p + I in the preceding induction, the open sets J 0 and U 0 may be shrinking and nothing may be left by the time we reach 00.) The next theorem proves this global statement. Theorem 4.2. Iff is a vecror field of class CP on U (wilh p possibly co), Ihen ils flow is of class CP on irs domain of definilion, which is open in R xU.
Proof By Remark 2 of §l we can assume f is time independent. It will suffice to prove the theorem for each integer p, because to be of class CO> means to be of class CP for every p. Therefore let p be an integer;;; I. Let Xo E U and let J(xo) be the maximal interval of definition of an integral curve having Xo as initial condition. Let @(f) be the domain of definition of the flow for the vector field f, and let cx be the flow. Let T be the set of
[XIX, §4]
DEPENDENCE ON INITIAL CONDITIONS
561
numbers b > 0 such that for each r with 0 ~ r < h there exists an open interval J 1 containing I and an open set V, containing Xo such that J, x V, is contained in (!2(f) and such that a is of class CP on J, x V,. Then T is not empty by Theorem 4.1. If T is not bounded from above, then we are done looking toward the right end point of J(xo). If T is bounded from above, we let h be its least upper bound. We shall show that b = I+(XO). Cf. Theorem 1.3. Suppose b < t+(xo). Then a(b,xo) is defined. Let XI = a(b,xo). By the local Theorem 4.1, we have a unique local flow at Xi> which we denote by p: {itO, x) = x, defined for some open interval J" = (-II, II) and open ball B"(x ,) of radius II centered at X,, Let b be so small that whenever b - (j < I < b, we have: a(l, xo) E B",.<x,).
We can find () because lim a(t, xo) = x, ,-b by continuity. Select a point r, such that b - (j < pothesis on h, we can select J I and V, so that
I,
< b. By the hy-
maps J I x VI into B"/2(Xtl. We can do this because a is continuous at (I" x o). being in fact CP at this point.
a(I"X)
•
If Ir -
I
Ii <
II
and x E V I. we define
562
ORDINARY DIFFERENTIAL EQUATIONS
[XIX, §4]
Then
= {3(O, o:(c" x») = o:(c" x)
and D,
=!({3(1 -
1"0:(1,, x»)
= !(
Hence both
PART FIVE
Multiple Integration
The extension of the theory of the integrals to higher dimensional domains gives rise to two problems which are due to the more complicated nature of the domain and to the more complicated nature of the functions. When dealing with functions of one variable. we work over intervals which are easily handled. Furthermore. the assumption of piecewise continuity (or regularity-uniform limit of step functions) is very easy to handle and quite sufficient to treat important applications. The end points of an interval. which form its boundary. present no problem. but in dealing with higher dimensional domains. we require a minimum of theory to obtain a satisfactory description of the boundary which allows us to generalize the fundamental theorem of calculus relating integration and differentiation. In Chapter XX we give the basic tool in n-space. and in Chapter XXI we describe the formalism of differential forms. which allows us to define the integral over a parametrized set.
564
CHAPTER
XX
Multiple Integrals
XX, §1. ELEMENTARY MULTIPLE INTEGRATION Let [a, b] be a closed interval. We recall that a partition P on [a, b] is a finite sequence of numbers
a
= Co ;;; c,
;;; ...
:$;
c,
=b
between a and b, giving rise to closed subintervals [Ci' Ci+ I]. This notion generalizes immediately to higher dimensional space. By a closed 11rectangle (or simply a rectangle) in Rn we shall mean a product J, x ... x I n
of closed intervals J 1>" • ,In • An open rectangle is a product as above, where the intervals J i are open. We shall usually deal with closed rectangles in what follows, and so do not use the adjective ..closed" unless we start dealing explicitly with other types of rectangles. If Pi is a partition of the closed interval J i , then we call (P 1>' •• ,Pn) = P a partition of the rectangle. In 2-space, a rectangle together with a partition looks like this: d
c b
565
566
[XX, §I]
MULTIPLE INTEGRALS
We view P as dividing the rectangle into subrectangles. Namely, if Si is a subinterval of the partition Pi' for each i = I, ... •n, then we call
S, x ...
X
Sn
a subrectangle of the partition P. Let R = J, x··· x I n be a rectangle, expressed as a product of intervals 1;. We define the volume of R to be v(R)
= I(J ,) .. ·IUn)
where lUi) is the length of 1;. If J i = [ai' b.J, then I(J i )
= bi
ai
-
so that v(R)
= (b ,
- a,)' .. (b n
-
an)'
We define the volume of an open rectangle similarly. The volume is equal to 0 if some interval J i consists of only one point. Let f be a bounded real valued function on a rectangle R. Let P be a partition of R. We can define the lower and upper Riemann sums by
LR(P,f) = UR( P,f)
I
=I
inf(f)v(S). s su p(f)v( S), s
where infif) is the greatest lower bound of all values f(x), for xeS. sUPif) is the least upper bound of all values f(x) for xeS, and the sum is taken over all subrectangles S of the partition P. If R is fixed throughout a discussion, we omit the subscript R and write simply L(P, f) and U(P, f). Let P' = (P', •... ,P~) be another partition of R. We shall say that P' is a refinement of P if each P; is a refinement of Pi (i = I, ... ,n). We recall that Pi being a refinement of Pi means that every number occurring in the sequence Pi also occurs in the sequence P;. If P, P' are two partitions of R, then it is clear that there exists a partition P" which is a refinement of both P and P'. This is achieved for intervals simply by inserting all points of a partition into the other. and then doing it for each interval occurring as a factor of the rectangle, in n-space. We have the usual lemma. Lemma 1.1. If P' is a refinement of P chen
L(P,f)
~
L(P',f)
~
U(P', f)
~
U(P, f).
[XX, §I]
567
ELEMENTARY MULTIPLE INTEGRATION
Proof The middle inequality is obvious. Consider the inequality relating L(P', f) and L(P,f). We can obtain P' from P by inserting a finite number of points in the partitions of the intervals occurring in P. By induction, we are thus reduced to the case when P' is obtained from P by inserting one point in some partition Pi for some i = I, ... ,n. For simplicity of notation, assume i = 1. The rectangles of P are of type
where Si is a subinterval of Pi' One of the intervals of Ph say T, is then split into two intervals T, T" by the insertion of a point in P. All the subrectangles of P' are the same as those of P, except when T occurs as a first factor. Then the rectangle S= T
X
S2
X •••
x S.
is replaced by two rectangles, namely S'
=T
X
S2
X •••
x S.
S" = T"
and
X
S2
X •••
x S•.
The term inf(f)v(S) S
in the lower sum L(P, f) is then replaced by the two terms inf(f)v(S') S'
+ inf(f)v(S"). S"
We have /(T) = /(T) + /(T"), and hence inf(f)v(S) = inf(f)/(T')/(S2)" ·/(S.) S
+ inf(f)/(T")/(S2)"
S
;:;; inf(f)v(S') S'
·/(S.)
S
+ inf(f)v(S"). S"
This proves that L(P, f) ;:;; L(P', f). The inequality concerning the upper sum is proved the same way. We define the lower integral LR(f) to be the least upper bound of all numbers LR(P, f), and the upper integral UR(f) to be the greatest lower bound of all numbers UR(P,f). We say that f is Riemann integrable (or simply integrable) if
568
MULTIPLE INTEGRALS
[XX, §1]
in which case we define its integral JR(f) to be equal to the lower or upper integral; it does not matter which. Example. Let f be the constant funclion 1. Let R
= [a" b,]
x ... x [an' bnl
Let P = (P" . .. ,Pn ) be a partition of R. Each Pi can be written in the form
where
The subrectangles of the partition are of the type
The lower sum is equal to the upper sum, and is equal to the repeated sum
"n
kl
i.. =O
it =0
L ... L (c'.h+'
- c'.h)",(cn.i.+, - cn.iJ
We evaluate the last sum first, and note that
,
L (cn.i.+, j,,=O
- cn.i ) = bn - an'
By induction, we find that JR(I) = (b , - a,) .. · (b N - aN)
= v(R).
From the definitions of the least upper bound and greatest lower bound, we obtain at once an (E, P)-characterization of the integrability of f, namely: f is integrable such that
011
R
if alld ollly if,
givell
E,
there exists a partitioll P of R
IU(P,f) - L(P,f)1 < (.
Furthermore, we also note that if the preceding inequality holds for P, then it holds for every partition P' which is a refinement of P.
[XX, §1]
ELEMENTARY MULTIPLE INTEGRATION
569
Theorem 1.2. The integrable functions on R form a vector space. The
integral satisfies the following properties: (NT I. The map f (NT 2. If f
~
1-+
.
IRf is linear. ~
0, then IRf
0.
Proof The first assertion follows from the fact that for each subrectangle S of a partition of R we have
+ inf(g)
inf(f) s
~
inf(f s
+ g)
~
suP(f s
+ g)
s
sup(f) s
+ sup(g),
+ g)
U(P, f) + U(P, g).
~
s
and hence for the partition P,
L(P, f) + L(P, g)
~
L(P, f + g)
~
U(P, f
~
Also for any number c ~ 0, inf(cf) = c inf(f). s s The linearity follows at once. As for (NT 2, if f inf(f) ~ s
so that L(P, f) ~
~
0 then
°
°
for all partitions P. Property (NT 2 follows at once.
From (NT ( and (NT 2 we have a strengthening of (NT 2, namely:
If f, 9 are integrable and f Indeed, we have 9 -
f >
~
g, then IR(f)
~
I R(g)·
0, so IR(g - f) ~ 0, and by linearity,
whenoe our assertion. We now want to integrate over more general sets than rectangles. A subset K of Rn will be said to be negligible if given £, there exists a finite number of rectangles R I> •.. ,R m which cover K (that is whose union contains K) and such that
570
[XX, §I]
MULTIPLE INTEGRALS
It is clear that in this definition, we may take the rectangles to be either open or closed. Furthermore, a negligible subset is clearly bounded. Its closure is also negligible, and is compact. A function f on a rectangle R will be said to be admissible if it is bounded and continuous except possibly on a negligible subset of R. It is trivial that a finite union of negligible sets is negligible. Hence a finite sum of admissible functions on R is admissible, and in fact, the set of admissible functions forms a vector space. It is also clear that the product of two admissible functions is admissible, and if f, 9 are admissible, Ihen so are max(j; g), min(f, g), and If I· We define the size of a partition P of R to be < b if the sides of all subrectangles of P have a length < b.
Theorem 1.3. Every admissible function on R is integrable. Given an admissible function, and £, there exists b such that if P is a partition of R of size < b, then U(P, f) - L(P, f) <
£.
If f, 9 are admissible and if f(x) = g(x) except for the points x in some negligible set, then I R f = I R g.
We shall need a lemma. Lemma 1.4. Let S be a rectangle contained in a rectangle R. Given there exists b such that if P is a partition of R, size(P) < b, and
£,
are the subrectangles of P which intersect S, then
Proof Let S be the rectangle
Let P be a partition of size < b, and let S" of P which intersect S. Then each Sj U = I, rectangle [c, - b,d,
+ b] x .. · x [cn
-
,Sm be the subrectangles ,m) is contained in the b,dn
+ b],
and the sum of the volumes v(Sj) therefore satisfies the inequality
v(S,)
+ ... + v(Sm) 5: (d,
- c,
+ 2b)· .. (d n -
Cn
+ 2b).
[XX, §1]
ELEMENTARY MULTIPLE INTEGRATION
571
If 0 is small enough, the expression on the right is < v(S)
+ £, as was to be
shown. To prove Theorem 1.3, let f be an admissible function on some rectangle R, and let D be a negligible set of points containing the set where f is not continuous. Let R?, ... ,R2 be open rectangles which cover D, and such that if R I' ... ,Rk are the corresponding closed rectangles, then
Let U be the union R? v ... v R2, so that U is open. Let Z be the complement of U. Then Z n R is closed and bounded, so compact, and f is uniformly continuous on Z n R. Let 0, be such that whenever
x, yeZnR and Ix - yl < 0 1 then If(x) - f(y)1 < €. (We use the sup norm on R".) By the lemma, there exists O2 such that if P is a partition of size < O2 , then if SI> ""Sm are the subrectangles of P which intersect R I , ... ,R k then
Let 0 < min(o" O2 ), To compare the upper and lower sum of f with respect to this partition, we distinguish the subrectangles S according as S is one of SI' ... ,Sm or is not. We obtain:
U(P, f) - L(P, f) = =
L [sup(f) s
f
s
inf(f)]v(S)
s
[SUP(f) - inf(f)]v(Sj)
j=15j
5J
+ L [sup(f) S'*SJ
s
inf(f)]v(S) S
;;; 211 f112£ + £ L v(S) 5,*5,
;;; 211f112£
+ w(R).
This proves that f is integrable. . Furthermore, suppose we change the values of f on D to those of another function g. The lower sums L(P, f) and L(P, g) then differ only in those terms m
m
L inf(f)v(Sj) j= 1 5J
and
L inf(g)v(S) j= 1 5J
572
[XX, §l]
MULTIPLE INTEGRALS
which are estimated by IIf1l2£ and IIg1l2£ respectively. Thus for £ smal~ the lower sums are close together. Since these lower sums are also close to the respective integrals, it follows that I R(f) = I R(g). This proves the theorem. A subset A of Rn will be said to be admissible if it is bounded, and if its boundary is a negligible set. We denote the boundary of a set A by oA. The verification of the following properties is left to the reader as an exercise: o(A u B) c (oA
U
oB),
o(A n B) c (oA
o(A - B) c (oA
where we denote by A - B the set of all x
oB),
oB)
U
E
U
A such that x
rt B.
Hence:
Lemma 1.5. A finite union of admissible sets is admissible, a finite intersection of admissible sets is admissible, and
if A,
B are admissible. so is
A-B.
Let A be a subset of RP and B a subset of Rq. Then A x B is a subset of and
Rp+q,
o(A x B) = (oA x B)
U
(A x oB).
This is immediately verified. By induction, we find that o(A I
X •••
x An)
= union of AI
x ... x oA, x ... x An
the union taken for all i = 1, ... ,n if AI' ... ,A n are subsets of euclidean spaces. We can apply this to the case of a rectangle R = [at>
btl
x ... x [an'
bJ
and find that its boundary is the union of sets [at> b l ] x ... x {a,} x ... x [an, bn]
and
The boundary of a rectangle obviously is negligible. For instance, we can cover a set [at> b l ] x ... x {c} x ... x [an, bn]
[XX, §1]
ELEMENTARY MULTIPLE INTEGRATION
573
by one rectangle
where J is an interval of length ( containing c, so that the volume of this rectangle is arbitrarily small. It is also nothing but an exercise to show that if A, B are admissible, then A x B is admissible. A function f on R· is said to be admissible if it is admissible on every rectangle. Let f be admissible, and equal to 0 outside the rectangle S. Let R be a rectangle containing S. We contend that IR(f) = Is(f). To prove this, write R = [a" b l ] x ... x [a., b.],
We view (ai> Ci> d h b i) as forming a partition Pi' and let P = (P" ... ,p.) be the corresponding partition of R. Then S appears as one of the subrectangles of the partition P of R. Let 9 be equal to f except on the boundary of S, where we define 9 to be equal to O. If P' is any partition of R which is a refinement of P, and S' is a subrectangle of P', then either S' is a subrectangle of S, or S' does not intersect S, or S' has only boundary points in common with S. Hence for each P' we find that
where P's is the partition of S induced by P' in the natural way. From this it follows at once that IR(g) = IS
Our preceding remark shows that 1A(f) is independent of the choice of rectangle R selected containing A. We call 1A f the integral of f over A. Conversely, given an admissible set A and a function f on A, we say that f is admissible on A if the function extended to R· by lelling f(x) = 0 if x ¢ A is an admissible function. We have now associated with each pair (A, f) consisting of an admissible set A and an admissible function f a real number I A f satisfying the following properties: INT I. For each A, the map f INT2.
f-+
Iff~OtheI1IAf>O.
1Af is linear.
574
MULTIPLE INTEGRALS
[XX, §I]
INT 3. We have IAf = IAfAINT 4. For every rectangle S we have IS
Proof We can assume f
=
fAvB' and then write
It follows that
Actually, there is a more general formula, because for any two admissible sets, we can write A u B = (A - B) u (A (") B) u (B - A),
and the three sets appearing on the right are disjoint. Furthermore, (A - B) u (A (") B) = A,
and similarly, (B - A) u (A (") B) = B. Hence: Proposition 1.8. For any two admissible sets A, B we have
Let X be any set. We define its characteristic function Ix to be the function such that Ix(x) = I if x E X and Ix(y) = 0 if Y ¢ X. Then Lx is continuous at every point which is not a boundary point of X, and is definitely not continuous on the boundary of X. It follows at once that
X is admissible if and only if Lx is an admissible function.
[XX, §1]
ELEMENTARY MULTIPLE INTEGRATION
575
For any admissible set A we define its volume to be Vol(A)
= v(A) = lAO).
This is simply the integral of the characteristic function of A. Proposition 1.9. We have IIAfl ~ IIfllv(A) (if usual).
IIfll
is the sup norm as
Proof Since ±f ~ IIfll we can use linearity and the inequality-preserving property of the integral to conclude that
which yields our assertion. In particular: Proposition 1.10. If A is negligible, then I Af = O. Theorem 1.11. There is one and only one way of associating with each admissible set A and admissible funct ion f a (real) number I Af satisfying the four properties INT 1 through INT 4. Proof Existence has been shown. We prove uniqueness. We denote by I~ f any other integral satisfying the four properties. Suppose A is contained in some rectangle R, and let P be a partition of R. If S, S' are subrectangles of the partition, then they are disjoint, or have only boundary points in common, so that the set of common points is negligible. We may assume that f(x) = 0 if x tJ A. We then have
IH = IH = 'i,IU s
For each S, by the inequality property of the integral, and linearity, we find inf(f)v(S) s
~
I ~(f)
~
sup(f)v(S). s
Hence
Since f is integrable, it follows that
I~ f
= I R f, as was to be shown.
Let A, f be admissible. Let WE R". We define A w to be the set of all elements x + w with x E A. Similarly, we define fw to be the function such that fjx) = f(x - w) (the minus sign is not a misprint). We call
576
MUL TIPLE INTEGRALS
[XX, §1]
A w and fw the translations by w of A and f respectively. It is clear that the map is linear (in other words, (f + g)w = fw + gw and (ef)w = efw)· As for sets, we have (A u B)w = A w u Bw and (A n B)w = A w n B w. If R is a rectangle, then R w is a rectangle, having the same volume (obvious). The translation of a negligible set is thus obviously negligible. Hence one verifies at once that both A wand fw are admissible.
Theorem 1.12. The integral is invariant under translations. In other words, for admissible A and f, and w E R· we have:
Proof We define (for fixed w):
IH=
IAwfw·
The four properties tNT 1 through tNT 4 are then immediately verified. Note that in tNT 3, we use the fact that fw(x + w) = f(x), so that if f is 0 outside A, then fw is 0 outside A w, and A w c B w. We can then apply Theorem J.J 1 to see that I· = I. As for tNT 4, if S is the rectangle
and w
= (w" ... ,w.), then Sw is the rectangle [e,
+ w"d, + w,]
x ... x [c.
+ w.,d. + w.]
whose volume is obviously equal to v(S). The first two properties are even more obvious, and the theorem is proved. In light of the uniqueness, we shall use standard notation, and write
IAf =
f/ =
Lf(X) dx.
XX, §1. EXERCISES The first set of exercises shows how to generalize the class of integrable functions. I. Let A be a subset of R· and let a E A. Let I be a bounded function defined on A.
For each r > 0 define the oscillation ofIon the ball of radius r centered at a to be oscillation o{I, a, r) = supII(x) - I(Y)I
[XX, §l]
ELEMENTARY MULTIPLE INTEGRATION
the sup being taken for all x, y
E
577
8,(a). Define the oscillation at a to be
0(/, a)
=
Show that this limit exists. Show that
lim 0(/, a, r).
,-0
J is continuous at a if and only if
0(/, a)
= O.
2. Let A be a closed set, and J a bounded function on A. Given subset of elements x E A such that 0(/, x) ~ £ is closed.
3. A set A is said to have measure 0 if given {R I , R" ...} covering A such that
£,
show that the
£,
there exists a sequence of rectangles
<
£.
m
L v(R
j)
J= 1
Show that a denumerable union of sets of measure 0 has measure O. Show that a compact set of measure 0 is negligible. 4. Let J be a bounded function on a rectangle R. Let D be the subset of R consisting of points where J is not continuous. If D has measure 0, show that J is integrable on R. [Hinl: Given £, consider the set A of points x such that 0(/, x) ~ £. Then A has measure 0 and is compact.] 5. Prove the converse of Exercise 4, namely: If J is integrable on R, then its set of discontinuities has measure O. [Hinl: Let A ", be the subset of R consisting of all x such that o(J, x} ~ lin. Then the set of discontinuities of J is the union of all A ", for n = I, 2, ... so it suffices to prove that each A II' has measure 0, or equivalently that A ", is negligible.] Exercises 4 and 5 above give the necessary and sufficient condition for a function to be Riemann integrable. We now go on to something else. 6. Let A be a subset of R'. Let I be a real number. Show that a(IA) IA is the set of all points IX with x E A).
= la(A) (where
7. Let R be a rectangle, and x, y two points of R. Show that the line segment joining x and y is contained in R. 8. Let A be a subset of R' and Jet AO be the interior of A. Let x E AO and let y be in the complement of A. Show that the line segment joining x and y intersects the boundary of A. [Hinl: The line segment is given by x + l(y - x) with O~t~1.
Consider those values of I such that [0, I] is contained in AO, and let s be the least upper bound of such values.] 9. Let A be an admissible set and let S be a rectangle. Prove that precisely one of the following possibilities holds: S is contained in the interior of A, S intersects the boundary of A, S is contained in the complement of the closure of A.
578
[XX, §2]
MULTIPLE INTEGRALS
10. Let A be an admissible set in R". contained in some rectangle R. Show that Vol(A)
= lub L v(S), P
Sc;;A
the least upper bound being taken over all partitions of R, and the sum taken over all subrectangles S of P such that SeA. Also prove: Given (, there exists ~ such that if size P < ~ then
I
Vol(A) -
L v(S)
SeA
I
< <,
the sum being taken over all subrectangles S of P contained in A. Finally, prove that Vol(A) = glb L v(S). ,.
snAnolemply
the sum now being taken over all subrectangles S of the partition P having a nonempty intersection with A.
I an integrable function on R. Suppose that for each rectangle S contained in R we are given a number ItI satisfying the following condition: (i) If P is a partition of R then
II. Let R be a rectangle and
IU= LI$!. s
(ii) If there are numbers m and M such that on a rectangle S we have m~/(x)~M
for all x
€
S,
then mv(S) ~ I$! ~ Mv(S).
Show that
1;1 = IRI.
12. Let U be an open set in R" and let P € U. Let 9 be a continuous function on U. Let V, be the volume of the ball of radius r. Let B(P, r) be the ball of radius r centered at P. Prove that g(P)
= Iim.!.,.-0
v,.
f
9
B(P.r)
XX, §2. CRITERIA FOR ADMISSIBILITY this section we give a few simple criteria for sets and functions to be admissible. We recall that a map f satisfies a Lipschitz condition on a set A if there exists a number C such that ]n
If(x) - f(Y)1 ::; Clx -
YI
[XX, §2]
CRITERIA FOR ADMISSIBILITY
579
for all x, YEA. Any C' map f salisfies locally al each poinl a Lipschitz condition, because its derivative is bounded in a neighborhood of each point, and we can then use the mean value estimate If(x) - f(y)1 ~
Ix - Ylsuplf'(z)1,
the sup being taken for z on the segment between x and y. We can take the neighborhood of the point to be a ball, say, so that the segment between any two points is contained in the neighborhood.
Proposition 2.1. Let A be a negligible set in R" and let f: A .... R" satisfy a Lipschitz condition. Then f(A) is negligible. Proof. Let C be a Lipschitz constant for f. A rectangle is called a cube if all its sides have the same length. By Lemma 1.4 we can cover A by a finite number of cubes S" . .. ,Sm such that
Let rj be the length of each side of Sj' Then for each j = 1, ... ,In we see that f(A (') Sj) is contained in a cube S; whose sides have length ~ 2CIj. Hence vIS;) ~ 2"C"rF = 2"C"v(Sj)'
Hence f(A) is covered by a finite number of cubes S; such that v(S;)
+ ... +
v(S~)
< 2"C"€.
This proves that f(A) is negligible, as desired.
Proposition 2.2. Let A be a bounded subset of Rm. Assume that m < n. Let f: A .... R" satisfy a Lipschitz condition. Then f(A) is negligible. Proof View Rm as contained in R" (first m coordinates). Then A is negligible. Indeed, if A is contained in an m-cube R, we tak~ n - m sides equal to a small number 0, and then R x [0,0] x ... x [0,0] has small n-dimensional volume. Thus we can apply Proposition 2.1 to conclude the proof.
Remark. In Propositions 2.1 and 2.2 we can replace the Lipschitz condition by the condition that the map f is C' on an open set U containing the closure A of A.
580
[XX, §2]
MUL T1PLE INTEGRALS
Proof Since A is compact, there exists a finite covering of A by open balls Vi (i = I, ... ,r) contained in V such that f' is bounded on each Vi' Then f is Lipschitz on each Vi and hence in Proposition 2.1, each set A n Vi is negligible, so that A itself is negligible, being a finite union of negligible sets. In Proposition 2.2, the same applies to each f(A n VJ
Proposition 2.2 is used in practice to show that the boundary of a certain subset of R" is negligible. Indeed, such a boundary is usually contained in a finite number of pieces, each of which can be parametrized by a C' map f defined on a lower dimensional set. Proposition 2.3. Let A be an admissible set in R" and assume that its closure A is contained in an open set V. Let f: V .... R" be a C' map, which is C'-invertible on the interior of A. Then f(A) is admissible and
af(A) c f(aA). Proof Let AO be the interior of A, that is the set of points of A which are not boundary points of A. Then AO is open, so is f(AO), and f yields a C'-invertible map between AO and f(Ao). We have
- = A° u
A
aA.
and aA = aA, whence f(AO) c f(A) c f(A)
= f(AO) u
f(aA).
This shows that af(A) c f(aA), and that aj(A) is negligible by Proposition 2.1. thus proving Proposition 2.3. Proposition 2.4. Let V be open in R" and A admissible such that the closure A is contained in V. Let f: V .... R" be a map of class C', and C'-invertible. Let 9 be admissible on f(A). Then 9 0 f is admissible on A.
Proof Using Proposition 2.3, we know that f(A) is admissible, and so is f(A). We can extend 9 arbitrarily to f(A), say by letting g(y) = 0 at those points y where 9 is not originally defined. Then this extension of 9 is still admissible. If D is a closed negligible set contained in f(A) and containing the boundary of f(A) as well as all points where 9 is not continuous, then D is compact, contained in the image f(V). and f-'(D) is therefore negligible by Proposition 2.1. Since go f is continuous outside f - '(D), our proposition is proved.
[XX, §3]
REPEATED INTEGRALS
581
XX, §2. EXERCISES I. Let 9 be a continuous function defined on an interval [a, b]. Show that the graph of 9 is negligible.
2. Let 9,,9, be continuous functions on [a, b] and assume 9, ;;; 9,. Let A be the set of points (x, y) such that a ;;; x ;;; band 9,(X) ;;; Y ;;; 9,(x). Show that A is admissible. 3. Let U be open in R" and letf: U .... R" be a map ofdass C'. Let R be a dosed cube contained in U, and let A be the subset of U consisting of all x such that Det f'(x)
= o.
Show that f(A () R) is negligible. [Hint: Partition the cube into N" subcubes each of side siN where s is the side of R, and estimate the diameter of eachf(A () S) for each subcube S of the partition.]
XX, §3. REPEATED INTEGRALS We shall prove that the multiple integral of §I can be evaluated by repeated integration. This gives an effective way of computing integrals. Let A, B be (closed) rectangles in RP and Rq respectively. Let f be an integrable function on A x B. We denote by fx: B ---+ R the function such that fx(y) = f(x, y). We may then want to integrate fx over B. It may happen that for some Xo E A the set {xo} x B is a set of discontinuities for f, because such a vertical set is negligible in RP x Rq. However, if fx is integrable, we define
The map x 1-+ f B fx then defines a function on A, or rather on the subset of A consisting of those x such that fBfx exists. We shall assume that fBfx exists for all x except in some negligible set in A. We define f B fx in any way (bounded) for x in this negligible set. For the purposes of the next theorem, we shall see that it does not matter how we define f B fx for such exceptional x. We shall denote the function X 1-+ fBfx by fBI
Theorem 3.1. Let A, B be (closed) rectangles in RP and Rq respectively. Let f be an integrable function on A x B. Assume that for all x except in a negligible subset of A the function fx is integrable. Then the function fBf is integrable, and we have
582
MULTIPLE INTEGRALS
[XX, §3]
or in another notation,
Proof LeI P A be a parIition of A and P B a partition of B. Then we obtain a partition P of Ax B by taking all producIs S = SA X SB of subrectangles SA of PA and subrectangles SB of PB. We have: L(P,f)
=I S
inf(f)v(S)
=I I
S
inf (f)V(SA x SB)
SA So SAxSB
~
~ ~
I I
inf inf(f,)v(Ss)v(SA)
SA So xeSA Sa
I (I inf
inf(fx)v(SB»)v(SA)
s", xeS.. So So
I
inf(lBf)v(SA)
SA SA
Similarly, we obIain
Since we can choose P = PAX PB such that U(P,f) and L(P, f) are arbitrarily close together, we conclude that I Bf is integrable, and that iIS integral over A is given by f A IBf=fA xB f, as was to be shown. Example. We recover an elementary theorem concerning multiple integration as a consequence of Theorem 3.1. Let g" g2 be continuous funcIions on [a, b] such thaI 9 1 :0;; g2. As we saw in Exercise 2 of the preceding section, the set of all points (x, y) such that a :0;; x ~ band g,(x) ~ y ~ g,(x)
is admissible. We denote this set by A. Let f be a continuous function on A. LeI R be a rectangle conIaining A and extend f to all of R by defining f(x, y) = 0 if (x, Y) E R but (x, y) ¢ A. Then f is admissible, since its set of discontinuities is the boundary of A and is negligible. We may take
R
= [a,b]
x
[m,M]
[XX, §3]
REPEATED INTEGRALS
where m, M are numbers such that m [a,b]. Then
~
583
g,(x) ;;; g,(x) ;;; M for all x in
By Theorem 3.1, we also have
L fr I =
I(x, y) dy dx,
Ix is continuous on the interval
because for each x, the function
[g,(x), g,(x)]
and is equal to 0 outside this interval. We then obtain
f
A
1=
I
b[I91(X) I(x, y) dy ] dx.
a
9.(%)
The picture for the preceding example is as follows:
b
a
XX, §3. EXERCISE I. Let f be defined on the square S consisting of all points (x. y) such that 0 ~ x ~ 1 and 0 ~ y ~ I. Let f be the function on S such that
_ {I y3
fIx. y) -
if x is irrational, if x is rational.
(a) Show that
J:[J:f(X. y) dyJ dx does not exist. (b) Show that the integrallg(f) does not exist.
584
[XX, §4]
MULTIPLE INTEGRALS
XX, §4. CHANGE OF VARIABLES We first deal with the simplest of cases. We consider vectors v" ... ,v. in R· and we define the block B spanned by these vectors to be the set of points
I. We say that the block is degenerate (in R·) if the vectors v" ... ,v. are linearly dependent. Otherwise, we say that the block is nondegenerate, or is a proper block in R·.
with 0 ;;;
t; ;;;
A block in 2-space A block in 3-space
We see that a block in R 2 is nothing but a parallelogram, and a block in R 3 is nothing but a parallelepiped (when not degenerate). We denote by Vol(v" ... ,v.) the volume of the block B spanned by v" .. . ,v•. We define the oriented volume VoIO(v" ... ,v.) = taking the
±Vol(v"
+ sign if Det(v" . .. ,v.) > 0 and the Det(v" ... ,v.) <
... ,v.), - sign if
o.
The determinant is viewed as the determinant of the matrix whose column vectors are v" ... ,v., in that order. We recall the following characterization of determinants: Suppose that we have a product
which to each n-tuple of vectors associates a number, such that the product is multilinear, alternating, and such that
e,
1\ •.. 1\
e.
= I
[XX, §4]
585
CHANGE OF VARIABLES
if e" ... ,e. are Ihe unit veclors. Then this producl is necessarily the determinant, i.e. it is uniquely determined. "Alternating" means that if Vi = Vj for some i # j then v, 1\ •.• 1\ v. = O. The uniqueness is easily proved, and we recalllhis short proof. We can write
for suitable numbers aij, and then V, 1\ ... 1\
+ ... + a,.e.) 1\ •..
v. = (aile, =
L •
1\
(a.,e,
+ ... + a••e.)
al,a(1)eCf{l) 1\ ••. 1\ an.a(n)eCf(n)
The sum is taken over all maps a: {l, ... ,n} -+ {I, ... ,n}, but because of the alternating property, whenever a is not a permutation the term corresponding to a is equal to O. Hence the sum may be taken only over all permutations. Since
where lea) = lor-I is a sign depending only on a, it follows that the alternating product is completely determined by its value e, 1\ ... 1\ e., and in particular is the determinant if this value is equal to 1. Theorem 4.1. We have Volo(v" ... ,v.) = Det(v" ... ,v.) and
Vol(v" ... ,v.) = IDet(v" ... ,v.)I.
Proof If v... .. ,v. are linearly dependent, then the determinant is equal to 0, and the volume is also equal to 0, for instance by Proposition 2.2. So our formula holds in this case. It is clear that VoIO(e ..... ,e.) = 1.
To show that VOiD satisfies the characteristic properties of the determinant, all we have to do now is to show that it is linear in each variable, say the first. In other words, we must prove: (*) (**)
VoIO(ev, VolO(v
V2' ... ,v.)
=
+ W, V2' ... ,v.) =
e Volo(v, v2 , Volo(v,
... ,v.)
V2, ... ,v.)
for e E R,
+ Volo(w, V2' ... ,v.).
586
[Xx, §4]
MULTIPLE INTEGRALS
As 10 the first assertion, suppose first that c is some positive integer k. Let B be the block spanned by v, V2"" 'Vn' We may assume without loss of generality that v, V2"" ,Vn are linearly independent (otherwise, the relation is obviously true, both sides being equal to 0). We verify at once from the definition that if B(v, V2,'" ,vn) denotes the block spanned by v, V2' ... ,Vn then B(kv, V2' ... 'Vn) is the union of the two sets and
B(k - I)v, V2"" ,Vn)
B(v, V2,'" 'Vn)
+ (k
- I)v
which have a negligible set in common. We actually carry out the details, proving this. By definition, B(kv, V2, ,Vn) is the set of elements x which can be written in the form t,kv + t 2 v2 + + tnvn with 0 ~ t l ~ I. Consider the subsets A, A' defined as follows. A consists of all the elements of B(kv, V2' ... 'Vn) such that
o ~ t,
~
(k - I)jk,
and A' consists of those elements such that (k - I)jk ;;; t 1 A = B(k - I)v, V2, ... 'Vn)' As for A', let
~
I. TIlen
or Then 0
~
s,
~
1 and elements of A' can be written in the form
Thus A' = B(v, V2' ... 'Vn) + (k - J)v is the translation of B(v, V2' ... ,vJ by (k - I)v. as was to be shown. The points in common between the above two sets A and A' are those for which t, = (k - I)jk, and thus these points can be parametrized by a lower dimensional set, under a map which is a composite of a linear map and a translation. Hence A n A' is negligible. Therefore, we find that
+ Vol(v, V2,'" ,vn) + Vol(v, V2,'"
Vol(kv, V2,'" ,vn) = Vol(k - I)v, V2"" ,vn) =
(k - J) Vol(v, V2,'"
= k Vol(v, V2,'" as was to be shown. Now let
,vn),
,vn) ,vn)
[XX, §4]
587
CHANGE OF VARIABLES
for a positive integer k. Then applying what we have just proved shows that
Writing a positive rational number in the form mlk = m . 11k, we conclude that the first relation holds when c is a positive rational number. If r is a positive real number, we find positive rational numbers c, c' such that c ~ r ~ c', Since
B(cv, v" ... ,vn) c B(rv, v" ... ,vn) c B(c'v, v" ... ,vn), we conclude that c Vol(v, VI"" ,vn)
~
Vol(rv, v" ... ,cn)
~
c' Vol(v, v" ... ,t'n)'
Letting c, c' approach r as a limit, we conclude that for any real number r ~ 0 we have Vol(rv, v" ... ,vn) = r Vo1(v, v" ... ,vn~ Finally, we note that B( - v, v" ... ,vn) is the translation of
B(v, v" ... ,vn) by - v so that these two blocks have the same volume. This proves the first assertion. For the second assertion, we shall first prove a special case.
Lemma 4.2. If VI> ••• ,Vn are linearly independent, then
Proof We look at the geometry of the situation, which is made clear by the following picture: v.
+ Vz +
Vz
588
[XX, §4)
MULTIPLE INTEGRALS
The proof amounts to observing that the two shaded triangles have the same volume because one is the translation of the other. We give the details. Let B be the block spanned by v" Vz, ... ,Vn and B' the block spanned by v, + vz, Vz, ... ,Vn. Then: 8' consists of all x
= t,(v, + vz) + tzvz + ... + tnvn
with
0
~ t, ~
I,
+ tnvn. which we can also write as t, v, + (t, + (z)vz + B consists of all elements y = S,V, + SzVz + + SnVn with 0 ~ S, ~ I. Let B - B' be the set of all y E B, Yf/ 8'. An element y lies in B - B' if and only if 0 ~ S, ~ I and Sz < s,. Indeed, let (, = SI' If Sz < SI then there is no(zsuchthats z = t, + t z . ConverselY,ifs z > s" then welelt z =Sz - s, and we see that y lies in B n B'. Finally, consider the set B' - B consisting of all x E B' such that x f/ B. It is the set of all x written as above, with t, + t z > I. Let Sz = t, + t z - I. An element x E B' - B can then be written
with 0 ~ t, ~ 1 and 0 < Sz ~ (1' (The condition Sz ~ (, comes from the fact that Sz + (1 - t z ) = t,.) Conversely, any element x written with I, and Sz satisfying 0 < Sz ~ (, lies in B' - B, as one sees immediately. Hence, except for boundary points, we conclude that 8' - B = (B - B')
+ vz.
Consequently, 8' - Band B - B' have the same volume. Then Vol B
= Vol(B -
8')
+ Vol(B n
B')
= Vol(B' -
B)
+ Vol(B n
B')
= Vol B'.
This proves the lemma. From the lemma, we conclude that for any number c, VoIO(v,
+ cVz, Vz, ... ,Vn) =
Volo(v" Vz, .. · ,Vn)'
Indeed, if c = 0 this is obvious, and if c "# 0 then c VolO(v,
+ CVz, VZ, ... ,Vn) = =
VOIO(VI
+ CV z , CVz,'"
Vo lOt VI ~
CV 1 , ... 'Vn) -
,Vn) C
V0 IO( VI'
\ Vl, ... ,Vn)'
We can then cancel c to get our conclusion. To prove the linearity of VOiD with respect to its first variable, we may assume that vz,'" ,Vn are linearly independent, otherwise both sides of
[XX. §4)
CHANGE OF VARIA3LES
589
(00) are equal to O. Let v, be so chosen that {VI>'" .vn } is a basis of Rn. Then by induction. and what has been proved above.
VolO(c,v,
+ ... + CnVn • V2.···. Vn)
= Volo(c,v,
+ ... + Cn-'Vn-I> V2.···. Vn)
= VoIO(c,VI> V2 •...• vn ) = c, Volo(v, •... •vn ).
From this the linearity follows at once. and the theorem is proved. Corollary 4.3. Let S be the ullit cube spalllled by the ullit vectors ill Rn. Let A: Rn -+ Rn be a lillear map. Theil VolA(S) = I Det(A)I·
Proof If VI .... •Vn are the images of el>'" .en under A. then A(S) is the block spanned by VI>' •• •vn • If we represent A by the matrix A = (a;j). then Vi
and hence Det(vi •... •vtl )
=
ail el
+ ... + aj,.e"
= Det(A) = Del(A).
This proves the corollary.
Corollary 4.4. If R is allY rectallgle ill Rn alld A: Rn -+ Rn is a lillear map.
thell VolA(R)
= I Det(A) IVol(R).
Proof. After a translation. we can assume that the rectangle is a block. If R = A,(S) where S is the unit cube. then
whence by Corollary 4.3. VolA(R)
= IDet(A
0
A,)I
= IDet(A) Det(A,)1 = IDet(A)1 Vol(R).
The next theorem extends Corollary 4.4 to the more general case where the linear map A is replaced by an arbitrary C'-invertible map. The proof then consists of replacing the linear map by its derivative and estimating the error thus introduced. For this purpose. we define the Jacobian determinant f:,.Jlx)
= DetJf(x) = Det j'(x)
where J fix) is the Jacobian matrix. and j'(x) is the derivative of the map f: U -+ RO.
590
[XX, §4]
MUL TJPLE INTEGRALS
Theorem 4.5. Lee R be a recrallgle ill R", canrailled ill some opell sec U. Lee (: U -> R" be a C' map, which is C'-illverrible Oil U. Then Vol/(R) =
Dllr
l.
Proof. When I is linear, this is nothing but Corollary 4.4 of the preceding theorem. We shall prove the general case by approximating/ by its derivative. Let us first assume that R is a cube for simplicity. Let P be a parI ilion of R, obtained by dividing each side of R into N equal segments for large N. Then R is partitioned into N" subcubes which we denote by SJ (j = I, ... ,N"). We let OJ be the center of Sj' We have Vol feR) =
L Vol f(S) j
because lhe images J(Sj) have only negligible sets in common. We investigate f(S) for eachj. LeI C be a bound for IF(x)' 'I, x E R. Such a bound exists because x ...... IF(x)' 'I is cominuous on R which is compact. Given £, we take N so large thaI for x, Z E Sj we have If'(z) - F(x)l
LeI Aj = F(o) wherc
OJ
< £Ie.
is the ccmer of thc cube Sj' Then
IA~ I . 0 f'(z) A~' 0 r(x)1 J J.
<
l
for all x, Z E Sj. By Lemma 3.3 of Chapter XVIII, §3 applied to the sup o/(Sj) contains a cube of radius norm, we conclude that
J./
(I - l)(radius of S), and lrivial estimates show that Aj' , (I
0
IrS) is containcd in a cube of radius
+ £)(radius of S),
these cubes being centered at OJ' We apply Aj to each one of lhese cubes and thus squeeze I(S) between the images of these cubes under Aj. We can determme lhe volumes of these cubes using Corollary 4.4. For some
[XX, §4]
591
CHANGE OF VARIABLES
constant C" we then obtain a lower and an upper estimate for Vol f(Sj), namely IDetf'(a)1 Vol(S) - £C 1 Vol(S)
~
Vol f(S)
;;; IDet f'(a)IVol(Sj) + £C, Vol(S). Summing over j, and estimating I~fl by a lower and upper bound, we get finally L(P,
I~fl)
- £C 2
~
Vol f(R)
;;; U(P, I~fl)
+ £C 2
for some constant C 2 (actually equal to C, Vol R). Our theorem now follows at once. Remark. We assumed for simplicity that R was a cube. Actually, by changing the norm on each side, multiplying by a suitable constant, and taking the sup of these adjusted norms, we see that this involves no loss of generality. Alternatively, we can find a finite number of cubes B" ... ,Bm in some partition of the rectangle such that Iv(B,)
+ ... + v(Bm )
-
v(R)1 <
£,
and apply the result to each cube. The next result is an immediate consequence of Theorem 4.5, and is intermediate to the most general form of the change of variable formula to be proved in this book. It amounts to replacing the integral of the constant function I in Theorem 4.5 by the integral of a more general function g. It actually contains both the preceding theorems as special cases, and and may be called the local change of variable formula for integration. Corollary 4.6. Let R be a rectangle in R", contained in some open set U. Let f: U -+ R" be a C 1 map, which is C'-invertible on U. Let 9 be an admissible function on f(R). Then 9 0 f is admissible on R, and
Proof Observe that the function 9 0 f is admissible by Proposition 2.4, and so is the function (g 0 f)1~fl.
592
[XX, §4]
MULTIPLE INTEGRALS
Let P be a partition of R and let {S} be the collection of subrectangles of P. Then
f
inf 9
I(S) lIS)
~
f ~f
sup g,
9
I(S)
I(S) lIS)
whence by Theorem 4.5, applied to constant functions, we get
Let C be a bound for 18/1 on R. Subtracting the expression on the left from that on the right, we find
o ~ Lhp(g
0
i~f(g
f) -
~ C fshp(g
0
f) -
0
f)} 8 / 1
i~f(g
0
f)}
Taking the sum over all S, we obtain
I
s
[isUP(g 0 f)18 F I s s
f inr(g Js s
0
J~
f) 18 / 1
C[U(P, go f) - L(P, 9 0 !n,
and this is < l for suitable P. On the other hand, we also have the inequality
which we combine with the preceding inequality to conclude the proof of the corollary. We finally come to the most general formulation of the change of variable theorem to be proved in this book. It involves passing from rectangles to more general admissible sets under suitable hypotheses, which must be sufficiently weak to cover all desired applications. The proof is then based on the idea of approximating an admissible set by rectangles contained in it, and observing that this approximation can be achieved in such a way that the part not covered by these rectangles can be covered by a finite number of other rectangles such that the sum of their volumes is arbitrarily small. We then apply the corollary of Theorem 4.5.
[XX, §4]
593
CHANGE OF VARIABLES
Theorem 4.7. Let V be open in Rn and let f: V -+ Rn be a C· map. Let A be admissible, such that its closure A is contained in U. Assume that f is C·-invertible on the interior of A. Let 9 be admissible on f(A). Then 9 0 f is admissible on A and
Proof In view of Proposition 2.3, it suffices to prove the theorem under the additional hypothesis that A is equal to its closure A, which we make from now on. Let R be a cube containing A. Given £, there exist t5 and a partition P of R such that all the subrectangles of P are cubes, the sides of these subcubes are of length t5, and if S..... ,Sm are those subcubes intersecting aA then
Let K = (S. u ... u Sm) n A. Then K is compact, v(K) < £, and we view K as a small set around the boundary of A (shaded below).
_v
J
"
If T is a subcube of P and T # Sj for j = I, ... ,m, then either T is contained in the complement of A or T is contained in the interior of A (Exercise 9 of §I). Let T..... ,Tq be the subcubes contained in the interior of A and let B = 1j u ... u Tq.
Then B is an approximation of A by a union of cubes, and A - B is contained in K. We have A = B u K and B n K is negligible. Both f(K) and feB) are admissible by Proposition 2.3. We have:
J J J J g=
I(A)
g+
I(K)
g=
I(B)
I(K)
g+
Iq k= 1
J
I(T,)
9
594
MULTIPLE INTEGRALS
[XX, §4)
and by the corollary of Theorem 4.5.
f =f
9
=
!(K)
9
ilK)
tf
+
11=1
7,.
(g
0
f)1~/1
+ f (g f)1~/1.
Js
0
All that remains to be done is to show that SilK) 9 is small, that (g 0 f) is admissible, and that the integral over B of (g 0 f)1 ~/I is close to the integral of this function over A. We do this in three steps. (I) By the mean value estimate. there exists a number C (the sup of I f'(z) I on A) such that f(S)", A) is contained in a cube Sj of radius ~ C(j for eachj = 1, ... ,In. Hence f(K) can be covered by S'" ... ,S;" and v(Sj) ~ c"(j" ~ C"v(S}).
Consequently v(J(K)) ~ C"€", and
f
9 ;;; C"lllgil
11K)
which is the estimate we wanted. (2) Under slightly weaker hypotheses, the admissibility of go f would follow from Proposition 2.4. In the present case, we must provide an argument. Let fK: K -+ f(A) and fs: B -+ f(A) be the restrictions of f to K and B respectively. Let D be a closed negligible subset of f(A) where 9 is not continuous. Note that D,., feB) is negligible. and hence
fs I(D) =
r
I(D) ,., B
is negligible, say by Proposition 2.1 of §2 applied to f-I. On the other hand, fKI(D) =
r
I(D),., K
is covered by the rectangles S .... ,.• Sm' Hence f-I(D),., A can be covered by a finite number of rectangles whose total volume is < 2l. This is true for every l, and therefore f-I(D) ,., A is negligible, whence go f is admissible, being continuous on the complement of f- I(D) ,., A. (3) Finally, if C I is a bound for Ig 0 fll~/1 on A, we get
IL(gOf)'~/'- L(gOf)'~/'I~ L_slgof"~/' ~ Clv(A - B)
< C,l, which is the desired estimate. This concludes the proof of Theorem 4.7.
[XX, §4]
595
CHANGE OF VARIABLES
Example I (Polar coordinates). Let
f: R 2
-+
R2
be the map given by f(r, 6) = (r cos 6, r sin 6) so that we put
x
= rcos6
y
= r sin 6.
and
Then J (r,6) I
= (c~s 6 Sin 6
- r sin
6)
rcos 6
and
!J. fr, 6) = r. Thus !J.jr, 6) > 0 if r > O. There are many open subsets U of R2 on which ! is C'·invertible. For instance, we can take U to be the set of all (r,6) such that 0 < rand 0 < 6 < 271. The image of ! is then the open set V obtained by deleting the positive x-axis from R 2 • Furthermore, the closure U of U is the set of all (r, 6) such that r 2:: 0 and 0 ;:;; 6 ;:;; 271. Furthermore, f(U) is all of R 2 • If A is any admissible set contained in U and 9 is an admissible function on !(A), then
f
g(x, y) dx dy
=
I(A)
f
g(r cos 6, r sin 6)r dr d6.
A
y
8
f r
596
MULTIPLE INTEGRALS
[XX, §4]
The rectangle R defined by
o;;;; 8 ;;;; 2n
and
maps under f onto the disc centered at the origin, of radius rl. Thus if we denote the disc by D, then
f2n ["
f.
= Jo
Dg(X, y) dx dy
The standard example is g(x, y)
J g(r cos 8, r sin 8)r dr d8. o
= e- X2 _ y2 = e-,2, and we then find
performing the integration by evaluating the repeated integral. Taking the limit as r 1 -. 00, we find symbolically
f.
e-X2_y2
dx dy = n.
R2
On the other hand, if S is a square centered at the origin in R 2 , it is easy to see that given E, the integral over the square
differs only by E from the integral over a disc of radius r 1 provided r I > a and a is taken sufficiently large. Consequently, taking the limit as a -. 00, we now have evaluated
(As an exercise, put in all the details of the preceding argument.)
Example 2. Let A be an admissible set, and let r be a number note by r A the set of all points rx with x e A. Then Vol(rA)
= r" Vol(A).
~
O. De-
[XX, §4)
597
CHANGE OF VARIABLES
Indeed, the determinant of the linear map
IS 1£.
The
is a dilation of the disc by the linear map represented by the matrix
if a, b are two positive numbers. Hence the area of the ellipse is nab. As an exercise, assuming that the volume of the n-ball of radius I in n-space is what is the volume of the ellipsoid
v.,
XX, §4. EXERCISES J. Let A be an admissible set symmetric about the origin (that meanS: if x - x E A). Let f be an admissible function On A such that
E
A then
f(-x) = -fix). Show that
2. Let T: R" -> R" be an invertible linear map. and let B be a ball centered at the origin in R". Show that
r
JB
e-(Ty.Ty)
dy
=
f
e-(x.x)
dxldet T-11.
T(B)
(The symbol (,) denotes the ordinary dot product in R".) Taking the limit as the ball's radius goes to infinity, one gets
598
[XX, §4]
MULTIPLE INTEGRALS
3. Let B.(r) be the closed ball of radius r in R·. centered at the origin. with respeel to the euclidean norm. Find its volume v,,(r). [Hint: First note that v,,(r)
We may assume n
~
= r"v,,(I).
2. The ball B.(I) consists of all (x" . .. ,x.) such that
Put (x" X2) = (x. y) and let g be the characteristic function of B.(I). Then
v,,(l)
=
f,f,U•.
_,g(x.Y.x3 •... ,x.)dX3 ... dX.] dxdy
where R._ 2 is a rectangle of radius I centered at the origin in (n - 2)-space. If x 2 + y2 > I then g(x. Y. X3 •. .. ,x.) = O. Let D be the disc of radius I in R2 • If x 2 + T ;;; 1. then g(x. Y. X3.·.· ,x.) viewed as function of (X3 •... ,x.) is the characteristic function of the ball
Hence the inner integral is equal to
J.
g(x.y.x3.···,x.) dx3··· dx•
= (1
- x2
-
y2)(.-2)/2v,,_2(1)
R"-l.
so that
Using polar coordinates, the last integral is easily evaluated. and we find: 11·
V2 (I) = n
n!
and
1·3·5···(2n -I)
Suppose that r is a function such that rex r(/2) = Show that
-fi.
v,,(I)
ro + n12)"
+ I) =
xr(x). r(1) = I. and
[XX, §4]
CHANGE OF VARIABLES
4. Determine the volume of the region in
a· defined by the inequality
Ix.! + ... + Ix.1 5. Determine the volume of the region in
~ r.
a'· = a' x
Iz.! + ... + Iz.1 where
599
~
...
X
a' defined by
r,
z, = (x" y,) and Iz,1 = Jxf + yf is the euclidean norm in
6. (Spherical coordinates) (a) Define f:
a'.
a 3 -+ a 3 by
Xl
= r COS 8•.
X2
= r sin 8. cos 82 ,
X3
= r sin 8 1 sin 8 2 .
Show that
Show that f is invertible on the open set
0<
Y,
o < 8,
<
0< 8, < 21t,
It,
and that the image under f of this rectangle is the open set obtained from deleting the set of points (x, y,O) with y ;;; 0, and x arbitrary.
x,
r ~----I----r--X3
Let S(r , ) be the closed rectangle of points (r, 8 8,) satisfying
"
o ~ 8, ~ It,
o ~ 8, ~ 21t.
a 3 by
600
[XX, §4]
MULTIPLE INTEGRALS
Show that the image of S(r I) is the closed ball of radius r, centered at the origin in R'. (b) Let 9 be a continuous function of one variable, defined for r ;;; O. Let G(x"
X2'
x,)
= g(Jx~ + x~ + xD.
Let 8(r,) denote the closed ball of radiusr,. Show that
f.
G = W, ['g(r)r 2 dr
~I)
0
where W, = 3V" and V, is the volume of the three-dimensional ball of radius I in R'. (c) The n-dimensional generalization of the spherical coordinates is given by the following formulas:
Xn _ 1
= r sin 81 sin 82 .. . sin 8,,- 2 cos 8,,_ h
x" = r sin
eI sin 82 • .• sin 8
n_ 2
We take 0 < r,O < 0, < " for i = I, ... ~J determinant is then given by
-
sin 8,,_ I_
2 and 0 < 0._, < 2". The Jacobian
A (0 ) .-, sin . •-201 SIn .•-, u/ r. 1 •... ' 0 ,,-1 = r
=
r'-' J(O).
Then one has the n-dimensional analogue of dx dy dx I
...
dx.
02' .. SIn . (1,,-2 n
= r·- I J(O) dr dO, ... dO._ 1
= r dr dO,
namely
abbreviated r·- I dr dl'(O).
Assuming this formula, define the (n - I)-dimensional area of the sphere to be
where the multiple integral on the right is over the intervals prescribed above for 0 = (01 , .•• ,O._.). Prove that
where V. is the n-dimensional volume of the n-ball of radius I. This generalizes the formula W, = 3V, carried out in 3-space.
[XX, §4]
CHANGE OF VARIABLES
601
7. Let T: R" .... R" be a linear map whose determinant is equal to I or -1. Let A be an admissible sel. Show that Vol(TA)
= Vol(A).
(Examples of such maps are the so-called unitary maps, i.e. those T for which (Tx, Tx) = (x, x) for all x E R".) 8. (a) Let A be the subset of R' consisting of all points
with 0 ;::> I, and 11 + I, ;::> 1. (This is just a triangle.) Find the area of A by integration. (b) Let v" v, be linearly independent vectors in R'. Find the area of the set of points 11V1 + I,V, with 0;::> I, and I, + I, ~ I, in terms of Oot(v" v,). 9. Let VI> • •• ,v" be linearly independent vectors in R". Find the volume of the solid consisting of all points
with 0 ;::> I, and I,
+ ... + I" ;::>
1.
10. Let B. be the closed ball of radius a > 0, centered at the origin. In n-space, let X = (x I ' ... • x.) and let r = IXI. where I I is the euclidean norm. Take
0< a < I, and let A. be the annulus consisting of all points X with a ;::> I X I ;::> 1. Both in the case n = 2 and n = 3 (i.e. in the plane and in 3-space), compute the integral
I
•
=
fA.IXI
_I_dx, ... dx.
"
Show that this integral has a limit as a .... o. Thus, contrary to what happens in I-space, the function f(X) = III X I can be integrated in a neighborhood of O. [Hint: Use polar or spherical coordinates. Actually, using n-dimensional spherical coordinates, the result also holds in II-space.] Show further that in 3-space, the function III XI' can be similarly integrated near O. II. Let B be the region in the first quadrant of R' bounded by the curves xy = I, xy = 3, x' - y' = I. and x' - y' = 4. Find the value of the integral
ff(X' + y') dx dy B
by making the substitution u = x' ing the change of variables formula.
r
and v = xy. Explain how you are apply-
602
[XX, §5]
MULTIPLE INTEGRALS
12. Prove that
where A denotes the half plane x
~
a > O. [Hint: Use the transformation
and
y
= vx.]
13. Find the integral
fff
xyz dx dy dz
taken over the ellipsoid
14. Let [ be in the Schwartz space on R·. Define a normalization of the Fourier
transform by r
(y)
=
f
[(x),-""" dx.
R"
Prove that the function h(x) = ,-••> is self dual, that is h" = h. 15. Let B he an /I x n non-singular real matrix. Define ([ 0 B)(x) = [(Bx). Prove that the dual of [ 0 B is given by (fo B)"{y) = _1_r('B-1y),
IIBII where IIBII is the absolute value of the determinant of B. 16. For bE R' define jj,(x) = [(x - b). Prove that (jj,)" (y)
= ,-,...."!" (y).
XX, §5. VECTOR FIELDS ON SPHERES Let S be the ordinary sphere of radius I, centered at the origin. By a tangent vector field on the sphere, we mean an association
F:S->R' which to each point X of the sphere associates a vector F(X) which is tan---+ gent to the sphere (and hence perpendicular to OX). The picture may be drawn as follows:
[XX, §5]
VECTOR FIELDS ON SPHERES
603
,
,
___ ... _ -
,, , __ ---,I-
,o
,
_
x + F(X)
For simplicity of expression, we omit the word tangent, and from now on speak only of vector fields on the sphere. We may think of the sphere as the earth, and we think of each vector as representing the wind at the given point. The wind points in the direction of the vector, and the speed of the wind is the length of the arrow at the point. We suppose as usual that the vector field is smooth. For instance, the vector field being continuous would mean that if P, Q are two points close by on the sphere, then F(P) and F( Q) are arrows whose lengths are close, and whose directions are also close. As F is represented by coordinates, this means that each coordinate function is continuous, and in fact we assume that the coordinate functions are of class ct. Theorem 5.1. Given any vector field on the sphere, there exists a point P on the sphere such that F(P) = O. In terms of the interpretation with the wind, this means that there is some point on earth where the wind is not blowing at all. To prove Theorem 5.1, suppose to the contrary that there is a vector field such that F(X) '" 0 for all X on the sphere. Define
F(X) E(X) = IF(X) I' that is let E(X) be F(X) divided by its (euclidean) norm. Then E(X) is a unit vector for each X. Thus from the vector field F we have obtained a vector field E such that all the vectors have norm 1. Such a vector field is called a unit vector field. Hence to prove Theorem 5.1, it suffices to prove:
Theorem 5.2. There is no unit vector field on the sphere. The proof which follows is due to Milnor (Math. Monthly, October,
1978).
604
[XX, §5]
MULTIPLE INTEGRALS
Suppose there exists a vector field E on the sphere such that
IE(X)I
= I
for all X. For each small real number t, define G,(X) = X
+ tE(X).
Geometrically, this means that G,(X) is the point obtained by starting at X, going in the direction of E(X) with magnitude t. The distance of X + tE(X) from the origin 0 is then obviously
JI + t
2
•
Indeed, E(X) is parallel (tangent) to the sphere, and so perpendicular to X itself. Thus
IX
+ tE(X)!2 = (X + tE(X»2 = X 2 +
t 2E(X)2
= I + t 2,
since both X and E(X) are unit vectors.
Lemma 5.3. For all t sufficiently small, the image G,(S) of the sphere IInder G, is equal to the whole sphere of radius + t2•
Jl
Proof This amounts to proving a variation of the inverse mapping theorem, and the proof will be left as an exercise to the reader.
We now extend the vector field E to a bigger region of 3-space, namely the region A between two concentric spheres, defined by the inequalities
a ::: IXI
;;a b.
This extended vector field is defined by the formula E(rU) = rE(U)
for any unit vector U and any number r such that a
;;a r ;;a b.
[XX, §5]
VECTOR FIELDS ON SPHERES
605
II follows that the formula G,(X) = X
+ eE(X)
also given in terms of unit vectors U by G,(rU)
=
rU
+ eE(rU) = rG,(U)
defines a mapping which sends the sphere of radius r onto the sphere of radius r 1 + eZ by the lemma, provided that e is sufficiently small. Hence it maps A onto the region between the spheres of radius
J
aJI
+ eZ
and
bJI
+ eZ•
By the change of volumes under dilations, it is then clear that Volume G,(A) = (JI
+ eZ )3
Volume(A).
JI
+ eZ still involves a square root, and is Observe that taking the cube of not a polynomial in e. On the other hand, the Jacobian matrix of G, is
as you can verify easily by writing down the coordinates of E(X), say E(X) = (gl(X, y, z), gz(x, y, z), g3(X, y, z»).
Hence the Jacobian determinant has the form
where fo, ... /3 are functions. Given the region A, this determinant is then positive for all sufficiently small values of e, by continuity, and the fact that the determinant is 1 when e = o. For any regions A in 3-space, the change of variables formula shows that the volume of G,(A) is given by the integral Vol G,(A) =
Iff
6 G,(x, y, z) dy dx dz.
A
If we perform the integration, we see that
606
MULTIPLE INTEGRALS
[XX, §5]
where Ci
=
Iff
fi(x, y, z) dy dx dz.
A
Hence Vol G,(A) is a polynomial in t of degree 3. Taking for A the region between the spheres yields a contradiction which concludes the proof.
XX, §S. EXERCISE Prove the statements depending on inverse mapping theorems which have been left as exercises in the above proof.
CHAPTER
XXI
Differential Forms
XXI, §1. DEFINITIONS We recall first two simple results from linear (or rather multilinear) algebra. We use the notation EI') = E x Ex··· x E, r times.
Theorem A. Let E be a finite dimensional vector space over the reals. For each positive integer r there exists a vector space f\ E and a multilinear alternating map
denoted by (u" ...•u,) ...... U, 1\ •.. 1\ u" having the following property: If {v" ... ,vn } is a basis of E, then the elements { V i 1 1\ • •• 1\
form a basis of
v·I.. }'
i1 < i2 < ... < i,.,
I\' E.
We recall that alternating means that some i #- j.
11, 1\ .•• 1\ 11,
=0
if
1/,
= lIj
for
Theorem B. For each pair of positive integers (r. s), there exists a lInique product (bilinear map)
607
608
[XXI, §I]
DIFFERENTIAL FORMS
such that
if u., ... ,Un W h
(UI A .•. A
u,)
.. . , Ws E
E then
X (WI A ... A W s )-- UI A .•. A U, A WI A ••• A W S '
This product is associative.
The proofs for these two statements will be briefly summarized in an E the r-thalternating product (or exterior product) appendix. We call E = R. Elements of E which can be written of E. If r = 0, we define in the form U I A .•. A U, are called decomposable. Such elements generate
!'I
!'I
If
!'I E.
Now let E* denote the space of linear maps from Rn into R. We call E* the dual space of Rn. It is the space which we denoted by L(Rn, R). If AI> ... ,An are coordinate functions on Rn, that is
then each Ai is an element of the dual space, and in fact {AI>'" ,An} is a basis of this dual space. Let U be an open set in Rn. By a differential form of degree r on U (or an r-form) we mean a map (I);
U
--+
!'I E*
from U into the roth alternating product of E*. We say that the form is of class CP if the map is of class CPo For convenience in the rest of the book, we assume that we deal only with forms of class CO:>, although we shall sometimes make comments about the possibility of generalizing to forms of lower order of differentiability. Since {AI> ... ,An} is a basis of E*, we can express each differential form in terms of its coordinate functions with respect to the basis
{A.II A namely for each x
E
••• A
A·I,}
(il < ... < i,),
U we have
W(X) =
L};, "'i.(x)Ai.
A ... A
Ai.
(i)
where fii) = /;.... i. is a function on U. Each such function has the same order of differentiability as w. We caIl the preceding expression the standard form of w. We say that a form is decomposable if it can be written as just one term
[XXI, §1]
609
DEFINITIONS
Every differenlial form is a sum of decomposable ones. We agree to the convention that functions are differential forms of degree O. It is clear that the differential forms of given degree form a vector space, denoted by O'(U). As with the forms of degree ~ 1, we assume from now on that all maps and all functions mentioned in the rest of this book are C"', unless otherwise specified. Let f be a function on U. For each x E U the derivative
f'(X): Rn --+ R is a linear map, and thus an element of the dual space. Thus
1': U --+ E* is a differential form of degree 1, which is usually denoted by df [Note. If f was of class C', then df would be only of class C'- I. Having assumed f to be C"', we see that df is also of class C"'.] Let Ai be the i-th coordinate function. Then we know that
dA;(X) = A,(X) = Ai for each x E U because A'(X) = A for any continuous linear map A. Whenever {XI"" ,xn} are used systematically for the coordinates of a point in Rn, it is customary in the literature to use the notation
dA;(X) = dx;. This is slightly incorrect, but is useful in formal computations. We shall also use it in this book on occasions. Similarly, we also write (incorrectly)
w=
L fm dXi,
1\ •.• 1\
dXi,
(i)
instead of the correct
w(X) =
L j(,)(X)A"
1\ •.. 1\
A",
(ij
In terms of coordinates, the map df (or 1') is given by
df(x)
= f'(x) = Dd(x)A I + ... + Dnf(X)An
where D;j(x) = Of/OXi is the i-th partial derivative. This is simply a restatement of the fact that if H = (h" ... ,hn ) is a vector, then
j'(x)H
= -of
ox
I
hi
of + ... + 0hn Xn
610
[XXI, §I]
DIFFERENTIAL FORMS
which was discussed long ago. Thus in older notation, we have of
of
= :;dx, + ... +:;- dxn · uXl UX
df(x)
n
Let ro and 1/1 be forms of degrees rand s respectively, on the open set U. For each x E U we can then take the alternating product ro(x) 1\ I/I(x) and we define the alternating product ro 1\ 1/1 by (ro
I/I)(x) = ro(x)
1\
1\
I/I(x).
If f is a differential form of degree 0, that is a function, then we define
f
1\
ro = fro
where (fro)(x) = f(x)ro(x). By definition, we then have co
1\
N
= fro
1/1.
1\
We shall now define the exterior derivative dro for any differential form roo We have already done it for functions. We shall do it in general first in terms of coordinates, and then show that there is a characterization independent of these coordinates. If ro = '" '(') d)'·II ~ Jh
1\'"
1\
d)'·I"
(i)
we define dro
= '" L. d'(,) !Ill
1\
d)"IJ
1\'"
1\
d)'·I,.'
(i)
Example. Suppose n = 2 and ro is a I-form, given in terms of the two coordinates (x, y) by ro(x, y) = f(x, y) dx + g(x, y) dy.
Then dro(x, y) = df(x, y) of
= ( ox dx
1\
= (Of
oy
+ dg(x, y)
f) +o oy dy
of
= oy dy
dx
1\
dx
1\
og
+ ox dx
_ Og) dy ox
dx
1\
dx
1\
1\
dy
Og) + (Og ox dx + oy dy dy
1\
dy
[XXI, §l]
611
DEFINITIONS
because the terms involving dx numerical example, take
dx and dy
1\
w(x, y) = y dx
1\
dy are equal to O. As a
+ (x 2 y) dy.
Then dw(x, y) = dy
dx
1\
+ (2xy) dx
= (I - 2xy) dy
1\
1\
dy
dx.
Theorem 1.1. The map d is linear, and satisfies dew
t/J) =
1\
dw
1\
t/J + (-l)'w
1\
dt/J
if r = deg w. The map d is uniquely determined by these properties, and by the fact that for a function J, we have df = f'. Proof The linearity of d is obvious. Hence it suffices to prove the formula for decomposable forms. We note that for any function f we have d(fw) = df
1\ W
+ f dw.
Indeed, if w is a function g, then from the derivative of a product we get d(fg) = f dg + g df If w
= g dA.-
'I
1\'"
1\
dA'I~
where g is a function, then d(fw)
= d(fg dA;,
1\ ... 1\
+ g df) 1\ =fdw + df 1\ W, = (f dg
dAiJ
dAi'
= d(fg)
1\ ... 1\
dAi'
1\ ..• 1\
dA;,
=
g dkJI
1\ ... 1\
dkJ~
=
gl{J
1\
dAi,
as desired. Now suppose that w =fdA'II
1\'"
1\
dA'l ..
and
,I,
~
=fw
with i, < ... < i, and j, < ... < j. as usual. If some i, = j., then from the definitions we see that the expressions on both sides of the equality in the theorem are equal to O. Hence we may assume that the sets of indices i" ... ,i, and j, , ... J. have no element in common. Then
dew
1\
I{J) = 0
612
[XXI, §1]
DIFFERENTIAL FORMS
by definition, and
dew
1\
if!) = d(fgw
1\
l{J) = d(fg)
W 1\ l{J
1\
= (g df + f dg)
1\
W 1\ l{J
=dw
1\
if! +fdg
W
1\
l{J
= dw
1\
if! + (-I)'iw
1\
dg
= dw
1\
if!
1\
+ (-IYw
1\
1\
lj,
dif!,
thus proving the desired formula, in the present case. (We used the fact that dg 1\ w = (-IYw 1\ dg whose proof is left to the reader.) The formula in the general case follows because any differential form can be expressed as a sum of forms of the type just considered, and one can then use the bilinearity of the product. Finally, d is uniquely determined by the formula, and its effect on functions, because any differential form is a sum of forms of type f dA" 1\ ••• 1\ dA" and the formula gives an expression of d in terms of its effect on forms of lower degree. By induction, if the value of d on functions is known, its value can then be determined on forms of degree ~ I. This proves the theorem.
XXI, §1. EXERCISES I. Show that ddf = 0 for any function f, and also for a I-form.
2. Show that ddw
= 0 for any differential form w.
3. In 3-space, express dw in standard form for each one of the following w: (a) w
= x dx + y dz
(b) w
(c) w = (sin x) dy + dz
(d) w
= xy dy + x dz = e'dx + y dy + e" dz
4. Find the standard expression for dw in the following cases: (a) w = x 2y dy - xy2 dx (b) w = ex'dx (c) w = f(x, y) dx where f is a function.
1\
dz
5. (a) Express dw in standard fonn if w
= x dy 1\ dz + Y dz
1\
dx
+ z dx
1\
dy.
1\
dx
+ h dx
1\
dy.
(b) Let f, g, h be functions, and let w
=f dy 1\ dz + g dz
Find the standard fonn for dw. 6. In n-space, find an (n - I)-form w such that dw = dx,
1\ ••• 1\
dx•.
[XXI, §2]
STOKES' THEOREM FOR A RECTANGLE
613
7. Let w be a fonn of odd degree on ·U, and let f be a function such that f(x) "" 0 for all x E U, and such that d(fw) = O. Show that w A dw = O. 8. A form w on U is said to be exact if there exists a form I/J such that w = dI/J. If WI' (1)2 are exact, show that WI A (t)2 is exact. 9. Show that the form 1 r
w(x, y, z) = -. (x dy
A
dz
+ y dz
is closed but not exact. As usual, r' = x' the complement of the origin in R3 •
A
dx
+ y' + z'
+ z dx
A
dy)
and the fonn is defined on
XXI, §2. STOKES' THEOREM FOR A RECTANGLE Let (J) be an n-forrn on an open set U in n-space. Let R be a rectangle in U. We can write (J) in the form w(x)
= f(x) dX 1
A ••• A
dx n ,
wherefis a function on U. We then define
where the integral of the function is the ordinary integral, defined in the previous chapter. Stokes' theorem will relate the integral of an (n - I)-form l/J over the boundary of a rectangle, with the integral of dl/J over the rectangle itself. However, we need the notion of oriented boundary, as the following example will make clear. Example. We consider the case n = 2. Then R is a genuine rectangle, R
= [a, b]
x [c, d].
I~
d
II
R c
a
b
614
[XXI, §2]
DIFFERENTIAL FORMS
The boundary consists of the four line segments
[a, b] x {c},
{b} x [c, d],
{a} x [c, d],
[a, b] x {d}
which we denote by I~,
respectively. These line segments are to be viewed as having the beginning point and end point determined by the position of the arrows in the preceding diagram. Let
w =fdx
+ gdy
be a I-form on an open set containing the above rectangle. Then
dw
= df 1\ dx + dg
1\
dy
and by definition,
dw
= of oy dy
1\
dx
og + ox dx
1\
dy
= (Og ox -
of) oy dx
1\
dy.
Then by definition and repeated integration,
f dw = [ JR <
Jb (Ogox _ of) dx dy oy Q
=
f
[g(b, y) - g(a, y)] dy -
f
[f(x, d) - f(x, c)] dx.
The right-hand side has four terms which can be interpreted as the integral of w over the .. oriented boundary" of R. If we agree to denOle this oriented boundary symbolically by 0° R, then we have Stokes' formula
We shall now generalize this to n-space. There is no additional difficulty. All we have to do is keep the notation and indices straight. Let
R
= [a., b.J
x ... x [an> bn].
[XXI, §2]
615
STOKES' THEOREM FOR A RECTANGLE
Then the boundary of R consists of the union for all i of the pieces RO•.
= [a"
b!] x ... x {a·}• x ... x [a n' b] n'
If w(x" ... ,xn} = I(x., ... ,xn } dx! 1\ ... 1\ l&.j 1\ ... 1\ dX n is an (n - 1)form, and the roof over anything means that this thing is to be omitted, then we define
And similarly for the integral over oriented boundary to be
i
=
(lOR
Rl.
L(-I}1 n i=l
We define the integral over the
[f.
. f.] Rl
-
Rf
Stokes' theorem for rectangles. Let R be a rectangle in an open set V in n-space. Let be an (n - 1}{orm on U. Then
ro
f. dro =Jr ro. R
OOR
Proof It suffices to prove the assertion in case form, say ro(x) = I(x!, .. . ,xn } dx,
ro is a decomposable
/'0.
1\ ... 1\
dXj
1\ ... 1\
dx n •
We then evaluate the integral over the boundary of R: If i i' j then it is clear that
f.R?ro=o=f.ro R' so that
i
iJOR
ro =
(-IY
f
b,
01
...
["'i ... [" [f(X., ... ,aj' ... ,x oJ
n)
0"
- I(x., ... ,bj , ... ,Xn)] dx, ... £j'" dX n
616
[XXI, §3]
DIFFERENTIAL FORMS
On the other hand, from the definitions we find that
+ ... + of dXn)
dw(x) = (Of dX I AX, .
I
OXn
of dx, OXj
= (- 1)'- -
1\ ... 1\
1\
dx,
1\ ... 1\
£j 1\ ...
1\
dXn
dx n ·
(The (-Iy-' comes from interchanging dXj with dx" ... •dXj_l' All other terms disappear by the alternation rule.) Intergrating dw over R, we may use repeated integration and integrate of/axj with respect to x j first. Then the fundamental theorem of calculus for one variable yields
f
bi
OJ
of
~ ux)
dXj = f(x" ... ,bj , ... ,Xn )
-
f(x ,•...•aj"" ,xn )·
We then integrate with respect to the other variables. and multiply by ( - I y- '. This yields precisely the value found for the integral of w over the oriented boundary 00 R, and proves the theorem.
In the next two sections. we establish the formalism necessary to extending our results to parametrized sets. These two sections consist mostly of definitions. and trivial statements concerning the formal operations of the objects introduced.
XXI, §3. INVERSE IMAGE OF A FORM We start with some algebra once more. Let E. F be finite dimensional vector spaces over R and let A: E -+ F be a linear map. If p.: F -+ R is an element of F*, then we may form the composite linear map p.oJ.:E-+R
which we visualize as
We denote this composite p. 0;' by ;'*(p.). It is an element of E*. We have a similar· definition on the higher alternating products. and in the appendix, we shall prove:
Theorem C. Let J.: E -+ F be a linear map. For each r there exists a unique linear map
;.*: /'I F*
-+
/'I E*
[XXI, §3]
617
INVERSE IMAGE OF A FORM
having the following properties: (i) .l.*(ro A IjJ) = .l.*(ro) A .l.*(IjJ) for roE 1\ P, IjJ E /\ F*. (ii) If /l E P then .l.*(J1) = /l o.l., alld .l.* is the identity 011 F* = R.
N
Remark. If /lj" ... ,/l). are in F*, then from the two properties of Theorem C, we conclude that .l.*(". VA} I
A'"
A
/l') = (". o.l.) }re \PJI
o.l.)
A ... A \ (". ¥Jr'
Now we can apply this to differential forms. Let U be open in E = R" and let V be open in F = Rm. Let f: U -+ V be a map (COO according to conventions in force). For each x E U we obtain the linear map
f'(x): E -+ F to which we can apply the preceding discussion. Consequently, we can reformulate Theorem C for differential forms as follows:
Theorem 3.1. Let f: U -+ V be a map. Then for each r there exists a unique linear map
1*: Q'(V) -+ Q'(U) having the following properties: (i) For any differential forms ro, IjJ on V we have
I*(ro
A
IjJ) = I*(ro)
(ii) If g is a function on V then I*(g)
A
f*(IjJ)·
= g f, and if ro is a I:form then 0
(f*ro)(x) = ro(f(x») 0 df(x).
We apply Theorem C to get Theorem 3.1 simply by letting .l. = f'(x) at a given point x, and we define (f*ro)(x) = f'(x)*ro(f(x»).
Then Theorem 3.1 is nothing but Theorem C applied at each point x.
Example I. Let Y.. ... ,Ym be the coordinates on V, and let /l) be the j-th coordinate function,j = I, ... ,m so that Y) = /l)(YI'··· ,Ym)' Let f: U .... V
618
[XXI, §3)
DIFFERENTIAL FORMS
be the map, with coordinate functions Yi
= jj(x) = /Ii
0
f(x).
If w(y) = g(y) dYi,
1\ ... 1\
dYj.
is a differential form on V, then
I
f*w = (g 0 f) djj,
1\ ... 1\
djj•.
Indeed, we have for x E U: (f*w)(x) = g(f(x»(lJio 0 rex»~
1\ •.. 1\
(/Ii. 0 rex»~
and
Example 2. Let f: [a, b) -+ R 2 be a map from an interval into the plane, and let x, Y be the coordinates of ttie plane. Let t be the coordinate in [a, b). A differential form in the plane can be written in the form w(x, y) = g(x, y) dx
+ hex, y) dy
where g, h are functions. Then by definition, dx f*w(l) = g(X(l), y(l» dt dl
+ h(x(t), y(l»
dy dl dl
if we write f(t) = (x(t), y(l». Let G = (g, h) be the vector field whose components are 9 and h. Then we can write f·w(t)
= G(f(l»· ret) dl
which is essentially the expression which we integrated when defining the integral of a vector field along a curve. Example 3. Let U, V be two open sets in n-space, and let f: U be a map. If w(y)
= g(y) dYI
1\ .•• 1\
dy.,
---+
V
[XXI, §3]
619
INVERSE IMAGE OF A FORM
where Yj = !,{x) is the j-th coordinate of Y, then dYj = D,!j(x) dx,
+ ... + D.!j(x) dx.,
oYj oYj =-dXl + ... + - d x ox, ox. •
and consequently, expanding out the alternating product according to the usual multilinear and alternating rules, we find that f*oo(x)
= g(f(x»)Ll.,(x) dx,
1\ .•• 1\
dx•.
As in the preceding chapter, Ll., is the determinant of the Jacobian matrix off Theorem 3.2. Let f: U -> V and g: V is a differential form on W, then
-+
W be maps of open sets. If
00
(g 0 f)*(oo) = f*(g*(oo»). Proof This is an immediate consequence of the definitions. Theorem 3.3. Let f: U Then
-+
V be a map, and
00
a differential form on V.
f*(doo) = df*oo. In particular,
if 9 is a function on
V, then f*(dg) = d(g 0 f).
Proof We first prove this last relation. From the definitions, we have dg(y) = g'(y), whence by the chain rule, (f*(dg»)(x)
= g'(f(x») f'(x) = (g f)'(x) 0
0
and this last term is nothing else but d(g 0 f)(x), whence the last relation follows. The verification for a I-form is equally easy, and we leave it as an exercise. [Hint: It suffices to do it for forms of type g(y) dy" with Yl = Ji (x). Use Theorem 1 and the faat that ddf, = 0.] The general formula can now be proved by induction: Using the linearity of f*, we may assume that 00 is expressed as 00 = t/J 1\ '1 where t/J, '1 have lower degree. We apply Theorem 1.1 and (i) of Theorem 3.1, to f*doo = f*(dt/J
1\
'1) + (-1)1*(t/J
1\
d'1)
and we see at once that this is equal to df*oo, because by induction, f*# = df*t/J and f*d'1 = df*'1. This proves the theorem.
620
[XXI, §4]
DIFFERENTIAL FORMS
XXI, §3. EXERCISES I. Let the polar coordinate map be given by
(x, y)
= f(r, II) = (r cos II, r sin II).
Give the standard form for f*(dx), f*(dy), and f*(dx
1\
dy).
2 Let the spherical coordinate map be given by (XI' x" x,)
= f(r, II" II,) = (r cos 0"
r sin II, cos II" r sin II. sin II,).
Give the standard form for f*(dx,), f*(dx,), f*(dx,), f*(dx, 1\ dx,)'f*(dx, 1\ dx,), and f*(dx 1 1\ dx, 1\ dx,).
f*(dx,
1\
dx,),
XXI, §4. STOKES' FORMULA FOR SIMPLICES In practice, we integrate over parametrized sets. The whole idea of the preceding section and the present one is to reduce such integration to integrals over domains in euclidean space. The definitions we shall make will generalize the notion of integral along a curve discussed previously, and mentioned again in Example 2 of §3. Let R be a rectangle contained in an open set U in R", and let u: U .... V be a map (COO according to conventions in force), of U into an open set V in Rm• For simplicity of notation, we agree to write this map simply as u:R .... V.
In other words, when we speak from now on of a map u: R .... V, it is understood that this map is the restriction of a map defined on an open set U containing R, and that it is COO on U. A map u as above will then be called a simplex. Let co be a differential form on V, of dimension 11 (same as dimension of R). We define
f. = L co
u*co.
Let Uh'" ,U, be distinct simplices, and C., ... ,c, be real numbers. A formal linear combination Y=
c,U, + ... + c,u,
will be called a chain. (For the precise definition of a formal linear combination, see the appendix.) We then define
[XXI, §4]
STOKES' FORMULA FOR SIMPLICES
621
This will be useful when we want to integrate over several pieces, with certain coefficients, as in the oriented boundary of a rectangle. Let u: R ..... V be a simplex, and let
R = [a" b,] x ... x [an, bJ. Let
-
R j = [ai' b,] x ... x [aj> biJ x ... x [an, bn]. We parametrize the i-th pieces of the boundary of R by the maps
a}: R , -+ V defined by
Observe that omitting the variable Xi on the left leaves II - I variables, but that we number them in a way designed to preserve the relationship with the original n variables. We define the boundary of u to be the chain n
au = I
j~
(-I)i(u? - (1). ,
Example. We consider the case II = 2. Then R is a genuine rectangle, R = [a, b] x [c, d].
.~
d
.? c
.:
R
.g a
b
622
[XXI, §4]
DIFFERENTIAL FORMS
We then find: u?(y) = (a, y),
u:(y)
= (b, y),
= (x, c),
uHx)
= (x, d).
ug(x) Then U
o+
=
-U 1
+ U 11 -
0 U2
1 U2
is the oriented boundary, and corresponds to going around the square counterclockwise. In general, consider the identity mapping I: R
->
R
on the rectangle. Let 1/1 be an n-form. We may view 0°R as 01, so that
r
JiJOR
1/1
=
r 1/1 = ±(-1);[1 1/1 -
Jal
;= I
If
11/1]. II
If u: R -> V is a simplex, and w is an (n - I)-form on V, then
r w = JfJor u*(w)
Jao
R
as one sees at once by considering the composite map up
= u IP. 0
Stokes' theorem for simplices. LeI V be open in Rm and let w be an (n - I){orm on V. LeI u: R -> V be an n-simplex in V. Then
Proof Since du*w = u* dw, it will suffioe to prove that for any (n - I)-form 1/1 on R we have
1
dl/l =
R
r
JiJOR
1/1.
[XXI, §4]
STOKES' FORMULA FOR SIMPLICES
623
This is nothing else but Stokes' theorem for rectangles, so Stokes' theorem for simplices is simply a combination of Stokes' theorem for rectangles together with the formalism of inverse images of forms. In practice one parametrizes certain subsets of euclidean space by simplices, and one can then integrate differential forms over such subsets. This leads into the study of manifolds, which is treated in my Real and Functional Analysis. In the exercises, we indicate some simple situations where a much more elementary approach can be taken.
XXI, §4. EXERCISES I. Instead of using reclangles, one can use triangles in Slokes' theorem. Develop this parallel theory as follows. Let Vo, ... ,v, be elements of R" such that Vi - Vo (i = I, ... ,k) are linearly independent. We define the triangle spanned by vo, ... ,i', to consist of all poinls
with realI, such that 0 ;;! Ii and 10 + ... + I, = I. We denote Ihis triangle by T, or T(vo ....•v,). (a) Let w, Vi - Vo for i I•... ,k. Let S be the set of points
=
=
with 5i ~ 0 and 5, + ... + 5, ;;! I. Show thai T(vo,··· ,v,) is the translation of S by vo. Define the oriented boundary of the triangle T to be the chain
,
fJ"T
= L (-I)JT(vo, ... $], ... •v,). j=O
(b) Assume that k = n, and thai T is contained in an open set U of R". Let w be an (n - I)-form on U. In analogy to Stokes' theorem for rectangles, show thai
fdw=i W. JT ~T The subsequent exercises do not depend on anything fancy, and occur in R 2 • Essentially you don't need to know anything from this chapter. 2. Let A be the region of R 2 bounded by the inequalities
and
g,(x) ;;! y ;;! g2(X)
624
DIFFERENTIAL FORMS
[XXI, §4]
where 9" 92 are continuous functions on [a, b]. Let C be the path consisting of the boundary of this region, oriented counterclockwise, as on the following picture:
b
a
Show that if P is a continuous function of two variables on A, then
Prove a similar statement for regions defined by similar inequalities but with reo spect to y. This yields Green's theorem in special cases. The general case of Green's theorem is that if A is the interior of a closed piecewise C' path C oriented counterclockwise and m is a (·form then
f.m = II dm A
In the subsequent exercises, you may assume Green's theorem. 3. Assume that the function J satisfies Laplace's equation,
on a region A which is the interior of a curve C, oriented counterclockwise. Show
that
fcoy
oj dx - oj dy = O.
ox
4. If F = (Q, P) is a vector field, we recall that its divergence is defined to be div F = oQlox + oPloY. If C is a curve, we say that C is parametrized by arc length if 1IC'(5)1I = I (we then use 5 as the parameter). Let C(5)
= 19,(5),92(5»)
be parametrized by arc length. Define the unit normal vector at 5 to be the vector N(5) = 19,(5), -9',(5)).
[XXI, §4]
STOKES' FORMULA FOR SIMPLICES
625
Verify that this is a unit vector. Show that if F is a vector field on a region A, which is the interior of the closed curve C, oriented counterclockwise, and parametrized by arc length, then
If
(div F)dydx
= LF.N ds.
•
If C is not parametrized by arc length, we define the unit normal vector by N(r) n(r) = IN(r)l' where I N(r)1 is the euclidean norm. For any function J we define the nonnal derivalive (the directional derivative in the normal direction) to be
D.J = (grad J). n. so for any value of the parameter r, we have
(D.J)(r) = grad J(C(r»· n(r). 5. Prove Green's formulas for a region A bounded by a simple closed curve C. always assuming Green's theorem.
+ g6J] dx dy = fcgD.J ds. H.(g6J - J6g) dx dy = fc
(al H.[(grad J)'(grad g) (b)
6. Let C: [a, b] -+ U be a C'-curve in an open set U of the plane. If J is a function on U (assumed to be differentiable as needed), we define
f/ = r.
J(C(r»IIC(r)1I dr
= •bJ(C(r»
f
For r > 0, let x
= r cos 8 and y = r sin 8.
'fJ(r)
=
I -2 1Cr
(dX)2 (d)2 dr + It dr. Let 'fJ be the function of r defined by
f J = -2I f2' J(r cos 8, r sin 8)r d8 c,.
tty
0
where C. is the circle of radius r, parametrized as above. Assume that J satisfies Laplace's equation
626
[XXI, §4]
DIFFERENTIAL FORMS
Show that 'P(r) does not depend on r and in fact f(O, 0)
= _I
if
21tr c,
[Hint: First take 'P'(r) and differentiate under the integral, with respect to r. Let D, be the disc of radius r which is the interior of C,. Using Exercise 4, you will find that 'P'(r)
I = 2xr
If
div grad f(x, y) dy dx
Dr
Taking the limit as r
I = 2xr
If
f (02f 02 ) dy dx ox 2 + oy2
Dr
-+
0, prove the desired assertion.]
= O.
Appendix
We shall give brief reviews of the proofs of the algebraic theorems which have been quoted in this chapter. We first discuss" formal linear combinations." Let S be a set. We wish to define what we mean by expressions
where {Ci} are numbers, and is;} are distinct elements of S. What do we wish such a "sum" to be like? Well, we wish it to be entirely determined by the "coefficients" Ci' and each "coefficient" Ci should be associated with the element Si of the set S. But an association is nothing but a function. This suggests to us how to define" sums" as above. For each S E S and each number C we define the symbol CS
to be the function which associates C to sand 0 to z for any element Z # s. If b, C are numbers, then clearly b(cs) = (bc)s
and
(b
ZE
S,
+ c)s = bs + cs.
We let T be the set of all functions defined on S which can be written in the form
where Ci are numbers, and Si are distinct elements of S. Note that we have no problem now about addition, since we know how to add functions.
629
630
APPENDIX
We contend that if 5" ... ,sn are distinct elements of S, then
are linearly independent. To prove this. suppose such that
Ct, .. "Cn
are numbers
(the zero function). Then by definition, the left-hand side takes on the value C; at Si and hence C; = O. This proves the desired linear independence. In practice, it is convenient to abbreviate the notation, and to write simply 5; instead of Is;. The elements of T, which are called formal linear combinations of elements of S, can be expressed in the form
and any given element has a unique such expression, because of the linear independence of 5" .•••sn' This justifies our terminology. We now come to the statements concerning multilinear alternating products. Let E, F be vector spaces over R. As before, let
E(')=Ex···xE• taken r times. Let
f: E(') -> F be an r-multilinear alternating map. Let V" •.. ,Vn be linearly independent elements of E. Let A = (a;i) be an r x n matrix and let
Then
f(u" ... ,u,) = f(a"vt
+ ... + a'nVn' ... •a"v, + ... + a,nvn)
=
L !(a1.d(l)va
=
L al.aO) ... o,.tI{r)!(V
• •
(l)t . . .
,a r • <1(r) va(r» tt (1), ... ,
where the sum is taken over all maps u: {I•... ,r}
->
va(r})
{I, ... ,n}.
In. this sum, all terms will be 0 whenever u is not an injective mapping, that IS whenever there is some pair i. j with i '# j such that u(i) = u0),
631
APPENDIX
because of the alternating property of f From now on, we consider only injective maps cr. Then (cr(l),.,. ,cr(r)} is simply a permutation of some r-tuple (il' .. ' ,i,) with i, < .. , < i,. We wish to rewrite this sum in terms of a determinant. For each subset S of (I, ... ,n} consisting of precisely r elements, we can take the r x r submatrix of A consisting of those elements a;j such that j E S. We denote by DetiA)
the determinant of this submatrix. We also call it the subdeterminant of A corresponding to the set S. We denote by P(S) the set of maps cr: (I, ... ,r}
->
{I, ... ,n}
whose image is precisely the set S. Then
L
Dets(A) =
DE
£s(cr)a.,DIIl·'· a,.Dl'I'
P{S)
and in terms of this notation, we can write our expression for f(u t> in the form (1)
f(ut> ... ,u,)
.••
,u,)
= L Dets(A)f(vs) s
where Vs denotes (Vi" ... ,Vi) if i, < ... < i, are the elements of the set S. The first sum over S is taken over all subsets of of I, ... ,n having precisely r elements. Theorem A. Let E be a vector space over R, of dimension n. Let r be an integer 1 ::: r ::: n. There exists a finite dimensional space E and an r-multilinear alternating map E(') -> E denoted by
!'I
!'I
(u" ... ,U,)I-+UI
1\ ... 1\ U,
satisfying the following properties:
AP 1. If F is a vector space over Rand g: El')
-> F is an r-multilinear alternating map, then there exists a unique linear map
such that for all u t>
•••
,u, E E we have
g(ut> ... ,u,) = g.(u.
1\ ... 1\
u,).
632
APPENDIX
AP 2. If {v" ... ,vn } is a basis of E, then the set of elements I
Vii A ... 1\ Vi"
is a basis of
~
h < ... < i,
~ II,
1\ E.
Proof For each subset S of {I,... ,n} consisting of precisely r elements, we select a letter ts. As explained at the beginning of the section, these letters ts form a basis of a vector space whose dimension is equal to the
binomial coefficient
G)'
It is the space of formal linear combinations
of these letters. Instead of ts, we could also write t(i) = ti, "", with i, < ... < i,. Let {v" ... ,vn } be a basis of E and let u" ... ,u, be elements of E. Let A = (au) be the matrix of numbers such that
Define U, 1\ ••• 1\ U,
=
L Dets{A)ts. s
We contend that this product has the required properties. The fact that it is multilinear and alternating simply follows from the corresponding property of the determinant. We note that if S = {i l , ... ,i,} with i, < ... < i" then
A standard theorem on linear maps asserts that there always exists a unique linear map having prescribed values on basis elements. In particular, if g: E(') -+ F is a multilinear alternating map, then there exists a unique linear map
such that for each set S, we have g.(t s ) = g(vs)
= g(Vi
I
, ... ,v,, )
if i ... ,i, are as above. By formula (I), it follows that "
g(u" ... ,u,) = g.{u, " ...
1\
for all elements u" ... ,U, of E. This proves AP I.
u,)
633
APPENDIX
As for AP 2, let {w" ... ,wn } be a basis of E. From the expansion of (1), it follows that the elements {w s }, i.e. the elements {WI, 1\ ••• 1\ w;J with all possible choices of r-tuples (i" ... ,i,) satisfying i I < ... < i, are generators of 1\ E. The number of such elements is precisely (;). Hence they must be linearly independent, and form a basis of 1\ E, as was to be shown.
Theorem B. For each pair of positive integers (r, s) there exists a unique bilinear map
such that
if u" ... , u" W" ... ,W, E E then
(UI 1\ ••• 1\
u,)
X (WI 1\ ••• 1\
w,) ......
UI 1\ ••. 1\ U, 1\ WI 1\ ••• 1\ W,.
This product is associative. Proof For each r-tuple (u" ... ,u,) consider the map of E(') into
1\+' E
given by
This map is obviously s-multilinear and alternating. Consequently, by AP 1 of Theorem A, there exists a unique linear map 9(u)
- 914 ...... 14,.·. IA, E \
-
such thaI for any elements WI'
...
--+
A,+, E
I \
,w, E E we have
Now the association (u) ...... g(u) is clearly an r-multilinear alternating map of E(') into L(IV. E, E>, and again by AP I of Theorem A, there exists a unique linear map
!'I+'
such that for all elements u" ... ,U, E E we have 9 141 •••• ,14 .. -- 9 *(u 1
To obtain the desired product the association
1\ ••• 1\ U r ) .
1\ E x IV. E --+ 1\+' E,
we simply take
634
APPENDIX
It is bilinear, and is uniquely determined since elements of the form u, 1\ ••• 1\ u, generate !'I E, and elements of the form w, 1\ •.• 1\ W s generate jI{ E. This product is associative, as one sees at once on decom-
posable elements, and then on aU elements by linearity. This proves Theorem B. Let E, F be vector spaces, finite dimensional over R, and let J.: E -+ F be a linear map. If J1: F --+ R is an element of the dual space F*, i.e. a linear map of F into R, then we may form the composite linear map
which we visualize as E
--+A F --+• R.
We denote this composite J1 0 A by A*(J1). It is an element of E*.
Theorem C. Let J.: E --+ F be a linear map. For each r there exists a unique linear map A*: /'\ F*
--+!'I E*
having the following properties: (i) A*(W 1\ "') = A*(W) 1\ A*("')forwE/'I P, "'EjI{ F*. (ii) If J1E F* then A*(J1) = J1 0 A, and A* is the identity on P = R.
If
Proof The composition of mappings
given by
is obviously multilinear and alternating. Hence there exists a unique linear map /'I F* --+ /'I E* such that
Property (i) now follows by linearity and the fact that decomposable elements J11 1\ ..• 1\ J1, generate /'I F*. Property (ii) comes from the definition. This proves Theorem C.
Index
Beginning point 389 Bernoulli polynomial 325 Abel's theorem 220, 223, 239 Bessel function 347, 364 Absolute convergence 213, 229 Big oh 117 of integrals 329 Bijective 6 of series 213, 229 Bilinear 478 Absolute value, real 23 Binomial coefficients and complex 97 expansion II, 89, 111 Absolutely integrable 329 Block 584 Acceleration 269 Bonnet mean value theorem 107 Accumulation point 35, 36, 132, 157, 193 Bound for linear map 247, 456 Adherent 41, 153 Boundary 159 Admissible 570, 572, 573 Boundary of chain 621 Algebraic closure 20 I Boundary point 158 Alternating product 607 Bounded 29, 131, 135 Anosov's theorem 511 Bounded from above or below 29 Approach 42, 51, 160 Bounded linear map 178, 247 Approximation by step maps 252, Bounded metric 136 307,319 Bounded set 29, 135 Approximation of identity 286 Bounded variation 260 Arbitrarily large 51 Bruhat-Tits 149, 150 Arcsine 95 Bump function 81 Arctan 95, 114 Area of sphere 600 c Associative 17 Asymptotic 119 C' 73, 468 Asymptotic expansion 118 C'-invertible 512 C' 73, 468, 482, 487 C' chart 514 B C' coordinate system 514 C'-isomorphism 513 Ball 135 C' norm 137 Banach space 143 A
637
638
INDEX
en functions 82, 244, 268 Cantor set 224 Cauchy criterion 167 Cauchy sequence 39, 143, 149 Cesaro kernel 312 Cesaro summation 314 Chain 428, 620 Chain rule 67, 282, 396, 471 Change of variables 490, 593 Characteristic function 574 Chart 514, 531 Circumcenter 150 Class C" 73, 468, 482, 487 Close together 441,442 Closed 153 Closed ban 135 Closed chain 428 Closed in a subset 156 Closed interval 24 Closed rectangle 565 Closure 155, 157 Coefficients 56, 235 Commutative 18 Commute with limits 61 Compact 193 Comparison test 209 Complement 4 Complete 143 Completeness axiom 29 Completion 189, 192 Complex numbers 95 Composite 6, 164 Conjugate 96 Connected 389 Connected along a partition 419 Conservation law 391 Conservative 390 Constant term 56 Contained 4 Continuous 59, 156, 170 Continuous linear map 178, 246, 455 Continuous path 388 Converge 34, 39, 143, 149, 206, 327 Converge absolutely 213,224,225, 229 Converge uniformly 179 Convergence of integrals 327 Convergence of products 214 Convex 64, 65, 72 Convolution 283, 311,355 Coordinate function 172, 472 Coordinate system 514 Cover 203 Critical point 546
Cube 579 Curve 268 Curve integral
395,418,438
D
Dampened oscillation 219 Decimal expansion 214 Decomposable form 608 Decreasing 36, 71 Degree 56, 20 I Degree of differential operator 374 Dense 155 Denumerable II Dependence on initial conditions 557 Derivation 410 Derivative 66, 99, 268 Differentiable 66, 73, 380, 382, 463, 465 Differential equations 538 Differential form 608 Differential operator 374 Differentiating series 241 Differentiating under integral sign 276, 336, 500 Dirac family 347 Dirac sequence 284, 313 Dircctional derivative 383 Dirichlet kernel 311 Disc 138 Distance 25, 135, 176 Distance function 135 Distributive 19 Diverge 209 Domain of flow 545 Dual space 608 E
Eigenfunction 306 Eigenvector 413 Element 3 Elkies estimate 325 Empty 3 End point 389 Enumerate 11 Equidistribution 310 Equivalence relation 15, 32 Equivalent norms 135 Euclidean norm 137 Euler number 89 Euler's relation 387 Even 27 Exact form 613
INDEX
Exponential 78 Exponential of matrices 509 Extend by continuity 172, 176 Extension theorem for linear maps 247 Exterior derivative 610 F Family 183 Family of open sets 203 Fejer kernel 312 Finite 12 Finite sequence 12 Finite subcovering 203 Fixed point theorem 150, 502 Flow 541 Formal linear combination 630 Fourier coefficients 294, 295 Fourier series 296, 303 Fourier transform 342, 354 Free homotopy 452 Function 41 Functional equations 358 Fundamental lemma of integration 265 Fundamental solution of heat equation 349 G Gamma function 346 Generate vector space 130 Global flow 544 Gradient 380 Graph 41 Greatest lower bound (glb) 30 Green's function 378 Green's theorem 624 H
Half-closed 24 Half-open 24 Harmonic 316, 375 Heat equation 350 Heat kernel 348, 372 Heat operator 349 Hermitian product 291 Hessian 486 Homologous 425, 429 Homotopic 444 Homotopy 444 Homotopy relative to end points 448
639
Hyperbolic 510 Hyperplane 385 Hypersurface 411
Ideal of continuous functions 205 Identity map 6 Image 4 Imaginary part 96, 99 Implicit functions 522, 527 Improper integral 326 Increasing 35, 71 Indefinite integral 103 Induction 8 Inf 30 Infinite 12 Infinite interval 24 Infinity 50 Initial condition 538, 557 Injective 5 Integers 27 Integrability theorem 430 Integrable 471 Integral 101, 249, 395, 568 Integral along a path 418, 438 Integral curve 538 Integral equation 504,510 Integral mean value theorem 107 Integral operator 285, 290, 306 Integral test 210 Integration by parts 105 Integration of series 239 Interchange of derivative and integral 276, 337, 340 Interchange of integrals 277, 339, 342 Interchange of limits J85 Interior point 158 Intermediate value theorem 62 Intersection 4 Interval 24 Inverse 74, 96 Inverse image 156 Inverse image of a form 616 Inverse mapping 7 Inverse mapping theorem 502, 515 Inverse path 398 Inversion function 74 Invertible 76, 512 Invertible linear map 506 Irrational 28 Isolated 158, 16 J Isometry 150 Isomorphism 76
640
INDEX
J Jacobian 467 Jacobian determinant
Minimum 61 Monotone 76 Multilinear 478 Multiple integral 568, 581 Multiplication 18
589
K Kernel function 286 Khintchine Theorem 216 Kinetic energy 391
N
L
{,
216, 233 215 L'-norm 141,254,329 L2 -norm 141, 255, 299 Lagrange multiplier 412 Landau kernel 288 Laplace operator 316 Lattice point problem 366 Leading coefficient 56 Least upper bound (Iub) 29· Lebesgue integral 262 Length 293 Level surface 384 Li(x) 123 Limit 41,42,50, 143, 160 Limit inferior 41 Limit superior 40 Linear combination 130 Linear differential equation 552 Lipschitz condition 74, 174, 320, 540, 578 Lipschitz constant 74, 540 Little oh 117 Local flow 541 Local isomorphism 513 Locally integrable 418,438 Locally invertible 513 Logarithm 83, 133 Lower integral 567 Lower sum 105 (2
M
Mapping 4 Matrix representing bilinear map Maximum 61 Meim value theorem 71, 475 Measure 262 Measure less than epsilon 263 Measure zero 215, 577 Metric space 136 Midpoint 149
483
Natural numbers 3, 8 Negative 22 Negligible 569 Newton quotient 66 Newton's law 390 Newton's method 504 Non-degenerate block 584 Norm 131, 293 Normal derivative 625 Normed vector space 132 Null sequence 190 Null space 292 Numbers 17
o o(g) 117 0(11) 67, 463 Odd 27 Onto 5 Open ball 135 Open covering 203 Open disc 138 Open in a subset 156 Open interval 24 Open neighborhood 152 Open set 51 Opposite path 398 Order of magnitude 116 Ordering 21 Oriented boundary 613 Oriented volume 584 Orthogonal 138, 292 Orthogonal family 301 Orthonormal 301 Oscillation 576 Oscillatory integral 318, 336
P p's and q's 29,32,69,216-217 Parallelogram law 137, 305 Parametrized 269 Partial derivative 275, 371, 495 Partial differential operator 324, 406 Partial sum 206
641
INDEX
Partition of interval 105, 249 Partition of rectangle 565 Parts (summation by) 220 Path 388 Path integral 398,418,438 Pathwise connected 448 Peano curve 225 Periodic 93 Permutation
5
Perpendicular 138, 292 Piecewise C I 389 Piecewise continuous 253, 388 Point of accumulation 35, 36, 132, 157, 193 Pointwise convergence
179, 317
Poisson inversion 352 Poisson kernel 315 Poisson summation formula 358 Polar coordinates 518,595 Polar form 98 Polynomials 54, 287 Positive definite 133, 292 Positive integer 8 Positive number 21 Potential energy 390 Potential function 392 Power series 234 Prime numbers 124 Product 162 Producl decomposition 526 Product of matrices 458 Products (infinite) 214 Projection 294 Proper 4 Pylhagoras theorem 293
Q Quadratic form
413
R Radius of convergence 235 Rapidly decreasing at infinity 353 Ratio test 209 Rational number 27 Real number 17 Real part 96, 99 Rectangle 565 Rectangular path 433 Refinement 250, 566 Regulated map 253 Relatively continuous 170 Relatively uniformly continuous 199 Remainder in Taylor formula 110
Repeated integral 581 Residue 431 Riemann integrable 567 Riemann-Lebesgue lemma Riemann-Stieltjes 259 Riemann sum 106 Root 56, 201
S Sawtooth function 345 Scalar product 133 Schanuel's tbeorem 217 Schwartz space 353 Schwarz inequality 135, 294 Second derivative 477 Self conjugate 173 Seminorm 136 Semiparallelogram law 149 Sequence 12, 32 Series 206 Shrinking lemma 502 Shrinking map 503 Shub's theorem 505 Simplex 620 Sine and cosine 90 Size of partition 106, 570 Sphere 135, 600 Spherical coordinates 599 Square roo I 23 Squeezing process 48 Standard 608 Star shaped 449 Step map 249, 252 Stieltjes integral 259 Stirling's formula 120 Stokes' theorem 614, 622 Strictly convex 72 Strictly decreasing 35, 71 Striclly increasing 35, 71 Strictly monotone 75 Subcovering 203 Subdivision 434 Submanifold 531 Subrectangle 566 Subseq uence 13 Subset 4 Subspace 130 Sufficiently close 43 Sufficiently large 35 Sum of series 206 Summation by parts 220 Sup 30 Sup norm 132, 139
318
642 Support 362 Surjective 5 Symmetric second derivative 481
tNDEX
Upper integral 567 Upper sum 105 V
T Tangent plane 386 Tangent vector 602 Tate's lemma 503 Taylor formula 109, 408, 490 Taylor polynomial Ill, 409 Taylor series, arbitrary 244 Theta series 358 Tietze extension 231 Total family 300 Translation 514 Trecimal expansion 192, 224 Triangle inequality 131 Trigonometric degree 311 Trigonometric polynomials 306, 311 U
Uniform convergence 179, 229 Uniform distribution 310 Uniform limit 180 Uniformly Cauchy 179 Uniformly continuous 63, 174, 198 Union 4 Unit element 9 Unit vector 295, 625
Value 4 Variation 260 Vector 130 Vector field 390, 538 Vector field on sphere 602 Vector space 129 Velocity 269 Volume 575, 584 Volume of n-ball 598 W
Wallis product 121 Weakly increasing 71 Weierstrass approximation theorem 287 Weierstrass-Bolzano theorem 38, 193 Weierstrass test 230 Well-ordering 8 Winding number 422
z Zero 18 Zeros of ideal 205 Zeta function 359
Undergraduate Texts in Mathematics Abbott: Understanding Analysis. Anglin: Mathematics: A Concise History and Philosophy. Readings in Mathematics. Anglin/Lambek: The Heritage of Thales. Readings in Mathematics. ApOSIOI: Introduction to Analytic Number Theory. Second edition. Armstrong: Basic Topology. Armstrong: Groups and Symmetry. Axler: Linear Algebra Done Right. Second edition. Beardon: Limits: A New Approach to Real Analysis. BaklNewman: Complex Analysis. Second edition. BanchofflWermer: Linear Algebra Through Geometry. Second edition. Berberian: A First Course in Real Analysis. Bix: Conics and Cubics: A Concrete Introduction to Algebraic Curves. Br~maud: An Introduction to Probabilistic Modeling. Bressoud: Factorization and Primalily Testing. Bressoud: Second Year Calculus. Readings ill Mathema/ics. Brickman: Mathematical Introduction to Linear Programming and Game Theory. Browder: Mathematical Analysis: An Introduction. Buchmann: Introduction to Cryptography. Buskeslvan Rooij: Topological Spaces: From Distance to Neighborhood. Callahan: The Geometry of Spacetime: An Introduction to Special and General Relavitity. Carter/van BrunI: The LebesgueStieltjes Integral: A Practical Introduction. Cederberg: A Course in Modem Geometries. Second edition.
Cbilds: A Concrete Introduction to Higher Algebra Second edition. Chung/AiISahlia: Elementary Probability Theory: With Stochastic Processes and an Introduction to Mathematical Finance. Fourth edition. Cox/Little/O'Shea: Ideals, Varieties, and Algorithms. Second edition. Croom: Basic Concepts of Algebraic Topology. Curtis: Linear Algebra: An Introductory Approach. Fourth edition. Daepp/Gorkin: Reading, Writing, and Proving: A Closer Look at Mathematics. Devlin: The Joy of Sets: Fundamentals of Contemporary Set Theory. Second edition. Dixmier: General Topology. Driver: Why Math? EbbinghauslFlumITbomas: Mathematical Logic. Second edition. Edgar: Measure, Topology, and Fractal Geometry. Elaydi: An Introduction to Difference Equations. Second edition. ErdiSslSuranyi: Topics in the Theory of Numbers. Estep: Practical Analysis in One Variable. Exner: An Accompaniment to Higher Mathematics. Exner: Inside Calculus. FineIRosenberger: The Fundamental Theory of Algebra Fischer: Intermediate Real Analysis. FlaniganJKazdan: Calculus Two: Linear and Nonlinear Functions. Second edition. Fleming: Functions of Several Variables. Second edition. Foulds: Combinatorial Optimization for Undergraduates. Foulds: Optimization Techniques: An Introduction. (continued after index)
Undergraduate Texts in Mathematics (continued from page ii)
Franklin: Methods of Mathematical Economics. Frazier: An Introduction to Wavelets Through Linear Algebra Gamelin: Complex Analysis. Gordon: Discrete Probability. HairerlWanner: Analysis by Its History. Readings in Mathematics.
Halmos: Finite-Dimensional Vector Spaces. Second edition. Halmos: Naive Set Theory. HammerlinIHoffmann: Numerical Mathematics. Readings in Mathematics.
HarriS/HirsUMossinghoff: Combinatorics and Graph Theory. Hartshorne: Geometry: Euclid and Beyond. Hijab: Introduction to Calculus and Classical Analysis. Hilton/Holton/Pedersen: Mathematical Reflections: In a Room with Many Mirrors. Hilton/Hollon/Pedersen: Mathematical Vistas: From a Room with Many Windows. looss/Josepb: Elementary Stability and Bifurcation Theory. Second edition. Isaac: The Pleasures of Probability. Readings in Mathematics.
James: Topological and Uniform Spaces. Janich: Linear Algebra. Janich: Topology. Janich: VeClor Analysis. Kemeny/Snell: Finite Markov Chains. Kinsey: Topology of Surfaces. Klambauer: Aspects of Calculus. Lang: A First Course in Calculus. Fifth edition. Lang: Calculus of Several Variables. Third edition. Lang: Introduction to Linear Algebra. Second edition. Lang: Linear Algebra. Third edition.
Lang: Short Calculus: The Original Edition of "A First Course in Calculus." Lang: Undergraduate Algebra. Second edition. Lang: Undergraduate Analysis. Laubenbacher/Pengelley: Mathematical Expeditions. Lax/BursteinlLax: Calculus with Applications and Computing. Volume J. LeCuyer: College Mathematics with
APL. LidllPilz: Applied Abstract Algebra. Second edilion. Logan: Applied Partial Differential Equations. LovaszlPelilcinIVesztergombi: Discrete Mathematics. Macld·Strauss: Introduction to Optimal Control Theory. Malitz: Introduction to Mathematical Logic. MarsdenlWeinslein: Calculus I, II, III. Second edition. Martin: Counting: The Art of Enumerative Combinatorics. Martin: The Foundations of Geometry and the Non-Euclidean Plane. Martin: Geometric Constructions. Martin: Transformation Geometry: An Introduction to Symmetry. MillmanlParker: Geometry: A Metric Approach with Models. Second edition. Moschovalds: Notes on Set Theory. Owen: A First Course in the Mathematical Foundations of Thermodynamics. Palka: An Introduction to Complex Function Theory. Pedrick: A First Course in Analysis. PeressinilSullivanlUbJ: The Mathematics of Nonlinear Programming. PrenowitzlJanlosciak: Join Geometries.
Undergraduate Texts in Mathematics Priestley: Calculus: A Liberal Art. Second edition. Protter/Morrey: A First Course in Real Analysis. Second edition. Protter/Morrey: Intennediate Calculus. Second edition. Pugh: Real Mathematical Analysis. Roman: An Introduction to Coding and Information Theory. Ross: Elementary Analysis: The Theory of Calculus. Samuel: Projective GeomelTy. Readings in Mathematics. Saxe: Beginning Functional Analysis Scharlau/Opolka: From Fcnnatto Minkowski. Schiff: The Laplace Transfonn: Theory and Applications. Sethuraman: Rings, Fields, and Vector Spaces: An Approach to Geometric Constructability. Sigler: Algebra. Silvermanffate: Rational Points on Elliptic Curves. Simmonds: A Brief on Tensor Analysis. Second edition. Singer: Geometry: Plane and Fancy.
Singerrrhorpe: Lecture Notes on Elementary Topology and GeomelTy. Smith: Linear Algebra Third edition. Smith: Primer of Modem Analysis. Second edition. SlanlonlWhite: Constructive Combinatorics. Stillwell: Elements of Algebra: GeomelTy, Numbers, Equations. Stillwell: Elements of Number Theory. Stillwell: Mathematics and Its History. Second edition. Stillwell: Numbers and GeomelTy. Readings in Mathematics. Strayer: Linear Programming and Its Applications. Toth: Glimpses of Algebra and GeomelTy. Second Edition. Readings in Mathematics. Troutman: Variational Calculus and Optimal Control. Second edition. Valenza: Linear Algebra: An Introduction to Abstract Mathematics. WhyburnlDuda: Dynamic Topology. Wilson: Much Ado About Calculus.
UNDERGRADUATE TEXTS IN MATHEMATICS
This is a logilolly sell-lontoined introdultion to analysis, suitable fa< students who have hod !wo years of 1011Ulus_ The book lenters around those properties that have to do with unifa<m lonvergenle and uniform limits in the lOntext of differentiation and integration. TopilS dislussed indude the dossilOI test for lonvergenle of series, fourier series, polynomial approximation, the Poisson kernel, the construction of harmonil fun[tions on the disk, ordinary differential equations, lurve integrals, derivatives in vellor spoles, multiple integrals, and others. One oflhe author's main lOnlerns is to olhieve a bolonle between lonuete examples and generoltheorems, augmented by a variety of interesting exerlises. Some new mOleriol has been added in this selond edition, for example: a new lhopter on the global version of inlegrotion of lomlly integrable veltor fields; a brief dislUssion of LI-Comby sequences, introduling students 10 the lebesgue inlegrol; ma<e material on Dirol sequenles and families, induding a section on the heat kernel; a ma<e systemotil dis,us~on of orders of magnitude; and a number of new exe"ises.
springeronline.com