This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
0, prove that there exists a path A in the E-neighborhood of [p. q ] that joins p to q and is disjoint from M. [Hint: Think of A as a bisector of M. From this bisection fact a dyadic disc partition of M can be constructed. which leads to the proof that M is tame.] Consider the Hilbert cube
Prove that H is compact with respect to the metric
1 32
A
Taste of Topology
d (x , y ) ) = sup lxn
n
1 1 3. * * * 1 14.
1 15.
* 1 1 6.
1 1 7.
1 1 8. 1 1 9.
Chapter 2
- Yn l
where x = (xn ), y = (Yn )- [Hint: sequences of sequences.] Remark Although compact, H is infinite-dimensional and is home omorphic to no subset of JRm . Prove that the Hilbert cube is perfect and homeomorphic to its Carte sian square, H ;:: H x H. Assume that M is compact, non-empty, perfect, and homeomorphic to its Cartesian square, M ;:: M x M. Must M be homeomorphic to the Cantor set, the Hilbert cube, or some combination of them? A Peano space is a metric space M that is the continuous image of the unit interval: there is a continuous smjection r : [0, 1 ] ---+ M . Theorem 6 8 states the amazing fact that the 2-disc is a Peano space. Prove that every Peano space is (a) compact, (b) non-empty, (c) path-connected, *(d) and locally path-connected, in the sense that for each p E M and each neighborhood U of p there is a smaller neighborhood V of p such that any two points of V can be joined by a path in U. The converse to Exercise 1 1 5 is the Hahn-Mazurkiewicz Theo rem. As;:sume that a metric space M is a compact, non-empty, path connected, and locally path-connected. Use the Cantor Surjection Theorem 65 to show that M is a Peano space. [The key is to make uniformly shon paths to fill in the gaps of LO, 1 ] \ C.] One of the famous theorems in plane topology is the Jordan Curve Theorem. It states that if f : [a, b] ---+ JR2 is continuous, j(a) = f (b), and for no other pair of distinct s , t E [a, b] does f (s) equal f (t) then the complement of the path f in JR2 consists of two disjoint, connected open sets, its inside and its outside. Prove the Jordan Curve Theorem for the circle, the square, the triangle, and, if you have courage - any simple closed polygon. Prove that there is a continuous surjection lR ---+ JR2 • What about JRn ? The utility problem gives three houses 1 , 2, 3 in the plane and the three utilities, Gas, Water, and Electricity. You are supposed to con nect each house to the three utilities without crossing utility lines. (The houses and utilities are disjoint.)
Exercises
133
(a) Use the Jordan curve theorem to show that there is no solution to the utility problem in the plane. *(b) Show also that the utility problem cannot be solved on the 2sphere S2 . *(c) Show that the utility problem can be solved on the surface of the torus. *(d) What about the surface of the Klein bottle? , Hn located on a * * *(e) Given utilities U 1 , . . . Um and houses H1 , surface with g handles, find necessary and sufficient conditions on m . n , g so that the utility problem can be solved. 1 20. The open cylinder is (0, 1 ) X S 1 . The punctured plane is JR2 \ {0} . (a) Prove that the open cylinder is homeomorphic to the punctured plane. (b) Prove that the open cylinder. the double cone, and the plane are not homeomorphic. 1 2 1 . Is the closed strip { ( x , y ) E JR2 : 0 ::s x ::s 1 } homeomorphic to the closed halfplane { (x , y) E JR2 : x :::: 0} ? Prove or disprove. 1 22 . Is the plane minus four points on the x -axis homeomorphic to the plane minus four points in an arbitrary configuration? 1 23 . Suppose that A. B c JR2 . (a) If A and B are homeomorphic, are their complements homeo morphic? *(b) What if A and B are compact? * * *(c) What if A and B are compact and connected? ** 1 24. Let M be a metric space and let K denote the class of non-empty compact subsets of M . The r-neighborhood of A E K is •
.
Mr A = { x E M :
3a E A and d (x , a )
For A , B E K define
D (A , B) = inf{r > 0 :
A
<
•
•
r} =
U Mra.
a EA
C Mr B and B c Mr A } .
(a) Show that D is a metric on K . (It i s called the Hausdorff met ric.) (b) Denote by F the collection of finite non-empty subsets of M and prove that F is dense in K. That is. given A E K and given E > 0 show that there exi sts F E F such that D (A, F) < E . (c ) If M is compact, prove that K is compact. (d) If M is connected, prove that K is connected.
1 34
A
Taste of Topology
Chapter 2
** *(e) If M is path-connected is K path-connected? (f) If M and M' are homeomorphic does it follow that K and K' are homeomorphic? ***(g) What about the converse? 1 25 . As on page 1 05 , consider the subsets of JR., A = {O} U [ 1 , 2] U {3 }
* * 1 26.
* * 1 27 .
* * 1 28.
** 1 29.
and
B = {O} U { 1 } U [2, 3 ] .
(a) Why i s there n o ambient homeomorphism o f JR. t o itself that carries A onto B ? (b) Thinking of JR. as the x-axis, i s there an ambient homeomor phism of JR2 to itself that carries A onto B ? Consider an overhand (trefoil) knot K in JR3 • It can be shown that there is no homeomorphism of JR3 to itself that sends K to the standard unit circle S 1 c JR2 . (See Rolfsen's book, Knots and Links.) Thinking of JR3 as the plane x4 = 0 in (x1 , x2 , x3 , x4)-space JR4 , show that there is a homeomorphism of JR4 to itself that carries K onto S 1 . Start with a set S c JR. and successively take its closure, the comple ment of its closure, the closure of that, and so on. S, cl(S) , (cl ( S)Y, . . . Do the same to sc . In total, how many distinct subsets oflR can be pro duced this way? In particular decide whether each chain S, cl( S) , . . . consists of only finitely many sets. For example, if S = Ql then we get Ql, JR., 0, JR., 0, JR., 0, . . . , and Qlc , JR., 0, JR., . . . for a total of four sets. Consider the letter T. (a) Prove that there is no way to place uncountably many copies of the letter T disjointly in the plane. [Hint: First prove this when the unit square replaces the plane.] (b) Prove that there is no way to place uncountably many homeo morphic copies of the letter T disjointly in the plane. (c) For which other letters of the alphabet is this true? (d) Let U be a set in JR3 formed like an umbrella: it is a disc with a perpendicular segment attached to its center. Prove that uncountably many copies of U can not be placed disjointly in JR 3 • (e) What if the perpendicular segment is attached to the boundary of the disc? Let M be a complete, separable metric space such as JR.m . Prove the Cupcake Theorem: each closed set K c M can be expressed uniquely as the disjoint union of a countable set and a perfect closed set, C u P = K .
Exercises
1 35
* 1 30. Write jingles at least as good as the following. When a set in the plane is closed and bounded, you can always draw a curve around it. Peter Pfibik T' is a most indisputable fact If you want to make something compact Make it bounded and closed For you're totally hosed If either condition you lack. Lest the reader infer an untruth (Which I think would be highly uncouth) I must hasten to add There are sets to be had Where the converse is false, fo'sooth. Karla Westfahl A coffee cup feeling quite dazed, said to a donut, amazed, an open surjective continuous injection, You'd be plastic and I'd be glazed. Norah Esty
If a clopen set can be detected, Your metric space is disconnected. David Owens Pre-lim problems t
1 . Suppose that f : IRm
-+
lR satisfies two conditions:
(i) For each compact set K , f ( K ) is compact. (ii) For any nested decreasing sequence of compacts (Kn ) .
Prove that f is continuous. t These are questions taken from the exam given to first year math graduate students at U.C. Berkeley.
1 36
A Taste of Topology
Chapter 2
2 . Let X C lRm be compact and f : X --+ lR be continuous. Given E > 0, show that there is a constant M such that for all x , y E X , l f (x ) - f ( y ) l :'S M l x - y l + E . 3 . Consider f : lR2 � JR. Assume that for each fixed xo , y �----+ f (x0 , y ) is continuous and for each fixed yo, x �-----+ f (x , Yo) is continuous. Find such an f that is not continuous. 4. Let f : JR 2 --+ lR satisfy the following properties. For each fixed xo E lR the function y �-----+ j (xo , y ) is continuous and for each fixed Yo E lR the function x �-----+ f (x , Yo) is continuous. Also assume that if K is any compact subset of lR2 then j( K ) is compact. Prove that f is continuous. 5. Let f (x , y ) be a continuous real valued function defined on the unit square [0, 1 ] x [0, 1 ] . Prove that g (x) = max { f (x , y) : y E [0, 1 ] } is continuous. 6. Let { Uk } be a cover of lRm by open sets. Prove that there is a cover { vk } of lRm by open sets vk such that vk c uk and each compact subset of lRm is disjoint from all but finitely many of the Vk . 7 . A function f : [0, 1 ] � lR is said to be upper semi-continuous if given x E [0, 1 ] and E > O there exists a o > O such that l y - x l < 8 implies that f ( y ) < f (x ) + E . Prove that an upper semi-continuous function on [0, 1 ] is bounded above and attains its maximum value at some point p E LO, 1 ] . 8. Prove that a continuous function f : lR --+ lR which sends open sets to open sets must be monotonic. 9. Show that [0, 1] can not be written as a countably infinite union of disjoint closed subintervals. 1 0. A connected component of a metric space M is a maximal connected subset of M . Give an example of M C lR having uncountably many connected components. Can such a subset be open? Closed? Does your answer change if lR2 replaces lR? 1 1 . Let U C JR.m be an open set. Suppose that the map h : U --+ JR.m is a homeomorphism from U onto JR.m which is uniformly continuous. Prove that U = lR.m . 12. Let X be a non-empty connected set of real numbers. If every element of X is rational prove that X has only one element. 1 3 . Let A C JR.m be compact, x E A . Let (xn) be a sequence in A such that every convergent subsequence of (xn ) converges to x .
Exercises
137
(a) Prove that the sequence (Xn ) converges. (b) Give an example to show if A is not compact, the result in (a) is not necessarily true. 1 4. Assume that f : JR. ---+ IP2. is uniformly continuous. Prove that there are constants A , B such that l f(x) l :S A + B lx l for all x E R 1 5 . Let h : [0, 1 ) ---+ JR. be a uniformly continuous function where [0, 1 ) is the half open interval . Prove that there is a unique continuous map g : [0, 1 ] ---+ JR. such that g(x) = h (x) for all x E [0, 1 ) .
3
Functions of a Real Variable
1
Differentiation
The function (1)
f : (a, b)
-4
lR is differentiable at x if
f(t) - f(x) t-+x t-X lim
=
L
exists. By definition t this means L is a real number and for each E > 0 there exists a 8 > 0 such that if 0 < I t - x l < 8 then the differential quotient above differs from L by < E . The limit L is the derivative of f at x, f ' (x ) . In calculus language, �x = t - x is the change in the independent variable x, while �� = f(t)- f(x) is the resulting change in the dependent variable y = f(x). Differentiability at x means that
! ' (x)
.
� x -+0
!::.f !::. x
= lim - .
We begin by reviewing the proofs of some standard calculus facts. t This concept of limit is slightly different from the limit of a sequence. Here t is a continuous parameter that rends to x, whereas for the sequence (an ) . the parameter n is an integer that grows without bound. A limit definition general enough to include both concepts is discussed in Exercise 2 6.
Functions of a Real Variable
1 40
Chapter 3
1 The Rules of Differentiation
(a) Diffe rentiability implies continuity. (b) If f and g are differentiable at x then so is f + g, the derivative being (f + g) '(x) = f'(x ) + g ' (x) . (c) Iff and g are differentiable at x then so is f · g , the derivative being
(f · g) ' (x) = J'(x) · g (x) + f(x) g ' (x) . ·
(d) The derivative of a constant is zero, c' = 0. (e) If f and g are differentiable at x and g (x ) "I= 0 then tiable at x, the derivative being
fIg is differen
(fl g) ' (x ) = (f ' (x ) · g (x ) - f( x ) · g ' (x ) ) l g (x) 2 .
(f) If f is diffe rentiable at x and g is differentiable at y = f (x) then
g o f is differentiable at x, the derivative being
(g o f) ' (x) = g ' (y) · J ' (x) . Proof (a) Continuity in the calculus notation amounts to the assertion that !Y..f -+ 0 as � x -+ 0. This is obvious. H the fraction �fI �x tends to a finite limit while its denominator tends to zero, then its numerator must also tend to zero. (b) Since !Y.. ( f + g) = �f + !Y.. g ,
!Y.. (f + g) !Y.. f = !Y.. x !Y.. x
+
!Y.. g -+ j ' (x) + g ' (x) . !Y.. x
as !Y.. x -+ 0 . (c) Since � ( f · g) = !Y.. f · g (x + !Y.. x ) + f (x ) !Y.. g , continuity of g at x implies that ·
!Y.. g !Y.. (f g) !Y..f = - g (x + � x ) + .f (x) - -+ .f I (x ) g (x ) + .f (x)g I (x) , !Y.. x !Y.. x !Y.. x •
---
as !Y.. x -+ 0. (d) H c is a constant then !Y.. c = 0 and c' = 0. (e) Since
!Y.. (fl g)
=
g (x) !Y.. f - f (x) !Y.. g , g (x + !Y.. x )g (x )
the formula follows when we divide by !Y.. x and take the limit.
Differentiation
Section 1 (f)
The
141
shortest proof of the chain rule for y = f (x) is by cancellation:
!:l.g !:l.x
-
=
!:l.g !:l.y ---* g ' (y) f ' (x) . !:l.y !:l.x -
A slight flaw is present, !:l.y may be zero when !:l.x is not. This is not a big problem. Differentiability of g at y implies that
!:l.g - = g ' (y ) + a !:l.y where a = a (!:l.y)
---+
0 as
!:l.y ---*
!:l.g
=
0.
Define a (O) = 0. The formula
(g ' (y) + a ) !:l.y
holds for all small !:l.y, including !:l.y = 0. Continuity of f at x (which is true by (a)) implies that !:l.y ---* 0 as !:l.x ---* 0. Thus
!:l.g !:l.x
- =
(g ' (y)
+ a)
!:l.y !1x
-
---+
g ' (y) f ' (x)
as !:l.x ---+ 0.
0
The derivative of a polynomial ao + a 1 x + · · · + anx n exists everywhere and equals a 1 + 2a2 x + · · · + na n x n - J .
2 Corollary
Proof Immediate from the differentiation rules. A function f : (a , differentiable.
0
b) ---* ffi. that is differentiable at each x E (a b) is
continuous function f : [a , b] ---* ffi. that is diffe rentiable on the interval (a , b) has the mean value property: there exists a point () E (a , b) such that 3 Mean Value Theorem A
j (b) - f (a) = f' (8) (b - a ) . Proof Let
f(b) - f (a ) b-a be the slope of the secant of the graph of f. See Figure 56. The function 4J (x) = f (x) - Sx is differentiable and has the same value S
=
v =
bf(a ) - af (b) b-a
Functions of a Real Variable
1 42
Chapter 3
• a
Figure 56 The secant line for the graph of f. .
. . .
.
. . .
.
. - .. .. . - - .. - . .. - - · a
e
e
b
. . . . . .: . . .
/\
.
.
· u
.. ..
..
..
.. -· e
.. ..
-· b
.
.
·
.·
\ � . � .
..
..
u
Figure 57 ¢ ' (e)
-
·
-
-
-
=
-
..
..
..
..
.
b
e
· - - - .. a
... .. .. .. .. .. e
.
b
0.
at a and b. Differentiability implies continuity. ¢ is continuous and therefore takes on maximum and minimum values. Since it has the same value at both endpoints, ¢ has a maximum or a minimum that occurs at a point e E (a . b) . See Figure 57. Then ¢ ' (e ) = 0 (see Exercise 6) and f (b ) - f (a) =
f ' W ) (b - a ) .
If f is differentiable and I f ' (x) I � M for all x E (a, b) then f satisfies the global Lipschitz condition: for all t, x E (a , b), l f (t) - f(x ) l � M It - x l . In particular if f ' (x) = 0 for all x E (a , b) then f (x) is constant. 4 Corollary
Proof l f (t) - f(x ) l = l f ' (e ) (t - x) l for some e between x and t .
D
Remark The Mean Value Theorem is the most important theorem in cal culus for making estimates. It is often convenient to deal with two functions simultaneously, and for that we have the following result. 5 Ratio Mean Value Theorem Suppose that the functions f and g are continuous on an interval [a , bj and differentiable on the interval (a , b).
Differentiation
Section 1
1 43
Then there is a (} E (a , b) such that {).jg ' (()) = {).gj ' (()) where {).j = f(b) - f(a) and {).g = g (b) - g (a). (/fg (x) = x, the Ratio Mean Value Theorem becomes the ordinary Mean Value Theorem.) Proof If {).g f= 0, then the theorem states that for some (} ,
{).j {).g
=
J'(())
g' (())
This ratio expression is how to remember the theorem. The whole point here is that J ' and g ' are evaluated at the same (} . The function
(x)
=
{).j (g (x) - g (a)) - {).g (f (x) - f (a))
is differentiable and its value at both endpoints a , b is 0. Since is con tinuous it takes on a maximum and a minimum somewhere in the interval [a . b] . Since has equal values at the endpoints of the interval, it must take on a maximum or minimum at some point () E (a , b ) ; i.e., () f= a , b. Then D ' (()) = 0 and {).jg' (()) = {).gf'(()) as claimed.
If f and g are differentiable functions defined on an interval (a , b), both ofwhich tend to 0 at b, and if the ratio of their deriva tives J ' (x)jg ' (x) tends to afinite limit L at b then f(x)jg (x) also tends to L at b. (We assume that g (x ) , g'(x) f= 0. )
6 L'Hospital's Rule
Rough proof Let x E (a . b) tend to b. Imagine a point t E (a , b) tending to b much faster than x does. It is a kind of "advance guard'' for x . Then f (t)/f (x) and g (t)jg(x) are as small as we wish. and by the Ratio Mean Value Theorem, there is a () E (x , t) such that
f(x) g (x)
f (x) - 0 g (x) - 0
.
f (x) - f (t) g (x) - g (t)
! '(() )
g'(()) because () is sandwiched between x and t as they tend
The latter tends to L to b. The symbol "=" means approximately equal. See Figure 5 8 .
D
Complete proof Given f: > 0 we must find /3 > 0 such that if l x - b l < /3 then l f (x)jg (x) - L l < E . Since f'(x)jg ' (x) tends to L at b there does exist /3 > 0 such that if x E (b - /3, b) then
I
f ' (x) g ' (x) L
I
E
<
2·
Functions of a Real Variable
1 44
---- I lightyear
------ I
Chapter 3 mile - - 1 inch e
X
a
b
Figure 58 x and t escort () toward b.
For each x E (b - 8. b) determine a point t E (b - 8 , b) which is so near to b that
l f (t) l + l g(t ) l < l g (t) l <
g (x ) 2 E 4( 1 f (x) l + i g (x ) i ) l g (x) l
2 .
Since f (t) and g (t) tend to 0 as t tends to b, and since g(x) f:. 0 such a T exists. It depends on x , of course. By this choice of t and the Ratio Mean Value Theorem
I
f (x ) g (x)
_
- f (t) f (x ) - f (t) L l L l I fg (x(x)) j(x) g (x) - g (t) g (x) - g (t) g (x) f (t) - f (x)g(t) j ' (() ) - I g (x) (g (x) - g(t)) I I g'(()) L I E ' =
_
<
which completes the proof that f (x) I g (x)
_
+
+
�
--
L as x
�
-
b.
<
D
It is clear that L' Hospital's Rule holds equally well as x tends to b or to a . It is also true that it holds when x tends to ±oo or when f and g tend to ±oo. See Exercises 7, 8 .
From now on feelfree to use L 'Hospital 's Rule! is differentiable on (a , b) then its derivativefunction f' (x) has the intermediate value property.
7 Theorem Iff
Differentiability of f implies continuity of f, and so the Intermediate Value Theorem applies to f and states that f takes on all intermediate val ues, but this is not what Theorem 7 is about. Not at all. Theorem 7 concerns f' not f. The function f' can well be discontinuous, but nevertheless it too takes on all intermediate values. In a clear abuse of language, functions like f' possessing the intermediate value property are called Darboux contin uous. even when they are discontinuous ! Darboux was the first to realize how badly discontinuous a derivative function can be. Despite the fact that
L 45
Differentiation
Section 1
f ' has the intermediate value property, it can be discontinuous at almost every point of [a , b ] . Strangely enough, however, f' can not be discontinu ous at every point. If f is differentiable. f ' must be continuous at a dense. thick set of points. See Exercise 24 and the next section for the relevant definitions.
Proof Suppose that a < x 1 < xz < b and a = f ' (xt ) < Y < J ' (xz) = {3.
We must find e E (x1 , x2 ) such that f'(O) = Y . Choose a small h , 0 < h < x2 - x1 , and draw the secant segment a (x ) between the points (x , f (x)) and (x + h , f (x + h)) on the graph of f. Slide x from x1 to x2 h continuously. This is the sliding secant method. See Figure 59. -
b
a X
x+h
x2 - h
x2
Figure 59 The sliding secant. When h is small enough, slope a (x1 ) � f ' (xt ) and slope a (x2 f'(xz ) . Thus slope a (x t ) < Y < slope a (xz - h ) .
h)
�
Continuity of f implies that for some x E (xt , xz h), slope a (x) = Y . The Mean Value Theorem then gives a e E (x . x + h) such that / ' (0) = Y . -
D
8 Corollary
discontinuity.
The derivative of a differentiable function never has a jump
Proof Near a jump, a function omits intermediate values. Pathological examples
Non-jump discontinuities of f' may very well occur. The function
D
Functions of a Real Variable
146
{
! (X ) =
x2
0
1
. sm X
Chapter 3
if x > 0 if X :::: 0
is differentiable everywhere. even at x = 0. where f ' (0) = 0. Its derivative function for x > 0 is ) X
1 X
f 1 (x) = 2x sin - - cos - , which oscillates more and more rapidly with amplitude approximately I as x --+ 0. Since f' (x) f* 0 as x --+ 0, f ' is discontinuous at x = 0. Figure 60 shows why f is differentiable at x = 0 and has f 1 (0) = 0. Although the graph oscillates wildly at 0, it does so between the envelopes y = ±x 2 , and any curve between these envelopes is tangent to the x-axis at the origin. Study this example, Figure 60. O.DI 0.005
0.5
I
[]
0
--{).5
--{).005
1
�I 0. 1
0.05
0
I
0.05
0. 1
Figure 60 The graphs of the function y = x 2 sin ( l jx) and tts envelopes y = ±x2 ; and the graph of its derivative. A similar but worse example is
g (x)
=
{
x 3 12 sin .!.
if x > 0
0
if x ::: O
x
Its derivative at x = 0 i s g 1 (0) = 0, while at x -:1= 0 its derivative is
g I (x )
=
3 . 1 I 1 - Jx sm - - - cos - . 2 X ,jX X
which oscillates with increasing frequency and unbounded amplitude as x --+ 0 because 1 / Jx blows up at x = 0. See Figure 6 1 .
Differentiation
Section 1
1 47
Figure 61 The function y = x 3 1 2 sin( l jx ) , its envelopes y = ±x 3 12 , and its derivative. Higher Derivatives
The derivative of f', if it exists, is the second derivative of f,
f'(t) - f ' (x) . (f ' ) ' (x) = f " (x) = lim 1 ---> X t-X Higher derivatives are defined inductively and written J= (J ) ' . If f (r ) (x ) exists then f is r1h order differentiable at x . If f (r ) (x) exists for each x E (a , b) then f is r1h order differentiable. If j C r l (x) exists for all r and all x then f is infinitely differentiable, or smooth. The zero-th derivative of f is f itself, j COJ (x) = f (x ) .
If f is r 1h order differentiable and r continuousfunction ofx E (a, b ).
9 Theorem
:=:
1 then j Cr- 1 ) (x) is a
Proof Differentiability implies continuity and j Cr- l l (x ) is differentiable.
D
10 Corollary
A smooth function is continuous. Each derivative ofa smooth
function is smooth and hence continuous.
Proof Obvious from the definition of smoothness and Theorem 9.
D
Smoothness Classes
If f is differentiable and its derivative function f ' (x) is a continuous func tion of x, then f is continuously differentiable, and we say that .f is of class C 1 . If .f is rth order differentiable and f (r ) (x) is a continuous function of x , then f is continuously r1h order differentiable, and we say that f is
1 48
Functions of a Real Variable
Chapter 3
of class cr . If f is smooth, then by the preceding coronary, it is of class c for all finite r and we say that f is of class c oo . To round out the notation, we say that a continuous function is of class C0• Thinking of c r as the set of functions of class c r , we have the regularity hierarchy, C O ::J C I ::J . ::J n c r = C oo . .
.
rEN
cr cr+ t
Each inclusion ::J is proper; there exist continuous functions that are not of class C 1 , C 1 functions that are not of class C2• and so on. For example, is of class C 0 but not of class C 1
f (x) = lx I
•
f (x) = X lx I is of class c I but not of class C 2 . f (x) = lx l 3 is of class C 2 but not of class C 3 Analytic Functions
A function that can be expressed locally as a convergent power series is analytic. More precisely, the function f : (a , b) ---+ IR is analytic if for each x E (a . b). there exists a power series and a /3 > 0 such that if I h i
<
/3
then the series converges and
f (x + h )
00
= L a,.h r .
r=O
The concept of series convergence will be discussed further in Section 3 and Chapter 4. Among other things we show in Section 2 of Chapter 4 that ,. analytic functions are smooth; and if f (x + h ) = L a,.h then
This gives uniqueness of the power series expression of a function: if two power series express the same function f at x then they have identical coefficients, namely J(x) jr ! . See Exercise 4.36 for a stronger type of uniqueness, namely the identity theorem for analytic functions. We write for the class of analytic functions.
cw
Section
1
Differentiation
1 49
A Non-analytic Smooth Function
The fact that smooth functions need not be analytic is somewhat surprising; i.e., c w is a proper subset of C'X). A standard example is
e(x) =
{�
- 1 /x
if X > 0 if X ::5 0
Its smoothness is left as an exercise in the use of L'Hospital' s rule and induction, Exerci se 1 4. At x = 0 the graph of e(x) is infinitely tangent to the x-axis. Every derivative e< r > (O) = 0. See Figure 62. 0.35 0.3 0.25 0.2 0. 1 5 0. 1
0.05 o �----�----� 0.4 0 0.8 0.2 0.6
Figure 62 The graph of e(x) = e- l fx . It follows that e (x) is not analytic. For if it were then it could be expressed near x = 0 as a convergent series e(h) = "L_ ar hr, and ar = e( r > (O)j r ! . Thus ar = 0 for each r , and the series converges to zero, whereas e(h) is different from zero when h > 0. Although not analytic at x = 0, e(x) is analytic elsewhere. See also Exercise 4.35. Taylor Approximation
The rth order Taylor polynomial of an rth order differentiable function f at x is
r tcX ) f " (x) hk . P (h) = f (x) + f ' (x)h + -- h 2 + · · · + h r = "' 2! r! � k ! k=O ·
Functions of a Real Variable
1 50
Chapter 3
The coefficients f (k) (x )j k! are constants, the variable is h. Differentiation of P with respect to h at h = 0 gives
P (O)
= f (x) ' P (O) = f' (x )
Assume that f
1 1 Taylor Approximation Theorem
: (a , b) ---+ lR is r 1h
order differentiable at x. Then (a) P approximates f to order r at x in the sense that the Taylor remain der R (h) = f (x + h) P (h) is r 1h order flat at h = 0; i.e., R (h) / h r 0 as h 0. (b) The Taylor polynomial is the only polynomial of degree ::5 r with this approximation property. (c) If, in addition, f is (r + o sr order differentiable on (a . b) then for some () between x and x + h,
---+
R (h)
=
---+
f (r+ l ) (() ) r + h . (r + 1 ) !
l
Proof (a) The first r derivatives of R (h ) exist and equal 0 at h = 0 . If h > 0 then repeated applications of the Mean Value Theorem give
R(h) where 0
=
=
R ( h) - 0 = ·
·
·
R ' (()t ) h = (R'(()t ) - O) h = R " (()2 )()t h
= R ( r - l ) (()r - t Wr - 2
·
·
·
flt h
< ()r- t < · · · < ()t < h . Thus
as h --+ 0. If h < 0 the same is true with h < ()1 < · · · < ()r - l < 0. (b) If Q (h) is a polynomial of degree ::5 r, Q i= P , then Q - P is not th r order flat at h = 0, so f (x + h) - Q (h) can not be r1h order flat either. (c) Fix h > 0 and define
g (t)
= f (x + t) - P (t) - Rh r(+hl) t r +l
=
t r+ l
R(t) - R (h) r h
+l
lSI
Differentiation
Section 1
for O :=:: t :=:: h . Note that since P (t ) i s a polynomial o f degree r , p(t ) = 0 for all t, and
Also, g (O) = g ' (O) = · · = g < r l (O) = 0, and g (h ) = R (h ) - R (h) = 0. Since g 0 at 0 and h, the Mean Value Theorem gives a t 1 E (0, h) such that g ' (t 1 ) = 0. Since g ' (0) and g ' (tt ) = 0, the Mean Value Theorem gives a t2 E (0, t 1 ) such that g " {t2 ) = 0. Continuing, we get a sequence t 1 > t2 > · · · > t r + l > 0 such that g( tk ) = 0. The ( r + 1 ) 81 equation, g (tr + d = 0, implies that
=
·
Thus, () = x + tr + 1 makes the equation in (c) true. If h is symmetric. 12 Corollary For each r E isfies lim e ( h ) / h r = 0.
N. th e
<
0 the argument D
smooth non-analytic function e(x) sat
h--*0
Proof Obvious from the theorem and the fact that e < r J (0) = 0 for all r. D The Taylor series at polynomial
x
of
a
smooth function f is the infinite Taylor
In calculus, you compute the Taylor series of functions such as sin x , arctan x , ex , etc. These functions are analytic: their Taylor series converge and express them as power series. In general, however, the Taylor series of a smooth function need not converge to the function, and in fact it may fail to converge at all. The function e(x) is an example of the first phenomenon. lts Taylor series at x = 0 converges, but gives the wrong answer. Examples of divergent and totally divergent Taylor series are indicated in Exercise 4.35. The convergence of a Taylor series is related to how quickly the rth derivative grows (in magnitude) as r ---* oo. In Section 6 of Chapter 4 we give necessary and sufficient conditions on the growth rate that determine whether a smooth function is analytic.
1 52
Functions of a Real Variable
Chapter 3
Inverse Functions
A strictly monotone continuous function f : (a , b) --* IR bijects (a , b) onto some interval (c, d) where c = f(a), d = f (b) in the increasing case. It is a homeomorphism (a , b) --+ (c, d) and its inverse function f - 1 : (c, d) � (a . b) is also a homeomorphism. These facts were proved in Chapter 2. 1 Does differentiability of f imply differentiability of f- ? If f ' =/=- 0 the 3 answer is "yes." Keep in mind, however, the function f : x �---+ x • It shows that differentiability of f - 1 fails when f ' (x) = 0. For the inverse function is y �---+ i ' 3 , which is non-differentiable at y = 0.
13 Inverse Function Theorem in dimension one If f : (a . b) --* (c. d) is a differentiable surjection and f ' (x) is never zero then f is a homeomor phism, and its inverse is differentiable with derivative
(f - l ) I (y)
=
}
f ' (x )
where y = f(x). Proof If f ' is never zero then by the intermediate value property of deriva tives, it is either always positive or always negative. We assume for all x that f ' (x ) > 0. If a < s < t < b then by the Mean Value Theorem there exists e E (s, t) such that f (t) - f (s ) = f ' (O) (t - s) > 0. Thus f is strictly monotone. Differentiability implies continuity, so f is a homeomorphism 1 (a , b) � (c, d) . To check differentiability of f- at y E (c, d) , define Then y = f(x ) and �y = f(x + �x) - x = tl.f. Thus
�X �y
1
�yj �x
�fl �x
Since f is a homeomorphism, �x --* 0 if and only if �y --* 0. so the limit D of �f- 1 j �y exists and equals l jf ' (x) . If a homeomorphism f and its inverse are both of class c r , r :::: 1 , then f is a cr diffeomorphism. 14 Corollary If f : (a . b) --* (c, d ) is a homeomorphism of class c r . 1 :::: r :::: 00, and f ' =!=- 0 then f is a cr diffeomorphism.
Differentiation
Section 1
1 53
Proof If r = l , the formula ( f - 1 ) ' (y) = l jj ' (x) = 1 /f ' < f - 1 (y)) implies that (f - 1 ) ' (y) is continuous, so f is a C 1 diffeomorphism. Induction on D r :=:: 2 completes the proof. The corollary remains true for analytic functions: the inverse of an ana lytic function with non-vanishing derivative is analytic. The generalization of the inverse function theorem to higher dimensions is a principal goal of Chapter 5 . A longer but more geometric proof o f the one dimensional inverse func tion theorem can be done in two steps. (i) A function is differentiable if and only if its graph is differentiable. (ii) The graph of f - 1 is the reflection of the graph of f across the diag onal, and is thus differentiable. See Figure 63 .
f! b )
�-----.-------.-------.---
y = f(x) !
-
-
-
-
-
-
-
-
·
·
·
-
-
-
·
·
·
-
-
-
•
• •
·
..
�
/( a) � - - - -
� .. : .
•
- �- - - . -� l1
/(a )
-
. . - . . . . . . -� . . . X
- �- - - - � . b
f(x)
.
.
...
�- - - :
{(h)
Figure 63 A picture proof of the inverse function theorem in �
1 54
Functions of a Real Variable
2
Chapter 3
Riemann Integration
Let f : [a, b] � lR be given. Intuitively, the integral of f is the area under its graph; i.e., for f � 0,
l b f (x) ds
= area 1U
where 1U is the undergraph of f. 1U
= { (x , y)
: a ::=:: x ::S b and 0 ::=:: y ::=:: f (x) } .
The precise definition involves approximation. A partition pair consists of two finite sets of points P , T C [a, bl; P = {xo , Xn } and T {tt , tn }, interlaced as •
•
.
.
.
.
.
.
We assume the points x0 , sponding to f, P. T is
R (f, P, T) =
.
•
.
, Xn are distinct. The Riemann sum corre
n
L f ( ti ) !lxi = /Ctt ) Llxt + f (tz)!l xz + · · · + f (tn) !lxn
i =l where Llx i = xi - X i - I · The Riemann sum R is the area of rectangles which approximate the area under the graph of f. See Figure 64. Think of the points ti as "sample points." We sample the value of the function f at lj .
((1;)1
• • • • • •
• • • •
a
Figure 64 The area of the strip is f (ti ) !lxi ·
b
Riemann [ntegration
Section 2
1 55
The mesh of the partition P is the length of the largest subinterval [xi _ 1 , Xi]. A partition with large mesh is coarse; one with small mesh is fine. In general, the finer the better. A real number I is the Riemann integral of f over [a, b J if it satisfies the following approximation condition: VE > 0 38
mesh P
<
> 0 such that if P , T is any partition pair then 8 ::::} IR - II < E
l b f (x dx
where R = R (f, P , T) . If such an I exists it is unique, we denote it as
)
a
=I =
lim
mesh P--+0
R (f, P , T),
and we say that f is Riemann integrable with Riemann integral I. See Exercise 26 for a formalization of this limit definition. 15 Theorem
If f is Riemann integrable then it is bounded.
Proof Suppose not. Let I = J: f (x) dx . There is some 8 > 0 such that I R - I I < 1 for all partition pairs P , T with mesh P < 8 . Fix such a partition pair P = { xo . . . . , Xn }, T = {tt , . . . , tn } . If f is unbounded on [a, b] then there is also a subinterval [xio - t . XioJ on which it is unbounded. Choose a new set T' = {t� , . . . , t� } with t[ = ti for all i f:. i0 and choose t[0 such that l f(t[0 ) - f(ti0) I �Xi0 > 2 .
This is possible since the supremum of { 1 /2, contrary to the fact that both R and 0 differ from I by < 1 .
R'
Let n denote the set of all functions that are Riemann integrable over
[a, bj .
16 Theorem (Linearity of the Integral)
(a) n is a vector space and f !----* I: f(x) dx is a linear map n --+ JR. (b) The constant function h ( x) = k is integrable and its integral is k(b - a).
Proof (a) Riemann sums behave naturally under linear combination:
R (f + cg , P , T)
l b f(x)
=
R(f, P , T) + cR( g , P, T),
l b f(x) dx
and i t follows that their limits, a s mesh P +
cg (x ) dx
=
--+
0 , give the expected formula +
l b g (x) dx.
c
1 56
Chapter 3
Functions of a Real Variable =
(b) Every Riemann sum for the constant function h (x ) so its integral equals this number too. 17 Theorem (Monotonicity of the Integral) If f,
k is k(b - a ) , D
R and
gE
f .::::
g
then
1h f (x ) dx 1h g (x) dx . .:S
Proof For each partition pair P , T , we have R ( f, P, T) .:::: R (g, P, T ) . D 18
M then IJ: f ( x ) dx l
Corollary If f E R and 1 ! 1 .:S
s
M ( b - a) .
Proof By Theorem 1 6, the constant functions ±M are integrable By The orem 1 7, - M .:::: f (x ) .:S M implies that
- M(b - a)
.:S
lh
f (x) dx _::::
M(b - a) .
Darboux integrability
The lower sum and upper sum of a function f : La , b I respect to a partition p of r a ' b] are n
n
i= l
i= l
--*
[- M, M] with
where
m;
= inf { f {t )
M; =
: X1 - J _:::: t _:::: X; }, sup {f ( t ) : x; _ 1 _:::: t _:::: x; }.
We assume f is bounded i n order t o be sure that m; and M ; are real numbers. Clearly L ( f, P ) _:::: R ( f, P . T) _:::: U (f. P ) for all partition pairs P , T . See Figure 65 . The lower integral and upper integral of f over I = sup L (f, p
P)
and
[a. bJ are
I = inf U (f, p
P).
P ranges over all partitions of [a , b] when we take the supremum and infimum. If the lower and upper integrals of f are equal, I = 7, then f is Darboux integrable and I is its Darboux integral. 19 Theorem Riemann integrability is equivalent to Darboux integrability. and when a function is integrable, its three integrals - lower. upper. and Riemann - are equal.
1 57
Riemann Integration
Section 2
a
xi - 1
lower sum
: \: :
. ;7.
b
X;
. . .
17
.
. .
\" .
:-:-
: : : :/
.flt; )lh;
a
lll l l l l lll
xi- 1
Riemann sum
t;
xi
v: : :�).?:· · · \· : v · · v:r .
.
.
. . .
. .
.
-
: :: : :< : : . . . . . . MJ'>.x; .
a
upper sum
b
\: : : u 7: : : 1------f . .
.
.
. .
.
. .
.
b
Figure 65 The lower sum, the Riemann sum, and the upper sum.
Functions of a Real Variable
1 58
Chapter 3
To prove Theorem 1 9 it is convenient to refine a partition P by adding more partition points ; the partition P' refines P if P' :=> P . Suppose first that P' = P U { w } where w E (xio- 1 • Xi0 ) . The lower sums for P and P' are the same except that mi0 l:l.xi0 in P) splits into two terms in P') . The sum of the two terms is at least as large as mi0 l:l.xio · For the infimum of over the intervals [Xio-1 • w] and [w , xi0] is at least as P ) . See Figure 66. large as mio · Similarly, P') ::=::: Repetition continues the pattern and we formalize it as the
L(f,
L (f,
f
U (f,
U (f,
Refinement Principle Refining a partition causes the lower sum to in
crease and the upper sum to decrease.
upper summand
refined upper summand
refined lower swnmand
lower summand
Figure 66 Refinement increases L and decreases U .
,
The common refinement P * of two partitions P, P' of [a b] is
P * = P u P' . According to the refinement principle
L (f,
P) ::::
L(f,
P*)
:::: U (f, P * ) :::: U (f, P') .
We conclude: each lower sum is less than or equal to each upper sum, the lower integral is less than or equal to the upper; and thus
(2)
A bounded function f : [a , b] --* JR. is Darboux integrable if and only if 'v'E > 0 3 P such that
f
U (f. P ) - L (f.
P) < E.
Proof of Theorem 19 Assume that i s Darboux integrable: the lower and upper integrals are equal. say their common value is 1 . Given E > 0 we must find 8 > 0 such that I R - / I < E whenever R = R (f, P , T) is a
Riemann Integration
Section 2
1 59
Riemann sum with mesh P < 8 . By Darboux integrability and ( 2 ) there is a partition P1 of La , b] such that
U1 - L I
E < 2
where U1 = U (f, P1 ) , L 1 = L(f, P1 ) . Fix 8 = Ej8Mn 1 where n 1 is the number of partition points in P1 . Let P be any partition with mesh P < 8 . (Since 8 « E , think of P1 a s coarse and P a s much finer.) Let P * be the common refinement P U P1 . By the refinement principle.
< L1 where L *
=
L (f, P * ) and U *
< U* L* -
=
< -
U1
U (f, P * ) . Thus,
u* - L*
< �2
{x; } and P * = {xJ l for 0 ::::: i ::::: n and 0 ::::: j ::::: n * . The sums U = L M; �x; and U * = I: Mj �xj are identical except for terms with
Write P =
for some i , j . There are at most n 1 - 2 such terms and each is of magnitude at most M8 . Thus,
U - U* Similarly, L *
-L
u-L
=
E < (n i - 2)2M8 < - . 4
< E /4, and so
(U - u * ) + (U * - L * ) + (L * - L )
< E.
Since I and R both belong to the interval [L , U], w e see that I R - II < E . Therefore f is Riemann integrable. Conversely, assume that f is Riemann integrable with Riemann integral I . By Theorem 1 5 , f is bounded. Let E > 0 be given. There exists a 8 > 0 such that for all partition pairs P , T with mesh P < 8, I R - I I < E I 4 where R = R (f. P , T ) . Fix any such P and consider L = L (f, P ) , U = U (f, P ) . There are choices of intermediate sets T = {t; }, T ' = such that each f (t; ) is so close to m; and each J
{ti}
U-
L
= (U
- R ' ) + (R ' - I)
+
(I
- R)
+ (R - L)
< E.
Functions of a Real Variable
1 60
E,
Since I. I, I are fixed numbers that belong to the interval ! L . and E is arbitrary, the E -principle implies that
Chapter 3
U] of length
I = I = I. which proves that Riemann integrals
f
is Darboux integrable and that its lower, upper, and equal . D
are
According to Theorem 1 9 and (2) we get
20 Riemann •s Integrability Criterion
A bounded function is Riemann integrable if and only if VE > 0 3 P such that U (f, P ) - L (f, P ) < E. Example Every continuous function f : [a . b] --+ lR is Riemann inte grable. (See also Corollary 22 to the Riemann-Lebesgue Theorem, below.) Since [a . b] is compact and f is continuous. is uniformly continuous. See Theorem 44 in Chapter 2. Let E > 0 be given. Uniform continuity provides a 8 > 0 such that if It - s l < 8 then 1 /(t ) - f (s) l < Ej2(b - a ) . Choose any partition P with mesh P < 8. On each partition interval [x; _ 1 , x; ] , we have M; - m; < Ej(b - a ) . Thus
f
By Riemann's integrability criterion f is Riemann integrable.
Example A piecewise continuous function is continuous except at a finite number of points. A step function is constant except at a finite number of points where it is discontinuous. Clearly, a step function is a special type of piecewise continuous function. See Figure 67 . The characteristic function (or indicator function) of a set E c JR, 1 at points of E and 0 at points of Ec . See Figure 68. A step function is a finite sum of constants times characteristic functions of intervals. See Figure 67 . B ounded piecewise continuous functions are Riemann integrable. See Corollary 23 below. Some characteristic functions are Riemann integrable, others aren't. X E , is
Example The characteristic function of Q is not integrable on [a , b] . It is defined as X Q (X ) = 1 when x E Q and X Q (X ) = 0 when x (j; Q. See
Figure 69. Every lower sum L (X Q . P) is 0 and every upper sum is b - a . By Riemann's criterion, X IQJ is not integrable. Note that X iQJ is discontinuous at every point, not merely at rational points.
Section 2
Riemann Integration
161
Figure 67 The graphs of a piecewise continuous function and a step function.
Figure 68 The graph of a characteristic function and the region below the graph. The fact that X Ql fails to be Riemann integrable is actually a failing of Riemann integration theory, for the function X Ql is fairly tame. Its integral ought to exist and it ought to be 0, because the undergraph is just countably many line segments of height 1 , and their area ought to be 0.
Example The rational ruler function is Riemann integrable. At each ra tional number x = pjq , we set f (x ) = l jq , while f (x) = 0 when x is irrational. See Figure 70. The integral of f is 0. Note that f is discontinuous at every x E
Q and is continuous at every x
E
Qc.
Example The Zeno's staircase function Z (x ) = 1 j2 on the first half of [0, 1 ] , Z (x) = 3 j 4 on the next quarter of [0, I ] , and so on. See Figure 7 1 .
It is Riemann integrable and its integral is
2/3 . The function has infinitely
Chapter 3
Functions of a Real Variable
1 62
Figure 69 The graph of X IQi and the region below its graph .
0.
.I
.2
.33
.4
.5
.6
.9
.7 .75
I.
Figure 70 The graph of the rational ruler function and the region below its graph. k many discontinuity points, one at each point (2 1 ) j2k . In fact, every monotone function is Riemann integrable. t See Corollary 24 below. -
t To prove; this directly is not hard. See also Corollary 24 below. The key observation 10 make is that a monotone function is not much different than a continuous function. It has only jump discontinuities,
Riemann Integration
Section 2
1 63
Figure 71 Zeno's staircase.
These examples raise a natural question:
Exactly which functions are Riemann integrable ? To give an answer to the question, and for many other applications, the following concept is very useful. A set Z c JR. is a zero set if for each E > 0 there is a countable covering of Z by open intervals (a i , bi ) such that 00
L bi - a i i =l
_:::: E .
The sum o f the series is the total length o f the covering. Think o f zero sets as negligible; if a property holds for all points except those in a zero set then one says that the property holds almost everywhere, abbreviated "a.e."
A function f : [a , b] --+ JR. is Riemann integrable if and only if it is bounded and its set ofdiscontinuity points is a zero set.
21 Riemann-Lebesgue Theorem
The set
D of di scontinuity points is exactly what its name implies, D = {x
E
La , b] : f is discontinuous at the point x } .
A function whose set of discontinuity points i s a zero set i s continuous almost everywhere. The Riemann-Lebesgue theorem states that a function is Riemann integrable if and only if it is bounded and continuous almost everywhere. Examples of zero sets are and only countably many of them; given any ::0: E . See Exercise 2.30.
E >
0, there are only finitely many at which the jump is
Functions of a Real Variable
1 64
Chapter 3
(a) Any subset of a zero set. (b) Any finite set. (c) Any countable union of zero sets . (d) Any countable set. (e) The middle-thirds Cantor set. (a) is clear. For if Zo c Z where Z is a zero set, and if E > 0 is given, then there is some open covering of Z by intervals whose total length is _:::: E ; but the same collection of intervals covers Z0, which shows that Z0 is also a zero set. (b) Let Z = { z � o . . . , Z n } be a finite set and let E > 0 be given. The intervals (z; E j2n , z; + E/2n ) , for i = I , . . . , n , cover Z and have total length E . Therefore Z is a zero set. In particular. any single point is a zero set. (c) This is a typical "E"/2n -argument." Let Z1 • Z2 , . . be a sequence of zero sets and z = u zj . We claim that z is a zero set. Let E > 0 be given. The set Z1 can be covered by countably many intervals (ai l , bi l ) with total length L (hi l - ai l ) _:::: E/2. The set Z2 can be covered by countably many intervals (ai2 . bi2) with total length L (hi2 - a;2) _:::: E / 4 . In general. the set Zi can be covered by countably many intervals (a;j . b;j ) with total length -
.
00
'""
Since the countable union of countable sets is countable, the collection of all the intervals ( a;i , bij ) is a countable covering of Z by open intervals, and the total length of all these intervals is 00
<
L ;j j =l
=
-E + -E + -E + . . . 2
4
8
= E.
Thus Z is a zero set and (c) is proved. (d) This is implied by (b) and (c) . (e) Let E > 0 be given and choose n E N such that 2 n 13 n < E . The middle-thirds Cantor set is contained inside 2n closed intervals of length I /3 n , say /1 {zn . Enlarge each closed interval /; to an open interval (a; , b; ) ::::> /; such that b; - a; = E j2n . (Since l j 3 n < E j 2n , and I; has length l /3 n , this i s possible.) The total length of these 2 n intervals ( a; . b; ) is E . Thus is a zero set. In the proof of the Riemann-Lebesgue Theorem, it is useful to focus on the "size" of a discontinuity. A simple expression for this size is the
C
,
C
•
•
•
•
Section 2
Riemann Integration
1 65
oscillation of f at x , oscx (f) = lim sup f {t) - lim inf f (t ) . r-+ x r-+ x Equivalently, oscx (f) = lim diam f ([x - r, x + r ] ) r-+ 0
(Of course, r > 0.) [t is clear that f is continuous at x if and only if oscx ( f ) = 0. It is also clear that if I is any interval containing x in its interior then M1 - m 1 ::=: oscx (f)
where M1 and m 1 are the supremum and infimum of f (t) as t varies in See Figure 72.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
:x
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
I.
.
Figure 72 The oscillation of f at x is lim sup f (t ) - lim inf j (t ) . r-+ x r->- x Proof of the Riemann-Lebesgue Theorem The set D of discontinuity points of f : [a .
b]
-4
[- M, M] naturally filters itself as the countable union D =
U Dk
kEN
where Dk = {x E [a ,
b]
:
l oscx (f) ::=: k } .
According to (a), (c) above, D is a zero set i f and only i f each Dk i s a zero set. Assume that f is Riemann integrable and let E > 0, k E N be given. By Theorem 19 there is a partition P such that U -L =
L(M; - m ; ) ll.x;
<
i·
1 66
Functions of a Real Variable
Chapter 3
Any partition interval I; = [x ; - 1 . X; ] that contains a point of Dk in its interior has M; - m; � 1 / k. Since L (M; - m ; ) D.x; < E/ k, there cannot be too many such intervals . (This is the key step in the estimates.) More precisely, if we sum over the i 's such that I; contains a point of Dk in its interior then
�" D.xL k
I
< �.
k
Except for the zero set of points which lie at partition points, Dk is contained in finitely many open intervals whose total length is < E . Since E is arbitrary, Dk is a zero set. By (c), D is a zero set. Conversely, assume that the discontinuity set D of f : [a , b] --+ [ -M, M] is a zero set. Let E > 0 be given. By Riemann's integrability criterion, to prove that f is Riemann integrable it suffices to find L = L (f, P) and U = U ( f, P) such that U - L < E . Choose k E N so that ]
- <
k
E
---
2 (b - a)
By (a), Dk is a zero set, so there is a countable covering of Dk by open intervals Jj = (aj . bj ) with total length ::::: E/4M. Also, for each x E [a b] \ Dk there is an open interval Ix containing x such that .
sup { f (t ) : t
E
I_. } - inf{ f (t) :
t E
Ix } <
�·
Consider the collection V of open intervals Jj and Ix such that j E N and x E [ a , b ] \ Dk . It is an open covering of [a , b] . Compactness of [a , b ] implies that V has a Lebesgue number A > 0. Let P be any partition of [a , b ] having mesh P < A . We claim that U ( f, P) - L (f, P ) < E . Each partition interval I; is contained wholly in some Ix or wholly in some lj . (This is what Lebesgue numbers are good for.) Set p = { i : I; is contained in some 11 } . See Figure 7 3 . For some finite
m,
11
U
···
U
lm contains those partition
Riemann Integration
Section 2
1 67
small oscillation on ��i
x0 = a
Figure 73 The partition intervals I; with large oscillation have i E p. intervals
I;
with
i E p. Also, { 1 , . . . , n }
= I U p. Then
n
U-L
-
L(M; - m; )D.. x; i=l
< < < <
L (M; - m;)D.. x;
+
L 2MD..x;
+
i ep i ep
m
j =1
E
- + -
2
i rtP
L k D..x; 1
i rf.P
2 M L bi - aj E
L(M; - m;)D.. x;
2
+
b-a k
E.
For the total length of the intervals I; contained in the intervals Ji is no greater than L bi - a i . As remarked at the outset, Riemann's integrability criterion then implies that f is integrable. D The Riemann-Lebesgue Theorem has many consequences, ten of which we list as corollaries.
22 Corollary Every continuous function is Riemann integrable.
168
Functions of a Real Variable
Chapter 3
Proof The discontinuity set of a continuous function is empty, and there fore is a zero set. A continuous function defined on a compact interval [a , b] is bounded. By the Riemann-Lebesgue Theorem, any such function is Riemann integrable. D
23 Corollary
integrable.
Every piecewise continuous bounded function is Riemann
Proof The discontinuity set of a piecewise continuous function is finite and therefore a zero set. By the Riemann-Lebesgue Theorem, such a function is Riemann integrable. D
24 Corollary
Every monotone function is Riemann integrable.
Proof The set of discontinuities of a monotone function f : [a , b] -+ � is countable and therefore is a zero set. (See Exercise 30 in Chapter 2.) Since f is monotone, its values lie in the interval between f (a) and f (b) , so f is bounded. By the Riemann-Lebesgue Theorem, f is Riemann integrable. D 25 Corollary
integrable.
The product of Riemann integrable functions is Riemann
Proof Let f, g E 'R be given. They are bounded and their product is bounded. By the Riemann-Lebesgue Theorem their discontinuity sets, D (f) and D ( g) are zero sets, and D ( f) U D ( g) contains the discontinuity set of the product f · g. Since the union of two zero sets is a zero set, the Riemann-Lebesgue Theorem implies that f · g is Riemann integrable. D ,
26 Corollary Iff : [a . b] -+ [ c, d] is Riemann integrable and
[c. d ]
--+
Proof The discontinuity set of
then I f I
E n.
Proof The function
�--+
I f (x) I is Riemann D
Riemann Integration
Section 2
1 69
If a < c < b and f : [a , b] ---+ ffi. is Riemann integrable then its restrictions to [a , c], [c, b] are Riemann integrable, and
28 Corollary
1b
f (x) dx =
1 f(x) dx + [ f(x) dx . b
c
Conversely, Riemann integrability on [a , c] and [c, b] implies Riemann integrability on [a , b ]. Proof See Figure 7 4. The union of the discontinuity sets for the restrictions of f to the subintervals [a , c], [c, b] is the discontinuity set of f. The latter is a zero set if and only if the former two are, and so by the Riemann Lebesgue Theorem, f is Riemann integrable if and only if its restrictions to [a . c], [c, b] are. Let X ra .cJ • X [c,bl be the characteristic functions of [a , c ], [c, b ] . By Corol lary 23 they are integrable, and by Corollary 25, so are the products X [a .cl · f and X [c.bl • f. Since
f=
X ra .c]
· f + X (c , b] • f,
the addition formula follows from linearity of the integral, Theorem
a
c
16. 0
b
Figure 74 Additivity of the integral is equivalent to additivity of area.
29 Corollary If f : [a , b] ---+ [0, M] is Riemann integrable and has inte gral zero then f (x) = 0 at every continuity point x of f. That is, f (x) = 0 almost everywhere.
Proof Suppose not: let x0 be a continuity point of f and assume that f(xo) > 0. Then for some 8 > 0 and each x E (x0 8 , x0 + 8 ) ,
f (x)
?:
-
f (xo)/2. The function f (xo) g (x) = 2 0
I
if X
E (Xo - 8 , Xo + 8 )
otherwise
Functions of a Real Variable
1 70
Chapter 3
satisfies 0 ::=:: g (x) ::=:: f(x) everywhere. See Figure 75. By monotonicity of the integral, Theorem 1 7 ,
f(xo) fl =
1b g (x) dx 1b j ( ) dx ::=::
a contradiction. Hence f (x ) =
x
= 0, 0
0 at every continuity point.
graph g
b
a
Figure 75 The shaded rectangle prevents the integral of f being zero. Corollary 26 and Exercises 34, 36, 46, 48 deal with the way that Riemann integrability behaves under composition. If f E R and ¢ is continuous then ¢ o f E R, although the composition in the other order, f o ¢, may fail to be integrable. Continuity is too weak a hypothesis for such a "change of variable." See Exercise 36. However, we have the following result.
Iff is Riemann integrable and l/1 is a bijection whose inverse satisfies a Lipschitz condition then f o l/1 is Riemann integrable. 30 Corollary
Proof More precisely, we assume that f : [a , b] grable, l/1 bijects [c, d] onto [a, constant K and all s , t E [a , b],
b], lfr (c)
=
---+
ffi. is Riemann inte and for some
a , l/J (d) = b,
l lfr - 1 (s) - l/l - 1 (t) l ::::: K Is - t ! .
We then assert that f o l/1 is a Riemann integrable function [c, d] ---+ R Note that lfr - 1 is a homeomorphism. For it is a continuous bijection whose domain of definition is compact. Let D be the set of discontinuity points of f. Then D ' = l/f - 1 (D) is the set of discontinuity points for f o l/J. Let E > 0 be given. There is an open covering of D by intervals (a; , b; ) whose total length is ::=:: E K . The homeomorphic intervals (a; , b; > = lfr - 1 (a; , b; ) cover D ' and have total length
I
".t b�t - a�z < " < L..t K (b·z - a z· ) _ L.
E.
Therefore D ' is a zero set and by the Riemann Lebesgue Theorem, is integrable.
f o l/1
0
17 1
Riemann Integration
Section 2 31
Corollary If f E R and 1/f : [c, d] then f o 1/f is Riemann integrable.
---+
La . b] is a C 1 diffeomorphism
Proof The hypothesis that 1/f is a C 1 diffeomorphism means that it is a
continuously differentiable bijection, whose derivative is nowhere zero. We assume that 1/f ' (t) > 0. Since 1/f ' is continuous and positive on [c, d] there is a constant K > 0 such that for all 0 E [c, d], 1/f'(O ) =:: K . The Mean Value Theorem implies that for all u , v E [c, d], there exists a 0 between u and v such that 1/f (u) - 1/f (v) = 1/f ' (O ) (u - v ) . Thus,
For any
s, t
e
1 1/f (u ) - 1/f (v) l � K l u - v i . [a , b], set u = 1/f - 1 (s) , v = 1/f - 1 (t). Then I s - t i � K J l/f- 1 (s) - 1/f - 1 (t) J ,
which is a Lipschitz condition on 1/f - 1 with Lipschitz constant K = K - 1 • By Corollary 30, f o 1/f is Riemann integrable. D Versions of the preceding theorem and corollary remain true without the hypotheses that 1/f bijects. The proofs are harder because l/t can fold infinitely often. See Exercise 39. In calculus you learn that the derivative of the integral is the integrand. This we now prove.
32 Fundamental Theorem of Calculus If f : [a . b]
lx f {t) dt
integrable then its indefinite integral F (x)
=
---+
lR
is Riemann
is a continuousfunction ofx. The derivative of F (x) exists and equals f (x) at all points x at which f is continuous. Proof #1 Obvious from Figure 76. Proof #2 Since f is Riemann integrable, it is bounded; say for all x . By Corollary 28 I F (y) - F (x) l =
l l y f (t) dt l .s:
M ly -
l f (x) l
_::::
M
xl .
Therefore, F is continuous : given E > 0, choose o < E f M, and observe that I Y - x i < o implies that I F (y) - F (x) l < Mo < E . In exactly the same way, if f is continuous at x then
Chapter 3
Functions of a Real Variable
172
F(xJ
X
a
h
x+h
Figure 76 Why does this picture give a proof of the Fundamental Theorem of Calculus?
F (x + h) - F (x) h
as
h
---+ 0. For if
=
!._h
l X
x
+
h
f (t) d t
---+
f (x)
m (x , h) = inf { f (s ) : Is - x l s l h l } M (x, h) = sup{ f (s) : Is - x i s lh l }
then
m (x , h)
= s
1
h ]
h
lx+h m (x , h) dt 1 lx+h f (t) dt lx+h M(x . h) t M (x , h) . s
x
h
x
d =
x
When f is continuous at x, m (x , h) and M(x , h) converge to h ---+ 0, and so must the integral sandwiched between them,
1
h (If h
1x+h f(t) dt. X
< 0 then t fx +h f (t) dt is interpreted as - t J:+h f(t) dt .) x
f (x)
as
D
The derivative ofan indefinite Riemann integral exists almost everywhere and equals the integrand almost everywhere.
33 Corollary
Riemann Integration
Section 2
173
Proof Assume that f : [a , b] ---+ JR. is Riemann integrable and F (x) is its
indefinite integral. By the Riemann-Lebesgue Theorem, f is continuous almost everywhere, and by the Fundamenta1 Theorem of Calculus, F ' (x) exists and equals f (x) wherever f is continuous. D A second version of the Fundamenta1 Theorem of Calculus concerns an tiderivatives. lf one function is the derivative of another, the second function is an antiderivative of the first.
Note When G is an antiderivative of of g : [a , b] ---+ JR., we have
G ' (x) = g (x) for all x E
[a , b], not merely for almost all x E [a , b] .
34 Corollary
Every continuous function has an antiderivative.
Proof Assume that f : [a , b] ---+ JR. is continuous. By the Fundamental Theorem of Calculus, the indefinite integral F (x) has a derivative every D where, and F ' (x) = f (x) everywhere. Some discontinuous functions have an antiderivative and others don't. Surprisingly, the wildly oscillating function
f (x)
=
{ 0. if x ::::: 00 --{ oi 00 rr
sm
x
if X >
has an antiderivative, but the j ump function
g (x) does not. See Exercise 42.
if X :::::
if x >
35 Antiderivative Theorem An antiderivative of a Riemann integrable function, if it exists, differs from the indefinite integral by a constant.
Proof We assume that f : [a , b] ---+ JR. is Riemann integrable, that G is an antiderivative of f , and we assert that for all x E [a , b],
G (x) =
{x f (t ) dt
+
C,
Functions of a Real Variable
1 74
where C is a constant. (In fact, C
a = Xo < and choose
tk E [xk - I . xk]
Chapter 3
= G (a) .) Partition [a , x]
X1
< · · · < Xn
as
= X,
such that
Such a tk exists by the Mean Value Theorem applied to the differentiable function G. Telescoping gives
G (x) - G (a)
=
n
n
k=l
k =l
L G (xk ) - G (xk - l ) = L f (tk ) D. xk ,
which i s a Riemann sum for f on the interval [a , x ] . Since f is Riemann integrable, the Riemann sum converges to F (x) as the mesh of the partition tends to zero. This gives G (x) - G (a) = F (x) as claimed. D
36 Corollary
are valid.
lb
Standard integral formulas, such as 2 b3 - a 3 x dx = , 3
a
Proof Every integral formula is actually a derivative formula, and the Anti derivative Theorem converts derivative formulas to integral formulas.
lx -1 dt .
D
In particular, the logarithm function is defined as the integral, log x
=
1
t
Since the integrand 1 j t is well defined and continuous when t > 0, log x is well defined and differentiable for x > 0. Its derivative is l j x . By the way, as is standard in post calculus vocabulary, log x refers to the natural logarithm, not to the base 1 0 logarithm. See also Exercise 1 6. An antiderivative of f has G ' (x) = f(x) everywhere, and differs from the indefinite integral F (x) by a constant. But what if we assume instead that H ' (x) = f (x) almost everywhere? Should this not also imply H (x) differs from F (x) by a constant? Surprisingly, the answer is "no."
There exists a continuous function H : [0, 1 ] ---+ lR whose derivative exists and equals zero almost everywhere, but which is not con stant. 37 Theorem
Riemann Integration
Section 2
175
Proof. The counter-example is the Devil's staircase function, also called the Cantor function. It is defined as H (x ) -
1 /2
if X E [ l j3 , 2 j 3]
1 /4
if X E [ l j9 , 2j9]
3j4
if X E [7 /9 , 8j 9 ]
See Figures 77, 78.
· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
3 4
2 I
· · · · · · · · · · · · · ·
I
4
· · · ·
�---�
�-
9 I
0
�
9 2
3
3
9
2
I
7
9 8
Figure 77 The devil' s staircase. On each discarded interval in the Cantor set construction, H is constant. Thus H is differentiable with derivative zero at all points of [0, 1 ] \ C, and since C is a zero set, this implies H ' (x) = 0 for almost every x . To show that H is continuous we use base 2 and base 3 arithmetic. [f x E [0, 1 ] has base 3 expansion x = (.x1x2 h then the base 2 expansion of y = H (x) is (-YI Y2 . . . h where •
Yi =
��
Xi 2
if 3k
such that Xk
•
•
=1
= 1 and !Jk < i such that xk = 1 if xi = 0 or xi = 2 and !Jk i such that Xk = 1 .
if Xi
<
You should ask: is this a valid definition of H (x)? If two different base 3 expansions, (.x1x2 h and (.x;x2 . . . h represent the same number x , do the two base 2 expressions for (x) represent the same number y? Two •
•
•
H
1 76
Functions of a Real Variable
Chapter 3
]
Figure 78 The devil' s undergraph.
base 3 expansions of x represent the same number x if and only if x is an endpoint of C; one of its base 3 expansions ends in O's, the other in 2's. For example, (.x 1 x2 . . . xeOZh = x = ( . x , x2 . . . xe lOh . If for some (smallest)
k
::::: e , xk =
(-YI Y2 · · - h =
I then
(" 2 2 X r X2
.
.
)
Xk - 1 . -- l O 2 2 -
unambiguously. If none of Xk with k :::: f equals expansions corresponding to H (x ) are
1
then the two base 2
(.� � . x�, ) ( . - - . . . 1 0) . oT
. .
X J x2
Xe - !
-
2 2
-
2
2
2
These two base 2 expansions represent the same number y . The same rea soning applies to the only other ambiguous case, (.xr x2 . . . xe lZh =
x
= (.x1 x2
. . . xe20h .
The point y = H (x) is well defined. Continuity is now easy to check. Let E > be given. Choose k such that 1 /2k :::: E . If j x - x ' < I j3 k then there are base 3 expansions of x , x '
0
i
Riemann Integration
Section 2
whose first k symbols agree. Therefore the first k symbols of agree, which implies that
H (x')
I H (x )
- H ( x ' ) l�
2
LI < E .
1 77 H
(x )
and
0
A yet more pathological example is a strictly monotone, continuous func tion whose derivative is almost everywhere zero. Its graph is a sort of Devil's ski slope, almost everywhere level but also everywhere downhill. To construct start with and extend it to a function fi : JR. --+ JR. by setting + n) = + n for all n E z and all X E ro. I ] . Then set
J
H(x
J,
H (x)
H
00 H k J (x ) = L (34k x) . k=O The values of fi (3 k x) for x E [0 , 1 ] are� 3 k , which is much smaller than the denominator 4k . Thus the series converges and J (x) is well defined. According to the Weierstrass M -test, proved in the next chapter, J is con tinuous. Since fi (3 k x) strictly increases for any pair of points at distance > l f 3 k apart, and this fact is preserved when we take sums, J strictly in creases. The proof that J'(x) = 0 almost everywhere requires more and deeper theory. Next, we justify two common methods of integration.
If f E R and g : [ c, d ] --+ [a , b] is a continuously differentiable bijection with g ' > 0 ( g is a C 1 diffeomorphism) then f (y) dy = f (g(x))g ' (x) dx.
38 Integration by substitution
lb
[d
Proof The first integral exi sts by assumption. By Corollary 3 1 , the com posite f o g E R, and since g ' is continuous, the second integral exists by Corollary 25 . To show that the two integrals are equal we resort again to Riemann sums. Let P partition the interval [ c, dl as C =
and choose
tk E
Xo < X I
<
· · ·
<
Xn = d
Lxk -I · Xk] such that
The Mean Value Theorem ensures that such a tk exists. Since g is a diffeo morphism we have a partition Q of the interval [a , b]
a
=
Yo < Y I
< ··· <
Yn = b
Functions of a Real Variable
1 78
Chapter 3
where Yk = g (xk). and II P I I � 0 implies that II Q II � 0. Set sk = g (td. This gives two equal Riemann sums n
n
k=l
k=l
which converge to the integrals J: f (y ) dy and fcd f (g (t)g'(t) dt as II P II � 0. Since the limits of equals are equal, the integrals are equal. D Actually, it is sufficient to assume that g ' 39 Integration by Parts
l f (x)g' (x) dx
R then
b
Iff, g : [a , b]
E
R.
� JR. are differentiable and f ' ,
= f(b)g(b) - f (a)g(a) -
l f' (x)g(x) dx .
g' E
b
Proof Differentiability implies continuity implies integrability, so J, g E R. Since the product of Riemann integrable functions is Riemann in tegrable, f' g , fg ' E R, and both integrals exist. By the Leibniz Rule, (fg)'(x) = f(x)g'(x) + J '(x)g(x ) everywhere. That is, fg is an an tiderivative of f ' g + fg ' . The Antiderivative Theorem states that fg differs from the indefinite integral of f ' g + fg ' by a constant. That is, for all t E [a , b] ,
f (t)g (t) - f(a)g(a) = = Setting t = b gives the result.
11 f' (x)g (x) + f(x) g' (x) dx l J' (x)g(x) dx lr f (x)g' (x) dx . +
r
D
Improper Integrals
Assume that f : [a , b) � JR. is Riemann integrable when restricted to any closed subinterval [a , c] c [a , b) . You may imagine that f (x) has some unpleasant behavior as x � b, such as lim supx �b 1 / (x) l = oo and/or b = oo . See Figure 79. If the limit of J: f (x) dx exists (and is a real number) as c � b then it is natural to define it as the improper Riemann integral
l f (x) dx b
a
= lim
c �b
l f (x) dx. c
a
The same idea works of course on an interval to the left of a . In order that the two sided improper integral exists for a function f : (a , b) � JR. it is
Section
3
Series
1 79
I
a
Figure 79 The improper integral converges if and only if the total undergraph area is finite. natural to fix some point m E (a , b) and require that both improper inte grals fam f (x) dx and J� f (x) dx exist. Their sum is the improper integral J: f (x) dx. With some ingenuity you can devise a function f : IR --+ IR whose improper integral J� f (x) dx exists despite the fact that it is un bounded at both ±oo. See Exercise 7 1 .
3
Series
A series is a formal sum L ak where the terms ak are real numbers. The n th partial sum of the series is
The series converges to A if A n
--+
A as
n
--+ oo, and we write
A series that does not converge diverges. The basic question to ask about a series is: does it converge or diverge?
Functions of a Real Variable
1 80
Chapter 3
< 1 then the geometric series
For example, if A is a constant and I A I 00
L Ak = 1 + ). + . . . + A n + . . .
k =O
converges to L / ( 1 - 'A). For its partial sums are
An = 1 + A + A 2 +
·
·
·
n 1 - ). + l n + A = --1 - 'J....
n k and ). + 1 -+ 0 as n -+ oo. On the other hand, if l A I :::: 1 , the series L A diverges. Let L an be a series. The Cauchy Convergence Criterion from Chapter 1 applied to its sequence of partial sums yields the CCC for series
L ak converges if and only if VE
>
0 3N such that m ,
n
:::: N
==}
One immediate consequence of the CCC is that no finite number of terms affects convergence of a series. Rather, it is the tail of the series, the terms ak with k large, that determines convergence or divergence. Likewise, whether the series Leads off with a term of index k = 0 or k = 1 , etc. is irrelevant. A second consequence of the CCC is that if ak does not converge to zero as k ---+ oo then L ak does not converge. For Cauchyness of the partial sum sequence (An) implies that an = An - An- I becomes small when n -+ oo. If I A I � 1 the geometric series L ). k diverges since its terms do not converge to zero. The harmonic series
gives an example that a series can diverge even though its terms do tend to zero. See below. Series theory has a large number of convergence tests. All boil down to the following result . 40 Comparison Test Ifa series L bk dominates a series L ak
thatfor all sufficiently large k, l ak I convergence of L ak .
�
in the sense
bk. then convergence of'L bk implies
181
Senes
Section 3
Proof Given E > 0 , convergence o f L bk implies there i s a large N such
that for all m , n 2: N , L�=m bk n
< E . Thus n
n
and convergence of L ak follows from the CCC.
0
Example The series L sin(k) j2k converges since it is dominated by the geometric series L 1 j2 k . A series L ak converges absolutely if L lak I converges. The comparison test shows that absolute convergence implies convergence. A series that converges but not absolutely converges conditionally: L ak converges and L lak I diverges. See below. Series and integrals are both infinite sums. You can imagine a series as an improper integral in which the integration variable is an integer,
More precisely, given a series L ak o define f
f(x) See Figure 80. Then
=
ak tf k - 1 < x
00
:L ak = k =O
:
[0, oo) _:::
1 00 f (x) dx .
---7-
lR by setting
k.
O
The series converges if and only if the improper integral does. The natural extension of this picture is the
41 Integral Test Suppose that and and L ak is a given series.
J000 f(x) dx is a given improper integral
(a) If lak l .::S f (x) for all sufficiently large k and all x E (k - 1 , k] then convergence of the improper integral implies convergence of the series. (b) lf l f (x) l .::S ak for all sufficiently large k and all x E [k , k + 1 ) then divergence of the improper integral implies divergence of the series.
Functions of a Real Variable
1 82
0
2
4
5
k- l
0
2
4
5
k
Chapter 3
k+ l
k+ l
k+2
Figure 8 0 The pictorial proof o f the integral test.
Proof See Figure 80. (a) For some large N0 and all N 2: N0 we have
t
��+ 1
iak l
::=::
1 N f(x) dx 1 00 f (x) dx , ::=::
�
0
which is a finite real number. An increasing, bounded sequence converges to a limit, so the tail of the series L la k l converges, and the whole series L lak I converges. Absolute convergence implies convergence. 0 The proof of (b) is left as Exercise 73.
Example The p-series, L 1 j kP converges when p > 1 and diverges when p ::::: l . Case 1 . p > 1 . By the fundamental theorem of calculus and differentiation rules b 1 -P - I l fb 1 = )1 xP 1 - p -----* p - I as b ---+ oo . The improper integral converges and dominates the p-series, which implies convergence of the series by the integral test. Case 2. p ::=:: 1 . The p-series dominates the improper integral h
1 dx { lt x P
=
{
log b
b 1 -P -
if p = 1 1
if p < 1 . 1-p As b -----* oo, these quantities blow up, and the integral test implies divergence of the series. When p = 1 we have the harmonic series, which we have just shown to diverge.
Series
Section 3
1 83
The exponential growth rate of the series L ak is ot
= lim sup � . k-+ oo
42 Root Test Let a be the exponential growth rare of a series L ak . If
< 1 the series converges. root test is inconclusive.
ot
Proof If ot
if ot
> I the series diverges. and if ot =
1
the
< 1 , fix a constant {3 , ot < f3 < 1 .
Then for all large k, l ak i i / k ::::: {3 ; i.e.,
which gives convergence of L ak by comparison to the geometric series k L: P . If a >
1 , choose {3 , 1 < f3
<
ot.
Then l ak l :::: pk for infinitely many k. Since the terms ak do not converge to 0, the series diverges. To show that the root test is inconclusive when ot = 1, we must find two series, one convergent and the other divergent, both having exponential growth rate ot = 1 . The examples are p-series. We have log
( -1 �
) I fk
=
- p log(k)
k
'"""
- p log(x) X
,....,
- p ix I
__
,...., 0
by L'Hospital's rule as k = x ---+ oo . Therefore a = limk -+ oo O i k P ) 1 f k = 1 . Since the square series L 1 I k2 converges and the harmonic series L 1 I k diverges the root test is inconclusive when ot = I . 0
43 Ratio Test Let the ratio beTWeen successive Terms of the series L ak be rk = l a k+ I f ak l . and set
li m sup r = p .
lim inf rk = A k � oo
If p < 1 the series converges, the ratio test is inconclusive.
if A
k --'> 00
>
k
1 the series diverges, and otherwise
Functions of a Real Variable
1 84
Proof lf p i.e. ,
<
l , choose f3 , p
<
f3
<
Chapter 3
l . For all k � some K Iak + I I ak l < f3 ; .
k k i ak i :S f3 - K i a K I = Cf3 where C 13 -K l a K I is a constant. Convergence of L ak follows from comparison with the geometric series L Cf3 k . If A > 1 , choose /3, 1 < f3 < k A . Then lak l � f3 I C for al1 1arge k, and L ak diverges because its terms do not converge to 0. Again the p-series all have ratio limit p = A = 1 and demonstrate the inconclusiveness of the ratio test when p = 1 or A = 1 . D
Although it is usually easier to apply the ratio test than the root test, the latter has a strictly wider scope. See Exercises 56, 60. Conditional Convergence
If ( ak) is a decreasing sequence in lR that converges to 0 then its alternating series converges. For, A 2n = (a l - a 2 ) + (a3 - a4 ) + . . (a 2n- 1 - a 2n ) . and ak- J - ak is the length of the the interval h = (ak . ak - l ) . The intervals h are disjoint, so the sum of their lengths is at most the length of (0, a0) , namely a0 . See Figure 8 1 . .
0
Figure 81 The pictorial proof of alternating convergence. The sequence ( A 2n ) is increasing and bounded, so limn oo A 2n exists. The partial sum A 2n+ 1 differs from A 2n by a 2n + 1 , a quantity that converges to 0 as n -+ oo , so A2n = nlim A2n+l • nlim --). oo �oo and the alternating series converges. When ak = 1 I k we have the alternating harmonic series, 00 k 1 I I (- 1 ) + l 1 +3-4+... .L k = 2 k =l which we have just shown is convergent. --->
1 85
Series
Section 3 Series of Functions
A series of functions is of the form 00
where each term A series
:
(a , b )
�
lR is a function. For example, in a power
the functions are monomials ck x k . (The coefficients ck are constants and x is a real variable.) [f you think of )... = x as a variable, then the geometric series is a power series whose coefficients are 1 , L x k . Another example of a series of functions is a Fourier series
L ak sin(kx) + bk cos (kx) . 44 Radius of Convergence Theorem If L ckx k is a power series then there is a unique R, 0 :S R :S oo, its radius of convergence, such that the series converges whenever lx I < R, and diverges whenever lx I > R. Moreover R is given by the fonnula
R-
1
lim sup V"'cJ .
----===
k--+ 00
Proof Apply the root test to the series L ck x k . Then lim sup k--+ oo
If l x I
<
.j J ckx k J = l x l lim sup V"'cJ = �R k--+ oo
R the root test gives convergence. [f lx I > R it gives divergence.
D
There are power series with any given radius of convergence, 0 :S R :S = 0. The series L x k I a k has The series L kk x k has = a for k 0 < a < oo. The series L x I k ! has R = oo. Eventually, we show that a function defined by a power series is analytic: it is has all derivatives at all points and it can be expanded as a Taylor series at each point inside its radius of convergence, not merely at x = 0. See Section 6 in Chapter 4. oo.
R
R
1 86
Functions of a Real Variable
Chapter 3
Exercises
f : � ---+ � satisfies 1 / (t) - f (x) l :::::: I t - x l 2 for all t, x . Prove that f is constant. 2. A function f : (a , b) ---+ � satisfies a HOlder condition of order a if a > 0, and for some constant H and all u , x E (a , b), l f (u) - f(x ) l :::::: H l u - x l a . 1 . Assume that
The function is said to be a-HOlder, with a-HOlder constant H . (The terms "Lipschitz function of order a" and "a-Lipschitz function" are sometimes used with the same meaning.) (a) Prove that an a-Holder function defined on (a . b) is uniformly continuous and infer that it extends uniquely to a continuous function defined on [a . b J . Is the extended function a-HOlder? (b) What does a-Holder continuity mean when a = 1 ? (c) Prove that a-Holder continuity when a > I implies that f is constant. 3. Assume that f : (a . b) ---+ � is differentiable. (a) If f' (x) > 0 for all x , prove that f is strictly monotone increas ing. (h) If f'(x) =:: 0 for all x , what can you prove? 4. Prove that .Jn+1 - Jn ---+ 0 as n ---+ oo. 5 . Assume that f : � ---+ � is continuous, and for all x "1- 0, f'(x) exists. H lim ! ' (x) = L exists, does it follow that f' (0) exists? x� o
6.
7.
8.
9.
Prove or disprove. If a differentiable function f : (a , b) ---+ � assumes a maximum or a minimum at some () E (a , b) , prove that j'(()) = 0. Why is the assertion false when [a , b ] replaces (a , b ) ? In L'Hospital 's Rule, replace the interval (a , b) with the half-line [a , oo) and interpret "x tends to b" as "x ---+ oo." Show that if fi g tends to 010 and f ' I g ' tends to L then fIg also tends to L also. Prove that this continues to hold when = oo in the sense that if ! ' I g' ---+ oo then fIg ---+ oo. In L' Hospital 's Rule, replace the assumption that fi g tends to 010 with the assumption that it tends to oo I oo. H f' I g' tends to L, prove that fl g tends to L also. [Hint: Think of a rear guard instead of an advance guard.] [Query: Is there a way to deduce the ooloo case from the 010 case? Na ;tvely taking reciprocals does not work.] (u) Draw the graph of a continuous function defined on [0, 1] that is differentiable on the interval (0, 1 ) but not at the endpoints.
L
Exercises
1 87
(b) Can you find a formula for such a function? (c) Does the Mean Value Theorem apply to such a function? 1 0. Let f : (a , b) � lR be given. (a) If exists, prove that
f"(x)
r
h1_!?1
f(x - h) - 2f(x ) + f(x + h) - f" (X ) h2
.
(b) Find an example that this limit can exist even when (x) fails to exist. 1 1 . Assume that ( - 1 , 1) � lR and exists. If an , fJn -+ 0 as n � oo, define the difference quotient
f"
f:
f'(O)
Dn
=
J (fJn
) - J (an )
fJn - CXn
.
(a) Prove that lim Dn = f ' (0) under each of the following conn -4 oo ditions (i) CX n < 0 < fJn · (ii) 0
<
CXn
<
fJn and
fJn :S fJn - CXn
M.
(iii) j' (x) exists and is continuous for al] x E (- 1 , 1 ) . = 0 . Observe that = x 2 sin ( l jx) for x i= 0 and (b) Set is differentiable everywhere in ( - 1 , 1 ) and (0) = 0. Find an , fJn that tend to 0 in such a way that Dn converges to a limit unequal to / ' (0) . Assume that and are rth order differentiable functions (a , b) � JR, r :=: I . Prove the rth order Leibniz product rule for the function
f(x )
f(O)
f
1 2.
f'
f
f · g,
g
(f . g )(r)(x) = t (�) f (k) (x ) . g (r -k)(x ) . k=O
where (�) = r ! j (k ! (r - k) !) is the binomial coefficient, r choose k. [Hint: Induction.] 1 3. Assume that lR � lR is differentiable. (a) If there is an L < 1 such that for each x E JR, (x) < L, prove that there exists a unique point x such that = x . [x is a fixed point for (b) Show b y example that (a) fails i f L = 1 . 1 4 . Define lR � lR by
f:
f' f(x)
e:
f.]
e(x )
=
{eo-I /x
if > 0 if X :S 0
x
Functions of a Real Variable
1 88
Chapter 3
(a) Prove that e is smooth: that is, e has derivatives of all orders at all points x . [Hint: L' Hospital and induction. Feel free to use the standard differentiation formulas about ex from calculus.] (b) Is e analytic? (c) Show that the bump function
fJ (x) = e 2 e ( I - x) · e(x + 1 )
is smooth, identically zero outside the interval ( - 1 . 1 ) , positive inside the interval ( - 1 1 ) , and takes value 1 at x = 0. (e 2 i s the square of the base of the natural logarithms, while e(x) is the function just defined. Apologies to the abused notation.) (d) For l x l < 1 show that ,
fJ (x) = e - 2x2f (x2 - l ) .
Bump functions have wide use in smooth function theory and differ ential topology. The graph of /3 looks like a bump. See Figure 82.
-I
--{).5
0
0.5
Figure 82 The graph of the bump function /3 . * * 1 5 . Let L be any closed set in JR. Prove that there i s a smooth function f : :IR -+ [0, 1 ] such that f (x) = 0 if and only if x E L . To put it another way, every closed set in IR is the zero locus of some smooth function. LHint: Use Exercise 1 4(c).] 16. log x is defined to be ft l f t dt for x > O. Using only the mathematics explained in this chapter.
Exercises
17. 1 8.
1 89
(a) Prove that log is a smooth function. (b) Prove that log(xy) = log x + log y for all x , y > 0. [Hint: Fix y and define f (x ) = log(xy) - log x - log y . Show that f (x) = 0.] (c) Prove that log is strictly monotone increasing and its range is all of R Define f (x) = x 2 if x < 0 and f (x) = x + x 2 if x :::_ 0. Differenti ation gives f" (x) = 2. This is bogus. Why? Recall that the K-oscillation set of an arbitrary function f : [a , b] --+ lR is
DK
=
{x E [a , b] : oscx CJ)
:::_
K}.
(a) Prove that DK i s closed. (b) Infer that the discontinuity set of f is a countable union of closed sets. (This is called an Fa -set.) (c) Infer from (b) that the set of continuity points is a countable intersection of open sets. (This is called a G8-set.) * 1 9. Baire's Theorem (page 243) asserts that if a complete metric space is the countable union of closed subsets then at least one of them has non-empty interior. Use Baire's Theorem to show that the set of irrational numbers is not the countable union of closed subsets of R 20. Use Exercises 1 8 and 19 to show that there is no function f : lR --+ lR which is discontinuous at every irrational number and continuous at every rational number. **2 1 . Find a subset S of the middle-thirds Cantor set which is never the discontinuity set of a function f : lR --+ R Infer that some zero sets are never discontinuity sets of Riemann integrable functions. [Hint: How many subsets of C are there? How many can be countable unions of closed sets?] **22. Suppose that fn : la , b] --+ lR is a sequence of continuous functions that converges pointwise to a limit function f : fa , b] --+ R Such an f i s said to be of Baire class 1 . (Pointwise convergence is discussed in the next chapter. It means what it says: for each x , fn (x) converges to f (x ) as n --+ oo . Continuous functions are considered to be of Baire class 0, and in general a Baire class k function is the pointwise limit of a sequence of Baire class k - 1 functions. Strictly speaking, it should not be of B aire class k - 1 itself, but for simplicity I include continuous functions among B aire class 1 functions. It is an interesting fact that for every k there are Baire class k functions not of Baire class k - I . You might consult A Primer of Real Functions by Ralph B oas.)
Functions of a Real Variable
1 90
Chapter 3
Prove that the K-oscillation set of f is nowhere dense, as follows. To arrive at a contradiction, suppose that DK is dense in some interval (a , {3) c [a , b]. By Exercise 1 8, DK is closed, so it contains (ex , {3 ) . Cover lR b y countably many intervals (a e , ht ) of length < K and set Hf. = fpre (af. , bt ) .
(a) Why does U e He = [a , b ] ? (b) Show that no Hl contains a subinterval of (ex , {3 ) . (c) Why are Ff.m n = {x E [a , b] : al + EtmN =
n Fem n
1
m
< fn (X )
<
bl
1
- -} m
n�N
closed? (d) Show that
U EemN · m. NEN (e) Use (a) and Baire's Theorem (page 24 3 ) to deduce that some EtmN contains a subinterval of (a, {3 ) . ( f) Why does (e) contradict (b) and complete the proof that DK is nowhere dense? 23. Combine Exercises 1 8 , 22, and Baire's Theorem to show that a Baire class 1 function has a dense set of continuity points. 24. Suppose that g : [a . b] ---+ lR is differentiable. (a) Prove that g' is of Baire class 1 . [Hint : Extend g to a differen tiable function defined on a larger interval and consider He =
fn (X ) =
g (x + 1 / n ) - g (x ) 1 /n
for x E [a . b ] . Is fn (x ) continuous? Does fn (x ) converge point wise to g' (x ) as n ---+ oo?] (b) Infer from Exercise 23 that a derivative can not be everywhere discontinuous. It must be continuous on a dense subset of its domain of definition. 25. Consider the characteristic functions f (x) and g (x ) of the intervals [ 1 , 4] and [2. 5]. The derivatives f ' , g' exist almost everywhere. The integration by parts formula says that 1' 3
Jo
f (x ) g ' (x ) dx = f ( 3)g ( 3 ) - f (O) g ( O) -
J{o 3
f ' (x) g (x ) d x .
Exercises
191
But both integrals are zero, while j (3 ) g ( 3) - j (O)g (O) = 1 . Where is the error? 26. Let Q be a set with a transitive relation � - It sati sfies the conditions that for all w, , wz , w3 E Q , w, � w, and w, � Wz � w3 implies that w 1 � w3 • A function f : Q --+ JR. converges to a limit L with respect to Q if, given any E > 0 there is an Wo E Q such that Wo � w implies l .f (w) - L l < E . We write lim n .f (w) = L to indicate this convergence. Observe that • When f (n) = an and N is given its standard order relation :;:: , limn �cx.. a n means the same thing as limN .f (n ) . • When JR. is given its standard order relation :;:: , limHoo .f (t) means the same thing as limJR f ( t) . • Fix an x E JR. and give JR. the new relation t1 � t2 when l tz - x l :::: I tt - x l . Then limHx .f ( t) means the same thing as limcJR. :o> f (t) . (a) Prove that limits are unique: if lim n f = L t and lim n f = Lz then L, = L z . (b) Prove that existence of lim n f and lim n g imply that lim (/ + cg) = lim f + c lim g n n n lim (/ · g) = lim f · lim g n n n lim (// g) = lim f/ lim g n n n where c is a constant and, in the quotient rule, lim n g f= 0, g t= 0. (c) Let Q consist of all partition pairs ( P , T); define ( P , T) � (P' , T ' ) when P' is finer than P, mesh P' :::: mesh P . Observe that � is transitive and that lim n R ( j, P, T) = I means the same as lim mesh P---. o R (/, P , T) in the definition of the Riemann integral. (d) Review the proof of Theorem 1 6 and use (b) to justify the fact that linearity of Riemann sums with respect to the integrand, R ( f + cg , P, T) = R ( f, P , T) + cR (g , P, T ) , actually does imply linearity o f the integral with respect to the integrand. (e) Formulate this limit definition for functions from Q to a general metric space in place of R
1 92
Functions of a Real Variable
Chapter 3
27. Redefine Riemann and Darboux integrability using only dyadic par titions. (a} Prove that the integrals are unaffected. (b) Infer that Riemann's integrability criterion can be restated in terms of dyadic partitions. 28. In many calculus books, the definition of the integral is given as
� f(xi ) b - a
n ---+ oo L
lim
k= l
--
n
where xi is the midpoint of the interval
k(b - a) (k - 1 ) (b - a) . a+ ]. n n See Stewart's Calculus with Early Transcendentals, for example. [a +
29.
30. 31 .
*32.
(a) If f is continuous, show that the calculus book limit exists and equals the Riemann integral of f. [Hint: This is a one-liner. ] (b) Show by example that the calculus style limit can exist for functions which are not Riemann integrable. (c) Infer that the calculus style definition of the integral is inade quate for real analysis. Suppose that Z C R Prove that the following are equivalent. (i) Z is a zero set. (ii) For each E > 0 there is a countable covering of Z by closed intervals [a i . bd with total length L b; - a; < E . (iii) For each E > 0 there i s a countable covering of Z by sets S; with total diameter L diam S; < E . Prove that the interval LO, 1 ] i s not a zero set. [Hint: B e careful; this is not entirely trivial.] The standard middle-quarters Cantor set is formed by removing the middle quarter from ro. 1 ] , then removing the middle quarter from each of the remaining two intervals, then removing the middle quarter from each of the remaining four intervals, and so on. (a) Prove that it is a zero set. (b) Formulate the natural definition of the middle /3-Cantor set. (c) Is it also a zero set? Prove or disprove. Define a Cantor set by removing from [0, 1 ] the middle interval of length 1 14. From the remaining two intervals F 1 remove the middle intervals oflength 1 I 1 6. From the remaining four intervals F2 remove the middle intervals of length 1 164, and so on. At the nth step in the construction p n consists of zn subintervals of p n - l
Exercises
33.
34.
*35.
*36.
1 93
(a) Prove that F = np n is a Cantor set but not a zero set. It is often referred to as a fat Cantor set. (b) Infer that being a zero set is not a topological property: if two sets are homeomorphic and one is a zero set, then the other need not be a zero set. [Hint: To get a sense of this fat Cantor set, calculate the total length of the intervals which comprise its complement. See Figure 49 and Exercise 36.] Consider the characteristic function of the dyadic rational numbers, j (x ) = 1 if x = kf2n for some k E Z and n E N, and f (x) = 0 otherwise. (a) What is its set of discontinuities? (b) At which points is its oscillation ::::: K ? (c) Is i t integrable? Explain, both by the Riemann-Lebesgue The orem and directly from the definition. (d) Consider the dyadic ruler function g (x) = 1 12n if x = kf2n and g (x) = 0 otherwise. Graph it and answer the questions posed in (a). (b), (c). (a) Prove that the characteristic function f of the middle-thirds Cantor set C is Riemann integrable but the characteristic func tion g of the fat Cantor set F (Exercise 32) is not. (b) Why is there a homeomorphism h : [0. 1 ] ---+ LO. 1] sending C onto F? (c) Infer that the composite of Riemann integrable functions need not be Riemann integrable. How is this example related to Corollaries 26, 30 of the Riemann-Lebesgue Theorem? See also Exercise 36. Assume that 1/1 : [a . b] -+ JR. is continuously differentiable. A critical point of 1/1 is an x such that 1/1' (x) = 0. A critical value is a number y such that for at least one critical point x, y = j (x ) . (a) Prove that the set o f critical values is a zero set. (This is the Morse-Sard Theorem in dimension one.) (b) Generalize this to continuously differentiable functions JR. -+ JR. Let F c [0, I ] be the fat Cantor set from Exercise 32, and define 1/f (x )
=
fox
dist(t , F) dt
where dist (t, F) refers to the minimum distance from t to F.
Functions of a Real Variable
1 94
Chapter 3
(a) Why is 1/f a continuously differentiable homeomorphism from [0, 1 ] onto [0, L] where L = 1/f ( 1 ) ? (b) What is the set of critical points of 1/f ? (See Exercise 35.) (c) Why is 1/f (F) a Cantor set of zero measure? (d) Let f be the characteristic function of 1/f (F). Why is f Riemann integrable but f o 1/f not? (e) What is the relation of (d) to Exercise 34? 37 Generalizing Exercise 30 in Chapter 1 , we say that f : (a , b) --+ JR. has a jump discontinuity (or a discontinuity of the first kind) at c E (a , b) if
f(c-)
=
lim f(x)
X -+ C
and
f (x + )
=
x �c+
lim f(x)
exist, but are either unequal or are unequal to f(c) . (The three quan tities exist and are equal if and only if f is continuous at c.) An oscillating discontinuity (or a discontinuity of the second kind) is any non-jump discontinuity. (a) Show that f : JR. --+ JR. has at most countably many jump discontinuities. (b) Show that 1 sm if x =:: O
f (x)
=
{
.
0
x
if x ::::;: 0
has an oscillating discontinuity at x = 0. (c) Show that the characteristic function of the rationals, X Q . has an oscillating discontinuity at every point. *38. Recall that P ( S ) = 25 is the power set of S, the collection of all subsets of S and R is the set of Riemann integrable functions f : [a b ] --+ JR.. (a) Prove that the cardinality of R is the same as the cardinality of P (JR.) , which is greater than the cardinality of JR.. (b) Call two functions in R integrally equivalent if they differ only on a zero set. Prove that the collection of integral equivalence classes of R has the same cardinality as JR., namely zN . (c) Is it better to count Riemann integrable functions or integral equivalence classes of Riemann integrable functions? (d) Show that f, g E R are integrally equivalent if and only if the integral of I f g I is zero. 39. Suppose that 1/f : f c, d] --+ fa , b] is continuous and for every zero set Z c [a , b], 1jfP'e (Z) is a zero set in [c, d] . ,
-
Exercises
1 95
(a) If f is Riemann integrable, prove that f o 1/r is Riemann inte grable. (b) Derive Corollary 30 from (a) . 40. Let 1/f (x) = x sin l jx for 0 < x ::::: 1 and 1/f (O) = 0 . (a) If f : [ - 1 , l ] --+ lR is Riemann integrable, prove that f o 1/f is Riemann integrable. (b) What happens for 1/f (x) = ,JX sin l fx ? *4 1 . Assume that 1/f : [c . d] --+ [a . b ] is continuously differentiable. (a) If the critical points of 1/f form a zero set in [ c, d ] and f is Rie mann integrable on [a . b] prove that f o 1/f is Riemann integrable on [ c , d] . (b) Conversely. prove that if f o 1/f is Riemann integrable for each Riemann integrable f on [a , b], then the critical points of 1/J form a zero set. [Hint: Think in terms of Exercise 35.] (c) Prove (a) and (b) under the weaker assumption that 1/f is con tinuously differentiable except at finitely many points of L c. d] . (d) Derive part (a) of Exercise 36 from (c). (e) Weaken the assumption further to 1/J being continuously dif ferentiable on an open subset of [c, d] whose complement is a zero set. The following assertion, to be proved in Chapter 6, is related to the preceding exercises. If f : [a , b] --+ lR satisfies a Lipschitz condition or is monotone then the set of points at which f' (x) fails to exist is a zero set. Thus: "a Lipschitz function is differentiable almost everywhere," which is Rademacher's Theorem in dimension one, and a "monotone function is almost everywhere differentiable," which is the last theorem in Lebesgue's book, Lefons sur / 'integration et Ia recherche des fonctions primitives. See Theorem 39 and Corollary 4 1 i n Chapter 6 . 42. Set f (x) =
l
o_
rr
sm
x
if X ::::: 0 if x > 0
and
g (x ) =
{�
i f X ::::: 0
if X > 0.
Prove that f has an antiderivative but g does not. 43 . Show that any two antiderivatives of a function differ by a constant. [Hint: This is a one-liner.] 44. (a) Define the oscillation for a function from one metric space to another, f : M � N .
1 96
45 .
46.
47.
48 .
**49.
50.
51.
Functions of a Real Variable
Chapter 3
(b) Is it true that f is continuous at a point if and only if its oscil lation is zero there? Prove or disprove. (c) Fix a number K > 0. Is the set of points at which the oscillation of f is ::: K closed in M? Prove or disprove. (a) Prove that the integral of the Zeno' s staircase function described on page 1 6 1 is 2/3 . (b) What about the Devil's staircase? In the proof of Corollary 26 of the Riemann-Lebesgue Theorem, it is asserted that when ¢ is continuous the discontinuity set of ¢ o f is contained in the discontinuity set of f . (a) Prove this. (b) Give an example where the inclusion is not an equality. (c) Find a sufficient condition on ¢ so that ¢ o f and f have equal discontinuity sets for all f E R (d) Is your condition necessary too? Assume that f E R and for some m > 0, 1 / (x ) l ::: m for all x E [a , b] . Prove that the reciprocal of f, 1 /f (x ) , also belongs to R. If f E R. 1 / (x ) l > 0. but no m > 0 is an underbound for 1 / 1 . prove that the reciprocal of f is not Riemann integrable. Corollary 26 to the Riemann-Lebesgue Theorem asserts that if f E R and ¢ is continuous, then ¢ o f E R. Show that piecewise continuity can not replace continuity. [Hint: Take f to be a ruler function and ¢ to be a characteristic function.] Assume that f : [a , b] ---+ [ c . d] is a Riemann integrable bijection. Is the inverse bijection also Riemann integrable? Prove or disprove. [Hint: The graph of f - 1 is the same as the graph of f, viewed with the axes reversed. A function is Riemann integrable if and only if its graph is squeezed tightly between graphs of step functions. Draw a picture of this and rotate your paper by 90 degrees.] If f, g are Riemann integrable on [a , b] , and f (x ) < g (x ) for all x E La , b], prove that J: f (x) dx < J: g (x ) dx . (Note the strict inequality. ) Let f : la , b ] ---+ lR b e given. Prove or give counter-examples t o the following assertions. (a) f E R => 1 / 1 E R. (b) I f I E R => f E R. (c) f E R and 1 / (x ) l ::: c > O for all x => 1 // E R (d) f E R => f 2 E R. (e) / 2 E R => f E R.
Exercises
1 97
f 3 E n ==> J E n. f (x ) ;:: 0 for all X ==} f E R. f2 E R and [Here f 2 and f 3 refer to the functions f (x ) · f (x) and f (x ) f (x ) f(x ) , not the iterates.] 52. Given f, g E R, prove that max (/, g) , min(/, g) E R, where max(/, g) (x ) = max ( j (x ) , g (x)) and min(/, g) (x) = min( f (x) . (f) (g)
·
g (x) ).
53. Assume that f, g : [0, I ] --+ JR. are Riemann integrable and f (x ) = g (x) except on the middle-thirds Cantor set C. (a) Prove that f and g have the same integral. (b) Is the same true if f (x) = g (x) except for x E Q? (c) How is this related to the fact that the characteristic function of Q is not Riemann integrable? 54. Prove that if an ;:: 0 and L an converges, then L
.
.
•
converges. (I call this the block test because it groups the terms of the series in blocks of length 2k- I .) 58. Prove that L 1/ k log(k) P converges when p > 1 and diverges when p .:S 1 . Here k = 2, 3, . . . . [Hint: Integral test or block test.] 59. Concoct a series L ak such that ( - l ) k ak > 0, ak --+ 0, but the series diverges. 60. (a) Show that if a series has ratio lim sup p then it has exponential growth rate p . Infer that the ratio test is subordinate to the root test. (b) Concoct a series such that the root test is conclusive but the ratio test is not. Infer that the root test has strictly wider scope than the ratio test.
1 98
Functions of a Real Variable
Chapter 3
6 1 . Show that there is no simple comparison test for conditionally con vergent series: (a) Find two series L ak and L bk such that L bk converges con ditionally, aklbk ---+ 1 as k ---+ oo, and L ak diverges. (b) Why is this impossible if the series L bk is absolutely conver gent? th 62. An infinite product is an expression n Ck where Ck > 0. The n partial product is Cn = c1 · · · Cn . If Cn converges to a limit C =I= 0 the product converges to C. Write ck = 1 + ak. If each ak ::::: 0 or each ak ::::: 0 prove that L ak converges if and only if n Ck converges. [Hint: Take log's.] 63 . Show that conditional convergence of the series L ak and the product n ( 1 + ak ) can be unrelated to each other: k (a) Set ak = ( - 1 ) I -Jk. The series L ak converges but the corre sponding product n ( 1 + ak ) diverges. (b) Let ek = 0 when k is odd and ek = 1 when k is even. Set bk = ek I k + ( - 1 ) k I -Jk. The series L bk diverges while the corresponding product n ( l + bk ) converges. 64. Consider a series L ak and rearrange its terms by some bijection f3 : N ---+ N, forming a new series L a,B(k) · The rearranged series converges if and only if the partial sums a.a co + · · · + a.a
Exercises
1 99
(a) Prove that Y is closed and connected. (b) If Y is compact and non-empty, prove that L bk converges to Y in the sense that dn ( Yn , Y) -+ 0 as n -+ oo, where dn is the Hausdorff metric on the space of compact subsets of JR. and Yn is the closure of { Bm : m ?:: n } . See Exercise 2. 1 24. (c) Prove that each closed and connected subset of JR. is the set of subsequential limits of some rearrangement of L ak. The article, "The Remarkable Theorem of Levy and Steinitz" by Peter Rosenthal in the American Math Monthly of April 1 987 deals with some of these issues. * * *67 . Let V be a Banach space - a vector space with a norm such that V is complete with respect to the metric induced by the norm. (For example, JR.m is a Banach space.) If L Vk is a convergent series of vectors in V such that L II vk I I diverges, what is the generalization of Exercise 66? In particular, is Y convex? *68. Absolutely convergent series can be multiplied in a natural way, the result being their Cauchy product,
where ck = ao bk + a 1 bk -1 + · · · + akbo . (a) Prove that L ck converges absolutely. (b) Formulate some algebraic laws for such products ( commutativity, distributivity, and so on.) Prove two of them. [Hint for (a) : Write the products aibi in an oo x oo matrix array M, and let An , Bn , Cn be the n th partial sums of L ai , L bi , L Ck . You are asked to prove that (lim A n ) (lim Bn ) = lim Cn . The product of the limits is the limit of the products. The product An Bn is the sum of all the aibi in the n x n comer submatrix of M and en is the sum of its anti-diagonal. Now estimate An Bn - Cn . Alternately, assume that a n , bn ?:: 0 and draw a rectangle R with edges A , B . Observe that R is the union of rectangles Rij with edges ai , bi .] * *69. With reference to Exercise 68, (a) Reduce the hypothesis that both series L ai and L bi are ab solutely convergent to merely one being absolutely convergent and the other convergent. (Exercises 68 and 69(a) are known as Mertens' Theorem.) (b) Find an example to show that the Cauchy product of two con ditionally convergent series may diverge.
Functions of a Real Variable
200
Chapter 3
* *70. The Riemann � -function is defined to be � (s) = :L� 1 n- s where > 1 . It is the sum of the p-series when p = s . Establish Euler's
s
product formula,
oo
� (s ) = kn =l
71.
72.
73.
* * *74.
l
I - Pk-s
where Pk is the kth prime number. Thus, p 1 = 2, P2 = 3 , and so on. Prove that the infinite product converges. [Hint: Each factor in the infinite product is the sum of a geometric series 1 + p;' + ( p;' )2 + . . . . Replace each factor by its geometric series and write out the n th partial product. Apply Mertens' Theorem, collect terms, and recall that any integer has a unique prime factorization.] Invent a continuous function f : JR. -+ JR. whose improper integral is zero, but which is unbounded as x -+ - oo and x -+ oo. [Hint: f is far from monotone.] Assume that f : JR. -+ JR. and that restricted to any closed interval, f is bounded. (a) Formulate the concepts of conditional and absolute convergence of the improper Riemann integral of f. (b) Find an example that distinguishes them. Let f : [0, oo) -+ [0. oo) and I > k be given. Assume that for all sufficiently large k and all x E [ k , k + 1 ) , f x ) ::=: ak . Prove that divergence of the improper integral J000 f (x) dx implies divergence of L: ak . Let f : JR. -+ JR. be given. Assume that the square and cube of f are smooth. Is f smooth? That is, if f · f E c oo and f f f E C00, does it follow that f E C 00 ?
(
·
·
4
Function Spaces
1
Uniform Convergence and
C0 La , b]
Points converge to a limit if they get physically closer and closer to it. What about a sequence of functions? When do functions converge to a limit function? What should it mean that they get closer and closer to a limit function? The simplest idea is that a sequence of functions fn converges to a limit function f if for each x , the values fn (x ) converge to f (x ) as n -4 oo. This is called pointwise convergence: a sequence of functions fn : [a , b] -4 JR. converges pointwise to a limit function f : [a , b] -4 JR. if for each x E [a , b]. lim f (x ) = f (x ) . n -> oo n
The function f is the pointwise limit of the sequence Cfn ) and we write fn -4 f
or
lim fn = f.
n ---> 00
Note that the limit refers to n -4 oo, not to x -4 oo. The same definition applies to functions from one metric space to another. The requirement of uniform convergence is stronger: the sequence of functions fn : La , b] -4 JR. converges uniformly to the limit function f : [a . b ] -4 lR if for each E > 0 there is an N such that for all n :=: N and
202
Function Spaces
Chapter 4
all X E [a , b ] , l fn (X ) - f (x ) l
(1)
< E.
The function f i s the uniform limit of the sequence fn
::::::t
f
or
Un ) and we write
unif n lim fn = f.
-+ oo
Your intuition about uniform convergence is crucial. Draw a tube V of vertical radius E around the graph of f . For n large, the graph of fn lies wholly in V . See Figure 83. Absorb this picture !
b
Figure 83 The graph of fn is contained in the E -tube around the graph of f .
It is clear that uniform convergence implies pointwise convergence. The difference between the two definitions is apparent in the following standard example. n Example Define fn : (0. 1 ) � lR by fn (x ) = x . For each x E (0, 1 ) it is clear that fn (x ) � 0. The functions converge pointwise to the zero function as n � oo. They do not converge uniformly: for take E = l / 1 0. The point Xn = ,ifl/2 is sent by fn to 1 /2 and thus not all points x satisfy ( 1 ) when n is large. The graph of fn fails to lie in the E -tube V . See Figure 84.
The lesson to draw is that pointwise convergence of a sequence of func tions is frequently too weak a concept. Gravitating toward uniform conver gence we ask the natural question: Which properties offunctions are preserved under uniform convergence?
The answers are found in Theorem 1 , Exercise 4, Theorem 6, and Theorem 9. Uniform limits preserve continuity, uniform continuity, integrability, and - with an additional hypothesis - differentiability.
Uniform Convergence and C 0 [a , b]
Section 1
203
0
0
Figure 84 Non-uniform. pointwise convergence. 1 Theorem If fn � f and each fn is continuous at xo , then f is continuous at xo. In other words, the uniform limit of continuous functions is continuous. Proof For simplicity, assume that the functions have domain [a , b] and target JR. (See also Section 8 and Exercise 2.) Let E > 0 and x0 E [a , b] be given. There is an N such that for all n ::::_ N and all x E [a , b] ,
l fn (X ) - f( x ) l The function fN is continuous at l x - xo I < � implies
If lx
- xo I
<
x0
<
E 3·
and so there is a �
0 such that
� then
l f (x ) - f (xo) l
:S:
l f (x ) - !N (x ) l + I JN (x ) - fN (xo) l + I JN (xo) - f (xo) l
E E E < - + - + - = E. 3 3 3
-
Thus
>
f is continuous at x0 E
D
[a , b] .
Without uniform convergence. the theorem fails. For example. define n : [0, 1 ] � lR as before, fn (x ) = x . Then J,, (x ) converges pointwise to the function
fn
f(x )
�
{�
if O :s: x < l if x = L
204
Function Spaces
Chapter 4
The function I is not continuous and the convergence is not uniform. What about the converse? If the limit and the functions are continuous, does pointwise convergence imply uniform convergence? The answer is "no," as is shown by x n on (0, 1 ) . But what if the functions have a compact domain of definition, [a , b] ? The answer is still "no.' ' Example John Kelley refers to this as the growing steeple,
if O if !n if ln
<x <x <x -
! n l n < - l.
< < -
See Figure 85 .
0
Figure 85 The sequence of functions converges pointwise to the zero
function, but not uniformly. Then lim n ---+ oo In (x) = 0 for each x . and In converges pointwise to the function I = 0. Even if the functions have compact domain of definition, and are uniformly bounded and uniformly continuous, pointwise conver gence does not imply uniform convergence. For an example, just multiply the growing steeple functions by 1 / n . The natural way to view uniform convergence i s i n a function space. Let Cb = Cb ( [a , b] , .IR) denote the set of all bounded functions [a , b] """""* JR.
Section 1
Uniform Convergence and C 0 [a , b]
205
The elements of Cb are functions f, g, etc. Each is bounded. Define the sup norm on Cb as l l f ll = sup{ l f (x) l : x E [a , b] } .
The sup norm satisfies the norm axioms discussed in Chapter I , page 27 . I I f I I ::: 0 and I f I = 0 if and only if f = 0. l l cf l l = l c l l l f ll II I + g il :::: II I II + I I g il . As we observed in Chapter 2, any norm defines a metric. In the case at hand, d ( f, g) = sup{ l f (x) - g (x ) l : x E [a , b] }
is the corresponding metric on Cb . See Figure 86. To distinguish the norm I I I I I = sup l f (x) l from other norms on Cb we sometimes write ll f l l sup for the sup norm.
I
, '
I
Figure 86 The sup-norm of f and the sup-distance between the functions f and g .
The thing to remember is that Cb is a metric space whose elements are functions. Ponder this. 2 Theorem Convergence with respect to the sup-metric d is equivalent to uniform convergence. Proof If d ( fn , f)
j�
=:::t
----+ 0 then sup{ l fn x - f x l f, and conversely.
3 Theorem Cb is a complete metric space.
x
E
[ a , bJ }
----+
0, so D
Function Spaces
206
Proof
Chapter 4
Let (fn) be a Cauchy sequence in Cb . For each individual x0 E [a . b] fn (xo) form a Cauchy sequence in Jl{ since
the values
i fn (Xo) - fm (Xo) i
S
sup{ l fn (x) - fm (x ) i : X E [a , b] } = d ( fn , fm) .
Thus, for each x E [a , b],
nlim -+oo fn ( X )
exists. Define this limit to be f (x ) . It is clear that fn converges pointwise to f. In fact, the convergence is uniform. For let E > 0 be given. There exists N such that m, n ::::: N imply
Also, for each x E [a , b] there exists an m =
i fm (X ) - f (x ) i If n :::::
m (x )
::::: N such that
E < 2·
N and x E [a , b] then i fn (X) - f (x ) i
S
l fn (X) - fm (x) (x ) l
E E < - + - = E.
2
+
l fm (x) (X ) - f (x ) l
2
Hence fn :::4 f. The function f is bounded. For !N is bounded and for all x , i fN (x ) - f (x ) i < E . Thus f E Cb . By Theorem 2, uniform con vergence implies d-convergence, d ( fn , f ) -4 0, and the Cauchy sequence Un ) converges to a limit in the metric space Cb . D The preceding proof is subtle. The uniform inequality dUn , f ) < E is derived by non-uniform means: for each x we make a separate estimate using an m (x ) depending non-uniformly on x . It is a case of the ends justifying the means. Let C 0 = C 0 ( [a , b], Jl{) denote the set of continuous functions [a , b J -4 Jl{_ Each f E C 0 belongs to cb since a continuous function defined on a compact domain is bounded. That is, C 0 c Cb . 4 Corollary C 0 is a closed subset of Ch · It is a complete metric space. Proof Theorem l implies that a limit in Cb of a sequence of functions in
lies in C 0 . That is, C 0 is closed in Cb . A closed subset of a complete D space is complete. C0
Uniform Convergence and C 0 [a , b]
Section 1
207
Just as it is reasonable to discuss the convergence of a sequence of func tions we can also discuss the convergence of a series of functions, L fk · Merely consider the nth partial sum Fn (x) =
n
L /k (x) . k=O
It is a function. If the sequence of functions ( Fn ) converges to a limit function F then the series converges, and we write F (x ) =
00
L fk (x ) . k=O
If the sequence of partial sums converges uniformly, then so does the series. If the series of absolute values L I /k (x ) I converges, then the series L A converges absolutely. 5 Weierstrass M-test If L Mk is a convergent series of constants and if fk E C satisfies I fk I ::::: Mk for all k, then L fk converges uniformly and b absolutely. Proof If
n
telescope as
>
m
then the partial sums of the series of absolute values
n
n
k=m+ l
k=m+ l
Since L Mk converges, the last sum is < E when m , n are large. Thus ( Fn ) is Cauchy in C . and by Theorem 3 it converges uniformly. 0
b
Next we ask how integrals and derivatives behave with respect to uniform convergence. Integrals behave better than derivatives. 6 Theorem The uniform limit of Riemann integrable functions is Riemann integrable, and the limit of the integrals is the integral of the limit,
lim
{ b fn (x ) dx
n ---?>- 00 Ja
=
b
limf (X) dx . n ---?>- 00 n J{a unif
Function Spaces
208
Chapter 4
In other words, R is a closed subset of Ch and the integral functional f 1---7 J: f(x) dx is a continuous map from R to JR. This extends the regularity hierarchy to
Theorem 6 gives the simplest condition under which the operations of taking limits and integrals commute. Proof Let fn E R be given and assume that fn ::::::t f as n --+ oo. By the Riemann-Lebesgue Theorem, fn is bounded and there is a zero set Zn such that fn is continuous at each x E fa , b ] \ Zn . Theorem 1 implies that f is continuous at each x E [a , b] \ U Zn , while Theorem 3 implies that f is bounded. Since U Zn is a zero set, the Riemann-Lebesgue Theorem implies that f E R. Finally
1 1b 1b
j( } x dx
:S
- 1b
l 1 1b
fn (x ) dx =
f (x) - fn (x ) dx
l f (x) - fn (x ) l dx :S d U, fn ) ( b - a ) --+ 0
l
as n --+ oo. Hence the integral of the limit is the limit of the integrals. 7 Corollary unifonnly,
1 1x
If fn
E R and fn
- 1x
Proof As above, f ( t ) dt
:::::4
0
f then the indefinite integrals converge
l
fn ( t ) dt :S d Un . f) (x - a ) :S d Un , f) ( b - a ) --+ 0
when n --+ oo.
0
8 Term by Term Integration Theorem A unifonnly convergent series of integrable functions L fk can be integrated tenn by tenn in the sense that
b1 �OO
fk (x) dx =
� 1b fk (x) 00
dx .
Uniform Convergence and C 0 La , b J
Section 1
209
F, converges uniformly to L fk . Each F, belongs to R since it is the finite sum of members of R. According to
Proof The sequence of partial sums
Theorem 6,
�1 Ti
b
fk (x) dx =
1
b
1 � fk (x) dx . b OO
F, (x) dx
�
This shows that the series L I: fk (x) dx converges to I: L fk (x) dx . D
The uniform limit of a sequence of differentiable functions is differentiable provided that the sequence of derivatives also converges uni formly. 9 Theorem
Proof We suppose that f, : [a , b] � lR is differentiable for each n and that f, ::::4 f as n � oo. Also we assume that f� ::::4 g for some function g. Then we show that f is differentiable and in fact f ' = g . We first prove the theorem with a major loss of generality: we assume that each f� is continuous. Then f� . g E R and we can apply the fundamental theorem of calculus and Corollary 7 to write
f (a) +
1x g(t) dt .
Since f, ::::4 f we see that f(x) = f (a) + I: g(t) dt and, again by the fundamental theorem of calculus, f' = g. In the general case the proof i s harder. Fix some x E [a , b] and define _
=
{ {
f, (t) - f, (x) t-x
f� (x)
f (t) - f (x ) 1 -X
g (x)
if t f= X if t = X
if t f= X
if t = X .
Each function ¢, is continuous since
_
_
J.' (O ) n
210
Function Spaces
Chapter 4
for some () between t and x . Since f� ::=t g the difference J;, - f� tends uniformly to 0 as m , n --+ oo. Thus (¢n ) is Cauchy in C 0 • Since C0 is com plete. ¢n converges uniformly to a limit function lfr . and lfr is continuous. As already remarked, the pointwise limit of ¢n is ¢, and so lfr = ¢ . Continuity 0 of lfr = ¢ implies that g (x) = j' (x ) . 10 Term by Term Differentiation Theorem A uniformly convergent se ries of differentiable functions can be differentiated term by term, provided that the derivative series converges uniformly,
Proof Apply Theorem 9 to the sequence of partial sums.
0
Note that Theorem 9 fails if we forget to assume the derivatives converge. For example, consider the sequence of functions fn : [ - 1 1 ] --+ � defined by ,
Figure 87 The uniform limit of differentiable functions need not be
differentiable. See Figure 87. The functions converge uniformly to f (x) = lx I , a non differentiable function. The derivatives converge pointwise but not uni formly. Worse examples are easy to imagine. In fact, a sequence of ev erywhere differentiable functions can converge uniformly to a nowhere differentiable function. See Sections 4 and 7. It is one of the miracles of the complex numbers that a uniform limit of complex differentiable functions is complex differentiable, and automatically the sequence of derivatives con verges uniformly to a limit. Real and complex analysis diverge radically on this point.
Power Series
Section 2
2
21 1
Power Series
As another application of the Weierstrass M -test we say a little more about the power series L ck x k . A power series is a special type of series of func tions, the functions being constant multiples of powers of x. As explained in Section 3 of Chapter 3, its radius of convergence is 1 R- ----= · lim sup {ICk
k�oo
Its interval of convergence is ( - R, R) . If x E ( - R, R), the series converges and defines a function f (x) = L ck x k , while if x fj. [- R , R], the series diverges. More is true on compact subintervals of ( - R , R) . 1 1 Theorem If r < R , then the power series converges uniformly and absolutely on the interval [ -r, r ]. Proof Choose {3 , r < f3 < R . For all large k, � < 1 / f3 since f3 Thus, i f l x l .::: r then
<
R.
These are terms in a convergent geometric series and according to the M -test 0 L ck x k converges uniformly when x E [ -r. r ] .
A power series can be integrated and differentiated term by term on its interval of convergence. 12 Theorem
For f(x) = L: ck x k and l x l
r f(t) dt =
Jo
<
fk+1 k=O
R this means
� x k+ l
00
and f ' (x) =
L kckx k - 1 • k=l
\
Proof The radius of convergence of the integral series is determined by the
exponential growth rate of its coefficients,
Since (k - 1 ) / k --+ 1 and k - l f k --+ 1 as k --+ oo, we see that the integral series has the same radius of convergence R as the original series. According to Theorem 8, term by term integration is valid when the series converges
Function Spaces
212
Chapter 4
uniformly, and by Theorem I 1 , the integral series does converge uniformly on any interval [ - r, r] C (- R , R) . A similar calculation for the derivative series shows that its radius of convergence too is R. Term by term differentiation is valid provided the series and the derivative series converge uniformly. Since the radius of convergence of the derivative series is R, the derivative series does converge uniformly on any [ - r, r] c ( - R , R). D 13 Theorem
Analytic functions are smooth, cw
c
C 00 •
Proof An analytic function f is defined by a convergent power series. According to Theorem 1 2, the derivative of f is given by a convergent power series with the same radius of convergence, so repeated differentiation is D valid, and we see that f is indeed smooth.
1 �-l jx
The general smooth function is not analytic, as is shown by the example e (x ) =
if x > 0
if x ::::: 0
on page 1 49. Near x = 0, e (x ) can not be expressed as a convergent power series. Power series provide the clean and unambiguous way to define functions, especially trigonometric functions. The usual definitions of sine, cosine, etc. involve angles and circular arc length, and these concepts seem less fundamental than the functions being defined. To avoid circular reasoning, as it were, we declare that by definition 00
k
exp x = " � � koft.O
k!
.
sm x
=
oo (- l ) k x 2k+ t L (2k + I ) ! k=O
oo ( - l txzk cos x = L --- k=O
(2k) !
We then must prove that these functions have the properties we know and love from calculus. All three series are easily seen to have radius of conver gence R = oo. Theorem 1 2 justifies term by term differentiation, yielding the usual formulas, exp' (x) = exp x
sin ' (x) = cos x
cos ' (x) = - sin x .
The logarithm has already been defined as the indefinite integral J( I I t dt. We claim that if lx I < 1 then log( l + x ) is given as the power series log ( I + x) =
oo
L
( - l ) k+l x k
k
Compactness and Equicontinuity in
Section 3
C0
213
To check this, we merely note that its derivative is the sum of a geometric series, (log( l
1
00
00
+ x))' x +1 -l = 1 - (-x) = '"'C-x)k = '"'c-llxk. k=O k=O =
-
LJ
LJ
The last is a power series with radius of convergence 1 . Since term by term integration of a power series inside its radius of convergence is legal. we integrate both sides of the equation and get the series expression for log ( l x ) as claimed.
+
3
Compactness and Equicontinuity in
C0
The Heine-Bore} theorem states that a closed and bounded set in IR.m is 0 compact. On the other hand, closed and bounded sets in C are rarely compact. Consider for example the closed unit ball B =
{f E C 0 ([ 0 , 1 ] , JR.)
:
ll f ll �
1 }.
fn (x) xn .
To see that B is not compact we look again at the sequence = It lies in B. Does it have a subsequence that converges (with respect to the converges to f in C 0 then metric d of C 0 ) to a limit in C 0 ? No. For if = lim Thus = 0 if x < 1 and f ( l ) = 1 , but this k-+ oo 0 function f does not belong to C • The cause of the problem is the fact that C 0 is infinite-dimensional. In fact it can be shown that if V is a vector space with a norm then its closed unit ball is compact if and only if the space is finite-dimensional. The proof is not especially hard. Nevertheless, we want to have theorems that guarantee certain closed and bounded subsets of C 0 are compact. For we want to extract a convergent subsequence of functions from a given sequence of functions. The simple condition that lets us go ahead is equicontinuity. A sequence of functions in C 0 is equicontinuous if
j(x)
C fn)
fnk(x).
VE > Is -
0 38
fnk
f (x)
>
0
such that
l fn (s ) - fn(t)l < E . t i and n E N The functions fn are equally continuous. The depends on E but it does not depend on n . Roughly speaking, the graphs of all the fn are similar. For < 8
=}
8
total clarity, the concept might better be labeled uniform equicontinuity, in contrast to pointwise equicontinuity, which requires
VE
> 0 and Vx E
l x - ti
[a , b]
< 8 and n E N
38
>
=}
0 such that
l fn (X ) - fn (t) i < E.
Function Spaces
214
Chapter 4
The definitions work equally well for sets of functions, not only sequences of functions. The set £ c C 0 is equicontinuous if VE > 0 315 > 0 such that < t5 and
�
l f(s ) - f(t) i < E . The crucial point i s that t5 does not depend on the particular f E £. It is valid for all f E £ simultaneously. To picture equicontinuity of a family £, Is - t l
f
E
£
imagine the graphs. Their shapes are uniformly controlled. Note that any finite number of continuous functions [a , b] --+ JR. forms an equicontinuous family so Figures 88 and 89 are only suggestive.
Figure 88 Equicontinuity.
Figure 89 Non-equicontinuity.
The basic theorem about equicontinuity is the 14 Arzela-Ascoli Theorem Any bounded equicontinuous sequence offunc tions in C 0 ( [a , b] , JR.) has a uniformly convergent subsequence.
Compactness and Equicontinuity in C 0
Section 3
215
Think of this as d compactness result. H (fn) i s the sequence of equicon tinuous functions, the theorem amounts to asserting that the closure of the set { fn : n E N} is compact. Any compact metric space serves just as well as [a , b], and the target space lR can also be more general. See Section 8.
[a , b] has a countable dense subset D = { d1 , d2 , } . For instance we could take D = Ql n [a , b] . Boundedness of Un) means that for some constant M, all x E [a , b], and all n E N, l fn (x) i :::; M. Thus Un (di )) is a bounded sequence of real numbers. Balzano-Weierstrass implies that some subsequence of it converges to a limit in JR, say Proof
•
.
•
The subsequence fi . k evaluated at the point dz is also a bounded sequence in JR, and there exists a sub-subsequence fz. k such that fz. k (d2 ) converges to a limit in JR, say fz . k (dz) --+ yz as k --+ oo . The sub-subsequence evaluated at d 1 still converges to y1 • Continuing in this way gives a nested family of subsequences !m , k such that
Um . k ) is a subsequence of Um - I , k ) j :::S m � fm . k (dj) --+ Yi as k
Choose k(m) 2:::
m large enough that if j
--+ oo . :::; m and k(m) :::; k then
The superdiagonal subsequence gm (x) = !m , k (m J (x) converges to a limit at each point x E D. We claim that 8m (x) also converges at the other points x E [a , bJ and that the convergence is uniform. It suffices to show that (gm ) is a Cauchy sequence in C 0 . Let E > 0 be given. Equicontinuity gives a 8 > 0 such that for all s, t
E [a , bJ ,
Choose J large enough that every x E [a , b ] lies in the 8-neighborhood of some di with j :::; J. Since D is dense and [a , b] is compact, this is possible. See Exercise 1 9. Since {d 1 , , d1 } is a finite set and gm (dj) converges for each di , there is an N such that for all i , m :::: N and all j :::; J , •
•
•
Function Spaces
216 If l ,
m
:::: N and x E [a , bJ , choose di with j dj - x i < 8 and j ::S J . Then
gm (x) - g (x) l e
Hence
Chapter 4
::S
j gm (x ) - gm (dj ) j + j gm (dj ) - gf (dj ) j + j ge (dj ) - 8t (X
E E E < - + - + - = E. 3 3 3
(8m ) is Cauchy in C 0 , it converges in C 0 , and the proof is complete.
0
Part of the preceding development can be isolated as the
Pointwise convergena of an equicontinuous sequence of functions on a dense subset of the domain propagates to uniform convergence on the whole domain. 15 Arzela-Ascoli Propagation Theorem
0
Proof This is the E /3 part of the proof.
The example cited over and over again in the equicontinuity world is the following:
16 Corollary Assume that fn : La , b] --+ lR is a sequence of differentiable functions whose derivatives are uniformly bounded. Iffor some Xo, fn (xo ) is bounded as n --+ oo, then the sequence (fn) has a subsequence that converges uniformly on [a , b].
Proof Let M be a bound for the derivatives I f� (x) I , valid for all n E N and x E [a . b] . Equicontinuity of Un ) follows from the Mean Value Theorem:
Is - t i
< 8
==>
l fn (s) - fn (t ) i = l f� (B ) j l s - t i
::S M8
for some e between s and t. Thus, given E > 0, the choice 8 = Ej (M + 1 ) shows that Un ) is equicontinuous. Let C be a bound for l fn (xo) l . valid for all n E N. Then
l fn (x) l
l fn (x) - fn (xo ) l + l fn (xo) l ::S M l b - a l + C ::S
::S
M lx - .\ol + C
shows that the sequence Un ) is bounded in C 0 . The Arzela-Ascoli theorem 0 then supplies the uniformly convergent subsequence. Two other consequences of the same type are fundamental theorems in the fields of ordinary differential equations and complex variables.
Uniform Approximation in
Section 4
C0
217
(a) A sequence o f solutions to a continuous ordinary differential equation in IRm has a subsequence that converges to a limit, and that limit is also a solution of the ODE. (b) A sequence of complex analytic functions that converges pointwise, converges uniformly (on compact subsets of the domain of definition) and the limit is complex analytic. Finally we give a topological interpretation of the Arzela-Ascoli theorem.
A subset £ c C 0 is com pact if and only if it is closed, bounded, and equicontinuous. 17 Heine-Borel Theorem in a function space
Proof Assume that £ is compact. By Theorem 2.56, it is closed and totally
bounded: given E > 0 there is a finite covering of £ by neighborhoods in C 0 that have radius E/3, say �1 3 ( fk ) , with k = 1 . . n . Each fk is uniformly continuous so there is a 8 > 0 such that .
If f E
Is - t l
< 8
=>
.
.
l fk (s ) - fk ( t) l
E < 3·
£ then for some k, f E �;J ( fk ) , and Is - t l < 8 implies
l f(s) - f (t) l
:S
l f (s) - fd s) i + i fk (s) - fk (t) l + l fk (t) - f (t) i
E E E < - + -+ - = E
3 3 3 Thus £ is equicontinuous. Conversely, assume that £ is closed, bounded, and equicontinuous If < fn ) i s a sequence i n £ then by the Arzela-Ascoli theorem, some subsequence < fn k ) converges uniformly to a limit. The limit lies in £ since £ is closed D Thus £ is compact.
4
Uniform Approximation in C0
Given a continuous but nondifferentiable function f, we often want to make it smoother by a small perturbation. We want to approximate f in C 0 by a smooth function g . The ultimately smooth function is a polynomial, and the first thing we prove is a polynomial approximation result 18 Weierstrass Approximation Theorem in C 0 ([a , bl , JR) .
The set ofpolynomials is dense
Density means that for each f E C 0 and each E > 0 there is a polynomial function p (x) such that for all x E La, b],
l f (x) - p(x) i < E . There are several proofs of this theorem, and although they appear quite
Function Spaces
218
Chapter 4
different from each other, they share a common thread: the approximating function is built from f by sampling the values of f and recombining them in some nice way. It is no loss of generality to assume that the interval [a, bj is [0 . 1 ] . We do so.
Proof #1 For each n E
N, consider the sum
Pn (x) =
t (:)ckx k (l - xt-k , k=O
- ).
where ck = f ( k j n) and G) is the binomial coefficient n ! / k ! (n k ! Clearly Pn is a polynomial. It is called a Bernstein polynomial. We claim that the nth Bernstein polynomial converges uniformly to f as n ._ oo. The proof relies on two formulas about how the functions
rk (x ) = (: ) x k (l - xt -k
shown in Figure 90 behave. They are n
(2)
n
(3)
:�:)k - nx ) 2 rk (x) = nx ( I
k=O
In terms of the functions
(6)k xk
- x).
rk we write
.....
Figure 90 The seven basic Bernstein polynomials of degree six.
(1
- x)
6 -k
,k=0
6.
Uniform Approximation in C 0
Section 4
n Pn (X) = .L::Ck rk (X ) k=O
219
n f (x ) = L f(x)rk (x) . k=O
Then we divide the sum Pn - f = L: Cck - f)rk into the terms where k/n is near x , and other terms where kIn is far from x . More precisely, given E > 0 we use uniform continuity of f on [0, I ] to find 8 > 0 such that i t - s l < 8 implies i f (t) - f(s) i < E/2. Then we set K1 = {k e {O, . . , n } : .
This gives
l � - x � < 8}
K2 = {0, . . . n } \ K 1
and
.
n
The factors i ck - f (x) i in the first sum are < E/2 since ck = f(kj n) and kjn differs from x by < 8 . Since the sum of all the terms rk is 1 and the terms are non-negative. the first sum is < E /2. To estimate the second sum, use (3) to write
nx ( I - x)
n 2 = L(k - nx) 2 rk (x) :::: L (k - nx) rk (x) k=O
:::: since k
e
L (n8) 2 rk (x) , k E K2
keK2
K2 implies that (k - nx ) 2 :::: (n8) 2 . This implies that
L rk (x ) -<
kEK2
nx ( l - x) 1 < 2 2 4n8 (n8)
--
since max x ( l - x) = 1 /4 as x varies in [0. 1 ] . The factors i ck - f(x) i in the second sum are at most 2M where M = I f 11 . Thus the second sum is E "" l ck - f (x) i rk (x) :::: M 2 :::: 2 L Z n8 k E K2 when n is large, completing the proof that I Pn (x) - f (x) I < E when large.
n
is
220
Function Spaces
Chapter 4
(2) and ( 3 ). The binomial coefficients
It remains to check the identities satisfy
(4)
x.
which becomes if we set = 1 On the other hand, if we fix y and differentiate (4) with respect to once, and then again, we get
(2)
(5) (6)
y
x
-
n(x + y) n -1 = t (�) kx k -l yn - k , k=O n (n - l)(x + y t - 2 = t (�) k(k - l )x k-Z y n - k . k=O
Note that the bottom term in (5) and the bottom two terms in (6) are 0. Multiplying (5) by and (6) by and then setting = 1 in both equations gives
x
x2
y
-x
nx = t (�)kx k (I - x) n -k = t krk(x), k=O k=O k (8) n(n - l)x2 = t ( � )k(k - l)x k (l - x )n- = t k(k - l)rk(x). k =O k=O The last sum is L k2rk(x) - L krk(x). Hence (7), (8) become n n k2rk(x) = n(n l)x2 krk(x) = n(n - l)x2 + nx. + (9) L L k=O k=O (7 )
Using (2 ),
(7 ), (9), we get
n (k - nx)2rk(x) L k=O n = L k2rk (x) - 2nx L krk (x) + (nx)2 L rk(x) k=O k=O k=O = n(n - l)x2 + nx - 2(nx)2 + (nx)2 = -nx2 + nx = nx(I - x), a s claimed i n ( 3 ) n
.
n
0
Uniform Approximation in C 0
Section 4
22 1
Proof #2 Let f E C 0 ( [0, 1 j , JR) be given and let g (x) =
where
j(x ) - (m x + b )
f ( l ) - f (O)
and b = f (O) . 1 Then g E C 0 and g ( O) = 0 = g ( l ) . lf we can approximate g arbitrarily well by polynomials, then the same is true of f since mx + b is a polynomial. In other words it is no loss of generality to assume that / (0) = f ( l ) = 0 in the first place. Also, we extend f to all of lR by defining f (x) = 0 for all x E lR \ [0, 1 ] . Then we consider a function
m =
n f3n ( t ) = bn ( l - t 2 )
- 1 ::=:: t ::=:: 1 ,
where the constant bn is chosen so that J� 1 f3n (t ) dt = 1 . As shown in Figure 91 , f3n is a kind of polynomial bump function. Set Pn (X) =
/_: f
(x + t ) f3n ( t ) d t .
This i s a weighted average o f the values o f f using the weight function f3n . We claim that Pn is a polynomial and Pn (x) =t f (x) as n --+ oo .
Figure 91 The graph of the function f36 (t) = 1 .467 ( 1
-
t2 ) 6 •
To check that Pn is a polynomial we use a change of variables, u = x + t . Then Pn (x ) =
r+l f (u )f3n (U - x) d u lx-1
=
Jot f ( ) f3 u
n (X
- u ) du
since f = 0 outside of [0, 1 ] . The function f3n (x - u ) is a polynomial in
Function Spaces
222
Chapter 4
x
whose coefticients are polynomials in u . The powers of x pull out past the integral and we are left with these powers of x multiplied by numbers, the integrals of the polynomials in u times f (u) . In other words, by merely inspecting the last formula, it becomes clear that Pn (x) is a polynomial in x . To check that Pn =t f as n --+ oo , we need to estimate f3n (t) . We claim that if 8 > 0 then
(10)
f3n (t)
=t
0 as n --+
oo and 8 ::::;
ItI
::::;
1.
This is clear from Figure 9 1 . Proceeding more rigorously, we have
,
n n ----> oo (1 - 1 / n ) we see that for some constant c and all n ,
Since l fe = lim
See also Exercise 29. Hence if 8 ::::; I t I ::::;
1
then
due to the fact that Jn tends to oo more slowly than (1 - 8 2 ) - n as n --+ oo . This proves ( I 0) . From ( 1 0) we deduce that Pn =t f, as follows: Let E > O be given. Uniform continuity of f gives 8 > O such that l t l < 8 implies l f (x + t) - f(x) l < E /2. Since f3n has integral 1 on L- 1 , 1 ] we have
I Pn (X ) - f(x) l ::S
=
I L: (f(x + t) - f(x))f3n (t) dt '
L: l f(x f(x) l f3n (t) dt 1 l (x + t) - f(x) l f3n (t) dt + 1 =
+ t) -
l r i
f
l r l=o:o
l f(x
+ t ) - f (x) l f3n (t) dt.
The first integral is < E /2, while the second is at most 2M � r l =o:o f3n ( t) dt . By ( 10) , the second integral is < E/2 when n is large. Thus Pn =t f as claimed. D
Uniform Approximation in C 0
Section 4
223
Next we see how to extend this result to functions defined on a com pact metric space M instead of merely on an interval. A subset A of C 0 M = C0 (M, JR) is a function algebra if it is closed under addition, scalar multiplication, and function multiplication. That is, if f, g E A and c is a constant then f + g, cf, and f · g belong to A. For example, the set of polynomials is a function algebra. The function algebra vanishes at a point p if f (p) = 0 for all f E A. For example, the function algebra of all polynomials with zero constant term vanishes at x = 0. The function algebra separates points if for each pair of distinct points p 1 , p2 E M there is a function f E A such that f (p J ) i= f(pz) . For example, the function algebra of all trigonometric polynomials separates points of [0, 2:rr ) and vanishes nowhere.
If M is a compact metric space and A 0 is a function algebra in C M that vanishes nowhere and separates points then A is dense in C 0 M. 19 Stone-Weierstrass Theorem
Although the Weierstrass approximation theorem is a special case of the Stone-Weierstrass theorem, the proof of the latter does not stand on its own ; it depends crucially on the former. We also need two lemmas. 20 Lemma IfA vanishes nowhere and separates points then, given distinct points PI , pz E M, and given constants c 1 , c2 , there exists afunction f E A such that j (p J ) = CJ and f(pz) = cz.
g 1 , gz E A that satisfy g1 ( pd i= 0 i= gz (pz). Then g = belongs to A and g (pt ) i= 0 i= g (pz) . Let h E A separate P I · p z , g f + gi and consider the matrix
Proof Choose
H=
(p J ) [ac abcd] [gg (pz) =
By construction a , c i= 0 and b i= d. Hence det H H has rank 2, and the linear equations
ac(d - b) i= 0,
a� + abT} c� + ed T]
J
g ( p J )h (p J ) . g (pz)h (pz ) acd - abc
= Ct =
cz
have a solution (� . TJ). Then f = �g + T}gh belongs to A and f (p1 ) = c1 ,
f(pz)
=
D
cz .
21 Lemma
The closure ofa function algebra in C 0 M is a function algebra.
Proof Clear enough.
D
Function Spaces
224
Chapter 4
Proof of the Stone- Weierstrass Theorem. Let A be a function algebra in
C 0 M that vanishes nowhere and separates points. We must show that A is dense in C 0 M: given F E C0 M and E > 0 we must find G E A such that for all x E M, F (x) - E < G (x) < F (x ) + E .
(1 1) First we observe that
(12)
f EA
=>
lfl E
A
where A denotes the closure of A in C 0 M. Let E > 0 be given. According to the Weierstrass approximation theorem, there exists a polynomial p (y) such that E ( 1 3) sup{ l p(y) - l y l l : I Y I � ll f ll l < 2 After all, I y I is a continuous function defined on the interval [- I f I , I f II ] . The constant term of p (y) i s at most E/2 since l p (O) - 101 1 < E/2. Let q (y) = p (y) - p (O) . Then q (y) is a polynomial with zero constant term and ( 1 3) becomes ( 14)
l q (y) - l y / 1 < E .
Lemma 2 1 states that A i s an algebra, so g E A . t Besides, i f x E M and y = f (x) then
l g (x) - l f (x) / 1 = l q (y) - I Y I I < E . Hence I f I E A = A as claimed in ( 1 2). Next we observe that if f, g belong to A, then max (f, g) and min(f, g) also belong to A. For max(f, g) = min(f, g) =
f+g If - gl + 2 2 f g If gl
;
�
-
_
t Since a function algebra need not contain constant functions. it was important that q has no constant term One should not expect that g = ao + u1 f + + an f n belongs to A. · · ·
Uniform Approximation in C 0
Section 4
225
Repetition shows that the maximum and minimum of any finite number of functions in A also belongs to A. Now we return to ( 1 1 ) . Let F E C 0 M and E > 0 be given. We are trying to find G E A whose graph lies in the E -tube around the graph of F. Fix any distinct points p, q E M. According to Lemma 20 we can find a function in A with given values at p, q , say Hpq E A satisfies Hpq (p)
=
F (p)
and
Hpq (q ) = F (q ) .
Fix p and let q vary. Each q E M has a neighborhood Uq such that x E Uq
( 1 5)
::::::>
F (x) - E < Hpq (x ) .
For Hpq (x ) - F (x ) + E is a continuous function of x which is positive at x = q . The function Hpq locally supersolves ( 1 1 ) . See Ftgure 92.
Hpq(p) = F(p)
q
Figure 92 In a neighborhood of q ,
Hpq supersolves ( l l ) in the sense of
(1 5). Compactness of M implies that finitely many of these neighborhoods Uq cover M, say Uq 1 , , Uq" . Define •
•
•
G p (X )
Then Gp E A and
=
max (Hpq 1 (x ) , . . . , Hpq" (x ) ) .
Function Spaces
226
G p (p)
( 1 6)
for all x
=
F (p)
and
F(x) - E
Chapter 4
G p (x)
<
E M. See Figure 93.
q,
Figure 93 Gp is the maximum of Hpq; • i
=
1,
. . . , n.
Continuity implies that each p has a neighborhood VP such that { 1 7)
x E Vp
==>
Gp (x) < F (x) + E .
See Figure 94. By compactness, finitely many of these neighborhoods cover
p
Figure 94 Gp (p ) = F (p) and Gp supersolves { 1 1 ) everywhere.
Uniform Approximation in C 0
Section 4
227
Figure 95 The graph of G lies in the € -tube around the graph of
F
M,
say VPI ' - - - . VPm ' Set G (x) = min(Gp 1 (x) , . . . , G Pm (x ) ) . We know that G E A and ( 1 6), ( 1 7) imply ( 1 L ) . See Figure 95.
22 Corollary Any 2rr -periodic continuous function of x formly approximated by a trigonometric polynomial
T (x) =
E
0
lR can be uni
n
n
L ak cos kx + L bk sin kx
.
k=O k=O Proof Think of [0, 2rr) parameterizing the circle S 1 by X !---+ (cos X ' sin x ) . The circle is compact, and 2rr -periodic continuous functions o n lR become continuous functions on S 1 . The trigonometric polynomials on S 1 form an algebra T c C 0 S 1 that vanishes nowhere and separates points. The 0 Stone-Weierstrass theorem implies that T is dense in C 0 S 1 . Here is a typical application of the Stone-Weierstrass Theorem: Consider a continuous vector field F : � -+ JR2 where � is the closed unit disc in the plane, and suppose that we want to approximate F by a vector field that vanishes (equals zero) at most finitely often. A simple way to do so is to approximate F by a polynomial vector field G . Real polynomials in two variables are finite sums
P (x . y) =
n
L ciix ; yi
i , j =O
where the Cij are constants. They form a function algebra A in C 0 ( � , JR) that
Chapter 4
Function Spaces
228
separates points and vanishes nowhere. By the Stone-Weierstrass Theorem, A is dense in C 0 , so we can approximate the components of F = (F1 , F2 ) by polynomials F1 = P Fz = Q .
The vector field ( P , Q) then approximates F . Changing the coefficients of P by a small amount ensures that P and Q have no common polynomial factor and F vanishes at most finitely often.
5
Contractions and ODE 's
Fixed point theorems are of great use in the applications of analysis, includ ing the basic theory of vector calculus such as the general implicit function theorem. If f : M --+ M and for some p E M, f( p ) = p, then p is a fixed point of f . When must f have a fixed point? This question has many answers, and the two most famous are given in the next two theorems. Let M be a metric space. A contraction of M is a mapping f : M --+ M such that for some constant k < 1 and all x, y E M,
f(y)) � kd(x, y). 2 3 Banach Contraction Principle Suppose that f M d(f (x),
: --+ M is a con traction and the metric space M is complete. Then f has a unique fixed point p and for any x E M, f" (x) = f o f o · · · o f(x) � p as n --+ 00.
Brouwer Fixed Point Theorem Suppose that uous where Bm is the closed unit ball in Rm. p E Bm.
f
:
Bm --+ Bm
Then
f
is contin has a fixed point
The proof of the first result is easy, the second not. See Figure 96 to picture a contraction. Proof #1 of the Banach Contraction Principle Beautiful, simple, and dy
namical ! See Figure 96. Choose any xo claim that for al1 n E N
E M and define Xn = f" (xo). We
( 1 8) This is easy:
d(xn , Xn+l ) � kd(f(Xn-J ), f(X71 ) ) � k2 d(f(Xn-z ), f(Xn-1 )) n � � k d(xo , xi ) . ·
·
·
From this and a geometric series type of estimate, it follows that the se quence (xn ) is Cauchy. For let E > 0 be given. Choose N large enough
Section 5
Contractions and ODE's
\
� X
•
•
\
fi
•
'1
/
Figure 96
Xo •
•
I
y
• Jy
•
Xn
••
r
•
229
/ fM
M
� -----
•
p
"
�
f contracts M toward the fixed point p
that
kN < E -1 - k d(xo, xt) . Note that ( 1 9) needs the hypothesis k < 1 . If N ::::: ( 1 9)
::::
m
n then
d(Xm , Xn) ::::: d(Xm , Xm+l) + d(Xm+ l • Xm+2 ) + · · · + d(Xn-t . Xn ) ::::: k"'d(xo, X J ) + km +ld(xo, xt ) + + kn -ld(xo, Xt ) ::::: km (l + k + · · · + kn - m -l)d(xo, X t) k :::: kN ( L kl ) d(xo, xt ) = 1 _N k d(xo, Xt) < E . l=O Thus (xn) is Cauchy. Since M is complete, Xn converges to some p E M ·
·
·
oo
as n --* oo . Then
d(p, f(p )) ::::: d( p, Xn) + d(Xn , f(xn)) + d(f(xn ), f ( p )) ::::: d(p , Xn ) + kn d(xo, xt) + kd(xn , p ) 0 as n --* --*
00 .
Since d (p, f(p ) ) i s independent of n, d( p , f( p)) = 0 and p = f(p ) . This proves the existence of the fixed point. Uniqueness is immediate. After all, how can two points simultaneously stay fixed and move closer together? D
230
Function Spaces
Chapter 4
Proof #2 of the Banach Contraction Principle - sketch Choose any point xo E M and choose ro so large that f(M,.0 (xo)) C Mr0 (xo ) - Let B0 =
= t n (Bn - d - The diameter of Bn is at most kn diam(B0 ) , and this tends to 0 as n --+ oo . The sets Bn nest downward as n --+ oo and f sends Bn inside Bn +I · Since M is complete, this implies that n Bn is a single point, say p, and f ( p ) = p . 0
Mr (xo ) and Bn 0
Proof of Brouwer's Theorem in dimension one The closed unit 1 -ball is the interval L - 1 , 1 ] in JR. If f : [ - 1 , l ] --+ [ - 1 , 1 ] is continuous then
so is g(x ) = x - f(x). At the endpoints ± 1 , we have g( - 1 ) ::::= 0 ::::= g( l ) . By the intermediate value theorem, there is a point p E [- 1 , 1 ] such that 0 g(p) = 0. That is, f(p) = p.
The proof in higher dimensions is harder. One proof is a consequence of the general Stokes ' Theorem. and is given in Chapter 5 . Another depends on algebraic topology, a third on differential topology. Ordinary Differential Equations
The qualitative theory of ordinary differential equations (ODE's) begins with the basic existence/uniqueness theorem in ODE's, Picard's Theorem. Throughout, U is an open subset of m-dimensional Euclidean space :!Rm . In geometric terms. an ODE is a vector field F defined on U . We seek a trajectory Y of F through a given p E U , i.e., Y : (a , b) --+ U is differentiable and solves the ODE with initial condition p . (20)
Y ' (t) = F(Y (t)) and Y (O) =
p.
See Figure 97. In this notation w e think o f the vector field F defining at each x E U a vector F (x) whose foot lies at x and to which Y must be tangent. The vector Y ' (t) is (Y; (t) , . . . . Y� (t)) where Y 1 , Y m are the components of Y . The trajectory Y (t) describes how a particle travels with prescribed velocity F. At each time t, Y (t) is the position of the particle; its velocity there is exactly the vector F at that point. Intuitively, trajectories should exist because particles do move. The contraction principle gives a way to find trajectories of vector fields, or - what is the same thing to solve ODE ' s. We will assume that F satisfies a Lipschitz condition - there is a constant L such that for all points x, y E U I F(x) - F (y) l ::::: L l x - y j . .
Here, I
I refers to the Euclidean length of a vector.
F,
x,
.
.
•
y are all vectors
Contractions and ODE's
Section 5
23 1
Figure 97 Y is always tangent to the vector field
F.
in :!Rm . It follows that F is continuous. The Lipschitz condition is stronger than continuity, but still fairly mild. Any differentiable vector field with a bounded derivative is Lipschitz.
Given p E U there exists an F -trajectory Y (t) in This means that Y : (a . b) � U solves (20). Locally, Y zs
24 Picard's Theorem
U through unique.
p.
To prove Picard's Theorem it is convenient to re-express (20) as an inte gral equation and to do this we make a brief digression about vector-valued integrals. Let's recall four key facts about integrals of real valued functions of a real variable, y = f(x ) , a ::::: x ::::: b. (a) J: f (x) dx is approximated by Riemann sums R = L f (tk) D.xk. (b) Continuous functions are integrable. (c) If f'(x) exists and is continuous then J: f'(x) dx = f(b) - f(a) .
jJ:
l
(d) f (x) dx ::::: M(b - a) where M = sup J f (x) J . The Riemann sum R in (a) has a = x0 ::::: · · · ::::: Xk- 1 ::::: Xn = b and all the D.xk = Xk - Xk - 1 are small. Given a vector-valued function of a real variable
a
:::::
x
tk ::::: xk
:::::
· · ·
:::::
j(x ) = ( /1 (x) . . . . . fm (x)) .
:::::
1b f (x) dx (1b j, (x) dx , . . . , 1b fm (x) dx)) .
b, we define its integral componentwise as the vector of integrals =
Function Spaces
232
Chapter 4
J:
Corresponding to (a) - (d) we have (a') f(x) dx is approximated by R = (R 1 , . . . , Rm ), with Rj a Riemann sum for [j . (b') Continuous vector-valued functions are integrable. (c') If f'tx) exists and is continuous, then J: f'(x) dx = f(b) - f(a).
l
I J:
(d' ) f(x) dx ::=: M (b - a) where M = sup l f(x) l . (a'), (b' ), (c') are clear enough. To check (d') we write
R = =
L Rjej = L L [j (tk) D.. xkej j k L L [j (tk )ej D.. Xk = L f(tk) D.. xk k k j
where e1 , . . . , em is the standard vector basis for :!Rm . Thus,
IRI
:S
L l f(tk ) l D.. xk k
::=:
L M D.. xk = M (b - a ) . k
By (a' ), R approximates the integral, which implies (d'). (Note that a weaker inequality with M replaced by ,JfiiM follows immediately from (d). This we aker inequality would suffice for most of what we do - but it is inele gant.) Now consider the following integral version of (20),
Y ( t) = p +
(2 1 )
1 ' F(Y (s) ) ds.
A solution of (2 1 ) is by definition any continuous curve Y : (a. b) � U for which (2 1 ) holds identically in t E (a , b). By (b') any solution of (2 1 ) i s automatically differentiable and its derivative i s F ( Y (t)) . That is. any solution of (2 1 ) solves (20) . The converse is also clear, so solving (20) is equivalent to solving (2 1 ). Proof of Picard's Theorem Since
F is continuous. there exists a compact (p) and a constant M such that I F (x) I ::=: M for all
neighborhood N = N r > 0 such that
x E N . Choose r (22)
r M ::=:
r
and r L < 1 .
Consider the set (3 of all continuous functions respect to the metric
Y
[-r, r]
d( Y , a) = sup{I Y (t) - a (t) l : t E [- r. r ] }
�
N.
With
Contractions and ODE's
Section 5
233
the set e is a complete metric space. Given y E e. define
(Y ) (t) =
p
+ for F(Y (s)) ds.
Solving (2 1 ) is the same as finding Y such that(Y) = Y. That is, we seek a fixed point of . We just need to show that is a contraction of e . Does send e into itself? Given Y E e we see that (Y ) (t ) is a continuous (in fact differentiable) vector-valued function of t and that by (22),
l(Y) (t) - P i =
1 1 1 F (Y (s)) ds l .:s rM
Therefore,does send e into itself. contracts e because
d((Y ) , (u )) = s�p � �
�
r .
1 1 t F (Y (s)) - F (u (s)) ds l
T
sup I F (Y (s)) - F(u (s) ) l
r
sup L I Y (s) - u (s) l � rLd(Y , u ) s
and T L < 1 by (22). Thereforehas a fixed point Y . and (Y) = Y implies that Y (t) solves (2 1), which implies that Y is differentiable and solves (20). Any other solution u (t) of (20) defined on the interval [ - r, r] also solves (2 1 ) and is a fixed point of , ( u ) = u . Since a contraction mapping has a unique fixed point, Y = u , which is what local uniqueness means. 0 The F -trajectories define a flow in the following way: To avoid the pos sibility that trajectories cross the boundary of U (they "escape from U") or become unbounded in finite time (they "escape to infinity") we assume that U is all of JRm . Then trajectories can be defined for all time t E JR. Let Y (t , p) denote the trajectory through p. Imagine all points p E JRm moving in unison along their trajectories as t increases. They are leaves on a river, motes in a breeze. The point PI = Y (ti . p ) at which p arrives after time t i moves according to Y (t, pi). Before p arrives at P I . however, p 1 has already gone elsewhere. This is expressed by the flow equation
Y (t, PI ) = Y (t + t 1 , p).
See Figure 98. The flow equation is true because as functions of t both sides of the equation are F -trajectories through PI , and the F -trajectory through a point
Function Spaces
234
_ _
/
/
y
I
I
/
�
/
/
/
.,
.-
Chapter 4
_
PJ
_...
/
/
/ / ,. / P2
/
/
p
Figure 98 The time needed to flow from from p to
p2
times needed to flow from p to P I and from
is the sum of the to P2 ·
PI
is locally unique. It is revealing to rewrite the flow equation with different notation. Setting qJ1 (p) = Y (t, p) gives
(/Jt +s (p) = (/Jr ( (/Js (p)) for all t, s E ffiL
(/)1 is called the t-advance map. It specifies where each point moves after time t . See Figure 99. The flow equation states that t �---+ (/Jr is a group /
/
/
/
/
Jf
/
/
"
/ /
"
/ /
Jf
/
Jf
/
"
Jf
Figure 99 The t-advance map shows how a set A flows to a set qJ1 (A ) .
homomorphism from ffi. into the group o f motions o f ffi.m . In fact each (/)1 i s a homeomorphism o f ffi.m onto itself and its inverse i s qJ_1 • For (/J- r o (/)1 = ({Jo and q;0 is the time-zero map where nothing moves at all, ({Jo = identity map.
Analytic Functions
Section 6*
6*
235
Analytic Functions
Recall from Chapter 3 that a function f : (a , b) � JR. is analytic if it can be expressed locally as a power series. For each x E (a , b) there exists a convergent power series L ck h k such that for all x + h near x,
f (x + h )
00
=
:I:Ck hk . k =O
As we have shown previously, every analytic function is smooth but not every smooth function is analytic. In this section we give a necessary and sufficient condition that a smooth function be analytic. It involves the speed with which the r th derivative grows as r � oo. Let f : (a , b) � JR. be smooth. The Taylor series for f at x E (a , b) is
Let I = [x - a, x + a ] be a subinterval of (a , b) , a > 0, and denote by Mr the maximum of l !(r ) (t) l for t E I . The derivative growth rate of f on I is
.
a = hm sup
r ---> 00
Clearly,
r i t(x ) l f r !
::S:
�Mr !r
-.
:./Mr fr!, s o the radius o f convergence
R=
------,====
lim sup
r I J< r > (x) l r!
of the Taylor series at x satisfies 1 - ::S:
a
R.
In particular, if a is finite the radius of convergence of the Taylor series is positive. 25 Theorem If a a < 1 , th e interval I.
un
then the Taylor series converges uniformly to f
Function Spaces
236
Proof Choose 8 > 0 such that (a + 8)a <
Chapter 4 I.
The Taylor remainder formula from Chapter 3 , applied to the (r - l ) st order remainder, gives
f (x + h )
r- 1 - " L., k=O
f (k) (X ) k h k!
=
f (r ) ((} )
r!
hr
M (( M ) a)
for some (} between x and x + h. Thus, for r large
f(x + h )
f (k) (x ) k h k'
r-1
-
L k=O
•
� -f a r =
r.
-f
1 /r
r.
r � ( (a + 8 )a Y .
Since (a + 8) a < 1 , the Taylor series converges uniformly to f (x + h ) 0 on / . 26 Theorem If f
is expressed as a convergent power series f (x + h ) = k L qh with radius of convergence R > a, then f has bounded derivative growth rate on I. The proof of Theorem 26 uses two estimates about the growth rate of factorials. If you know Stirling ' s formula they are easy, but we prove them directly. (23)
Taking logarithms, applying the integral test, and ignoring terms that tend to zero as r � oo gives 1
1
1
- (Iog r r - 1og r !) = 1og r - - (I og r + log(r - 1 ) + · · · + log 1 )
r
�
log r - 1
r
1
r
r 1
log x dx = log r - - (x log x - x)
1
r
= 1 - -,
r
which tends to
I
as r
�
oo. This proves (23).
l
r 1
Analytic Functions
Section 6*
f G) "Ak f k (k - l ) (k k=r 00
237
To prove (24) we write "A = e-J.L for Jl > 0, and reason similarly: =
k =r
< - '"""' e e- J.L r .' � k l
(
2� · !
k
.
. (k - r +
l -
r .'
1 00
l)
k e - J.L
X r e -J.LX dx
r r rxr - l r (r - 1 ) x r -1 x x = e - J.L - + -- + Jl2 Jl Jl 3 r! ] r+ l 1 < -! e -w ( r + 1 ) r r - r min( 1 , Jl)
=r
(
)
-
2
r! + · · · + -Jl r + l
) 1 00 r
According to ( 23 ) the rth root of this quantity tends to e 1 -J.L I min ( 1 , Jl) as r -+ oo, completing the proof of (24 ).
Proof of Theorem 26 By assumption the power series L ck h k has radius of convergence R and a < R Since 1 I R is the lim sup of as k -+ oo, there is a number "A < 1 such that for all large k, ! q a k :::=: "A k . Differentiating
00
the series term by term with l h I :::=: a gives
! J (r) (x + h ) !
:S
!
-YfCkT
L k (k - 1 ) (k - 2 ) . . . (k - r + 1 ) ! ck h k r ! -
k=r
for r large. Thus,
According to ( 24) ,
and f has bounded derivative growth rate on I .
D
From Theorems 25 and 26 we deduce the main result of this section. 27 Analyticity Theorem A smooth function is analytic locally bounded derivative growth rate.
if and only if it has
Function Spaces
238
Chapter 4
(a , b) --+ lR is smooth and has locally bounded derivative growth rate. Then x E (a , b ) has a neighborhood N on which the derivative growth rate a is finite. Choose a > 0 such that I = Lx - a , x + a ] C N and a a < 1 . We infer from Theorem 25 that the Taylor series for f at x converges uniformly to f on l . Hence f is analytic. Conversely, assume that f is analytic and let x E (a . b) be given. There is a power series L ck h k that converges to f (x + h) for all h in some interval Proof Assume that
f
(- R . R), R > 0. Choose a , 0 < a < R . We infer from Theorem 26 that 0 f has bounded derivative growth rate on I .
28 Corollary
bounded.
A smooth function is analytic if its derivatives are uniformly
M
An example of such a function is f (x) = sin x .
I f(r) ( fJ) I ::=:: for all r and f) then the derivative growth rate of f is bounded, in fact a = 0 and R = oo. 0
Proof [f
lf f (x) = L ckx k and the power series has radius of convergence R, then f is analytic on ( - R , R). 29 Taylor's Theorem
Proof The function f is smooth, and by Theorem 26 it has bounded deriva
tive growth rate on I
c
( R , R ) . Hence it is analytic. -
0
Taylor's Theorem states that not only can f be expanded as a convergent power series at x = 0, but also at any other point x0 E (- R , R) . Other proofs of Taylor's theorem rely more heavily on series manipulations and Mertens' theorem. The concept of analyticity extends immediately to complex functions. A function f : D --+ C is complex analytic if D is an open subset of C and for each z E D there is a power series
such that for all
z
+ i; near z ,
f (z + i;)
=
00
L::Ck i; k . k=O
The coefficients ck are complex and so is the variable i; . Convergence occurs on a disc of radius R . This lets us define ez , log z, sin z, cos z for the complex
Section 6*
239
Analytic Functions
number z by setting log ( l + z) = cos z =
=
L k=
( - l ) k + I zk
l
k
when l z l < 1
oo ( - 1 ) k z 2k z= (2k ) ! k=O
It is enlightening and reassuring to derive formulas such as
e i e = cos () + i sin () directly from these definitions. (Just plug in z = i() and use the equations i 2 = - 1 , i 3 = -i , i 4 = 1 , etc.) A key formula to check is e z + w = e'e w .
One proof involves a manipulation of product series, a second merely uses analyticity. Another formula is log(e') = z . There are many natural results about real analytic functions that can be proved by direct power series means ; e.g., the sum, product, reciprocal, composite. and inverse function of analytic functions are analytic. Direct proofs, like those for the Analyticity Theorem above, involve major series manipulations. The use of complex variables leads to greatly simplified proofs of these real variable theorems, thanks to the following fact.
Real analyticity propagates to complex analyticity and complex analyticity is equivalent to complex differentiability. t
For it is relatively easy to check that the composition, etc., of complex differentiable functions is complex differentiable. The analyticity concept extends even beyond
x ' = Ax in calculus. A is a given m x m matrix and the unknown solution x = x (t) is a vector function of t , on which an initial condition x (0) = x0 is usually imposed. A vector ODE is equivalent to m coupled, scalar, linear ODE's. The solution x (t) can be expressed as
x (t) = e t A x0 t A function f : D ---+ C is complex differentiable or holomorphic if D is an open subset of C and for each z E D, the limit of
!'if l'iz
exists as
=
f (z
+
l'iz) - / (-:..} l'iz
/'i z --+ 0 in C. The limit, if it exists, is a complex number.
Function Spaces
240
Chapter 4
where
oo k I 1 (I + t A + - (t A ) 2 + · · · + - (t A ) n ) = " !.._ A k . e t A = nlim � oo 2! n! � k!
I i s the m x m identity matrix. View this series a s a power series with kth coefficient t k I k ! and variable A . ( A is a matrix variable ! ) The limit exists in the space of all m x m matrices, and its product with the constant vector x0
does indeed give a vector function of t that solves the original linear ODE. The previous series defines the exponential of a matrix as eA = L A k I k ! . You might ask yourself - i s there such a thing a s the logarithm o f a matrix? A function that assigns to a matrix its matrix logarithm? A power series that expresses the matrix logarithm? What about other analytic functions? Is there such a thing as the sine of a matrix? What about inverting a matrix? Is there a power series that expresses matrix inversion? Are formulas such as log A 2 = 2 log A true? These questions are explored in nonlinear functional analysis. A terminological point on which to insist is that the word "analytic" be defined as "locally power series expressible." In the complex case, some mathematicians define complex analyticity as complex differentiability, and although complex differentiability turns out to be equivalent to local ex pressibility as a complex power series, this is a very special feature of
7*
Nowhere Differentiable Continuous Functions
Although many continuous functions, such as lx l , $, and x sin ( l lx) fail to be differentiable at a few points, it is quite surprising that there can exist a function which is everywhere continuous but nowhere differentiable. 30 Theorem There exists a continuous function f derivative at no point whatsoever.
:
lR
�
lR that has a
Proof The construction is due to Weierstrass. The letters k, m , integers. Start with a sawtooth function a0 : lR � lR defined as a0 (x )
=
l
x - 2n 2n + 2 - x
if 2n :::=: x s 2n + l if 2n + 1 ::::: x ::::: 2n + 2.
n denote
Section 7*
Nowhere Differentiable Continuous Functions
cr0 is periodic with period 2;
compressed sawtooth function
if
crk (x ) =
t =
x
+ 2m
then cro (t)
(�) k cr0 (4kx )
24 1
cro (x ) . The
has period :rrk = 2j4k . If t = x + m:rrk then crk (t) = crk (x) . See Figure 1 00.
ao(x)
Figure 100 The graphs of the sawtooth function and two compressed
sawtooth functions. According to the limit f, and
M -test, the series L crk (x) converges uniformly to a f (x) =
00
L crk (x) k=O
is continuous. We claim that f is nowhere differentiable. Fix an arbitrary point x, and set 8n = 1 /2 · 4n . We will show that
Function Spaces
242
Chapter 4
f (x ± 8n ) - f (x )
f).j
does not converge to a limit as 8n The quotient is
f
f).j = /).x k=v ---"
---+
0, and thus that f ' (x ) does not exist.
O"k (X ±
Dn) 8n
O"k (X )
.
There are three types of term in the series, k > n , k = n , and k < n . If k > n then o-k (x ± 8n ) O"k (x ) = 0 - for 8n is an integer multiple of the period of o-k o -
�
_
On -
1
__
2 4n •
4k-(n + l )
_ -
.
� 4k
-
_
4
k-(n + I ) . "k· ...-
Thus the infinite series expression for f).jI f).x reduces to a sum of n + I terms
The function O"n is monotone on either [x - 8n , x ] or [x , x + 8n ] , since it is monotone on intervals of length 4 -n and the contiguous interval [x - 8n x , x + 8n ] at x is of length 4 -n . The slope of a-n is ±3n . Thus, either ,
or The terms with k k ±3 : Thus
I I f).j f).x
�
<
n
are crudely estimated from the slope of O"k being
3 n - (3 n- l +
which tends to oo as 8n
---+
·
·
·
+ 1 ) = 3n
-
3n 1 1 n � = 2 (3 + 1 ) , -
0, so f ' (x ) does not exist.
D
Weierstrass showed that a nowhere differentiable continuous function exists by simply writing a formula for it. Yet more amazing is the fact thar most continuous functions (in a reasonable sense defined below) are
Section 7*
Nowhere Differentiable Continuous Functions
243
nowhere differentiable. If you could pick a continuous function at random, it would be nowhere differentiable. Recall that the set D c M is dense in M if D meets every non-empty open subset W of M, D n W f= 0. The intersection of two dense sets need not be dense; it can be empty, as is the case with Q and Qc in JR. On the other hand if U, V are open dense sets in M then U n V is open dense in M. For if W is any non-empty open subset of M then U n W is a non-empty open subset of M, and by denseness of V, V meets U n W ; i.e., U n V n W is non-empty and U n V meets W . Moral Open dense sets do a good job o f being dense.
The countable intersection G = nc n of open dense sets is called a thick (or residual t ) subset of M, due to the following result, which we will apply in the complete metric space C 0 ([a , b] , JR.) . Extending our vocabulary in a natural way we say that the complement of a thick set is thin (or meager). A subset H of M is thin if and only if it is a countable union of nowhere dense closed sets, H = U Hn . Clearly, thickness and thinness are topological properties. A thin set is the topological analog of a zero set (a set whose outer measure is zero).
31 Baire's Theorem Every thick subset of a complete metric space M is dense m M. A non-empty, complete metric space is not thin: if M is the union of countably many closed sets, at least one has non-empty interior.
If all points in a thick subset of M satisfy some condition then the con dition is said to be generic, we also say that most points of M obey the condition. As a consequence ofBaire 's theorem and a modification ofWeier strass' construction we will prove
32 Theorem
The generic f E C 0 = C 0 ([a , b ] , JR.) is differentiable at no point of [a . b ], nor is it monotone on any subinterval of l a . b 1 -
Using Lebesgue's monotone differentiation theorem (monotonicity im plies differentiability almost everywhere ), the second assertion follows from the first, but below we give a direct proof. Before getting into the proofs of Baire's theorem and Theorem 32, we further discuss thickness, thinness, and genericity. The empty set is always thin and the full space M is always thick in itself. A single open dense subset is thick and a single closed nowhere dense subset is thin . JR. \ Z is a thick subset of JR. and the Cantor set is a thin subset of JR. Likewise JR. is a
t "Residual" is an unfonunate choice of words. It connotes smallness, when !he opposite.
n
should connote just
Function Spaces
244
Chapter 4
thin subset of �2 • The generic point of � does not lie in the Cantor set. The generic point of �2 does not lie on the x-axis . Although � \ Z is a thick subset of � it is not a thick subset of �2 • The set Q is a thin subset of � It is the countable union of its points, each of which is a closed nowhere dense set. Qc is a thick subset of R. The generic real number is irrational. In the same vein: (a) The generic square matrix has determinant =I= 0. (b) The generic linear transformation �m ---+ �m is an isomorphism. (c) The generic linear transformation �m ---+ �m - k is onto. (d) The generic linear transformation �m ---+ JR.m +k is one-to-one. (e) The generic pair of lines in JR3 are skew (nonparallel and disjoint). (f) The generic plane in �3 meets the three coordinate axes in three distinct points. (g) The generic n th degree polynomial has n distinct roots . In an incomplete metric space such as Q, thickness and thinness have no bite: every subset of Q, even the empty set, is thick in Q.
Proof of Haire's Theorem If M = 0, the proof is trivial, so we assume M =I= 0. Let G = nG n be a thick subset of M; each G n being open dense in M. Let Po E M and E > 0 be given. Choose a sequence of points Pn E M and radii rn > 0 such that rn < 1 / n and
M2q ( p i ) c M"' (po ) M2r2 ( P2 ) C Mr1 (p d n G t M2rn ( Pn + d
C
Mrn (Pn ) n G t · · · n Gn .
See Figure 1 0 1 . Then
ME (po )
:J
Mr J (PI )
:J
Mrz (P2 )
:J
·
·
·
·
The diameters of these sets tend to 0 as n ---+ oo . Thus (pn) is a Cauchy sequence and it converges to some p E M, by completeness. The point p belongs to each set M rn ( Pn ) and therefore it belongs to each G n . Thus p E G n ME (p0 ) and G is dense in M. To check that M i s not thin, w e take complements. Suppose that M = UKn and Kn is closed. If each Kn has empty interior, then each G n = K� is open-dense, and
a contradiction to density of G.
0
245
Nowhere Differentiable Continuous Functions
Section 7*
Figure 101 The closed neighborhoods Mrn (pn ) nest down to
33 Corollary
and thin.
a
point.
No subset ofa complete non-empty metric space is both thick
Proof If S is both a thick and thin subset of M then M \
S is also both
thick and thin. The intersection of two thick subsets of M is thick, so
0 = S n (M \ S) is a thick subset of M. By Baire's theorem, this empty set D is dense in M, so M is empty.
To prove Theorem 32 we use two lemmas. 34 Lemma
The set P L ofpiecewise linear functions is dense in C 0 .
Proof If � :
---* IR is continuous and its graph consists of finitely many line segments in :IR2 then � is piecewise linear. Let f E C0 and E > 0 be given. Since [a , b] is compact, f is uniformly continuous, and there is 8 > 0 such that I t - s l < 8 implies l f (t) - f (s) l < E . Choose n > (b - a)/8 and partition [a , b] into n equal subintervals /i = [xi - l , xd, each of length < 8. Let � : [a , b] ---* :IR be the piecewise linear function whose graph consists of the segments joining the points (xi 1 , f (xi 1 )) and
[a, b]
_
_
(xi , f (xi )) on the graph of f. See Figure 102 . The value of � (t) for t E h lies between f (x i - d and f (x i ) . Both these numbers differ from f (t) by less than E . Hence, for all t E [a , b],
Function Spaces
246
Chapter 4
X;
Xi - 1
Figure 102 The piecewise linear function ¢ approximates the continuous
function f.
l f (t) - f/J (t) l < E. In other words, d (f, ¢) < E and P L is dense in C 0 .
D
35 Lemma If ¢ E P L and E > 0 are given then there exists a sawtooth function a such that II a II ::; E, a has period ::; E, and I
min{ l slope a l } > max{ l slope ¢ 1 } + - . Proof Let () = max{ l slope ¢ 1 } and choose
E
c large. The compressed saw tooth a (x ) = Ea0 (cx) has l l a ll = E , period rc = l jc, and slope s = ±Ec. D When c is large. rc ::; E. and l s i > () + l jE .
Proof of Theorem 32 For n E
Rn = { f E C 0 : Vx E [a , b L n = { f E C 0 : Vx E [a +
N define
�]
3h
> 0 such that
� · b ] 3 h < 0 such that
I D..: I I D..: I
>
n}
>
n}
1 G n = { f E C 0 : f restricted to any interval of length - is non-monotone} ,
D..
n
where f = f (x + h) - f (x) . We claim that each of these sets is open dense in C0 .
Section 7*
Nowhere Differentiable Continuous Functions
247
For denseness it is enough to prove that the closure of each contains P L , since by Lemma 34 the closure o f P L i s C 0 Let cp E P L and E > 0 be given_ According to Lemma 35, there is a sawtooth function a such that l l a l l � E , a has period < 1 / n , and _
min{ l slope a I }
>
max{ l slope c/J I } + n .
Consider the piecewise linear function f = cp + a . Its slopes are dom inated by those of a , and so they alternate in sign with period < 1 j2n . At any x E [a , b + 1 / n ] there is a rightward slope either > n or < - n _ Thus f E Rn . Similarly f E L n . Any interval I of length 1 / n contains in its interior a maximum or minimum of a , and so it contains a subinterval on which f strictly increases, and another on which f strictly decreases. Thus, f E Gn . Since d(j. cp) = E is arbitrarily small this shows that Rn , L n . and G n are dense in C 0 . Next suppose that f E Rn is given. For each x E [a . b - 1 /n ] there is an h = h (x) > 0 such that
Since f is continuous, there is a neighborhood Tx of constant v = v > 0 such that this same h yields
I
f (t + h) h
-
x
in [a . b] and a
I >n+v
f(t)
for all t E Tx . Since [a , b - 1 / n J is compact, finitely many of these neigh borhoods Tx cover it, say Tx1 , • • • , Txm . Continuity of f implies that for all t E T Xi ' (25)
I
f (t + h i ) - f (t) h I-
I
::: n + vi ,
where hi = h (xi ) and vi = v (xd . These m inequalities for points t in the m sets Tx, remain nearly valid if f is replaced by a function g with d (j g) small enough; (25) becomes (26) which means that g E Rn and Rn is open in C 0 • Similarly L n
is
open in C 0 •
Function Spaces
248
Chapter 4
Checking that G n is open is easier. If (jk ) is a sequence of functions in G� and fk :::::::t f then we must show that f E G� . Each fk is monotone on some interval h of length 1 In. There is a subsequence of these intervals that converges to a limit interval I . Its length is l 1 n and by uniform convergence, f is monotone on I . Hence G� is closed and G n is open . Each set Rn . L n . G n is open dense in C 0 . Finally, if f belongs to the thick set
then for each x E
[a, b] there is a sequence h n
I
f (x + hn) - f (x) hn
I
f= 0 such that
>
n.
The numerator of this fraction is at most 2 II f 11. so hn -4 0 as n -4 oo. Thus f is not differentiable at x . Also, f is non-monotone on every interval of length lin. Since each interval J contains an interval of length 1/n when D n is large enough, f is non-monotone on J. Further generic properties of continuous functions have been studied, and you might read about them in the books A Primer of Real Functions by Ralph Boas, Diffe rentiation of Real Functions by Andrew Bruckner, or A Second Course in Real Functions by van Rooij and Schikhof.
8*
Spaces of Unbounded Functions
How important is it that the functions we deal with are bounded, or have domain [a, b] and target JR? To some extent we can replace [a, b] with a metric space X and lR with a complete metric space Y . Let :F denote the set of all functions f : X -4 Y . Recall from Exercise 2.84 that the metric dy on Y gives rise to a bounded metric
p (y, y ) '
=
1
dy ( y , y ' ) . + dy ( y . y')
where y, y ' E Y . Note that < 1 . Convergence and Cauchyness with p respect to and dy are equivalent. Thus completeness of Y with respect to dy impliespcompleteness with respect to p . In the same way we give :F the metric d d(f, g ) = sup y (j(x) , g (x)) XEX
1 + dy (f(x), g (x ) )
Spaces of Unbounded Functions
Section 8*
249
A function f E :F is bounded with respect to dy if and only if for any constant function c, supx dy (f(x) , c) < oo; i.e., d(f, c) < I . Unbounded functions have d (f, c) = 1 .
In the space :F equipped with the metric d, Uniform convergence of (fn) is equivalent to d-convergence. Completeness of Y implies completeness of F. The set :Fb of bounded functions is closed in F. The set C 0 ( X , Y) of continuous functions is closed in F.
36 Theorem
(a) (b) (c) (d)
f = unif lim means that dy (fn (x) , f (x)) � 0, which means n -HXl that d(fn , f) -+ 0.
Proof (a)
(b) If Un) is Cauchy in :F and Y is complete then, just as in Section 1 , f(x) nlim -+ oo fn (x ) exists for each x E X. Cauchyness with respect to the metric d implies uniform convergence and thus d ( fn , f) -+ 0. (c) If fn E :Fb and d(fn , f) -+ 0 then supx dy (fn (x), f (x )) -+ 0. Since fn is bounded, so is f. D (d) The proof that C 0 is closed in :F is the same as in Section 1 . =
The Arzela-Ascoli theorem is trickier. A family £ C :F is uniformly equicontinuous if for each E > 0 there is a 8 > 0 such that f E £ and dx (x , t) < 8 imply dy ( J (x ) , f(t)) < E . If the 8 depends on x but not on f E £ then £ is pointwise equicontinuous. 37 Theorem
Pointwise equicontinuity implies uniform equicontinuity if X
is compact. Proof Suppose not. Then there exists E > 0 such that for each 8 = 1 / n we have points Xn , tn E X and functions fn E £ with dx (Xn , tn) < 1 /n
and
Xn
dy (fn (Xn ) , fn C tn)) 2: E . By compactness of X we may assume that -+ xo . Then tn -+ xo , which leads to a contradiction of pointwise D
equicontinuity at x0 .
38 Theorem Ifthe sequence offunctions fn : X -+ Y is uniformly equicon tinuous, X is compact, andfor each x E X, (fn ( ) ) lies in a compact subset of Y, then (fn) has a uniformly convergent subsequence. x
D. Then the proof of the Arzela Ascoli Theorem in Section 3 becomes a proof of Theorem 38.
Proof Being compact, X has a countable dense subset
D
250
Function Spaces
Chapter 4
The space X is u -compact if it is a countable union of compact sets, X = m UX i . For example Z,
39 Theorem If X is a -compact and if
Proof Express X as UXi with Xi compact. By Theorem 37 Un l x) is uniformly equicontinuous and by Theorem 38 there is a subsequence ft.n that converges uniformly on X 1 , and it has a sub-subsequence fz.n that con verges uniformly on X2 , and so on. A diagonal subsequence (gm) converges uniformly on each Xi . Thus (gm) converges pointwise. If A c X is com pact, then (gm l A ) is uniformly equicontinuous and pointwise convergent. By the proof of the Arzela Ascoli propagation theorem. (gm l A ) converges uniformly. D
is a sequence of pointwise equicontinuous functions IR -+ IR, and for some xo E IR.
Proof Let [a . b] be any interval containing x0 . By Theorem 37, the restric tions of fn to [a , b] are uniformly equicontinuous, and there is a 8 > 0 such that if t, s E [a . b] then I t s l < 8 implies that l fn (t) - fn (s) l < 1 . Each point x E [a , b] can be reached in .:::: N steps of length < 8, starting at x0 , if N > (b - a) j8. Thus l fn (x ) i .:::: l fn (xo) l + N . and < fn (x ) ) is bounded for each x E JR. A bounded subset of IR has compact closure and Theorem 39 gives the corollary. D -
Exercises
25 1
Exercises
In these exercises, C 0 = C 0 ([a , b] , IR) is the space of continuous real valued functions defined on the closed interval [a, b ] . It is equipped with the sup norm, 11 / 11 = sup{ l f (x) l : x E [a , b] } . 1 . Let M, N be metric spaces. (a) Formulate the concepts of pointwise convergence and uniform convergence for sequences of functions fn : M � N. (b) For which metric spaces are the concepts equivalent? 2. Suppose that fn � f where f and fn are functions from the metric space M to the metric space N. (Assume nothing about the met ric spaces such as compactness, completeness, etc.) If each .fn is continuous prove that f is continuous. [Hint: Review the proof of Theorem 1 .] 3 . Let fn : [a , b] ---+ !R be a sequence ofpiecewise continuous functions, each of which is continuous at the point x0 E [a , b J. Assume that fn � f as n � oo . (a) Prove that f is continuous at xo. [Hint: Review the proof of Theorem 1 .] (b) Prove or disprove that f is piecewise continuous. 4. (a) If fn : lR � lR i s uniformly continuous for each n E N and if fn � f as n � oo, prove or disprove that f is uniformly continuous. (b) What happens for functions from one metric space to another instead of lR to lR? 5 . Suppose that fn : [a , b] � lR and fn =::t f as n � oo . Which of the following discontinuity properties (see Exercise 37 in Chapter 3) of the functions fn carries over to the limit function? (Prove or give a counter-example.) (a) No discontinuities. (b) At most ten discontinuities. (c) At least ten discontinuities. (d) Finitely many discontinuities. (e) Countably many discontinuities, all of jump type. (f) No jump discontinuities. (g) No oscillating discontinuities. * * 6 (a) Prove that C 0 and lR have equal cardinality. [Clearly there are at least as many functions as there are real numbers, for C 0 includes the constant functions. The issue is to show that there are no more continuous functions than there are real numbers ]
Function Spaces
252
Chapter 4
(b) Is the same true if we replace [a , b] with lR or a separable metric space? (c) In the same vein, prove that the collection T of open subsets of lR and lR itself have equal cardinality. 7. Consider a sequence of functions fn in C 0 • The graph G n of fn is a compact subset of JR2 • (a) Prove that Un ) converges uniformly as n --+ oo if and only if the sequence (Gn ) in lC (JR2 ) converges to the graph of a function f E C 0 . (The space /( was discussed in Exercise 2. 1 24.) (b) Formulate equicontinuity in terms of graphs. 8. Is the sequence of functions fn : lR --+ lR defined by
fn (x)
=
cos (n + x) + log( l
+
1 . n .Jn+2 s m2 (n x) ) n+2
equicontinuous? Prove or disprove. : lR --+ lR is continuous and the sequence fn (x) = f (n x ) is equicontinuous, what can be said about f? 10. Give an example to show that a sequence of functions may be uni formly continuous, pointwise equicontinuous, but not uniformly equi continuous, when their domain M is noncompact. 1 1 . If every sequence of pointwise equicontinuous functions M --+ lR is uniformly equicontinuous, does this imply that M is compact? 1 2. Prove that if £ c C 0 (M, N) is equicontinuous then so is its closure. 1 3. Suppose that Un ) is a sequence of functions lR --+ lR and for each compact subset K c IR, the restricted sequence Un I K ) is pointwise bounded and pointwise equicontinuous. (a) Does it follow that there is a subsequence of Un) that converges pointwise to a continuous limit function lR --+ lR? (b) What about uniform convergence? 14. Recall from Exercise 78 in Chapter 2 that a metric space M is chain connected if for each E > 0 and each p, q E M there is a chain P = Po, . . . . Pn = q in M such that
9. If f
d ( Pk - J , Pk ) < E for
1 :::5: k ::::: n .
A family F of functions f : M --+ lR i s bounded at p E M if the set { f (p) : f E F} is a bounded in JR. Show that M is chain connected if and only if pointwise boundedness of an equicontinuous family at one point of M implies pointwise boundedness at every point of M.
Exercises
15. A continuous, strictly increasing function
253 JL
: (0, oo) --+ (0, oo) is a modulus of continuity if JL (s) --+ 0 as s --+ 0 . A function f : [a , b] --+ IR has modulus of continuity JL if l f (s) - f (t) l :S
JL ( Is - t l) .
1 6.
17.
18.
1 9.
(a) Prove that a function is uniformly continuous if and only if it has a modulus of continuity. (b) Prove that a family of functions is equicontinuous if and only if its members have a common modulus of continuity. Consider the modulus of continuity JL (s) = L s where L is a positive constant. (a) What is the relation between C�-' and the set of Lipschitz functions with Lipschitz constant ::=:: L ? (h) Replace [a , b] with IR and answer the same question. (c) Replace [a , b] with N and answer the same question. (d) Formulate and prove a generalization of (a). Consider a modulus of continuity JL (s ) = H s a where 0 < a ::=:: 1 and 0 < H < oo. A function with this modulus of continuity is said to be a-HOlder, with a-Holder constant H . See also Exercise 2 in Chapter 3. (a) Prove that the set c a (H) of all continuous functions defined on [a , b] which are a-HOlder and have a-Holder constant ::=:: H is equicontinuous . (b) Replace [a , b] with (a , b ) . I s the same thing true? (c) Replace [a , b] with R Is it true? (d) What about Q? (e) What about N ? Suppose that Un ) is an equicontinuous sequence in C 0 and p E [a , b] is given. (a) If fn (p) is a bounded sequence of real numbers, prove that Un ) is uniformly bounded. (b) Reformulate the Arzela-Ascoli Theorem with a weaker bound edness hypothesis. (c) Can [a , b] be replaced with (a , b) ?, Q?, IR?, N ? (d) What is the correct generalization? If M is compact and A is dense in M, prove that for any 8 > 0 there is a finite subset {a! , . . . , ak} C A which is �-dense in M in the sense that each x E M lies within distance 8 of at least one of the points aj , j = l , . . . , k.
Function Spaces
254
Chapter 4
20. Give an example of a sequence of smooth equicontinuous functions
21.
22.
23.
: [a , b] � lR whose derivatives are unbounded. Suppose that (fn ) is an equicontinuous sequence of functions : lR --+ JR, such that (0)) i s a bounded sequence in R (a) Does there exist a subsequence that converges uniformly? Prove or give a counter-example. (b) What if lRm replaces JR? (c) What about N ? Suppose that £ c C 0 is equicontinuous and bounded. (a) Prove that sup { f (x ) : f E £ } is a continuous function of x . (b) Show that (a) fails without equicontinuity. (c) Show that this continuous sup-property does not imply equicon tinuity. (d) Assume that the continuous sup-property is true for each subset F C £ . Is £ equicontinuous? Give a proof or counter-example. Let M be a compact metric space, and let be a sequence of isometries : M � M. ( a ) Prove that there exists a subsequence that converges t o an isometry as k � oo . (b) Does the inverse isometry 1 converge to - I . (Proof o r counter example. ) ( c ) Infer that the group of orthogonal 3 x 3 matrices is compact. LHint: Each orthogonal 3 x 3 matrix defines an isometry of the unit 2-sphere to itself.] (d) How about the group of m x m orthogonal matrices ? Suppose that f --+ is a contraction, but M is not necessarily complete. (a) Prove that f is uniformly continuous. (b) Why does (a) imply that f extends uniquely to a continuous rna£_ f : M --+ M, where M is the completion of M ? (c) I s f a contraction? Give an example of a contraction of an incomplete metric space that has no fixed point. Suppose that f : M --+ M and for all x , y E M, if x "I y then d ( f (x) , f ( y)) < d(x , y ) . Such an f is a weak contraction (a) Is a weak contraction a contraction ? (Proof or counter-example . ) (b) If M i s compact is a weak contraction a contraction? (Proof or counter-example.)
fn
fn
Un
Un )
in
in k
i
24.
25 . 26.
i�
:M
M
i
Exercises
27.
28. 29.
30.
255
(c) If M is compact, prove that a weak contraction has a unique fixed point. Suppose that f : JR. --+ JR. is differentiable and its derivative satisfies l f ' (x) l < 1 for all x E R (a) Is f a contraction ? (b) A weak one? (c) Does it have a fixed point? Give an example to show that the fixed point in Brouwer's Theorem need not be unique. On page 222 it is shown that if bn J� 1 ( 1 - t 2 t dt = 1 then for some constant c, and for all n E N, bn ::S c -Jii . What is the best (i.e. , smallest) value of c that you can prove works? (A calculator might be useful here.) Let M be a compact metric space. and let c Lip be the set of continuous functions f : M --+ JR. that obey a Lipschitz condition: for some L and all p. q . E M .
l f(p) - f (q ) l 0
::S
Ld(p, q) .
*(a) Prove that C LiP is dense in C ( M , JR.) . [Hint: Stone-Weierstrass.] * * * (b) If M = [a , b) and JR. is replaced by some other complete metric space, is the result true or false? * * * (c) If M is a general compact metric space and Y i s a complete metric 0 space, is C LiP ( M , Y) dense in C ( M , Y)? (Would M equal ro the Cantor set make a good test case?) 3 1 . Consider the ODE x ' = x on R Show that its solution with initial condition x0 is t f--+ e 1 x0 . Interpret e t+s = e ' e s in terms of the flow property. 32. Consider the ODE y ' = 2JTYT where y E R (a) Show that there are many solutions to this ODE, all with the same initial condition y (O) = 0. Not only does y (t ) = 0 solve the ODE, but also y (t) = t 2 does for t ::=: 0. (b) Find and graph other solutions such as y (t) = 0 for t ::S c and y (t) = (t - c) 2 for t ::=: c > 0. (c) Does the existence of these non-unique solutions to the ODE contradict Picard's Theorem? Explain. * (d) Find all solutions with initial condition y (O) = 0. 33. Consider the ODE x ' = x 2 on R Find the solution of the ODE with initial condition xo . Are the solutions to this ODE defined for all time or do they escape to infinity in finite time?
Function Spaces
256
Chapter 4
Suppose that the ODE x' = f (x ) on lR is bounded, I f (x ) 1 ::::: M for all x . (a) Prove that no solution of the ODE escapes to infinity in finite time. (b) Prove the same thing if f satisfies a Lipschitz condition, or, more generally, if there are constants C, K such that (x ) C l x l + K for all x . (c) Repeat (a) and (b) with JRm i n place of JR. (d) Prove that if : JRm ---+ JR.m is uniformly continuous then the condition stated in (b) is true. Deduce that solutions of uni formly continuous ODE's defined on JR.m do not escape to in finity in finite time. * * 3 5 . (a) Prove Borel's Lemma : given any sequence whatsoever o f real numbers (ar ) there is a smooth function : lR ---+ lR such that J(O) = ar . [Hint: Try = L fJk (x)akx k f k ! where fJk is a well chosen bump function. ) (b) Infer that there are many Taylor series with radius of conver gence = 0. (c) Construct a smooth function whose Taylor series at every x has radius of convergence = 0. [Hint: Try L fJk (x) e (x + qk) where {qJ , qz , . . . } = Q.] *36. Suppose that T C (a . b) clusters at some point of (a , b) and that J, g : (a , b) ---+ lR are analytic. Assume that for all t E T, ( t ) =
34.
If
I :::::
f
f
f
R
R
f
g (t) .
(a) Prove that = g everywhere in (a , b). (b) What if and g are only C00? (c) What i f T i s an infinite set but its only cluster points are a and
f
f
b?
* * (d) Find a necessary and sufficiem condition for a subset Z c (a , b) to be the zero locus of an analytic function defined on (a , b), Z = {x E (a , b) : f (x ) = 0}. [Him: Think Taylor. The result in (a) is known as the Identity Theorem. It states that if an equality between analytic functions is known to hold for points of T then it is an identity, an equality that holds everywhere.] 37. Let M be any metric space with metric d. Fix a point p E M and for each q E M define the function = d(q , x) - d(p, x ) . (a) Prove that i s a bounded, continuous function of x E M , and that the map q �---+ J sends M isometrically onto a subset Mo of C 0 (M, JR.) .
f
fq
fq (x)
q
Exercises
257
(b) Since C 0 (M. JR.) is complete, infer that an isometric copy of M is dense in a complete metric space, the closure of M0 , and hence that we have a second proof of the Completion Theorem 2. 76. 38. As explained in Section 8, a metric space M is cr -compact if it is the countable union of compact subsets, M = U M; . (a) Why i s it equivalent to require that M is the monotone union of compact subsets,
i.e., M1 C M2 C . . . ? (b) Prove that a a -compact metric space is separable. (c) Prove that Z,
tJ
(f) Assume that M is a * -compact, M =
tJ int {M; ) with each M;
compact. Prove that this monotone union engulfs all compacts in M , in the sense that if A C M is compact, then for some i ,
A C M; .
(g) If M = M; and each M; is compact show by example that this engulfing property may fail, even when M itself is compact. * * (h) Prove or disprove that a complete a -compact metric space is a * -compact. 39. (a) Give an example of a function f : [0. I] x [0, 1] --+ JR. such that for each fixed x, y � f (x , y) is a continuous function of y, and for each fixed y , x � f (x , y) is a continuous function of x, but f is not continuous. (b) Suppose in addition that the set of functions
tJ
£ = { x � f (x . y) : y
E
[0, I ] }
i s equicontinuous. Prove that f i s continuous. 40. Prove that JR. can not be expressed as the countable union of Cantor sets. 41 . What is the joke in the following picture?
258
Function Spaces
Chapter 4
e
y' = ll(y)
More Pre-lim Problems
f and fn . n E N. be functions from JR. to JR. Assume that fn (xn ) f(x) as n - oo and Xn - x . Show that f is continuous. LHint: Equicontinuity is irrelevant since the functions fn are not assumed
I . Let
2.
to be continuous.] Suppose that E
fn
C 0 and for each x
!1 (x)
n -> oo fn (X)
and lim
3.
:=::
E
[a , b ] ,
fz (x ) :=:: . . .
,
= 0. Is the sequence equicontinuous? Give a proof
fn (x)
or counter-example. [Hint: Does converge uniformly to 0, or does it not?] Let E be the set of all functions : [0, 1 ] ---+ JR. such that (0) = 0 and satisfies a Lipschitz condition with Lipschitz constant 1 . Define
u
u
u
(u ) =
11 (u(x)2 - u (x)) dx.
Prove that there exists a function E E at which
u
x
u
259
Exercises
5. Let (Pn ) be a sequence of real polynomials of degree � 10. Suppose that Pn (x) converges pointwise to 0 as n � oo and x E [0, 1 j. Prove that Pn (x) converges uniformly to 0. 6. Let ( an ) be a sequence of nonzero real numbers. Prove that the se quence of functions
has a subsequence converging to a continuous function. 7. Suppose that f : JR. -4 JR. is differentiable, f (O ) = 0, and f ' (x) > f(x) for all x E R Prove that f (x ) > 0 for all x > 0. 8 . Suppose that f : [a , b] -4 JR. and the limits of f (x) from the left and the right exist at all points of [a . b]. Prove that f is Riemann integrable. 9. Let h : [0. 1 ) -4 JR. be a uniformly continuous function where [0, 1 ) i s the half open interval. Prove that there is a unique continuous map g : [0. 1 ] -4 JR. such that g (x ) = h (x) for all x E [0. 1 ) . 1 0. Assume that f : JR. -4 JR. is uniformly continuous. Prove that there are constants A , B such that I f (x) I :::::: A + B lx I for all x E JR.. 1 1 . Suppose that f (x) is defined on [ - 1 , 1 ] and that its third derivative exists and is continuous . (That is, f is of class . ) Prove that the series
C3
L (n ( f ( l jn ) - f ( - 1 /n )) - 2J ' (O) ) 00
n =O
1 2.
1 3.
converges. Let A C IR.m be compact, x E A. Let (xn ) be a sequence in A such that every convergent subsequence of (xn ) converges to x . (a) Prove that the sequence (xn ) converges . (b) Give an example to show if A is not compact, the result in (a) is not necessarily true. Let f : [0, 1 ] -4 JR. be continuously differentiable, with f ( O ) = 0. Prove that
1 /11 2 :::::: 1 \f'(x ) )2 dx
1 /1 1 = sup { l f (t ) l : 0 :::::: t _::: I } .
where 14. Let fn :
fn (0)
=
JR. JR. be 0 and I f� (x) I -4
differentiable functions, n = _::: 2 for al l n , x . Suppose that
nlim -+oo fn (x) = g (x)
for all x . Prove that g is continuous.
1 , 2,
. . . with
Function Spaces
260
Chapter 4
15.
Let X be a non-empty connected set of real numbers. If every element of X is rational, prove that X has only one element. 1 6 . Let k � 0 be an integer and define a sequence of maps fn lR lR as
:
�
xk
fn (X) = � x +n
n
= 1 , 2, . . . . For which values of k does the sequence converge
uniformly on JR? On every bounded subset of lR? 1 7 . Let f [0, 1] lR be Riemann integrable over [b , 1] for every b such that 0 < b (a) If f i s bounded, prove that f is Riemann integrable over [0, 1 ] . (b) What if f i s not bounded? 1 8 . (a) Let S and T be connected subsets of the plane JR2 having a point in common. Prove that S U T is connected. (b) Let { Sa } be a family of connected subsets of JR2 all containing the origin. Prove that u sa is connected. 19. Let f : lR --+ lR be continuous. Suppose that lR contains a countably infinite set S such that
:
� :::: 1 .
iq f
(x) dx = 0
if p and q are not in S. Prove that f is identically zero. 20. Let f lR lR satisfy f (x) :::: f (y) for x y. Prove that the set where f is not continuous is finite or countably infinite. 2 1 . Let (gn ) be a sequence of Riemann integrable functions from [0, 1 ] into lR such that l gn (x) l :::: 1 for all n , x . Define
:
�
::::
1x gn (t) dt.
Gn (X ) =
Prove that a subsequence of (G n ) converges uniformly. 22. Prove that every compact metric space has a countable dense subset. 23. Show that for any continuous function f [0, 1] lR and for any E > 0 there is a function of the form
:
g (x ) = for some
n
n
L Ck x k k =O
E for all x in [0, 1 ] . lR --+ lR having all three of the
E N, and l g (x) - f(x ) l <
2 4 . Give an example of a function
following properties:
f
�
:
26 1
Exercises
(a) f (x ) = O for all x < O and x > 2. (b) = 1, (c) has derivatives of all orders. JR JR whose 25 . (a) Give an example of a differentiable function derivative is not continuous. (b) Let f be as in (a). If < 2 < prove that (x = 2 for some x E [0, 1 ] . 26. Let U C JRm be an open set. Suppose that the map h U --+ JRm is a homeomorphism from U onto JRm which is uniformly continuous. Prove that U = JRm . 27. Let be a sequence of continuous maps [0, 1 ] JR such that
f'(l) f
f : --+
f'(O)
f'(l)
f ) '
:
--+
fo\ fn ( Y )) 2 dy :S: 5 for al1
n.
Define
1 1 �X + Y fn (y) dy
gn : [0, 1 ] --+ JR by gn (X ) =
(a) Find a constant K :=: 0 such that :::; K for all n . (b) Prove that a subsequence of the sequence converges uni formly. 28. Consider the fol1owing properties of a map JRm JR. (a) is continuous. (b) The graph of is connected in JRm x JR. Prove or disprove the implications (a) :::::} (b), (b) ::::::} (a) . b e a sequence o f real polynomials o f degree :::; 10. Suppose 2 9 . Let that lim =0
l gn (x ) l
f
f:
(gn )
--+
f
(Pn )
n-+ oo Pn (x ) for al] x E [0, 1 ] . Prove that Pn (x) :::t 0, 0 :::; x say about Pn (x ) for 4 :::; x :::; 5 ?
:::; 1 . What can you
30. Give an example o f a subset of JR having uncountably many connected components. Can such a subset be open? Closed? Does your answer change if JR2 replaces JR? 3 1 . For each (a , b, c ) E JR3 consider the series
Determine the values of a, b, and c for which the series converges absolutely, converges conditionally, diverges.
262
Function Spaces
Chapter 4
Let X be a compact metric space and f : X � X an isometry (That is, f (y)) = y) for all y E X.) Prove that f (X ) = X 3 3 . Prove or disprove:
32.
d(f(x) ,
d(x,
f
x,
.
/_: l f(x) l dx
< oo .
(xn)
Xn
Show that there is a sequence in IR such that ---* oo, ---* 0, and ---* 0 as n ---* 00. 3 5 . Let : [0, 1] ---* lR be a continuous function. Evaluate the following limits (with proof)
Xnf(xn) f
Xn f( -xn)
n nlim ---. oa }t 0 x f(x) dx
(a)
x n f(x) dx. nlim ---> 00 n lt o
(b)
36. Let K be an uncountable subset of lRm . Prove that there is a sequence of distinct points in K which converges to some point of K . 37. Let be a sequence of twice differentiable functions on [0, 1 ] such that for all n , (0) = 0 and (0) = 0. Suppose that l g� I :::=: 1 for all n , Prove that there is a subsequence of which converges uniformly on 3 8 . Prove or give a counter-example: Every connected locally pathwise connected set in IRm is pathwise connected. 39. Let be a sequence of continuous functions ro. 1 1 ---* IR such that � 0 for each x E [0, l j . Suppose that
(gn) x.
gn
g�
(gn )
LO, 1 ] .
Un > fn (x)
1 1 ' fn (x) dx I
(x)
:S:
K
J01 fn (x ) dx
for all n where K is a constant. Does converge to 0 as n � oo? Prove or give a counter-example. 40. Let E be a closed, bounded, and non-empty subset of IRm and let f : E ---* E be a function satisfying I f (y) < - y I for all x , y E E, x t=- y . Prove that there is one and only one point E E such that f(xo) = 4 1 . Let : 2.rr I ---* lR b e a continuous function such that
(x) - f
Xo.
f LO,
for all integers
I
n
�
1 2:n: f(x) sin(nx) dx
1.
=0
Prove that f is identically zero.
lx
x0
Exercises
263
u
42. Let E be the set of all real valued functions : [0, I ] -+ IR satisfying = 0 and y ) l � l x - y l for all x , y E [0, 1 ] . Prove that the function
u(O)
lu(x) - u (
11 ((u(x) 2 - u (x)) dx
achieves its maximum value at some element of E. 43 . Let !I , fz , . . . be continuous real valued functions on [0, 1 ] such that for each x E [0, L ], f1 (x ) :::: fz (x ) :::: . . . . Assume that for each x , fn (x) converges t o 0 a s n -+ oo. Does fn converge uniformly to 0? Give a proof or counter-example. 44 . Let f : [0, oo) -+ [0, oo) be a monotonically decreasing function with
X-+ 00
(
1 00
j (x ) dx < oo .
Prove that lim xf x ) = 0 . 45 . Suppose that F
:
!Rm -+ IRm
is continuous and satisfies
IF(x) - F ( y) l
:::: A l x - Y l
F
for all x , y E !Rm and some constant A > 0. Prove that is one-to one, onto, and it has a continuous inverse. 46. Show that [0, 1 ] cannot be written as a countably infinite union of disjoint closed sub-intervals. 47. Prove that a continuous function f : IR -+ IR which sends open sets to open sets must be monotonic. 48. Let f : [0, oo) -+ IR be uniformly continuous and assume that ..... oo
lim
b
r b f (x ) dx
Jo
x ..... oo
exists (as a finite limit). Prove that lim
f (x )
= 0.
49. Prove or supply a counter-example: If f and g are continuously differentiable functions defined on the interval 0 < x < which satisfy the conditions
1
x---> 0
lim
f(x )
x-+ 0
= 0 = lim g (x )
and
J ' (x ) . an d 1 f g and g ' never vant· sh , then 1 Im · -- = x ..... o g ' (x) of L' Hospital's rule. )
f(x)
x---> 0 g (x )
lim c.
(Th"IS
IS ·
=c a converse
264
Function Spaces
Chapter 4
50. Prove or provide a counter-example: If the function f from JR to lR has both a left and a right limit at each point of JR, then the set of discontinuities is at most countable. 5 1 . Prove or supply a counter-example: is a non-decreasing real val ued function on [0 , 1 1 then there is a sequence n = 1 , 2, . . . of continuous functions on [0, 1 ] such that for each in [0, 1 ] , lim =
IT f
n -+oo fn (X)
j(x). f
fn ,
x
52. Show that if is a homeomorphism o f [0, 1 ] onto itself then there is a sequence of polynomials (x ) , n = 1 , 2, . . . , such that � uniformly on [0. 1 ] and each is a homeomorphism of [0. 1 ] onto 1 itself. [Hint: First assume that is C .] 53. Let be a C 2 function on the real line. Assume that is bounded with bounded second derivative. Let A = supx and B = supx I !" (x } I · Prove that
Pn Pn f
f
sup X
54. Let
I J (x) l '
Pn
f
f I f (x) I
_:::: 2 -Ji:B .
( k)
f be continuous on lR and let n l 1 fn (X) = ;; L J X + -n · k=O
fn (x)
Prove that converges uniformly to a limit on every finite interval [a , b] . 5 5 . Let be a real valued continuous function on the compact interval [a , b ] . Given E > 0, show that there is a polynomial such that
f
p
p(a) = f(a), p ' (a) = 0, and l p(x) - f(x) l < E.
,
for all x E [ b] . 5 6 . A function : [0, 1 ] � lR i s said t o b e upper semicontinuous if, given x E [0, L J and E > 0, there exists a 8 > 0 such that l y <8 implies that (y) < (x) E. Prove that an upper semicontinuous function on [0, 1 ] is bounded above and attains its maximum value at some point E [0, 1 ] . 57. Let and be functions from lR to R Assume that � as n � oo whenever � Prove that is continuous. (Note: the functions are not assumed to be continuous.)
a f
f
f
fn
fn
f
+
Xn
x.
- xI
p
f
fn (xn)
f (x)
Exercises
f(x), x :::; J'(x).
265
5 8 . Let 0 :::; l, be a continuous real function with continuous < 1. derivative Let M be the supremum of 0 :::; Prove: for n = 1 , 2 , . . . -l 1 n
;; L f k=O
(k)- - 1' f (x) dx o
n
lf'(x)l,
:S
x
M 2n
(
59. Let K be a compact subset of JRm and let Bi) be a sequence of open balls which cover K . Prove that there is an E > 0 such that each E -ball centered at a point of K is contained in at least one of the balls
Bi .
60. Let
f
( t (x)
f (t) dt ) f(x) = 0.
be a continuous real-valued function on [0, oo) such that lim x � oo
+
r
Jo
exists (and is finite) . Prove that limx-4 oo 6 1 . A standard theorem asserts that a continuous real-valued function on a compact set is bounded. Prove the converse: if K is a subset of JRm and if every continuous real-valued function defined on K is bounded, then K is compact. 62. Let F be a uniformly bounded equicontinuous family of real valued functions defined on the metric space X. Prove that the function
g(x) = u { f (x) : f s p
E F}
is continuous. 63. Suppose that Un ) is a sequence of nondecreasing functions which map the unit interval into itself. Suppose that lim = n -+ oo pointwise and that f i s a continuous function. Prove that uniformly as n oo . Note that the functions are not nec essarily continuous. 64. Does there exist a continuous real-valued function 0 :::; :::; 1 , such that
f(x)
fn (x) f (x) fn (x) --+ fn f(x), x
--+
1 ' xf(x) dx
= l
and
for all n = 0, 2 , 3, 4, 5 , . . . ? Give a proof or counter-example. 65 . Let f be a continuous, strictly increasing function from [0, oo ) onto [0, oo) and let = (the inverse, not the reciprocal) . Prove that
g f-1 loa f(x) dx + fob g (y) dy
2:
ab
Function Spaces
266
66. 67 .
68.
69.
70.
Chapter 4
for all positive numbers a , b, and determine the condition for equality. Let f be a function 1 ] --+ IR whose graph { (x , f (x) ) : x E 1]} i s a closed subset of the unit square. Prove that f i s continuous. Let ) be a sequence of positive numbers such that L converges. Prove that there exists a sequence of numbers --+ oo as n --+ oo such that L converges. Let f (x , y) be a continuous real valued function defined on the unit square 1] x 1 ] . Prove that g (x) = max {f (x , y) : y E 1]} i s continuous. Let the function f from 1 ] to 1 ] have the following properties. 1 It is of class C , f(O) = = / ( 1 ) , and f ' is nonincreasing (i.e., f is concave) . Prove that the arc-length of the graph of f does not exceed 3 . Let A b e the set of all positive integers that do not contain the digit 9 in their decimal expansions. Prove that
[0,
[0,
(an
Cn
Cn a n
[0,
[0,
an
[0,
[0, 0
[0,
I: -1 < 00 . a a EA That is, A defines a convergent sub-series of the harmonic series.
5
Multivariable Calculus
Thi s chapter presents the natural geometric theory of calculus in n dimen sions.
1
Linear Algebra
It will be taken for granted that you are familiar with the basic concepts of linear algebra - vector spaces, linear transformations, matrices, de terminants, and dimension. In particular, you should be aware of the fact that an m x n matrix A with entries aii is more than just a static array of mn numbers. It is dynamic. It can act. It defines a linear transformation TA : IR" ---+ _�Rm that sends n-space to m -space according to the formula
TA (v) =
m
11
L L aij Vj ei i= l j =l
, e, is the standard basis of .IR" . where v = L v jej E .IR" and e 1 , (Equally, e 1 , . . . , e is the standard basis for _�Rm . ) m The set M = M (m , n ) of all m x n matrices with real entries aii is a vector space. Its vectors are matrices. You add two matrices by adding the corresponding entries, A + B = C where aii + bii = cii . Similarly, if 1.. E lR is a scalar, I.. A is the matrix with entries J... aii . The dimension of the vector space M is mn , as can be seen by expressing each A as L aii Eij where •
•
•
268
Multivariable Calculus
Chapter 5
is the matrix whose entries are 0, except for the (ij ) th entry which is 1 . Thus, as vector spaces, M = �m n . This gives a natural topology to M . The set £ = £(JRn , JRm ) of linear transformations T : �n --+ JRm i s also a vector space. You combine linear transformations as functions, U = T + S being defined by U ( v ) = T (v) + S (v ) , and A T being defined by ( A T ) ( v ) = A T (v). The vectors in £ are linear transformations. The mapping A �---+ TA is an isomorphism T : M --+ £. As a rule of thumb, think with linear transformations and compute with matrices. As explained in Chapter 1 , a norm on a vector space V is a function I : V --+ lR that satisfies three properties: (a) For all v E V, I v i ::: 0; and I v i = 0 if and only if v = 0. (b) I A v l = I A I I v l . (c) l v + w l ::S l v l + l w l . (Note the abuse of notation in (b) ; l A I is the magnitude of the scalar A and l v l is the norm of the vector v . ) Norms are used to make vector estimates, and vector estimates underlie multi variable calculus. A vector space with a norm is a normed space. Its norm gives rise to a metric as d(v, v' ) = v - v' ·
E;j
l
l
Thus a normed space is a special kind of metric space. If V, W are normed spaces, then the operator norm of T : V --+ W is
I T il
= sup{
I Tv
l
w
--
lvlv
: v :F 0 } .
The operator norm of T is the maximum stretch that T imparts to vectors in V . The subscript on the norm indicates the space in question, which for simplicity is often supprc�ssed. t
1 Theorem Let T : V --+ W be a linear transformation from one normed space to another. The following are equivalent: (a) II T II < oo. (b) T is uniformly continuous. (c) T is continuous. (d) T is continuous at the origin.
Proof Assume (a) ,
II T I I < oo . For all v, v ' E V , linearity of T implies that T v - T v' ::: II T II I v - v ' ,
l
l
l
t If II T 11 is finite then T is said to be a bounded linear transformation. Unfortunately, this terminology conflicts with -;- being bounded as a mapping from the metric space V to the metric space W. The only linear transformation that is bounded in the latter sense is the zero transformation.
Section 1
Linear Algebra
269
which gives (b), uniform continuity. Clearly, (b) implies (c) implies (d) . Assume (d) and take E = 1 . There is a � > 0 such that if u E V and l u i < � then
I Tu l
For any nonzero and
E V , set u
v
I Tvl
--
lvl
<
1.
= A. v where A. = �/2 1 v l . Then l u i I Tu l
= -- <
lui
1
-
lui
=
�/2 < �
2 �
= -
which verifies (a).
D
Any linear transformation T JRn it is an isomorphism then it is a homeomorphism.
2 Theorem
:
---+
W
is continuous, and if
Proof The norm on JRn is the Euclidean norm
IvI and the norm on W is I Express v E JRn as v = L
=
Jvf + · · · + v�
Let M = max { I T ( e ) 1 w . . . , I T ( en ) l w } . 1 vi e i . Then l vi l � l v l and
lw·
n
n
j= l
j=l
,
which implies that II T II � n M < oo . By Theorem 1 , T is continuous . Assume that T is an isomorphism. Continuity of T implies that the image of the unit sphere, T (sn- 1 ) , is a compact subset of W. lnjectivity of T implies that the origin of W does not belong to T (sn- 1 ) . Thus. there is a constant c > 0 such that
We observe that
�
= T - 1 (w) l < 1 . For, if not, then t = 1 / r < 1 , and we have I T - 1 (t w ) = t r = 1 , contrary to the fact that l tw l < c. Thus II r - 1 11 � 1 / c and by Theorem 1 , r - I is continuous. A bicontinuous
r
bijection is a homeomorphism.
D
In the world of finite dimensional normed spaces, all lin ear transformations are continuous and all isomorphisms are homeomor phisms. In particular, T M .C is a homeomorphism. 3 Corollary
:
---+
Chapter 5
Multivariable Calculus
270
Proof Let V be an n -dimensional normed space. As you know from linear algebra, there is an isomorphism H : JRn ---* V . Any linear transformation T : V ---* W factors as T = (T
o
H)
o
H- 1 •
Theorem 2 implies that since T o H is a linear transformation defined on it is continuous, and H is a homeomorphism. Thus T is continuous. If T is an isomorphism, then continuity of T and T - 1 imply that T is a D homeomorphism.
ffi.n ,
A fourth norm property involves composites. It states that (d) li T o S ll :S I I T II II S II for all linear transformations : U ---* V and T : V ---* W. Thinking in terms of stretch, (d) is clear: stretches a vector u E U by at most II S II , and T stretches S(u) by at most II T 11 - The net effect on u is a stretch of at mo st II T II II S II . Corresponding to composition of linear transformations is the product of matrices. If A is an m x matrix and B is a x n matrix then the product th matrix = AB is the m x n matrix whose (ij ) entry is
S S
k
P
k
Pij = a;Iblj + Proof For each
... +
k
a;kbkj = L: airbrj· r=1
er E JRk and e E JRn w e have i
TA ( )
er
=
m
L a;rei i=l
k TB(ej ) = L brjer. r=l
Thus,
k
=
k
m
TA(TB(ej)) = TA ( L brjer) L brj L airei r=l i=l r=l m k = L L airbrjei = TAB ( ej ) . i=l r=l Two linear transformations that are equal on a basis are equal.
D
Theorem 4 expresses the pleasing fact that matrix multiplication corre sponds naturally to composition of linear transformations. See also Exer cise 6.
Derivatives
Section 2
2
27 1
Derivatives
A function of a real variable y
= f (x) has a derivative f' (x) at x when f (x + h) - f (x) (1) lim = f ' (x) . h-+0 h If, however, x is a vector variable. ( 1 ) makes no sense. For what does it mean to divide by the vector increment h ? Equivalent to ( 1 ) is the condition R (h) f (x + h) = f (x) + f ' (x)h + R (h) ::::} lim = 0, h-+0 l h l
which is easy to recast in vector terms.
Definition Let f : U � :!Rm be given, where U is an open subset of :!Rn . The function f is differentiable at p E U with derivative ( Df) P = T if T : :!Rn � IRm is a linear transformation and (2)
f(p + v) = f (p) + T (v) + R (v)
::::}
R (v) = 0. l v l -7 0 l v l lim
We say that the Taylor remainder R is sublinear because it tends to 0 faster than l v l . When n = m = 1 , the multidimensional definition reduces to the stan dard one. That is because a linear transformation IR � lR is just multipli cation by some real number, in this case multiplication by f ' (x ) . Here is how to visualize Df. Take m = n = 2. The mapping f : U � distorts shapes nonlinearly; its derivative describes the linear part of the distortion. Circles are sent by f to wobbly ovals, but they become ellipses under ( Df) P . Lines are sent by f to curves, but they become straight lines under ( Df) P . See Figure 103 and also Appendix A.
JR2
v ' •
P ', /
EB
Figure 103
(Df)p
( Df ) P
is the linear part of f at
p.
Multivariable Calculus
272
Chapter 5
This way of looking at differentiability is conceptually simpl e. Near p. i s the sum o f three terms : a constant term f(p), a linear term and a sublinear remainder term Keep i n mind what kind o f an object the derivative is. It is not a number. It is not a vector. No, if it exists, then is a linear transformation from the domain space to the target space. P
f
(Df )p v,
R (v).
( Df)
If f is differentiable at p, then it unambiguously determines (Df)p according to the limit formula, valid for all u E ffi.n , f(p + t u ) - f(p) . (3) lim ( Df)p (u) = t-+0 t Proof Let T be a linear transformation that satisfies (2) . Fix any u E ffi.n and take v = tu. Then f(p + tu) - f(p) T (tu) + R (tu) = T u ) R (tu ) u . ( + tu l i t t 5 Theorem
=
l l 0, which verifies (3 ) . Limits, when
The last term converges to zero as t � they exist, are unambiguous, and therefore if T' is a second linear transfor D mation that satisfies (2) then T (u ) = T' ( u), so T = T'.
6 Theorem
Differentiability implies continuity.
Proof Differentiability at
l f(p + v ) - f(p) l as p +
v �
=
p implies that
I C DJ) pv + R (v) l
::=
I I < Df) p ll l v l + I R(v) l � 0 D
p.
Df is the total derivative or Frechet derivative. In contrast, the (ij)th partial derivative of f at p is the limit, if it exists, afi (p) a xj
= lim
t-+0
fi ( P + tej ) - fi ( P ) . t
If the total derivative exists, then the partial derivatives exist, and they are the entries of the matrix that represents the total derivative.
7 Corollary
Proof Substitute in (3) the vector u = both sides of the resulting equation.
ei
and take the ;th component of
D
As is shown in Exercise 1 5 , the mere existence of partial derivatives does not imply differentiability. The simplest sufficient condition beyond the ex istence of the partials - and the simplest way to recognize differentiability - is given in the next theorem.
Derivatives
Section 2
If the partial de riva tives of f uous, then f is differentiable.
8 Theorem
273
: U � IR_m
exist and are contin
Proof Let A be tbe matrix of partials at p, A = [aji ( p) jax i ] , and let T : IR.n � IR.m be tbe linear transformation that A represents. We claim that ( Df ) P = T . We must show tbat tbe Taylor remainder
R ( v ) = f (p + v ) - f (p) - A v
an ] from p to q = p + v tbat is sublinear. Draw a patb a = [a1 , consists of n segments parallel to tbe components of v . Thus v = L v i e i and •
is a segment from P i-1 = p Figure 1 04.
•
•
•
+ L k < i v k ek
to p1 = Pi- l
+ vi e i .
See
Figure 104 The segmented patb a from p to q . By tbe one-dimensional chain rule and mean value theorem applied to tbe differentiable real-valued function g ( t ) = fi o ai ( t ) of one variable, tbere exists t;i E (0, 1 ) such tbat J.f.'i c P i )
- Ji c Pi - t ) = g c 1 ) - g (o ) = g ' C tij ) =
where P ii = ai (tij ) . Telescoping ji (p
R, (v)
=
ji (p
af; ( P;i ) vi , a Xi
+ v) - ji (p ) along a gives
+ v ) - /; (p ) - ( Av ) ;
vi � ( /; (pi ) - /; (Pi-d - --. a Xl i =l :t { afi (P;i ) a.{; (p) J vi . = =
a.{; (p)
�
)
_
i= l
axi
axi
Continuity of the partials implies tbat tbe terms inside curly brackets tend 0 to 0 as I v i � 0. Thus R is sublinear and f is differentiable at p .
Multivariable Calculus
274
Chapter 5
Next we state and prove the basic rules of multivariable differentiation
Let f and g be differentiable. Then (a) D(f + cg) = Df + cDg. (b) D (constant) = 0 and D(T(x)) = T.
9 Theorem
(c) D(g o f) = Dg o Df. (chain rule) (d) D(f g) = Df g + f Dg. (Leibniz rule) •
•
•
There is a fifth rule that concerns the derivative of the nonlinear inversion operator Inv : T r+ T - 1 • It is a glorified version of the formula
d x-I
� = -X -2 , and is discussed in Exercises 32 - 36.
Proof ( a) Write the Taylor estimates for the Taylor estimate for f + cg.
f and g and combine them to get
f(p + v) = f(p) + ( DJ)p (v) + Rt g(p + v) = g(p) + (Dg)p (v) + Rx (f + cg) (p + v) = (f + cg) (p ) + ( (Df)p + c(Dg) p) (v) + R f + cRx . Since R f + eRg is sublinear, (DJ)p + c(Dg)p is the derivative of f + cg at p. n (b) If f : lft --+ lftm is constant, f (x ) = c for all x E lft n , and
if 0 : lft --+ lftm denotes the zero transformation then the Taylor re = mainder is identically zero. Hence constant )p = 0 . n T : lft � lftm is a linear transformation. If then substi = tuting itself in the Taylor expression gives the Taylor remainder v) = which is identically zero. Hence Note that when n = m = 1 , a linear function is of the form (x ) = ax , and the previous formula just states that a x ' = a . (c) Tacitly, we assume that the composite makes j (x ) = sense as x varies in a neighborhood of E U . The notation refers to the composite of linear transformations and is written out as
n
R(v)
D(
f(p +
T
J(p + v) - f(p) - O (v)
f(x) = T(x),
R(v) (DJ)p T. f
f(p) - T(v),
( )
p
go
g ( f(x)) Dg o Df
D (g o J)p = ( Dg)q o (DJ)p where q = This chain rule states that the derivative of a composite is the composite of the derivatives . Such a beautiful and natural formula must be true. See also Appendix A. Here is a proof:
f (p) .
Derivatives
Section 2
275
+ - f (p ) T ( v)
It is convenient to write the remainder R ( v) = f (p v) in a different form, defining the scalar function e( v) by
e( v )
I R (v ) l lvl
=
I0
-
if v =I= 0 if
v
=0
Sub linearity is equivalent to lim e( v) = 0. Think of e as an error factor.
v--.0
The Taylor expressions for f at p and
g at q = f (p) are
+ Av + R t g( q + w) = g (q ) + B w + R8 f ( p + v) = f (p )
where A = (Df) p and B = ( D as
g ) q as matrices. The composite is expressed
g o f (p + v) = g (q + A v + Rf ( v)) = g (q ) + BAv + B Rf (v) + R8 ( w ) where w = Av + R1 ( v) . It remains to show that the remainder terms are sub linear with respect to v . First
is sublinear. Second,
Therefore,
Since eg ( w) --+ 0 as w --+ 0 and since v --+ 0 implies that w does tend to 0, we see that R8 ( w ) is sublinear with respect to v. It follows that f )) p = B A as claimed. (d) To prove the Leibniz product rule, we must explain the notation v • w. In there i s only one product, the usual multiplication o f real numbers . In higher dimensional vector spaces, however, there are many products, and the general way to discuss products is in terms of bilinear maps. A map f3 : V x W --+ Z is bilinear if V, W, Z are vector spaces and for each fixed v E V the map f3 (v ) : W --+ Z is linear, while for each fixed w E W the map {3 ( . , w ) : V --+ Z is linear. Examples are
(D(g o ffi.
•
.
Multivariable Calculus
276
Chapter 5
(i) Ordinary real multiplication (x , y) t---+ xy is a bilinear map JR. X JR. ----* R (ii) The dot product is a bilinear map IR.n x IR.n ----* R (iii) The matrix product is a bilinear map M (m x k) x M (k x
M (m x n) .
t
n)
�
The precise statement of (d) is that if fJ : JR.k x IR. ----* IR.m is bilinear while f : U ----* JR.k and g : U ----* JR.�' are differentiable at p , then the map x t---+ fJ (f(x) , g (x)) is differentiable at p and
(DfJ (f, g)) p (v)
=
fJ ((Df) p ( v ) , g (p)) + {J ( f ( p ) , (Dg) p ( v )) .
Just as a linear transformation between finite-dimensional vector spaces has a finite operator norm, the same is true for bilinear maps:
ll fJ II
= sup{
! fJ ( v , w ) ! : v, w lvl lwl
::/= 0} < oo . t
To check this, we view fJ as a linear map T : JR.k ----* .C (ffi. , JR.m ) . Accord 13 ing to Theorems 1 , 2, a linear transformation from one finite dimensional normed space to another is continuous and has finite operator norm. Thus the operator norm T is finite. That is,
13
But
I I T13 (v) ll
ll fJ I I < oo .
I I T13 ll
= max{
= max{ lfJ ( v ,
II
T���) II
:
v
w) l I ! w ! : w
::/= 0} < oo . ::/= 0}, which implies that
Returning to the proof of the Leibniz rule, we write out the Taylor esti mates for f and g and plug them into fJ. Using the notation A = (Df) p , B = ( D g) P , bilinearity implies
fJ (f ( p + v) , g (p + v ) ) = fJ ( f(p ) + A v + R1 , g (p) + B v + Rg ) = fJ (f ( p ) g ( p ) ) + fJ ( A v g ( p )) + fJ (f( p ) , B v ) + fJ (f ( p ) , Rg ) + fJ (A v , B v + Rg) + fJ ( R1 , g ( p ) + B v + Rg ) . ,
,
The last three terms are sublinear. For
l fJ ( f ( p ) , Rg) l l fJ ( Av, B v + Rg) l l fJ C R t . g(p) + B v + Rg l
�
llfJ II I f (p) I I Rg l � ll fJ II II A I I I v i i B v + Rg l � llfJ II I R t l l g ( p ) + Bv + Rg l Therefore fJ ( f, g) is differentiable and DfJ (f, g) = fJ ( Df g) + fJ (f, Dg) ,
as claimed.
0
Here
are
277
Derivatives
Section 2
some applications of these differentiation rules :
10 Theorem A function f : U ---+ IRm is differentiable at p E U if and only if each of its components /;, is differentiable at p. Furthermore, the derivative of its i 1h component is the i 1h component of the derivative. m Proof Assume that f is differentiable at p and express the i component
of f as /;, = rri o f where rri : IRm ---+ lR is the projection that sends a wm ) to Wj . Since rri is linear it is differentiable. By vector w = ( w 1 the chain rule, fi is differentiable at p and •
.
.
.
,
( D/;, ) p = (D rrd o ( Df ) p
= ni
o ( Df)p .
0
The proof of the converse is equally natural.
Theorem 1 0 implies that there is little loss of generality assuming m = 1 , i.e., that our functions are real-valued. Multidimensionality of the domain, not the target, is what distinguishes multivariable calculus from one-variable calculus.
segment [p, q ] where M =
:
U ---+ IR m is differentiable on U and the is contained in U, then
11 Mean Value Theorem If f
l f (q ) - f ( p) l ::;::
sup{ I I ( D/ ) x ll :
Proof Fix any unit vector g (t )
u
x
E
E
U }.
M
pi ,
lq -
lRn . The function
= ( u , f (p + t (q
-
p)))
is differentiable, and we can calculate its derivative. By the one-dimensional Mean Value Theorem, this gives some () E {0, l ) such that g ( l ) - g (O) = g ' (() ) . That is,
( u , f (q ) - f (p ) ) = g ' (() ) = ( u ,
( Df ) p +O (q -p) (q -
p) } ::;:: M l q
- pl .
A vector whose dot product with every unit vector is no larger than M l q - p i 0 has norm ::;:: M l q - p l .
Remark The one-dimensional Mean Value Theorem is an equality, f (q ) - f (p)
= f ' (()) (q - p) ,
and you might expect the same to be true for a vector-valued function if we replace J' (() ) by (Df) e . Not so. See Exercise 1 7 . The closest we can come to an equality form of the multidimensional Mean Value Theorem is the following:
Multivariable Calculus
278
Chapter 5
U -4- IRm is of class C 1 (its derivative exists and is continuous) and if the segment [p, qJ is contained in U, then 12 C 1 Mean Value Theorem
Iff :
f (q ) - f ( p)
(4)
=
T (q - p )
1 ( Df)p+t(q-p) dt .
where T is the average derivative of f on the segment 1
T=
Conversely, if there is a continuousfamily of linear maps Tpq (4) holds, then f is of class C 1 and (Df) p = Tpp ·
E
£for which
Proof The integrand takes values in the normed space £(1Rn , JRm ) and is a continuous function of t. The integral is the limit of Riemann sums
L ) Df ) p+tk(q-p) !:,. tb k
which lie in £. Since the integral is an element of £. it has a right to act on the vector q - p. Alternately, if you integrate each entry of the matrix that represents f along the segment, the resulting matrix represents T. 1 Fix an index i , and apply the Fundamental Theorem o f Calculus to the C real-valued function of one variable
D
g (t) where
a (t) = p + t (q - p) J; (q) - J; (p)
=
fi o a (t) ,
parameterizes
=
=
[p, q ] . This gives
1 1 g' (t) dt
1 1 � aJ; (a (t)) (q · - p · ) dt
g ( l ) - g (O)
1
0 � j=1
-�
- � j= 1
=
ax J-
J
J
aJ; (a (t ) ) d t (qj - Pj ) , a XJ· 0 1
which is the ; m component of p). To check the converse, we assume that (4) holds for a continuous family of linear maps Take q = p + The first-order Taylor remainder at p is
T (q -
Tpq ·
v.
f( p + v) - f (p) - Tpp ( v ) (Tpq - Tpp ) ( v) , which is sublinear with respect to v . Therefore ( Df)p Tpp· R ( v)
=
=
=
D
279
Higher derivatives
Section 3
Assume that U is connected. Iff : U ___,. !Rm is differentiable = 0, then f zs constant and for each point E U, 13 Corollary
x
(Df)x
Proof The enJoyable open and closed argument is left to you as Exercise 20.
D We conclude this section with another useful rule - differentiation past the integral. See also Exercise 23.
Assume that f [a . b] x (c, d) is continuous and that aj(x, y)jay exists and is continuous. Then F (y ) = 1 b f (x, y) dx is of class C1 and dF 1b aj(x, y) = dy ay dx. C1 F (y + h) - F(y) 1 1b (1 1 af(x, y + th ) dt ) h dx . h h ay f y y y + h. aj(x, y) jay h 0, dF jdy aflay. D :
14 Theorem
___,.
lR
(5)
a
Proof By the
Mean Value Theorem, if h is small, then
------ = -
a
0
The inner integral is the average partial derivative of with respect to along the segment from to Continuity implies that this average converges to as ___,. which verifies (5) . Continuity of follows from continuity of See Exercise 22.
3
Higher derivatives
In this section we define higher-order multivariable derivatives. We do so in the same spirit as in the previous section: the second derivative will be the derivative of the first derivative, viewed naturally. Assume that : U ___,. !Rm f) exists at each E U and the is differentiable on U . The derivative map x �---+ x defines a function
(D x
( Df)
x
f
The derivative D is the same sort of thing that is, namely a function from an open subset of a vector space into another vector space In the case
f
f
Multivariable Calculus
280
Chapter 5
of Df, the target vector space is not IR;_m , but rather the mn dimensional space £. If Df is differentiable at p E U then by definition
( D (Df)) p
=
(D 2 f) p = the second derivative of f at p,
and f is second-differentiable at p. The second derivative i s a linear map from IR;_n into £. For each v E IRn , ( D 2 f) P ( v) belongs to £ and therefore is a linear transformation IR;_n � IR;_m , so (D 2 f) p ( v ) (w) is bilinear, and we write it as (D 2 f) p ( v , w ) . (Recall that bilinearity is linearity in each variable separately. ) Third- and higher derivatives are defined i n the same way. If f is second differentiable on U, then x �---+ ( D 2 f) x defines a map where £2 is the vector space of bilinear maps IR;_n x IR;_n ---+ IR;_m . If D 2 f is differentiable at p, then f is third differentiable there and its third derivative is the trilinear map (D 3 f) p = (D(D 2 f)) p . Just as for first derivatives, the relation between the second derivative and the second partial derivatives calls for thought. Express f : U � IR;_m in component form as f (x) = (fl (x) , . . . , fm ( x)) where x varies in U. L S Theorem
p exist, and
If (D 2 f) P exists then (D 2 fk ) P exists, the second-partwls at a 2 k (P) (D 2 fk ) p (e; , e J) = ---'-f--'ax; ax1
Conversely, existence of the second-partials implies existence of (D 2 f) P • provided that the second-partials exist at all points x E U near p and are continuous at p. Proof Assume that ( D 2 f) P exists. Then x = p and the same is true of the matrix
x
�---+
( Df) x is differentiable at
that represents it; x �---+ Mx is differentiable at x = p. For, according to Theorem 1 0, a vector function is differentiable if and only if its components
Higher derivatives
Section 3
28 1
are differentiable; and then, the derivative of the kth component is the kth component of the derivative. A matrix is a special type of vector, its compo nents are its entries. Thus the entries of Mx are differentiable at x = p, and the second-partials exist. Furthermore, the k th row of Mx is a differentiable vector function of x at x = p and
(Dfk )p +r ei ( ej ) - ( Djk )p ( ej ) ( D(Dfk ))p ( ei )( ej ) = ( D 2 fk )p (ei , ej ) = rlim --+ 0 t th The first derivatives appearing in this fraction are the j partials of fk at p + t e and at p. Thus, a2 fk ( p ) jaxi axj = (D 2 fk )p (ei , ej ) as claimed. Conversely, assume that the second-partials exist at all x near p and are continuous at p. Then the entries of Mx have partials that exist at all points q near p, and are continuous at p . Theorem 8 implies that x 1--+ Mx is differentiable at x = p; i.e. , f is second-differentiable at p. D i
The most important and surprising property of second derivatives is sym metry.
If (D 2 f )p exists then it is symmetric: for all v , w E JR:n, (D 2 f )p ( v, w) = (D 2 f )p ( w , v). Proof We will assume that f is real-valued (i.e. , m = 1 ) because the symmetry assertion concerns the arguments of f, not its values. For a variable t E [0, 1], draw the parallelogram P determined by the vectors t v , t w , and label the vertices with ± 1 's as in Figure 1 05 . 16 Theorem
p + tw
p + tv + tw
p + tv
p
Figure 105 The parallelogram
P has signed vertices.
The quantity
D. = !::i (t,
v , w ) = f( p + t v + t w) - j( p + tv) - f( p + t w) + f( p )
Chapter 5
Multivariable Calculus
282
is the signed sum of f at the vertices of P . Clearly, respect to v, w ,
Ll (t,
!::!..
is symmetric with
V, W ) = Ll ( t , W , V).
We claim that
. b. (t, v , w ) , (D 2 f) p (v , w ) = hm t ---+ 0 t2 from which symmetry of D 2 f follows. Fix t, v, w and write .6. = g ( l ) - g (O) where (6)
g(s ) = f(p + t v + st w) - f(p
+ st w) .
Since f is differentiable, so is g . By the one-dimensional Mean Value Theorem there exists () E (0, 1 ) with .6. = g ' (O) . By the Chain Rule, g ' (()) can be written in terms of Df and we get
b. = g ' (()) = (Df) p+tv+Ot w (t w) - (Df) p+Ot w (t w) . Taylor's estimate applied to the differentiable function u gives
1---+
( D/)u at u = p
(Df ) p+x = (Df) p + ( D 2 f) p (x , . ) + R (x, . ) where R (x , . ) E .C (ll�n , JR:m ) is sublinear with respect to x . Writing out this estimate for (Df) p+x first with x = tv + Ot w and then with x = Ot w gives
�
=
=
� {[ (Df) p (w) + (D2 f) p (t v Ot w, w) R (tv Ot w, - [ (Df) p (w) + (D2 /) p (Ot w , w) + R (Ot w , w) J} +
(D 2 f) p ( V , W ) +
+
R (t v + Ot w , w) 1
+
w)
]
R (Ot w, w) - ---t
Bilinearity was used to combine the two second derivative terms. Sublin earity of R (x , w) with respect to x implies that the last two terms tend to 0 as t -+ 0, which completes the proof of (6). Since (D 2 f) P is the limit of a symmetric (although nonlinear) function of v, w it too is symmetric. D Remark. The fact that
D2 f can be expressed directly as a limit of val
ues of f is itself interesting. It should remind you of its one-dimensional counterpart,
_ 1.
f " (Y ) - lm h---+ 0
f(x + h) + f(x - h) - 2f(x) h2
.
Section 3
Higher derivatives
283
Corresponding mixed second-partials of a second-differen tiable function are equal, 17 Corollary
Proof The equalities
follow from Theorem 1 5 and the symmetry of D 2 f .
D
The mere existence of the second-partials does not imply second-dif ferentiability, nor does it imply equality of corresponding mixed second partials. See Exercise 24.
h derivative, if it exists, is symmetric: permutation of does not affect the value of(D' f ) p ( V J , . . , v, ) . Cor responding mixed higher-order partials are equal.
18 Corollary The r 1 the vectors V J , . . . , v,
.
D
Proof The induction argument is left to you as Exercise 29.
In my opinion Theorem 1 6 is quite natural. even though its proof i s tricky. It proceeds from a pointwise hypothesis to a pointwise conclusion: when ever the second derivative exists it is symmetric. No assumption is made about continuity of partials. It is possible that f is second-differentiable at p and nowhere else. See Exercise 25 . All the same. it remains standard to prove equality of mixed partials under stronger hypotheses, namely that D 2 f is continuous. See Exercise 27. We conclude this section with a brief discussion of the rules of higher h order differentiation. It is simple to check that the rt derivative of f + cg is D' f + cD' g. Also, if fJ is k-linear and k < r then f (x) = fJ (x , , x) has D' f = 0. On the other hand. if k = r then ( D' f)p = r ! Symm (fJ ) , the where Symm({J) i s the symmetrization o f fJ . See Exercise 28. th The chain rule for r derivatives is a bit complicated. The difficulties arise from the fact that x appears in two places in the expression for the first order chain rule, (D g o j)x = (Dg)J(x> o ( Df ) x and so differentiating this product produces .
.
.
.
284
Chapter 5
Multivariable Calculus
(The meaning of f) ; needs clarification.) Differentiating again produces four terms, two of which combine. The general formula is
(D
(Dr g
0
f) x =
r
L �)Dkg ) f(x ) k= i
ll
0
(D11 n x
where the sum on Jl is taken as f.l runs through all partitions of { I , . . . , r } into k disjoint subsets. See Exercise 4 1 . The higher-order Leibniz rule i s left for you a s Exercise 42.
4
Smoothness Classes
A map f : U --+ �m is of class cr if it is r order differentiable at each p E U and its derivatives depend continuously on p. (Since differentia bility implies continuity, all the derivatives of order < r are automatically h continuous; only the rt derivative is in question.) If f is of class for all r, it is smooth or of class c oo . According to the differentiation rules, these smoothness classes are closed under the operations of linear combination, product, and composition . We discuss next how they are closed under limits. Let ( fk ) be a sequence of functions fk : U -4 �m . The sequence is (a) Uniformly cr convergent if for some C' function f : U -4 �m ,
th
cr
cr
as k -4 oo.
(b) Uniformly cr Cauchy if for each E > 0 there is an N such that for all k, f. � N and all x E U,
lfk(x) - fe(x) i < E l i (Dfk)x - (Dfdx l < E 19 Theorem
Uniform
cr convergence and Cauchyness are equivalent.
Proof Convergence always implies the Cauchy condition. As for the con verse, first assume that r = 1 . We know that fk converges uniformly to a
Section 4
Smoothness Classes
285
continuous function f, and the derivative sequence converges uniformly to a continuous limit Dfk =:::t G .
We claim that Df = G. Fix p E U and consider points q in a small convex 1 neighborhood of p. The C Mean Value Theorem and uniform convergence imply that as k � oo,
fk (q) - fk (P)
=
tt
f (q ) - f (p)
=
11
(Dfk)p+r (q- p J dt (q - p)
1 1 G (p
tt +
t (q - p)) d t (q - p) .
This integral of G is a continuous function of q that reduces to G (p ) when p = q . By the converse part of the C 1 Mean Value Theorem, f is differen tiable and Df = G. Therefore f is C 1 and fk converges C 1 uniformly to f as k ---+ oo, completing the proof when r = 1 . Now suppose that r 2: 2. The maps Dfk : U ---+ £ form a uniformly C' - 1 Cauchy sequence. The limit, by induction, is C' - 1 uniform; i.e., as
k ---+
00,
Ds (Dfk )
=:::t
D·' G
for all s ::S r - I . Hence fk converges C' uniformly to f as k completing the induction. The cr norm of a C' function f : U � IRm is
ll f ll ,
= max f sup xEU
I I , makes C' (U, �m)
normed vector space.
oo, D
l f (x ) l , . . . , sup II (D ' nx II }. xEU
The set of functions with l l f ll , < oo is denoted C (U. �m ) . 20 Corollary II
�
a
Banach space -
a
complete
Proof The norm properties are easy to check; completeness follows from Theorem 19. D
is a convergenT series of constants and if II fk II , ::S Mk for all k, then the series offunctions L fk converges in C' ( U , JRm ) to a function f. Term by term d(ffe rentiation of order ::S r is valid, D' f = L k D' fk · 21 cr M -test If "'E. Mk
Proof Obvious from the preceding corollary.
D
5
Chapter 5
Multivariable Calculus
286
Implicit and Inverse Functions
Let f : U ---+ JR.m be given, where U is an open subset of JRn x JRm. Fix attention on a point (xo , Yo) E U and write f (xo , Yo) = zo. Our goal is to solve the equation
f(x , y) = zo
(7)
near (xo, Yo ) . More precisely, we hope to show that the set of points (x , y) nearby (xo, Yo) at which f (x , y) = zo, the so-called zo-locus of f , is the graph of a function y = g(x). If so, g is the implicit function defined by (7) . See Figure 1 06. f= Zo
Yo
f = Zo
Figure 106 Near
(xo , Yo) the zo-locus of f is the graph of a function y = g (x ) .
Under various hypotheses we will show that g exists, is unique, and is differentiable. The main assumption. which we make throughout this section, is that
the m x m matrix
B=
[ afi (xoayj, Yo) ]
is invertible.
Equivalently the linear transformation that B represents is an isomorphism
]Rm ---+ ]Rm .
If the function f above is C, 1 .:::: r .:::: oo, then near (xo, Yo). the zo-locus of f is the graph of a unique ju."lction y = g (x). Besides, g is C. 22 Implicit Function Theorem
Proof Without loss of generality, we suppose that (x0 , y0) is the origin (0, 0) in JRn x JRm , and zo = 0 in JRm . The Taylor expression for f is
f (x , y) = Ax + By + R where A is the m
x
n matrix
Section
5
287
Implicit and Inverse Functions
[
(x A = aj; o. Yo)
]
a xj and R is sublinear. Then, solving j(x , y) = 0 for y = gx is equivalent to solving
y = - B -1 (Ax + R (x , y)) . In the unlikely event that R does not depend on y, (8) is an explicit formula (8)
for gx and the implicit function is an explicit function. In general, the idea is that the remainder R depends so weakly on y that we can switch it to the left hand side of (8), absorbing it in the y term. Solving (8) for y as a function of x is the same as finding a fixed point of
Kx : y 1--+ - B -1 ( A x + R (x , y)),
so we hope to show that Kx contracts. The remainder R is a C 1 function, and (DR) (O.O) = 0. Therefore if r is small and lx l , IY I ::::: r then
I B - 1 11
1 a R �:· y) I ::::: � ·
By the Mean Value Theorem this implies that
I Kx Cy i ) - Kx ( Yz) i ::::: I B - 1 II I R(x, yt ) - R (x , Yz) i ::::: I I B - 1 1 1
for
1 �; I IYt - Yz i
:::::
� IY1 - Yz i
lx l , IYI I , iYz i ::::: r . Due to continuity at the origin. if lx l ::::: T « r then I Kx CO) I :::::
I
2.
Thus, for all x E X , Kx contracts Y into itself where X is the T -neighborhood of 0 in JR.n and Y is the closure of the r -neighborhood of 0 in JR.m . See Fig ure 1 07 .
Figure 107
Kx contracts Y into itself.
Multivariable Calculus
288
Chapter 5
By the Contraction Mapping Principle, Kx has a unique fixed point g (x) in Y . This implies that near the origin, the zero locus of f is the graph of a function y = g(x) . It remains to check that g i s c r . First we show that g obeys a Lipschitz condition at 0. We have l gx l = I Kx (gx ) - K, ( O) + Kx CO) I � Lip( Kx ) l gx - 01 + I Kx CO) I 1 1 1 � (x , 0)) x l + x B1 g � (A + R I 1 2 2 lgx l + 2L lx l
where L condition
I B -1 11 11 A I I and l x I is small. Thus
g
satisfies the Lipschitz
l gx l �
4L l x l . In particular, g is continuous at x = 0. Note the trick here. The term l g x I appears on both sides of the inequality but since its coefficient on the r.h.s. is smaller than that on the l.h.s., they combine to give a nontrivial inequality. By the chain rule, the derivative of g at the origin, if it does exist, must satisfy A + B(Dg)0 = 0, so we aim to show that ( Dg)o = -B- 1 A . Since gx is a fixed point of Kx , we have g x = - B 1 A (x + R ) and the Taylor estimate for g at the origin is -
l g (x) - g (O) - (- B - 1 Ax ) l = I B- 1 R (x , g x) l � II B - ' II I R (x , gx ) l � II B - ' 11 e (x , gx ) ( l x l + l g x l ) � II B - ' II e (x , gx) ( l + 4L) I x l where e(x , y) --* 0 as (x, y) --* (0, 0). Since g x --* 0 as x --* 0, the error factor e(x . gx ) does tend to 0 as x --* 0, the remainder is sublinear with respect to x, and g is differentiable at 0 with (Dg)0 = - B-1 A .
All facts proved at the origin hold equally at points (x . y ) on the zero locus near the origin. For the origin is nothing special. Thus, g is differentiable at x and (Dg)x = - B; ' o Ax where
aj (x , gx ) ax
aj(x, gx) . ay Since gx is continuous (being differentiable) and f is C 1 , Ax and Bx are continuous functions of x . According to Cramer's Rule for finding the in verse of a matrix, the entries of B; 1 are explicit, algebraic functions of the Ax =
---
Bx =
entries of Bx . and therefore they depend continuously on x . Therefore g is C 1 • To complete the proof that g is c r , we apply induction. For 2 � r < oo, assume the theorem is true for r - I . When f is c r this implies that g is
Implicit and Inverse Functions
Section 5
289
cr- l . Because they are composites of cr- l functions, Ax and Bx are cr- l . Because the entries of B; 1 depend algebraically on the entries of Bx . B; 1 is also cr - l Therefore ( D g) X is cr- l and g is cr If f is c oo ' we have just D s hown that g is c r for all finite r, and thu s g is c oo . 0
0
Exercises 35 and 36 discuss the properties of matrix inversion avoiding Cramer's Rule and finite dimensionality. Next we are going to deduce the Inverse Function Theorem from the Implicit Function Theorem. A fair question is: since they tum out to be equivalent theorems, why not do it the other way around? Well, in my own experience, the Implicit Function Theorem is more basic and flexible. l have at times needed forms of the Implicit Function Theorem with weaker differentiability hypotheses respecting x than y and they do not follow from the Inverse Function Theorem. For example, if we merely assume that B = af (xo , Yo)fay is invertible, that af (x , y)jax is a continuous function of (x , y) , and that f is continuous (or Lipschitz) then the local implicit function of f is continuous (or Lipschitz). It is not necess ary to assume that f is of class C 1 • Just as a homeomorphism is a continuous bij ection whose inverse is continuous, so a cr diffeomorphism is a cr bij ection whose inverse is C. (We assume 1 � r � oo.) The inverse being C is not automatic. The example to remember is f (x) = x 3 . It is a C00 bijection JR. -4 JR. and is a homeomorphism but not a diffeomorphism because its inverse fails to be differentiable at the origin. Since differentiability implies continuity, every diffeomorphism is a homeomorphism. Diffeomorphisms are to cr things as isomorphisms are to algebraic things. The sphere and ellipsoid are diffeomorphic under a diffeomorphism JR.3 -4 JR. 3 , but the sphere and the surface of the cube are only homeomor phic, not diffeomorphic.
If m = n and f : U � JR.m is C , I � (Df) p is an isomorphism, then f is a C diffeomorphism from a neighborhood of p to a neighborhood of f(p). 23 Inverse Function Theorem r � oo, and if at some p E U,
f(x ) - y. Clearly F is C, F (p, fp ) = 0, and the derivative of F with respect to x at (p, fp) is (Df) p . Since (Df)p is Proof Set F (x ,
y)
=
an isomorphism, we can apply the implicit function theorem (with x and y interchanged ! ) to find neighborhoods Uo of p and V of fp and a C implici t function h : V � Uo uniquely defined by the equation
F (hy , y)
=
0.
Chapter 5
Multivariable Calculus
290
Then f (hy) = y, so f o h = idv and h is a right inverse of f . Except for a little fussy set theory, this completes the proof: f bijects U1 = {x E Uo : f x E V } onto V and its inverse is h , which we know to be a cr map. To be precise, we must check three things, (a) U1 is a neighborhood of p. (b) h is a right inverse of f u 1 • That is, f u o h = idv . 1 (c) h is a left inverse of f u1 • That is, h o f u1 = id u1 • See Figure 108.
l
l
l
l
h
Figure 108 f is a local diffeomorphism.
(a) Since f is continuous, Ut is open. Since p E U0 and fp E V , p belongs to U 1 · (b) Take any y E V . Since hy E Uo and f (hy) = y, we see that hy E U1 . Thus, f u1 o h is well defined and f u o h (y) = f o h (y) = y . 1 (c) Take any x E U1 . By definition of U1 , fx E V and there is a unique point h (fx) in Uo such that F (h (fx ) , fx) = 0. Observe that x itself is just such a point. It lies in Uo because it lies in U1 , and it satisfies F (x , fx) = 0 0 since F (x , fx) = fx - f x . By uniqueness of h, h (f (x)) = x .
l
l
Upshot If (Df)p is an isomorphism, then f is a local diffeomorphism
at p.
6*
The Rank Theorem
The rank of a linear transformation T : IRn � IRm is the dimension of its range. In terms of matrices, the rank is the size of the largest minor with nonzero determinant. If T is onto then its rank is m . If it is one-to-one, its rank is n . A standard formula in linear algebra states that rank T + nullity T = n
where nullity is the dimension of the kernel of T. A differentiable function f : U � IRm has constant rank k if for all p E U the rank of (Df)p is k.
The Rank Theorem
Section 6*
29 1
An important property of rank is that if T has rank k and I S - T I is small, then S has rank 2: k. The rank of T can increase under a small perturbation of T but it cannot decrease. Thus, if f is C1 and (Df) p has rank k then automatically ( Df) x has rank 2: k for all x near p. See Exercise 43 . The Rank Theorem describes maps of constant rank. It says that locally they are just like linear projections. To formalize this we say that maps f : A -4 B and g : C -4 D are equivalent (for want of a better word) if there are bijections a : A ----* C and f3 : B -4 D such that g = f3 o f o a - 1 . An elegant way to express this equation is as commutativity of the diagram f
A
B
g
c
D.
Commutativity means that for each a E A , {3(/(a )) = g (a (a )) . Following
the maps around the rectangle clockwise from A to D gives the same result as following them around it counterclockwise. The a. f3 are "changes of variable." If f, g are C and a, f3 are C diffeomorphisms, l :::::: r :::::: oo , then f and g are said to be cr equivalent, and we write f � g . As cr r maps, f and g are indistinguishable. 24 Lemma cr
on rank.
equivalence is an equivalence relation and it has no effect
Proof Since diffeomorphisms form a group, �r is an equivalence relation. Also. if g = f3 o f o a- 1 , then the chain rule implies that
Dg = D/3
o
o
Df
Da - 1 •
Since D/3 and Da - 1 are isomorphisms, Df and Dg have equ al rank.
P : !Rn -4 IRm
The linear projection P (x r
•
0
.
.
.
, Xn ) = (x r
•
.
.
[hxk OJ
, Xk o 0, . . . , 0)
.
has rank k. It projects !Rn onto the k-dimensional subspace IRk x 0. The matrix of P is 0
0
.
25 Rank Theorem Locally, a cr constant rank k map is cr equivalent to a linear projection onto a k-dimensional subspace.
Multivariable Calculus
292
Chapter 5
As an example, think of the radial projection n : JR3 \ {0} -+ S2 • where n (v) vj l v i . It has constant rank 2, and is locally indistinguishable from linear projection of JR3 to the (x . y )-plane. =
U -+ IR.m have constant rank k and let p E U be given. We will show that on a neighborhood of p , f � r P . Step 1 . Define translations of IR. n and IR.m by ---
Proof Let f :
r
'
: ]Rm -+ ]Rm
Z I-+ Z - fp I
I
The translations are diffeomorphisms of JR.n and !Rm and they show that f is cr equivalent to r ' 0 f 0 r . a cr map that sends 0 to 0 and has constant rank k. Thus, it is no loss of generality to assume in the first place that p is the origin in IR.n and fp is the origin in IR.m . We do so. Step 2. Let T : IR.n -+ IR.n be an isomorphism that sends 0 x !Rn - k onto the kernel of (D /)0 . Since the kernel has dimension n - k , there is such a T . Let T ' : IR.m -+ IR. m be an isomorphism that sends the image o f ( DJ)o onto JR.k x 0. Since ( DJ )o has rank k there is such a T'. Then f � r T ' o f o T . This map sends the origin i n IR.n t o the origin i n IR.m , its derivative at the origin has kernel 0 x JR.n -k, and its image JRk x 0. Thus, it is no loss of generality to assume in the first place that f has these properties. We do so. Step 3. Write --f (x , y )
=
Ux (x , y ) ,
fy (x ,
y) )
E IR
k
x
IR.m - k _
We are going to find a g � r f such that
g (x , 0) = (x , 0) . The matrix of (D /)0 is where A is k x k and invertible. Thus, by the Inverse Function Theorem. the map a : �--+ fx (x , 0)
x
is a diffeomorphism a : X -+ X ' where X, X' are small neighborhoods of the origin in JR.k . For x' E X'. set h (x' )
=
1 Jy (a - (x ' ) . 0) .
Secti on 6*
The Rank Theorem
This makes h a cr map
293
X' ---+ JR.m-k . and h ( a (x ) )
= fy (x . 0) .
The image of X x 0 under f is the graph of h . For f (X
X
0) = { f (x , 0) :
x
E X } = { { fx (x , 0) , fy (x , 0)) :
x
E X}
= { {fx (a - 1 (x ' ) . 0) , jy (a - 1 (x ') . 0) ) : x' E X'} = { (x' , h (x')) : x ' E X' } .
See Figure 1 09. y
X
X
Figure 109 The image of X
For (x ' ,
y ')
Y'
f
E X'
O"X
X'
x 0 is the graph of h.
x JR.m -k , define 1/f (x', y ' ) = ( a - 1 (x' ) , y ' - h (x ' ) ) .
Since l/f is the composite of cr diffeomorphisms,
(x' . y ' ) �---+ (x' , y ' - h (x')) �----* ( a - 1 (x ) , y ' - h (x') ) . '
it too i s a C diffeomorphism. (Alternately, you could compute the deriva tive of l/f at the origin and apply the Inverse Function Theorem.) We observe that g = l/f o f � f satisfies
r
g (x , 0) = l/f o ( fx (x , 0) , jy (x , 0 ) ) = ( a - 1 o fx ( x , 0) . Jy (x , 0) - h (fx (x , 0)) = (x , 0) . Thus, it is no loss of generality to assume in the first place that f (x , 0) = (x , 0) . We do so. Step 4. Finally, we find a local diffeomorphism q; in the neighborhood of O in JR.n so that f o q; is the projection map P (x , y ) = (x , 0) . The equation fx ( � , y ) - x = 0 defines � = � (x , y ) implicitly in a neighborhood of the origin; it is a cr map from JR.n into JR.k and has � (0, 0) = 0. For, at the origin, the derivative
294
Multivariable Calculus
Chapter 5
of fx (l; , y) - x with respect to � is the invertible matrix that ({J (x . y) = (Hx , y) , y )
hx k · We clrum
is a local diffeomorphism of JRn and G = f o (/) is P . The derivative of � (x , y ) with respect to x at the origin can be calculated from the chain rule (this was done in general for implicit functions) and it satisfies a F a� aF d F(�(x . y ) . x . y ) a� 0 = -+ hxk - hxk• a� ax dx ox ax -
[hxk
That is, at the origin a � ;ax is the identity matrix. Thus,
(D({J)o =
0
* l(n- k ) x (n- k )
J
which is invertible no matter what * is. Clearly
G (x , y) = f o ({J(X , y ) = f (� (x , y ) , y) = < fx (� y ) , Jy (� , y )) = (x , G y (x , y ) ) . .
Therefore G x (x , y) = x and
DG =
[
hxk *
g
o y ay
]
.
At last we use the constant rank hypothesis. (Until now, it has been enough that Df has rank :::: k.) The only way that a matrix of this form can have rank k is that aGy = o. ay See Exercise 43 . By Corollary 1 3 to the Mean Value Theorem, this implies that in a neighborhood of the origin, G y is independent of y . Thus
G y (x , y) = G y (x , 0) = fy ( � (x , 0) , 0) , which is 0 because Jy = 0 on JRk x 0. The upshot is that G � r f and G (x , y) = (x , 0) ; i.e., G = P . See also Exercise 3 1 . B y Lemma 24, steps 1 -4 concatenate to give a cr equivalence between D the original constant rank map f and the linear projection P
The Rank Theorem
Section 6*
26 Corollary If j : U
295
has rank k at p, then it is locally C' equivalent to a map of the form G (x , y) = (x , g (x , y)) where g : m.n ---+ m.m - k is Cr and X E JRk . ---+
m.m
Proof This was shown in the proof of the Rank Theorem before we used 0 the assumption that f has constant rank k. 27 Corollary If f : U ---+ lR is C' and ( D f) P has rank 1 . then in a neighborhood of p the level sets {x E U : f (x) = c} form a stack of C' nonlinear discs of dimension n 1. -
Proof Near
p the rank can not decrease, so f has constant rank 1 near p . The level sets of a projection m.n ---+ lR form a stack of planes and the level sets of f are the images of these planes under the equivalence 0 diffeomorphism in the Rank Theorem. See Figure 1 1 0.
Figure 110 Near a rank-one point, the level sets of f :
diffeomorphic to a stack of planes.
U
---+ lR
are
If f : U ---+ !Rm has rank n at p then locally the image of U under f is a diffeomorphic copy of the n-dimensional disc.
28 Corollary
p the rank can not decrease, so f has constant rank n near p. The Rank Theorem says that f is locally C' equivalent to x �---* (x , 0) . (Since k = n , the y-coordinates are absent.) Thus, the local image of U is diffeomorphic to a neighborhood of 0 in m. n x 0 which is an n-dimensional 0 �-
Proof Near
The geometric meaning of the diffeomorphisms 1/f and ({J is illustrated in the Figures 1 1 1 and 1 1 2.
296
Chapter 5
Multivariable Calculus
Y'
X
q
q
f
X'
Y' X
p
X'
Figure 1 1 1 f has constant rank 1 .
7*
Lagrange Multipliers
In sophomore calculus you learn how to maximize a function f (x , y , z) subject to a "constraint" or "side condition" g (x , y, z ) = const. by the Lagrange multiplier method. Namely, the maximum can occur only at a point p where the gradient of f is a scalar multiple of the gradient of g,
The factor A is the Lagrange multiplier. The goal of this section is a natural. mathematically complete explanation of the Lagrange multiplier method, which amounts tu gazing at the right picture. First, the natural hypotheses are:
Lagrange Multipliers
Section 7*
297
Y'
. ·
.·
IR"
.·
. ·
• •· .·
.·
.·
.·
·
X
.·
.·
.·
.·
.·
Y'
.·
p
Figure 112 f has constant rank 2.
(a) f and g are C 1 real-valued functions defined on some region U c (b) For some constant c , the set S = gPre (c) is compact, nonempty, and gradq g t= 0 for all q E S. The conclusion is (c) The restriction of f to the set S, / I s , has a maximum, say M, and if p E S has f (p) = M then there is a A such that grad f = A grad g . P P The method i s utilized a s follows. You are given t f and g, and you are asked to find a point p E S at which f I s is maximum. Compactness implies
JR3•
t Sometimes you are merely given f and S. Then you must think up an appropriate g such that (b) is true.
298
Multivariable Calculus
Chapter 5
that a maximum point exists, your job is to find it. You first locate all points q E S at which the gradients of f and g are linearly dependent; i .e., one gradient is a scalar multiple of the other. They are "candidates" for the maximum point. You then evaluate f at each candidate and the one with the largest f -value i s the maximum. Done. Of course you can find the minimum the same way. It too will be among the candidates, and it will have the smallest f-value. In fact the candidates are exactly the critical points of ! i s . the points x E S such that
fy - fx y-x
.:._c__.:.___
--+
0
as y E S tends to x . Now we explain why the Lagrange multiplier method works. Recall that the gradient of a function h (x , y , z) at p E U is the vector grad P h =
( o h (p) , o h (p ) ax
oy
,
o h (p) az
)
m3
E m.. .
Assume the hypotheses (a), (b), and that / I s attains its maximum value M at p E S. We must prove (c) - the gradient of a f at p is a scalar multiple of the gradient of g at p. If grad P f = 0 then grad P f = 0 · grad P g, which verifies (c) degenerately. Thus, it is fair to assume that grad P f i= 0. By the Rank Theorem, in the neighborhood of a point at which the gradi ent of f is nonzero, the f -level surfaces are like a stack of pancakes. (The pancakes are infinitely thin and may be somewhat curved. Alternately, you can picture the level surfaces as layers of an onion skin, or as a pile of transparency foils.) To arrive at a contradiction, assume that grad P f is not a scalar multiple of grad P g. The angle between the gradients is nonzero. Gaze at the /-level surfaces f = M ± E for E small. The way these f -level surfaces meet the g-level surface S is shown in Figure 1 1 3 . The surface S is a knife blade that slices through the f -pancakes. The knife blade is perpendicular to grad g, while the pancakes are perpendicular to grad f. There is a positive angle between these gradient vectors, so the knife is not tangent to the pancakes. Rather, S slices transversely through each f -level surface near p, and S n { f M + E } is a curve that passes near p. The value of f on this curve is M + E , which contradicts the assumption that / I s attains a maximum at p. Therefore grad P f is, after all, a scalar multiple of grad P g and the proof of (c) is complete. There is a higher-dimensional version of the Lagrange multiplier method. A C 1 function f : U --+ lR i s defined on an open set U C JRn . and it is constrained to a compact "surface" S c U defined by k simultaneous
=
Lagrange Multipliers
Section 7*
299
Figure 1 13 S cuts through all the f -level surfaces near p
equations
We assume the functions g; are C 1 and their gradients are linearly inde pendent. The higher-dimensional Lagrange multiplier method asserts that if f l s achieves a maximum at p , then grad P f i s a linear combination of grad P g 1 , , grad P gk . In contrast to Protter and Morrey's presentation on pages 369-372 of their book, A First Course in Real Analysis, the proof is utterly simple: it amounts to examining the situation in the right coordinate system at p . It is n o loss o f generality to assume that p i s the origin in IR.n and that c 1 . . ck . f (p) are zero. Also, we can assume that grad P f f=- 0, since otherwise it is already a trivial linear combination of the gradients of the g; . Then choose vectors w k + 2 • , Wn so that •
.
.
•
•
.
.
.
.
is a basis of IR.n . For k + 2 _:s i _:s n define
h ; (x) = {w; , x ) .
Multivariable Calculus
300
Chapter 5
The map x f---+ F (x) = (g, (x ) , . . . , gk (X ) , j ( x ) , h k+z (x) , . . . , hn (x ) ) is a local diffeomorphism o f .!Rn t o itself, since the derivative o f F at the origin is the n x n matrix of linearly independent column vectors, ( D F)o =
[
grad0 g,
.
.
. grad0 gk
. J.
grad0 f w k + 2 . W n .
Think of the functions Yi Fi (x) as new coordinates on a neighborhood of the origin in .!Rn . With respect to these coordinates, the surface S is the coordinate plane 0 x .!Rn -k on which the coordinates y1 , , Yk are zero. and f is the (k + 1 }81 coordinate function Yk+ l · This coordinate function obviously does not attain a maximum on the coordinate plane 0 x !Rn - k , so f ls attains no maximum at p . =
•
8
•
•
Multiple Integrals
In this section we generalize to n variables the one-variable Riemann inte gration theory appearing in Chapter 3. For simplicity, we assume throughout that the function f we integrate is real-valued, as contrasted to vector valued, and at first we assume that f is a function of only two variables. Consider a rectangle R = [a , b] x [c, d] in JR2 . Partitions P and Q of [a , b] and [c, d], P
:
a
= xo < X t
Q
< · · · < Xm = b
:
c
=
Yo < Yi < · · · < Yn
=
d
give rise to a grid G = P x Q of rectangles
Rij = Ii
X
lj
where li = [Xi - I , Xi ] and lj = [ Yj - i . yj ] . Let b. xi = Xi - Xi - J , yj - yj - ! , and denote the area of Rij as
b. yj
=
I Rij l = b. xi l'i Y i · Let S be a choice of sample points ( sij . tij ) E R ij . See Figure 1 14 . Given f : R � IR, the corresponding Riemann sum is R(f,
G , S) =
m
n
L L f(sij . lij ) I Rij l · i =l j =l
I f there is a number t o which the Riemann sums converge a s the mesh o f the grid (the diameter of the largest rectangle) tends to zero, then f is Riemann integrable and that number is the Riemann integral
{f }R
=
lim R ( f, G , S) . mesh G---> 0
Multiple Integrals
Section 8
Yj }j - 1
301
R,J •
(sij· ti)
Figure 1 14 A grid and
a
sample point.
The lower and upper sums of a bounded function grid G are
f with respect to the
U (f, G) = L Mij I Rij i where mij and Mi1 are the infimum and supremum of f (s, t) as (s , t) varies over R1 . The lower integral is the supremum of the lower sums and the upper integral is the infimum of the upper sums. The proofs of the following facts are conceptually identical to the one dimensional versions explained in Chapter 3 : (a) If f i s Riemann integrable then it i s bounded. (b) The set of Riemann integrable functions R ----+ lR is a vector space R = R( R ) and integration is a linear map R � JR. (c) The constant function f = k i s integrable and its integral is k I R I . (d) If f g E R and f ::=:: g then ,
(e) Every lower sum is less than or equal to every upper sum, and con sequently the lower integral is no greater than the upper integral,
Chapter 5
Multivariable Calculus
302
(f) For a bounded function, Riemann integrability is equivalent to the equality of the lower and upper integrals, and integrability implies equality of the lower, upper, and Riemann integrals. The Riemann-Lebesgue Theorem is another result that generalizes nat urally to multiple integrals. It states that a bounded function is Riemann integrable if and only if its discontinuities form a zero set. First of all , Z c IR2 is a zero set if for each E > 0 there is a countable covering of Z by open rectangles Sk whose total area is < E ,
L I Sk l k
< E.
By the E f2k construction, a countable union of zero sets is a zero set. As in dimension one, we express the discontinuity set of our function f : R � IR as the union where DK is the set points z OSCz
K >0 E R at which the oscillation is �
f = lim diam ( f ( Rr (z))) � r ---+ 0
K.
That is,
K
where Rr (Z) is the r-neighborhood of z in R. The set DK is compact. Assume that f : R � IR is Riemann integrable. It is bounded and its upper and lower integrals are equal . Fix K > 0. Given E > 0, there exists � > 0 such that if G is a grid with mesh < � then U ( f,
G) - L ( f, G ) < E .
Fix such a grid G. Each R;j i n the grid that contains i n its interior a point of DK has M;j - m ij � K, where m;j and M;j are the infimum and supremum of f on Rij . The other points of DK lie in the zero set of gridlines X; x [c, d] and [a , b] x yj. Since U L < E , the total area of these rectangles with oscillation � K does not exceed E f K . Since K is fixed and E is arbitrary, D�< is a zero set. Taking K = 1 /2, 1 / 3 , . . shows that the discontinuity set D = U DK is a zero set. Conversely, assume that f is bounded and D is a zero set. Fix any K > 0. Each z E R \ DK has a neighborhood W = Wz such that -
.
sup{ f ( w ) : w E W }
-
inf{ f ( w) : w E W } <
K.
Since DK is a zero set, it can be covered by countably many open rectangles Sk of small total area, say
303
Multiple Integrals
Section 8
Let V be the covering of R by the neighborhoods W with small oscillation, and the rectangles Sk . Since R is compact, V has a positive Lebesgue number A. Take a grid with mesh < A. This breaks the sum U-L =
L(Mij - mij ) I Rij l
into two parts : the sum of those terms for which Rij is contained in a neighborhood W with small oscillation, plus a sum of terms for which Rij is contained in one of the rectangles Sk . The latter sum is < 2M a , while the former is < K I R 1 . Thus, when K and a are small. U - L is small, which implies Riemann integrability. To summarize,
The Riemann-Lebesgue Theorem remains valid for functions of several variables. Now we come to the first place that multiple integration has something new to say. Suppose that f : R � lR is bounded and define
F ( y) =
J � (x , y) dx
-a
F (y)
-b
=i f (x , y) dx
.
For each fixed y E [c, d], these are the lower and upper integrals of the single variable function /v : [a , b] � lR defined by fy (x ) = f (x , y ) . They are the integrals of f (x , y) on the slice y = const. See Figure 1 1 5 .
Figure 1 15 Fubini 's Theorem is like sliced bread
304
Multivariable Calculus
29 Fubini's Theorem
If f
Moreover,
is
Chapter 5
Riemann integrable then
so
are F and F.
Since F :=:: F and the integral of their difference is zero, it follows from the one-dimensional Riemann-Lebesgue Theorem that there exists a linear zero set Y c [c , such that if y � Y then F( y ) = F( y) . That is, the integral of f(x, y) with respect to x exists for almost all y , and we get the more common way to write the Fubini formula
d],
Ji f dxd ld [1b f(x , y) dx]dy. y=
There is, however, an ambiguity i n this formula. What i s the value of the integrand J: f(x, y) x when y E Y? For such a y, F ( y ) < F (y) and the integral of f (x, y) with respect to x does not exist. The answer is that we can choose any value between F( y ) and F( y ) . The integral with respect to y will be unaffected. See also Exercise 47.
d
Proof We claim that if P and Q are partitions of [a , b] and [ c ,
L (f, G)
(9) where G is the grid P then
x
m
L m ij �Xi i =l and i t follows that
L (F, Q)
Q . Fix any partition interval Jj
m ij = inf{ f( s , t) : (s , t) E Rij } Thus
:S
<
< inf{ f( s , y) :
m
L m i (fy ) � xi i =l m
L m ij �Xi i =l
d] then
:=::
=
c
[c,
d]. If y E Ji
s E /; } =
m i (fy ).
L(fy . P) :S F ( y ) ,
mj (F).
Therefore n
m
j =l
i=l
n
U(F, Q) :=:: U (j, G) . Thus :S U ( F , Q) :S U (F, Q) :S U(f, G) .
which gives (9). Analogously,
L (f, G )
:S L ( F , Q)
j= l
305
Multiple Integrals
Secti on 8
Since f is integrable. the outer terms of this inequality differ by arbitrarily little when the mesh of G is small. Taking infima and suprema over all grids G = P x Q gives
if=
sup L (f,
G)
s
sup L ( F , Q) s inf U (£, Q)
s
inf U (f,
G) =
i f.
The resulting equality of these five quantities implies that F is integrable and its integral on [c, d] equals that of f on R . The case of the upper integral is handled in the same way. 0 30 Corollary
first then integral, x
y,
If f is Riemann integrable, then the order of integration or vice versa - is irrelevant to the value of the iterated
Proof Both iterated integrals equal the integral of
f over R .
0
A geometric consequence of Fubini's Theorem concerns the calculation of the area of plane regions by a slice method. Corresponding slice methods are valid in 3-space and in higher dimensions. 31 Cavalieri's Principle
The area of a region S respect to of the length of its vertical slices, x
area (S) =
C
£b length(Sx )dx,
R is the integral with
provided that the boundary of S is a zero set.
See Appendix B for a delightful discussion of the historical origin of Cavalieri 's Principle. Deriving Cavalieri's Principle is mainly a matter of definition. For we define the length of a subset of lR and the area of a subset of JR2 to be the integrals of their characteristic functions. The requirement that a S is a zero set is made so that X s is Riemann integrable. It is met if S has a smooth, or piecewise smooth, boundary. See Chapter 6 for a more geometric definition of length and area in terms of outer measure. The second new aspect of multiple integration concerns the change of variables formula. We will suppose that cp : U � W is a C diffeo morphism between open subsets of JR2• that R c U , and that a Riemann
1
306
Multivariable Calculus
Chapter 5
integrable function f : W ---+ JR. is given. The Jacobian of cp at z E the determinant of the derivative,
U is
Jacz cp = det(Dcp)z ·
See Figure 1 1 6.
{ f o cp JR
1
Under the preceding assumptions I Jac cp l = f.
32 Change of Variables Formula
�(R)
u
Figure 1 16 cp is a change of variables.
If S is a bounded subset of lR.2 , its area (or Jordan content) is by defini tion the integral of its characteristic function X 5, if the integral exists; when the integral does exist we say that S is Riemann measurable. See also Appendix B of Chapter 6. According to the Riemann-Lebesgue Theorem, S is Riemann measurable if and only if its boundary is a zero set. For X s is discontinuous at z if and only if z is a boundary point of S. See Exer cise 44 . The characteristic function of a rectangle R is Riemann integrable, its integral is I R I, so we are justified in using the same notation for area of a general set S, namely
33 Proposition
/f T
measurable set S C
l S I = area(S) =
J
Xs.
: IR.2 ---+ �2 is an isomorphism then for every Riemann IR.2 , T (S) is Riemann measurable and I T (S) I = l det T I I S I
Proposition 3 3 i s a version of the Change of Variables Formula in which cp = T, R = S, and f = 1 . It remains true for n-dimensional volume and leads to a definition of the determinant of a linear transformation as a "volume multiplier."
307
Multiple Integrals
Section 8
Proof As is shown in linear algebra, the matrix product of elementary matrices
Each elementary 2
x
A
that represents
T is a
2 matrix is one of the following types:
where A > 0. The first three matrices represent isomorphisms whose effect on /2 is obvious: 12 is converted to the rectangles "AI x I , I x "A I , / 2 • In each case, the area agrees with the magnitude of the determinant. The fourth isomorphism converts /2 to the parallelogram fl =
{ (x , y) E
I +
lR2 : O" Y ::: x :::
O"Y and 0
:::
y ::: 1 } .
n is Riemann measurable since its boundary is a zero set. By Fubini's Theorem, we get
l fl l =
f
Xn =
lx=l+uv L dx = 1 1 1 [ x=uy ]dy .
0
= det E .
Exactly the same thinking shows that for any rectangle R , not merely the unit square, I E (R) I = l det E I I R I .
( 1 0)
We claim that ( 1 0) implies that for any Riemann measurable set S, E (S) is Riemann measurable and I E ( S ) I = l det E I I S I .
(11)
Let E > 0 be given. Choose a grid G on R the rectangles R of G satisfy ( 1 2)
lSI -
E
:::
L IRI
ReS
<
L
=:>
S with mesh so small that
I R I ::: l S I + E .
RnSofV!
The interiors of the inner rectangles - those with R c S and therefore for each z E JR2,
L
Res
X int R (Z) ::: X s (z) .
-
are disjoint,
Chapter 5
Multivariable Calculus
308
The same is true after we apply E , namely
L X mt(E(R)) (Z)
ReS
:S X E(Sl ( Z) .
Linearity and monotonicity of the integral, and Riemann measurability of the sets E ( R ) imply that (13)
L I E (R) I = L
ReS
ReS
I
x int(E(R)) =
L
I
ReS -
X int (E( R)) ::::
I
-
X E (S) ·
Similarly, X E(S) (Z) ::::
L
RnS-:fo0
X E (R) (Z) ,
which implies that
By ( 1 0) and ( 1 2). ( 1 3) and ( 1 4) become l det E l ( l S I - E ) :S l det E 1 :S :S
I
_
L IRI
ReS
X Ecs> :S
J
X E( S ) :S l det E 1
L
IRI
RnS-:fofll
l det E l ( l S I + E ) .
Since these upper and lower integrals do not depend on E , and E is arbitrarily small, they equal the common value l det E I I S I , which completes the proof of ( 1 1 ) . The determinant of a matrix product i s the product o f the determinants. Since the matrix of T is the product of elementary matrices, E 1 · Ek . ( 1 1 ) implies that if S is Riemann measurable then so is T ( S ) and •
I T ( S ) I = l E t · · · Ek ( S) I
= l det E t l · · · l det Ek i i S I = l det T I I S I .
•
0
We isolate two more facts in preparation for the proof of the Change of Variables Formula.
Multiple [ntegrals
Section 8 34 Lemma E U,
Suppose that 1/f
all u
If Ur ( O )
C
Proof By
: U
---+
309
IR2 is C 1 , 0 E U, 1/1 (0) = 0,
and for
U then Ur (P) w e denote the r-neighborhood o f p in U. The
Value Theorem gives
1/f (u )
= =
1/f (u ) - 1/f ( O ) =
1 ' (
If lu i ::::; r this implies that
1 1/f ( u) l
::::;
1 ' (Dlfr)tu dt
Id
C1
Mean
(u )
) dt (u) + u .
( 1 + E)r: i.e .. l/f (Ur (O))
c
u( l +€} r ( 0) . 0
Lemma 34 is valid for any choice of norm on JR 2 • in particular for the maximum coordinate norm. In that case, the inclusion refers to squares: the square of radius r is carried by 1/f inside the square of radius ( 1 + E )r.
35 Lemma
The Lipschitz image of a zero set is a zero set.
Proof Suppose that Z is a zero set and
condition Given E
>
h :
Z
---+
JR2 satisfies a Lipschitz
i h (z) - h (z ' ) l ::::; L l z - z ' l · 0. there is a countable covering of Z by squares Sk such that
See Exercise 45 . Each set Sk n Z has diameter ::::; diam Sb and therefore h ( Z n Sk ) has diameter ::::; L diam Sk . As such, it is contained in a square S� of edge length L diam Sk . The squares S� cover h ( Z) and
L j s� l ::::; L 2 L (diam Sk ) 2 k
k
Therefore h ( Z ) is a zero set.
=
2 L 2 L I Sk l k
::::;
2L 2 E. D
3 10
Chapter 5
Multivariable Calculus
Proof of the Change of Variables Formula Recall that cp : U ---+ W is a C 1 diffeomorphism, f : W ---+ lR is Riemann integrable, R is a rectangle in U, and it is asserted that
(15)
{ f o cp JR
·
IJac cp l =
1
�(R)
f.
Let D ' be the set of discontinuity points of f. It is a zero set. Then
is the set of discontinuity points of f o cp. The C 1 Mean Value Theorem implies that cp- 1 is Lipschitz, Lemma 35 implies that D is a zero set, and the Riemann-Lebesgue Theorem implies that f o cp is Riemann integrable. Since I Jac cp l is continuous, it is Riemann integrable, and so is the product f o cp · I Jac cp l . In short, the l.h.s. of ( 1 5) makes sense. Since cp is a diffeomorphism, it is a homeomorphism and it carries the boundary of R to the boundary of cp(R). The former boundary is a zero set and, by Lemma 35, so is the latter. Thus X � c R) is Riemann integrable. Choose a rectangle R' that contains cp (R). Then the r.h.s. of ( 1 5) becomes
1
�(R)
f = { f · X cp(R) , 1R'
which also makes sense. It remains to show that the two sides of ( 1 5 ) not only make sense, but are equal. Equip JR2 with the maximum coordinate norm and equip .C (JR2 , JR2) with the associated operator norm I I T i l = max { I T ( v ) l max
:
l v l max :::=: 1 } .
Let E > 0 be given. Take any grid G that partitions R into squares Rii of radius r . (The smallness of r will be specified below.) Let Z ij be the center point of Rij and call
The Taylor approximation to cp on Rij is
+ Aij (Z - Z ij ) .
The composite 1/f =
Multiple Integrals
Section 8
311
j'j =:::l::::f..J
l.J::: ;( :=:=.
=::: - :::=:
Figure 1 17 How we magnify the picture and sandwich a nonlinear
parallelogram between two linear ones. if r is small enough then for all z E R;i and for all ij, I I (Dl/1 )z - ld ll < E . By Lemma 34, ( 1 6) where ( 1 +E) RiJ refers to the ( 1 +E)-dilation of Rii centered at Z iJ . Similarly, Lemma 34 applies to the composite cp- 1 o q>iJ and, taking the radius r f ( 1 + E ) instead o f r , w e get ( 17) See Figure 1 1 7. Then ( 1 6) and ( 1 7) imply I l/>;j ((l + E) - R;j )
c
cp (Rij ) = Wij
c
l/>ij ((l + E)R;j ) .
B y Proposition 33, this gives the area estimate
Jij I Rij I � I Wij I � ( 1 + E) 2 Jij I Rij I , + ( 1 E )2 where l;j = I Jaczij ( 1 8)
1 Wij l < ( 1 + E )2 . < I ( 1 + E ) 2 - lij I Rij l -
An estimate of the form
a 1 < ( 1 + E )2 2 ' ( 1 + E) - b -
-----::- <
-
-
with 0 ::::; E � 1 and a , b > 0, implies that
Ia
-
b l � 1 6E b
,
as you are left to check in Exercise 40. Thus ( 1 8) implies ( 1 9)
Multivariable Calculus
312
where J = sup{ IJacz q� l : z E R } . Let mii and Mii be the infimum and supremum of f o q� on for all w E q�(R),
L mijXim wij (w) which integrates to
L mii I Wii l According to ( 1 9). replacing than 1 6E J I Rii l · Thus
L m;j lij 1 Rii i - 16EMJ I R I
�
:::S
Chapter 5
Rii . Then,
f(w) :::S L MijX wij (w) ,
1
f ::::
1
f :::S
1
f ::S
rp(R)
L Mii I Wii l ·
I W;i I by l;i I Rii I causes an error of no more ::S
rp( R )
L MiJij 1 Rij i + 1 6EMJ I R I .
where M = sup I f 1 . These are lower and upper sums for the integrable function f o q� IJac q� l . Thus ·
{ f o q� · I Jac q� I - 1 6E M J I R I .IR
::S
rp(R)
{ f o q� · IJac q� l + l 6E M1 I R I . .IR
Since E is arbitrarily small, the proof is complete.
D
Finally, here is a sketch of the n-dimensional theory. Instead of a two dimensional rectangle we have a box
Riemann sums of a function f : R --7- lR are defined as before: take a grid G of small boxes Re in R , take a sample point se in each, and set
R(f, G, S) = L f(se) I Rt l where I R£ I is the product of the edge lengths of the small box Rf and S is the set of sample points. If the Riemann sums converge to a limit it is the integral. The general theory, including the Riemann-Lebesgue Theorem, is the same as in dimension two. Fubini 's Theorem is proved by induction on n , and has the same meaning: integration on a box can be done slice by slice, and the order in which the iterated integration is performed has no effect on the answer. The Change of Variables Formula has the same statement, only now the Jacobian is the determinant of an n x n matrix. In place of area we have
Differential Forms
Section 9
313
volume, the n-dimensional volume of a set S C JRn being the integral of its characteristic function. The volume-multiplier formula, Proposition 33, has essentially the same proof, but the elementary matrix notation is messier. (It helps to realize that the following types of elementary row operations suffice for row reduction: transposition of two adjacent rows, multipli cation of the first row by A. , and addition of the second row to the first.) The proof of the Change of Variables Formula itself differs only in that 1 6 becomes 4n . 9
Differential Forms
The Riemann integral notation
n
L f (t;) D..x; i= l
may lead one t o imagine the integral as an ''infinite sum o f infinitely small quantities (x ) dx ." Although this idea itself seems to lead nowhere, it points to a good question - how do you give an independent meaning to the symbol dx ? The answer: differential forms. Not only does the theory of differential forms supply coherent, independent meanings for dx , dx , dy , d dxd y, and even for d and x separately, it also unifies vector calculus results. A single result, the General Stokes Formula for differential forms
f f
f
f,
1M
dw =
f w, laM
encapsulates all integral theorems about divergence, gradient, and curl. The presentation of differential forms in this section appears in the natural generality of n dimensions, and as a consequence it is unavoidably fraught with complicated index notation - armies of i ' s, j 's, double subscripts, multi-indices, and so on. Your endurance may be tried. First, consider a function y = F (x ) . Normally, you think of F as the function, x as the input variable, and y as the output variable. But you can also take a dual approach and think of x as the function, F as the input variable, and y as the output variable. After all, why not? It's a kind of mathematical yin/yang. Now consider a path integral the way it is defined in calculus,
1
c
f dx + gdy =
11 0
dx (t ) f (x (t ) , y (t) ) -- d t dt
+
11 0
g (x (t ) , y (t ) )
dy (t )
--
dt
dt .
Multivariable Calculus
3 14
Chapter 5
f and g are smooth real-valued functions of (x , y) and C is a smooth path parameterized by (x (t), y(t)) as t varies on [0, 1 ] . Normally, you think of the integral as a number that depends on the functions f and g . Taking the
dual approach, you can think of it as a number that depends on the path C. This will b e our point o f view. lt parallels that found in Rudin 's Principles
of Mathematical Analysis .
Definition A differential l-form is a function that sends paths to real
numbers and which can be expressed as a path integral in the previous notation. The name of this particular differential l -form is fdx + g dy . In a way, this definition begs the question. For it simply says that the standard calculus formula for path integrals should be read in a new way as a function of the integration domain. Doing so, however, is illuminating, for it leads you to ask: just what property of C does the differential 1 -form fdx + g dy measure? First take the case that f (x, y) = l and g (x , y) = 0. Then the path integral is
1 d 1 b dxdt(t) dt C
x
=
a
=
x (b) - x (a)
which is the net x -variation of the path C. This can be written in functional notation as dx : C �----+ x (b) - x (a) .
It means that dx assigns to each path C its net x-variation. Similarly dy as signs to each path its net y-variation. The word "net" is important. Negative x-variation cancels positive x-variation, and negative y-variation cancels positive y-variation. In the world of forms, orientation matters. What about f dx ? The function f "weights" x-variation. lf the path C passes through a region in which f is large, its x-variation is magnified accordingly, and the integral fc fdx reflects the net f -weighted x -variation of C. In functional notation f dx : C
�----+
net
f -weighted x- variation of C .
Similarly, g dy assigns to a path its net g-weighted y-variation, and the 1 -form fdx + gdy assigns to C the sum of the two variations.
Terminology A functional on a space X is a function from X to JR.
Differential 1 -forms are functionals on the space of paths. Some func tionals on the space of paths are differential forms but others are not. For
Differential Forms
Section 9
315
instance, assigning to each path its arc-length i s a functional that i s not a form. For if C is a path parameterized by (x (t) , y (t)), then (x * (t) , y * (t)) = (x (a + b - t) , y (a + b - t)) parameterizes C in the reverse direction. Arc length is unaffected, but the value of any 1 -form on the path changes sign. Hence, arc-length is not a 1 -form. A more trivial example is the functional that assigns to each path the number 1 . It too fails to have the right symmetry property under parameter reversal, and is not a form. The definition of k-forms for k =::: 2 requires Jacobian determinants. To simplify notation we write a FI ;ax] = a (Fi ] ' - - - ' Fik ) ja (xj] ' - - - ' Xjk ) where I = Ci t , h). J = (h . . . , A) are k-mples of integers, and F : !Rn ---+ !Rm is smooth. Thus, •
.
.
.
.
a Fi1
a Fi 1
ax i l
axik
a Fik
a Fik
ax h
ax ik
If k = 1 , I = (i ) , and J = (j) then aji Jax1 is just ajj jaxi .
Definition A k-cell in !Rn is a smooth map q; : I k -+ !Rn where I k is the k cube. If I = (i 1 , , i k ) is a k-tuple in { 1 , . . . , n} then dx I is the functional that assigns to each k-cell q; its xrarea •
.
•
where this integral notation is shorthand for
If f is a smooth function on !Rn then f dx I is the functional
fdx i : q;
1--+
1
a q; I f ( q; ( u )) - du . au Ik
The function f weights XI-area. The functional dxi is a basic k-form and f dxi is a simple k-form, while a sum of simple k-forms, is a (general) k-form
w=
L fidxi : q; L (fi dxi ) (q;) . I
�---+
I
Multivariable Calculus
3 16
Chapter 5
The careful reader will detect some abuse of notation. Here I is used to index a collection of scalar coefficient functions { !I } , whereas is also used to reduce an m -vector ( FI , . . . , Fm ) to a k-vector = ( F; 1 , , F;k ) . Besides this, I is the unit interval. Please persevere. To underline the fact that a form is an integral, we write
I
F1
w (cp ) =
i w.
•
•
•
k
Notation Ck (JR.n ) i s the set of all k-cells i n JRn , C (JR.n ) i s the set of all k functionals on Ck (lRn ) , and Q (JRn ) is the set of k-forms on JRn . Because a determinant changes sign under a row transposition, k-forms satisfy the signed commutativity property: if rr permutes I to rr I then
dx:rr 1 = sgn (n ) dx J . In particular, dxo .z) = dx < Z . l l · Because a determinant is zero if it has a repeated row, = 0 if has a repeated entry. In particular dxo . I J = 0. In terms of line integrals in the plane, paths are 1 -cells, X( 1 J = x , x (ZJ = y, x r warea is net x-variation, and dx(2J -area is net y-variation. Similarly, pa rameterized surface integrals (as discussed in sophomore calculus) corre spond to integrals of 2-forms over 2-cells. The xo . zrarea of a surface is the net area of its projection to the xy-plane. The equation dxo .z) = x (Z l l signifies that xy-area is the negative of yx-area -
dx1
I
d
-
d
.
Form Naturality
It is a common error to confuse a cell, which a smooth mapping, with its image, which is point set - but the error is fairly harmless.
Integrating a k-fonn over k-cells that differ by a reparame terization produces the same answer up to a factor of± 1 , and thisfactor of ±I is determined by whether the reparameterization preserves or reverses orientation. 36 Theorem
k
Proof If T is an orientation preserving diffeomorphism of I to itself and w fdx1 then the Jacobian a T jau is positive The product determinant =
Differential Forms
Section 9
1
3 17
formula and the change of variables formula for multiple integrals give rp o T
w= = =
1 1 1
f ( q; o T ( u)) a (q; o T)1 du [k au ( aq;l ) ar f q; o T ( u)) du ( k a v v= T ( u ) a u aq; w. f (q;(v)) -I d v
[k
=
av
1
rp
J"'
w continues to hold for all Taking sums shows that the equation f T w = q; o E Q k . If T reverses orientation its Jacobian is negative. In the change of variables formula appears the absolute value of the Jacobian, which causes D f T to change sign.
w
q;o
w
A particular case of the previous theorem concerns line integral s in the plane. The integral of a 1 -form over a curve C does not depend on how C i s parameterized. If we first parameterize C using a parameter E [0, 1 J and then reparameterize it by arc-length s E [0, L] where L is the length of C and the orientation of C remains the same, then integrals of l -forms are unaffected,
t
dx(t) dt 1 1 f (x (t), y(t) ) -dt dv(t) dt 1 1 g(x (t), y(t))-·dt o
o
= =
dx (s) ds 1LL f (x (s), y(s ) ) -ds dy(s) . 1 g(x (s), y(s)) -ds 0
o
ds
Form Names A
k-tuple l = Ut
ik ) ascends if it < · · · < ik . 37 Proposition Each k-form w has a unique expression as a sum of simple k-forms with ascending k-tuple indices, •
.
.
.
,
Moreover, the coefficient fA (x) in this ascending presentation (or "name ") of w is determined by the value of w on small k-cells at x. Proof Using the signed commutativity propeny of forms, we regroup and combine a sum of simple forms into terms in which the indices ascend. This gives the existence of an ascending presentation w = L fAdxA.
Chapter 5
Multivariable Calculus
318
Fix an ascending k-tuple the inclusion cell, l=
A
and fix a point p E .!Rn . For
r
>
0 consider
lr, p : u I-* p + rL(u)
where i s the linear inclusion map that sends JRk to the XA -plane. 1 sends to a cube in the XA -plane at p. As � 0, the cube shrinks to p. If 1 ascends, the Jacobians of t are
Ik
L
r
if 1 = A Thus, if 1 =F A then
fi d
x1
w (t) = Continuity of
if 1 =I A . (t )
= 0 and
/AdxA (l) = r k
fA implies that
1 /A (t (u)) du. [k
(20) which is how the value of w on small k-cells at
fA (p ) .
p determines the coefficient D
38 Corollary
If k > n then Qk (JRn ) = 0.
Proof There are no ascending k-tuples of integers in { 1 , ,
. . . , n}.
D
Moral A form may have many names, but it has a unique ascending name. Therefore if definitions or properties of a form are to be discussed in terms of a form's name, the use of ascending names avoids ambiguity. Wedge Products
L I a1dx1
L J b1dx1. L I, J aibJdxu
Let a be a k-form and f3 be an .e-form. Write them in their ascending presentations, a = and f3 = Their wedge product is the (k + .e) -form a 1\ f3 = where = (i t , ik ) . J = (h . . . , h_ ) , I J = (i , , . . . , ib h . . , h.) . and the sum is taken over all ascending J . The use of ascending presen tations avoids name ambiguity, although Theorem 39 makes the ambiguity moot. A particular case of the definition is
I
•
.
.
.
.
I,
dxt 1\ dxz dxo. 2)· =
.
.
Differential Forms
Section 9
319
The wedge product 1\ Qk x Qf ---+ Qk +f satisfies four natural conditions: (a) distributivity: (a + fJ) 1\ Y = a 1\ Y + fJ 1\ Y and Y 1\ (a + {J) = y 1\ a + y 1\ fJ. (b) insensitivity to presentations: a 1\ fJ = Lu aib J dX IJ for general presentations a = L a1dx1 and fJ = L b1dx 1. (c) associativity: a 1\ (fJ 1\ Y) = (a 1\ fJ) 1\ Y. (d) signed commutativity: fJ 1\ a = ( - l ) k£ a 1\ fJ, when a is a k-form and fJ is an l-form. In particular, dx 1\ dy = -dy 1\ dx.
39 Theorem
40 Lemma
:
The wedge product of basic forms satisfies
Proof #1 See Exercise 54. Proof #2 If I and J ascend then the lemma merely repeats the definition of the wedge product. Otherwise, let JT and p be permutations that make JT I and p l non-descending. Call the permutation of I J that is n on the first k terms and p on the last l. The sign of is sgn (n ) sgn(p), and
a
a
dxi i\dXJ = sgn (n ) sgn (p) dx I 1\dXpJ :n:
= sgn ( ) dxa ( IJ )
a
= dxiJ . D
Proof of Theorem 39 (a) To check distributivity, suppose that a = L a 1dx 1 and fJ = L b1dx1 , are k-forms, while Y = L c1dx1 is an £-form, and all sums are ascending presentations. Then
is the ascending presentation of a and
+ fJ
(this is the only trick in the proof),
T, J
I, J
l, J
which is a 1\ Y + fJ 1\ Y , and verifies distributivity on the left. Distributivity on the right is checked in a similar way. (b) Let L a1 dx1 and L b1dx1 be general, non-ascending presentations of a and {J . By distributivity and Lemma 40,
( LI aidxi ) 1\ ( L b1dx1 ) = L a1 b1 dx1 1\ dx1 = L a1b1 dxiJ �J J
�J
320
Multivariable Calculus
Chapter 5
L, cK dX K , L, b1dxJ . c�::::a[ dX[) 1\ ( L::> J CKdXJK ) L a l b JCK dX I JK . J
(c) By (b), to check associativity we need not use ascending presentations . Thus, if a = L, 1d 1 f3 = and Y = then
a x
a 1\ ({3
1\ Y )
=
I
=
,K
I.J.K
which equals (a 1\ {3 ) 1\ Y . (d) Associativity implies that it makes sense to write products d 1 1\ 1\ and 1\ Thus, 1\
xi
·
•
dxik
·
dxh
·
·
•
dxit .
dx1
and
dx1
as
It takes kf pair-transpositions to push each
dx1 1\ dx1
=
dxi past each dx i, which implies (- l lfdx1 1\ dx 1 .
Distributivity completes the proof of signed commutativity for the general a and {3 . D The Exterior Derivative
Differentiating a form is subtle. The idea, as with all derivatives, is to imagine how the form changes under small variations of the point at which it is evaluated. A 0-form is a smooth function f Its exterior derivative is by definition the functional on paths qJ : [0, 1 ] -+ �" ,
(x) .
41 Proposition
df : cp �--+ / (cp ( l ) ) - f (cp (O)) . df is a 1 -form and when n 2, it is expressed as =
df
In particular, d (x)
=
dx.
=
-aaxf dx + -aJay dy .
Proof When no abuse of notation occurs we use calculus shorthand and write fx = af/ax, /y = af/ay. Applied to ({J, the form w = fx d + /y d y produces the number w
(cp)
=
to
J
(
dx (t)
dy (t )
)
x
fy (cp (t)) ------;[( + /y (cp (t)) ------;[( dt.
By the Chain Rule, the integrand is the derivative of f o qJ(t), so the Fun damental Theorem of Calculus implies that w (({J) = f (cp ( 1 ) ) - / (({J (O) ) . Therefore, f = w as claimed. D
d
Section
9
Differential Forms
321
Definition Fix k � 1 . Let :L /Jdx1 be the ascending presentation of a k-form w. The exterior derivative of w is the (k + 1 ) -form
dw =
L dh
1\
dxJ .
I
The sum is taken over all ascending k-tuples I . Use of the ascending presentation makes the definition unambiguous, although Theorem 42 makes this moot. Since d/J is a 1 -form and dx1 is k-form, dw is indeed a (k + 1 ) -form. For example we get
d(fdx + gdy) = (gx - /y)dx 1\ dy . :
Qk
Q k+ l
satisfiesfour natural conditions. (a) It is linear: d(a + cf3) = da + cdf3. (b) It is insensitive to presentation: ifL fidxi is a general presentation ofw then dw = 'L_ dh 1\ dxJ. (c) It obeys a product rule: if a is a k-form and f3 is an £-form then d(a 1\ {3) = da 1\ f3 + (- l ) k a 1\ df3 .
42 Theorem
Exterior differentiation d
--+
(d) d 2 = 0. That is, d (dw) = Ofor all w E Q k . Proof (a) Linearity is easy and is left for the reader as Exercise 55. (b) Let n make n I ascending. Linearity o f d and associativity o f 1\ give
d( /J dx1 ) = sgn( n ) d ( /J dx:n: 1 ) =
sgn (n ) d ( /J )
1\
dx:n: 1 = d ( /J ) 1\ dx 1 •
Linearity of d promotes the result from simple forms to general ones. (c) The ordinary Leibniz product rule for differentiating functions of two variables gives
atg afg d(fg) = - dx + - dy ax ay = fx g dx + /yg dy + fgx dx + /gy dy which is gdf + fdg, and verifies (c) for 0-forms in JR2 . The higher dimensional case is simi lar. Next we consider simple forms a = fdx1 and f3 = gdx 1 . Then
d(a 1\ {3) = d (fg dx u ) = (gdf + fdg) 1\ dx u = (df /\ dxJ) /\ (gdx1 ) + (- l ) k (fdx1 ) A (dg /\ dx 1 ) = da 1\ f3 + ( - 1 / a 1\ df3.
Multivariable Calculus
322
Chapter 5
Distributivtty completes the proof for general a and {3 . The proof of (d) i s fun. We check i t first for the special 0-form x . B y Proposition 4 1 , the exterior derivative x i s dx and i n turn the exte rior derivative of dx is zero. For dx = l dx, d l = 0, and by definition, d (ldx) = d (l ) 1\ dx = 0. For the same reason, d(dx1 ) = 0. Next we consider a smooth function f : IR.2 � JR. and prove that d 2 f = 0. Since d 2 x = d 2 y = 0 we have
d 2 f = d < fx dx + fy dy) = d < fx ) 1\ dx + d (fy ) 1\ dy = < fxx dx + fxy dy ) 1\ dx + (/y x dx + fyydy ) 1\ dx =0
.
The fact that d 2 = 0 for functions easily gives the same result for forms. The higher-dimensional case is similar. D Pushforward and Pullback
According to Theorem 36, forms behave naturally under composition on the right. What about composition on the left? Let T : JR.n � JR.m be a smooth transformation. It induces a natural transformation T* : Ck (JR.n ) � Ck (JR.m ) , the pushforward of T , defined by
k Dual to the pushforward is the pullback T * : C k (JR.m ) � C (JR.n ) defined by T * : Y t--+ Y o T.
Thus, the pullback of Y E C k (JR.m ) is the functional on Ck (JR.n )
T * Y : cp
�---+
Y (cp o T ) .
The pushforward T* goes the same direction as T, from JR.n to IR.m , while the pullback T * goes the opposite way. The pushforward/pullback duality is summarized by the formula
Pullbacks obey the following four natural conditions. (a) The pullback transformation is linear and (S o T) * = T * o S * . (b) The pullback of a form is aform; in particular. T * (dyi) = d T1 and T * (f dy1 ) = T *f d T1, where d T1 = d1i1 1\ · · · 1\ d ijk .
43 Theorem
Differential Forms
Section 9
323
(c) The pullback preserves wedge products, T * (a 1\ {3) = T � 1\ T *fJ. (d) The pullback commutes with the exterior derivative, dT * = T * d. Proof (a) This is left as Exercise 56. (b) We rely on a nontrivial result in linear algebra, the Cauchy-Binet Formula, which concerns the determinant of a product matrix AB = C , where A i s k x n and B i s n x k . See Appendix E. In terms of Jacobians, the Cauchy-Binet Formula asserts that if the maps k k q; : JR � JR" and 1/f : JR" � JR are smooth, then the composite ¢ = k k 1/f o cp : JR -+ JR satisfies
a¢ _ au
L OX]o l/J J
acp 1
au
where the Jacobian o 1/ff o x 1 i s evaluated at x = cp ( u ) , and J ranges through all ascending k-tuples in { 1 , . . . , n } . Then the pullback of a simple k-form o n :!Rm i s the functional on Ck (JR"),
T * (fdy1) : cp
�---+
=
= which implies that
f f (T u u 1 du 0 T1 L f f(T cp (u)) ( OX] ) jdy1 ( T o cp) o cp (
Jk
J
Jk
))
a (T o cp ) a
o cp J du , x=q;(u) a u
o
is a k-form. Linearity of the pullback promotes this to general forms - the pullback of a form is a form. It remains to check that T * (dy 1) = dT1 . For I = (i 1 , , h), distributivity of the wedge product and the definition of the exterior derivative of a function imply that •
.
•
d T1 = d T.·I J 1\
·
·
·
1\
,) - (L d · dxS] OX T.l k
S] = l
a T;,
s
1\
·
·
1\
·
I
o Ti ) (I:" dx OX Sk = !
k
s
k
Sk
, h are fixed. All terms with repeated dummy indices The indices i 1 , s 1 , , sk are zero, so the sum is really taken as (s 1 , , sk ) varies in the set •
•
•
•
•
•
•
•
•
Multivariable Calculus
324
Chapter 5
of k-tuples with no repeated entry, and then we know that (s 1 , . . . , sk ) can be sk ) = rr J for an ascending J = (j 1 , expressed uniquely as (s 1 , A) and a permutation rr . Also, dx51 1\ 1\ dxsk = sgn (rr )dx 1 . This gives •
•
•
,
•
·
·
•
•
,
·
and hence T * (dy1) = dT1 . Here we used the description of the determinant from Appendix E. (c) For 0-forms it is clear that the pullback of a product is the product of the pullbacks. T * (fg) = T *f T 'g . If the forms a = fdy1 and f3 = gdy 1 are simple then a 1\ f3 = fgdy11 and by (b),
T * (a 1\ {3)
=
T * (fg) dT11
=
T *f T g d T1
1\
dT1
=
T * a 1\ T * fJ.
Wedge distributivity and pullback linearity complete the proof of (c) . (d) If w is a form of degree 0, w = f E Q 0 (1Rm ) , then
T * (df) (x)
=
=
T*
m
of (L � dy; ) y ,
i= l
m of L: r * ( - ) T * (dy; ) oy . i= l
,
which is merely the chain rule expression for d(f o T) =
Thus,
T * dw = dT * w for 0-forms.
d(T * f),
The General Stokes' Formula
Section 1 0
325
Next consider a simple k-form w = fdyi with k � I . Using (b), the degree-zero case, and the wedge differentiation formula, we get
d(T * w) = d(T *f dT1 ) = d(T *f) 1\ dT1 + ( - 1) 0 T *f 1\ d (dT1 ) * (df) 1\ d T1 = T = T * (df 1\ dy1 ) = T * (dw) . Linearity promotes this to general k-forms and completes the proof of (d) .
D 10
The General Stokes ' Formula
1cp dw = la{cp w ,
In this section we establish the general Stokes' formula as
where w E Q k (JRn ) and ifJ E Ck + 1 (JRn ) . Then, as special cases, we reel off the standard formulas of vector calculus. Finally, we discuss antidifferentiation of forms and briefly introduce de Rham cohomology. First we verify Stokes ' formula on a cube, and then get the general case by means of the pullback. A
Definition
where
a1
•
•
•
•
L
k-chain is a formal linear combination of k-cells,
,
=
N
j=l
aj (/Jj ,
a N are real constants. The integral of a k-form
Definition The boundary of a k -cell 0 (/J
k+ I
=
�) - l ) j + l (({J j=I
ifJ
o
is the k-chain lj. I - (/J
o
lj,O)
w over
326
Multivariable Calculus
Chapter 5
where
L j ,O : (Ut , . . . , Uk) Lj , l : (U t , . . . , Uk)
1---+ 1---+
(U ! , , Uj- 1 • 0, U j , · , Uk) (U ! , . . . , Uj-1 • 1 , U j , . . . , Uk) . .
.
•
•
•
are the /11 rear inclusion k-cell and jth front inclusion k-cell of J k+ 1 As shorthand, one can write as
acp
k+ l acp = I:: < - I ) H18 j j =l where
8 j = cp t j , I - cp o
o
t1· 0 is the j th dipole of cp .
Assume that k + 1 = n. If w ffi.n is the identity-inclusion n-cell in ffi.n then
4 4 Stokes' Theorem for a Cube
and t
:
ln
-4
f1 dw
Proof Write w as
w
n
=
L /i (x ) dx l i=l
=
E Q k (ffi.n )
{ w. la1
1\ · · · 1\
� 1\ · · · 1\ dxn ,
dxi
where the hat above the term is standard notation to indicate that it is deleted. The exterior derivative of w is
n
L dfi
1\ dx1 1\ · · · 1\ ;[;i 1\ · · · 1\ dxn i=l n afi dx1 1\ · · · 1\ dx . = L ( - 1 ) i-l n ax· i=l
dw =
I
which implies that
f1 dw
=
t (- l ) i + l l 1=1 .
[k
afi du . ax;
Deleting the j th component of the rear j th face t1· 0 ( u) gives the k-tuple (u 1 , while deleting any other component gives a k-tuple with a
. . . , Uk ) ,
Section
10
The General Stokes' Formula
component that remains constant as front face. Thus the Jacobians are
o (tf ,O or i ) J au
=
I
L
11 1 1· . ·1 1 (iJ( u l, W
=
varies. The same is true of the /h
if / =
0
and so the /h dipole integral of case
u
327
(l
,
.
.
.
,
j . . . , n) ,
otherwise,
w is zero except when i
= j, and in that
. . . , Uj - 1 • 1 , Uj , . . . ' Uk )
- fJ(u 1 , , uJ-I· 0, u1, . . . , uk)) du1 . . . duk . •
•
•
By the Fundamental Theorem of Calculus we can substitute the integral of a derivative for the difference; and, by Fubini 's Theorem, the order of integration in ordinary multiple integration is irrelevant. This gives
fJ
{
=
(x r '. . . [ ' ofJ ) dx l . . . dxn ,
lo lo axi D so the alternating dipole sum L(- l )i+I J8j w equals J, dw . 45 Corollary - Stokes' Formula for a general k-cell Ifw E Q k (IRn ) and if cp E Ck + l (JRn ) then d w = { w. lal
(J)
1rp
Jarp
1
Proof Using the pullback definition and applying (d) of Theorem 43 when T = cp : J k + 1 --+ IRn and t : J k + 1 --+ JRk + is the identity-inclusion gives
1rp dw 1rpot dw Jt cp* dw Jt dcp* w =
=
=
=
�
{ cp* w = w. larp Jat
Stokes ' Formula on manifolds
If M C IRn divides into (k + i ) -cells and its boundary divides into k-cells, as shown in Figure 1 1 8, there is a version of Stokes' Formula for M . Namely, if is a k -form, then
w
L dw = 1M w.
It is required that the boundary k-cells which are interior to M cancel each other out. This prohibits the Mobius band and other nonorientable sets The ( k + 1 )-cells tile M . Since smooth cells can be singular (not one-to-one) their images can be simplices (triangles, etc.), a helpful fact when tiling M .
Multivariable Calculus
328
Chapter 5
Figure 118 Manifolds of ( k + 1 ) -cells. The boundaries may have more than one connected component Vector Calculus
The Fundamental Theorem of Calculus can be viewed a special case of Stokes· Formula
L 1
dw =
1M w
by taking M = [a , b 1 c lR and w = f. The integral of w over the 0-chain a M = b - a is f (b) - f (a), while the integral of dw over M is J ' (x) dx . Second. Green's Formula in the plane,
JL
(gx - fy ) dxdy =
lf
J:
dx
+ gdy ,
is also a special case when we take w = f dx + gdy . Here, the region D is bounded by the curve C. It is a manifold of 2-cells in the plane. Third, the Gauss Divergence Theorem,
J!L
div F =
!1
fiux F ,
is a consequence of Stokes' Formula. Here. F = (j, g , h) is a smooth vector C JR3 . (The notation indicates that f is the x-component field defined on of F , g is its y-component. and h is its z-component.) The divergence of F is the scalar function
U
div F = If ({J is a 2-cell in U , the integral
i
f dy
1\
dz
+
fx + gy + hz .
g d z 1\ dx + hdx
1\
dy
The General Stokes' Formula
Section 1 0
329
is the flux of F across q;. Let S be a compact manifold of 2-cells. The total flux across S is the sum of the flux across its 2-cells. If S bounds a region D c U , then the Gauss Divergence Theorem is just Stokes' Formula with w=
f dy 1\ dz
+
gdz 1\ dx + hdx
For d w = div F dx 1\ dy 1\ dz. Finally, the curl of a vector field
F
=
dy .
(f, g, h) is the vector field
1 (hy - gz ) dy dz + Uz - hx ) dz l fdx gdy + hdz
Applying Stokes' Formula to the form w = 1\
=
1\
fdx + gdy + hdz gives 1\
dx + (gx - jy) dx
1\
dy
+
where S is a surface bounded by the closed curve C. The first integral is the total curl across S, while the second is the circulation of F at the boundary. Their equality is Stokes' Curl Theorem. Closed Forms and Exact Forms
A form is closed if its exterior derivative is zero. It is exact if it is the exterior derivative of some other form. Since d 2 = 0, every exact form is closed: w = da dw = d(da) = 0. ::::} When is the converse true? That is, when can we antidifferentiate a closed form w? If the forms are defined on ffi.n , the answer "always" is Poincare's Lemma. See below. But if the forms are defined on some subset U of ffi.n , and if they do not extend to smooth forms defined on all of ffi.n , then the answer depends on the topology of U . There i s one case that should be familiar from calculus. Let U b e a planar region that is bounded by a simple closed curve, and let w = f dx + gdy be a closed L -form on U . Then it is exact. For one can show that the integral of w along a path C c U depends only on the endpoints of C. The integral is "path independent." Fix a point p E U and set
h (q)
=
L
w
Multivariable Calculus
330
Chapter 5
where q E U and C 1s any path in U from p to q . The function h is well defined because the integral depends only on the endpoints . One checks that ahjax = J, ahjay = g, so dh = (J) and (J) is exact. An open set U c JR.n is simply connected if every closed curve in U can be shrunk to a point in U without leaving U . A region in the plane that is bounded by a simple closed curve is simply connected. (In fact it is homeomorphic to the open disc.) Also, the n-dimensional ball is simply connected, and so is the spherical shell in IR.3 which consists of all points whose distance to the origin is between a and b with 0 < a < b. The preceding construction of f works equally well in dimension n and implies that a 1 -form defined on a simply connected region in JR.n is closed if and only if it is exact. If U C IR.2 is not simply connected, there are 1 -forms on it that are closed but not exact. The standard example is
w=
X -y dx + 2 dy 2 r r
-
-
y2.
where r = Jx 2 + See Exercise 6 1 . In IR.3 it i s instructive to consider the 2-form
w=
X
3
r
dy A dz +
Y
3
r
dz A dx +
3
Z dx A dy . r
is defined on U , which is IR.3 minus the origin. U is a spherical shell with inner radius 0 and outer radius oo. The form w is closed but not exact despite the fact that U is simply connected.
w
46 Poincare's Lemma
If w is a closed k-form on JR.n then it is exact.
Proof In fact, a better result is true. There are integration operators
with the property that
Ld + dL
= identity . That is, for all
(L k - I d + dL k ) (w)
=
w
E Q k ( JR.n ) ,
w.
From the existence of these integration operators, the Poincare Lemma is immediate. For, if dw = 0 then we have
w = L (dw) + dL (w) which shows that w i s exact.
=
dL (w) ,
The General Stokes· Formula
Section 1 0
33 1
The cons truction of L is tricky. First we consider a k-form {J . not on JRn , but on JRn + 1 . It can be expressed uniquely as fJ =
(2 1 )
L /Jdx1 + L g1dt 1\ dx1 J
I
where /J = /J (x , t), g1 = g1 (x , t), and (x . t) E JRn + l = JRn x JR . The first sum is taken over all ascending k-tuples I in { 1 , . . . , n } , and the second over all ascending (k 1 ) -tuples J in { 1 . . . . . n } . Then we define operators -
Q k {JRn+ l ) ---+ nk - I (JRn )
N: by setting
N({J)
L ( lot g1 (x . t) dt ) dx1 .
=
1
We claim that for all fJ E Q k (.JRn+ 1 ) ,
(dN + N d) ({J)
(22)
=
L ( /J (x , L ) - /J (x , 0)) dx1 I
where the coefficients fi take their meaning from is legal to differentiate past the integral sign. Thus
11
11
(2 1 ) .
By Theorem 14 it
a gJ afi ""' dt 1\ dxi + ""' - dxe A dt /\ dx1 LL J a xe I at ag1 af - dt dxe A dx1 _I dt dxi - L ( N (d{J) = L 0 at J, e 0 a xe 1 a gJ dt dxe 1\ dxJ , dN({J) = L t J ,e lo a xe d{J
=
L aafixe A dxi -
l,l
( (
and therefore
(dN + N d ) ( {J )
=
+
)
)
1 aJ,I 1 dt ) dxi L( _ at I.e
=
o
as is claimed in (22). Then we define a "cone map"
p
:
JRn +l
p (x, t)
=
)
L ( fi (x
---+
tx ,
o .�
I
JRn by
,
1) -
/J (x , O))dx , ,
Multivariable Calculus
332
and set
L
=
Chapter 5
N o p * . Commutativity of pullback and d gives Ld + dL
=
Np * d + dNp *
=
(Nd + d N ) p * ,
so it behooves us to work out p * (w). First suppose that w is simple. say tx ) , we have w = hdx1 E Q k (ffi.n) . Since p (x , t) = (tx 1 •
p * (hdx1 )
=
(p * h ) ( p * (dx1 ))
.
.
.
.
n
h (tx)dp1 = h (tx) (d(tx;1 ) 1\ 1\ d(tx;k )) = h (tx ) ((tdx;1 + x;1 dt ) 1\ · · · 1\ (tdx;k + x;k dt ) ) = h (tx ) (l dx1 ) + terms that include dt . =
·
·
·
From (22) we conclude that
Linearity of L and d promote this equation to general k-forms,
(Ld + dL)w = w, and a s remarked at the outset, existence of such forms on ffi.n are exact.
47 Corollary {{U is diffeomorphic to ffi.n
an
L
implies that closed
0
then closedforms on U are exact.
Proof Let T : U � ffi.n be a diffeomorphism and assume that w is a closed k-form on U . Set a = (T - 1 ) * w. Since pullback commutes with d. a is closed on ffi.n , and for some (k 1 )-form JL on ffi.n , d JL = a. But then -
0
which shows that w is exact.
48 Corollary
exact.
Locally, closed forms defined on an open subset of ffi.n are
Proof Locally an open subset of ffi.n is diffeomorphic to
ffi.n .
0
If U C ffi.n is open and starlike (in particular, if U is convex) then closed forms on U are exact.
49 Corollary
Proof A starlike set U c ffi.n contains a point p such that the line segment from each q E U to p lies in U . It is not hard to construct a diffeomorphism from U to ffi.n . 0
The General Stokes' Formula
Section 1 0
333
50 Corollary A
smooth vector field F on �3 (or on an open set diffe o 3 morphic to � ) is the gradient of a scalarfunction if and only if its curl is everywhere zero. Proof If F = grad f then
F = Ux - fy . fz )
curl F = Uzy - fy z • fx z - fzn fyx - fxy ) =
=}
F
On the other hand, if
curl F =
= ( f,
0
0.
g, h) then
=?
w
=
fdx + gdy + hdz
is closed, and therefore exact. A function f with df =
w
has gradient F.
D
A smooth vectorfield on �3 (or on an open set diffeomorphic 3 to � ) has everywhere zero divergence if and only if it is the curl of some other vector field. 51 Corollary
Proof If F = ( f,
g , h) and G G
=
= curl F then
(h y - gz , fz - h x . gx - fv )
s o the divergence of G is zero. On the other hand, i f the divergence of G = ( A , B . C) is zero then the form w
=
Ady A dz + Bdz A dx + Cdx A dy
is closed, and therefore exact. If the form a = f dx = w then F = ( f, g, h ) has curl F = G .
da
+ gdy + hdz
has D
Cohomology
The set of exact k-forms on U is usually denoted B k (U) , while the set of closed k-forms is denoted zk (U) . ("B" is for boundary and "Z" is for cycle.) Both are vector subspaces of Q k (U) and
The quotient vector space
is the k th de Rham cohomology group of U . Its members are the coho mology classes of U. As was shown above, if U is simply connected then
Chapter 5
Multivariable Calculus
334
H 1 (U) = 0. Also, H 2 ( U ) f= 0 when U is the three-dimensional sphencal shell. If U is starlike then H k ( U ) = 0 for al1 k > 0, and H 0 ( U) = JR. Co homology necessarily reflects the global topology of U . For locally, closed forms are exact. The relation between the cohomology of U and its topol ogy is the subject of algebraic topology, the basic idea being that the more complicated the set U (think of Swiss cheese), the more complicated is its cohomology and vice versa. The book From Calculus to Cohomology by Madsen and Tomehave provides a beautiful exposition of the subject.
1 1*
The Brouwer Fixed Point Theorem
n Let B = B be the closed unit n-ball,
B =
{x
:
E Rn
lx l
::S
1 }.
The following is one o f the deep results in topology and analysis:
52 Brouwer's Fixed Point Theorem has afixed point, a point p E B such
If F : B --+ B is continuous then it that F (p) = p.
A relatively short proof of Brouwer's Theorem can be given using Stokes' Theorem. Note that Brouwer's Theorem is trivial when n = 0, for B 0 is a point and is the fixed point of F. Also, if n = 1 , the result is a consequence of the Intermediate Value Theorem on 8 1 = [ - 1 , 1 ] . For the continuous function F (x) - x is non-negative at x = - 1 and non-positive at x = + 1 , so at some p E [ - 1 , 1 ] , F (p) - p = 0 ; i.e., F (p ) = p. The strategy o f the proof in higher dimensions i s to suppose that there does exist a continuous F : B --+ B that fails to have a fixed point, and from this supposition, to derive a contradiction, namely that the volume of B is zero. The first step in the proof is standard. Step 1 . Existence of a continuous F : B --+ B without a fixed point implies the existence of a smooth retraction T of a neighborhood U of B to a B . The map T sends U to a B and fixes every point o f a B . If F has no fixed point as x varies in B , then compactness of B implies that there is some f.L > 0 such that for all x E B ,
I F (x) - xl
> f.L ·
'!:_he Stone-Weierstrass Theorem then produces a multivariable polynomial F
:
n
R
--+
!Rn that
f.L /2-approximates
G (x ) =
F on 1
1 + �J-/2
B . The map
-
F (x)
The Brouwer Fixed Point Theorem
Section 1 1 *
335
is smooth and sends B into the interior of B. It JL-approximates F on B, so it too has no fixed point. The restriction of G to a small neighborhood U of B also sends U into B and has no fixed point.
u
~
T(u)
u
Figure 1 19 T retracts U onto a B . The point u E U is sent by T to the unique point u ' = T (u ) at which the segment [u , G (u ) ] , extended through u , crosses the sphere a B . Figure 1 19 shows how to construct the retraction Since G is smooth, so is T .
T
from the map G .
Step 2. T * kills all n -forms. Since the range of T is a B , it contains no n dimensional open set. Then the Inverse Function Theorem implies that the derivative matrix (DT)u is nowhere invertible, so its Jacobian determinant a T ; a u is everywhere zero, and T* : Q n (ll�.n ) � Q n ( U ) is the zero map.
Step 3 . There is a map cp : I n � B that exhibits B as an n -cell such that �cp is smooth. (b) cp ( P ) = B and cp ( a P ) = a B . (c)
1 a cp Jn
au
d u > 0.
336
Chapter 5
Multivariable Calculus
To construct cp, start with a smooth function : lR --+ IR such that (r) = 0 for r .:::; 1 / 2 , a ' (r ) > 0 for 1 /2 < r < l , and (r) = I for r ::: I . Then define l/f : [ - 1 , 1 ]11 --+ lR11 by
a
{v
v
l/f ( ) =
+
0
a
a
a (lvl) ( � l v l - v)
if
v :F 0
if v = 0.
In fact, this formula defines l/f on all of 1R11 and l/f carries the sphere Sr of radius r to the sphere of radius p (r ) = r
+ a (r ) ( 1 - r ) ,
sending each radial line to itself. Set cp = l/f oK where K scales /11 to [ - 1 , 1]11 by the affine map u I--* = (2u t - 1 . . . . . 2u11 - 1 ) . Then (a) cp is smooth because, away from u 0 = ( 1 /2, . . . , 1 / 2) , it is the com posite of smooth functions. and in a neighborhood of u 0 it is identi cally equal to K (u ) . (b) Since 0 .:::; p (r) .:::; 1 for all r , l/f sends lR 11 to B . Since p (r ) = I for all r ::: 1 , cp sends a /11 to a B . (c) It i s left as Exercise 65 to show that the Jacobian of l/f for r = 1 t I is p ' (r) p (r)11- j r11- • Thus, the Jacobian acpjau is always non negative, and is identically equal to 211 on the ball of radius 1 /4 at u 0 , so its integral on I" is positive.
v
vi
Step 4. Consider an ---
(n - 1 )-form a . If fJ : r- t
--+ lR11 is an (n - I )-cell
whose image lies in a B then
i a - io/3 a i T*a . =
The (n - I ) -dimensional faces of cp : I" --+ B lie in a B . Thus
{ a laq;
(23)
=
{ a. laq; T*
Step 5. Now we get the contradiction. Consider the specific (n - I )-form
Note that da = dxt
1\
·
·
·
1\
dxn is n-dimensional volume and
1 da = f tp
Jn
a
cp
au
du
>
0.
Appendix A
Perorations of Dieudonne
In fact the integral is the volume of
i da
=
{ a 1 = { T *a laq; = = =
337
B . However, we also have by Stokes' Theorem on a cell
iJ q;
£ dT *a £ T * da
0
by Equation (23) by Stokes' Theorem on a cell by (d) in Theorem 43 by Step 2.
This is a contradiction: an integral can not simultaneously be zero and positive. The assumption that there exists a continuous F : B � B with no fixed point has led to a contradiction. Therefore it is untenable, and every F does have a fixed point. Appendix A: Perorations of Dieudonne
In his classic book, Foundations ofAnalysis. Jean Dieudonne of the French Bourbaki school writes "The subject matter of this Chapter [Chapter VIIJ on differen tial calculus] is nothing else but the elementary theorems of Calculus, which however are presented in a way which will probably be new to most students. That presentation which throughout adheres strictly to our general 'geometric' outlook on Analysis, aims at keeping as close as possible to the fun damental idea of Calculus, namely the local approximation of functions by linear functions. In the classical teaching of Calculus, this idea is immediately obscured by the accidental fact that, on a one-dimensional vector space, there is a one-to one correspondence between linear forms and numbers, and therefore the derivative at a point is defined as a number in stead of a linear form. This slavish subservience to the shib boleth of numerical interpretation at any cost becomes much worse when dealing with functions of several variables: one thus arrives. for instance, at the classical formula" . . . "giving the partial derivatives of a composite function, which has lost any trace of intuitive meaning, whereas the natural statement of the theorem is of course that the (total) derivative of a com posite function is the composite of their derivatives" . . . , "a
338
Multivariable Calculus
Chapter 5
very sensible formulation when one thinks in terms of linear approximation." 'This 'intrinsic ' formulation of Calculus, due to its greater 'abstraction' , and in particular to the fact that again and again, one has to leave the initial spaces and climb higher and higher to new 'function spaces' (especially when dealing with the theory of higher derivatives), certainly requires some mental effort, contrasting with the comfortable routine of the classical formulas. But we believe the result is well worth the labor, as it will prepare the student to the still more general idea of Calculus on a differentiable manifold; the reader who wants to have a glimpse of that theory and of the questions to which it leads can look into the books of Chevalley and de Rham. Of course, he will observe in these applications, all the vector spaces which intervene have finite dimension; if that gives him an additional feeling of security, he may of course add that assumption to all the theorems of this chapter. But he will inevitably realize that this does not make the proofs shorter or simpler by a single line; in other words the hypothesis of finite dimension is entirely irrelevant to the material developed below; we have therefore thought it best to dispense with it altogether, although the applications of Calculus which deal with the finite dimensional case still by far exceed the others in number and importance." I share most ofDieudonne's opinions expressed here. And where else will
you read the phrase "slavish subservience to the shibboleth of numerical interpretation at any cost"? Appendix B : The History of Cavalieri 's Principle
The following is from Marsden and Weinstein's Calculus. The idea behind the slice method goes back, beyond the inven tion of calculus, to Francesco Bonaventura Cavalieri ( 1 598 1 647), a student of Galileo and then professor at the University of Bologna. An accurate report of the events leading to Cava lieri ' s discovery is not available, so we have taken the liberty of inventing one. Cavalieri ' s delicatessen usually produced bologna in cylindri cal form, so that the volume would be computed as 2 length One day the casings were a bit weak, and rr · radius -
•
339
A Short Excursion into the Complex Field
Appendix C
the bologna came out with odd bulges. The scale was not work ing that day, either, so the only way to compute the price of the bologna was in terms of its volume. Cavalieri took his best knife and sliced the bologna into n very thin slices, each of thickness x, and measured the radii, r1 , r2 , , rn of the slices (fonunately they were all round). He then estimated the volume to be I:7=1 nrl x , the sum of the volumes of the slices. •
•
•
Cavalieri was moonlighting from his regular job as a professor at the University of Bologna. That afternoon he went back to his desk and began the book Geometria indivisibilium contin uorum nova quandum ratione promota (Geometry shows the continuous indivisibility between new rations and getting pro moted), in which he stated what is now known as Cavalieri's principle: If two solids are sliced by a family of parallel planes in such a way that corresponding sections have equal areas, then the two solids have the same volume. The book was such a success that Cavalieri sold his delicatessen and retired to a life of occasional teaching and eternal glory. Appendix C: A Short Excursion into the Complex Field
The field C of complex numbers corresponds bijectively with JR2 • The com plex number z = x + iy E C corresponds to (x , y) E JR2 . A function T : C ----+ C is complex linear if for all J... , z. w E
and
T (J... z ) = J... T (z) .
Since
[� -:] .
The form of this matrix is special. For instance it could never be
[� �J.
A complex function of a complex variable f (z) has a complex derivative f'(z) if the complex ratio ( f (z +h) - f (z ) ) j h tends to f' (z) as the complex
340
Multivariable Calculus
Chapter 5
number h tends to zero. Equivalently,
f(z + h )
f(z) - J'(z) h
-
h
as h
-+
y)
[ ]
0. Write f (z) = u (x , y) + i v(x ,
-+
0 = x + iy, and 2 JR by F (x , y )
where z --+
v are real valued functions. Define F : ( u (x , y) , v (x , y)) . Then F is JR-differentiable with derivative matrix
u,
DF =
au
au
ax
ay
av
av
ax
ay
JR2
_
[� : ]
Since this derivative matrix is the IR2 expression for multiplication by the complex number f' (z), it must have the form. This demonstrates a basic fact about complex differentiable functions - their real and imag inary parts, u and v, satisfy the 53 Cauchy-Riemann Equations
au
av
ax
ay
and
au
av
ay
ax
Appendix D : Polar Form
The shape of the image of a unit ball under a linear transformation T is not an issue that is used directly in anything we do in Chapter 5, but it certainly underlies the geometric outlook on linear algebra. Question. What shape is the (n - I )-sphere sn - t ? Answer. Round. Question. What shape is T {sn - 1 ) ? Answer. Ellipsoidal. e Let z = x + iy be a nonzero complex number. Its polar form is z = re i where r > 0 and 0 :::: (} < 2;r ' and X = r cos e ' y = r sin (} . Multiplication by z breaks up into multiplication by r, which is just dilation, and multi w plication by e , which is rotation of the plane by angle e . As a matrix the rotation is cos e sin e cos O · sin O
[
-
]
34 1
Polar Form
Appendix D
The polar coordinates of (x , y) are (r, () ) . Analogously, consider an isomorphism T : �n
�
�n . Its polar form is
T = OP where 0 and P are isomorphisms �n --+ �" such that (a) 0 is like e i 0 ; it is an orthogonal isomorphism. (b) P is like r; it is positive definite symmetric (PDS) isomorphism. Orthogonality of 0 means that for all v, w E �n ,
( O v , O w } = (v, w}, while P being positive definite symmetric means that for all nonzero vectors v, w E �n ,
(Pv, v}
>
0 and
(Pv, w} = (v , P w } .
The notation ( v , w} indicates the usual dot product. The polar form T = 0 P reveals everything geometric about T. The geometric effect of 0 is nothing. It is an isometry and changes no distances or shapes. It is rigid. The effect of a PDS operator P is easy to describe. ln linear algebra it is shown that there exists a basis B = {u , , . . . , u n } of orthonormal vectors (the vectors are of unit length and are mutually perpendicular) and with respect to this basis A. ,
0
P=
0
Az
0 0
An - I
0
0
An
The diagonal entries A.; are positive. P stretches each u; by the factor A.; . Thus P stretches the unit sphere to an n-dimensional ellipsoid. The u; are its axes The norm of P and hence of T is the largest A.; , while the conorm is the smallest A.; . The ratio of the largest to the smallest, the condition number, is the eccentricity of the ellipsoid.
0 , an isomorphism is no more geometrically complicated than a diagonal matrix with positive entries.
Upshot Except for the harmless orthogonal factor
Any isomorphism T 0 P where 0 is orthogonal and P is PDS.
54 Polar Form Theorem
:
�n --+ �n factors as
T=
342
Multivariable Calculus
Proof Recall that the transpose of T
T' satisfying the equation
:
ffi.n
Chapter 5 ffi.n
---+
is the unique isomorphism
( T v , w ) = ( v , T1 w ) for all v , w E ffi.n . Thus, the condition ( P v , w ) = ( v , P w ) in the definition of PDS means exactly that P1 = P . Let T b e a given isomorphism T : ffi.n ---+ ffi.n . We must find its factors 0 and P. We just write them down as follows. Consider the composite T 1 o T . It i s PDS because
Every PDS transformation has a unique PDS square root, just as does every positive real number r. (To see this, take the diagonal matrix with entries A in place of A.i o) Thus, T1 T has a PDS square root and this is the factor P that we seek, p2 = T t T.
By P 2 we mean the composite P o P. In order for the formula T = 0 P to hold with this choice of P , we must have 0 = T p - 1 • To finish the proof we merely must check that T p - 1 actually is orthogonal. Magically, ( O v , O w ) = (T P - 1 v , T p - 1 w ) = ( P - 1 v , T1 T p - 1 w ) = ( P - 1 v , P w ) = ( P1 p - 1 v , w ) = ( P P - 1 v , w )
D
= (v, w) .
55 Corollary
an ellipsoid.
Under any invertible T : ffi.n
Proof Write T in polar form T
=
---+
ffi.n, the unit ball is sent to
0 P . The image of the unit ball under P 0 merely rotates the ellipsoid. D
is an ellipsoid. The orthogonal factor
Appendix E : Determinants
A permutation of a set S is a bijection n : S ---+ S. That is, n is one-to-one and onto. We assume the set S is finite, S = { 1 , , . . . k } . The sign of n is sgn (n ) = (- 1 Y where r is the number of reversals - i.e., the number of pairs ij such that i <
j and n (i )
>
n (j) .
343
Determinants
Appendix E
Any permutation is the composite ofpair transpositions; the sign of a composite permutation is the product of the signs of its factors; and the sign of a pair transposition is - 1 . Proposition
The proof of this combinatorial proposition is left to the reader. Although the factorization of a permutation rr into pair transpositions is not unique, the number of factors, say t . satisfies ( - 1 )1 = sgn(rr ) . Definition The determinant of a k
det A =
x
k matrix A i s the sum
L sgn (rr )a l rr(l) a2 rr(2) . . . ak rr(k)
rr where rr ranges through all permutations of { 1 , . . . , k}.
Equivalent definitions appear in standard linear algebra courses. One of the key facts about determinants is the product rule: for k x k matrices. det A B = det A det B . I t extends to non-square matrices a s follows.
Assume that k :::= n. If A is a k x n matrix and B is an n x k matrix, then the determinant of the product k x k matrix A B = C is given by the formula
56 Cauchy-Binet Formula
det C =
L det A 1 det B1 J
where J ranges through the set of ascending k-tuples in { 1 , . . , n }, A 1 is the k x k minor of A whose column indices j belong to J, while B 1 is the k x k minor of B whose row indices i belong to J. See Figure 120. .
A
B
·---· h ·---·
j4
x 4 minors of A and B are determined by the 4-tuple J = (h , h . h . j4 ) .
Figure 120 The paired 4
Multivariable Calculus
344
Chapter 5
Proof Note that special cases of the Cauchy-Binet formula occur when k = 1 or k = n. When k = 1 , C is the 1 x I matrix that is the dot product
of an A row vector of length n times a B column vector of height n. The 1 -tuples J in { 1 . . . . , n } are just single integers, J = ( 1 ) , . . . , J = (n) , and the product formula is immediate. In the second case, k = n, we have the usual product determinant formula because there is only one ascending k-tuple in { 1 , . . . k } , namely J = ( 1 , . . . , k) . To handle the general case, define the sum S (A ,
B) =
L det A 1 det B1 J
as above. Consider an elementary n
x
n matrix E . We claim that
S (A . B ) = S ( A E , E -
1
B).
Since there are only two types o f elementary matrices, this is not too hard a calculation, and is left to the reader. Then we perform a sequence of elementary column operations on A to put it in lower triangular form
A, =
A E t - . . Er
=
[
au
0
...
az t
.
a22 .
..
ak t
akz
0 0
.
About B ' = E; 1 . . . £) 1 B we observe only that A B = A' B ' = A'10 B �
0
where lo = ( 1 , . . . , k) . Since elementary column operations do not affect S,
S(A ,
1 B ) = S (A E t . £) 1 B) = S (A E 1 E2 , E2 £) 1 B) = · · · = S (A' , B ' ) .
All terms in the sum that defines S(A ' , B ' ) are zero except the J/f' , and thus det (AB) = det A'10 det B � = S (A' , B ' ) = S(A , B) 0
as claimed.
0
345
Exercises
Exercises 1 . Let T : V
W
be a linear transformation, and let p E V be given. Prove that the following are equivalent. (a) T is continuous at the origin. (b) T is continuous at p . (c) T i s continuous at at least one point o f V . 2 . Let £ b e the vector space o f continuous linear transformations from a normed space V to a normed space W . Show that the operator norm makes £ a normed space. 3. Let T : V � W be a linear transformation between normed spaces. Show that ---+
II T il = sup{ I T v l : l v l < 1 }
= sup{ I T v l : l v l ::::: 1 } = sup{ I T v l : l v l = 1 } = inf{M : v E V ::::} I T v l :::::
M lvl}.
4. The conorm of a linear transformation T : ffi.n � ffi.m is ,_( T )
ITvl . = mf{ - : v :F 0} . Ivi
It is the minimum stretch that T imparts to vectors in ffi.n . Let U be the unit ball in ffi.n . (a) Show that the norm and conorm of T are the radii of the smallest ball that contains T U and the largest ball contained in T U . (b) Is the same true in normed spaces? (c) If T is an isomorphism, prove that its conorm is positive. (d) Is the converse to (c) true? (e) If T : ffi.n � ffi.n has positive conorm, why is T is an isomor phism? (f) If the norm and conorm of T are equal, what can you say about T? 5 . Formulate and prove the fact that function composition i s associative. Why can you infer that matrix multiplication is associative? 6. Let M n and Ln be the vector spaces of n x n matrices and linear transformations ffi.n ---+ ffi.n . (a) Look up the definition of "ring" in your algebra book. (b) Show that M n and Ln are rings with respect to matrix multi plication and composition.
346
Multivariable Calculus
Chapter 5
(c) Show that T : M n ---+ Cn is a ring isomorphism. 7. Two norms, I h and I 1 2 , on a vector space are comparable t if there are positive constants c, C such that for all nonzero vectors in V,
c
lvh
<
< C.
l v lz -
-
(a) Prove that comparability is an equivalence relation on norms. (b) Prove that any two norms on a finite-dimensional vector space are comparable. [Hint: Use Theorem 2.] (c) Consider the norms 1 / IL'
=
11
1 /
dt
and
1 / l co
= max{ l f
: t E [0. 1 ] } ,
defined on the infinite-dimensional vector space C 0 of contin uous functions f : [0, 1 ] --+ JR. Show that the norms are not comparable by finding functions f E C 0 whose integral norm is small but whose C 0 norm is 1 . *8. Let I I = I l eo be the supremum norm on C 0 as in the previous exercise. Define an integral transformation T : C 0 --+ C0 by
T (f) (x)
=
fox
f (t) dt.
(a) Show that T is continuous and find its norm. (b) Let fn (t) = cos (n t) , n = 1 , 2, . . . . What is T Cfn ) ? (c) I s the set of functions K = { T (/n ) : n E N } closed? bounded? compact? (d) Is T ( K ) compact? How about its closure? 9. Give an example of two 2 x 2 matrices such that the norm of the product is less than the product of the norms. 1 0. In the proof of Theorem 2 we used the fact that with respect to the Euclidean norm, the length of a vector is at least as large as the length of any of its components. Show by example that this false for some norms in IR2 • [Hint: Consider the matrix
A=
[ 3 -22] . -2
t From an analyst's poim of view, the choice between comparable norms has little importance. At worst it affects a few constants that tum up in estimates.
347
Exercises
Use A to define an inner product ( v , w} A use the inner product to define a norm
= L V; a;i wi on JR2 , and
(What properties must A have for the sum to define an inner product? Does A have these properties?) With respect to this norm, what are the lengths of e 1 , ez , and v = e 1 + ez ?] 1 1 . Consider the shear matrix
1 2. 13. 14.
15.
and the linear transformation S : JR2 ---+ JR2 it represents. Calculate the norm and conorm of S. [Hint: Using polar form, it suffices to calculate the norm and conorm of the positive definite symmetric part of S. Recall from linear algebra that the eigenvalues of the square of a matrix A are the squares of the eigenvalues of A .] What is the one line proof that if V is a finite-dimensional normed space then its unit sphere { v : I v I = 1 } is compact? The set of invertible n x n matrices is open in M . Is it dense? An n x n matrix is diagonalizable if there is a change of basis in which it becomes diagonal. (a) Is the set of diagonalizable matrices open in M (n x n)? (b) closed? (c) dense? Show that both partial derivatives of the function f (x ,
y) =
{
x2
xy
� y2
if (x .
y ) f=. (0. 0)
if (x , y)
= (0 , 0) .
exist at the origin, but the function is not differentiable there. 3 1 6. Let f : JR2 ---+ JR3 and g : JR ---* lR be defined by f = (x , y , z) and g = w where
w = w (x , y, z) = xy + yz + zx x = x (s, t) = st y = y (s, t) = s cos t z = z (s , t ) = s sin t . (a) Find the matrices that represent the linear transformations ( Df) P and (Dg)q where p = (s0 , t0 ) = (0, 1 ) and q = f (p) . (b) Use the Chain Rule to calculate the 1 x 2 matrix [ a w ; a s, a w ; a t] that represents ( D (g o f)) P .
348
Multivariable Calculus
Chapter 5
(c) Plug the functions x = x (s , t ) , y = y (s , t) , and z = z (s , t) directly into w = w (x , y , z ), and recalculate [a wjas , a wja t ], verifying the answer given i n (b). (d) Examine the statements of the multivariable Chain Rules that appear in your old calculus book and observe that they are nothing more than the components of various product matrices. 1 7. Let f : U -4 ffi.m be differentiable, L p , q ] C U C ffi.n , and ask whether the direct generalization of the one-dimensional Mean Value Theorem is true: does there exist a point () E L p , q ] such that (24)
f ( q ) - f ( p) = ( Df)e (q - p) ?
(a) Take n = 1 . m = 2 , and examine the function f (t) = (cos t , sin t) for rr ::: t ::: 2rr . Take p = rr and q = no () E [ p , q] which satisfies (24). (b) Assume that the set of derivatives
2rr . Show that there is
is convex. Prove that there exists () E [ p , q l which satisfies ( 24) . (c) How does (b) imply the one dimensional Mean Value Theorem? 1 8 . The directional derivative of f : U -4 ffi.m at p E U in the direction u is the limit, if it exists, vP '<7
f ( u ) - 11m . r-o _
f (p
+ t u ) - f (p) . t
(Usually, one requires that l u i = 1 .) (a) If f i s differentiable at p , why is it obvious that the directional derivative exists in each direction u ? (b) Show that the function f : IR2 � lR defined by if (x , y) =I= (0, 0) if (x , y) = (0, 0) . has V(o . o) / { u ) = 0 for all u but is not differentiable at (0, 0)
Exercises
349
* 1 9. Using the functions in Exercises 15 and
1 8, show that the composite
of functions whose partial derivatives exist may fail to have partial derivatives, and the composite of functions whose directional deriva tives exist may fail to have directional derivatives. (That is, the classes of these functions are not closed under composition, which is further reason to define multidimensional differentiability in terms of Taylor approximation, and not in terms of partial or directional derivatives.) 20. Assume that U is a connected open subset of � n and f : U � JRm is differentiable everywhere on U. If (Df) P = 0 for all p E U , show that f is constant. 2 1 . For U as above, assume that f is second-differentiable everywhere and ( D 2 f) P = 0 for all p . What can you say about f? Generalize to higher-order differentiability. 22. If Y is a metric space and f : [a . bl x Y � � is continuous, show that
F (y )
=
1b f(x, y) dx
is continuous. 23. Assume that f : [a , b] x Y � JRm is continuous, Y is an open subset of 1Rn , the the partial derivatives aii (x , y) 1 ay j exist, and they are continuous. Let Dy / be the linear transformation JRn � JRm which is represented by the matrix of partials. (a) Show that
F (y) =
is of class C 1 and
1b f(x , y ) dx
(D F)y =
1b (
Dy f) dx.
This generalizes Theorem 1 4 to higher dimensions. (b) Generalize (a) to higher order differentiability. 24. Show that all second partial derivatives of the function f : JR2 defined by
�
lR
if (x , y ) # (0, 0) if (x ,
y) =
(0, 0)
exist everywhere, but the mixed second partials are unequal at the origin, a 2 f(O, O)jax ay i= a 2 f(O, O)jayax.
Multivariable Calculus
350
Chapter 5 lR
� lR that is second differentiable only at the origin. (Infer that this phenomenon occurs also in higher dimensions.) 26. Suppose that u �----+ f3u is a continuous function from U C JRn into .C (JRn , JRm ) .
*25 . Construct an example of a
C 1 function f
:
(a) If for some p E U, {Jp is not symmetric, prove that its average over some small 2-dimensional parallelogram at p is also not symmetric. (b) Generalize (a) by replacing C with a finite dimensional space E, and the subset of symmetric bilinear maps with a linear subspace of E. If the average values of a continuous function always lie in the subspace then the values do too. m *27. Assume that f : U � JR is of class C 2 and show that D 2 f is sym metric by the following integral method. With reference to the signed sum !:!. of f at the vertices of the parallelogram P in Figure 1 05 , use the C 1 Mean Value Theorem to show that !:!.
=
1'21 '
(D2 f) p+sv +tw dsdt ( v , w ) .
Infer symmetry of ( D f) p from symmetry of !:!. and Exercise 26. n n 28. Let f3 : JR x · · · x IR � JRm be r-linear. Define its symmetrization as symm ({J ) (v1 , .
·
.
Vr)
,
I = - ""' L
r ! aEP( r)
{J (Va ( l ) •
· · ,
·
Va(r) ) ,
where P (r ) is the set of permutations of { I , . . . , r } . (a) Prove that symm ({J) i s symmetric. (b) If f3 is symmetric prove that symm ({J ) {3 . (c) I s the converse to (b) true? (d) Prove that a = f3 - symm ({3 ) is antisymmetric in the sense that if a is any permutation of { I , . . . , r } then =
a ( Va( I J ·
.
.
.
•
Va(r ))
=
sign(a )a (v 1
•
•
•
•
,
Vr ) .
Infer that C = .c: EEl .C� where .C� and C� are the subspaces of symmetric and antisymmetric r-linear transformations. (e) Let f3 E C 2 (1R2 , JR) be defined by {J ((x , y ) , (x ' ,
y ' ))
=
xy ' .
Express f3 as the sum of a symmetric and an antisymmetric bilinear transformation.
Exercises
35 1
1 8 that r th order differentiability implies symmetry of D r f, r ::: 3, in one of two ways. (a) Use inducti on to show that (D r . . . . , Vr ) is symmetric with respect to permutations of , Vr - l and of Vz , . . . , Vr . Then take advantage of the fact that r is strictly greater than 2. (b) Define the signed sum � of f at the vertices of the parallelotope P spanned by V J , Vn and show that it is the average of Dr f . Then proceed a s i n Exercise 27. 30. Consider the equation
* 29. Prove Corollary
/)p (v1 v1 , • • •
•••,
(25)
xeY + yex
= 0.
(a) Observe that there is no way to write down an explicit solution y = y (x ) of (25) in a neighborhood of the point (xo , Yo ) = (0, 0) . (b) Why, nevertheless, does there exist a c oo solution y = y (x) of (25) near (0, 0) ? (c) What is its derivative at x = 0? (d) What is its second derivative at x = 0? (e) What does this tell you about the graph of the solution? (f) Do you see the point of the Implicit Function Theorem better? * *3 1 . Consider a function f : U � lR such that (i) U is a connected open subset of (ii) f is C 1 • (iii) For all (x , y) E U,
IR 2 •
aj(x , y) ay
= 0.
(a) If U is a disc, show that f is independent from y. (b) Construct such an f of class C00 which does depend on y. (c) Show that the f in (b) can not be analytic. (d) Why does your example in (b) not invalidate the proof of the Rank Theorem on page 294? 32. Let G denote the set of invertible n x n matrices. (a) Prove that G is an open subset of M (n x n ) . (b) Prove that G i s a group. (It i s called the general linear group. ) (c) Prove that the inversion operator lnv : - I is a homeomorphism of G onto G. (d) Prove that Inv is a diffeomorphism and show that its derivative at A is the linear transformation M � M ,
A�A
X � -A- 1 o X o A - 1 •
352
Multivariable Calculus
Chapter 5
Relate this formula to the ordinary derivative of I 1x at x = a . 33. Observe that Y = lnv (X) is solves the implicit function problem
F( X, Y ) - I = 0,
where F (X, Y ) = X o Y . Assume it is known that Inv is smooth and use the chain rule to derive from this equation the formula for the derivative of Inv. 34. Use Gaussian elimination to prove that the entries of the matrix A depend smoothly (in fact analytically) on the entries of A . *35 . Give a proof that the inversion operator Inv i s analytic (i.e., is defined locally by a convergent power series) as follows: (a) If T E £(1Rn , JRn ) and II T II < 1 show that the series of linear transformations
-I
I+
T
+ T2 + . . . + Tk + . . .
converges to a linear transformation S, and
S o (I - T )
=
I
=
(I - T) o S.
where I is the identity transformation. (b) Infer from (a) that inversion is analytic at I . (c) In general, if To E G and II T II < 1 I I T0- 1 ll , show that Inv(To -
T)
=
Inv(/ -
T0- 1 o T) o T0-1 ,
and infer that Inv is analytic at T0 . (d) Infer from the general fact that analyticity implies smoothness that inversion is smooth. (Note that this proof avoids Cramer's Rule and makes no use of finite dimensionality.) *36. Give a proof of smoothness of Inv by the following bootstrap method (a) Using the identity x-I - y-I = x-I
o ( Y - X) o
y-I
give a simple proof that Inv is continuous. (b) Infer that Y = Inv (X) is a continuous solution of the plicit function problem F (X, Y)
-I
=
0,
c oo im
where F (X, Y) = o Y as in Exercise 33. Since the proof of the C 1 Implicit Function Theorem relies only continuity of Inv, it is not circular reasoning to conclude that Inv is C 1
X
Exercises
353
(c) Assume simultaneously that the c r Implicit Function Theorem has been proved, and that Inv is known to be c r - ! . Prove that Inv is C' and that the cr + I Implicit Function Theorem is true. (d) Conclude logically that Inv is smooth and the C00 Implicit Func tion Theorem is true. Note that this proof avoids Cramer's Rule and makes no use of finite dimensionality. 37. Draw pictures of all the possible shapes of T ( S 2 ) where T : R3 ---+ R3 is a linear transformation and S 2 is the 2-sphere. (Don't forget the cases in which T has rank < 3.) *38. Use polar decomposition to give an alternate proof of the volume multiplier formula. * *39. Consider the set S of all 2 x 2 matrices X E M that have rank 1 . (a) Show that in a neighborhood of the matrix
Xo
=
[� �]
S is diffeomorphic to a two-dimensional disc. (b) Is this true (locally) for all matrices X E S? (c) Describe S globally. (How many connected components does it have? Is it closed? If not, what are its limit points and how does S approach them? What is the intersection of S with the unit sphere in M ?, etc.) 40. Let 0 :::: E :::: 1 and a , b > 0 be given. (a) Prove that
Ia - b l :::: L 6Eb. (b) Is the estimate in (a) sharp? (That is, can 16 be replaced by a smaller constant?) * *4 1 . Suppose that f and g are rth-order differentiable and that the compos ite h = g o f makes sense. A partition divides a set into nonempty disjoint subsets. Prove the Higher-Order Chain Rule,
(Drh) p =
L L r
k=l J.l. E P (k,r)
( Dk g ) q
0
(DIL f) p
Multivariable Calculus
354
Chapter 5
where f.-i partitions { 1 , . . . . r } into k subsets. and q = f (p) . In terms of r-linear transformations, this notation means
=
r
L L ( D k g ) q ((D IJII I f)p( vJ.q ) k= !
•
.
.
.
'
(D I Jikl f ) p (vJik ) )
Jl
where IJ.-Ld = #J-i; and vJi i is the I J.-Ld -tuple of vectors vi with j E J.-i ; . (Symmetry implies that the order of the vectors vi in the IJ.-L; ! -tuple vJI, and the order in which the partition blocks f.-i t J-i k occur are irrelevant.) Suppose that {3 is bilinear and {3 ( /, g ) makes sense. If f and g are rth order differentiable at p, find the Higher-Order Leibniz Formula for Dr ({3 (f, g )) P . [Hint: First derive the formula in dimension 1 .] Suppose that T : JR.n -+ JR.m has rank k. (a) Show that there exists a 8 > 0 such that if S : JR.n -+ JR.m and li S - T i l < 8 , then S has rank ::: k. (b) Give a specific example in which the rank of S can be greater than the rank of T, no matter how small 8 is. (c) Give examples of linear transformations of rank k for each k where 0 ::::: k ::::: min{n , m } . Let S c M. (a) Define the characteristic function X s : M -+ JR. . (b) If M is a metric space, show that X s (x ) is discontinuous at x if and only if x is a boundary point of S. On page 302 there is a definition of Z c JR.2 being a zero set that involves open rectangles. (a) Show that the definition is unaffected if we require that the rectangles covering Z are squares. (b) What if we permit the squares or rectangles to be non-open? (c) What if we use discs or other shapes instead of squares and rectangles? Assume that S c IR2 is bounded. (a) Prove that if S is Riemann measurable then so are its interior and closure. (b) Suppose that the interior and closure of S are Riemann measur able and l int(S) I = JSJ . Prove that S is Riemann measurable. (c) Show that some open bounded subsets of JR.2 are not Riemann measurable. •
* *42.
43 .
44.
45 .
*46.
•
.
.
•
Exercises
355
* 47 In the derivation of Fubini's Theorem on page 304, it is observed that for all y E [c, \ Y, where Y is a zero set, the lower and upper integrals with respect to agree, F (y) F (y) . One might think that the values of F and F on Y have no effect on their integrals. Not so. Consider the function defined on the unit square [0, 1] x [0, 1],
d]
x
if y is irrational
1
j(x, y)
1
=
-
1
q
-
x is irrational if y is rational and x = is rational
if y is rational and
p fq
and written in lowest terms.
(a) Show that f is Riemann integrable, and its integral is 1 . (b) Observe that if Y is the zero set Q n [0. 1 ], then for each y fl. Y ,
11 f(x, y) dx
exists and equals 1 . (c) Observe that if for each arbitrary manner some
y E
Y
we choose in a completely
h (y ) E [F(y) , F(y)]
and set
H(x) = { hF(y)(y) = F(y)
if y fj. Y if y E Y
then the integral of H exists and equals 1 , but if we take g 0 for all y E Y then the integral of
G (x)
= F(y) = { gF(y) (y ) = 0
(x) =
if y fl. y if y E Y
does not exist. * * *48. Is there a criterion to decide which redefinitions of the Riemann integral on the zero set Y of Exercise 47 are harmless and which are not? 49. Using the Fundamental Theorem of Calculus, give a direct proof of Green· s Formulas
- Jl fy dxdy = 1R fdx
and
!L gx dx dy = 1R g dy.
356
Chapter 5
Multivariable Calculus
R is a square in the plane and
f, 8 :
IR2
---+
lR are smooth. (Assume
that the edges of the square are parallel to the coordinate axes.) 50. Draw a staircase curve Sn that approximates the diagonal D. = { (x , y) E IR2 : 0 s x = y S 1 } to within a tolerance 1 / n . (Sn consists of both treads and risers.) Suppose that f, 8 : IR2 ---+ lR are smooth. (a) Why does length Sn fr length D. ? (b) Despite (a), prove fsn f dx ---+ JD. f dx and fsn 8 dy ---+ JD. 8 dy. (c) Repeat (b) with D. replaced by the graph of a smooth function 8 : [a , b] ---+ R (d) If C is a smooth simple closed curve in the plane, show that it is the union of finitely many arcs Ct , each of which is the graph of a smooth function y = 8 (x) or x = 8(y), and the arcs Ce meet only at common endpoints. (e) Infer that if (Sn ) is a sequence of staircase curves that converges to C then fsn f dx + 8dy ---+ fc fdx + 8dy. (f) Use (e) and Exercise 49 to give a proof of Green's Formulas on a general region D c IR2 bounded by a smooth simple closed curve C, that relies on approximatingt C, say from the inside, by staircase curves Sn which bound regions Rn composed of many small squares. (You may imagine that R1 C R2 C . . . and that Rn ---+ D.) 5 1 . A region R in the plane is of type 1 if there are smooth functions 8 1 : [a , b] ---+ IR , 82 : [a , b] ---+ lR such that 81 (x) S 8z (x } and R =
{ (x , y) : a S x S b and 8t (x)
S
y S 8z (x)} .
R is of type 2 if the roles of x and y can be reversed, and it is simple if it is of both type 1 and type 2. (a) Give an example of a region that is type 1 but not type 2. (b) Give an example of a region that is neither type 1 nor type 2. (c) Is every simple region starlike? convex? (d) If a convex region is bounded by a smooth simple closed curve, is it simple? (e) Give an example of a region that divides into three simple sub regions but not into two.
t This staircase approximation proof generalizes to regions that are bounded by fractal, non differentiable curves such as the von Koch snowflake. As Jenny Harrison has shown, it also generalizes to higher dimensions.
Exercises
* (f) If a region is bounded by a smooth simple closed curve
357
C, it
need not divide into a finite number of simple subregions. Find an example. (g) Infer that the standard proof of Green's Formulas for simple regions (as, for example, in J. Stewart's Calculus) does not im mediately carry over to the general planar region R with smooth boundary, i.e., cutting R into simple regions can fail. *(h) Show that if the curve C in (f) is analytic, then no such example exists. [Hint: C is analytic if it is locally the graph of a function defined by a convergent power series. A nonconstant analytic function has the property that for each x , there is some derivative of f which is nonzero, f (r ) (x) f= 0. 1 **52. The 2-cell q; : r -+ Bn constructed in Step 3 of the proof of Brouwer's Theorem is smooth but not one-to-one. (a) Construct a homeomorphism h : / 2 -+ B 2 where / 2 is the closed unit square and B 2 is the closed unit disc. (b) In addition make h in (a) be of class C 1 (on the closed square) and be a diffeomorphism from the interior of / 2 onto the interior of B 2 • (The derivative of a diffeomorphism is everywhere non singular.) (c) Why can h not be a diffeomorphism from ! 2 onto B 2 ? (d) Improve class C 1 in (b) to class c oo . **53. If K, L C ]Rn and if there is a homeomorphism h : K � L that extends to H : U -+ V such that U, V C JR;.n are open, H is a homeomorphism, and H , H - 1 are of class cr , 1 :S r _:s oo, then we say that K and L are ambiently cr -diffeomorphic. (a) In the plane, prove that the closed unit square is ambiently dif feomorphic to a general rectangle; to a general parallelogram. (b) If K, L are ambiently diffeomorphic polygons in the plane, prove that K and L have the same number of vertices. (Do not count vertices at which the interior angle is 1 80 degrees.) (c) Prove that the closed square and closed disc are not ambiently diffeomorphic. (d) If K is a convex polygon that is ambiently diffeomorphic to a polygon L, prove that L is convex. (e) Is the converse to (b) true or false? What about in the convex case?
Multivariable Calculus
358 (f)
Chapter 5
The closed disc is tiled by five ambiently diffeomorphic copies of the unit square as shown in Figure 1 2 1 . Prove that it cannot be tiled by fewer.
Figure 121 Five diffeomorphs of the square tile the disc.
54. 55. 56.
57. 58. 59.
(g) Generalize to dimension n :=::: 3 and show that the n -ball can be tiled by 2n + 1 diffeomorphs of the n -cube Can it be done with fewer? (h) Show that a triangle can be tiled by three diffeomorphs of the square. Infer that any surface that can be tiled by diffeomorphs of the triangle can also be tiled by diffeomorphs of the square. What happens in higher dimensions? Choose at random I, J , two triples of integers between 1 and 9. Check that dx1 1\ dxJ dxu . Show that d : Q k � Qk + 1 is a linear vector space homomorphism. Prove that the pullback acts linearly on forms, and that it is natural with respect to composition in the sense that ( T o S) * = S* o T * . See how succinct a proof you can give. True or false: if w is a k-form and k is odd, then w 1\ w = 0. What if k is even and :=::: 2? Does there exist a continuous mapping from the circle to itself that has no fixed point? What about the 2-torus? The 2-sphere? Show that a smooth map T : U � V induces a linear map of cohomology groups H k (V) � H k (U) defined by T * : [w J f--+ [T * w ] .
=
Here, [wJ denotes the equivalence class of w E z k ( V) in H k ( V). The question amounts to showing that the pullback of a closed form w i s
359
Exercises
closed and that its cohomology class depends only on the cohomology class of w. t 60. Prove that diffeomorphic open sets have isomorphic cohomology groups. 6 1 . Show that the 1 -form defined on JR.2 \ { (0, 0) } by w
-y =dx + - dy r2 r2 X
is closed but not exact. Why do you think that this L -form is often referred to as d() and why is the name problematic? 62. Show that the 2-form defined on the spherical shell by
is closed but not exact. 63 . If w is closed, is the same true of f w? What if w is exact? 64. Is the wedge product of closed forms closed? Of exact forms exact? What about the product of a closed form and an exact form? Does this give a ring structure to the cohomology classes? 65 . Prove that the n -cell l/r : [- 1 , l ] n ---+ B n in the proof of the Brouwer Fixed Point Theorem has Jacobian p '(r) p ( r ) n -l fr n - l for r I v i as claimed on page 336. **66. The Hairy Ball Theorem states that any continuous vector field X in JR.3 that is everywhere tangent to the 2-sphere S is zero at some point of S. Here is an outline of a proof for you to fill in. (If you imagine the vector field as hair combed on a sphere, there must be a cowlick somewhere.) (a) Show that the Hairy Ball Theorem is equivalent to a fixed point assertion: every continuous map of S to itself that is sufficiently dose to the identity map S ---+ S has a fixed point. (This is not needed below, but it is interesting.) (b) If a continuous vector field on S has no zero on or inside a small simple closed curve C c S. show that the net angular turning of X along C as judged by an observer who takes a tour of C in the
=
t A fancier way to present the proof of the Brouwer Fixed Point Theorem goes like this: As always, the question reduces to showing that there is no smooth retraction T of the n -ball to its boundary. Such a T would give a cohomology map T * : H k (iJB) ---. H k (B) where the cohomology groups of iJ B are those of its spherical shell neighborhood. The map T * is seen to be a cohomology group isomorphism because T o inclusiona s = inclusion a s and inclusion 0 8 * = identity. But when k = n - I :0: I the cohomology groups are non-isomorphic: they are computed to be H n - 1 (iJB) = JR and H n - 1 (B) = 0.
360
Multivariable Calculus
Chapter 5
counterclockwise direction is - 2rr . (The observer walks along C in the counterclockwise direction when S is viewed from the outside, and he measures the angle that X makes with respect to his own tangent vector as he walks along C. By convention, clockwise angular variation is negative.) Show also that the net turning is + 2rr if the observer walks along C in the clockwise direction. (c) If C, is a continuous family of simple closed curves on S, a :S t :S b, and if X never equals zero at points of C, , show that the net angular turning of X along C, is independent of t . (This is a case of a previous exercise stating that a continuous integer valued function of t is constant.) (d) Imagine the following continuous family of simple closed curves C1 • For t = 0, Co is the arctic circle. For 0 :S t :S 1 /2, the latitude of C1 decreases while its circumference increases as it oozes downward, becomes the equator, and then grows smaller until it becomes the antarctic circle when t = 1 /2. For 1 /2 :S t :S 1 , C1 maintains its size and shape, but its new cen ter, the South Pole, slides up the Greenwich Meridian until at t = 1 , C, regains its original arctic position. See Figure 1 22. Its orientation has reversed. Orient the arctic circle Co positively and choose an orientation on each C, that depends continuously on t . To reach a contradiction, suppose that X has no zero on S. (i) Why is the total angular turning of X along Co equal to -2rr ?. (ii) Why is it +2rr on C1 ? (iii) Why is this a contradiction to (c) unless X has a zero some where? (iv) Conclude that you have proved the Hairy Ball Theorem
36 1
Exercises
) Figure 122 A deformation of the arctic circle that reverses its orientation
6
Lebesgue Theory
This chapter presents a geometric theory of Lebesgue measure and integra tion. In calculus you certainly learned that the integral is the area under the curve. With a good definition of area, that is the point of view we advance here. Deriving the basic theory of Lebesgue integration then becomes a matter of inspecting the right picture.
1
Outer measure
How should you measure the length of a subset of the line? If the set to be measured is simple, so is the answer: the length of the interval ( a , b) is b - a . But what i s the length of the set of rational numbers? of the Cantor set? As is often the case in analysis we proceed by inequalities and limits. In fact one might distinguish the fields of algebra and analysis solely according to their use of equalities versus inequalities.
Definition The length of an interval I = (a , b) is b - a . It is denoted I l l The Lebesgue outer measure of a set A c lR is
m * A = inf{ L I h i : { ld is a covering of A by open intervals } . k
Tacitly we assume that the covering is countable; the series L k I h I is its total length. (Recall that ..countable" means etther finite or denumerable. )
Lebesgue Theory
364
Chapter 6
The outer measure of A is the infimum of the total lengths of all possible coverings { h} of A by open intervals. If every series L h diverges, then by definition, m * A = oo. Outer measure is defined for every A C JR. I t measures A from the outside as do calipers. A dual approach measures A from the inside. It is called inner measure, is denoted m * A, and is discussed in Section 3 . Three properties o f outer measure (the axioms of outer measure) are easy to check. (a) The outer measure of the empty set is 0, m * IZJ = 0. (b) If A C B then m * A � m * B .
kI I
00
(c) If A = U� 1 A n then m * A �
L: m * A n . n=l
(b) and (c) are called monotonicity and countable subadditivity of outer measure. (a) is obvious. Every interval covers the empty set. (b) is obvious. Every covering of B is also a covering of A. (c) uses the E j2n trick. Given E > 0 there exists for each n a covering { hn : k E N } of A n such that 00
L I hn I k=l
k
The collection U n 00
: k, n E N} covers A and 00
L l hn I = L L l hn I k.n
E < m * A n + -n . 2
00
�
n=l k=l
L n=l
(m * A n +
;J = L n=l 00
m * A n + E.
Thus the infimum of the total lengths of coverings of A by open intervals is � L n m * A n + E , and since E > 0 is arbitrary, the infimum is � L n m * A n , which is what (c) asserts. Next, suppose you have a set A in the plane and you want to measure its area. Here is the natural way to do it.
Definition The area of a rectangle = (a , b) x (c, d) is = (b - a) · (d - c), and the (planar) outer measure of A c JR2 is the infimum of the total area of countable coverings of A by open rectangles
R
m * A = inf {
I I
Lk I Rk l : {Rk}
IRI Rk
covers
A}.
If need be, we decorate and m * with subscripts " 1 " and "2" to distinguish the linear and planar quantities.
Outer measure
Section 1
365
The outer measure axioms - monotonicity, countable subadditivity, and the outer measure of the empty set being zero - are true for planar outer measure too. See also Exercise 5 . A consequence o f subadditivity concerns zero sets, sets that have zero outer measure. l Proposition
The countable union of zero sets is a zero set.
Proof If m * Zn
= 0 for all n E N and Z = U Zn then by (c), n
Two other properties enjoyed by outer measure are left for you as Exercises 1 and 2: (d) Rigid translation of a set has no effect on its outer measure. (e) Dilation of a set dilates its outer measure correspondingly. The next theorem states a property of outer measure that seems obvious. See also Exercises 3, 4.
2 Theorem The linear outer measure of a closed interval is its length; the planar outer measure of a closed rectangle is its area. E >
= [a, b] be the interval. For any 0, the single open interval (a - E , b + E ) covers I, so m * I :::: I I I + 2E . Since E is arbitrary, m * I :::: I I I .
Proof Let I
To check the reverse inequality, let { ld be a countable covering of I by open intervals. Compactness implies that for some finite N, I c U�= l h . Form a partition a Xo < · · · < b
=
Xn =
such that all the endpoints ak . bk of the intervals h (ak . bk ) that appear in I occur as partition points. Each partition interval xd is contained wholly inside some covering interval h . which implies
=
Xi = (Xi - l ,
N
IXi l :::: 'L: I Xi n /tl . k= l
I I
For one of the terms in the sum is X i itself. Also, for each of I restricts to a partition of the interval I Ik . and thus
n
n
II n h i = L IXi n hl . i=l
k, the partition
Lebesgue Theory
366
This gives
n
n
i=l
i =l
Chapter 6
N
N
I I I = L IXd ::: I: ( I: IX; n h l ) = I: ( I: IX; n h l ) N
N
k =l
n
k =l
•= 1
oo
= I: 11 n h l ::: "L: I �tl ::: I: l �t l . k=l
k =1
k= 1
which shows that I I ::; m * and completes the proof in dimension one. The case of a rectangle R [a , b] x L c, dl is similar. As with intervals, it is enough to show that any countable covering of R by open rectangles Rk (at . bk ) x (ct . dk ) has I R I ::; L I Rk l · There is an N such that R 1 , , RN cover R. Partition the sides of R as
I
I =
=
.
a
= Xo
<
···
<
Xn =
b c = Yo
<
·
·
·
<
Ym
=
.
.
d
so that all the endpoints at . bt . Ct . dk of the covering intervals that appear in [a , b] and [c. d] occur among the partition points. Call ( X; - 1 , xj } . Yj ( Yj - 1 . yj ). and G;j = ; x Yj . Each of these grid rectangles G ij i s contained wholly inside at least one covering rectangle Rt. and s o G ij I ::; L �=l I G;1 n Rk l · Also, for each k the rectangle R n Rk is partitioned by the grid rectangles that lie in it so that I R n Rk L ij I G ij n ( R n Rk ) I · This gives
=
X; =
X
I
I=
IRI
=
n
( 'L:
m
IX; I ) ( 'L: I Yj l i =l j=l
N
) = Lij
I G ij l :::
N
) L
L ( L I G ij n Rk l ) N
ij
k =l
N
= ( L I Gij n < R n Rd l = I R n Rk l ::: I Rk I k =1 k=l k=l ij which shows that I R I ::; m * R and completes the proof.
L
L
oo
::: L I Rk I k =l
D
The formulas m * I = I l l and m * R = I R I hold also for intervals and rectangles that are open or partly open.
3 Corollary
is any interval and E that sandwich as J c
Proof If J, J'
I
I
m*J
Ill
I
<
<
>
C
0 is given then there are closed intervals J' and < E Then
II'I - 1 1 1
m* I
II I
<
<
m * J'
I
I I' I ·
367
Measurability
Section 2
Since I I I I - m * I I < E for all E > 0, m * I = 1 1 1 . The sandwich method works equally well for rectangles. 0
2
Measurability
If A and 8 are subsets of disjoint intervals in lH! it is easy to show that
m * (A u B) = m * A + m * B . But what i f A and B are merely disjoint? I s the formula still true? The answer is •·yes," if the sets have an additional property called measurability, and "no" in general as is shown in Appendix B. Measurability is the rule and nonmeasurability the exception: the sets you meet in analysis - open sets, closed sets, their unions, differences, etc. - all are measurable. See Section 3 . C
IR is (Lebesgue) measurable i f the division E I E c of lH! is so "clean" that for each "test set" X C JH!, Definition A set E
(1)
m * X = m * (X n E ) + m * (X
n Ec) .
We denote by M = M (ffi!) the collection of all Lebesgue measurable subsets of JH!. If E is measurable its Lebesgue measure is m * E, which we write as m E, dropping the asterisk to emphasize measurability of E. The definition of measurability in the plane is analogous : if E C ffi!2 is measurable, then E 1 p· divides each X c ffi!2 so cleanly that ( 1 ) is true for the planar outer measures. Which sets are measurable? It is obvious that the empty set is measurable. It is also obvious that if a set is measurable, then so is its complement, since E I Ec and E c I E divide a test set X in the same way. In this section we analyze measurability in the abstract. For the basic facts about measurability have nothing to do with lH! or JH!2 • They hold for any "abstract outer measure." Definition Let M be any set. The collection of all subsets of M is denoted as 2M . An abstract outer measure on M is a function w : 2M ---+ [0, oo]
that satisfies the three axioms of outer measure: w (0) = 0, w is monotone, and w is countably subadditive. A set E c M is measurable with respect to w if M = E I E c is so clean that for each test set X c M,
wX = w (X n E) + w ( X n
£<") .
Lebesgue Theory
368
Chapter 6
The collectzon M ofmeasurable sets with respect to any outer measure on any set M is a a -algebra and the outer measure restricted to this a -algebra is countably additive. In particular, Lebesgue measure is countably additive.
4 Theorem
A u -algebra is a collection of sets that includes the empty set, is closed under complement, and is closed under countable union. Countable ad ditivity of w means that if E1 , Ez , . . . are measurable with respect to w, then
Proof Let M denote the collection of measurable sets with respect to the outer measure w on M . To check that M is a -algebra we must show that
a
it contains the empty set, is closed under complements, and is closed under countable union. It is clear that 0 divides any X cleanly, so 0 E M . Also, since E I Ec divides a test set X in the same way that Ec I E does, M is closed under complements. To check that M is closed under countable union takes four preliminary steps: (a) M is closed under differences. (b) M is closed under finite union. (c) w is finitely additive on M . (d) w satisfies a special countable addition formula. (a) For measurable sets E1 , E2 , and a test set X, draw the Venn diagram in Figure 1 23 where X is represented as a disc. To check measurability of E 1 \ Ez we must verify the equation
n
2 + 1 34 = 1 234
n
where 2 = w (X (Et \ Ez) ) , 1 34 = w (X (E1 \ Ez) Y . 1 234 = wX, etc. Since E1 divides any set cleanly, 1 34 = 1 + 34, and since E2 divides any set cleanly, 34 = 3 + 4. Thus, 2 + 1 34 = 2 + 1 + 3 + 4 = 1 + 2 + 3 + 4.
For the same reason, 1 234 = 12 + 34 = 1 + 2 + 3 + 4. which completes the proof of (a). (b) Suppose that E1 . E2 are measurable and E = E1 U E2 • Since Ec = Ef \ E2 , (a) implies that Ec E M, and thus E E M . For more than two sets, induction shows that if Et , . . . , En E M then E1 U · · · U En E M . (c) I f E1 , E2 E M are disjoint, then E1 divides E = E 1 u Ez cleanly, so wE = w (E n EI ) + w (E n Ef) = w (E t )
+ w (Ez) ,
Section 2
369
Measurability
Figure 123 The picture that proves
M is closed under differences.
which is additivity for pairs of measurable sets. For more than two mea surable sets, induction implies that w is finitely additive on M ; i.e., if EJ , . . , En E M then .
n
E
= U E;
=>
i =l
n
wE
= L w (E; ) . i=l
(d) Given a test set X c M, and a countable disjoint union of measurable sets E = U Ei of measurable sets, we claim that
w(X n E) = L: w(X n E; ) .
(2)
(When X = M this i s countable additivity, but i n general measurable.) Consider the division
X n (E1
u
Ez)
X need not be
= (X n Et ) u (X n Ez) .
Measurability of E 1 implies that the two outer measures add. B y induction the same is true for any finite sum,
w (X n ( E 1
u · · · u
Ek)) = w(X n EJ ) +
·
·
·
+ w ( X n Ek) .
Monotonicity of w implies that
w (X n E)
:::
w (X n ( E 1
u · · · u
Ek )) ,
370
Lebesgue Theory
Chapter 6
and so w { X n E) dominates each partial sum of the series L w (X n Hence it dominates the series too,
E; ) .
00
L w (X n Ed .:::: w (X n E ) . i=l
The reverse inequality is always true by subadditivity and we get equality, verifying (2). Finally, we prove that E = U E; is measurable when each E; is. Taking E; = E; \ ( E 1 U · · · U E; _ J ) , {a) tells us that it is no loss of generality to assume the sets E; are disjoint, E = U E; . Given a test set X c M we know by {c) {finite additivity) and monotonicity of w that w (X n E t ) + · · · + w (X n Ed + w (X n Ec) =
<
u
u
+ w (X n e) w (X n (E1 u · · · u Ek } ) + w (X n (E1 u · · · u Ek t ) w (X
n (E1
···
Ek } )
wX.
Being true for all k, the inequality holds also for the full series, 00
L w (X n E; ) + w (X n E'") .:::: wX. i=l
Using (2), we get w ( X n E ) + w (X
n Ec) =
00
L w (X n E; ) + w (X n £'") .::5 wX. i=l
The reverse inequality is true by subadditivity of w. This gives equality and shows that E is measurable. Hence M is a a -algebra and the restriction of D w to M is countably additive. From countable additivity we deduce a very useful fact about measures. It applies to any outer measure w, in particular to Lebesgue outer measure. 5 Measure Continuity Theorem
surable sets then
If { Ek } and { Fk } are sequences of mea
upward measure continuity downward measure continuity
Fk ..!- F
and wF1
< oo ::::} w h ..!- w F.
Section
3
371
Regularity
Proof The notation Ek t E means that £1
C
C
. . . and E = U Ek . Write E disjointly as E = U Ei where E� = Ek \ (E 1 U U Ek _ J ) . Countable additivity gives £2
·
·
·
00
L: w E� .
wE =
n= l
Also, the kth partial sum of the series equals w Ek . so wEk converges upward to w E . The notation Fk -!.- F means that F1 ::J F2 ::J . . and F = n Fk . Write F1 disjointly as
F1 =
( kU F� ) u F 00
=I
where F� = Fk \ Fk +I · Then Fk = U n ::: k F� formula
wF1 = wF +
U
.
F. The countable additivity
00
L: wF�
n=l plus finiteness of wF1 implies that the series converges to a finite limit. so its tails converge to zero. That is wFk =
00
L: w F� + w F
n =k converges downward to w F as k -+ oo . 3
D
Regularity
In this section we discuss properties of Lebesgue measure related to the topology of IR and IR2 . 6 Theorem
Open sets and closed sets are measurable.
The inclusion or exclusion ofzero sets has no effect on outer measure or measurability.
7 Proposition
Proof If Z is a zero set then we assert that m * ( E U Z) = m * E = m * (E \ Z) and further, that if E is measurable then so are E U Z and E \ Z . We have
m * E .:=:: m * (E U Z) _:::: m * E + m * Z = m * E,
so m * (E U Z) = m * E. Applying this to the set E \ Z gives m * ( E \ Z) = m * ( ( E \ Z) U (E n Z ) ) ; i.e, m * E = m * ( E \ Z ) .
372
Lebesgue Theory
Chapter 6
Assume that E is measurable. Given a test set X we use subadditivity, monotonicity, and the set theory formula (E U zy = E e \ Z to get
m*X
�
m * (X n (E U Z)) + m * (X n ( Ec \ Z)) * � m (X n E) + m * (X n Z) + m * (X n Ee) = m * X.
Thus the inequalities are equalities and E U Z is measurable. Applying this to Ee shows that Ee U Z = ( E \ zy is measurable, so E \ Z is measurable. D
8 Corollary
Zero sets are measurable.
Proof Take
Z = fZJ U Z in Proposition 7.
D
9 Proposition The half line H = [0. oo) c lR and the halfplane H x lR c JR2 are measurable. Proof Given X c JR, we claim that m * X = m * (X n H) + m * (X n He) . By Proposition 7 we may assume that 0 fl. X, since X and X \ {0) differ by
the zero set {0). Then
X=
x+ u x-
= (X n H) u (X n He) .
Given E > 0 there i s a covering I of X by open intervals h whose total length is
L l h l :::=: m * X + E. k
Split any interval h that contains 0 into its positive and negative open subintervals It. This gives a new covering p of X with the same total length as I. Each interval in p lies wholly in the positive or negative half axis, so p splits into disjoint coverings f/ of x±. Thus
m*X
�
m * (X n H) + m * (X n He) ::::: L I l l + L I l l = L l h l k
�
m * X + E.
Since E > 0 is arbitrary this gives measurability of H. The planar case i s similar. The y-axis has planar outer measure zero since for any E > 0 it is covered by the countable set of open rectangles ( -E/4n , E/4n ) X ( -2n , zn ) whose total area is 4E . By Proposition 7 it is then fair to exclude the y-axis from a test set X c JR2 , and the rest of the reasoning is the same as in the linear case. D
Section
3
Regularity
373
Proof of Theorem 6 Consider ffi. By Proposition 9 the half lines [0, oo) and ( - oo , 0) are measurable. Measurability is unaffected by translation, so [a , oo) and ( -oo, b) are measurable. Measurability is also unaffected by the exclusion of zero sets. Thus (a , oo) is measurable and so is
(a , b) = (a , oo ) n ( - oo , b) . Consider ffi. 2 . The translates of half planes are measurable and it is fair to exclude zero sets. Thus the vertical strip (a , b) x ffi. is measurable. Exchang ing the roles of the x-and y-axis shows that the horizontal strip JR. x (c, d) is also measurable, so their intersection (a , b) x (c, d) is measurable. Every open set in JR. is the countable union of open intervals, and every open set in the plane is the countable union of open rectangles. Since M is a rr -algebra, every open set is measurable, and since the complement of a D measurable set is measurable, every closed set is also measurable. 10 Corollary The Lebesgue measure of an interval is its length and the Lebesgue measure of a rectangle is its area. Proof By Theorem 6, the interval and rectangle are measurable, so their
measures equal their outer measures, which we know to be their length and area. D Sets that are slightly more general than open sets and closed sets arise naturally. A countable intersection of open sets is called a G� -set and a countable union of closed sets is an Fu -set. ("8" stands for the German word durschnitt and "rr " stands for "sum.") By deMorgan's Laws, the complement of a G0-set is an Fa -set and conversely. Since the rr-algebra of measurable sets contains the open sets and the closed sets, it also contains the G 0-sets and the Fer -sets. 1 1 Theorem Lebesgue measure is regular in the sense that each measur able set E can be sandwiched between an Fa -set and a G�-set, F C E C G, such that m (G \ F) = 0. Proof Assume first that E
c
JR. is bounded. There is a large closed interval
I that contains E . Measurability implies that m l = m E + m (l \ E) .
There are decreasing sequences of open sets U, and V, :J ( I \ E), m U, --+ m E, and m V, --+ m (l \
V, such that U, :J E, E) as --+ oo. The n
Lebesgue Theory
374
complements of E and
Kn = I \
Chapter 6
Vn form an increasmg sequence of closed subsets
mKn = m l - m Vn ---+ m l - m (l \ E) = m E . Thus F = U Kn is an Fa -set in E with m F = m E . Similarly there i s a G 8 -set G that contains E and has m G = m E. Because all the measures are finite, the equality m F = m E = m G implies that m ( G \ F) = 0. The same n reasoning applies to bounded subsets of the plane or !R . The unbounded case is left as Exerci se 9. D 12 Corollary Modulo zero sets, Lebesgue measurable sets are Fa -sets and/or G8 -sets. Proof
E = F U Z = G \ Z' for the zero sets Z = E \ F and Z' = G \ E .
n
0
The outer measure of a set A c IR or A c !R is the infimum of the measure of open sets that contain it. Dually, we define the inner measure of A to be the supremum of the measure of closed sets containeei in it. We denote the inner measure of A as m * A . Clearly m * A ::=: m * A and m * measures A from the inside. Also, m * is monotone: A c B implies
m * A :::: m * B .
Equivalently, m * A i s the measure of the smallest measurable set H � A while m * A is the measure of the largest measurable set N c A . (H being "smallest" means that for any other measurable H' � A, H \ H' is a zero set. Similarly, if N' is any measurable subset of A then N' \ N is a zero set. It is easy to see that H and N exist and are unique up to zero sets. They are called the hull and kernel of A . See Exercise 8 . )
!Rn ) is Lebesgue measurable if and only if its inner and outer measures are equal, m * A = m * A. Further, if B is a bounded measurable set that contains A, then A is measurable if and only if it divides B cleanly. 1 3 Theorem A bounded set A C IR (or A
C
A is measurable then it is both its own largest measurable subset and the smallest measurable subset that contains it, so m * A = m * A. O n the other hand, i f m * A = m * A, then A is sandwiched between
Proof If
measurable sets with equal finite measure, so it differs from each by a zero set and is therefore measurable. Finally suppose that B � A with B bounded and measurable. Then
m * A + m * (B \ A) = m B .
Section
3
375
Regularity
See Exercise 8.
If
A divides B cleanly. then m * A + m*( B \ A)
= mB. A
Finiteness of all the measures implies that m * A = m* and A i s measur able. The converse is obvious because a measurable set divides every test D set cleanly. Remark Lebesgue took Theorem 1 3 as his definition of measurability. He said that a bounded set is measurable if its inner and outer measures are equal, and an unbounded set is measurable if it is a countable union of bounded measurable sets. In contrast, the current definition which uses cleanness and test sets is due to Caratheodory. It is easier to use (there are fewer complemems to consider), unboundedness has no effect on it, and it generalizes more easily to "abstract measure spaces."
Regularity of Lebesgue measure has a number of uses. Here is one. See also Exercises 32 and 3 3 .
The Cartesian product ofmeasurable sets is measurable, and the measure of the product is the product of the measures. 1 4 Theorem
Proof By convention,
we claim that A
mz(A
x
x
0 oo = 0 = oo · 0. Given measurable A , B ·
JR. ,
B is measurable with respect to planar measure and
B) = mtA m t B .
A
C
·
A.
Case 1 . or B i s a zero set; say it i s Given E > 0 and an interval (c. d) there is a covering of A by open intervals ( h } with L k I < Ej(d - c) . Thus A x (c, d) is covered by rectangles h x (c, d) whose total area is < E, so X (c. d) is a zero set. S ince X JR. = u:: l A X ( - n, n), it is a zero set, and since x B C x JR., x B is also a zero set. Zero sets are measurable. which completes the proof of the theorem in this case. UAi and B = U B where Case 2. and B are open sets. Then and B are open intervals, and A x B is the disjoint countable union of open rectangles. It is therefore measurable and by countable additivity,
A
Ai
j
A
A
A
hi
A
A
A=
j
ij ij = ( L I A i i ) ( L i Bj i ) = m t A · m t B . i j
A
and B are bounded G0-sets. Then there are nested decreas Case 3 . and Vn _j,. B with m 1 U1 < oo and ing sequences of open sets, Un -1--
A
376
Chapter 6
Lebesgue Theory
Thus Un X Vn ..!- A X B as n � 00 Being a Gil -set, A X B is measurable. Downward measure continuity gives m 1 Un ---+ m 1 A, m 1 Vn � m 1 B , and by Case 2
m 1 v1 <
00 .
.
mz (Un
X
Vn ) = m 1 Un . m 1 Vn
�
m1A . m1 B
as n � oo. Case 4. A and B are bounded measurable sets. By regularity, A and B differ from Gil-sets G and H by zero sets X and Y . Thus A x B differs from G x H by the zero set X
which implies that A
mz (A
x
x
X
H
u
G
X
Y,
B is measurable and
B) = mz (G
x
H) = m 1 G · m 1 H = m 1 A · m 1 B .
Case 5 . A and B are measurable but not necessarily bounded. The proof D of the theorem in this case is left to the reader. See Exercise 9. See Exercise 1 3 for the n -dimensional version of Theorem 14.
4
Lebesgue integrals
Following J.C. Burkill, we justify the maxim that the integral of a function is the area under its graph. Let f : lR � [0, oo) be given. Definition The undergraph of f is
Uf = { (x , y )
E lR
x
[0, oo ) : 0 :::::
y
<
f (x )} .
The function f is Lebesgue measurable if U f is measurable with respect to planar Lebesgue measure m2, and if it is, then the Lebesgue integral of f is the measure of the undergraph
J f = m z (Uf) .
When X C lR and f : X � [0, oo ) , the same definition applies: if the undergraph { (x , y ) E X x [0, oo) : 0 ::::: y < f (x) ) is measurable, then its measure is the Lebesgue integral of f over X ,
fx t = mz (Uf) .
Section
4
Lebesgue integrals
3 77
f llf
Figure 124 The geometric definition of the integral as the measure of the
undergraph. See Figure 1 24. Burkill refers to the undergraph as the ordinate set of f . The notation for the Lebesgue integral intentionally omits the usual "dx" and the limits of integration to remind you that it is not merely the ordinary Riemann integral J: f (x) dx or the improper Riemann integral f�oo f (x ) dx . The subscript "2" on the measure indicates planar measure and will usually be dropped. Since a measurable set can have infinite measure, J f = oo is permitted.
f : lR --+ [0, oo) is Lebesgue integrable if (it is measurable and) its integral is finite. The set of integrable functions is denoted by L 1 , £ 1 , or C.
Definition The function
Assume that fn : lR --+ is a sequence ofmeasurable functions and fn t f as n --+ oo. Then
15 Lebesgue Monotone Convergence Theorem
[0, oo)
Proof Obvious.
U fn t
m (U fn ) t m (Uf ) .
Uf
and measure continuity (Theorem 5) implies D
Let f, g : lR --+ [0, oo) be measurable functions. (a) If f ::: g then J f ::: J g . (b) If lR = u�l xk and each xk is measurable, then
16 Theorem
! f = t.Lk f.
(c) If X C lR is measu rable, then mX = J X x. (d) IfmX = 0, then fx f = 0.
378
Lebesgue Theory
Chapter 6
(e) If j (x ) = g (x ) almost everywhere, then J f = J g. (j) /fa 2: 0, then J af = a f f . (g ) J f + g = J f + J g . Proof Assertions (a) - (t) are obvious from what we know about measure.
(a) f :::; g implies U f C Ug and thus m (U f) :::; m (Ug) . (b) The product Xk x � is measurable and its intersection with Uf is Uf l xk · Thus, UJ = U: 1 U/ Ix1 and countable additivity of planar measure give the result. (c) The planar measure of the product U (X x ) = X x [0, l ) 1s mX. (d) U f is contained in the product X x � of zero planar measure. (e) Almost everywhere equality of f and g means that there is a zero set Z C � such that if x r;_ Z then f (x) = g (x) . Apply (b), (d) to � = Z u (� \ Z) . (f) Scaling the y-axis by the factor a scales planar measure correspond ingly. (g) is a matter of looking carefully at the right picture, namely Figure 1 25 . We assume that f i s integrable (otherwise (g) merely asserts that oo = oo) and define the f -translation �2 � �2 as •
Tt :
(x , y)
�--+
(x , f (x) + y ) .
Tt bijects the plane to itself, and as i s shown in Figure 1 25, (3)
U(J + g) = UJ u Tt (Ug ) .
This makes (g) an immediate consequence of the following assertion.
l_r(Ug)
Figure 125 The undergraph of a sum.
Section 4
(4)
For each measurable E
Case 1 . E is a rectangle = h · X (a . b) • and (3) gives
g
(5 }
Since
379
Lebesgue integrals c
1R2 • Tt E is measurable and m ( TJ E ) = m E .
R = (a , b)
x
[0, h). Then R = Ug where
Uf u Tt R = U(f + g) = U(g + f) = R u Tg (Uf) .
(a , b)
x
lR is measurable U f splits cleanly as U f
= U1
u
U2 where ¢ (a, b)} .
Uz = { (x , y ) e Uf : x y) E Uf : X E (a , b) } Under Tg . U1 is translated vertically by the amount h while U2 stays fixed. Neither measure changes and the sets stay disjoint. Thus m (Tg (U f)) = m (Uf), and (5) becomes ul = { (x ,
m (Uj) + m (Tt R) = m R + m (Uf) .
(6)
Since J f < oo, it is legal to subtract m (Uf) from (6), which gives (4) when £ = R . Case 2. E is the rectangle R = (a, b) x [c, d) . Then T1 R = Tg R' where g = f + c X (a .b) and R' = (a , b) x [0, d - c) . Case 1 applied to g gives (4) when E = R . Case 3 . E is bounded. Choose a large rectangle R that contains E. Given E > 0 there is a covering { Rk} of E by rectangles such that ·
L I Rk I ::S m E + E . k
Tt E is covered by f -translated rectangles Tt Rk and m * (Tt E) ::::; L: m CTt Rd = L I Rk I ::::; m E + E . k k Since E > 0 is arbitrary, m * (TJE) ::::; mE. The same applies to R \
Then
E,
and
so
m * (T1 E) + m* (Tt ( R \ E)) ::::; m E + m ( R \ E) = I R I = m(Tt R ) ::::; m * ( TJ E) + m * (Tt R \ Tt E). This gives equality m (T1 R) = m * (T1 E) + m * (T1 R \ T1 E), which means that Tt E divides the bounded, measurable set T1 R cleanly. Theorem 1 3 implies that Tt E is measurable. Subtraction of I R I = m E + m (R \ E) m(Tt R) = m(TtE) + m(Tt (R \ E) )
Chapter 6
Lebesgue Theory
380
gives
0 = (m E - m ( Tt E) ) + (m (R \ E) - m (Tt (R \ E) ) ) . Both terms are :::: 0, so they are zero, which completes the proof of (4) when E is bounded. Case 4. E is unbounded. Break E into countably many bounded, mea surable, disjoint pieces and apply Case 3 piece by piece. This completes D the proof of (4 ), of (g), and of the theorem. Remark The standard proof oflinearity of the Lebesgue integral is outlined in Exercise 28. 1t is no easier than this undergraph proof, and undetgraphs at least give you a picture as guidance. Definition The completed undergraph is the undergraph together with the graph itself,
Uf
= Uf
U
{ (x , f (x)) : x E
JR}.
The definitions of measurability and integral are unaffected if we replace the undergraph with the completed undergraph. 17 Theorem
[0, oo) be given, and set fn (x) = ( 1 + 1 /n ) f (x) for U Then f is the decreasing intersection nu fn together with the
Proof Let f : lR ---*
n E N. x-axis. Assume that U f is measurable. Then U fn is measurable, the intersection nu fn is measurable, and the x-axis is measurable (it is a zero set) so f) f is measurable. Also m (Uf) = m (U f) . For if m (U f) = oo, then Uf => U f implies that m (Uf) = oo, while if m (U f) < oo, then the result follows from downward measure continuity (Theorem 5) and the fact that the x -axis is a zero set. The converse is checked similarly. Set g n (x ) = ( 1 - 1 /n ) f (x ) and express U f as the increasing union of Ugn , modulo a zero set on the x -axis. Measurability of Uf implies measurability of Ugn implies measurability D of U f and by upward measure continuity m (U f) = m (Uf). Definition Let fn : lR ---* [0, oo ) be a sequence of functions. Its lower envelope and upper envelope sequences are
f (x)
:::....n
= inf{ fk (x)
:
k :::: n } and fn
f ...n increases and fn decreases as Clearly, :::. sandwich the original sequence.
n
= sup{ fk (x ) : k :::: -
n} .
oo, and the envelopes
381
Lebesgue integrals
Section 4
1 8 Theorem If the functions fn are measurable, then so are lts envelopes. Proof The undergraph of the maximum of two functions is the union of
their undergraphs, while the minimum is the intersection. Keeping track of strict inequality versus nonstrict inequality. we get the formulas
D
and measurability follows from Theorem 1 7 . 19 Corollary
The pointwise limit of measurable functions is measurable.
Proof If fn (x)
� f (x) for every x (or almost every x) as n � oo then the same is true of the envelope functions. Then =--n f t f implies that U=--n f t Uf so f is measurable. D
Suppose that fn lR � [0, oo) is a sequence of measurable functions converging pointwise to the limit function f. If there exists a function g lR � [0, oo) whose integral is finite and which is an upper bound for all the functions fm 0 ::S fn (x) ::S g (x), then J fn � J f. 20 Lebesgue Dominated Convergence Theorem
:
:
Proof Easy. See Figure 1 26. The upper and lower envelopes converge monotonically to f , and since U /n c Ug, all have finite measure. This makes both downward and upward measure continuity valid, so we see that the measures of Uf and U .fn both converge to m (Uf) . Between these "'-II measures is sandwiched m (U fn), and so it also converges to m (U f) as n � oo. D
.. .. . . .. . . . . .
.
,'
.. .. . . .. ..
.. . .. .. •
R
..
t.
t.
I
.
I
I
t.
t.
.
�
.
:
.
.
.
.. �
t. # t. ...
..
. .
.
.. . ..
.
.
. #II ,.,
Figure 126 Lebesgue Dominated Convergence.
..
##
.
.
.
.. ..
.
.. .. . ..
382
Lebesgue Theory
Chapter 6
Remark If a dominator g with finite integral fails to exist then the assertion
fails. For example the sequence of "steeple functions," shown in Figure 85 on page 204, have integral n and converge at all x to the zero function as
n --+ oo.
Suppose that fn : lR --+ [0, oo) is a sequence of measurable functions and f : lR [0, oo) is their lim inf, f (x) = lim inf{ fk {x) : k � n} . Then J f _:::: lim inf J fn · n---+ oo
21 Fatou's Lemma
--+
n---+ oo
Proof The assertion is really more about lim inf's than integrals. The as
sumption is that f is the limit of the lower envelope functions c:... f .n . Since ( x ) , we have f (x) _::: : f n c:....n The Lebesgue Monotone Convergence Theorem implies that J c:... f .n converges upward to J .f, so the integrals J fn have a lim inf that is no smaller than f f . D Remark The inequality can be strict, as is shown by the steeple functions
Up to now we have assumed the integrand f is nonnegative. If f takes both positive and negative values we define if f (x) � if f (x ) < Then f± � of f ,
0 and f
= f+ -
0 0
J_ (x) =
l�
f (x )
i f f(x) < if f(x) �
0 0.
f It is easy to see that the total undergraph _ .
{(x , y) E lR2 : y < f (x) }
is measurable if and only if f± are measurable. See Exercise 3 1 . If f± are integrable we say that f is integrable and define its integral as
The set of measurable functions f : lR --+ lR is a vector space, the set of integrable functions is a subspace, and the integral is a linear map from the latter into JR. 22 Proposition
The proof is left to the reader as Exercise 2 1 .
Section 5
5
383
Lebesgue integrals as limits
Lebesgue integrals as limits
The Riemann integral is the limit of Riemann sums. There are analogous "Lebesgue sums'' of which the Lebesgue integral is the limit. Let f : lR ---+ [0. oo) be given. take a partition Y : 0 = Yo < Y1 < Yz . on the y-axis, and set .
X;
{x E lR : Yi 1
=
-
::S
f (x )
.
< yd .
(We require that y; ---+ oo as i ---+ oo.) Assume that X; i s measurable and define the lower and upper Lebesgue sums as lA f, Y) =
L Yi -1 · m X;
L (f, Y )
i=l
=
00
L Yi · mX; . i=l
These sums represent the measure of "Lebesgue rectangles" X; x [0. Yi - l ) and X; x [0, y; ) that sandwich the undergraph. Under the measurability conditions explained below. they converge to the Lebesgue integral as the mesh of Y tends to zero,
ff
l:_ (f, Y ) t
and
L (f. Y ) -!-
f
f.
Upshot Lebesgue sums are like Riemann sums, and Lebesgue integration is like Riemann integration, except that Lebesgue partitions the value axis and takes limits, while Riemann does the same on the domain axis.
The assumption that the sets X; are measurable needs to be put in context. Definition The function
a E JR, fpre la . oo) = follows that the sets
X;
f
{x
=
: lR ---+ lR
is pre-image measurable if for each
E lR : a ::::: f (x ) } is Lebesgue measurable. (It fpre[y;_ 1 , y; ) are measurable.)
This is the standard definition for measurability of a function. We will show that it is equivalent to the geometric definition that the undergraph is measurable, and then discuss when the Lebesgue sums converge to the Lebesgue integral. First we show that there is nothing special about using closed rays [a , oo ) . 23 Proposition Thefollowing are equivalent conditionsforpre-image mea surability of f : lR ---+ R (a) The pre-image of every closed ray [a . oo) is measurable. (b) The pre-image of every open ray (a , oo) is measurable.
3 84
(c) (d) (e) (f) (g) (h)
Lebesgue Theory
Chapter 6
The pre-image of every closed ray ( -oo. a] is measurable. The pre-image of every open ray ( -oo, a) is measurable. The pre-image of every half-open interval La , b) is measurable. The pre-image of every open interval (a , b) is measurable. The pre-image of every half-open interval (a , b] is measurable. The pre-image of every closed interval [a , b] is measurable.
=> (h) =} (a). Since the pre-image of a union, an intersection. or a complement is the union. intersection, or complement of the pre-image, these implications are checked by taking pre-images of the following.
Proof We show that (a) =} (b) =}
(a) =} (b) (b) =} (c) (c) =} (d) (d) =} (e) (e) =} (f) (f) =} (g) (g) =} (h) (h) =} (a)
·
·
·
(a , oo) = U[a + ljn, oo) . (-oo, a]= (a , oo)C. (-oo, a)= U<-oo, a - 1 /n]. [a , b) = (-oo, a)c n (-oo, b). (a , b) = U [a + I jn , b). (a , b] = n ca , b + Ijn). [a , b] = n ca - 1 jn , bl. [a , oo) = U [a , a + n) .
Remark You might expect that a measurable function is defined by the property that the pre-image of a measurable set is measurable. As is shown in Exercise 24, this is not true. The pre-image of a measurable set under a measurable function can be nonmeasurable. 24 Theorem Pre-image measurability of f : JR. � [0, oo) is equivalent to measurability ofU f. Proof that pre-image measurability implies undergraph measurability
The undergraph is
Uf =
U fpre[r, oo) x [0, r) ,
rEIQ
and Theorem 14 asserts that the product of measurable sets is measurable.
D
The converse requires a lemma.
Iff : JR. � [0, oo) is pre-image measurable and J f = 0 then f (x) = 0 almost everywhere.
25 Lemma
Section
5
385
Lebesgue integrals as limits
Proof We have shown that pre-image measurability implies undergraph
measurability, so J f is well defined. The product set pre [r, oo) x [0, r) IS measurable and its measure is m (fPre [r, oo)) - r = 0. Thus m (fPre [r, oo)) = 0 and f (x) = 0 almost everywhere. 0
Proof that undergraph measurability implies pre-image measurability
We first assume that l f (x) l � M for all x , and f (x) = 0 for all x ¢. Regularity implies that there is a sequence of compact sets Kn t F as n -+ oo such that m F = m (U f ) . Define gn (X ) =
l
max{y : (x , y) E Kn }
0
if Kn
n (x
if Kn n (x
X
X
[a , b].
c
Uf
IR) ¥= 0 IR) = 0 .
Clearly gn t g where g � f and m (Ug) = m (U f). The function gn is upper semicontinuous. (See Exercise 26.) That is, if xk -+ x as k -+ oo then lim sup gn (xk ) � gn (x) . k -H)O Upper semicontinuity is equivalent to the condition that the pre-image of each open ray ( - oo , a) is an open set. (See Exercise 25.) Thus gn is pre image measurable. The upward limit of gn is also pre-image measurable since g Pre ( 00 a) = U g�re ( - oo , a). n -
,
Because f is bounded and its support is contained in [a , b ] , we can make the same construction from above and find a pre-image measurable function h with / � h and m (Uh) = m (Uf). Thus g � f � h and f g = J f = J h . Linearity of the integral applies to h = g + (h - g) . so J h - g = 0. Lemma 25 implies that g (x) = h (x ) almost everywhere, so f, which is sandwiched between g and f, equals a pre-image measurable function almost everywhere and hence is pre-image measurable itself. Removing the extra boundedness hypotheses is left to the reader as Exercise 27 . 0
26 Corollary If f : lR -+ [0. oo) is measurable then J f = 0 if and only
if f
=
0 almost everywhere.
Proof Theorem 24 states that measurability implies pre-image measura 0 bility, so the assertion follows from Lemma 25 .
Lebesgue Theory
386
Chapter 6
Now we return to Lebesgue sums. Assume that f measurable and define
fn (X) Clearly
fn
t
f as n
I
=
f (x) - l f n 0
if f(x )
lR ---+
[0. oo] is
� 1 /n
if f(x) < l j n .
oo, and b y the Monotone Convergence Theorem J fn ---+ J f. If Y : 0 = Yo < Y1 < . . . is a partition of the y-axis then the Lebesgue lower sum L (f Y) is the integral of the function ---+
,
00
L Yi-I X ; (x) i=l where X; is the characteristic function of f Pre [y; _ 1 , y; ) . If mesh Y < 1 / n 0. then fn :::= fy :::= f, and therefore !A f, Y) tends to J f as mesh Y fy (x)
=
---+
Remark A measurable function is simple if it takes on only finitely many values. See Exercise 28. The lower Lebesgue sum function qy = L:7= 1 y; - I X x
is a simple function, and what we have shown implies that for a nonnegative measurable function f,
Jf
= sup {
/
f}.
I n fact this i s often how the Lebesgue integral i s developed. A "pre-integral"' is constructed for simple functions, and the integral of a general nonnegative measurable function is defined to be the supremum of the pre-integrals of lesser simple functions_ Lebesgue upper sums behave equally well when f is identically zero outside some set X C lR with mX < oo. For
L - L < mesh Y mX. ·
2 On the other hand, functions such as f (x ) e-x have finite integral but have every Lebesgue upper sum equal to oo. One way to get a satisfactory, general upper-sum/lower-sum sandwiching is to extend the concept of a partition, permitting a hi-infinite set of points Y = {y; : 0 < < Yi -t < Y; < · · · : i E Z} with y; ---+ 0 as i ---+ - oo and y; ---+ oo as i --+ oo . Then =
·
·
·
lim inf
mesh Y---+ 0
See Exercise 29.
L ( f, Y )
---+
J f-
6
387
Italian Measure Theory
Section 6
Italian Measure Theory
In Chapter 5 the slice method is developed in terms of Riemann integrals. Here we generalize to the Lebesgue case. The slice of a set E c JR2 through a point x in the x -axis is
Ex = {y E ffi. : (x , y) E E } .
Similarly, the slice of f : E --+ lR through x is the function fx defined by fx Cy ) = f(x , y). Remark In this section we frequently write
: Ex
--+ lR
dx and dy to indicate which
variable is the integration variable.
27 Cavalieri 's Principle If E C IR 2 is measurable, then almost every slzce Ex of E is measurable, the function x 1--+ m 1 Ex is measurable, and its
integral is
m2E
=
J m 1 Ex dx .
Our proof of Cavalieri's Principle involves the bisection picture used in the Bolzano-Weierstrass Theorem. A number ofthe form k/2n with k , n E Z is dyadic. The vertical line x = k j2n and the horizontal line y = .e j2n are dyadic, and so is the square with vertices (kj2n .e;2n ) . ((k + l ) j2 n . (.e + 1 ) / 2n ) . •
An open set in the plane is the union ofcountably many disjoint open dyadic squares together with a zero set.
28 Lemma
Proof Let
U
c
IR2 be open. Accept all the open unit dyadic squares that
lie in U, and reject the rest. B isect every rejected square into four equal subsquares. Accept the interiors of all these subsquares that lie in U . and reject the rest. Proceed inductively, bisecting the rejected squares, accepting the interiors of the resulting subsquares that lie in U. and rejecting the rest. In this way U is shown to be the countable union of disj oint. accepted, open dyadic squares, together with the points rejected at every step in the construction. See Figure 127. Rejected points of U lie on horizontal or vertical dyadic lines. There are countably many such lines, each is a zero set, and so the rejected points D form a zero set.
1 . E is rectangle R = (d - c ) · X (a . b) (x ) and the equation
Proof of Cavalieri's Principle Case
Then m 1 Rx
=
(a , b) x (c, d) .
388
Lebesgue Theory
Chapter 6
Figure 127 An open set is a countable union of dyadic squares.
(7)
mz E
= (b - a ) ·
(d -
c) =
J m t Ex dx
is obvious. Case 2. E is a zero set Z . There is a sequence of coverings { R;n } of Z by open rectangles such that Un = U ::: 1 R;n satisfies and Define f (x) = m f ( Zx ) and fn (x ) - L:::: 1 m t ( (R;n )x ) . Since Zx C U ; C R;n)x , we have 0 ::::= f (x ) :::: fn (x ) and
We now show that f is measurable. Since x � m t ((Rin ) x ) is a well defined measurable function of x , so is fn , and the undergraph of fn has measure m z (U fn )
=
J
fn (x ) dx
=
J 1= 1
f:_ m t ( ( R;n)x ) dx
389
Italian Measure Theory
Section 6
(We used Case l and the Lebesgue Monotone Convergence Theorem.) Thus U f is a zero set. A zero set is measurable. Therefore the function f is measurable and its integral is zero. By Corollary 26, f (x) = 0 almost everywhere, which implies that almost every slice of Z is a zero set and verifies Cavalieri's Principle for Z. Case 3. E is an open set U . By Lemma 28, U = Z u U S; where Z is a zero set and the S; 's are open rectangles. Set S = U S; . Cases 1 , 2 and countable additivity give m2 U
=
=
f m2 S; = f I m t (( S )x ) dx = I fmt((S;)x) dx
I
I
;
i=l
i=l
m 1 Sx dx =
(m t Sx + m 1 Zx ) dx
I
z=l
=
m 1 Ux dx ,
which verifies Cavalieri's Principle for an open set. Case 4. E is a bounded measurable set. By regularity there is a sequence of bounded open sets Un that nest down to a G8 -set G with G = E u Z and Z a zero set. Then Ex U Zx = Gx , so m t Ex = m 1 Gx almost everywhere. Also Gx = n� t (Un )x and by Case 3
Downward measure continuity and the Lebesgue Dominated Convergence Theorem give m2 E = m2 G = nlim m2 Un = nlim l m t ((Un )x ) dx ---+ oo ---+ oo =
=
I I
lim m t ( ( Un)x) dx
n ->-oo
m t Gx dx = l (m t Ex + m l Zx ) dx =
I
m t Ex dx ,
which verifies Cavalieri's Principle for bounded measurable sets. Case 5 . E i s an unbounded measurable set. Split E into bounded mea surable pieces (say by intersecting it with unit dyadic squares), apply Case 4 to each piece, and add the answers. 0 Cavalieri's Principle holds also in dimension 3 and higher. If E its slice through a point x on the x-axis is Ex =
{ (y, z) E IR.2
: (x ,
y, z) E
E} .
c
IR.3 ,
Lebesgue Theory
390
Chapter 6
29 Cavalieri's Principle in 3-space If E c JR3 is measurable then almost every slice EY. of E is measurable, the function x � mzEx is measurable,
and its integral is
m3 E =
J
m z Ex dx .
The proof of the three-dimensional (and higher) cases is identical to the two-dimensional case. See also Appendix B of Chapter 5 . A s a consequence of Cavalieri's Principle i n 3-space w e get the integral theorems of Fubini and Tonelli. It is standard practice to refer to the integral of a function f on JR2 as a double integral and to write it as
J f Jf f (x , y) dxdy. =
It is also standard to write the iterated integral as
If f : JR2 [0, oo) is measurable, then al most every slice f� (y) is a measurable function of y, the function x � J fx ( y ) dy is measurable. and the double integral equals the iterated inte gral, f (x , y) dxdy = f(x , y) dy ] dx . 30 Fubini-Tonelli Theorem
""""*
J [j
Jf
Proof The result follows from the simple observation that the slice of the
undergraph is the undergraph of the slice,
(U f) Y.
(8)
= Ufx -
See Figure 128. For (8) implies that m z ( (Uf)x ) = m z (Ufx ) = and then Cavalieri gives
!!
f (x ,
y) dxdy
=
J[
m z ( (U f )x)
j [j f (x , y) dy] dx .
m 3 (Uf) =
J f (x . y) dy,
] dx D
31 Corollary When f : JR2 [0, oo) is measurable the order of integra tion in the iterated integrals is irrelevant, """"*
J [j f (x , y) dy ] dx Jf f x , y) dxdy = J [ j f (x , v) dx ] dy . =
(
Vitali coverings and density points
Section 7
39 1
Figure 128 Slicing the undergraph.
(In particular, if one of the three integrals is finite so are the other lWO and all three are equal. ) Proof The difference between "x·· and "y·· is only notational . In contrast to the integration of differential forms, the orientation of the plane or 3space plays no role in Lebesgue integration, so the Fubini-Tonelli Theorem applies equally to x -slicing and y-slicing, which implies that both iterated integrals equal the double integral. D When f takes on both signs a little care must be taken to avoid subtracting oo from oo.
32 Theorem If f
:
IR2 �
is integrable (the double integral of f exists and is finite) then the iterated integrals exist and equal the double integral. lR
Proof Split f into its positive and negative parts, f f+ - f_ , and apply the Fubini-Tonelli Theorem-to each separately. Since the integrals are finite. D subtraction is legal and the theorem follows for f . =
See Exercise 35 for an example in which trouble arises if you forget to assume that the double integral is finite.
7
Vitali coverings and density points
The fact that any open covering of a closed and bounded subset of Euclidean space reduces to a finite subcovering is certainly an important component of basic analysis. In this section, we present another covering theorem, this
Lebesgue Theory
392
Chapter 6
time the accent being on disjointness of the sets in the subcovering rather than on finiteness. The result is used to differentiate Lebesgue integrals.
Definition A covering V of a set A in a metric space M is a Vitali covering if for each point p E A and each r > O there is V E V such that p E V c M7 p and V is not merely the singleton set { p } . We also say that V is fine because it consists of sets that have arbitrarily small diameter. For example if A = [a, b], M = �. and V consists of all intervals with a � ,8 and a, ,8 E Ql then V is a Vitali covering o f A.
33 Vitali Covering Lemma A Vitali covering of A
C
[a, ,8]
�n
by closed balls reduces to an efficient disjoint subcovering of almost all of A.
More precisely, assume that A c �n is bounded, V is a Vitali covering of A, each V E V is a closed ball, and E > 0 is given. Then there exists a countable subcovering { Vk } C V of A such that (a) The Vk are disjoint. (b) m U � m * A + E where u = u� l vk · (c) A \ U is a zero set. Condition (b) is what we mean by { Vk } being an efficient covering the extra points covered form an E -set. The sets UN = Vt u · · · u VN nearly cover A in the sense that given E > 0, if N is large then UN contains A except for an E -set. After all, u = u uN contains A except for a zero set. See also Appendix C. Boundedness of A is an unnecessary hypothesis. Also, the assumption that the sets V E V are closed balls can be weakened somewhat We discuss these improvements after the proof of the result as stated.
Proof of the Vitali Covering Lemma Given E > 0, there is a bounded open set W ::J A such that m W � m * A + E . Define V1 = { V E V
:
V
C
W}
and
d1
= sup{diam V : V E VI } .
V1 i s sti11 a Vitali covering of A. Since W bounded, Vt E V1 with diamV 1 2: d1 /2 and define V2 = { V E Vt
:
V n Yt = 0}
and
dk
e
vk- l : v n uk l = 0} -
= sup{diam V : V E Vk }
Vk E Vk has d1am Vk 2: 2
.
i s finite. Choose
d2 = sup{diam V
Choose V2 E V2 with diam V2 2: d2 f 2. In general, vk = { V
d1
dk
:
V E V2 } .
Section 7
Vitali coverings and density points
393
where Uk _ 1 = V1 u u Vk - 1 • This means that Vk has roughly maximal diameter among the V E V that do not meet Uk - I · By construction, the b alls are disjoint and since they lie in W, m (U Vd :::: m W :::: m * A + E , verifying (a), (b). It remains to check (c). If at any stage in the construction, Vk = 0, then we have covered A with finitely many Vk 's, so (c) becomes trivial. We therefore assume that V1 , V2 , form an infinite sequence. Additivity implies that m ( U Vk ) = L: m Vk . Since all the Vk 's are contained in W, the series converges. This implies that diam vk � 0 as k � oo; i.e., ·
·
·
Vk
•
•
.
dk � 0 as k � oo .
(9) For any N E
N w e claim that 00
( 1 0)
u 5 Vk
k=N
::J
A \ UN - I ·
where 5 Vk denotes the ball Vk dilated from its center by the factor 5 . (These dilated balls need not belong to V.) Take any a E A \ UN - 1 · Since UN - 1 is compact and V1 is Vitali, there is a ball B E V1 such that a E B and B n UN - I = 0. That is, B E VN . Assume that (1 0) fails. Then, for all k � N, a ¢. 5 Vk . Therefore B ct. 5 VN . Figure 1 29 shows that due to the choice of VN with roughly maximal diameter, the fact that 5 VN fails to contain B implies that
4r
�
•
B
Figure 129 The unchosen ball B .
394
Lebesgue Theory
VN is disjoint from B . so B
Chapter 6
E VN + I · This continues for all
k
>
N : namely
B E Vk. Aha! B was available for choice as the next Vk. k > N , but it was never chosen. Therefore the chosen Vk has a diameter at least half as large as that of B . The latter diameter i s fixed, but (9) states that the former diameter tends to 0 as k --+ oo, a contradiction. Thus ( 1 0) is true. It is easy to see that ( 10) implies (c ) . For let � > 0 be given. Choose N so large that 00 � '"' m vk < n L 5 k= N where n = dim :!Rn . Since the series L m Vk converges this is possible. By ( 1 0) and the scaling law m ( tE ) = t n m E for n -dimensional measure,
m * (A \ UN-1 )
00
00
k =N
k= N
::::: L: m (5 Vk) = s n L: m vk
Since � is arbitrary, A \ U = n (A \ Uk ) is a zero set.
< �-
D
Remark A similar strategy of covering reduction appears in the proof in
Chapter 2 that sequential compactness implies covering compactness. For mally, the proof is expressed in terms of the Lebesgue number of the cov ering, but the intuition is this: Given an open covering U of a sequentially compact set K , you choose a subcovering by first taking a U1 E U that covers as much of K as possible, then taking U2 E U that covers as much of the remainder of K as possible. and so on. If finitely many of these sets Un fail to cover K then you take a sequence Xn E K \ ( U1 U · · · U Un- 1 ) and prove that it has no subsequence which converges in K . (The contradiction shows that in fact finitely many of the Un you chose actually did cover K . ) In short, when given a covering it is a good idea to choose the biggest sets
first.
Removing the assumption that A i s bounded presents no problem. Ex press an unbounded set A as A = UA; where each A1 is bounded. Given a Vitali covering V of A and given E > 0, we form an (E/2 1 )-efficient sub covering of A; , say { V;k : k E N } . Then { V;k : i, k E N} is an E-efficient subcovering of A . A further generalization involves the shapes of the sets V E V. See Exercise 45. As a consequence of the Vitali Covering Lemma. we verify one of Lit tlewood's principles (see Appendix C):
Section 7
Vitali coverings and density points
395
Most points of a measurable set are like interior points. Let E c JRn be measurable. For x E IRn , define the density of E at x as m ( E n B) 8 (x , E) = lim , B .j, x mB if the limit exists, m being Lebesgue measure on JRn and B being a ball that shrinks down to x. Clearly, 0 :::::: 8 :::::: I . Points with 8 = 1 are called
density points of E. The fraction that we're taking the limit of is the relative measure or concentration of E in B . Existence of 8 (x , E) means that for
each E > 0 there exists an r > 0 such that if B is any ball of radius < r that contains x then the concentration of E in B differs from 8 by < E . If we specify that the balls are centered at x , we refer to 8 as the balanced density of E at x .
34 Lebesgue Density Theorem
Almost every x E E is a density point
of E. Interior points of E are obviously density points of E, although sets like the irrationals or a Cantor set of positive measure have empty interior. while still having infinitely many density points. Let E I denote the set of density points of E . In analogy with topological interior, it is easy to see that ( E I ) I = E I . Thus, E 1 is a kind of "measure theoretic interior" of E. Keep in mind the pathology that the boundary of an open set can have positive measure - for instance a Cantor set is the frontier of its complement and both can have positive measure. Even a Jordan curve in JR2 can have positive Lebesgue planar measure (see Exercise 46) , so this measure theoretic interior is more an analogy than a generalization. A consequence of the Lebesgue Density Theorem is that measurable sets are not "diffuse" - a measurable subset of lR can not meet every interval (a , b) in a set of measure c - (b - a) where c is a constant, 0 < c < I . Instead, a measurable set must be "concentrated'' or "clumpy." See Exercise 42. Also. looking at the complement Ec of E, we see that almost every point x E Ec has 8(E , x) = 0. Thus, almost every point of E is a density point of E and almost every point of Ec is not. Proof of the Lebesgue Density Theorem Without loss of generality, we
assume E is a bounded subset of JRn . Take any a . 0 ::::::
Ea = {x E E : Q(E , x ) < a }
a
< 1.
where Q denotes the lim inf of the concentration of E in balls We will show that Ea has outer measure zero.
and consider
B as B -!, x.
Lebesgue Theory
396
Chapter 6
By assumption, at every x E Ea there are arbitrarily small balls in which the concentration of E is < a. These balls form a Vitali covering of Ea and by the Vitali Covering Lemma we can select a subcovering V1 , V2 , such that the Vk are disjoint, cover almost all of Ea . and nearly give the outer measure of Ea in the sense that •
•
•
(Ea turns out to be measurable, but the Vitali Covering Lemma does not require us to know this in advance.) We get
m * (Ea )
=
L: m * (Ea n Vk ) < a L: m vk _:::: a (m * (Ea ) + E) , k
k
which implies that m * (Ea) _:::: aE/(1 - a). Since E > 0 is arbitrary, m * (Ea ) = 0. The Ea are monotone increasing zero sets as a t I . Let ting a = 1 1 /f, f = 1 , 2, . . . , we see that the union of all the Ea with a < 1 is also a zero set, Z. Points x E E \ Z have the property that as B ,!. x, the lim inf o f the concentration of E in B i s ::::: a for all a < 1 ; since the concentration is always _:::: 1 , this means that the limit of the concentration exists and equals 1 . Hence E \ Z is contained in E 1 and almost every point D of E is a density point of E . -
8
Lebesgue's Fundamental Theorem of Calculus
In this section we write the integral of f over a set E as JE f(x) dm . In dimension one we write it as JE f (t) dt , or as Jt f (t) dt when E = (a, {3 ) . Definition The average o f an integrable function
measurable set E
c
f : !Rn ---* IR over a
IRn with positive measure is
1 f (x) dm
fe
=
1 -- r f (x) dm .
mE )E
35 Theorem If f : !Rn ---* IR is Lebesgue integrable, then for almost every p E ]Rn lim
where B ,!.
p means
1 f (x) dm
8{-ph
=
f (p) ,
that B is a ball that contains p and shrinks down to p .
Section
8
Lebesgue's Fundamental Theorem of Calculus
397
E is a measurable set. The average of f on B is the concentration of E in B. By the Lebesgue Density Theorem, for almost every p E :!Rn this concentration converges to X E (p) . Case 2. f is integrable and nonnegative. Fix any a > 0 and consider the
Proof Case 1 . f = X E where
1h1 f(x) dm
set
A = {p E :!Rn : lim sup B {,p
l
- f (p) > a } .
It suffices to show that for each a > 0, A = A (a) is a zero set. Let E > 0 be given. We claim that m * A < E . In Section 5 we showed that for any E > 0 there is a Lebesgue lower sum function
frn
(1 1 )
JRn
-
g (x) dm
<
aE 4
where g = f <j> . Linearity of the integral and Case 1 imply that for almost every p, the average of(p) as B shrinks to p . Thus, A differs from A' by a zero set where
A' = {p E ]Rn : lim sup B{,p
1h1 g(x) dm
- g (p)
To get rid of the absolute values we write
A; = { p E A' : g (p)
>
A; = {p E A' : lim sup B .J, p
and observe that A'
a 2
c
· ,
- mA1
A� ::S
U A; . Then
1 g (x ) dm
>
a}.
aj2}
h
>
1 g (x) dm frn g (x) dm ::=::
A'I
l
lR"
aj2} ,
<
-
aE , 4
implies that m A ; ::::: E /2 . The balls B on which f8 g (x) dm > a/2 form a Vitali covering of A;. The Vitali Covering Lemma gives a countable disjoint collection of these balls { Bi } that covers A;. Then
� · m * A; ::S 2
L. i · m Bi ::S L. h1 g(x) dm · m Bi I
I
'
<
1m
JRn
g (x) dm
aE 4
< -.
398
Chapter 6
Lebesgue Theory
Dividing through by a /2 gives m * A; < € j2, which completes the proof that A is a zero set, and hence completes the proof of Theorem 35 in Case 2. n Case 3. f is a general integrable function on � . Write f = f-t /with f± ;::: 0 and apply Case 2 to f+ and f- · D -
36 Corollary If f : [a , b]
---+
1x f(t) dt
� is Lebesgue integrable and
F(x) =
is its indefinite integral then for almost every x E [a. b], F (x) exists and equals f(x). Proof In dimension one a ball is a segment, so Theorem
i
F (x + h) - F (x) = f (t) dt h [x , x +h] almost everywhere as h .1-
---+
35 gives
f (x)
0. The same holds for [x - h . x].
D
Corollary 36 does not characterize indefinite integrals. Mere knowledge that a function G has a derivative almost everywhere and that its deriva tive is an integrable function f does not imply that G differs from the indefinite integral of f by a constant. The Devil's staircase function H is a counter-example. Its derivative exists almost everywhere, H' (x) is almost everywhere equal to the integrable function f (x) = 0, and yet H does not differ from the indefinite integral of 0 by a constant. The missing ingredient is a subtler form of continuity. Definition A function G
: [a, b] ---+ � is absolutely continuous if for each € > 0 there is a 8 > 0 such that whenever (a 1 • /3t ) . . . . . (an . f3n ) are disjoint intervals in [a , b] we have n
n
k=l
k=l
Since 8 does not depend o n n , i t follows that i f G is absolutely continuous and (ak . f3k ) is an infinite sequence of disjoint intervals in [a , b] then 00
L f3k - ak k= l
<
8
=}
:X.
L I G (fh) - G (ak ) i ::::: E. k=l
Section 8
399
Lebesgue's Fundamental Theorem of Calculus
37 Theorem
Let f : [a . b] TP?,. be integrable. (a) The indefinite integral F of f is absolutely continuous. (b) For almost evny x, F'(x ) exists and equals f(x). (c) If G is an absolutely continuous function and G'(x) almost every x, then G d!ffe rsfrom F by a constant. ---+
f (x) for
As we show in the next section, the tacit assumption in (c) that G'(x) exists is redundant. Theorem 37 then gives a characterization of indefinite integrals as follows.
38 Lebesgue's Main Theorem
Every indefinite integral is absolutely con tinuous and, conversely, every absolutely continuous function has a deriva tive almost everywhere and, up to a constant, is the indefinite integral of its derivative. Proof of Theorem 37
(a) Case 1 . f is the characteristic function of a measurable set For any interval (a, {3) c [a . b] we have
F(f3) - F(a)
=
E
1{3
E
X E (t)
dt
=
L F (f3k ) - F (ak ) = L: m ( E n ( ak , f3d) k
E < E/2
= X E.
m (E n (a, {3 ) ) .
Thus, given > 0, we take 8 = and check that i f { (ak o intervals in [a, b] with l:: k f3k - ak < 8 then k
f
f3k) } are disjoint
< E.
Case 2. f � 0. Given > 0, choose a simple function 0 _::: ¢ _::: f as in the proof of Lebesgue's Fundamental Theorem of Calculus (Theorem 35) such that J: g (t) dt where g = f - ¢ . Linearity of the integral and Case 1 applied to the indefinite integral of ¢ give a 8 such that if 2:::: f3k - ak 8 then L k (f3k ) - (ak ) Thus
<
< E /2.
_:::
lh g (t) dt + � < E.
Case 3. The general integrable J± � 0. If (a, {3) C [a , b] then
I F(f3) - F(a) l
=
f. Express f a s f
=
f+ - f- with
1 1 13 f (t) dt l ::: 113 l f(t) l dt 1 13 f+ (t) =
+
f_ (t) dt ,
Lebesgue Theory
400
Chapter 6
which implies that Case 3 follows from Case 2 and completes the proof of (a). (b) This is Corollary 36: the derivative of the indefinite integral is the integrand. (c) G is absolutely continuous and G' =f almost everywhere. It is easy to see that the sum and difference of absolutely continuous functions are absolutely continuous. Thus = F G is an absolutely continuous function such that = 0 almost everywhere, and our goal is to show that is constant. Fix any E [a , Given E > 0 it is enough to show that
H
H' (x) x* b].
b
(x) (x) H -
IH(x * ) - H (a) l < E .
Absolute continuity implies that there is a 8 [a , ] are disjoint, k = 1 , 2 , . , then .
.
>
0 such that if
(ak. fh)
Fix such a 8 and define
C
fx E [a, x*] : H'(x) exists and H'(x) = 0} . Each E X is contained in arbitrarily small intervals [x, x h ] such that H(x + h) - H(x) < E h l 2(b - a ) I These intervals form a Vitali covering of and the Vitali Covering Lemma gives a countable disjoint subcovering {[xk. xk + h k ] } of almost all of X. Since [a , x*l \ X is a zero set, I:%': 1 hk = x* - a. Therefore there is an N such that N L hk > x* - a - 8 . k=l Since N is fixed we can re-label the indices so that x1 < x2 < < x N. This gives a partition of [a, x*] as X=
x
+
•
X
·
·
·
Section 9
and we telescope
Lebesgue's Last Theorem
40 1
H (x * ) - H (a ) accordingly,
H (x * ) - H (a) = H (x * ) - H (xN + hN) + H (xN) - H (xN- 1 + hN-d + H (xN - I + h N - 1 ) - H (xN - 1 ) +... + H (x2 ) - H (x1 + h 1 ) + H (xt ) - H (a) . The absolute values of the terms i n the first column add up to at most E /2 since the total lengths of the intervals (a, XI ) , (xi + h i . x2 ), . . . , (x N + h N , x * ) is < 8. The absolute value of the kth term in the second column is
the sum of which is < E/2. Therefore ! H er * ) which completes the proof that H i s constant. 9
- H (a ) l < E/2 + E/2 = E, D
Lebesgue's Last Theorem
The final theorem in Lebesgue's ground bre aking book, Ler;ons gration. is extremely concise and quite surprising.
sur [ 'inte
39 Theorem A monotone function has a derivative almost everywhere. Note that no hypothesis is made about continuity of the monotone func tion. Considering the fact that a monotone function f : [a , b] -----+ lR has only a countable number of discontinuities, all of j ump type, this may seem reasonable, but remember: the discontinuity set may be dense in [a , b] . We assume henceforth that f is nondecreasing, since the non-increasing case can be handled by looking at - f. Lebesgue's proof of Theorem 39 used the full power of the machinery he had developed for his new integration theory. In contrast, the proof given below is more direct and geometric. It relies on the Vitali Covering Lemma and the following form of Chebyshev's inequality from probability theory. The slope of f over [a , b] is
s=
f (b) - f(a) b-a
402
Lebesgue Theory
Chapter 6
IR is nondecreasing Assume that f : [a , b] and has slope s over I = [a , b]. If I contains countably many disjoint subintervals h and the slope of f over h is � S > s then
40 Chebyshev Lemma
�
L l h l ::: � I l l . Proof Write h =
k
[ab bk ] . Since f is nondecreasing
f(b) - f(a) Thus,
s Ill
�
SL
�
L f (bk ) - f(ad � L S(bk - ak ) . k
k
I h I , and the lemma follows.
0
Remark An extreme case of this situation occurs when the slope is con centrated in the three subintervals drawn in Figure 1 30.
J
. .
. .
.
a
.
.
.
b
lz
Figure 130 Chebyshev's Inequality for slopes. Proof of Lebesgue's Last Theorem Not only will we show that f' (x) ex ists almost everywhere, we will also show that f' (x) is a measurable func
tion of x, and
( 1 2)
1b f' (x) dx
:::
f(b) - f(a) .
Section 9
Lebesgue 's Last Theorem
403
To estimate differentiability, one introduces upper and lower limits of slopes called derivates. If h > 0, [x , x + h] is a right interval at x and (f (x + h ) f (x )) j h is a right slope at x . The lim sup of the right slopes as h -+ 0 is called the right maximum derivate of f at x , D right max f (x ) , and the lim inf is the right minimum derivate of f at x , D ri ght min f (x) . Similar definitions apply to the left of x . Think of D right max f (x) as the steepest slope at the right of x and D right min f (x) as the gentlest. See Figure 1 3 1 . -
f l.< + h )
........
: right rise
f(x) f{x - h) x-h
left interval
X
right
t+h
interval
Figure 131 Left and right slopes.
The derivates exist at all points of [a , b], but they can take the value oo There are four derivates. We first show that two are equal almost every where, say the left min and the right max. Fix any s < S and consider the set
E
= Ess =
{x E [a , b] :
D left min f(x)
< s <
S
<
D n ght max f (x) } .
We claim that
( 1 3)
m* E
= 0.
At each x E E there are arbitrarily small left intervals [x h , x 1 over which the slope is < s . These left intervals form a Vitali covering £ of E . -
Lebesgue Theory
404
Chapter 6
(Note that the point x is not the center of its £-interval, but rather it is an endpoint. Also, we do not know a priori that i s measurable. Luckily, Vitali permits this.) Let > 0 be given . By the Vitali Covering Lemma, there are countably many disjoint left intervals E .C that cover modulo a zero set, and they do so -efficiently. That is, if we write
E
E
Li
E
L
E
E,
= U int Li
mL ::::: m * + E .
then \ L is a zero set and E Every y E n has arbitrarily small right intervals [y , y t] C L over which the slope is > S. (Here it is useful that L is open.) These right intervals form a Vitali covering R of n and by the Vitali Covering Lemma, we can find a countable number of disjoint intervals R E R that cover L n modulo a zero set. Since n modulo a zero set, also covers modulo a zero set. By the Chebyshev Lemma,
L E
+
L E, L E = E,
E,
j
E, R = URj m*E :::= m R = L, L, I Rj i :::= L, � I L i l :5: � (m* E + E ) . i i Since the inequality holds for all E > 0, it holds also with = 0, which implies that m * E = 0, and completes the proof of ( 1 3). Then {x D left min J (x) D right max / (x)} = u Rj CL;
E
:
<
{ (s , S) E IQi x iQI : s < S]
is a zero set. Symmetrically, {x : D 1eft min f (x) > D right max f (x ) } is a zero Dright max f (x) almost everywhere. Mutual equality set, and D 1eft min f (x) of the other derivates, almost everywhere. is checked in the same way. See Exercise 49. So far we have shown that for almost every x E [a , b ], the derivative of f at x exists, although it may equal oo . Infinite slope is not really acceptable, and that is the purpose of ( 1 2) - for an integrable function takes on a finite value at almost every point. The proof of ( 1 2) uses a cute trick reminiscent of the traveling secant method. First extend f from La , bj to lR by setting j (x) f (a) for x < a and f (x) f (b) for x > b. Then define gn (x) to be the slope of the secant from (x , f (x)) to (x 1 / n , f (x + l jn ) . That is,
=
=
=
+ f(x + l jn) - f (x) gn (x) = = n (f(x + 1 /n) - f(x)). 1 /n
Section 9
405
Lebesgue's Last Theorem
See Figure 1 32. Since f is almost everywhere continuous, it is measurable, and so is gn . For almost every x, gn (x) converges to f'(x) as n --7 oo . Hence f ' is measurable, and clearly f' ::=:: 0. Fatou's Lemma gives
1 b f' (x) dx = 1 h lim inf g (x) dx a a n---+ 00 n
::=:
1 gn (x) dx . n---+ oo
lim inf
The integral of gn is
1 b g (X ) dx = 1 h+I/n f (x ) d a n n
b
x
-
n
h
a
1 a+lfn f (x) dx . a
The first integral equals j (b) since we set f (x) = j (b) for x second integral is at least f (a) since f is nondecreasing. Thus
>
b. The
which completes the proof of ( 1 2). As remarked before, since the integral of f ' is finite, f ' (x) < oo for almost all x , and hence f is differentiable (with finite derivative) almost everywhere. 0
f
a
X
X
Figure 132
b
g n (x) is the slope of the right secant at x .
406
Lebesgue Theory
Chapter 6
A Lipschitz function is almost everywhere differentiable.
41 Corollary
f : [a , bl L. Then, for all x , y E [a , h ] ,
Proof Suppose that
--+
JR. is Lipschitz with Lipschitz constant
l f ( y } - f (x > l
=
:::
L IY - xl .
The function g (x) f(x) + Lx is nondecreasing. Thus, g ' exists almost everywhere and so does f' = g ' - L. D Remark Corollary 4 1 remains true for a Lipschitz function
f : IR.n
it is Rademacher's Theorem, and the proof is much harder.
--+
R
a partition where ll. k f = L� = l l �k f l f (xk ) - f(xk - J ) . The supremum of the variations over all partitions X is the total variation of f. If the total variation of f is finite, f is said to be a function of bounded variation.
Definition The variation of a function f : X : a xo < - · · < Xn = b is the sum
=
42 Theorem
entiable.
[a, bj
---+
JR. over
A function of bounded variation is almost everywhere differ
Proof Up to an additive constant, a function of bounded variation can be written as the difference f (x) = P (x) - N (x) where
P (x )
= sup {
N (x }
=
L ll. k f : a = xo <
k -inf{
·
·
L �k f : a = Xo < k
·
< Xn =
· - -
x and ll. k f
< Xn =
:=::
x and � k f
0} <
0} .
See Exercise 5 2 The functions P and N are monotone nondecreasing, so D for almost every x, f'(x) = P'(x) - N'(x) exists and is finite. 43 Theorem
An absolutely continuous function is of bounded variation.
Proof Assume that F : [a , b ] --+ JR. is absolutely continuous and take E = L There is a � > 0 such that if (ak . fJk ) are disjoint intervals in [a , bj
then
L /3k - ak k
< �
=>
L I F ({Jk ) - F (ak ) i k
< I.
Fix a partition D of [a . bj with N subintervals of length < � . For any partition X of La . bj we claim that L k l � k f l < N. We may assume that X
Appendix A
contains Then
Translations and Noruneasurable sets
407
D, since adding points to a partition increases the sum .L l � k f l .
where Xi refers to the subintervals of X that lie in the ph subinterval of D. The subintervals in Xi have total length < �, so the variation of F over them is < l , and the total variation of F is < N . 0 44 Corollary
An absolutely continuous function is almost everywhere dif
ferentiable. Proof Absolute continuity implies bounded variation. which implies al 0
most everywhere differentiability.
As mentioned in Section 8, Theorem 37 plus Corollary 44 express Lebes gue's Main Theorem,
Indefinite integrals are absolutely continuous and every absolutely continuous function has a derivative almost everywhere of which it is the indefinite integral. Appendix A : Translations and Nonmeasurable sets
If t E � is fixed, t-translation is the mapping x �---+ x + t . It is a home omorphism � --+ R Think of the circle C as � modulo Z. That is, you identify any x with x + n for n E Z. Equivalently, you take the unit interval [0, 1 ] and you identify 1 with 0. Then t-trans1ation becomes rotation by the angle 2n t. and is denoted as R1 : C -+ C. If t is rational this rotation is periodic, i.e., for some n ::: 1 , the n th iterate of R. Rn = R o · · · o R, is the identity map C -+ C. In fact the smallest such n is the denominator when t = m f n is expressed in lowest terms. On the other hand, if t is irrational then R = R1 is nonperiodic ; every orbit O (x ) = { R k (x) : k E Z } is denumerable and dense in C.
Let R = R r with t irrational. If P C C contains exactly one point of each orbit of R . then P is nonmeasurable. 4 5 Theorem
Proof The orbits of R are disjoint sets, there are uncountably many of them, and they divide the circle as C = Un E Z R n ( P ) . Translation preserves
outer measure, measurability, and measure. So does rotation. Can
P be
Lebesgue Theory
408
Chapter 6
measurable? No, because if it is measurable with positive measure we would get
m ( C) =
L m (R P) = 00
n
n=-00
a contradiction, while if m P = O then m C = contradicts the fact that m [O, 1 ) = 1 .
00 ,
r_r:: _ 00 m ( R n P)
= 0, which
D
But does P exist? The Axiom of Choice states that given any family of non-empty disjoint sets there exists a set that contains exactly one element from each set. So, if you accept the Axiom of Choice, you apply it to the family of R-orbits, and you get an example of a nonmeasurable set; while if you don't accept the Axiom of Choice you're out of luck. To increase the pathology of P , we discuss translations in more depth below. 46 Theorem
exists a 8
>
47 Lemma
then
If E
C
0 such
If F
C
lR is measurable and has positive measure, then there that for all t E (-8, 8), the t-translate of E meets E.
(a, b) is measurable and disjoint from its t-translate, 2m F
:::=:
(b - a) + I t I .
Proof F and its t -translate have equal measure, so if they do not intersect,
then their total measure is 2m F, and any interval that contains them must have length ::: 2m F . If t > 0 then (a . b + t) contains F and its t-translate, while if t < 0 then (a + t , b) contains them. The length of the interval in D either case is (b - a) + I t I . Proof of Theorem 46 By the Lebesgue Density Theorem (Theorem 34) E
has lots of density points, so we can find an interval (a, b) in which E has concentration > 1 /2. Call F = E n (a , b) . Then m F > (b - a)j2. By Lemma 47 if l t l < 2m F - (b - a) then the t-translate of F meets F, so the t-translate of E meets E, which is what the theorem asserts. D Now we return to the nonmeasurable set P discussed in Theorem 45. It contains exactly one point from each orbit of R, R being rotation by an irrational t. Set B=
U R2k+l p.
kE7l
The sets A , B are disjoint, their union is the circle, and them. Since R preserves outer measure m * A = m * B
R interchanges
Appendix C
Riemann integrals as undergraphs
409
The composite R 2 = R o R is rotation by 2t, also an irrational number. Let E > 0 be given. Since the orbit of 0 under R2 is dense, there is a large integer k with For R 2k is the kth iterate of R 2 . Thus I R 2k + 1 (0) I < E so R 2k + 1 is a rotation by < E . Odd powers of R interchange A and B, so odd powers of R translate A and B off themselves. It follows from Theorem 46 that A and B contain no subsets of positive measure. Their inner measures are zero. The general formula mC = m * A + m * B implies that m * B = 1 . Thus we get an extreme type of nonmeasurability expressed in the next theorem.
48 Theorem The circle, or equivalently [0, 1 ) , splits into two nonmeasur
able, disjoint subsets that both have inner measure zero and outer measure one. Appendix B : The Banach-Tarski Paradox
If the example in the preceding appendix does not disturb you enough, here is a much worse one. You can read about it in Stan Wagon's book, The Banach-Tarski Paradox. Many other paradoxes are discussed there too. The solid unit ball in three-space can be divided into five disjoint sets, A 1 , , A5, and the Ai can be moved by rigid motions to new disjoint sets A; whose union is two disjoint unit balls. The Axiom of Choice is fundamental in the construction, as is dimensionality greater than two. The sets Ai are nonmeasurable. Think of this from an alchemist's point of view. A one inch gold ball can be cut into five disjoint pieces and the pieces rigidly re-assembled to make two one inch gold balls. Repeating the process would make you very rich. •
•
•
Appendix C: Riemann integrals as undergraphs
The geometric description of the Lebesgue integral as the measure of the undergraph has a counterpart for Riemann integrals.
49 Theorem A function f : [a , b]
[0, M] is Riemann integrable if and only if the Lebesgue measure of the closure and interior of its undergraph are equal. Equivalently the boundary of its undergraph is a Lebesgue zero set. --4
410
Lebesgue Theory
Chapter 6
Proof Assume that f is Riemann integrable. Given E >
0, a fine partition of the [a , b] gives upper and lower sums, U = L: M; � x; and L = L: m; �x; , which differ by < E . As usual, �x; = x; - x;_1 , while m; and M; are the infimum and supremum of f(x) as x varies in [x;_1 , x; ] . Geometrically this means that the graph of f is covered by rectangles R; = X; x Y; where X; = [x;_ 1 , x; ] and Y; = [m; , M; ] . The open rectangles (x;_1 , x; ) x (0, m;)
are disjoint, lie i n the interior of the undergraph, and have total area L, while the closed rectangles [x;_1 , x;] x [0. M; ] cover the closure of the undergraph and have total area U . Since U - L < E and E > 0 is arbitrary, the interior and closure of the undergraph have equal measure. Conversely, assume that m (interior UJ) = m (closure UJ) and let E > 0 be given By regularity of planar Lebesgue measure, there are a compact subset K of the interior of the undergraph and an open set V that contains the closure of the undergraph such that
m ( V \ K) < E . For x E [a b ] set k(x) = max{y : (x , k(x) = 0. The segment .
y) E K } , or if n o such y exists set
K (x) = { (x , t) : 0
<
t ::::: k (x ) }
lies in the interior of the undergraph. Otherwise, there would exist a se quence (xn , Yn ) that converges to some (x , t) E K (x) and (xn , Yn ) does not lie in the undergraph. Neither does the higher point (xn , Yn + (k(x) - t). But the latter tends to (x , k(x) ) , which contradicts the fact that (x , k(x)) E K c interior U f. Since K (x) is contained in the interior of the undergraph, there is a 8 = 8 (x) > 0 such that the strip (x - 8, x + 8) x (0. k(x) + 8) is contained in it too. These strips cover K , K is compact, and so finitely many of them cover K . The strips then refine to rectangles whose total area is less than or equal to a lower sum L with m K ::::: L . See Figure 1 33. Similarly, the open set V contains strips that cover the closure o f the undergraph and they refine to rectangles whose total area is greater than or equal to an upper sum U with U ::::: m V . Hence U - L < E , which implies D Riemann integrability.
50 Corollary If f is Riemann integrable, then it is Lebesgue integrable
and the two integrals are equal. Proof Since
interior U f
c
Uf
c
closure U f,
Appendix D
Littlewood's Three Principles
41 1
X
Figure 133 The strips that cover
K.
equality of the measures of its interior and closure implies that U f i s mea surable, and it shares their common measure. Since the Lebesgue integral D of f is equals m (JA f) the proof is complete.
Appendix D : Littlewood 's Three Principles
In the following excerpt from his book on complex analysis, Lectures on the Theory of Functions, J.E. Littlewood seeks to demystify Lebesgue theory. It owes some of its popularity to its prominence in Royden's classic text,
Real Analysis. The extent of knowledge [of real analysis] required is noth ing like as great as is sometimes supposed. There are three principles, roughly expressible in the following terms: Every (measurable) set is nearly a finite sum of intervals; every func tion (of class L "A) is nearly continuous; and every convergent sequence of functions is nearly uniformly convergent. Most of the results of the present section are fairly intuitive applica tions of these ideas, and the student armed with them should be equal to most occasions when real variable theory is called for. If one of the principles would be the obvious means to settle a problem if it were "quite" true, it is natural to ask if
412
Lebesgue Theory
Chapter 6
the ··nearly" is near enough, and for a problem that is actually soluble it generally is. t Littlewood's first principle expresses the regularity of Lebesgue measure. Given E > 0, a measurable S C [a . b] contains a compact subset covered by finitely many intervals, whose union differs from S by a set of measure < E . In that sense, S is nearly a finite union of intervals. I like very much Littlewood's choice of the term "nearly," meaning "except for an E-set," to contrast with '"almost," meaning "except for a zero set." Littlewood's second principle refers to "functions of class L A ," although he might better have said "measurable functions." He means that if you have a (measurable) function and you are given E > 0, you can discard an E -set from its domain of definition and the result is a continuous function. This is Lusin's Theorem: a measurable function is nearly continuous. Littlewood's third principle concerns a sequence of (measurable) func tions that converges almost everywhere to a limit. Except for an E -set the convergence is uniform, which is Egoroff's Theorem: almost everywhere convergence implies nearly uniform convergence. Proofs of Egoroff's and Lusin's theorems are outlined in Exercises 54 and 57.
Appendix E: Roundness
The density of a set E at x is the limit, if it exists, of the concentration of E in a ball B as B ..J, x . What if you used a cube instead of a ball? Or an ellipsoid? Would it matter? The answer is "somewhat." Let us say that a neighborhood U of x is K -quasi-round if it can be sandwiched between balls B c U C B ' with diam B ' ::::: K diam B. A ball is 1 -quasi-round, while a square is .J2-quasi-round. It is not hard to check that if x is a density point with respect to balls then it also a density point with respect to K -quasi-round neighborhoods of x , provided that K i s fixed a s the neighborhoods shrink to x . See Exercise 45. When the neighborhoods are not quasi-round, the density point analysis becomes marvelously complicated. See Falconer's book, The Geometry of
Fractal Sets. t Reprinted from Lectures on the Theory of Functions by I.E. Litrlewood ( 1 994) by permission of Oxford University Press.
Appendix F
Money
413
Appendix F : Money
Riemann and Lebesgue walk into a room and find a table covered with hundreds of U.S. coins. (Well . . . ) How much money is there? Riemann solves the problem by taking the coins one at a time and adding their values as he goes. As he picks up a penny, a nickel, a quarter, a dime, a penny. etc., he counts: "one cent, 6 cents, 3 1 cents, 4 1 cents, 42 cents, etc." The final number is Riemann's answer. In contrast, Lebesgue first sorts the coins into piles of the same value (partitioning the value axis and taking pre-images); he then counts each pile ( applying counting measure); and he sums the six terms, ''value v times number of coins with value v ," and that is his answer. Lebesgue's answer and Riemann's answer are of course the same number. It is their methods of calculating that number which are different. Now imagine that you walk into the room and behold this coin-laden table. Which method would you actually use to find out how much money there is - Riemann 's or Lebesgue 's? This amounts to the question : Which is the "better" integration theory? As an added twist, suppose you have only sixty seconds to make a good guess. What would you do then?
414
Lebesgue Theory
Chapter 6
Suggested Reading
There are many books on more advanced analysis and topology. Among my favorites in the "not roo advanced" category are these. 1 . Kenneth Falconer. The Geometry of Fractal Sets. Here you should read about the Kakeya problem: how much area is needed to reverse the position of a unit needle in the plane by a continuous motion? Falconer also has a couple of later books on fractals that are good. 2. Thomas Hawkins, Lebesgue 's Theory of Integration. You will learn a great deal about the history of Lebesgue integra tion and analysis around the tum of the last century from this book, including the fact that many standard attributions are incorrect. For instance, the Cantor set should be called the Smith set; Vitali had many of the ideas credited solely to Lebesgue, etc. Hawkins ' book is a real gem. 3 . John Milnor, Topology from the Differentiable Viewpoint. Milnor is one of the clearest mathematics writers and thinkers of the twentieth century. This is his most elementary book, and it is only seventy six pages long. 4. James Munkres, Topology, a First Course. This is a first year graduate text that deals with some of the same material you have been studying. 5. Robert Devaney, An Introduction to Chaotic Dynamical Systems. This is the book you should read to begin studying mathematical dynamics. It is first rate. One thing you will observe about all these books - they use pictures to convey the mathematical ideas. Beware of books that don't.
Bibliography
415
Biblio graphy
1 . Ralph Boas, A Primer of Real Functions, The Mathematical Associ ation of America, Washington DC, 1 9 8 1 2. Andrew Bruckner, Differentiation of Real Functions, Lecture Notes in Mathematics, Springer-Verlag, New York, 1 978. 3. John Burkill, The Lebesgue Integral, Cambridge University Press, London, 1 958. 4. Paul Cohen, Set Theory and the Continuum Hypothesis, Benjamin, New York, 1 966. 5. Robert Devaney, An Introduction to Chaotic Dynamical Systems, Benjamin Cummings, Menlo Park, CA, 1 986. 6. Jean Dieudonne, Foundations of Analysis, Academic Press, New York, 1 960. 7. Kenneth Falconer, The Geometry of Fractal Sets, Cambridge Univer sity Press, London, 1 985. 8. Russell Gordon, The Integrals ofLebesgue, Denjoy, Perron, and Hen stock, The American Mathematical Society, Providence, RI, 1 994. 9. Fernando Gouvea, p-adic Numbers, Springer-Verlag, Berlin, 1 997. 10. Thomas Hawkins, Lebesgue 's Theory of Integration, Chelsea, New York, 1 975. 1 1 . George Lakoff, Where Mathematics Comes From, Basic Books, New York, 2000. 1 2 . Edmund Landau, Foundations ofAnalysis, Chelsea, New York, 1 95 1 . 1 3 . Henri Lebesgue, Ler;ons sur l 'integration et la recherche des fonc tions primitives, Gauthiers-Villars, Paris, 1 904. 14. John Littlewood, Lectures on the Theory of Functions, Oxford Uni versity Press, Oxford, 1 944. 1 5 . Ib Madsen and J!iirgen Tomehave, From Calculus to Cohomology, Cambridge University Press, Cambridge, 1 997. 16. Jerrold Marsden and Alan Weinstein, Calculus Ill, Springer-Verlag, New York, 1 998. 1 7 . Robert McLeod, The Generalized Riemann Integral, The Mathemat ical Association of America, Washington DC, 1 980. 1 8. John Milnor, Topology from the Differentiable Viewpoint, Princeton University Press, Princeton, 1 997. 19. Edwin Moise, Geometric Topology in Dimensions 2 and 3, Springer Verlag, New York, 1 977. 20. James Munkres, Topology, a First Course, Prentice Hall. Englewood Cliffs, NJ, 1 975.
416
Lebesgue Theory
Chapter 6
2 1 . Murray Protter and Charles Morrey, A First Course in Real Analysis. Springer-Verlag, New York, 1 99 1 . 22. Dale Rolfsen, Knots and Links, Publish or Perish, Berkeley, 1 976. 23. Halsey Royden, Real Analysis, Prentice-Hall, Englewood Cliffs, NJ, 1988. 24. Walter Rudin, Principles of Mathematical Analysis, McGraw-Hill, New York, 1 976. 25. James Stewart, Calculus with Early Transcendentals, Brooks Cole, New York, 1 999. 26. Arnoud van Rooij and Wilhemus Schikhof, A Second Course on Real Functions, Cambridge University Press, London, 1 982.
Exercises
417
Exercises
1 . If t E IR is fixed show that the linear outer measure of A and its t 2.
3.
4.
5.
6.
7.
8.
translate t + A = { t +a : a E A} are equal . What is the corresponding assertion in the plane? If t :::: 0 is fixed and A c IR is given, �how that m * (tA) = t · m * A where t A = {ta : a E A } is the t-dilation of A . State the corresponding assertion in the plane. Give a shorter proof of Theorem 2 in dimension one as follows: (a) If I = La , b] is covered by finitely many open intervals h = (at , bk ), none of which contains I , show that there exists c E I such that [a , c] and [c, b] are covered by fewer than k of the intervals h . (b) Use (a) and induction on k to show that I I I ::::: m * I . (a) Show that the definition of linear outer measure is unaffected if we demand that the intervals h in the coverings be closed instead of open. (b) Why does this imply that the middle-thirds Cantor set has linear outer measure zero? (c) Show that the definition of linear outer measure is unaffected if we drop all openness/closedness requirements on the intervals h in the coverings. (d) What about planar outer measure? Specifically, what if we de mand that the rectangles be squares? In analogy with intervals and rectangles, formulate a definition of an n-dimensional box R in n-space. (a) What is the natural definition of the volume of R ? (b) What should be the outer measure o f A c IRn ? (c) Check the outer measure axioms for your definition. (d) Prove that the outer measure of a box equals its volume. A line in the plane that is parallel to one of the coordinate axes is a planar zero set because it is the Cartesian product of a point (which is a linear zero set) and R (a) What about a line that is not parallel to a coordinate axis? (b) What is the situation in higher dimensions? Prove that every closed set in IR or IRn is a G15-set. Does it follow at once that every open set is an Fa -set? Why? The hull of A C IR or A C IRn is a smallest measurable set H that contains A, where "smallest" means that for any measurable E :::J A ,
418
Lebesgue Theory
Chapter 6
H \ E is a zero set. The kernel of A is a largest (in the corresponding
9.
1 0.
1 1. 12. 13.
sense) measurable set K contained in A . (a) Show that a hull o f A exists, that i t can be taken to be a G8-set, and that it is unique up to a zero set. (b) Show that a kernel of A exists, that it can be taken to be an Fa -set, and that it is unique up to a zero set. (c) Show that A is measurable if and only if its hull and kernel differ by a zero set. (d) If H is the hull of A, show that He is the kernel of A c. Complete the proofs of Theorems 1 1 and L 4 in the unbounded case. [Hint: How can you break an unbounded set into countably many disjoint bounded pieces?] Show that inner measure is translation invariant. How does it behave under dilation? Countable intersections of Fa -sets are called Farsets. Find an Fa8 set that is neither a G8-set nor an Fa -set. How does Theorem 1 3 fail for unbounded sets? f Generalize Theorem 14 to IRn as follows. If A c JRk and B c JR are measurable show that A x B is a measurable subset of JRk H and
[Hint: Generalize Lemma 28 to n dimensions and imitate the proof of Theorem 14.] * 14. Generalize the geometric undergraph characterization of Riemann integrability that appears in Appendix B to the case in which the integrand is a function of several variables. 1 5 . Observe that under Cartesian products, measurable and nonmeasur able sets act like odd and even integers respectively. (a) Which theorem asserts that the product of measurable sets is measurable? (Odd times odd is odd.) (b) Is the product of nonmeasurable sets nonmeasurable? (Even times even is even.) (c) Is the product of a nonmeasurable set and a measurable set having non-zero measure always nonmeasurable? (Even times odd is even.) (d) Zero sets are special. They correspond to the number zero, an odd number in this imperfect analogy. (Zero times anything is zero.)
419
Exercises 1 6 . The outer (Jordan) content of a set
total lengths ofjinite coverings of A n
J* A = inf{ L I h i k= I
:
A c
i s the infimum of the by open interval s, lR
each h is an open interval and A
c
n
U /k } .
k= l
The corresponding definition o f outer content i n the plane or n-space substitutes boxes for intervals. (a) Show that outer content satisfies (i) J* (0) = 0. (ii) If A c B then J* A ::: J* B . (iii) If A =
n
u�= I Ak then J* A ::: L J * Ak . k= l
(b) (iii ) i s called finite subadditivity. Find an example o f a set A c [0, 1 ] such that A = U� 1 Ak . J* A k = 0 for all k, and J* A = 1 , which shows that finite subadditivity does not imply countable subadditivity, and that J* is not an outer measure. (c) Why is it clear that m* A ::= J* A, and that if A is compact then m A = J* A ? What about the converse? (d) Show that the requirement that the intervals in the covering of A be open is irrelevant. 1 7 . Prove that J* A = J*A = m A where A is the closure of A. 1 8. I f A , B are compact show that
J*(A U B) + J* (A n B ) = J* A + J* B.
[Hint: Is the formula true for Lebesgue measure? Use Exercise 1 7 . ] A of an interval I is
1 9. The inner (Jordan) content o f a subset
(a) Show that h A = m (interior
A).
(b) A set with equal inner and outer content i s said to have content. Infer from Theorem 49 that a bounded function is Riemann integrable if and only if its undergraph has content. 20. Prove that measurability of E c lR is equivalent to (a), to (b), and to (c).
420
21. 22.
23. 24.
25.
Lebesgue Theory
Chapter 6
(a) E cleanly divides every measurable subset of JR. (b) E cleanly divides every open subset of JR. (c) E cleanly divides every interval [a , b] c JR. (d) Generalize (a), (b), (c) to the plane and IR.n . Prove Proposition 22. Extend Proposition 23 by showing that pre-image measurability is equivalent to (a) The pre-image of every G 8-set is measurable (b) The pre-image of every Fer -set is measurable. Show that every S C JR. with m * S > 0 contains a nonmeasurable subset. [Hint: Theorem 48.] Consider the Devil's ski slope homeomorphism h : [0. 1 ] ---+ [0. 2 ] that sends the standard Cantor set C to a fat Cantor set F of mea sure I . (a) Does C contain a nonmeasurable set? (b) Does F contain a nonmeasurable set? (c) [s h - 1 measurable? (d) [f E c [0. J J is measurable, is its pre-image under h - 1 always measurable? A function f M ---+ JR. is upper semicontinuous if
:
lim X = x k --HX k
==?
lim sup f (xk ) ::: f (x ) . k--+ cv
(M can be any metric space. ) (a) Draw a graph o f an upper semicontinuous function that i s not continuous. (b) Show that upper semicontinuity is equivalent to the requirement that for every open ray ( - oo, a), ! pre ( - oo . a ) is an open set. (c) Lower semicontinuity is defined similarly. Work backward from the fact that the negative of a lower semicontinuous func tion is upper semicontinuous to give the definition in terms of lim inf's. 26. Given a compact set K C JR. x [0, oo) define g (x ) =
l max{y : y) 0
(x ,
E K}
Prove that g is upper semicontinuous.
n (x if K n (x if K
x x
JR.) -1 0
JR.) = 0 .
27. Complete the proof o f Theorem 2 4 b y removing the extra hypotheses that f is bounded and defined on a compact interval
42 1
Exercises *28. A
nonnegative linear combination of measurable characteristic func tions is a simple function. That is,
n
Q> (x) = I::C; x E; (x) i=l
where £1 , . . . , En are measurable sets and c 1 , . . . , Cn are nonnegative constants. We say that L c; X E; expresses ¢ . If the sets E; are disjoint and the coefficients c; are distinct and positive then the expression for ¢ is called canonical (a) Show that a canonical expression for a simple function exists and is unique. (b) It is obvious that the integral of ¢ = L c; X E; (the measure of its undergraph) equals L c; m E; if the expression is the canonical one. Prove carefully that this remains true for every expression of a simple function. (c) Infer from (b) that J ¢ + 1/J = J ¢ + J 1./J for simple functions. (d) Given measurable J, g : IR --+ [0. oo). show that there exist sequences of simple functions if>n t f and 1/ln t g as n --+ oo. (e) Combine (c) and (d) to revalidate linearity of the integral. 29. Assume that f : IR --+ [0, oo) is integrable. (a) Show that there exists a sequence of hi-infinite partitions Yn of the y-axis as described in Section 5 for which the Lebesgue upper sums are finite and converge to J f as n --+ oo . (b) Find an example o f an integrable function f : IR --+ [0, oo ) for which the upper Lebesgue sums do not converge to the integral as the mesh of the hi-infinite partition tends to zero. (The mesh is the supremum of the interval lengths I y; y; - I 1 . The difference between (a) and (b) is lim inf versus lim.) * * * (c) Is there a definition of the mesh of a hi-infinite partition of the positive y-axis such that the Lebesgue upper sums do converge to the integral as the mesh tends to 0? 30. Find a sequence of measurable functions fn : [0 , 1 ] --+ [0, 1 ] such that J fn --+ 0 as n --+ oo. but for no x E [0. 1 ] does fn (x) converge to a limit as n --+ oo. 3 1 . The total undergraph of f : IR --+ IR is the set [ (x . y) : y < f (x) }. Using undergraph pictures, show that the total undergraph is measurable if and only if the positive and negative parts of f are measurable. -
422
Lebesgue Theory
Chapter 6
(a) Assume that An t A as n ---* oo but do not assume that An is measurable. Prove that m * A n ---* m * A as n ---* oo . (This is upward measure continuity for outer measure. [Hint: Regularity gives G.5-sets G n :J An wi th m G n = m * An . Can you make sure that G n increases as n ---* oo? If so, what can you say about G = U G n ?]) (b) Is upward measure continuity true for inner measure? (Proof or counter-example.) (c) What about downward measure continuity of inner measure? of outer measure? **33. Prove that the outer measure of the Cartesian product of sets which are not necessarily measurable is the product of their outer measures. [Hint: Begin by assuming that A is not necessarily measurable and B is compact. Show that A x B has outer measure m * A · m B . It helps to note that for each x E A, finitely many open rectangles in a covering of A x B suffice to cover x x B .] 34. Check linearity of the integral directly for the two measurable char acteristic functions, f = X F and g = X G . 3 5 . Consider the function f : IR2 ---* IR defined by *32.
f (x , y ) =
I
1
y2
-1
x2
0
if O < X < y <
1
if O < y < X < 1 otherwise.
(a) Show that the iterated integrals exist and are finite (calculate them), but the double integral does not exist. (b) Explain why {a) does not contradict Corollary 3 1 . 36. Do (A) or (B), not both. (A) (a) State and prove Cavalieri's Principle in dimension 4. (b) Formulate the Fubini-Tonelli theorem for triple integrals and use (a) to prove it. (B ) (a) State Cavalieri's Principle in dimension n + 1 . (b) State the Fubini-Tonelli Theorem for multiple integrals and use (a) to prove it. How short can you make your answers? *37. Here is a trick question: "Are there any functions for which the Rie mann integral converges but the Lebesgue integral diverges?" Corol lary 50 would suggest the answer is "no." Show, however, that the
{
423
Exercises
improper Riemann integral f01 f (x) dx of
f (x ) =
rr
X
sin
if x i= 0
rr
X
if x = 0
0
exists (and is finite) while the Lebesgue integral is infinite. [Hint: Integration by parts gives
1a 1 X
- sin - dx = 7f
7f X
x
cos 7f
11 - 1a ' cos -X dx . 7f
X a
Why does this converge to a limit as a ---+ o+ ? To check divergence of the Lebesgue integral, consider intervals [ l f (k + 1 ) , 1 / k] . On such an interval the sine of 1r fx is everywhere positive or everywhere negative. The cosine is + 1 at one endpoint and 1 at the other. Now use the integration by parts formula again and the fact that the harmonic series diverges.] **38. A theory ofintegration more general than Lebesgue's is due to A Den joy. Rediscovered by Henstock and Kurzweil, it is described in R. McLeod's book, The Generalized Riemann Integral. The definition is deceptively simple. Let f : [a , b] ---+ lR be given. The Denjoy integral of f , if it exists, is a real number I such that for each E > 0 there is a function 8 : [a , b] ---+ (0, oo) and -
I L f (tk ) �Xk - I I < n
E
k=I
for all Riemann sums with �Xk < 8 ( tk ) , k = 1 , . . . , n . (McLeod refers to the function 8 as a gauge and to the intermediate points tk as tags.) (a) Verify that if we require the gauge 8 (t) to be continuous then the Denjoy integral reduces to the Riemann integral. (b) Verify that the function
f (x) =
{ )x 1 00
if 0 < x ::::: 1 if x = 0
has Denjoy integral 2. [Hint: Construct gauges 8 (t) such that 8 (0) > 0 but lim 8 (t) = 0 . ] t �o+
424
Lebesgue Theory
Chapter 6
(c) Generalize (b) to include all functions defined on [a . b] for which the improper Riemann integral is finite. (d) Infer from (c) and Exercise 37 that some functions are Denjoy integrable but not Lebesgue integrable. (e) Read McLeod's book to verify that (i) Every nonnegative Denjoy integrable function is Lebesgue integrable, and the integrals are equal. (ii) Every Lebesgue integrable function is Denjoy integrable, and the integrals are equal. Infer that the difference between Lebesgue and Denjoy corre sponds to the difference between absolutely and conditionally convergent series : if f is Lebesgue integrable, so is I f I , but this is not true for Denjoy integrals . 3 9 . Let T : IR2 --+ IR2 b e a rotation and let E c IR2 b e measurable. Show that m (T E) = m E by completing the following outline. (a) The planar measure of a disc is determined solely by its radius. (b) A rectangle R is the union of a zero set and countably many disjoint discs. (c) Under T, discs are sent to discs of equal radius, and a zero set is sent to a zero set. (d) Infer from (a)-( c) that T R is measurable and m (T R) = I R I . (e) Infer from (d) that T E is measurable and m ( T E) = m E . [Hint: Regularity.] (f) Generalize to !Rn . Combined with translation invariance, this exercise shows that Lebesgue measure is invariant under all rigid motions of IRn . 40. If T : IRn --+ IRn is a general linear transformation, use Exercise 39 and the polar form of T explained in Appendix D of Chapter 5 to show that if E c IRn is measurable then m (T E) = l det T l m E . 4 1 . The balanced density of a measurable set E at x is the limit, if exists, of the concentration of E in B where B is a ball centered at x that shrinks down to x . Write 8balanced (x , E) to indicate the balanced density, and if it is 1 , refer to x as a balanced density point. (a) Why is it immediate from the Lebesgue Density Theorem that almost every point of E is a balanced density point? (b) Given a E [0, 1 ] , construct an example of a measurable set E C IR that contains a point x with 8balanced (x . E) = a . (c) Given a E [0, 1 ] , construct an example of a measurable set E c IR that contains a point x with 8 (x , E) = a .
Exercises
425
* *(d) I s there a single set that contains points of both types of density for all a E [0, 1 ] ? 42. Suppose that P c lR has the property that for every interval (a , b) c IR, m t P n (a . b)) 1 b-a 2 (a) Prove that P is nonmeasurable. [Hint: This is a one-liner.] (b) Is there anything special about 1 /2? **43 . Assume that the (unbalanced) density of E exists at every point of R not merely at almost all of them. Prove that up to a zero set, E = IR, or E = 0. ( This is a kind of measure theoretic connectedness. Topological connectedness of lR is useful in the proof.) Is this al so true in IRn ? 44. Prove that any positive measure subset of lR contains a positive mea sure Cantor set. Is the same true in IRn ? *45 . As indicated in Appendix D, U c IR n is K -quasi-round if it can be sandwiched between balls B c U c B ' such that diam B' _::::: K diam B . (a) Prove that in the plane, squares and equilateral triangles are (uniformly) quasi-round. (The same K works for all of them.) (b) What about isosceles triangles? (c) Formulate a Vitali Covering Lemma for a Vitali covering V of A c IR2 by uniformly quasi-round sets instead of discs. (d) Prove it. (e) Generalize to JRn . (f) Consider the alternate definition of K -quasi-roundness of a measurable V C IRn as diam(V) n -
--
mV
--
-
< - K.
What is the relation between the two definitions, and is the Vitali Covering Lemma true for coverings by uniformly quasi-round sets under the second definition? [Hint: Review the proof of the Vitali Covering Lemma.] *46. Construct a Jordan curve (homeomorphic copy of the circle { (x . y) : x 2 + y 2 = 1 } ) in IR2 that has positive planar measure. [Hint: Given a Cantor set in the plane. is there a Jordan curve that contains it? Is there a Cantor set in the plane with positive planar measure?] 47 . [Speculative] Density seems to be a first order concept. To say that the density of E at x is 1 means that the concentration of E in a ball
426
Lebesgue Theory
B containing x tends to 1 as B .J,
x.
Chapter 6
That is,
m (B) - m (E n B) ------� o. mB
48.
49.
50.
*5 1 . *52.
But how fast can we hope it tends to 0? We could call x a double density point if the ratio still tends to 0 when we square the denomi nator. Interior points of E are double density points. Are such points common or scarce in a measurable set? What about balanced density points? Let E c ffi.n be measurable, and let x be a point of a E, the boundary of E . (That is, x lies in both the closure of E and the closure of Ec . ) (a) Is it true that if the density 8 = 8 (x , E) exists then 0 < 8 < l ? Proof or counter-example. (b) Is it true that if 8 = 8 (x , E) exists and 0 < 8 < l then x lies in a E ? Proof or counter-example. (c) What about balanced density? Choose a pair of derivates other than the right max and left min. If f is monotone write out a proof that these derivates are equal almost everywhere. Construct a monotone function f : [0, l ] � JR. whose discontinuity set is exactly the set Q n [0, 1 ], or prove that such a function does not exist. Construct a strictly monotone function whose derivative is equal to zero almost everywhere. In Section 9 the total variation of a function f : [a , b] � ffi is defined as the supremum of all sums I:7= 1 l b. i f l where P partitions [a, hl into subintervals [xi-1 · xd and b.d = f (xi ) - f (xi-d· As sume that the total variation of f is finite (i.e., f is of bounded variation) and define
r; = sup{ L l b. d l } p
k
P; = sup{ L b.d : b.d N: =
p -
k
�
0}
i�f{ L b.d : b.d � 0} k
where P ranges through all partitions of [a, x j . Prove that (a) f is bounded. (b) r; , P; , N; are monotone nondecreasing functions of x .
427
Exercises
(c) r; = P; + N; . (d) f (x) = f (a) + P; N: . **53. Assume that f : [a , b] --+ � has bounded variation. The Banach indicatrix is the function -
(a) Prove that Ny < oo for almost every y. (b) Prove that y 1--+ Ny i s measurable. (c ) Prove that
r:
=
ld Ny dy
where c ::S min f and max f ::S d. *54. A sequence of measurable functions fn : [a , b] --+ � converges nearly uniformly to f as n --+ oo if for each E > 0 there is an E-set S C [a , bl (that is, m S < E) such that fn (x) --+ f(x) uniformly as x varies in sc and n --+ oo . (a) Contrast nearly uniform convergence with almost uniform convergence. which means uniform convergence on the com plement of a zero set. (b) Egoroff's Theorem states that almost everywhere convergence on [a , b] implies nearly uniform convergence. Prove it as fol lows. (i) Let fn : [a , b] --+ � be a sequence of measurable functions that converges almost everywhere to a function f. and set X (k ,
l) = {x E
[a ,
b] : Vn
�
k , l fn (X ) - f (x > l < l jl } .
Show that for each fixed £ , U k X (k , £) equals [a , b] modulo a zero set. (ii) Given E > 0 show that there exists a sequence k 1 < k2 < . . such that for Xe = X (ke , e ) we have m (X D < E/2£ . (iii) lnfer that X = n e xe has m (Xc) < E and that fn converges uniformly to f on X. (Avoid re-using the letter E in your proof.) (c) Your proof in (b) is also valid for functions defined on any bounded subset of Euclidean space, is it not? (d) Why is Egoroff's Theorem false for functions defined on un bounded domains such as �? *55. Show that nearly uniform convergence is transitive in the following sense. Assume that fn converges nearly uniformly to f as n --+ oo, .
428
Lebesgue Theory
Chapter 6
and that for each fixed n there is a sequence fn k wluch converges nearly uniformly to fn as k ---+ oo. (All the functions are measurable and defined on [a , b ] . ) (a) Show that there is a sequence k(n ) ---+ oo a s n ---+ oo such that fn , k(n ) converges nearly uniformly to f as n ----+ oo . In symbols .
nulim nulim fn.k = f n ---+ oo k----* 00
=}
nulim fn.k( n) = f. n -+oo
(b) Why does (a) remain true when almost everywhere convergence replaces nearly uniform convergence? [Hint: The answer is one word.] (c) Is (a) true when IR replaces [a , b l ? (d) Is (b) true when IR replaces [a , b ] ? 56. Consider the continuous functions
fn , k (x) = (cos (:rr n !x ) ) k for k, n E N and x E R (a) Show that for each x E IR, lim lim fn , k (X ) = X Q (X ) , n---> OO k -HXJ the characteristic function of the rationals. (b) Infer from Exercise 23 in Chapter 3 that there can not exist a sequence fn , k(n) converging everywhere as n ---+ oo . (c) Interpret (b) to say that everywhere convergence can not replace almost everywhere convergence or nearly uniform convergence in Exercise 55. *57. Lusin's Theorem states that a measurable function f : [a , b] ---+ IR is nearly uniformly continuous in the sense that for any E > 0 there is an E -set S c [a , b 1 such that the restriction of f to sc is uniformly continuous. Prove Lusin's Theorem as follows. (a) Show that the characteristic function of an open interval is the nearly uniform limit of continuous functions. (b) Infer from (a) that the same is true of the characteristic function of a measurable set. [Hint: Regularity and Exercise 5 5 . ] (c) Infer from (b) that the same i s true for a simple function. (d) Use Egoroff's Theorem and Exercises 28, 55 to infer from (c) that the same is true for a nonnegative measurable function: it is the nearly uniform limit of a sequence of continuous functions.
Exercises
429
(e) Given a measurable f : [a , b] ---+ lR and given E > 0 infer from (d) that there exists a sequence of continuous functions fn : [a , b] ---+ lR and an open E -set U C [a , b] such that fn ---+ f uniformly on uc as n � oo . ( f) Why does (e) imply that f i s nearly uniformly continuous? 58. At what stage, if any, does your reasoning in Exercise 57 make es sential use of one-dimensionality? Explain. 59. Let f : lR ---+ JR. be measurable. (a) Give an example of an f for which the conclusion of Lusin's Theorem is false. (b) Formulate the definition of nearly continuous and prove that f is nearly continuous. (c) Generalize to JR.n . *60. Let E be a measurable subset of the line having positive Lebesque measure. (a) Prove Steinhaus' Theorem: E meets its t-translate for all suffi ciently small t. [Hint: density points.J (b) Formulate and prove the corresponding result in higher dimen sions. (c) Prove that despite the fact that the standard Cantor set has mea sure zero, it meets each of its t-translates for I t I � 1 .
Index
1 -form, 3 1 4
c r equivalent, 29 1 c r norm, 285 Fa -set. 1 89 G0-set, 1 89 M-test, 207 , 2 1 1 , 24 1 , 285 a-Holder, 1 86, 253 8-dense, 253 E -chain, 1 23 E -principle, 2 1 E , 8-condition, 55 E /2n -argument, 1 64 a -compact, 250, 257 k-cell, 3 1 5 k-chain, 325 k-form, 3 1 5 p-adic, 1 30 p-series, 1 82 rth derivative, 1 47 t -advance map, 234 x1-area, 3 1 5 absolute continuity, 398 absolute convergence of a series, 1 8 1 absolute property, 82 abuse of notation, 7 accumulation points, 69
address string, 97 adheres, 5 9 advance map, 234 aleph null, 30 algebraic number, 48 almost everywhere, 1 63 almost uniform convergence, 427 alternating harmonic series, 1 84 alternating series. 1 84 ambient homeomorphism, I 05 arnbiently diffeomorphic, 357 analogy, 1 0 analytic function, 1 48 , 235 Analyticity Theorem, 237 antiderivative, 1 73 Anti derivative Theorem, 1 7 3 Antoine's Necklace. 1 07 arc, 1 22 Archimedean property, 20 area, 306, 364 argument by contradiction, 8 Arzela-Ascoli Theorems, 2 1 4, 2 1 6 ascending k-tuple index, 3 1 7 ascending presentation, 3 1 7 associativity, 1 4 average ( of a function), 396 average derivative, 278
432 B aire class I , 1 89 B aire's Theorem, 243 baker's transformation, 1 1 8 B anach Contraction Principle, 228 B anach indicatrix, 427 B anach space, 285 basic k-form, 3 1 5 Bernstein polynomial, 2 1 8 bijection, 30 bilinear, 275 block test. 1 97 B olzano-Weierstrass Theorem, 77 Borel's Lemma, 256 boundary of a k-cell, 325 boundary of a set, 66, 1 27 bounded function, 249 bounded linear transformation, 268 bounded metric, 1 24 bounded sequence, 49 bounded variation, 406, 426 box, 24 Brouwer's Fixed Point Theorem, 228, 334 bump function, 1 88 Burkill, J .C., 376 Cantor function, 1 75 Cantor piece, I 03 Cantor set, fat, 98, 1 93 Cantor set, middle quarters, 1 92 Cantor set, middle-thirds, 95 Cantor space, 1 03 Cantor surjection theorem, 99 cardinality, 28 Cartesian product. 2 1 Cauchy completion, 1 1 2 Cauchy condition, 1 8, 73 Cauchy Convergence Criteria, 1 9, 1 80 Cauchy product of series, 1 99 Cauchy sequence in a metric space, 73 Cauchy-Binet Formula, 323. 343 Cauchy-Riemann Equations, 340 Cauchy-Schwarz Inequality, 22 Cavalieri 's Principle, 305, 3 3 8 center o f a starI ike set, 1 2 1 chai n connected, 252 chain rule, 274 chain-connected, 1 23
Index Change of Variables Formula, 306 characteristic function. 1 60 Chebyshev Lemma, 40 1 class, 2 class C 0 , C 1 , c r . coc , c w , 1 47 clopen, 60 closed form, 329 closed neighborhood, 94 closed set, 59 closed set condition, 65 closure of a set, 66 cluster point, 69, 1 26 co-Cauchy, 1 09 coarse partition, 1 55 cohomology classes, 3 3 3 common refinement, 1 5 8 commutative diagram, 29 1 compact (in the covering sense), 8 8 compact (sequentially), 76 comparability of metrics, 7 1 comparable norms, 346 comparison test, 1 80 complement of a set, 4 1 complete, 1 3 , 1 8 completeness of a metric space, 74 Completion Theorem, I 08 complex analytic, 238 complex derivative. 339 complex differentiable, 239 complex linear map, 339 composite, 30 compound word (as an address), I 00 concentration, 395 condensation point, 69, 1 26 condition number, 34 1 conditional convergence of a series, 181 cone over a set. 1 25 connected, 83 connected component, 1 36 conorm, 345 constant rank, 290 content, 4 1 9 continuity, 36, 55, 65 continuously differentiable, 1 47 Continuum Hypothesis, 30 contraction. 228 convergence of a sequence, 1 7, 54
433
Index convergence with respect to an order relation, 1 9 1 convex, 25 convex combination, 26, 45 convex function, 46 convex hull, 1 05 countable, 30 countable additivity, 364, 368 countable base of a topology, 1 28 covering, 8 8 covering compact, 88 covering, Vitali or fine, 392 critical point, 1 93 critical value, 193 cupcake theorem, 1 34 curl of a vector field, 329 cut, 1 1 Darboux continuous , 1 44 Darhoux integral, 1 56 de Rham cohomology group, 3 3 3 decimal expansion, 42 decreasing sequence, 1 1 7 Dedekind cuts, 1 1 deMorgan' s Law, 4 1 Denj oy integral, 423 dense, 98 density, 395 density point, 395 density, balanced, 395, 424 denumerable, 30 derivate, 403 derivative, 1 39, 27 1 derivative growth rate. 235 determinant, 343 Devil ' s ski slope, 1 77 Devil's staircase function, 1 75 diagonal, 43 diameter, 79 diffeomorphism, 1 52. 289 difference of sets, 2 differentiable, 1 4 1 , 27 1 differential forms, 3 1 3 dipole, 326 directional derivative. 348 disconnected, 83 discontinuity of the first kind, 1 94 discontinuity of the second kind, 1 94 discrete metric, 52
disjoint sets, 2 distance from a point to a set, 1 1 5 distance vector, 1 24 divergence of a series, 1 79 divergence of a vector field. 328 domain of a function, 28 domination, 1 80 dot product, 22 double density point. 426 double integral, 390 dyadic, 387 dyadic filtration lemma. I 00 dyadic number, 44 dyadic ruler function, 19 3 efficient covering, 392 Egoroff's Theorem, 427 embeds, 8 1 empty set, 2 engulf, 257 envelopes, upper and lower, 380 equal cardinality, 30 equicontinuity, 2 1 3 equivalence relation, classes, 3 error factor, 275 Euclidean distance, 24 Euler characteristic, 45 Euler's product formula, 200 exact form, 329 excludes, 94 exponential growth rate, 1 83 extends to (of a function), 1 1 8 exterior derivative, 3 2 1 F sigma set, 373 fat Cantor set, 98, 1 93 Fatou ' s Lemma, 382 field, 1 5 fine partition, 1 55 finite additivity, 369 finite cardinality. 30 finite intersection property, 1 20 finite subadditivity, 4 1 9 finite word, 48 first orthant, 24 fixed point. 43. 228. 334 flow, 233 flux of a vector field, 329 Frechet derivative, 272
434 front inclusion k -cell, 326 Fubini 's Theorem, 304 function, 28 function algebra, 223 functional, 3 1 4 Fundamental Theorem of Calculus, 171 Fundamental Theorem of Continu ous Functions, 39 G delta set, 373 gap interval, 1 03 gauge, 423 general linear group, 35 1 generic, 243 geometric series. 1 80 gradient vector field, 298 grand intersection, 1 20 graph, 43 grid, 300 growing steeple, 204
Hahn-Mazurkiewicz Theorem, 1 32 Hairy Ball Theorem, 359 harmonic series, 1 80, 1 84 Hausdorff metric, 1 3 3 , 1 99, 252 Hawaiian earring, 53, 1 22 Heine- Borel Theorem, 77 Heine- Borel Theorem in C 0 ([a , b] , JR) , 217 Henstock, 423 higher order differentiability, 1 47 Hilbert cube, 1 3 1 Holder condition. 1 86, 253 holomorphic, 239 homeomorphism, 57 hull, 374, 4 1 7 identity map, 3 0 Identity Theorem, 256 image set of a function. 28 implicit function, 286 Implicit Function Theorem, 286 improper Riemann integral, 1 7 8 inclusion cell, 3 1 8 increasing sequence, 1 1 7 indefinite Lebesgue integral, 398 indicator function, 1 60 infimum, 1 7
Index infinite (cardinality) , 30 infinite product, 1 98 infinitely differentiable, 1 47 inheritance, 52, 67 Inheritance Principle. 68 initial condition, 230 injection, 29 inner measure, 364, 374 inner product, 27 inner product space, 27 integer lattice. 24 integers, I integral test, 1 8 1 integrally equivalent functions, 194 integration by parts, 1 78 integration by substitution, 1 77 integration operators, 330 interior of a set, 66, 1 27 intermediate value property, 38, 1 44 Intermediate Value Theorem, 38, 83, 84 interval, 1 9 intrinsic property. 82 Inverse Function Theorem, 1 52, 289 inverse image, 64 irrational numbers, 1 9 isometry, 1 1 6 iterate (of a function), 407 iterated integral, 390 Jacobian, 306 Jordan content, 306, 4 1 9 Jordan Curve Theorem, 1 3 2 j ump discontinuity, 46, 1 94 kernel, 374, 4 1 8 Kurzweil, 423 L' Hospital's Rule, 1 43 Lagrange multiplier, 296 lattice. 24 least upper bound property, 1 3 Lebesgue Density Theorem, 395 Lebesgue Dominated Convergence Theorem, 3 8 1 Lebesgue integrable, 377 Lebesgue integral. 376 Lebesgue measure, 367 Lebesgue number, 89, 1 66
435
Index Lebesgue outer measure. 363 Lebesgue sums, 383 Lebesgue's Fundamental Theorem of Calculus, 396 Lebesgue's Last Therorem, 40 1 Lebesgue's Main Theorem, 399 left limit, 46 Leibniz rule, 274 length, 363 length of a vector, 22 less than or equal, 1 3 letters in an address, 1 00 limit, 1 7 , 54, 5 8 limit infimum, 49 limit point, 5 8 limit set, 6 1 limit supremum, 49 linear transformation, 267 Lipschitz condition, 1 42, 1 70, 1 95, 230 Littlewood's princinples. 394 locally path-connected, 1 32 logarithm function, 1 74 lower bound, 43 lower integral, 1 5 6 lower semicontinuity, 420 lower sum. 1 56 Lusin's Theorem, 428 magnitude, 1 6 magnitude of a vector, 22 Manhattan metric, 72. 1 24 maximum stretch. 268 McLeod, 423 meager, 243 mean value property, 1 4 1 Mean Value Theorem, 1 4 1 , 277 , 278 measurability (of a function), 376 measurability, pre-image, 383 measurable, 367 measurable set, 367 measure continuity, 370 Mertens ' Theorem, 1 99 mesh of a partition. 1 55 metaphor, 9 metric, metric space, 5 1 middle quarters Cantor set, 1 92 middle-thirds Cantor set, 95 minimum stretch, 345
modulus of continuity, 253 monotone, 1 1 7 Monotone Convergence Theorem, 377 monotone function. 4 7 monotone sequence, 1 1 7 monotonicity of outer measure, 364 Moore- Kline Theorem, l 03 most (in the Baire sense), 243 name of a form, 3 1 4 natural numbers, 1 nearly, 392, 4 1 2 nearly continuous, 4 1 2, 429 nearly uniform continuity, 428 nearly uniform convergence. 4 1 2, 427 neighborhood. 62 nested sequence, 78 net x -variation, 3 1 4 norm, 27 normed space. 268 nowhere dense. 98 nowhere differentiable functions. 240 one-to-one, 29 onto, 29 open covering, 88 open map. 1 1 8 open set condition, 65 operator norm, 268 orbit, 1 1 8, 407 ordered field, 1 5 ordinate set, 377 orthant. 24 orthogonal linear map, 34 1 oscillating discontinuity, 1 94 oscillation, 1 65 outer measure, 363 outer measure axi oms, 364 outer measure, abstract, 367 parallelogram law, 50 partial derivative, 272 partition, 1 04 partition of an index set, 353 partition pair. 1 54 path, 86 path-connected, 86, 1 3 2 Peano curve, 1 02 Peano space, 1 3 2
436 perfect, 93 periodic, 407 permutation, 342 Picard's Theorem, 23 1 piece, 99 piecewise continuous function, 1 60 piecewise linear function, 245 planar outer measure, 364 Poincare's Lemma, 330 pointwise convergence, 20 1 pointwise equicontinuity, 2 1 3 , ?49 pointwise limit, 20 I polar form, 3 4 1 positive definite symmetric linear map, 34 1 positive definiteness of a metric, 52 power series, 1 85 , 2 1 1 power set, 4 7 pre-image, 64 prime number, 6 product matrix, 270 propagation, 2 1 6 proper subset, 82 pullback, 322 pushforward, 322 quasi-round, 4 1 2 Rademacher's Theorem, 406 radius of convergence, 1 8 5 range o f a function, 2 8 rank, 290 Rank Theorem, 292 Ratio Mean Value Theorem, 1 43 ratio test, 1 83 rational cut 1 2 rational numbers, 2 rational ruler function, 1 6 1 real number, 1 2 rear inclusion k-cell, 326 rearrangement of a sequence, 1 1 8 rearrangement of a series , 198 reduces to (of a covering), 8 8 refinement o f a partition, 1 5 8 Refinement Principle, 1 5 8 regularity (ofLebesgue measure), 373 regularity hierarchy, 1 48, 208 relative measure, 395
Index representative of an equivalence class. 3
residual, 243 retraction, 334 Riemann s- function, 200 Riemann integrable, 1 5 5 , 300 Riemann integral, 1 55 Riemann measurable, 306 Riemann sum, 1 54, 300 Riemann 's Integrability Criterion, 1 60 Riemann-Lebesgue Theorem, 1 63 , 302 right interval, 403 right limit, 46 right slope, 403 root test, 1 83 sawtooth function, 240 Schroeder-Bernstein Theorem, 34 scraps, 88 second derivative, 280 second-differentiable, 280 semicontinuity, 420 separable metric space, 1 28 separates points, 223 separation (disconnectedness), 83 sequence , 1 7, 52 sequentially compact, 76 sigma algebra, 368 sign of a permutation, 342 signed commutativity, 3 1 6 simple k-form, 3 1 5 simple function, 386, 42 1 simple region, 356 simply connected, 3 3 0 singleton set, 2 slice, 387 sliding secant method, 1 45 slope over an interval, 40 1 smooth, 14 7, 284 somewhere dense, 98 space filling, 1 02 staircase approximation, 356 starlike, 1 2 1 , 332 steeple functions, 204 Steinhaus' Theorem, 429 step function, 1 60 Stokes ' Theorem, 326, 327 Stone-Weierstrass Theorem, 223 strictly decreasing sequence, 1 1 7
437
Index strictly increasing sequence, 1 1 7 subadditivity, finite, 4 1 9 subcovering, 8 8 subfield, 1 5 sublinear, 27 1 subsequence, 54 subspace, 52 sup norm, 205 supersolving, 225 supremum. 1 7 surjection, 29 symmetric difference, 2 symmetrization of a bilinear map, 350 symmetry of a metric, 52 tag, 423 tail of a series , 1 80 tame, 1 06 target of a function, 28 taxicab metric, 72, 1 24 Taylor polynomial, 1 49 Taylor series, 1 5 1 , 235 Taylor's Theorem, 1 50, 238 telescoping sum, 207 term by term differentiation, 2 1 0 term b y term integration, 208 thick, 243 thin, 243 tiling a manifold with cells, 327 topological equivalence, 66 topological property, 64 topological space, 60 topologist's sine circle, 53, 1 23 topologist' s sine curve, 87 topology, 60 total derivative, 272 total length. 98, 363 total length of a covering by intervals, 1 63 total variation, 406, 426 totally bounded, 92 totally disconnected, 96 trajectory, 230 transitive relation, 1 9 1 transitivity, 1 5 translation, 1 5 translation (by a function), 378
trefoil knot, 105, 1 34 Triangle Inequality, 1 6 triangle inequality, 23, 24, 52 trichotomy, 1 5 trigonometric polynomial, 227 truncate an address string, 97 type 1 region, type 2 region, 356 ultrametric, 1 29 uncountable, 30 undergraph, 1 54, 376 undergraph, completed, 380 undergraph, total, 382, 42 1 uniform cr convergence, 284 uniform continuity, 48, 82, 1 1 8 uniform convergence, 20 I uniform equicontinuity, 249 uniform limit, 202 uniformly cr Cauchy, 284 unit ball, cube, sphere, 24 universal compact metric space, 99 upper bound, 1 3 upper integral, 1 56 upper semicontinuity, 1 36, 264, 420 upper sum, 1 56 utility problem, 1 32 vanishing at a point, 223 variation, 406 Vitali covering, 392 volume multiplier, 306 weak contraction, 254 wedge product, 3 1 8 Weierstrass M -test, 207 Weierstrass Approximation Theorem 217 weighting signed area, 3 1 5 wild, 1 06 word (as an address), 1 00 Zeno's maze, 5 3 Zeno's staircase function, 1 6 1 zero locus, 256, 286 zero set, 98, 1 63 , 302, 365 zero-th derivative, 147