•
AN INTRODUCTION TO
ANALYSIS
AN INTRODUCTION TO
ANALYSIS THIRD EDITION
WILLIAM R. WADE UNIVERSITY OF TENNESSEE
PEARSON EDUCATION INTERNATIONAL
If you purchased this book within the United States or Canada you should be aware that it has been wrongfully imported without the approval of the Publisher or the Author.
Executive Acquisition Editor: George LoheU Executive Editor-in-Chief: SaUy Yagan Production Editor: Bayani Mendoza de Leon Vice-PresidentlDirector of Production and Manufacturing: David W. Riccardi Senior Managing Editor: Linda Mihatov Behrens Executive Managing Editor: Kathleen Schiaparelli Assistant Manufacturing ManagerlBuyer: Michael BeU Manufacturing Manager: Trudy Pisciotti Marketing Manager: Halee Dinsey Marketing Assistant: Rachel Beckman Director of Creative Services: Paul Belfanti Art Director: Jayne Conte Cover Designer: Bruce Kenselaar Editorial Assistant: Jennifer Brady Cover Image: Rudolf Bauer "Purple Center "(40.215), 1939, oil on canvas 43 3/8 x 43 112 in. Courtesy of Gary Snyder Fine Art, NY
© 2004, 2000, 1995 Pearson Education, Inc. Pearson Prentice Hall Pearson Education, Inc. Upper Saddle River, New Jersey 07458 All rights reserved. No part of this book may be reproduced, in any form or by any means, without permission in writing from the publisher. Pearson Prentice Hall® is a registered trademark of Pearson Education, Inc. Printed in the United States of America
10987654321
ISBN 0-13-124683-6 Pearson Education LTD., London Pearson Education Australia PTY, Limited, Sydney Pearson Education Singapore, Pte. Ltd Pearson Education North Asia Ltd, Hong Kong Pearson Education Canada, Ltd., Toronto Pearson Educacion de Mexico, S.A. de C.V. Pearson Education - Japan, Tokyo Pearson Education Malaysia, Pte. Ltd Pearson Education, Upper Saddle River, New Jersey
To Cherri, Peter, and Benjamin
Contents
Preface
Xl
Part I. ONE-DIMENSIONAL THEORY 1 The Real Number System 1.1 1.2 1.3 1.4
1
Ordered field axioms 1 Well-Ordering Principle 13 Completeness Axiom 18 Functions, count ability, and the algebra of sets
2 Sequences in R 35 2.1 2.2 2.3 2.4 e2.5
3
Limits of sequences 35 Limit theorems 39 Bolzano-Weierstrass Theorem Cauchy sequences 49 Limits supremum and infimum
45 52
Continuity on R 58 3.1 3.2 3.3 3.4
Two-sided limits 58 One-sided limits and limits at infinity Continuity 71 Uniform continuity 79 vii
66
24
Contents
viii
4 Differentiability on R 85 4.1 4.2 4.3 4.4
The derivative 85 Differentiability theorems 92 Mean Value Theorem 94 Monotone functions and Inverse Function Theorem
5 Integrability on R 107 5.1 5.2 5.3 5.4 e5.5 e5.6
Riemann integral 107 Riemann sums 117 Fundamental Theorem of Calculus 127 Improper Riemann integration 136 Functions of bounded variation 142 Convex functions 147
6 Infinite Series of Real Numbers 154 6.1 6.2 6.3 6.4 e6.5 e6.6
Introduction 154 Series with nonnegative terms Absolute convergence 165 Alternating series 173 Estimation of series 177 Additional tests 181
160
7 Infinite Series of Functions 184 7.1 7.2 7.3 7.4 e7.5
Uniform convergence of sequences 184 Uniform convergence of series 192 Power series 197 Analytic functions 207 Applications 219
Part II. MULTIDIMENSIONAL THEORY 8 Euclidean Spaces 225 8.1 8.2 8.3 8.4
Algebraic structure 225 Planes and linear transformations 234 Topology of Rn 242 Interior, closure and boundary 249
9 Convergence in Rn 256 9.1 9.2 9.3 e9.4 e9.5
Limits of sequences 256 Limits of functions 263 Continuous functions 270 Compact sets 277 Applications 280
102
Contents
ix
10 Metric Spaces 290 10.1 10.2 10.3 10.4 10.5 10.6
Introduction 290 Limits of functions 296 Interior, closure and boundary Compact sets 306 Connected sets 312 Continuous functions 316
11 Differentiability on Rn 11.1 11.2 11.3 11.4 11.5 11.6 ell.7
321
Partial derivatives and partial integrals 321 Definition of differentiability 332 Derivatives, differentials, and tangent planes 339 Chain Rule 348 Mean Value Theorem and Taylor's Formula 352 Inverse Function Theorem 358 Optimization 369
12 Integration on Rn 12.1 12.2 12.3 12.4 e12.5 e12.6
301
381
Jordan regions 381 Riemann integration on Jordan regions Iterated integrals 407 Change of variables 420 Partitions of unity 432 Gamma function and volume 441
394
13 Fundamental Theorems of Vector Calculus 449 13.1 13.2 13.3 13.4 13.5 13.6
Curves 449 Oriented curves 461 Surfaces 468 Oriented surfaces 479 Theorems of Green and Gauss Stokes's Theorem 496
488
e14 Fourier Series 506 e14.1 e14.2 e14.3 e14.4 e14.5
Introduction 506 Summability of Fourier series Growth of Fourier coefficients Convergence of Fourier series Uniqueness 532
512 519 526
Contents
x
e15 Differentiable Manifolds 538 e15.1 Differential forms on Rn 538 e15.2 Differentiable manifolds 550 e15.3 Stokes's Theorem on manifolds
561
Appendices A. Algebraic laws 570 B. Trigonometry 573 C. Matrices and determinants D. Quadric surfaces 583 E. Vector calculus and physics F. Equivalence relations 590
References
577 587
592
Answers and Hints to Exercises Subject Index Notation Index
611 624
593
This text provides a bridge from "sophomore" calculus to graduate courses that use analytic ideas, e.g., real and complex analysis, partial and ordinary differential equations, numerical analysis, fluid mechanics, and differential geometry. For a twosemester course, the first semester should end with Chapter 8. For a three-quarter course, the second quarter should begin in Chapter 6 and end somewhere in the middle of Chapter II. Our presentation is divided into two parts. The first half, Chapters 1 through 7 together with Appendices A and B, gradually introduces the central ideas of analysis in a one-dimensional setting. The second half, Chapters 8 through 15 together with Appendices C through F, covers multidimensional theory. More specifically, Chapter 1 introduces the real number system as a complete, ordered field, Chapters 2 through 5 cover calculus on the real line; and Chapters 6 and 7 discuss infinite series, including uniform and absolute convergence. The first two sections of Chapter 8 give a short introduction to the algebraic structure of Rn, including the connection between linear functions and matrices. At that point instructors have two options. They can continue covering Chapters 8 and 9 to explore topology and convergence in the concrete Euclidean space setting, or they can cover these same concepts in the abstract metric space setting (Chapter 10). Since either of these options provides the necessary foundation for the rest of the book, instructors are free to choose the approach that they feel best suits their aims. With this background material out of the way, Chapters 11 through 13 develop the machinery and theory of vector calculus. Chapter 14 gives a short introduction to Fourier series, including summability and convergence of Fourier series, growth of Fourier coefficients, and uniqueness of trigonometric series. Chapter 15 gives a short introduction to differentiable manifolds which culminates in a proof of Stokes's Theorem on differentiable manifolds. xi
xii
Preface
Separating the one-dimensional from the n-dimensional material is not the most efficient way to present the material, but it does have two advantages. The more abstract, geometric concepts can be postponed until students have been given a thorough introduction to analysis on the real line. Students have two chances to master some of the deeper ideas of analysis (e.g., convergence of sequences, limits of functions, and uniform continuity). We have made this text flexible in another way by including core material and enrichment material. The core material, occupying fewer than 384 pages, can be covered easily in a one-year course. The enrichment material is included for two reasons: Curious students can use it to delve deeper into the core material or as a jumping off place to pursue more general topics, and instructors can use it to supplement their course or to add variety from year to year. Enrichment material appears in enrichment sections, marked with a superscript e, or in core sections, where it is marked with an asterisk. Exercises that use enrichment material are also marked with an asterisk, and the material needed to solve them is cited in the Answers and Hints section. To make course planning easier, each enrichment section begins with a statement which indicates whether that section uses material from any other enrichment section. Since no core material depends on enrichment material, any of the latter can be skipped without loss in the integrity of the course. Most enrichment sections (5.5, 5.6, 6.5, 6.6, 7.5, 9.4, 11.6, 12.6, 14.1, 15.1) are independent and can be covered in any order after the core material that precedes them has been dealt with. Sections 9.5, 12.5, and 15.2 require 9.4, Section 15.3 requires 12.5, and Section 14.3 requires 5.5 only to establish Lemma 14.25. This result can easily be proved for continuously differentiable functions, thereby avoiding mention of functions of bounded variation. In particular, the key ideas in Section 14.3 can be covered without the background material from Section 5.5 anytime after finishing Chapter 7. Since for many students this is the last (for some the only) place to see a rigorous development of vector calculus, we focus our attention on classical, nitty-gritty analysis. By avoiding abstract concepts such as vector spaces and the Lebesgue integral, we have room for a thorough, comprehensive introduction. We include sections on improper integration, the gamma function, Lagrange multipliers, the Inverse and Implicit Function Theorem, Green's Theorem, Gauss's Theorem, Stokes's Theorem, and a full account of the change-of-variables formula for multiple Riemann integrals. We assume that the reader has completed a three-semester or four-quarter sequence in elementary calculus. Because many of our students now take their elementary calculus in high school (where theory may be almost nonexistent), we assume that the reader is familiar only with the mechanics of calculus, i.e., can differentiate, integrate, and graph simple functions of the form y = f(x) or z = f(x, y). We also assume that the reader has had an introductory course in linear algebra, i.e., can add, multiply, and take determinants of matrices with real entries, and are familiar with Cramer's Rule. (Appendix C, which contains an exposition of all definitions and theorems from linear algebra used in the text, can be used as review if the instructor deems it necessary.)
Preface
xiii
Always we emphasize the fact that the concepts and results of analysis are based on simple geometric considerations and on analogies with material already known to the student. The aim is to keep the course from looking like a collection of tricks and to share enough of the motivation behind the mathematics so that students are prepared to construct their own proofs when asked. We begin complicated proofs with a short paragraph (marked STRATEGY:) which shows why the proof works, e.g., the Archimedean Principle (Theorem 1.22), Density of Rationals (Theorem 1.24), Cauchy's Theorem (Theorem 2.29), Change of Variables in R (Theorem 5.34), Riemann's Theorem about rearrangements (Theorem 6.29), the Implicit Function Theorem (Theorem 11.47), the Borel Covering Lemma (Theorem 9.10), the fact that a curve is smooth when ¢' =1= 0 (Remark 13.10), and Stokes's Theorem on Manifolds (see page 563). We precede abstruse definitions or theorems with a short paragraph that describes, in simple terms, what behavior we are examining, and why, e.g., Cauchy sequences, one-sided limits, upper and lower Riemann sums, the Integral Test, Abel's Formula, uniform convergence, the total derivative, compact sets, differentiable curves and surfaces, smooth curves, and orientation equivalence. We include examples to show why each hypothesis of a major theorem is necessary, e.g., the Nested Interval Property, the Bolzano-Weierstrass Theorem, the Mean Value Theorem, the Heine-Borel Theorem, the Inverse Function Theorem, the existence of exact differentials, and Fubini's Theorem. Each section contains a collection of exercises that range from very elementary (to be sure the student understands the concepts introduced in that section) to more challenging (to give the student practice in using these concepts to expand the theory). To minimize frustration, some of the more difficult exercises have several parts that serve as an outline to a solution of the problem. To keep from producing students who know theory but cannot apply it, each set of exercises contains a mix of computational and theoretical assignments. (Exercises that play a prominent role later in the text are marked with a box. These exercises are an integral part of the course, and all of them should be assigned.) Since many students have difficulty reading and understanding mathematics, we have paid close attention to style and organization. We have consciously limited the vocabulary, kept notation consistent from chapter to chapter, and presented the proofs in a unified style. Individual sections are determined by subject matter, not by length of lecture, so that students can comprehend related results in a larger context. Examples and important remarks are numbered and labeled so that students can read the text in small chunks. (Many of these, included for the student's benefit, need not be covered in class.) Paragraphs are short and focused so that students are not overwhelmed by long-winded explanations. To help students discern between central and peripheral results, the word "Theorem" is used relatively sparingly; preliminary results and results that are used in only one section are called Remarks, Lemmas, and Examples. How does the third edition differ from the second? We have corrected a number of misprints. We have broken with tradition by stating definitions explicitly with an "if and only if." (How can we chide our students for using the converse of a result when it appears that we do so about half the time we apply a definition?) We have
xiv
Preface
continued to simplify and reorganize the presentation. Chapters 8 and 9 have been reworked completely. Chapter 8 now includes an introduction to both the algebraic and topological structure of Euclidean spaces but makes no premature mention of the total derivative. All material about limits (of sequences and functions) in a Euclidean space have been gathered together in Chapter 9. Chapter 11 now contains a complete exposition of the total derivative, gathering together material that had before been scattered in two different chapters. Chapter 12 has been further simplified by altering the definition of a Jordan region and by making the observation that the behavior of a function on the interior of a Jordan region E is what really determines the value of its Riemann integral on E. This allows us to avoid beginning every third proof by choosing a grid so that the outer sum on the boundary of E is small. Finally, the presentation of curves and surfaces in Chapter 13 has been further simplified as we continue to search for a minimal path to the theorems of Gauss, Green, and Stokes. I wish to thank Mr. P.W. Wade and Professors S. Fridli, G.S. Jordan, Mefharet Kocatepe, J. Long, M.E. Mays, M.S. Osborne, P.W. Schaefer, F.E. Schroeck, and Ali Sinan Sertoz, who carefully read parts of the first edition and made many valuable suggestions and corrections. Also, I wish to express my gratitude to Ms. C.K. Wade for several lively discussions of a pedagogical nature, which helped shape the organization and presentation of this material, and to F. David Lesley of San Diego State University and William Yslas Velez of the University of Arizona, for their insightful review of the third edition text. Finally, I wish to make special mention of Professor Lewis Lum, who made many careful and perspicuous comments about style, elegance of presentation, and level of rigor which have found their way into this third edition. If there remain any typographical errors, I plan to keep an up-to-date list at my Web site (http://www.math.utk.edurwade). If you find errors which are not listed at that site, I would appreciate your contacting me at the e-mail address below. William R. Wade
[email protected]
AN INTRODUCTION TO
ANALYSIS
Chapter 1
You have already had several calculus courses in which you evaluated limits, differentiated functions, and computed integrals. You may even remember some of the major results of calculus, such as the Chain Rule, the Mean Value Theorem, and the Fundamental Theorem of Calculus. Although you are probably less familiar with multivariable calculus, you have taken partial derivatives, computed gradients, and evaluated certain line and surface integrals. In view of all this, you must be asking: Why another course in calculus? The answer to this question is twofold. Although some proofs may have been presented in earlier courses, it is unlikely that the subtler points (e.g., completeness of the real numbers, uniform continuity, and uniform convergence) were covered. Moreover, the skills you acquired were mostly computational; you were rarely asked to prove anything yourself. This course develops the theory of calculus carefully and rigorously from basic principles and gives you a chance to learn how to construct your own proofs. It also serves as an introduction to analysis, an important branch of mathematics that provides a foundation for numerical analysis, functional analysis, harmonic analysis, differential equations, differential geometry, real analysis, complex analysis, and many other areas of specialization within mathematics. Every rigorous study of mathematics begins with certain undefined concepts, primitive notions on which the theory is based, and certain postulates, properties that are assumed to be true and need no proof. Our study will be based on the primitive notions of set and real numbers and on four postulates (containing a total of 18 different properties), that will be introduced in the first three sections of this chapter.
1.1 ORDERED FIELD AXIOMS In this section we explore the algebraic structure of the real number system. We shall use standard set theoretic notation. For example, 0 represents the empty set (the set with no elements), a E A means that a is an element of A, and a ~ A means that a is not an element of A. We can represent a given finite set in two ways. We 1
2
Chapter 1
THE REAL NUMBER SYSTEM
can list its elements directly, or we can describe it using sentences or equations. For example, the set of solutions to the equation x 2 = 1 can be written as
{I, -I} or {x: x 2 = I}. A set A is said to be a subset of a set B (notation: A ~ B) if and only if every element of A is also an element of B. If A is a subset of B but there is at least one element b E B that does not belong to A, we shall call A a proper subset of B (notation: A c B). Two sets A and B are said to be equal (notation: A = B) if and only if A ~ B and B ~ A. If A and B are not equal, we write A =f B. A set A is said to be nonempty if and only if A =f 0. The union of two sets A and B (notation: AUB) is the set of elements x such that x belongs to A or B or both. The intersection of two sets A and B (notation: AnB) is the set of elements x such that x belongs to both A and B. The complement of B relative to A (notation: A \ B, sometimes Be if A is understood) is the set of elements x such that x belongs to A but does not belong to B. For example,
{-I, 0,1} U {I, 2} = {-1,0,1, 2},
{-1,0,I}n{I,2} = {I},
{1,2}\{-1,0,1}={2} and {-1,0,1}\{1,2}={-1,0}. Let X and Y be sets. The Cartesian product of X and Y is the set of ordered pairs defined by X x Y:= {(x,y): x E X,y E Y}. (The symbol := means "equal by definition" or "is defined to be.") Two points (x, y), (z, w) E X x Y are said to be equal if and only if x = z and y = w. Let X and Y be sets. A relation on X x Y is any subset of X x Y. Let n be a relation on X x Y. The domain of n is the collection of x E X such that (x, y) belongs to n. When (x, y) En, we shall frequently write xny. A function f from X into Y (notation: f : X ---+ Y) is a relation on X x Y whose domain is X (notation: Dom (I) := X) such that for each x E X there is one and only one y E Y that satisfies (x, y) E f. In this case we say that f is defined on X, and call y the value of f at x (notation: y = f(x) or f : x f-----+ y). Notice that by the definition of equality of ordered pairs, two functions f and 9 are equal if and only if they have the same domain and values; i.e., Dom (I) = Dom (g) and f(x) = g(x) for all x E Dom (I). We shall denote the set of real numbers by R. We shall assume that R is a field, i.e., that R satisfies the following postulate. POSTULATE 1. [FIELD AXIOMS]. There are functions + and ., defined on R2 := R x R, that satisfy the following properties for every a, b, c E R: Closure Properties. a + b and a . b belong to R. Associative Properties. a + (b + c) Commutative Properties. a + b =
= (a + b) + c and a· (b· c) = (a· b) . c. b + a and a . b = b . a.
1.1
Ordered field axioms
3
Distributive Law. a· (b + c) = a· b + a· c. Existence of the Additive Identity. There is a unique element 0 E R such that
o+ a = a for all a E R. Existence of the Multiplicative Identity. There is a unique element 1 E R such that 1 =f 0 and 1 . a = a for all a E R. Existence of Additive Inverses. For every x E R there is a unique element -x E R such that x + (-x) = O. Existence of Multiplicative Inverses. For every x E R \ {O} there is a unique element X-I E R such that
We note in passing that the word "unique" can be dropped from the statement of Postulate 1 (see Appendix A). 1 We shall frequently denote a + (-b) by a - b, a· b by ab, a-I by - or 1/a, and a
a . b- I by ~ or a/b. Notice that by the existence of additive and multiplicative inverses, the equation x + a = 0 can be solved for each a E R, and the equation ax = 1 can be solved for each a E R provided that a =f O. From these few properties (Le., from Postulate 1), one can derive all the usual algebraic laws of real numbers, including the following:
(_1)2 = 1,
(1)
(2)
O'a=O,
(3)
-a=(-1)·a,
-(a - b)
= b-
-(-a)=a,
a,
aER,
a,bER,
and
(4)
a, b E R
and
ab = 0 imply
a = 0 or
b = O.
We want to keep our attention sharply focused on analysis. Since the proofs of algebraic laws like these lie more in algebra than analysis (see Appendix A), we will not present them here. In fact, with the exception of the absolute value and the Binomial Formula, we will use all material usually presented in a high school algebra course (including the quadratic formula and graphs of the conic sections) without further explanation as the need arises. Postulate 1 is sufficient to derive all algebraic laws of R, but it does not completely describe the real number system. The set of real numbers also has an order relation, Le., a concept of "less than."
Chapter 1
4
THE REAL NUMBER SYSTEM
POSTULATE 2. [ORDER AXIOMS]. There is a relation < on R x R that has the following properties: Trichotomy Property. Given a, b E R, one and only one of the following statements holds: a < b, b < a, or a = b. Transitive Property. a
< b and b < c imply a < c.
Additive Property. a
< b and c E R imply
a
+ c < b + c.
Multiplicative Properties. a < b and
c > 0 imply
ac < bc
and a
< b and c < 0 imply bc < ac.
By b > a we shall mean a < b. By a ::; b and b 2: a we shall mean a < b or a = b. If a < b and b < c, we shall write a < b < c. We shall call a number a E R nonnegative if a 2: 0 and positive if a > O. Postulate 2 has a slightly simpler formulation using the set of positive elements as a primitive concept (see Exercise 11). We have introduced Postulate 2 as above because these are the properties we use most often. The real number system R contains certain special subsets: the set of natuml numbers N := {I, 2, ... }, obtained by beginning with 1 and successively adding l's to form 2 .- 1 + 1, 3 := 2 + 1, etc.; the set of integers Z := {- .. - 2, -1,0,1,2, ... } (Zahlen is German for number); the set of mtionals (or fractions or guotients)
Q := { : : m, n E Z and n
=f O};
and the set of irrationals Equality in Q is defined by m =!!. if and only if mq = np. n q Recall that each of the sets N, Z, Q, and R is a proper subset of the next; i.e.,
NCZCQcR. For example, every rational is a real number (because min := mn- 1 is a real number by Postulate 1), but J2 is an irrational. Since we did not really define N and Z, we must make certain assumptions about them. If you are interested in the definitions and proofs, see Appendix A.
1.1
Ordered field axioms
5
1.1 Remark. We assume that Nand Z satisfy the following properties.
(i) Given nEZ, one and only one of the following statements holds: n EN, -n EN, or n = O. (ii) If n E N, then n + 1 EN and n 2: 1. (iii) If n E Nand n =f 1, then n - 1 EN. (iv) IfnEZ andn>O, thennEN. Using these properties and induction, we can prove that Nand Z are closed under addition and multiplication (see Remark 1.13). We can also prove that Q satisfies Postulate 1 (see Exercise 6). We notice in passing that none of the other special subsets of R satisfies Postulate 1. N satisfies all but three of the properties in Postulate 1: N has no additive identity (since 0 ~ N), N has no additive inverses (e.g., -1 ~ N), and only one of the nonzero elements of N (namely, 1) has a multiplicative inverse. Z satisfies all but one of the properties in Postulate 1: Only two nonzero elements of Z have multiplicative inverses (namely, 1 and -1). QC satisfies all but four ofthe properties in Postulate 1: QC does not have an additive identity (since 0 ~ R \ Q), does not have a multiplicative identity (since 1 ~ R \ Q), and does not satisfy either closure property. Indeed, since v'2 is irrational, the sum of irrationals may be rational (v'2 + (-v'2) = 0) and the product of irrationals may be rational (v'2 . v'2 = 2). Notice that any subset of R satisfies Postulate 2. Thus Q satisfies both Postulates 1 and 2. The remaining two postulates, introduced in Sections 1.2 and 1.3, identify properties that Q does not possess. In particular, these four postulates distinguish R from each of its special subsets N, Z, Q, and QC. These postulates actually characterize Rj i.e., R is the only set that satisfies Postulates 1 through 4. [Such a set is called a complete Archimedean ordered field. We may as well admit a certain arbitrariness in choosing this approach. R has been axiomized in at least five other ways (e.g., as a one-dimensional continuum or as a set of binary decimals with certain arithmetic operations). The decision to present R using Postulates 1 through 4 is based partly on economy and partly on personal taste.] Using all four postulates, one can define a function f : x 1--7 XO for any x > 0 and a E R (see Exercise 5, p. 134) so that the following properties hold: xO = 1, XO > 0, XO . xf3 = xo+ f3 , and (XO)f3 = xo. f3 for all a, f3 E R and all x > 0, and if a = n E N, then xn = x· ... ·x (there are n factors here). We also define 0° := 0 for a > O. [The symbol 0° is left undefined because it is indeterminate (see Example 4.22).] Because it would be impractical to wait until Chapter 5 to use XO for examples, we shall accept these properties as given and use them as the need arises. We shall also accept, as given, the trigonometric functions (whose formulas are) represented by sin x, cos x, tan x, cot x, sec x, csc x, the exponential function eX and its inverse, the natural logarithm x dt logx:= -, 1 t
j
defined and real-valued for each x E (0,00). [Although this last function is denoted
6
Chapter 1
THE REAL NUMBER SYSTEM
by lnx in elementary calculus texts, most analysts denote it, as we did just now, by logx. We will follow this practice throughout this text.] Notice that xC< . x-C< = xC<-C< = xO = 1. By the uniqueness of multiplicative inverses, it follows that x-C< = (xc<)-l = 1/ x C< for all a E R and x > O. If a = 11m for some mEN, we shall denote xC< by y'X. (We shall also write ..jX for -VX.) Hence
for all mEN, nEZ, and x> O. In particular, ( y'X)m = ~ = x for all x> 0 and mEN. Notice that since xC< is positive when x > 0, y'X 2: 0 for all x 2: 0 and mEN. Postulates 1 and 2 can be used to derive all the usual algebraic laws regarding real numbers and inequalities (e.g., see implications (5) through (9)). Since arguments based on inequalities are of fundamental importance to analysis, we begin to supply details of proofs at this stage. What is a proof? Every mathematical result (for us this includes examples, remarks, lemmas, and theorems) has hypotheses and a conclusion. There are three main methods of proof: mathematical induction, direct deduction, and contradiction. Mathematical induction, a special method for proving statements that depend on positive integers, will be covered in Section 1.2. To construct a deductive proof we assume the hypotheses to be true and proceed step by step to the conclusion. Each step is justified by a hypothesis, a definition, a postulate, or a mathematical result that has already been proved. (Actually, this is usually the way we write a proof. When constructing your own proofs, you may find it helpful to work forward from the hypotheses as far as you can and then work backward from the conclusion, trying to meet in the middle.) To construct a proof by contradiction, we assume the hypotheses to be true, the conclusion to be false, and work step by step deductively until a contradiction occurs; i.e., a statement that is obviously false or that is contrary to the assumptions made. At this point the proof by contradiction is complete. The phrase "suppose to the contrary" always indicates a proof by contradiction (e.g., see the proof of Theorem 1.9). Here are some examples of deductive proofs. (Note: The symbol I indicates that the proof or solution is complete.) 1.2 Example. If a E R, prove that (5)
a
=f 0 implies
a2
> O.
In particular, -1 < 0 < 1. PROOF. Suppose that a =f O. By the Trichotomy Property, either a > 0 or a < O. Case 1. a > O. Multiply both sides of this inequality by a. By the first Multiplicative Property, we obtain a 2 = a . a > 0 . a = O.
1.1
Ordered field axioms
7
Case 2. a < O. Multiply both sides of this inequality by a. Since a < 0, it follows from the second Multiplicative Property that a 2 = a· a > o· a = O. This proves a2 > 0 when a =I- O. Since 1 =I- 0, it follows that 1 = 12 > O. Adding -1 to both sides of this inequality, we conclude that 0 = 1 - 1 > 0 - 1 = -1. I 1.3 Example. If a E R, prove that
(6)
0 < a < 1 implies 0 < a 2 < a
and a
> 1 implies
a2
> a.
PROOF. Suppose that 0 < a < 1. Multiply both sides of this inequality by a. By the first Multiplicative Property,
o = 0 . a < a . a = a2 < 1 . a = a. On the other hand, if a> 1, then a> 0 by Example 1.2 and the Transitive Property. Multiplying a > 1 by a, we conclude that a 2 = a . a > 1 . a = a. I Similarly (see Exercise 4), we can prove that
(7) (8)
o ~ a < band
o~ a < b
implies
0
~
c < dimply ac < bd,
0 ~ a 2 < b2
and 0 ~
Va < Vb,
and
(9)
o< a < b
implies
1
~
1
> b > O.
Although it may seem both pedantic and unnecessary to include proofs of such well-known (yes, perhaps even obvious) laws, we include them here for several reasons. We want this book to be reasonably self-contained, because this will make it easier for you to begin to construct your own proofs. We want the first proofs you see in this book to be easily understood, because they deal with familiar properties that are unobscured by new concepts. And we want to form a habit of proving all statements, even "obvious" statements like these. The reason for this hard headed approach is that some "obvious" statements are false. For example, some students think it obvious that any continuous function must be differentiable at some point. Others think it obvious that if every rational in [0, 1] is covered by a small interval, then the sum of the lengths of those intervals must exceed 1. We shall see that both these statements, and many others equally "obvious," are false (e.g., see Theorem 7.62 and Remark 9.42). In particular, we harbor a skepticism that demands proofs of all statements, even the "obvious" ones. What, then, are you allowed to use when solving the exercises? You may use any property of real numbers (e.g., 2 + 3 = 5, 2 < 7, or v'2 is irrational) without reference or proof. You may use any algebraic property of real numbers involving equal signs (e.g., (x + y)2 = x 2 + 2xy + y2 or (x + y)(x - y) = x 2 - y2) and the
8
Chapter 1
THE REAL NUMBER SYSTEM
techniques of calculus to find local maxima or minima of a given function without reference or proof. After completing the exercises in Section 1.2, you may also use any algebraic property of real numbers involving inequalities (e.g., 0 < a < b implies o < aX < bX for all x > 0) without reference or proof. (To illustrate how to use the Well-Ordering Principle and the Completeness Axiom, we have included some proofs of properties like these; e.g., see Remarks 1.13, 1.25 and 1.26.) Much of analysis deals with estimation (of error, of growth, of volume, etc.) in which inequalities and the following concept play a central role. 1.4 DEFINITION. The absolute value of a number a E R is the number a~O
a
< O.
When proving results about the absolute value, we can always break the proof up into several cases, depending on when the parameters are positive, negative, or zero. Here is a typical example. 1.5 Remark. The absolute value is multiplicative; i.e., labl =
lallbl for all a, bE R.
PROOF. We consider four cases. Case 1. a = 0 or b = O. Then ab = 0, so by definition, labl = 0 = lallbl. Case 2. a > 0 and b > O. By the first Multiplicative Property, ab > O· b = O. Hence by definition, labl = ab = lallbl. Case 3. a> 0 and b < 0, or, b> 0 and a < O. By symmetry, we may suppose that a > 0 and b < O. (That is, if we can prove it for a > 0 and b < 0, then by reversing the roles of a and b, we can prove it for a < 0 and b > 0.) By the second Multiplicative Property, ab < O. Hence by Definition 1.4, (2), and commutativity, labl = -(ab) = (-l)(ab) = a((-l)b) = a(-b) =
lallbl.
Case 4. a < 0 and b < O. By the second Multiplicative Property, ab by Definition 1.4, labl = ab = (-1)2(ab) = (-a)( -b) =
lallbl.
> O. Hence
I
We shall soon see that there are more efficient ways to prove results about absolute values than breaking the argument into cases. The following result is useful when solving inequalities involving absolute value signs. 1.6 THEOREM. Let a M.
ER
and M
~
O. Then
lal
~
M if and only if -M
PROOF. Suppose first that lal ~ M. Multiplying by -1, we also have -Ial Case 1. a ~ O. Then by Definition 1.4 and hypothesis, -M ~ 0 ~ a
= lal
~ M.
~
~
a
~
-M.
1.1
Ordered field axioms
9
Case 2. a < O. Then -M:::;
-Ial = a < 0:::; M.
This proves that - M :::; a :::; M in either case. Conversely, if -M :::; a :::; M, then a :::; M and -M :::; a. Multiplying the second inequality by -1, we have -a :::; M. Consequently, lal = a :::; M if a ~ 0, and lal = -a :::; M if a < O. I Note: In a similar way we can prove that lal < M if and only if -M < a < M. Here is another useful result about absolute values. 1. 7 THEOREM. The absolute value satisfies the following three properties. (i) [POSITIVE DEFINITE] For all a ER, lal ~ 0 with lal = 0if and only if a = O. (ii) [SYMMETRIC] For all a, bE R, la - bl = Ib - al. (iii) [TRIANGLE INEQUALITIES] For all a, bE R,
la + bl
:::; lal + Ibl,
la -
bl
~
lal-Ibl,
and
1lal-Ibl I:::; la - bl·
PROOF. (i) If a ~ 0, then lal = a ~ o. If a < 0, then by Definition 1.4 and the second Multiplicative Property, lal = -a = (-l)a > O. Thus lal ~ 0 for all a ER. If lal = 0, then by definition ±a = O. Hence a = 1· a = (±1)2a = (±l)(±a) = (±1) ·0 = O. Thus lal = 0 implies that a = O. Conversely, 101 = 0 by definition. (ii) By Remark 1.5, la - bl = I - 111b - al = Ib - al· (iii) To prove the first inequality, notice that Ixl :::; Ixl holds for any x E R. Thus Theorem 1.6 implies that -Ial :::; a :::; lal and -Ibl :::; b :::; Ibl. Adding these inequalities, we obtain
-(Ial + Ibl) :::; a + b :::; lal + Ibl· Hence by Theorem 1.6 again, la + bl :::; lal + Ibl· The second inequality follows immediately from the first, since
lal - Ibl = la - b + bl - Ibl :::; la - bl + Ibl - Ibl = la - bl· To prove the third inequality, notice that by Theorem 1.6 we need to verify
-Ia-bl:::;
lal-Ibl:::; la-bl·
The right-hand inequality has already been proved. Hence by Remark 1.5,
Ibl-Ial :::; Ib - al
=
I-lila - bl = la - bl·
Multiplying this last inequality by -1, we conclude that
-Ia -
bl :::; lal-Ibl·
I
Some students mistakenly mix absolute values and the Additive Property to conclude that b < cimplies that la + bl < la + cl. It is important from the beginning to recognize that this implication is false unless both a + b and a + c are nonnegative. For example, if a = 1, b = -5, and c = -1, then b < c but la + bl = 4 is not less than la + cl = O. A correct way to estimate using absolute value signs usually involves one of the triangle inequalities.
Chapter 1
10
THE REAL NUMBER SYSTEM
1.8 Example. Prove that if -2 < x < 1, then PROOF. By hypothesis,
Ixl <
Ix2 + xl < 6.
2. Hence by the triangle inequality and Remark
1.5,
Ix2 + xl
~
Ixl 2 + Ixl < 4 + 2 = 6.
I
The following result (which is equivalent to the Trichotomy Property) will be used many times in this and subsequent chapters.
1.9 THEOREM. Let x, y, a E R. (i) x < y + c for all c > 0 if and only if x ~ y. (ii) x > y - c for all c > 0 if and only if x ~ y. (iii) lal < c for all c > 0 if and only if a = O. PROOF. (i) Suppose to the contrary that x < y + c for all c > 0 but x > y. Set = x - y > 0 and observe that y + co = x. Hence by the Trichotomy Property, y + co cannot be greater than x. This contradicts the hypothesis for c = co. Thus x ~ y. Conversely, suppose that x ~ y and c > 0 is given. Either x < y or x = y. If x < y, then x + 0 < y + 0 < y + c by the Additive and Transitive Properties. If x = y, then x < y + c by the Additive Property. Thus x < y + c for all c > 0 in either case. This completes the proof of part (i). (ii) Suppose that x > y - c for all c > O. By the second Multiplicative Property, this is equivalent to -x < -y + c, hence by part (i), equivalent to -x ~ -yo Multiplying this inequality by -1, we conclude that x ~ y. (iii) Suppose that lal < c for all c > O. By Theorem 1.6, this is equivalent to -c < a < c. It follows from parts (i) and (ii) that 0 ~ a ~ O. We conclude by the Trichotomy Property that a = O. I co
Let a and b be real numbers. A closed interval is a set of the form
[a,b]:= {x
E
R: a ~ x
(-oo,b] := {x
E
~
b},
R: x
[a,oo):= {x
~
b},
or
E
R: a ~ x},
(-00,00):= R,
and an open interval is a set of the form
(a, b)
:=
{x
E
R: a < x < b},
(-oo,b):={XER:x
(a,oo):= {x or
E
R: a < x},
(-00,00):=R.
By an interval we mean a closed interval, an open interval, or a set of the form [a,b):={xER:a~x
or
(a,b]:={xER:a<x~b}.
Notice, then, that when a < b, the intervals [a, b], [a, b), (a, b], and (a, b) correspond to line segments on the real line, but when b < a, these "intervals" are all the empty set.
1.1
Ordered field axioms
11
An interval I is said to be bounded if and only if it has the form [a, b], (a, b), [a, b), or (a, bJ for some -00 < a ::; b < 00, in which case the numbers a, b will be called the endpoints of I. All other intervals will be called unbounded. An interval with endpoints a, b is called degenerate if a = band nondegenerate if a < b. Thus a degenerate open interval is the empty set, and a degenerate closed interval is a point. Analysis has a strong geometric flavor. Geometry enters the picture because the real number system can be identified with the real line in such a way that a < b if and only if a lies to the left of b (see Figures 1.1, 2.1, and 2.2). This gives us a way of translating analytic results on R into geometric results on the number line, and vice versa. We close with several examples. The absolute value is closely linked to the idea of length. The length of a bounded interval I with endpoints a, b is defined to be III := Ib-al. And the distance between any two points a, bE R is defined by la - bl. Inequalities can be interpreted as statements about intervals. By Theorem 1.6, lal ::; M if and only if a belongs to the closed interval [-M, MJ. And, by Theorem 1.9, a belongs to the open intervals (-e, e) for all e > 0 if and only if a = O. We will use this point of view in Chapters 2 through 5 to give geometric interpretations to the calculus of functions defined on R, and in Chapters 11 through 13 to extend this calculus to functions defined on the Euclidean spaces Rn. EXERCISES In each of the following exercises, verify the given statement carefully, proceeding step by step. Validate each step that involves an inequality by using some statement found in this section.
[!].
This exercise is used in Section 6.3. The positive part of an a E R is defined by a+'- lal +a '--2and the negative part by a
_
(a) Prove that a = a+ - a- and lal (b) Prove that a2:0 a::;O
lal-a
:=-2-' =
a+ + a-.
and
a- = { 0 -a
2. Solve each of the following inequalities for x E R. (a) Ix-21 <5. (b) II-xl <4. (c) Ix2-x-ll <x 2 . 3. Suppose that a, b, c E R and a ::; b. (a) Prove that a + c ::; b + c. (b) If c 2: 0, prove that a . c ::; b . c.
a2:0 a::; O.
(d) Ix 2 +xl <2.
Chapter 1
12
THE REAL NUMBER SYSTEM
4. Prove (7), (8), and (9). Show that each of these statements is false if the hypothesis a ~ 0 or a > 0 is removed. 5. (a) Prove that if 0 < a < 1 and b = 1- J1=(i, then 0 < b < a. (b) Prove that if a> 2 and b = 1 +~, then 2 < b < a. (c) The arithmetic mean of a, b E R is A(a, b) = (a + b)/2, and the geometric mean of a, bE [0,00) is G(a, b) = J(Jj. If 0::; a ::; b, prove that a ::; G(a, b) ::; A(a, b) ::; b. Prove that G(a, b) = A(a, b) if and only if a = b. 6. (a) Interpreting a rational min as m· n- 1 E R and assuming that Z is closed under addition and multiplication (see Remark 1.13), use Postulate 1 to prove that m
p
-+-= n q
mq + np , nq
m
n
p q
mp nq
m -m - - = - , and
n
n
(;;£)-1
n
for m, n,p, q, £ E Z and n, q,£ #- O. (b) Prove that Postulate 1 holds with Q in place of R. (c) Prove that the sum of a rational and an irrational is always irrational. What can you say about the product of a rational and an irrational? (d) Let min, pi q E R with n, q > O. Prove that
m
if and only if mq < np.
(Restricting this observation to Q gives a definition of "<" on Q.)
7. (a) Prove that Ixl ::; 1 implies Ix 2 -11::; 21x -11. (b) Prove that -1 ::; x ::; 2 implies Ix 2 + x - 21 ::; 41x-1I(c) Prove that Ixl ::; 1 implies Ix 2 - x - 21::; 31x + 11. (d) Prove that 0 < Ix - 11 < 1 implies Ix 3 + x - 21 < 81x -11- Is this true if 0::; Ix - 11 < I? 8. For each of the following, find all values of n E N that satisfy the given inequality. I-n I-n
(a)
(b)
(c)
- - 2 <0.01.
+ 2n + 3 + 5n2 + 8n + 3
n2 2n 3
< 0.025.
n-l 2 1 < 0.002. n -n +n3
9. Prove that (a 1b1 + a2b2)2 ::; (a~
+ a~)(b~ + b~)
for all a1,a2,b 1,b2 E R. 10. Suppose that x, a, y, bE R, Ix - al < €, and Iy - bl < € for some € > O. (a) Prove that Ixy - abl < (Ial + Ibl)€ + €2. (b) Prove that Ix 2y - a 2bl < €(laI 2 + 2labl) + €2(lbl + 21al) + €3.
1.2
13
Well-Ordering Principle
11. (a) Let R+ represent the collection of positive real numbers. Prove that R+ satisfies the following two properties. (i) For each x E R, one and only one of the following hold: xER+,
-xER+,
or
x=O.
(ii) Given x, y E R+, both x + y and x . y belong to R+. (b) Suppose that R contains a subset R+ (not necessarily the set of positive numbers) that satisfies properties (i) and (ii). Define x -< y by y - x E R + . Prove that Postulate 2 holds with -< in place of <.
1.2 WELL-ORDERING PRINCIPLE In this section we introduce the Well-Ordering Principle, a postulate that distinguishes the set N from the sets Z, Q, and R. We use it to establish the Principle of Induction and prove the Binomial Formula, a result that shows how to expand powers of a binomial expression, i.e., an expression of the form a + b. The Well-Ordering Principle is different from the preceding postulates in a fundamental way. Postulates 1 and 2 were statements about the algebraic structure of R, namely, about finite sums and products of elements of R. Postulate 3 is a statement about the "direction" of N under the order relation <, namely, about the existence of least elements of subsets of N. Before we state the Well-Ordering Principle, we make precise what we mean by a least element.
1.10 DEFINITION. A number x is a least element of a set E c R if and only if x E E and x :s; a for all a E E. Note: Because French mathematicians (e.g., Borel, Jordan, and Lebesgue) did fundamental work on the connection between analysis and set theory, and fnsemble is French for set, analysts frequently use E to represent a general set.
POSTULATE 3. [WELL-ORDERING PRINCIPLE]. Every nonempty subset of N has a least element. Notice that the Well-Ordering Principle is not satisfied by the number systems Z, Q, and R since none of these systems contains a least element. Our first application of the Well-Ordering Principle is the Principle of Mathematical Induction (which, under mild assumptions, is equivalent to the Well-Ordering Principle-see Appendix A).
1.11 THEOREM. Suppose for each n E N that A(n) is a proposition (i.e., a verbal statement or formula) that satisfies the following two properties: (i) A(l) is true. (ii) For every kEN for which A(k) is true, A(k + 1) is also true. Then A(n) is true for all n E N. PROOF. Suppose that the theorem is false. Then the set E = {n EN: A(n) is false} is nonempty. Hence by Postulate 3, E has a least element, say x.
Chapter 1
14
THE REAL NUMBER SYSTEM
By hypothesis (i), x #- 1. Since x E E ~ N, it follows from Remark 1.1iii that x-I EN. But x-I < x and x is a least element of E. Consequently, A(x - 1) is true. Applying hypothesis (ii) to k = x -1, we see that A(x) = A(k + 1) must also be true; i.e, x tt E, a contradiction. I Recall that if xo, Xl, ... , xn are real numbers and 0
~
j
~
n, then
n
L
Xk := Xj
+ Xj+l + ... + Xn
k=j
denotes the sum of the Xk'S as k ranges from j to n. The following examples illustrate the fact that the Principle of Mathematical Induction can be used to prove a variety of statements involving integers. 1.12 Example. Prove that n
L(3k -1)(3k + 2) = 3n3
+ 6n 2 + n
k=l
for n E N. PROOF. Let A(n) represent the statement n
L(3k - 1)(3k + 2) = 3n3 + 6n 2 + n. k=l
For n = 1 the left side of this equation is 2 . 5 and the right side is 3 + 6 + 1. Therefore, A(l) is true. Suppose A(n) is true for some n 2: 1. Then n
n+l
L(3k -1)(3k + 2) = (3n + 2)(3n + 5) k=l
+ L(3k -1)(3k + 2) k=l
=
(3n + 2)(3n + 5) + 3n3
+ 6n 2 + n =
3n3 + 15n2 + 22n + 10.
On the other hand, a direct calculation reveals that
3(n + 1)3 + 6(n + 1)2 + (n + 1)
=
3n3 + 15n2 + 22n + 10.
Therefore, A(n+ 1) is true when A(n) is. We conclude by induction that A(n) holds for all n E N. I Next, we show that Nand Z satisfy the Closure Properties. *1.13 Remark. Prove that if n, mEN (respectively, E Z), then n belong to N (respectively, to Z).
+m
and nm
PROOF. Since n E Z if and only if n = 0, n E N, or -n E N, it suffices to prove the closure properties for N, namely, to show that given n E N, both n + m and nm belong to N for all mEN. We shall prove these by induction on m.
1.2
Well- Ordering Principle
15
Fix n E N and consider the set A:= {m EN: n+m EN}. Recall from Remark 1.1ii that n E N implies n + 1 E N; i.e., 1 E A. Suppose that mEA for some m ~ 1; i.e., n + mEN. Then by Remark 1.1ii and associativity, n + (m + 1) = (n + m) + 1 EN; i.e., m + 1 E A. Thus, by induction, A = N and closure holds for addition. Now consider the set B := {m EN: n· mEN}. Clearly, n· 1 = n E N; i.e., 1 E B. Next, if some m ~ 1 belongs to B, then n(m + 1) = nm + n E N by the closure of addition. Thus m + 1 E B and the proof is complete by induction. I Two formulas encountered early in an algebra course are the perfect square and cube formulas:
Our next application of the Principle of Mathematical Induction is a generalization of these formulas from n = 2 or 3 to arbitrary n E N. Recall that Pascal's Triangle is the triangular array of integers whose rows begin and end with l's with the property that an interior entry on any row is obtained by adding the two numbers in the preceding row immediately above that entry. Thus the first few rows of Pascal's Triangle are as below. 1 1
1
121 133 1
1 464 1 1 5 10 10 5 1 1 6 15 20 15 6 1 Notice that the third and fourth rows are precisely the coefficients that appeared in the perfect square and cube formulas above. One can write down a formula for each entry in each row of the Pascal Triangle. The first (and only) entry in the first row is
(~) := 1. Using the notation O! := 1 and n! := 1· 2··· (n -1). n for n E N, define the binomial coefficient n over k by n! ( n) k .- (n - k)!k! for 0 ::; k ::; n and n = 0, 1, .... The following result shows that the binomial coefficient n over k does produce the (k + l)st entry in the (n + l)st row of Pascal's Triangle.
Chapter 1
16
THE REAL NUMBER SYSTEM
1.14 Lemma. lin, kEN and 1:::; k :::; n, then
(n + k
1) = ( k-1 n ) + (n) k .
(n) k
=
n! k (n - k + l)!k!
=
n!(n + (n - k + l)!k! =
PROOF. By definition,
( n) k- 1
+
1)
n!(n - k + 1) + (n - k + l)!k! (n + k
1) . I
Binomial coefficients can be used to expand the nth power of a sum of two terms. 1.15 THEOREM [BINOMIAL FORMULA]. lia,b E Rand n E N, then
PROOF. The proof is by induction on n. The formula is obvious for n = 1. Suppose that the formula is true for some n E N. Then by the inductive hypothesis and Postulate 1,
(a + bt+1 = (a + b) (a + bt
(~ (~)an-kbk)
=
(a + b)
=
(~ (~)an-k+lbk) + (~ (~)an-kbk+1)
t (~)an-k+lbk) t 1))
+ (b n+1 + ~
= (a n+ 1 + = an+1 +
((~) +(k:
(~)an-kbk+1)
an-k+ 1 bk +bn+ 1 •
Hence it follows from Lemma 1.14 that
i.e., the formula is true for n + 1. We conclude by induction that the formula holds for all n EN. I
1.2
17
Well- Ordering Principle
EXERCISES
00. This exercise is used in Sections 1.4, 2.4, and 5.1. Prove that the following formulas hold for all n EN. (a)
tk = 2+ n(n
(b)
1).
k=l
1 2: -a-I =1--, ak an
k 2 = n(n + 1~(2n + 1).
k=l
n
(c)
t
a
¥= O.
(d)
t(2k _
1)2 =
n(4n~ -1).
k=l
k=l
2. Use the Binomial Formula to prove each of the following. (a) 2n = L~=o (~) for all n E N. (b) (a + b)n ;::: an + nan-lb for all n (c) (1 + l/n)n ;::: 2 for all n E N.
E
N and a, b ;::: O.
3. Let n E N. Write
as a sum none of whose terms has an h in the denominator. 4. (a) Suppose that 0 < Xl < 1 and Xn+1 = 1 - VI - Xn for n E N. Prove that o < Xn+1 < Xn < 1 holds for all n EN. (b) Suppose that Xl ;::: 2 and Xn+l = 1 + VXn - 1 for n E N. Prove that 2 ~ xn+1 ~ Xn ~ Xl holds for all n E N. 5. Suppose that 0 < Xl < 2 and Xn+l = V2 + Xn for n E N. Prove that 0 < Xn < Xn+l < 2 holds for all n E N. 6. Prove that each of the following inequalities holds for all n E N. (a) n < 2n. (b) n 2 ~ 2n + 1. (c) n 3 ~ 3 n .
[!]. This exercise is used in Section 2.3. Prove that 0 ~ a < b implies o ~ an < bn and 0 ~ y'a < V'b for all n E N. 8. In the next section we prove that the square root of an integer m is rational if and only if m = k 2 for some kEN. Assume that this result is true. (a) Prove that vn + 3 + Vii is rational for some n E N if and only if n = 1. (b) Find all n E N such that vn + 7 + Vii is rational. 9. Prove that 2n + 3n is a multiple of 5 for all odd n 10. Let ao = 3, bo = 4, and Co = 5.
E
N.
(a) Let ak = ak-l + 2, bk = 2ak-l + bk-l + 2, and Ck = 2ak-l + Ck-l + 2 for kEN. Prove that Ck - bk is constant for all kEN. (b) Prove that the numbers defined in part (a) satisfy a ~ + b~ = c~ for all kEN.
18
Chapter 1
THE REAL NUMBER SYSTEM
1.3 COMPLETENESS AXIOM
In this section we introduce the last of four postulates that describe R. To formulate this postulate, which distinguishes Q from R, we need the following concepts. 1.16 DEFINITION. Let E
c
R be nonempty.
(i) The set E is said to be bounded above if and only if there is an MER such that a :::; M for all a E E. (ii) A number M is called an upper bound of the set E if and only if a:::; M for all a E E. (iii) A number s is called a supremum of the set E if and only if s is an upper bound of E and s :::; M for all upper bounds M of E. (In this case we shall say that E has a supremum s and shall write s = sup E.) Notice by (iii) that a supremum of a set E (when it exists) is the smallest (or least) upper bound of E. By definition, then, in order to prove that s = supE for some set E C R, we must show two things: s is an upper bound, AND s is the smallest upper bound. 1.17 Example. If E = [0,1]' prove that supE = 1.
PROOF. By the definition of interval, 1 is an upper bound of E. Let M be any upper bound of E; Le., M ~ x for all x E E. Since 1 E E, it follows that M ~ 1. Thus 1 is the smallest upper bound of E. I The following two remarks answer the question: How many upper bounds and suprema can a given set have? 1.18 Remark. If a set has one upper bound, it has infinitely many upper bounds.
PROOF. If Mo is an upper bound for a set E, then so is M for any M > Mo. I 1.19 Remark. If a set has a supremum, then it has only one supremum.
PROOF. Let SI and S2 be suprema of the same set E. Then both SI and S2 are upper bounds of E, whence by Definition 1. 16iii, SI :::; S2 and S2 :::; SI. We conclude by the Trichotomy Property that s 1 = S2· I (Note: This proof illustrates a general principle. When asked to prove that a = b, it is often easier to verify that a :::; band b :::; a.) The next result, a fundamental property of suprema, shows that the supremum of a set E can be approximated by a point in E (see Figure 1.1 for an illustration). 1.20 THEOREM [ApPROXIMATION PROPERTY FOR SUPREMA]. If E has a supremum and c > 0 is any positive number, then there is a point a E E such that sup E - c < a :::; sup E.
PROOF. Suppose that the theorem is false. Then there is an co > 0 such that no element of E lies between So := supE - co and supE. It follows that a :::; So for all
1.3
The Completeness Axiom
19
a E E; i.e., 80 is an upper bound of E. Thus, by Definition 1.16iii, supE :=:; 80 = sup E - co. Adding co - sup E to both sides of this inequality, we conclude that co :=:; 0, a contradiction. I The Approximation Property can be used to show that the supremum of any subset of integers is itself an integer. 1.21 Remark. If E
c
N has a 8upremum, then supE E E.
PROOF. Suppose that 8 := supE and apply the Approximation Property to choose an Xo E E such that 8 - 1 < Xo :=:; 8. If 8 = xo, then 8 E E as promised. Otherwise, 8 - 1 < Xo < 8 and we can apply the Approximation Property again to choose Xl E E such that Xo < Xl < 8. Subtract Xo from this last inequality to obtain < Xl -Xo < 8-XO' Using the leftmost inequality, we have by Remarks 1.1iv and 1.1ii that Xl - Xo ;::=: 1. On the other hand, since Xo > 8-1, the rightmost inequality implies that Xl -Xo < 8-(8-1) = 1, a contradiction. I
°
The existence of suprema is the last major assumption about R we make. POSTULATE 4. [COMPLETENESS AXIOM]. If E is a nonempty subset of R that is bounded above, then E has a (finite) supremum. We shall use this property many times. Our first four applications deal with the distribution of integers and rationals among real numbers. 1.22 THEOREM [ARCHIMEDEAN PRINCIPLE]. Given positive real numbers a and b, there is an integer n E N such that b < na.
STRATEGY: The idea behind the proof is simple. By the Completeness Axiom and Remark 1.21, any nonempty subset of integers that is bounded above has a "largest" integer. If ko is the largest integer that satisfies koa :=:; b, then n = (k o + 1) (that is larger than k o) must satisfy na > b. In order to justify this application of the Completeness Axiom, we have two details to attend to: (1) Is the set E := {k EN: ka :=:; b} bounded above? (2) Is E nonempty? The answer to the second question depends on whether or not b < a. Here are the details. PROOF. If b < a set n = 1. If a :=:; b, consider the set E = {k EN: ka :=:; b}. E is nonempty since 1 E E. Since ka :=:; b for all k E E and a > 0, it follows from the Multiplicative Property that k :=:; bja for all k E E; i.e., E is bounded above by bja. Thus, by the Completeness Axiom and Remark 1.21, E has a supremum 8 that belongs to E, in particular, 8 E N. Set n = 8 + 1. Then n E N and (since n is larger than 8), n cannot belong to E. Thus na > b. I Notice in Example 1.17 and Remark 1.21 that the supremum of E belonged to E. The following result shows that this is not always the case. 1.23 Example. Let A sup A = supB = 1.
=
{1,~, i,~,
... } and
B =
g, £,~, ... }.
Prove that
PROOF. It is clear that 1 is an upper bound of both sets. It remains to see that 1 is the smallest upper bound of both sets. For A, this is trivial. Indeed, if M
20
THE REAL NUMBER SYSTEM
Chapter 1
I
•
• •
0 ... 1.
8
1
+ 1
•
• •Z .•. 1I
•
1
4
2"
points in A
•
I
1
0
3
4
2:
8
points in B
Figure 1.1 is any upper bound of A then M ~ 1 (since 1 E A). On the other hand, if M is an upper bound for B, but M < 1, then 1 - M > O. Since (9) implies that 1/(1 - M) > 0, we can choose (by the Archimedean Principle) an n E N such that n > 1/(1 - M). Since n < 2n (see Exercise 6, p. 17), it follows (do the algebra) that Xo := 1- 1/2n > M. Since Xo E B, this contradicts the assumption that M is an upper bound of B (see Figure 1.1). I
The next proof shows how the Archimedean Principle is used to establish scale. 1.24 THEOREM [DENSITY OF RATIONALS]. If a, b there is a q E Q such that a < q < b.
E
R satisfy a < b, then
STRATEGY: To motivate the proof, consider the special case a = 1/4 and b = 1/3. We want to find a fraction q = min such that 1/4 < min < 1/3. No such m exists if 1 ::; n ::; 6 because the fractions pin are spaced too far apart; 1/6 is too small and 2/6 is too large. If n is large enough, however, so that some of the fractions pin belong to the interval (1/4,1/3) (e.g., n = 7), then an acceptable value for m is m = ko - 1, where ko is the smallest integer satisfying b ::; koln. How large should n be? In order for pin to belong to (a, b), we need an n that satisfies
1 n
- < b- a. Such an n can be chosen by the Archimedean Principle. We begin our formal proof at this point. PROOF. Since b - a > 0, use the Archimedean Principle to choose an n E N that satisfies n(b - a) > 1. Case 1. b> o. Consider the set E = {k EN: b ::; kin}. By the Archimedean Principle, E is nonempty. Hence, by the Well-Ordering Principle, E has a least element, say k o. Set m = ko - 1 and q = min. Since m < ko and ko is a least element of E, m ~ E. This can happen two ways. Either m::; 0 or b > min = q. In either case we obtain q < b. On the other hand, since ko E E implies that b::; koln, it follows from the choice of n that
ko n
1 ko-1 = - - = q. n n
a = b - (b - a) < - - -
1.3
The Completeness Axiom
21
Case 2. b:::; o. Choose (by the Archimedean Principle) a kEN such that k + b > O. By Case 1, there is an r E Q such that k + a < r < k + b. Therefore, q := r - k belongs to Q and satisfies the inequality a < q < b. I
Here is another application of the Archimedean Principle to the distribution of numbers. 1.25 Remark. Ifx
> 1 and x
~
N, then there is ann E N such thatn < x < n+l.
PROOF. By the Archimedean Principle, the set E = {m EN: x < m} is nonempty. Hence by the Well-Ordering Principle, E has a least element, say mo. Set n = rna-I. Since mo E E, n+l = mo > x. Since mo is least, n = mo-l :::; x. Since x ~ N, we also have n =I- x. Therefore, n < x < n + 1. I Using this last result, we can prove that the set of irrationals is nonempty. * 1.26 Remark. If n E N is not a perfect square (i. e., if there is no mEN such
that n = m 2 ), then Vn is irrational.
PROOF. Suppose to the contrary that n E N is not a perfect square but Vn E Q; i.e., Vn = p/q for some p, q E N. Choose by Remark 1.25 an integer mo EN such that (10)
mo
Consider the set E := {k EN: kVn E Z}. Since qVn = p, we know that E is nonempty. Thus by the Well-Ordering Principle, E has a least element, say no. Set x = no( Vn - mo). By (10), 0 < Vn - mo < 1. Multiplying this inequality by no, we find that (11)
0< x < no.
Since no is a least element of E, it follows from (11) that x ~ E. On the other hand, xvn = no( vn - mo)vn = non - monovn E Z since no E E. Moreover, since x > 0 and x = noVn - nomo is the difference of two integers, x E N. Thus x E E, a contradiction. I For some applications, we also need the following concepts. 1.27 DEFINITION. Let E c R be nonempty. (i) The set E is said to be bounded below if and only if there is an mER such that a ~ m for all a E E. (ii) A number m is called a lower bound of the set E if and only if a ~ m for all aEE. (iii) A number t is called an infimum of the set E if and only if t is a lower bound of E and t ~ m for all lower bounds m of E. In this case we shall say that E has an infimum t and write t = inf E. (iv) E is said to be bounded if and only if it is bounded above and below.
22
Chapter 1
THE REAL NUMBER SYSTEM
When a set E contains its supremum (respectively, its infimum) we shall frequently write maxE for supE (respectively, minE for inf E). [Some authors call the supremum the least upper bound and the infimum the greatest lower bound. We will not use this terminology because it is somewhat oldfashioned and because it confuses some students, since the least upper bound of a given set is always greater than or equal to the greatest lower bound. J To relate suprema to infima, we define the reflection of a set E <;;; R by
-E
:=
{x : x
=
-a for some a E E }.
For example, -(1,2J = [-2, -1). The following result shows that the supremum of a set is the same as the negative of its reflection's infimum. This can be used to prove a completeness axiom for infima (see Exercise 6b).
1.28 THEOREM. Let E <;;; R be nonempty. (i) E has a supremum if and only if -E has an infimum, in which case inf(-E)
=
-supE.
(ii) E has an infimum if and only if - E has a supremum, in which case sup( -E)
= - inf E.
PROOF. The proofs of these statements are similar. We prove only the first statement. Suppose that E has a supremum s and set t = -so Since s is an upper bound for E, s 2:: a for all a E E, so -s ~ -a for all a E E. Therefore, t is a lower bound of -E. Suppose that m is any lower bound of -E. Then m ~ -a for all a E E, so -m is an upper bound of E. Since s is the supremum of E, it follows that s ~ -m; i.e., t = -s 2:: m. Thus t is the infimum of - E and sup E = s = -t = - inf( - E). Conversely, suppose that - E has an infimum t. By definition, t ~ -a for all a E E. Thus -t is an upper bound for E. Since E is nonempty, E has a supremum by the Completeness Axiom. I Theorem 1.28 allows us to obtain information about infima from results about suprema, and vice versa (see the proof of the next theorem and Exercises 5 and 6). We shall use the following result many times.
1.29 THEOREM [MONOTONE PROPERTyJ. Suppose that A <;;; B are nonempty subsets of R. (i) If B has a supremum, then sup A ~ sup B. (ii) If B has an infimum, then inf A 2:: inf B. PROOF. (i) Since A <;;; B, any upper bound of B is an upper bound of A. Therefore, sup B is an upper bound of A. It follows from the Completeness Axiom that sup A exists, and from Definition 1.16iii that sup A ~ sup B.
1.3
(ii) Clearly, -A plicative Property,
~
The Completeness Axiom
23
-B. Thus by part (i), Theorem 1.28, and the second Multi-
inf A
= - sup( -A)
~
- sup( -B) = inf B. I
It is convenient to extend the definition of suprema and infima to all subsets of R. To do this we expand the definition of R as follows. By an extended real number x we mean either x E R, x = 00, or x = -00. Let E ~ R be nonempty. We shall define sup E = +00 if E is unbounded above and inf E = -00 if E is unbounded below. Finally, we define sup 0 = -00 and inf 0 = +00. Notice, then, that the supremum of a subset E of R (respectively, the infimum of E) is finite if and only if E is nonempty and bounded above (respectively, nonempty and bounded below). Moreover, under the convention -00 S a and a S 00 for all a E R, the Monotone Property still holds for this extended definition; i.e., if A and B are subsets of R and A ~ B, then supA S supB and inf A ~ inf B.
EXERCISES 1. Find the infimum and supremum of each of the following sets.
(a) E = {4,3,2,1,8, 7,6,5}. (b) E = {x E R: x 2 - 3x - 5 = a}. (c) E = [a, b), where a < b are real numbers. (d) E = {plq E Q : p2 < 2q2 and p, q > a}. (e) E = {x E R: x = 1 + (-l)n for n EN}. (f) E = {x E R : x = lin - (_l)n for n EN}. (g) E = {I + (-l)nln: n EN}. 2. Show that if E is a nonempty bounded subset of Z, then both sup E and inf E exist and belong to E. ~. [DENSITY OF IRRATIONALS] This exercise is used in Section 3.3. Prove that if a < b are real numbers, then there is an irrational ~ E R such that a < ~ < b. 4. Prove that for each a E R and each n E N there exists a rational rn such that la - rnl < lin. [ApPROXIMATION PROPERTY FOR INFIMA] This exercise is used in many sections, including 2.2 and 5.1. (a) By modifying the proof of Theorem 1.20, prove that if a set E c R has a finite infimum and f > 0 is any positive number, then there is a point a E E such that inf E + f > a ~ inf E. (b) Give a second proof of the Approximation Property for Infima by using Theorem 1.28.
r:m.
6. (a) Prove that a lower bound of a set need not be unique but the infimum of a given set E is unique. (b) Prove that if E is a nonempty subset of R that is bounded below, then E has a finite infimum.
24
Chapter 1
THE REAL NUMBER SYSTEM
7. (a) Prove that if x is an upper bound of a set E ~ R and x E E, then x is the supremum of E. (b) Make and prove an analogous statement for the infimum of E. (c) Show by example that the converse of each of these statements is false. S. Let Xn E R and suppose that there is an MER such that Ixnl ~ M for n E N. Prove that Sn = sup{ x n , x n+1, ... } defines a real number for each n E N and Sl ~ S2 ~ •.. Prove an analogous result about tn = inf{xn,xn+1""}' 9. Prove that if a and b are real numbers and ~ a < b, then there exist n, mEN such that a < m/lOn < b. 10. Suppose that E, A, Be R and E = Au B. Prove that if E has a supremum and both A and B are nonempty, then sup A and sup B both exist, and sup E is one of the numbers sup A or sup B.
°
1.4 FUNCTIONS, COUNTABILITY, AND THE ALGEBRA OF SETS In this section we examine the role that functions play in distinguishing one kind of infinite set from another and use this point of view to obtain more information about the special subsets of R introduced in Section 1.1. We also introduce "transfinite" unions and intersections of sets and examine what happens to them under images and inverse images by functions. We begin with some preliminary remarks. For the first half of this course, most of the concrete functions we consider will be real-valued functions of a real variable, i.e., functions f : E ---t R where E ~ R. We shall often call such functions simply real functions.
We assume that you are familiar with the the following real functions: the trigonometric functions sinx, cos x, tanx, cot x, sec x, csc x; the natural logarithm log x and its inverse eX; and the power functions xC>, which are defined using the exponential function by x> 0, Q E R. We also assume that you can differentiate algebraic combinations of these functions using the basic formulas (sinx)' = cos x, (cosx)' = -sinx, and (ex)' = eX, for x E R; (Iogx)' = l/x and (xc>), = QXc>-l, for x > and Q E R; and
°
2
(tan x)' = sec x
for x -:f:.
(2n + 1)1r 2 , n E Z.
(For a derivation of these identities based on fundamental properties, see Exercise 4, p. 101, and Exercises 4 and 5, p. 134.) Even with these assumptions, we shall repeat some material from elementary calculus. At this point it is important to notice a consequence of defining a function to be a set of ordered pairs (see p. 2), the domain cross the range. The notation f : X ---t Y means that the domain of f is X and all images of points in X under f belong to Y. Thus if f(x) = x 2 , then f : [0,1) ---t [0,1) and f : (-1,1) ---t [0,1) are two different functions. They both have the same range, but the first one has domain [0,1) and
1.4
Functions, countability, and the algebm of sets
25
the second one has domain (-1, 1). Making distinctions like this will actually make our life easier later on in the course. Let f : X - t Y. Although, by the definition of a function, each x E X is assigned a unique (meaning one and only one) y = f(x) E Y, there is nothing that keeps two x's from being assigned to the same y, and nothing that says every y E Y corresponds to some x EX. Functions that satisfy these additional properties are important enough to warrant separate terminology.
1.30 DEFINITION. Let f be a function from a set X into a set Y. (i) f is said to be one-to-one (1-1) on X if and only if
(ii) f is said to take X onto Y if and only if for each y E Y there is an x E X such that y = f(x). For example, the function f(x) = x 2 is 1-1 from [0,(0) onto [0,(0) but not 1-1 on any open interval containing O. Some authors call 1-1 functions injections, onto functions surjections, and 1-1 onto functions bijections. Here is a simple, useful characterization of bijections from one set X to another Y.
1.31 THEOREM. Let X and Y be sets and f: X - t Y. Then f is 1-1 from X onto Y if and only if there is a unique function 9 from Y onto X that satisfies (12)
f(g(y))
=
y,
yEY
g(f(x))
=
x,
x EX.
and (13)
f is 1-1 and onto. For each y E Y choose the unique f(x) = y, and define g(y) := x. It is clear that 9 takes Y onto X.
PROOF. Suppose that
x
E X such that
Moreover, by construction, (12) and (13) are satisfied. Conversely, suppose that there is a function 9 from Y onto X that satisfies (12) and (13). If Xl, X2 E X and f(Xl) = f(X2), then it follows from (13) that Xl = g(f(Xl)) = g(f(X2)) = X2. Thus f is I-Ion X. If y E Y and x = g(y), then (12) implies that f(x) = f(g(y)) = y. Thus f takes X onto Y. Finally, suppose that h is another function that satisfies (12) and (13), and y E Y. Choose x E X such that f(x) = y. Then, by (13),
h(y) = h(f(x)) = x = g(f(x)) = g(y); i.e., h = 9 on Y. It follows that the function 9 is unique. I
If f is 1-1 from a set X onto a set Y, we shall say that f has an inverse function. We shall call the function 9 given in Theorem 1.31 the inverse of f, and denote it
26
Chapter 1
THE REAL NUMBER SYSTEM
by f-l. [Note: This is different from the function (f(X))-1 := 1/ f(x).] Notice by (12) and (13) that
for all y E Y and x EX. Let f be a real function. If f has an inverse function f- 1 and y = f(x), we have by definition that (x, f(x)) = (f-l(y), y). Hence, the graph of y = f-l(X) is a reflection of the graph of y = f(x) about the line y = x (see Figure 1.2). How can we prove that a given function f is I-Ion a set E? By definition, we must show that if a, bEE and f(a) = f(b), then a = b. One way to accomplish this is to solve f(a) = f(b) for a, hoping to get b as an answer. For example, to show f(x) = eX + 1 is I-Ion R, suppose that ea + 1 = eb + 1, subtract 1 from both sides, and take the logarithm of the resulting expression. We obtain a = b, so f is I-Ion R. This simplistic approach will not work if f is suitably complicated, because it is not possible to solve all algebraic expressions, e.g., general polynomials of degree ~ 5. Fortunately, if f is differentiable, there is a simple sufficient condition to prove that f is I-Ion a given interval. 1.32 Remark. If f is differentiable on an open interval I and f'(x) -:F 0 for all x E I, then f is 1-1 on I. PROOF. You may remember the Mean Value Theorem (see Theorem 4.15) from elementary calculus: If a < b and the derivative f' of a function f exists at every point in an interval [a, b], then there is acE (a, b) such that f(b)- f(a) = (b-a)f'(c). Suppose that f is not I-Ion I. Then there exist points a -:F b in I such that f(a) = f(b). Hence by the Mean Value Theorem,
0= f(b) - f(a) = f'(c) b-a for some c between a and b, hence some c E I. This contradicts the fact that f' is never zero on I. I The following example shows that the inverse function y = f-l(X) can sometimes be found by treating y = f(x) as a relation that implicitly defines x = f-l(y) and solving for x. 1.33 Example. Prove that f(x) = eX - e- x is I-Ion R. Find a formula for f- 1. SOLUTION. Since f'(x) = eX + e- x > 0 is never zero, Remark 1.32 implies that is I-Ion R. To find f-1, let y = eX - e- x . Multiplying this equation by eX and collecting all nonzero terms on one side of the equation, we have
f
e2x - ye X - 1 = 0, a quadratic in eX. By the quadratic formula,
X
y±Vy2+4
e = -"------'--=2--.
1.4
Functions, countability, and the algebra of sets Y
Y=f-1(x)/
/
.
.I
27
/ /
/
1/ '/
x
Figure 1.2 Since eX is always positive, the minus sign must be discarded. Taking the logarithm of this last identity, we obtain x = log(y + + 4) -log2. Therefore,
Vy2
f-l(X) = log(x +
Vx 2 + 4) -log 2.
I
The following will not be used for core material until Chapters 8, 9, and 10. Functions that have inverses can be used to "count" infinite sets. Before we make a formal definition, let us examine what it means to count a finite set of objects E. When we count E, we assign a number n E N to each object in E; i.e., we construct a function f from a subset of N to E. For example, if E has three objects, then the "counting" function takes {I, 2, 3} to E. Now in order to count E properly, we must be careful to avoid two pitfalls. We must not count any element of E more than once (Le., f must be 1-1), and we cannot miss any element of E (Le., f must take {I, 2, 3} onto E). Accordingly, we make the following definition.
1.34 DEFINITION. Let E be a set. (i) E is said to be finite if and only if either E = 0 or there is an n E N and a 1-1 function from {I, 2, ... , n} onto E. (ii) E is said to be countable if and only if there is a 1-1 function from N onto
E. (iii) E is said to be at most countable if and only if E is either finite or countable. (iv) E is said to be uncountable if and only if E is neither finite nor countable. Loosely speaking, a set is countable if it has the same number of elements as N, finite if it has less, uncountable if it has more. To show that a set E is countable, it suffices to exhibit a 1-1 function from N onto E. For example, the set of even integers E = {2, 4, ... } is countable because f(k) := 2k is 1-1 from N onto E. Thus, two infinite sets can have the same number of elements even though one is a proper subset of the other. (In fact, this property can be used as a definition of "infinite set.") The following result shows that not every infinite set is countable.
28
Chapter 1
THE REAL NUMBER SYSTEM
1.35 Remark [CANTOR'S DIAGONALIZATION ARGUMENT]. The open interval (0, 1) is uncountable. STRATEGY. Suppose to the contrary that (0,1) is countable. Then by definition, there is a function f on N such that f(I), f(2), ... exhausts the elements of (0,1). We could reach a contradiction if we could find a new number x E (0,1) that is different from all the f(k)'s. How can we determine whether two numbers are different? One easy way is to look at their decimal expansions. For example, 0.1234 =f. 0.1254 because they have different decimal expansions. Thus, we could find an x that has no preimage under f by making the decimal expansion of x different by at least one digit from the decimal expansion of EVERY f(k). There is a flaw in this approach that we must fix. Decimal expansions are unique except for finite decimals, which always have an alternative expansion that terminates in 9's, e.g., 0.5 = 0.4999 ... and 0.24 = 0.23999 ... (see Exercise 10, p. 44). Hence, when specifying the decimal expansion of x we must avoid decimals that terminate in 9's. PROOF. Suppose that there is a 1-1 function f that takes N onto the interval (0,1). Write the numbers f(j), j E N, in decimal notation, using the finite expansion when possible; i.e.,
f(l)
=
0.an a 12""
f(2)
=
O.a21 a 22···,
f(3) =
0.a31 a 32 ... ,
... , where aij represents the jth digit in the decimal expansion of f(i) and none of these expansions terminates in 9's. Let x be the number whose decimal expansion is given by 0.f3d32 ... , where f3k := {
akk akk
+1 -1
if akk
:::;
5
if akk > 5.
Clearly, x is a number in (0,1) whose decimal expansion does not contain one 9, much less terminate in 9's. Since f is onto, there is a j E N such that f(j) = x. Since we have avoided 9's, the decimal expansions of f(j) and x must be identical, e.g., ajj = f3j := ajj ± 1. It follows that 0 = ±1, a contradiction. I It is natural to ask about the count ability of the sets Z, Q, and R. To answer these questions, we prove several preliminary results. First, to show that a set E is countable, we do not need to construct a ONE-TO-ONE function from N onto E. 1.36 Lemma. A nonempty set E is at most countable if and only if there is a function 9 that takes N onto E. PROOF. If E is countable, then by Definition 1.34ii there is nothing to prove. If E is finite, then there is an n E N and a 1-1 function f that takes {I, 2, ... ,n} onto
1.4
29
Functions, countability, and the algebra of sets
E. Hence .
g(J):=
{ f(j)
f(1)
j50n j>n
takes N onto E. Conversely, suppose that 9 takes N onto E. We need to construct a function f that is 1-1 from some subset of N onto E. We will do this by eliminating the duplication in g. To this end, let kl = 1. If the setEl := {k EN: g(k) =I g(k l )} is empty, then E = {g(k l )}, thus evidently at most countable. Otherwise, let k2 be the least element in El and notice that k2 > kl . Set E2 := {k EN: g(k) E E \ {g(k l ),g(k2)}. If E2 is empty, then E = {g(kl)' g(k2)} is finite, hence at most countable. Otherwise, let k3 be the least element in E 2. Since g(k3) E E \ {g(kl)' g(k2)}' we have g(k3) =I g(k l ) and g(k3) =I g(k2)' Since 9 is a function, the first condition implies k3 =I k 2. Since k2 is least in E l , the second condition implies k2 < k 3. Hence, kl < k2 < k 3. Continue this process. If it ever terminates, then some
is empty, so E is finite, hence at most countable. If this process never terminates, then we generate integers kl < k2 < ... such that kj+l is the least element of E j for j = 1,2, ... Set f(j) = g(kj ), j EN. To show that f is 1-1, notice that j =I £ implies that kj =I kt, say kj < kt. Then kj 50 kt-b so by construction
g(kt) E E \ {g(kl)' ... , g(kj
), • .. ,
g(kt-d} ~ E \ {g(kd, ... , g(kj
)}.
In particular, g(kt) =I g(kj ); Le., f(£) =I f(j)· To show that f is onto, let x E E. Since 9 is onto, choose £ E N such that g(£) = x. Since by construction j < k j , we can choose (by the Archimedean Principle) a j E N such that k j > £. Since k j is the least element in E j - l , it follows that g(£) cannot belong to E \ {g(kd, ... , g(kj - l )}; Le., g(£) = g(kn ) for some n E [1,j - 1). In particular, f(n) = g(kn ) = x. I Next, we show how set containment affects count ability, and use it to answer the question about count ability of R.
1.37 THEOREM. Suppose that A and B are sets. (i) If A ~ Band B is at most countable, then A is at most countable. (ii) If A ~ B and A is uncountable, then B is uncountable. (iii) R is uncountable. PROOF. (i) Since B is at most countable, choose by Lemma 1.36 a function 9 that takes N onto B. We may suppose that A is nonempty, hence fix an ao E A. Then g(n) E A f(n) := { g(n) g(n) A ao
tt
Chapter 1
30
THE REAL NUMBER SYSTEM
takes N onto A. Hence by Lemma 1.36, A is at most countable. (ii) If B were at most countable, then by part (i), A would also be at most countable, a contradiction. (iii) Since the interval (0,1) is uncountable (by Remark 1.35) and a subset of R, it follows from part (ii) that R is uncountable. I The following result shows that the Cartesian product of two countable sets is countable, and that a countable union of countable sets is countable. 1.38 THEOREM. Let A I ,A2 , .•• be at most countable sets. (i) Then Al x A2 is at most countable. (ii) If 00
E
=
U Aj := U Aj := {x: x E Aj JEN
for some j EN},
j=l
then E is at most countable. PROOF. (i) By Lemma 1.36, there exist functions ¢ (respectively, 'l/J) that take N onto Al (respectively, onto A2). Hence f(n,m) := (¢(n),'l/J(m)) takes N x N onto Al x A 2 • If we can construct a function g that takes N onto N x N, then by Exercise 9a, fog takes N onto Al x A 2. Hence by Lemma 1.36, Al x A2 is at most countable. To construct the function g, plot the points of N x N in the plane. Notice that we can connect these lattice points with a series of parallel backward-slanted lines, e.g., the first line passes through (1,1), the second line passes through (1,2) and (2,1), and the third line passes through (1,3), (2,2), and (3,1). This suggests a method for constructing g. Set g(l) = (1,1), g(2) = (1,2), g(3) = (2,1), g(4) = (3,1), ... If you wish to see an explicit formula for g, observe that the nth line passes through the set of lattice points
(1, n), (2, n - 1), (3, n - 2), ... , (n - 1,2), (n, 1), i.e., through the set of lattice points (k, j) that satisfy k + j = n + 1. Since the sum of integers 1 + 2 + ... + n is given by (n - 1)n/2 (see Exercise la, p. 17), there are (n - 1)n/2 elements in the first n - 1 slanted lines. Hence a function that takes N onto the nth slanted line is given by (14)
g(j) = (e, n
+ 1 - e),
where j = e+ (n - 1)n/2. This function is defined on all of N because given j EN, we can use the Archimedean Principle and the Well-Ordering Principle to choose n least such that j :::; n(n + 1)/2, i.e., such that j = e+ (n -1)n/2 for some e E [1, n). Thus g takes N onto N x N. (ii) By Lemma 1.36, choose functions fJ that take N onto A j , j E N. Clearly, the function h(k,j) := h(j) takes N x N onto E. Hence the function hog, where g is defined by (14), takes N onto E. We conclude by Lemma 1.36 that E is at most countable I
1.4
Functions, countability, and the algebra of sets
31
1.39 Remark. The sets Z and Q are countable, but the set of irrationals is uncountable.
PROOF. Z = N U (-N) U {O} and Q = U~=l {pin: p E Z} are both countable by Theorem 1.38ii. If R \ Q were countable, then R = (R \ Q) U Q would also be countable, a contradiction of Remark 1.37iii. • Theorem 1.38 says something about a countable union of sets. In Chapters 9 and 10, we need to consider uncountable unions and intersections. Here is some notation that will prove useful in that regard. A collection of sets [; is said to be indexed by a set A if and only if there is a function F from A onto [;. In this case A is called the index set of [;, and we shall represent F(o:) by Ea.. In particular, we shall represent a collection of sets indexed by A by
1.40 DEFINITION. Let [; = {Ea.}a.EA be a collection of sets. (i) The union of the collection [; is the set
U Ea. := {x : x E Ea.
for some
0:
E A}.
a.EA
(ii) The intersection of the collection [; is the set
n
Ea. :=
{x : x
E Ea.
for all
0:
E A}.
a.EA
There is an easy way to get from unions to intersections, and vice versa.
1.41 THEOREM [DEMORGAN'S LAWS]. Let X be a set and {Ea.}a.EA be a collection of subsets of X. If for each E ~ X the symbol EC represents the set X \ E, then (15)
(u Ear
a.EA
(n Ear
a.EA
a.EA
n E~
and (16)
a.EA
U E~.
PROOF. Suppose that x belongs to the left side of (15); i.e., x E X and x 1By definition, x E X and x 1- Ea. for all 0: E A. Hence, x E E~ for all 0: E A; i.e., x belongs to the right side of (15). These steps are reversible. This verifies (15). A similar argument verifies (16) .• Ua.EA Ea..
The following concepts will be used frequently in subsequent chapters.
Chapter 1
32
THE REAL NUMBER SYSTEM
1.42 DEFINITION. Let X and Y be sets and f : X ~ X under f is the set
-t
Y. The image of a set
E
f(E)
:=
{y
The inverse image of a set E (17)
f-l(E)
:=
E
~
{x
Y :y
f(x) for some x
=
Y under EX:
E
E}.
f is the set
f(x)
=
y for some y
E
E}.
Notice that equation (17) makes sense whether or not f is 1-1; Le., f need not be 1-1 for f-l(E) to be defined. In particular, f-l(E) is the inverse image of E under f, not the image of E under the inverse function f- 1 (unless f is 1-1). In fact, the inverse function f- 1 exists on f(X) if and only if the inverse image f-l({y}) contains at most one point for all y E Y. Another indication that the inverse image of a set is different from its image under the inverse function can be seen by examining f-l(f(E)). If f were 1-1, this set would be E (see Exercise 6). In general, however, the best one can say is rl(f(E)) :2 E for any E ~ Domf (see Theorem 1.43v). If f is not 1-1, E can be a PROPER subset of rl(f(E)). For example, let f : R - t R be defined by f(x) = x 2 and set E = [0,1). Then f(E) = [0,1), so f-l(f(E)) = (-1,1) :::> E. The following result, which plays a prominent role in Chapters 9 and 12, describes images and inverse images of unions and intersections of sets.
1.43 THEOREM. Let X and Y be sets and f: X - t Y. (i) If {EO:}O:EA is a collection of subsets of X, then
(ii) If B and C are subsets of X, then
f(C \ B) :2 f(C) \ f(B). (iii) If {Eo:} 0:0 is a collection of subsets of Y, then
(iv) If B and C are subsets of Y, then
f- 1 (C \ B)
=
rl(C) \ f-l(B).
(v) If E ~ f(X), then f(f-l(E)) = E, but if E ~ X, then f-l(f(E)) :2 E. PROOF. (i) By definition, y E f(UO:EAEo:) if and only if y = f(x) for some x E Eo: and Q E A. This is equivalent to y E UO:EAf(Eo:). Similarly, y E f(nO:EAEo:) if and
1.4
Functions, countability, and the algebra of sets
33
= f(x) for some x E naEAEa. This implies that for all a E A there is an Xa E Ea such that y = f(x a ). Therefore, y E naEAf(EOI.)' (ii) If y E f(C) \ f(B), then y = f(c) for some c E C but Y =F f(b) for any b E B. It follows that y E f(C \ B). Similar arguments prove parts (iii), (iv), and (v). I only if y
The set inequalities in parts (i), (ii), and (v) are equalities when f is 1-1 (see Exercise 6). When f is NOT 1-1, they can be strict. For example, if f : R -+ R is defined by f(x) = x 2, El = {1}, and E2 = {-1}, then f(E 1 n E 2) = 0 is a proper subset of f(E 1 ) n f(E 2) = {1}.
EXERCISES 1. For each of the following, prove
f
is 1-1 on E. Find a formula for
f- 1 •
(a) (b) (c) (d) (e)
f(x) = 3x - 7, E = R. f(x) = e1/ x , E = (0,00). f(x) = tanx, E = (-1r/2,1r/2). f(x) = x 2 + 3x - 6, E = [-3/2,00). f(x) = 3x -Ixl + Ix - 21, E = R. (f) f(x) = x/(x 2 + 1), E = [-1,1]. 2. Suppose that A is finite and f is 1-1 from A onto B. Prove that B is finite. 3. Prove that the set of odd integers {1, 3, ... } is countable. 4. Find f(E) and f-l(E) for each of the following. (a) f(x) = 1 - 5x, E = (-3,1). (b) f(x) = x 2, E = [-1,4]. (c) f(x) = x 2 + x, E = [-2,1). (d) f(x) = log(x 2 + X + 1), E = (1/2,5]. (e) f(x) = sinx, E = [0,00). 5. Give a simple description of each of the following sets.
(a)
U [x-1,x+1].
(b)
xE[O,l]
U [0, 1/k].
(c)
kEN
6. Let X, Y be sets and
f :X
n n
[x - 1, x + 1].
xE[O,l]
(d)
[0,1/k].
kEN -+
Y. Prove that the following are equivalent.
(a) f is 1-1 on X. (b) f(A \ B) = f(A) \ f(B) for all subsets A and B of X. (c) f-l(f(E)) = E for all subsets E of X. (d) f(A n B) = f(A) n f(B) for all subsets A and B of X. 7. Prove (16). 8. Prove Theorem 1. 43iii , iv, and v. 9. Let f : A -+ B and 9 : B -+ C and define go f : A -+ C by (g 0 f)(x) := g(f(x)). (a) Show that if f,g are 1-1 (respectively, onto), then go f is 1-1 (respectively, onto).
34
Chapter 1
THE REAL NUMBER SYSTEM
r
1 is (b) [PIGEONHOLE PRINCIPLE] Prove that if f is 1-1 from A into B, then 1-1 from f(A) onto A. (c) Suppose that g is 1-1 from B onto C. Prove that f is I-Ion A (respectively, onto B) if and only if g 0 f is I-Ion A (respectively, onto C). 10. Suppose that n E N and
(a) Prove that if n E N and q E Q, then n q is algebraic. (b) Prove that for each n E N the collection of algebraic numbers of degree n is countable. (c) Prove that the collection of transcendental numbers is uncountable. (Two famous transcendental numbers are 1f and e. For more information on transcendental numbers and their history, see Kline [5].)
Chapter 2
§ce({jucencces
in R
2.1 LIMITS OF SEQUENCES An infinite sequence (more briefly, a sequence) is a function whose domain is N. A sequence f whose terms are Xn := f(n) will be denoted by Xl, X2, .. · or {Xn}nEN or {Xn}~=l' or {x n }. Thus 1,1/2,1/4,1/8, ... represents the sequence {1/2 n - 1 }nEN, -1,1,-1,1, ... represents the sequence {(-I)n}nEN' and 1,2,3,4, ... represents the sequence {n }nEN' It is important not to confuse a sequence {Xn}nEN with the set {xn : n EN}; these are two entirely different concepts. For example, as sequences, 1,2,3,4, .. , is different from 2,1,3,4, ... , but as sets, {l, 2, 3, 4, ... } is identical with {2, 1, 3, 4, ... }. Again, the sequence 1,-1,1,-1, ... is infinite, but the set {(-I)n: n E N} has only two points. The limit concept is one of the fundamental building blocks of analysis. Recall from elementary calculus that a sequence of real numbers {xn} converges to a number a if Xn gets near a (i.e., the distance between a and Xn gets small) as n gets large. Thus, given E > 0 (no matter how small), IXn - al gets smaller than E as n gets large. This leads us to a formal definition of the limit of a sequence.
2.1 DEFINITION. A sequence of real numbers {xn} is said to converge to a real number a E R if and only if for every E > 0 there is an N E N (which in general depends on E) such that
n~ N
implies
IX n -
al < E.
We shall use the following phrases and notations interchangeably: (a) {xn} converges to a; (b) Xn converges to a; (c) a = limn--->oo Xn; (d) Xn --+ a as n --+ 00; (e) the limit of {xn} exists and equals a. When Xn --+ a as n --+ 00, you can think of Xn as a sequence of approximations to a, and E as an upper bound for the ~rror of these approximations. The number N in Definition 2.1 is chosen so that the error is less than E when n ~ N. In general, the smaller E gets, the larger N must be. (See, for example, Figure 2.1.) 35
Chapter 2
36
SEQUENCES IN R
Notice by definition that Xn converges to a if and only if IXn - al In particular, Xn ---+ 0 if and only if IXnl ---+ 0 as n ---+ 00 .
•
•
a-e
Xs ... a ... X6 a+e
•
---+
0 as n
---+ 00.
•
Figure 2.1 According to Definition 2.1, to prove a particular limit exists, given an arbitrary
> 0, no matter how small, we must describe how to choose an N such that n 2 N implies IXn - al < E. In particular, E is usually introduced before N is specified,
E
and N often is defined to depend on 2.2 Example. Prove that lin
---+
E.
0 as n
---+ 00.
PROOF. Let E > O. Use the Archimedean Principle to choose N E N such that N > 1 IE. By taking the reciprocal of this inequality, we see that n 2 N implies lin::; liN < E. Since lin are all positive, it follows that Il/nl < E for all n 2 N. I Let P(n) be a property indexed by N. We shall say that P(n) holds for large n if there is an No EN such that P(n) is true for all n 2 No. Hence by definition, Xn converges to a if and only if IX n - al is small for large n. What we mean by this is that given any prescribed positive quantity E (no matter how small), we can choose No large enough so that IX n - al is less than E for all n 2 No. The following two results show that a given sequence can have no limits or one limit, but no more. 2.3 Example. The sequence {( -l)n }nEN has no limit. PROOF. Suppose that (_l)n ---+ a as n ---+ 00 for some a E R. Given E = 1, there is an N E N such that n 2 N implies I( -l)n - al < E. For n odd this implies 11 + al = 1- 1 - al < 1, and for n even this implies 11 - al < 1. Hence,
2 = 11
+ 11::; 11 -
al
+ 11 + al < 1 + 1 =
2;
i.e., 2 < 2, a contradiction. I 2.4 Remark. A sequence can have at most one limit. PROOF. Suppose that Xn converges to both a and b. By definition, given E > 0, there are integers Nl and N2 such that n 2 Nl implies IXn - al < E/2, and n 2 N2 implies IXn - bl < E/2. Let N = max{Nl' N 2}. By the choice of Nl and N 2, n 2 N implies both IX n - al < E/2 and IXn - bl < E/2. Thus it follows from the triangle inequality that la - bl ::; la - xnl + IXn - bl < E; i.e., la - bl <
E
for all
E
> O. We conclude, by Theorem 1.9, that a = b. I
Notice that in the proof of Remark 2.4 we forced two properties that held for
n 2 N j , j = 1,2, to hold for n 2 N by setting N equal to the maximum of Nl and
2.1
Limits of sequences
37
N 2 . It is clear that by this same process, if N l , •.• , N q have been chosen so that for each j a property P j holds when n > N j and if N = max{Nl> ... , N q }, then all q properties P l , ... , Pq hold simultaneously when n > N. We shall use this device frequently, but rarely write N explicitly as a maximum of integers N j again. We shall use the following concept many times.
2.5 DEFINITION. By a subsequence of a sequence {Xn}nEN, we shall mean a sequence of the form {x nk hEN, where each nk E Nand nl < n2 < .... Thus a subsequence Xn1 , Xn2 , . .. of Xl, X2, . .. is obtained by "deleting" from Xl> X2, ... all xn's except those such that n = nk for some k. For example, 1,1, ... is a subsequence of (-l)n obtained by deleting every other term (set nk = 2k), and 1/2,1/4, ... is a subsequence of l/n obtained by deleting all nondyadic fractions, i.e., deleting 1/3,1/5,1/6,1/7, ... (set nk = 2k). Subsequences are sometimes used to correct a sequence that behaves badly or to speed up convergence of another, which converges slowly. For example, {l/n} converges much more slowly to zero than its subsequence {1/2n}, and {( -l)n} does not converge at all (see Example 2.3), but its subsequence 1,1, ... converges to 1 immediately. If Xn ---+ a as n ---+ 00, then the xn's get near a as n gets large. Since nk gets large as k does, it comes as no surprise that any subsequence of a convergent sequence also converges. 2.6 Remark. If {Xn}nEN converges to a and {xnkhEN is any subsequence of {Xn}nEN, then x nk converges to a as k ---+ 00.
PROOF. Let E > 0 and choose N E N such that n ~ N implies IX n - al < E. Since nk E Nand nl < n2 < ... , it is clear that nk ~ k for all kEN. Hence, k ~ N implies IXnk - al < E; i.e., x nk ---+ a as k ---+ 00. I The following concepts also play an important role for the theory of sequences.
2.7 DEFINITION. Let {xn} be a sequence of real numbers. (i) {Xn} is said to be bounded above if and only if there is an MER such that Xn :::; M for all n E N. (ii) {xn} is said to be bounded below if and only if there is an mER such that Xn ~ m for all n E N. (iii) {xn} is said to be bounded if and only if it is bounded both above and below. It is easy to check (see Exercise 5) that {xn} is bounded if and only if there is a C > 0 such that IXnl :::; C for all n E N. In this case we shall say that {xn} is bounded, or dominated, by C. Is there a relationship between convergent sequences and bounded sequences?
2.8 THEOREM. Every convergent sequence is bounded. STRATEGY: The idea behind the proof is simple (see Figure 2.1). Suppose that Xn ---+ a as n ---+ 00. By definition, for large N the sequence XN, XN+l> ... must be close to a, hence bounded. Since the finite sequence Xl,"" XN-l is also bounded, it should follow that the whole sequence is bounded. We now make this precise.
Chapter 2
38
SEQUENCES IN R
PROOF. Given € = 1 there is an N E N such that n 2: N implies IX n - al :s; 1. Hence by the triangle inequality, IXnl :s; 1 + lal for all n 2: N. On the other hand, if 1 :s; n :s; N, then IXnl :s; M := max{lxll, IX21,···, IXNI}. Therefore, {xn} is dominated by maxiM, 1 + lal}· I Notice that by Example 2.3, the converse of Theorem 2.8 is false.
EXERCISES 1. Using the method of Example 2.2, prove that the following limits exist.
(a) 3 + lin -+ 3 as n -+ 00. (b) 2(1 - lin) -+ 2 as n -+ 00. (c) (5 + n)/n 2 -+ as n -+ 00. (d) 1r - 3/v'n -+ 1r as n -+ 00. 2. Suppose that Xn is a sequence of real numbers that converges to 1 as n Using Definition 2.1, prove that each of the following limits exists.
°
-+ 00.
°
(a) 1 - Xn -+ as n -+ 00. (b) 3x n +1-+4asn-+00.
(c) (2 + x~)/xn -+ 3 as n -+ 00. 3. (a) Prove that {( _1)n} has some subsequences that converge and others that do not converge. (b) Find a convergent subsequence of n + (_1)3n n . 4. (a) Suppose that ibn} is a sequence of nonnegative numbers that converges to 0, and {Xn} is a real sequence that satisfies IX n - al :s; bn for large n. Prove that Xn converges to a. (b) What happens to part (a) if ":s; bn " is replaced by ":S; Cbn " for some fixed positive constant C? 5. Suppose that Xn E R. (a) Prove that {xn} is bounded if and only if there is a C > such that IX n I :s; C for all n E N. (b) Suppose that {xn} is bounded. Prove that xnln k -+ 0, as n -+ 00, for all kEN.
°
6. (a) Suppose that {xn} and {Yn} converge to the same point. Prove that Xn Yn -+ as n -+ 00. (b) Prove that the sequence {n} does not converge. (c) Show that the converse of part (a) is false.
°
7. (a) Let a be a fixed real number and define Xn := a for n E N. Prove that the "constant" sequence Xn converges. (b) What does {xn} converge to? 8. Suppose that {xn} is a sequence in R. Prove that Xn converges to a if and only if EVERY subsequence of Xn also converges to a.
2.2
Limit theorems
39
2.2 LIMIT THEOREMS One of the biggest challenges we face (both for theory and applications) is deciding whether or not a given sequence converges. Once we know that it converges, we can often use other techniques to approximate or evaluate its limit. One way to identify convergent sequences is by comparing a sequence whose convergence is in doubt with another whose convergence property is already known (see Example 2.10). The following result is the first of many theorems that addresses this issue.
2.9 THEOREM [SQUEEZE THEOREM]. Suppose that
{Xn}, {Yn},
and {w n } are
real sequences.
(i) If Xn ----+ a and such that
Yn ----+
a (the SAME a) as n for n
then Wn ----+ a as n ----+ 00. (ii) If Xn ----+ 0 as n ----+ 00 and
{Yn}
----+ 00,
~
and if there is an No E N
No,
is bounded, then
XnYn ----+
0 as n
----+ 00.
PROOF. (i) Let E > O. Since Xn and Yn converge to a, use Definition 2.1 and Theorem 1.6 to choose N I , N2 EN such that n ~ NI implies -E ::; Xn - a::; E and n ~ N2 implies -E ::; Yn - a ::; E. Set N = max{No, N I , N 2}. If n ~ N we have by hypothesis and the choice of NI and N2 that a - E ::; Xn ::; Wn ::; Yn ::; a
+ E;
i.e., IW n - al ::; E for n ~ N. We conclude that Wn ----+ a as n ----+ 00. (ii) Suppose that Xn ----+ 0 and there is an M > 0 such that IYnl ::; M for n E N. Let E > 0 and choose an N E N such that n ~ N implies IXnl ::; ElM. Then n ~ N implies
We conclude that
XnYn ----+
0 as n
----+ 00.
I
The following example shows how the Squeeze Theorem can be used to find the limit of a complicated sequence by ignoring its "less important" factors. 2.10 Example. Find limn--+oo 2- n cos(n 3
-
n2
+n -
13).
SOLUTION. The factor cos(n 3 -n 2 +n-13) looks intimidating, but it is superfluous for finding the limit of this sequence. Indeed, since I cos x I ::; 1 for all x E R the sequence {2-n cos( n 3 - n 2 + n - 13)} is dominated by 2- n . Since 2n > n it is clear by Example 2.2 and the Squeeze Theorem that both 2- n ----+ 0 and 2- n cos(n 3 n 2 + n - 13) ----+ 0 as n ----+ 00. I The Squeeze Theorem can also be used to construct convergent sequences with certain properties. To illustrate how this works, we now prove a result that connects suprema and infima with convergent sequences.
40
Chapter 2
SEQUENCES IN R
2.11 THEOREM. Let E c R. If E has a finite supremum (respectively, a finite infimum), then there is a sequence Xn E E such that Xn - t sup E (respectively, Xn - t inf E) as n - t 00. PROOF. Suppose that E has a finite supremum. For each n E N, choose (by the Approximation Property for Suprema) an Xn E E such that sup E - l/n < Xn :s; supE. Then by the Squeeze Theorem and Example 2.2, Xn - t supE as n - t 00. Similarly, there is a sequence Yn E E such that Yn - t inf E. I Here is another result that helps to evaluate limits of specific sequences. This one works by viewing complicated sequences in terms of simpler components.
2.12 THEOREM. Suppose that {Xn} and {Yn} are real sequences and a E R. If {xn} and {Yn} are convergent, then
(i)
(ii)
lim (xn + Yn)
n~~
= lim Xn + lim Yn,
lim (axn)
n-+oo
n~oo
n~oo
= a n-+oo lim Xn ,
and
(iii) If, in addition, Yn
(iv)
¥- 0 and limn-+oo Yn ¥- 0,
then
Xn limn-+oo Xn . 11m -----n-+oo Yn - limn-+oo Yn .
(In particular, all these limits exist.) PROOF. Suppose that Xn - t x and Yn - t Y as n - t 00. (i) Let € > 0 and choose N E N such that n ~ N implies IX n IYn - yl < €/2. Thus n ~ N implies
xl <
€/2 and
(ii) It suffices to show that aXn - ax - t 0 as n - t 00. But Xn - X - t 0 as n - t 00, hence by the Squeeze Theorem, a(xn - x) - t 0 as n - t 00. (iii) By Theorem 2.8, the sequence {xn} is bounded. Hence by the Squeeze Theorem, the sequences {xn(Yn - y)} and {(xn - x)y} both converge to O. Since XnYn - xY it follows from part (i) that XnYn part (iv) (see Exercise 3). I
= xn(Yn - y) + (xn - x)y, -t
xy as n
-t
00.
A similar argument establishes
2.2
Limit theorems
41
Theorem 2.12 can be used to evaluate limits of sums, products, and quotients. Here is a typical example.
2.13 Example. Find lim n..... oo (n 3
+ n2 -
1)/(1 - 3n3 ).
SOLUTION. Multiplying the numerator and denominator by 1/n3 , we find that 1 + (l/n) - (1/n 3 ) (1/n 3 ) - 3 By Example 2.2 and Theorem 2.12iii, l/n k kEN. Thus by Theorem 2.12i, ii, and iv,
=
(l/n)k
1+0-0
0-3
----t
1 3
0, as n
----t
00,
for any
I
The sequence {log n }nEN fails to converge in a different way than {n( _1)n }nEN does. (Indeed, the terms logn get steadily larger as n ----t 00, but the terms n( _l)n bounce back and forth between large positive values and large negative values.) It is sometimes convenient to emphasize this difference by generalizing limits to include extended real numbers.
2.14 DEFINITION. Let {xn} be a sequence of real numbers. (i) {xn} is said to diverge to +00 (notation: Xn ----t +00 as n ----t 00 or limn ..... oo Xn +(0) if and only if for each MER there is an N E N such that n ~ N implies Xn > M.
(ii) {xn} is said to diverge to -00 (notation: Xn ----t -00 as n ----t 00 or limn ..... oo Xn -(0) if and only if for each MER there is an N E N such that
n
~
N
implies
Xn < M.
Notice by Definition 2.14i that Xn ----t +00 if and only if given MER, Xn is greater than M for sufficiently large n; i.e., eventually Xn exceeds every number M (no matter how large and positive M is). Similarly, Xn ----t -00 if and only if Xn eventually is less than every number M (no matter how large and negative Mis). It is easy to see that the Squeeze Theorem can be extended to infinite limits (see Exercise 6). The following is an extension of Theorem 2.12.
2.15 THEOREM. Suppose that {xn} and {Yn} are real sequences such that Xn (respectively, Xn ----t -(0) as n ----t 00. (i) If Yn is bounded below (respectively, Yn is bounded above), then
+00
lim (Xn + Yn)
n-+(X)
= +00
(respectively,
= +00
(respectively, lim (aXn)
lim (Xn + Yn)
n--+oo
= -(0).
(ii) If a > 0, then lim (axn)
n--+oo
n--+oo
= -(0).
----t
Chapter 2
42
SEQUENCES IN R
(iii) If Yn > Mo for some Mo > 0 and all n lim (xnYn) = +00
n----+oo
E
N, then
(respectively, lim (xnYn) = -(0). n-+oo
(iv) If {Yn} is bounded and Xn =I- 0, then lim Yn =
o.
n-+oo Xn
We suppose for simplicity that Xn --+ +00 as n --+ 00. (i) By hypothesis, Yn ~ Mo for some Mo E R. Let MER and set MI = M -Mo. Since Xn --+ +00, choose N E N such that n ~ N implies Xn > MI. Then n ~ N implies Xn + Yn > MI + Mo = M. (ii) Let MER and set MI = Mia. Choose N E N such that n ~ N implies Xn > MI. Since a > 0, we conclude that aXn > aMI = M for all n ~ N. (iii) Let MER and set MI = MIMo. Choose N E N such that n ~ N implies Xn > MI. Then n ~ N implies XnYn > MIMo = M. (iv) Let c > O. Choose Mo > 0 such that IYnl ~ Mo and MI > 0 so large that MolMI < c. Choose N E N such that n ~ N implies Xn > MI. Then n ~ N implies PROOF.
If we adopt the conventions x + 00 = 00, X·
00 = 00,
x - 00 = -00, X·
(-00) = -00,
x . 00 = -00,
x . ( -(0) = 00,
00 + 00 = 00,
-00 - 00 = -00,
XER,
X> 0, x < 0,
and
00·00= (-00)· (-00) = 00,
00· (-00) = (-00)·00 = -00,
then Theorem 2.15 contains the following corollary.
2.16 COROLLARY. Let {xn}, {Yn} be real sequences and a, x, Y be extended real numbers. If Xn --+ x and Yn --+ y, as n --+ 00, then lim (xn
n--->oo
+ Yn)
= x
+Y
(provided that the right side is not of the form 00 - (0), and lim (axn) = ax,
n-+oo
lim (xnYn) = xy
n-+oo
(provided that none of these products is of the form O· ±oo). We have avoided the cases 00 - 00 and 0 . ±oo. These and other "indeterminate forms" will be covered by I'Hopital's Rule in Section 4.3. Theorems 2.12 and 2.15 show how the limit sign interacts with the algebraic structure of R. (Namely, it says that the limit of a sum (product, quotient) is the sum (product, quotient) of the limits.) The following theorem shows how the limit sign interacts with the order structure of R.
2.2
43
Limit theorems
2.17 THEOREM [COMPARISON THEOREMj. Suppose that {xn} and {Yn} are convergent sequences. If there is an No E N such that (1)
for n 2: No,
then lim xn:S: lim Yn.
n-+oo
n-+oo
In particular, if Xn E [a, bj converges to some point c, then c must belong to [a, bj. PROOF. Suppose that the first statement is false, i.e., that (1) holds but x := liffin->oo Xn is greater than Y := limn->oo Yn. Set € = (x - y)/2. Choose Nl > No such that IX n - xl < € and IYn - YI < € for n 2: N 1 . Then f~r ~uch an n,
Xn > X -
€ =
X-Y X-Y X - -2- = Y + -2- = Y + € > Yn,
which contradicts (1). This proves the first statement. We conclude by noting that the second statement follows from the first, since a :s: Xn :s: b implies a :s: c :s: b. • One way to remember this result is that it says the limit of an inequality is the inequality of the limits, provided that these limits exist. We shall call this process "taking the limit of an inequality." Since Xn < Yn implies Xn :s: Yn, the Comparison Theorem contains the following corollary: If {Xn} and {Yn} are convergent real sequences, then
Xn < Yn,
n 2: No,
implies
lim xn:S: lim Yn.
n-+oo
n-+oo
In particular, if Xn < M for n large and {xn} converges, then limn->oo Xn :s: M. It is important to notice that these results are false if in the conclusion, :s: is replaced by <. For example, 1 n
1 n
-2 < -
but
1l· m1- = 1·l m1- =0. n->oo n 2 n->oo n
EXERCISES 1. Prove that each of the following sequences converges to zero.
(a) Xn = (b) Xn = (c) Xn = (d) Xn = 2. Find the (a) Xn = (b) Xn = (c) Xn = (d) Xn =
sin((n4 + n + 1)/(n 2 + I))/n. n/(n 2 + 1). (ffn + 1)/(n + 1). n/2n. limit (if it exists) of each of the following sequences. (1 + n - 3n 2)/(3 - 2n + n 2). (n 3 + n - 5)/(5n 3 + n - 1). J2n2 - I/(n + 1). In + 1 - yin.
44
Chapter 2
SEQUENCES IN R
3. Prove Theorem 2.12iv. 4. Suppose that x E R, Xn ~ 0, and Xn --+ x as n --+ 00. Prove that .;x;; --+ Vx as n --+ 00. (For the case x = 0 you may wish to use (8) on p. 7.) 5. Prove that given x E R there is a sequence Tn E Q such that Tn --+ X as n --+ 00. 6. Suppose that x and y are extended real numbers and {x n }, {Yn}, and {w n } are real sequences. (a) If Xn --+ x and Yn --+ x, as n --+ 00, and Xn ::; Wn ::; Yn for n E N, prove that Wn --+ x as n --+ 00. (b) If Xn --+ x and Yn --+ y, as n --+ 00, and Xn ::; Yn for n E N, prove that x ::; y. 7. Using the result in Exercise 4, show the following. (a) Suppose that Xl ~ 0 and Xn+l = V2 + Xn for n EN. If Xn --+ x as n --+ 00, prove that x = 2. (b) Suppose that 0 ::; Xl ::; 1 and X n +! = 1 - VI - Xn for n E N. If Xn --+ x as n --+ 00, prove that x = 0 or 1. 8. Prove Corollary 2.16. 9. Interpret a decimal expansion 0.ala2 ... as
Prove that (a) 0.5 = 0.4999 ... and (b) 1 = 0.999 ... [!ill. This exercise was used in Section 1.4. (a) Suppose that 0 ::; Y < IlIOn for some integer n ~ O. Prove that there is an integer 0 ::; W ::; 9 such that W
W
Ion+! ::; Y < IOn+1
1
+ IOn+I'
(b) Prove that given x E [0,1) there exist integers 0 ::; nEN, ~ Xk ~ Xk 1 ~ 10k ::; X < ~ 10 k + IOn' k=l
(C) Prove that given x
Xk ::;
9 such that for all
k=l
E [0,1) there exist integers 0 ::; Xk ::; 9, kEN, such that n
x= n~~ 10k' .
' " Xk k=l
(Note: The numbers Xk are called digits of x, and 0.XIX2 •.. is called a decimal expansion of x. Unless x is a rational number whose denominator is of the form 2i 5j for some integers i ~ 0, j ~ 0, this expansion is unique; i.e., there is only one sequence of integers {x d that satisfies part (c). On the other hand, if x is a rational number whose denominator is of the form 2i 5j , then there are two sequences {x d that satisfy part (c), one that satisfies Xk = 0 for large k and one that satisfies Xk = 9 for large k (see Exercise 9). We shall identify the second sequence by saying that it terminates in 9's.)
2.3
Bolzano- Weierstrass Theorem
45
2.3 BOLZANO-WEIERSTRASS THEOREM Notice that although the sequence {( _l)n} does not converge, it has convergent subsequences. In this section we shall prove that this is a general principle. Namely, we shall establish the Bolzano-Weierstrass Theorem, which states that every bounded sequence has a convergent subsequence. We begin with a special case (monotone sequences) for which the BolzanoWeierstrass Theorem is especially transparent. Afterward, we shall use this special case to obtain the general result. 2.18 DEFINITION. Let {Xn}nEN be a sequence of real numbers. (i) {xn} is said to be increasing (respectively, strictly increasing) if and only if Xl ::; X2 ::; ... (respectively, Xl < X2 < ... ). (ii) {xn} is said to be decreasing (respectively, strictly decreasing) if and only if Xl ;:::: X2 ;:::: ... (respectively, Xl > X2 > ... ). (iii) {xn} is said to be monotone if and only if it is either increasing or decreasing. [Some authors call decreasing sequences nonincreasing and increasing sequences non decreasing.]
If {xn} is increasing (respectively, decreasing) and converges to a, we shall write i a (respectively, Xn 1 a), as n -+ 00. Clearly, every strictly increasing sequence is increasing, and every strictly decreasing sequence is decreasing. Also, {xn} is increasing if and only if the sequence {-xn} is decreasing. By Theorem 2.8, any convergent sequence is bounded. We now establish the converse of this result for monotone sequences. (For an extension to extended real numbers, see Exercise 3.) Xn
2.19 THEOREM [MONOTONE CONVERGENCE THEOREM]. If {xn} is increasing and bounded above, or if it is decreasing and bounded below, then {xn} has a finite limit.
PROOF. We shall actually prove that an increasing sequence converges to its supremum, and a decreasing sequence converges to its infimum. (i) Suppose that {xn} is increasing and bounded above. By the Completeness Axiom, the supremum a := sup{ Xn : n E N} exists and is finite. Let € > O. By the Approximation Property for Suprema, choose N E N such that
a-
€
<
X N ::;
a.
Since XN ::; Xn for n ;:::: Nand Xn ::; a for all n E N, it follows that a - € < Xn ::; a for all n ;:::: N. In particular, Xn i a as n -+ 00. (ii) If {xn} is decreasing with infimum b := inf{xn : n EN}, then {-xn} is increasing with supremum -b (see Theorem 1.28). Hence, by part (i) and Theorem 2.12ii, b = -(-b) = - lim (-xn) = lim X n . • n----+oo
n----+oo
The Monotone Convergence Theorem is used most often to show that a limit exists. Once existence has been established, it is often easy to find the value of that limit by using Theorems 2.9 and 2.12. The following examples illustrate this fact.
Chapter 2
46
2.20 Example. If
SEQUENCES IN R
lal < 1, then an - t 0 as n - t 00.
PROOF. It suffices to prove that lain - t 0 as n - t 00. First, we notice that lain is monotone decreasing since by the Multiplicative Property, lal < 1 implies lal n +l < lain for all n E N. Next, we observe that lain is bounded below (by 0). Hence by the Monotone Convergence Theorem, £ := limn ---+ co lain exists. Suppose that £ =I- O. Taking the limit of the algebraic identity lal n +l = lal·lal n , as n - t 00, we see by Theorem 2.12 that £ = lal' £. Since £ is not zero, it follows that lal = 1, a contradiction.•
2.21 Example. If a> 0, then a l / n
-t
1 as n
-t
00.
PROOF. We consider three cases. Case 1. a = 1. Then a l / n = 1 for all n E N, and it follows that a l / n - t 1 as n - t 00. Case 2. a > 1. We shall apply the Bounded Convergence Theorem. To show that {a l / n } is decreasing, fix n E N and notice that a> 1 implies anH > an. Taking the n(n + l)st root of this inequality, we obtain a l / n > al/(nH); i.e., a l / n is decreasing. Since a > 1 implies a l / n > 1, it follows that a l / n is decreasing and bounded below. Hence, by the Monotone Convergence Theorem (Theorem 2.19), £ := limn --+ co a l / n exists. To find its value, take the limit of the identity (a l / n )2 = a l /(2n) as n - t 00. We obtain £2 = £; i.e., £ = 0 or 1. Since a l / n > 1, the Comparison Theorem (Theorem 2.17) shows that £ ~ 1. Hence £ = 1. Case 3. 0 < a < 1. Then l/a > 1. It follows from Theorem 2.12 and Case 2 that · a l/n = l'1m - -1- = 1 11m =1. n--+co n---+co l/a l / n limn--+co(l/a)1/n
•
Next, we introduce a monotone property for sequences of sets.
2.22 DEFINITION. A sequence of sets {In}nEN is said to be nested if and only if
h :2 12 :2 .... In Chapters 3, 8, and 9, we shall use this concept to study continuous functions. Here, we use it to prove the Bolzano-Weierstrass Theorem. All of these applications depend in a fundamental way on the following result.
2.23 THEOREM [NESTED INTERVAL PROPERTY]. If {In}nEN is a nested sequence of nonempty closed bounded intervals, then E =
n
In := {x : x E In for all n E N}
nEN
contains at least one number. Moreover, if the lengths of these intervals satisfy IInl - t 0 as n ---7 00, then E contains exactly one number. PROOF. Let In = [an, bn ]. Since {In} is nested, the real sequence {an} is increasing and bounded above by bl , and {bn } is decreasing and bounded below by al
2.3
B olzano-Weierstrass
Theorem
-
47
..
Figure 2.2 (see Figure 2.2). Thus by Theorem 2.19, there exist a, bE R such that an i a and bn 1 b as n ---+ 00. Since an ::; bn for all n E N, it also follows from the Comparison Theorem that an ::; a ::; b ::; bn . Hence, a number x belongs to In for all n E N if and only if a ::; x ::; b. This proves that E = [a, b]. Suppose now that IInl ---+ 0 as n ---+ 00. Then bn - an ---+ 0 as n ---+ 00, and we have by Theorem 2.12 that b - a = O. In particular, E = [a, a] = {a} contains exactly one number. I The next two results show that neither of the hypotheses of Theorem 2.23 can be relaxed.
2.24 Remark. The Nested Interval Property might not hold if "closed" is omitted. PROOF. The intervals In = (O,l/n), n E N, are bounded and nested but not closed. If there were an x E In for all n E N, then 0 < x < l/n; i.e., n < l/x for all n EN. Since this contradicts the Archimedean Principle, it follows that the intervals In have no point in common. I
2.25 Remark. The Nested Interval Property might not hold if "bounded" is omitted. PROOF. The intervals In = [n, 00), n E N are closed and nested but not bounded. Again, they have no point in common. I We are now prepared to prove the main result of this section.
2.26 THEOREM [BOLZANO-WEIERSTRASS THEOREM]. Every bounded sequence of real numbers has a convergent subsequence. PROOF. We begin with a general observation. Let {xn} be any sequence. If E = Au B are sets and E contains Xn for infinitely many values of n, then at least one of the sets A or B also contains Xn for infinitely many values of n. (If not, then E contains Xn for only finitely many n, a contradiction.) Let {xn} be a bounded sequence. Choose a, b E R such that Xn E [a, b] for all n E N, and set 10 = [a, b]. Divide 10 into two halves, say, I' = [a, (a + b)/2] and I" = [(a + b)/2, b]. Since 10 = I' U I", at least one of these half intervals contains Xn for infinitely many n. Call it h, and choose nl > 1 such that Xn1 E h. Notice that Ihl = 1101/2 = (b - a)/2. Suppose that closed intervals 10 ::> II ::> ... ::> 1m and natural numbers nl < n2 < ... < nm have been chosen such that for each 0 ::; k ::; m, (2)
b-a
Ihl = 2"iC'
x nk E h,
and
Xn Elk
for infinitely many n.
To choose Im+1 , divide 1m = [am,b m ] into two halves, say I' = [am, (am + bm )/2] and 1" = [(am + bm )/2, bm]. Since 1m = I' U I", at least one of these half intervals
Chapter 2
48
SEQUENCES IN R
contains Xn for infinitely many n. Call it I m+!, and choose nm+l > nm such that x nm + 1 E I m + l . Since IIml b-a IIm+ll = -2- = 2m+l' it follows by induction that there is a nested sequence {IdkEN of nonempty closed bounded intervals that satisfy (2) for all kEN. By the Nested Interval Property, there is an x E R that belongs to Ik for all kEN. Since x Elk, we have by (2) that
for all kEN. Hence by the Squeeze Theorem, x nk
-t
x as k - t
00 . •
EXERCISES 1. Prove that
has a convergent subsequence. 2. Suppose that E c R is a nonempty bounded set and sup E ~ E. Prove that there exists a strictly increasing sequence {xn} that converges to sup E such that Xn E E for all n E N. 3. (a) Suppose that {xn} is a monotone increasing sequence in R (not necessarily bounded above). Prove that there is an extended real number x such that Xn - t x as n - t 00. (b) State and prove an analogous result for decreasing sequences.
4. Suppose that 0 < Xl < 1 and Xn+l = 1 - v'1 - Xn for n E N. Prove that Xn 10 as n - t 00 and xn+dxn - t 1/2, as n - t 00. (Exercise 4.3 in Apostol [1].) 5. Let 0 < Xl ::; 3 and Xn+l = v'2xn + 3 for n E N. Prove that Xn 13 as n - t 00. 6. Suppose that Xl ~ 2 and Xn+l = 1 + v'Xn - 1 for n E N. Prove that Xn 12 as n - t 00. What happens when 1 ::; Xl < 2? 7. Prove that lim x l /(2n-l) n-->oo
8. Suppose that Xo E Rand Xn Xn - t 1 as n - t 00. 9. Let 0 < Yl < Xl and set Xn+l
=
Xn +Yn 2
= {
~
X>o X=o
-1
x <
o.
= (1 + xn-d/2 for n E N. Prove that
and
Yn+l
(a) Prove that 0 < Yn < Xn for all n E N.
=
v'xnYn,
nEN.
2.4
Cauchy sequences
49
(b) Prove that Yn is increasing and bounded above, and Xn is decreasing and bounded below. (c) Prove that 0 < Xn+l - Yn+l < (Xl - Yl)/2 n for n E N. (d) Prove that limn->oo Xn = limn->oo Yn. (This common value is called the arithmetic-geometric mean of Xl and Yl.) 10. Suppose that Xo
= 1, Yo = 0,
and Yn = Xn-l
= ±1 for n
for n E N. Prove that x; - 2y; Xn Yn
-
+ Yn-l
---7
11. [ARCHIMEDES] Suppose that Xo Xn =
In V 2
=
as n
E Nand ---7
2V3, Yo
=
00.
3,
2Xn-lYn-l , Xn-l + Yn-l
and Yn = v'XnYn-l
for n E N. (a) Prove that Xn 1 X and Yn (b) Prove that X = Y and
i
y, as n
3.14155 <
X
---7
for some x, Y E R.
00,
< 3.14161.
(The actual value of x is 7!'.)
2.4 CAUCHY SEQUENCES In this section we introduce an extremely powerful and widely used concept. By definition, if {xn} is a convergent sequence, then there is a point a E R such that Xn is near a for large n. If the xn's are near a, they are certainly near each other. This leads us to the following concept. 2.27 DEFINITION. A sequence of points Xn E R is said to be Cauchy if and only if for every c > 0 there is an N E N such that
(3)
n,m ~ N
imply
IX n
-
xml < c.
The next two results show how this concept is related to convergence.
50
Chapter 2
SEQUENCES IN R
2.28 Remark. If {xn} is convergent, then {xn} is Cauchy.
PROOF. Suppose that Xn -+ a as n -+ 00. Then by definition, given c > 0 there is an N E N such that IXn - al < c/2 for all n ;:: N. Hence if n, m ;:: N, it follows from the triangle inequality that
The following result shows that the converse of Remark 2.28 is also true (for real sequences) . 2.29 THEOREM [CAUCHY]. Let {xn} be a sequence of real numbers. Then {xn} is Cauchy if and only if {xn} converges (to some point a in R).
STRATEGY. By Remark 2.28, we need only show that every Cauchy sequence converges. Suppose that {xn} is Cauchy. Since the xn's are near each other, the sequence {xn} should be bounded. Hence by the Bolzano-Weierstrass Theorem, {xn} has a convergent subsequence, say x nk . This means that for large k, the xnk's are near some point a E R. But since {xn} is Cauchy, the xn's should be near the xnk's for large n, hence also near a. Thus the full sequence must converge to that same point a. Here are the details. PROOF. Suppose that {xn} is Cauchy. Given c = 1, choose N IXN - xml < 1 for all m ;:: N. By the triangle inequality
E
N such that
for m;:: N. Therefore, {xn} is bounded by M = max{lxll, IX21,· .. , IXN-11, 1 + IXNI}. By the Bolzano-Weierstrass Theorem, {xn} has a convergent subsequence, say Xnk -+ a as k -+ 00. Let c > O. Since Xn is Cauchy, choose Nl EN such that
n, m ;:: Nl Since x nk
-+
a as k
-+ 00,
choose N2 E N such that
k ;:: N2
Fix k ;:: N2 such that
nk ;::
for all n ;:: N 1 • Thus Xn
implies
-+
implies
N 1 . Then
a as n
-+ 00.
I
This result is extremely useful because it is often easier to show that a sequence is Cauchy than to show that it converges. The reason for this, as the following example shows, is that we can prove that a sequence is Cauchy even when we have no idea what its limit is.
2.4
51
Cauchy sequences
2.30 Example. Prove that any real sequence {xn} that satisfies
nEN, is convergent. PROOF. If m > n, then
+ Xn+l - Xn+2 + ... + Xm-l - xml + IXn+l - x n+21 + ... + IXm-l - xml 1 1 <-+ ... +-- 2n 2m - 1
IXn - xml = IXn - Xn+1
::; IXn - xn+ll
1 m-n 1
= 2n-1
L
2k
1
(
= 2n-1 1 -
1) 2m-n
.
k=l
(The last step uses Exercise lc, p. 17, for a = 2.) It follows that IX n - Xm I < Ij2 n- 1 for all integers m > n ~ 1. But given c > 0, we can choose N E N so large that n ~ N implies Ij2 n - 1 < c. We have proved that {xn} is Cauchy. By Theorem 2.29, therefore, it converges to some real number. I The following result shows that a sequence is not necessarily Cauchy just because Xn is near Xn+1 for large
n.
2.31 Remark. A sequence that satisfies Xn+l - Xn
--->
0 is not necessarily Cauchy.
PROOF. Consider the sequence Xn := logn. By basic properties of logarithms (see Exercise 4, p. 134), Xn+1 - Xn = log(n
+ 1) -logn =
log((n + l)jn)
--->
log 1 = 0
as n ---> 00. {xn} cannot be Cauchy, however, because it does not converge; in fact, it diverges to +00 as n ---> 00. I
EXERCISES 1. Prove (without using Theorem 2.29) that the sum of two Cauchy sequences is
Cauchy. 2. Prove that if {xn} is a sequence that satisfies
l+n
IXnl ::; 1 + n
+ 2n2
for all n EN, then {xn} is Cauchy. 3. Suppose that Xn E N for n E N. If {xn} is Cauchy, prove that there are numbers a and N such that Xn = a for all n ~ N.
52
Chapter 2
SEQUENCES IN R
4. Let {xn} be a sequence ofreal numbers. Suppose that for each e > 0 there is an N E N such that m 2: n 2: N implies 2:~=n Xk e. Prove that
I
1<
n
lim '"' L..-t Xk
n---+(X)
k=l
exists and is finite. 5. Let {xn} be Cauchy. Without using Theorem 2.29, prove that {xn} converges if and only if at least one of its subsequences converges (compare with Exercise 8, p. 38). 6. Prove that limn->oo 2:~=1 (-l)k /k exists and is finite. 7. Let {xn} be a sequence. Suppose that there is an a > 1 such that
for all kEN. Prove that Xn ...... x for some x E R. 8. (a) A subset E of R is said to be sequentially compact if and only if every sequence Xn E E has a convergent subsequence whose limit belongs to E. Prove that every closed bounded interval is sequentially compact. (b) Prove that there exist bounded intervals in R which are not sequentially compact. (c) Prove that there exist closed intervals in R which are not sequentially compact. 9. (a) Let E be a subset of R. A point a E R is called a cluster point of E if En (a - r, a + r) contains infinitely many points for every r > O. Prove that a is a cluster point of E if and only if for each r > 0, En (a - r, a + r) \ {a} is nonempty. (b) Prove that every bounded infinite subset of R has at least one cluster point.
e2.5 LIMITS SUPREMUM AND INFIMUM rial from any other enrichment section.
This section uses no mate-
In some situations (e.g., the Root Test in Section 6.3), we shall use the following generalization of limits. 2.32 DEFINITION. Let {xn} be a real sequence. Then the limit supremum of {xn} is the extended real number
(4)
limsupx n := lim (SUpXk), n->oo
n->oo k?:n
and the limit infimum of {xn} is the extended real number liminf Xn:= lim (inf Xk). n->oo
n->oo k?:n
2.5
53
Limits supremum and infimum
Before we proceed, we must show that the limits in Definition 2.32 exist as extended real numbers. To this end, let {xn} be a sequence of real numbers and consider the sequences Sn
= SUPXk:= SUp{Xk: k ~ k~n
and
n}
tn
= inf
k~n
Xk:=
inf{xk: k ~
n}.
Each Sn and tn is an extended real number, and by the Monotone Property, Sn is a decreasing sequence and tn an increasing sequence of extended real numbers. In particular, there exist extended real numbers sand t such that Sn ! sand tn l' t as n - t 00 (see Exercise 3, p. 48). These extended real numbers are, by Definition 2.32, the limit infimum and limit supremum of the sequence {x n }. Here are two examples of how to compute limits supremum and limits infimum. 2.33 Example. Find limsuPn __HlO Xn and liminfn--+<XJ Xn if Xn =
(-l)n.
SOLUTION. Since SUPk>n( _l)k = 1 for all n E N, it follows from Definition 2.32 that lim sUPn--+<XJ Xn = 1. --Similarly, lim infn--+<XJ Xn = -1. • 2.34 Example. Find limsuPn--+<XJ Xn and liminfn--+<XJ Xn if Xn = 1 + lin.
SOLUTION. Since SUPk>n(1 + 11k) = 1 + lin for all n E N, limsuPn--+<XJ Xn = 1. Since infk>n(1 + 11k) = 1" for all n E N, lim infn--+<XJ Xn = 1. • These examples suggest that there is a connection between limits supremum, limits infimum, and convergent subsequences. The next several results make this connection clear. 2.35 THEOREM. Let {xn} be a sequence of real numbers, S = lim sUPn--+<XJ X n, and t = liminfn--+<XJ X n . Then there are subsequences {Xnk hEN and {Xl,ljEN such that X nk - t S as k - t 00 and Xl, - t t as j - t 00.
PROOF. We will prove the result for the limit supremum. A similar argument establishes the result for the limit infimum. Let Sn = sUPk>n Xk and observe that Sn ! S as n - t 00. -Case 1. S = 00. Then by definition Sn = 00 for all n E N. Since S1 = 00, there is an n1 E N such that X nl > 1. Since snl +1 = 00, there is an n2 ~ n1 + 1 > n1 such that X n2 > 2. Continuing in this manner, we can choose a subsequence {x nk } such that X nk > k for all kEN. Hence, it follows from the Squeeze Theorem (see Exercise 6, p. 44) that x nk - t 00 = S as k - t 00. Case 2. S = -00. Since Sn ~ Xn for all n E N, it follows from the Squeeze Theorem that Xn - t -00 = S as n - t 00. Case 3. -00 < S < 00. Set no = O. By Theorem 1.20 (the Approximation Property for Suprema), there is an integer n1 E N such that sno+! - 1 < X nl ::; Sno+1. Similarly, there is an integer n2 ~ n1 + 1 > n1 such that snl +1 -1/2 < x n2 ::; Snl+1. Continuing in this manner, we can choose integers n1 < n2 < ... such that (5)
1
snk_I+1 -
k < x nk
::; snk_I+1
Chapter 2
54
for kEN. Since snk_l +1 ---+ S as k that x nk ---+ s as k ---+ 00 . •
SEQUENCES IN R ---+ 00,
we conclude by the Squeeze Theorem
This observation leads directly to a characterization of limits in terms of limits infimum and limits supremum.
2.36 THEOREM. Let {xn} be a real sequence and x be an extended real number. Then Xn ---+ x as n ---+ 00 if and only if
(6)
limsupxn = liminfxn = x. n--+oo n--->oo PROOF. Suppose that Xn ---+ x as n ---+ 00. Then x nk ---+ x as k ---+ 00 for all subsequences {x nk }. Hence by Theorem 2.35, limsuPn--+ooxn = x and liminfn--+oo Xn = x; i.e., (6) holds. Conversely, suppose that (6) holds. Case 1. x = ±oo. By considering ±xn we may suppose that x = 00. Thus given MER there is an N E N such that infk2:N Xk > M. It follows that Xn > M for all n ~ N; i.e., Xn ---+ 00 as n ---+ 00. Case 2. -00 < x < 00. Let € > o. Choose N E N such that SUPXk -x < ~ k2:N 2
Let n, m
~
x- inf Xk <~. k2:N 2 N and suppose for simplicity that Xn > Xm . Then and
- xm I = x n - x m < sup Xk - X + X k2:N Thus {xn} is Cauchy and converges to some finite 2.35, some subsequence of {xn} converges to x.
Ix n
n ---+
inf Xk < ~2 + ~2 = €. k>N real number. But by Theorem We conclude that Xn ---+ x as
00 . •
Theorem 2.35 also leads to the following geometric interpretation of limits supremum and limits infimum.
2.37 THEOREM. Let {xn} be a sequence of real numbers. Then lim sUPn---> 00 Xn (respectively, lim infn---> 00 xn) is the largest value (respectively, the smallest value) to which some subsequence of {xn} converges. Namely, if Xnk ---+ x as k ---+ 00, then
(7)
lim inf Xn :<:::; x :<:::; lim sup x n . n--+oo n--+oo PROOF. Suppose that X nk ---+ x as k ---+ 00. Fix N E N and choose K so large that k ~ K implies nk ~ N. Clearly,
for all k
~
inf Xj :<:::; x nk :<:::; sup Xj J2:N j2:N K. Taking the limit of this inequality as k
---+ 00,
we obtain
inf Xj :<:::; x :<:::; sup Xj. j2:N j2:N Taking the limit of this last inequality as N ---+ 00 and applying Definition 2.32, we obtain (7) .• We close this section with several other properties of limits supremum and limits infimum.
2.5
Limits supremum and infimum
55
2.38 Remark. If {xn} is any sequence of real numbers, then
lim inf Xn n--+oo
~
lim sup Xn . n--+oo
PROOF. Since infk?:n Xk ~ sUPk>n Xk for all n E N, this inequality follows from Theorem 2.17 (the Comparison Theorem). I The following result is an immediate consequence of Definition 2.32, the Comparison Theorem, and the Monotone Convergence Theorem. 2.39 Remark. A real sequence {xn} is bounded above if and only if lim SUPn--+oo Xn < 00, and is bounded below if and only if liminfn--+oo Xn > -00.
The following result shows we can take limits supremum and limits infimum of inequalities. ~
2.40 THEOREM. If Xn
(8)
lim sup Xn n~oo
~
Yn for n large, then
lim sup Yn
and
lim inf Xn
~
n~oo
n~oo
lim inf Yn. n--+oo
PROOF. If Xk ~ Yk for k ~ N, then sUPk>n Xk ~ sUPk>n Yk and infk?:n Xk < infk?:n Yk for any n ~ N. Taking the limit of these inequ-alities as n --t 00, we obtain (8). I
EXERCISES 1. Find the limit infimum and the limit supremum of each of the following sequences. (a) xn=3-(-1)n. (b) Xn = cos (mrj2). (c) Xn = (_l)n+l + (-l)njn. (d) xn=J1+n2j(2n-5). (e) Xn = Ynjn, where {Yn} is any bounded sequence. (f) Xn = n(l + (-l)n) + n- 1 (( -l)n - 1). (g) Xn = (n 3 + n 2 - n + 1)j(n2 + 2n + 5). 2. Suppose that {xn} is a real sequence. Prove that
-lim sup Xn
liminf( -xn)
=
n--+oo
n--+oo
and -liminf Xn n--+oo
= limsup( -xn). n--+oo
3. Let {xn} be a real sequence and r E R.
(a) Prove that lim sup Xn < r n--+oo
implies
Xn < r
Chapter 2
56
SEQUENCES IN R
for n large. (b) Prove that lim sup Xn n--+oo
>r
Xn
implies
>r
for infinitely many n E N.
4. Suppose that {xn} and {Yn} are real sequences. (a) Prove that lim inf Xn n~oo
+ lim inf Yn n~oo
+ Yn) ~ lim sup Xn + lim inf Yn n--+oo ~ lim inf (xn n~oo
n--+oo
provided that none of these sums is of the form (b) Show that if lim n --+ oo Xn exists, then liminf(x n n~oo
+ Yn)
n--+oo
n--+oo
n--+oo
lim Xn
=
n~oo
00 -
00.
+ lim inf Yn n--+oo
and limsup(xn n--+oo
+ Yn) = n--+oo lim Xn + limsupYn. n--+oo
(c) Show by examples that each of the inequalities in part (a) can be strict.
5. Let {xn} and {Yn} be real sequences. (a) Suppose that Xn ~ 0 and Yn ~ Q for each n E N. Prove that
limsup(xnYn) ~ (limsupxn)(limsuPYn), n--+oo n--+oo n--+oo provided that the product on the right is not of the form 0 . 00. Show by example that this inequality can be strict. (b) Suppose that Xn ~ 0 ~ Yn for n EN. Prove that
provided that none of these products is of the form 0 . 00.
6. Suppose that Xn ~ 0 and Yn ~ 0 for all n E N. Prove that if Xn (x may be an extended real number), then n--+oo
=
inf (SUPXk) nEN k~n
x as n
n--+oo
provided that none of these products is of the form O· 7. Prove that lim sup Xn n--+oo
-t
and
lim inf Xn n--+oo
=
00.
sup (inf Xk) nEN k~n
- t 00
2.5
Limits supremum and infimum
for any real sequence {x n }. 8. Suppose that Xn ::::: 0 for n E N. Under the interpretation 1/0 = 00 and 1/00 = 0, prove that 1 . 1 11m sup - = - - - - - n--->oo
9. Let
Xn
Xn
lim infn---> 00 Xn
E R. Prove that
Xn -+
and
0 as n
-+
lim inf ~ = ___1_ __ n--->oo Xn lim sUPn--->oo Xn 00 if and only if
lim sup IXn I = O. n--->oo
57
Chapter 3
Continuity on
R
3.1 TWO-SIDED LIMITS In Chapter 2 we studied limits of real sequences. In this chapter we examine limits of real functions, i.e., functions whose domains and ranges are subsets of R. Recall from elementary calculus that a function f(x) converges to a limit L, as x approaches a, if f(x) is near L when x is near a. Here is a precise definition of this concept. 3.1 DEFINITION. Let a E R, let I be an open interval that contains a, and let f be a real function defined everywhere on I except possibly at a. Then f(x) is said to converge to L, as x approaches a, if and only if for every c > 0 there is a 8> 0 (which in general depends on c, f, I, and a) such that
(1)
o < Ix - al
< 8 implies If(x) - LI < c.
In this case we write L
= x--->a lim f(x)
and call L the limit of f(x) as x approaches a. As was the case for sequences, c represents the maximal error allowed in the approximation f(x) to L. The number 8 represents the tolerance allowed in the measurement x of a that will produce an approximation f(x) which is acceptably close to the value L. According to Definition 3.1, to show that a function has a limit, we must begin with a general c > 0 and describe how to choose a 8 that satisfies (1). 3.2 Example. Suppose that f(x)
=
f(a)
mx + b where m, bE R. Prove that =
lim f(x)
x--->a
for all a E R. 58
3.1
Two-sided limits
59
PROOF. If m = 0, there is nothing to prove. 8 = cflml. If Ix - al < 8, then
Otherwise, given c > 0, set
If(x) - f(a)1
= Imx + b - (ma + b)1 = Imllx - al < Iml8 = c.
Thus by definition, f(x) ---- f(a) as x ---- a. I Sometimes, in order to determine 8, one must break f(x) - L into two factors, replacing the less important factor by an upper bound. 3.3 Example. If f(x)
=
x2 + X
3, prove that f(x) ---- -1 as x ---- 1.
-
PROOF. Let c > 0 and set L = -1. Notice that f (x) - L
= x2 + X
-
2 = (x - 1) (x
+ 2).
If 0 < 8 ~ 1, then Ix - 11 < 8 implies 0 < x < 2, so by the triangle inequality, Ix + 21 ~ Ixl + 2 < 4. Set 8 = min{1, cf4}. It follows that if Ix - 11 < 8, then
If(x) - LI Thus by definition,
= Ix -111x + 21 < 48
f (x) ---- L as x ----
~ c.
1. I
Before continuing, we would like to draw your attention to two features of Definition 3.1: the assumption that f be defined on an open interval I, and the assumption that 0 < Ix - al. First, notice that if 1= (c, d) :) {a} and 80 := min{a - c, d - a}, then Ix - al < 80 implies x E I. Hence, the assumption that f be defined on some open interval containing a is made so that f(x) is defined for all x satisfying Ix - al < 8 when 8 sufficiently small. Next, notice that the assumption Ix - al > 0 is equivalent to x -# a. Thus the function f need not be defined at a in order for f to have a limit at a. (This will be crucial for defining derivatives later.) The next result shows that even when a function f is defined at a, the value of the limit of f at a is, in general, independent of the value f(a). 3.4 Remark. Let a E R, let I be an open interval that contains a, and let f, g be real functions defined everywhere on I except possibly at a. If f(x) = g(x) for all x E I \ {a} and f(x) ---- L as x ---- a, then g(x) also has a limit as x ---- a, and
lim g(x)
x--+a
=
lim f(x).
x--+a
PROOF. Let c > 0 and choose 8> 0 small enough so that (1) holds and Ix-al < 8 implies x E I. Suppose that 0 < Ix - al < 8. We have f(x) = g(x) by hypothesis and If(x) - LI < c by (1). It follows that Ig(x) - LI < c. I Thus to prove that a function algebraically. 3.5 Example. Prove that
f
has a limit, we may begin by simplifying
f
Chapter 3
60
has a limit as x
-t
CONTINUITY ON R
1.
PROOF. Set f(x) Since
= x + 1 and observe
( ) =
g X
x3
+ x2 -
X- 1 x2 _ 1
by Example 3.2 that f(x)
=
(x + 1) (x 2 - 1) x2 _ 1
=
-t
2 as x
-t
1 (and that
-t
1.
f( ) x
for x i- ±1, it follows from Remark 3.4 that g(x) has a limit as x limit is 2). I
There is a close connection between limits of functions and limits of sequences. 3.6 THEOREM [SEQUENTIAL CHARACTERIZATION OF LIMITS). Let a E R, let I be an open interval that contains a, and let f be a real function defined everywhere on I except possibly at a. Then
L exists if and only if f(x n ) converges to a as n - t 00.
-t
=
L as n
lim f(x)
x---+a
- t 00
for every sequence Xn E I \ {a} that
PROOF. Suppose that f converges to L as x approaches a. Then given c > 0 there is a 8 > 0 such that (1) holds. If Xn E I \ {a} converges to a as n - t 00, then choose an N E N such that n ~ N implies IX n - al < 8. Since Xn i- a, it follows from (1) that If(x n ) - LI < c for all n ~ N. Therefore, f(x n ) - t Las n - t 00. Conversely, suppose that f(x n ) - t L as n - t 00 for every sequence Xn E I \ {a} that converges to a. If f does not converge to L as x approaches a, then there is an c > 0 (call it co) such that the implication "0 < Ix - al < 8 implies If(x) - LI < co" does not hold for any 8 > O. Thus, for each 8 = lin, n E N, there is a point Xn E I that satisfies two conditions: 0 < IX n - al < lin and If(x n ) - LI ~ co. Now the first condition and the Squeeze Theorem (Theorem 2.9) imply that Xn i- a and Xn - t a, so by hypothesis, f(xn) - t L, as n - t 00. In particular, If(xn) - LI < co for n large, which contradicts the second condition. I Thus to show that the limit of a function f does not exist as x - t a, we need only find two sequences converging to a whose images under f have different limits. 3.7 Example. Prove that
f(x) has no limit as x
-t
=
. 1 { ~m;;
xi-O x=O
O.
PROOF. By examining the graph of y consider two extremes:
2 an := --;-4:-n-+----:-:-1 ( )-n
= f(x) 2
and
bn := (4n
(see Figure 3.1), we are led to
+ 3)n'
nEN.
3.1
Two-sided limits
61
y
-1
Figure 3.1
Clearly, both an and bn converge to 0 as n ---+ 00. On the other hand, since f(a n ) = 1 and f(b n ) = -1 for all n E N, f(a n ) ---+ 1 and f(b n ) ---+ -1 as n ---+ 00. Thus by Theorem 3.6, the limit of f(x), as x ---+ 0, cannot exist. I Theorem 3.6 also allows us to translate results about limits of sequences to results about limits of functions. The next three theorems illustrate this principle. Before stating these results, we need to introduce an algebra of functions. Suppose that f, g : E ---+ R. For each x E E, the pointwise sum, f + g, of f and g is defined by (f + g)(x) := f(x) + g(x), the scalar product, exf, of a scalar ex E R with f by
(exf)(x)
:=
exf(x),
the pointwise product, fg, of f and g by
(fg)(x)
:=
f(x)g(x),
and (when g(x) -=I- 0) the pointwise quotient, f /g, of f and g by
(x) (L) g
:=
f(x). g(x)
The following result is a function analogue of Theorem 2.12.
3.8 THEOREM. Suppose that a E R, that I is an open interval that contains a, and that f, g are real functions defined everywhere on I except possibly at a. If
Chapter 3
62
CONTINUITY ON R
f(x) and g(x) converge as x approaches a, then so do (f + g)(x), (fg)(x), (af)(x), and (f jg)(x) (when the limit of g(x) is nonzero). In fact,
+ g) (x) = x--+a lim f(x) + lim g(x), x--+a
lim (f
x--+a
lim (af) (x)
x--+a
lim (fg) (x)
x--+a
=
a lim f(x), x--+a
lim f(x) lim g(x),
=
x--+a
x--+a
and (when the limit of g(x) is nonzero) lim x--->a
PROOF.
(L)g (x) = l~mx--->a f(x). hmx--->a g( x)
Let
L
:=
lim f(x)
x--+a
and
M:= lim g(x). x--+a
If Xn E 1\ {a} converges to a, then by Theorem 3.6, f(x n ) ---7 Land g(x n ) ---7 M as n ---7 00. By Theorem 2.12i, f(x n ) + g(x n ) ---7 L + Mas n ---7 00. Since this holds for any sequence Xn E I \ {a} that converges to a, we conclude by Theorem 3.6 that lim (f
x--+a
+ g) (x) = L + M = x--+a lim f(x) + lim g(x). x--+a
The other rules follow in an analogous way from Theorem 2.12ii through iv. I y
-
..........
'.\
.
.. y = h(x)
a
x
Figure 3.2
Similarly, the Sequential Characterization of Limits can be combined with Theorems 2.9 and 2.17 to prove the following results.
3.1
Two-sided limits
63
3.9 THEOREM [SQUEEZE THEOREM FOR FUNCTIONS). Suppose that a E R, that I is an open interval that contains a, and that f, g, h are real functions defined everywhere on I except possibly at a. (i) If g(x) ~ h(x) ~ f(x) for all x E I \ {a}, and lim f(x) = lim g(x) = L,
x~a
x~a
then the limit of h(x) exists, as x
-t
a, and
lim h(x) = L.
x-+a
(ii) If Ig(x)1 ~ M for all x E I \ {a} and f(x) lim f(x)g(x)
x-+a
=
-t
0 as x
-t
a, then
O.
The preceding result is illustrated in Figure 3.2.
3.10 THEOREM [COMPARISON THEOREM FOR FUNCTIONS). Suppose that a E R, that I is an open interval that contains a, and that f, 9 are real functions defined everywhere on I except possibly at a. If f and 9 have a limit as x approaches a and
f(x)
~
g(x),
xEI\{a},
then lim f(x)
~
x~a
lim g(x). x~a
We shall refer to this last result as taking the limit of an inequality. The limit theorems (Theorems 3.8, 3.9, and 3.10) allow us to prove that limits exist without resorting to €'s and 8's.
3.11 Example. Prove that
x-I · m 1I - - = 0. 3x + 1
x-+1
PROOF. By Example 3.2, x-I - t 0 and 3x + 1 Theorem 3.8, lim x-I = Q= O. I x-+1 3x + 1 4
-t
4 as x
-t
1. Hence, by
EXERCISES 1. Using Definition 3.1, prove that each of the following limits exists.
(a)
lim x 2
x-+2
-
x
+ 1 = 3.
Chapter 3
64
(b)
(c)
CONTINUITY ON R
x 2 -1 lim - - =2. x_I x - 1
+ x + 1 = 3.
lim x 3
x-I
2. Decide which of the following limits exist and which do not. Prove that your answer is correct.
(a)
(b)
(c)
. 1 hm cos-.
x-o
X
1 lim x sin -.
x
x-o
. 1 11 m--. x-I log x
3. Evaluate the following limits using results from this section. You may assume that sinx, 1 - cos x, and ?Ix converge to 0 as x -+ O.J
(a)
(b)
. 11m
X2
. 11m
x2
+ cos x
x-o 2 - tan x
x-I
+X -
X
3
-
~1f
(d)
. xn - 1 11 m-x--->l x -1 '
. 11m
x-o
.
x2
-
lim x-.Jii x
(e)
2
X
(c)
+ 1f
X SIll
.
.
.
nEN.
1
2.
x
3.1
Two-sided limits
65
4. Using Definition 3.1, prove that
. x nSlll· 1 11m
x-->O
X
exists for all n EN. 5. Prove Theorem 3.9. 6. Prove Theorem 3.10. [1]. This exercise is used in Sections 3.2 and 5.2. For each real function define the positive part of f by
f+(x) = If(x)l; f(x),
x
E
Dom(f)
x
E
Dom(f).
f
and the negative part of f by
r(x)
=
If(x)l; f(x),
(a) Prove that f+(x) ~ 0, f-(x) ~ 0, f(x) = f+(x) - f-(x), and If(x)1 = f+(x) + f-(x) hold for all x E Dom (f). (Compare with Exercise 1, p. 11.) (b) Prove that if L = lim f(x) x-->a
exists, then f+ (x) ---; L + and f- (x) ---; L - as x ---; a. 8. Suppose that f is a real function. (a) Prove that if L = lim f(x) x-->a
exists, then If(x)1 ---; ILl as x ---; a. (b) Show that there is a function such that as x ---; a, If(x)1 ---; of f(x) does not exist.
[!].
ILl
but the limit
This exercise is used in Sections 3.2 and 5.2. Let f, 9 be real functions, and for each x E Dom (f) n Dom (g) define
(f V g)(x)
:=
max{f(x), g(x)}
and
(f
1\
g)(x)
:= min{f(x), g(x)}.
(a) Prove that
(f V g)(x)
=
(f
+ g)(x) + l(f -
g)(x)1
2
and
(f
1\
g)(x) = (f
+ g)(x) -I(f -
g)(x)1
2
for all x E Dom (f) n Dom (g). (b) Prove that if L
=
lim f(x) x~a
and
M
=
lim g(x) x~a
exist, then (f V g)(x) ---; Lv M and (f 1\ g)(x) ---; L 1\ M as x ---; a.
Chapter 3
66
CONTINUITY ON R
3.2 ONE-SIDED LIMITS AND LIMITS AT INFINITY In the preceding section we defined the limit of a real function. In this section we expand that definition to handle more general situations. What is the limit of f(x) := Vx=1 as x -4 I? A reasonable answer is that the limit is zero. This function, however, does not satisfy Definition 3.1 because it is not defined on an OPEN interval containing a = 1. Indeed, f is defined only for x ~ 1. To handle such situations, we introduce "one-sided" limits.
3.12 DEFINITION. Let a E R. (i) A real function f is said to converge to L as x approaches a from the right if and only if f is defined on some open interval I with left endpoint a and for every e > 0 there is a 8 > 0 (which in general depends on e, f, I, and a) such that a + 8 E I and (2)
a
< x < a + 8 implies If(x) - LI < e.
In this case we call L the right-hand limit of f at a, and denote it by f(a+) := L =: lim f(x). x--+a+
(ii) A real function f is said to converge to L as x approaches a from the left if and only if f is defined on some open interval I with right endpoint a and for every e > 0 there is a 8 > 0 (which in general depends on e, f, I, and a) such that a - 8 E I and a- 8
< x < a implies
If(x) -
LI < e.
In this case we call L the left-hand limit of f at a and denote it by f(a-):=L=: lim f(x). x--+a-
It is easy to check that when two-sided limits are replaced with one-sided limits, all the limit theorems from the preceding section hold. We shall use them as the need arises without further comment. Existence of a one-sided limit can be established by these limit theorems or by appealing directly to the definition.
3.13 Examples. (i) Prove that f(x)
={
x+l x-I
x~o
x
has one-sided limits at a = 0 but that limx--+o f(x) does not exist.
(ii) Prove that lim
x--+o+
Vi = o.
3.2
One-sided limits and limits at infinity
67
PROOF. (i) Let c > 0 and set 0 = c. If 0 < x < 0, then If(x) - 11 = Ixl < 0 = c. Hence limx--+o+ f(x) exists and equals 1. Similarly, limx--+o- f(x) exists and equals -1. However, if Xn = (-l)n In, then f(x n ) = (-l)n(1 + lin) does not converge as n --; 00. Hence by the Sequential Characterization of Limits, limx--+o f(x) does not exist.
(ii) Let c > 0 and set 0 = c 2 . If 0 < x < 0, then If (x) I =
Vi < J8 = c. I
Not every function has one-sided limits (see Example 3.7). Example 3.13 shows that even when a function has one-sided limits, it may not have a two-sided limit. The following result, however, shows that if both one-sided limits at a point a exist and are EQUAL, then the two-sided limit at a exists. 3.14 THEOREM. Let
f
be a real function. Then the limit lim f(x)
x--+a
exists and equals L if and only if (3)
L
= x--+a+ lim f(x) = lim f(x). x--+a-
PROOF. If the limit L of f(x) exists as x --; a, then given c > 0 choose 0> 0 such that 0 < Ix - al < 0 implies If(x) - LI < c. Since any x that satisfies a < x < a + 0 or a - 0 < x < a also satisfies 0 < Ix - al < 0, it is clear that both the left and right limits of f(x) exist as x --; a and satisfy (3). Conversely, suppose that (3) holds. Then given c > 0 there exists a 01 > 0 (respectively, a 02 > 0) such that a < x < a + 01 (respectively, a - 02 < X < a) implies If(x) - LI < c.
Set 0 = mini 01, 02}. Then 0 < Ix - al < 0 implies both a < x < a + 01 and a - 02 < X < a. Hence (1) holds; i.e., f(x) --; L as x --; a. I The definition of limits of real functions can be expanded to include extended real numbers. We say that f(x) --; L as x --; 00 (respectively, as x --; -00) if and only if there exists a c > 0 such that (c, 00) C Dom (I) (respectively, (-00, -c) C Dom (I)) and given c > 0, there is an MER such that x > M (respectively, x < M) implies If(x) - LI < c. In this case we shall write lim f(x) x--+oo
=L
(respectively,
lim f(x)
x--+-oo
=
L).
We say that f (x) --; +00 (respectively, f (x) --; -00) as x --; a if and only if there is an open interval I containing a such that 1\ {a} C Dom (I) and given MER there is a 0 > 0 such that 0 < Ix - al < 0 implies f(x) > M (respectively, f(x) < M). In this case we shall write lim f(x)
x--+a
=
+00
(respectively, lim
x--+a
f (x) = - 00 ) .
Chapter 3
68
Obvious modifications define f(x) as x -4 ±oo.
3.15 Example. Prove that l/x
CONTINUITY ON R
-4
-4
±oo as x
0 as x
-4
-4
a+ and x
-4
a-, and f(x)
-4
±oo
00.
PROOF. Given e > 0, set M = l/e. If x > M, then Thus l/x -4 0 as x -4 00 . •
II/xl =
l/x < l/M = e.
3.16 Example. Prove that lim f(x):= lim x-+l-
x-+l-
x +2 2X2 - 3x + 1
=
-00.
PROOF. Let MER. We must show that f(x) < M for x near but to the left of 1 (no matter how large and negative M is). Without loss of generality, assume that M < O. As x converges to 1 from the left, 2x2 - 3x + 1 is negative and converges to O. (Observe that 2x2 - 3x + 1 is a parabola opening upward with roots 1/2 and 1.) Therefore, choose 8 E (0,1) such that 1-8 < x < 1 implies 3/M < 2x2 -3x+1 < 0; Le., 1/(2x 2 - 3x + 1) < M/3. Notice that 0 < x < 1 also implies 2 < x + 2 < 3. It follows that x+2 f(x) = 2x 2 - 3x+ 1 < M for all 1 - 8 < x < 1. • In order to unify the presentation of one-sided, two-sided, and infinite limits, we introduce the following notation. Let a be an extended real number, and I be a nondegenerate open interval that either contains a or has a as one of its endpoints. Suppose further that f is a real function defined on I except possibly at a. If a is finite and I contains a, then
(4)
lim f(x)
x-+a
xEI
will denote limx-+a f (x) (when it exists); if a is a finite left endpoint of I, then (4) will denote lim x -+ a + f(x) (when it exists); if a is a finite right endpoint of I, then (4) will denote lim x-+ a - f(x) (when it exists); if a = ±oo is an endpoint of I, then (4) will denote limx-+±<XJ f(x) (when each exists). Using this notation, we can state a Sequential Characterization of Limits valid for two-sided, one-sided, and infinite limits.
3.17 THEOREM. Let a be an extended real number and I be a nondegenerate open interval that either contains a or has a as one of its endpoints. Suppose further that f is a real function defined on I except possibly at a. Then lim f(x)
x-+a
xEI
3.2
69
One-sided limits and limits at infinity
exists and equals L if and only if f(xn) Xn =F a and Xn --> a as n --> 00.
-->
L for all sequences Xn
E
I that satisfy
PROOF. Since we have already proved this for two-sided limits, we must show it for the remaining eight cases that notation (4) represents. Since the proofs are similar, we shall give the details for only one of these cases, namely the case when a belongs to I and L = 00. Thus we must prove that f(x) --> 00 as x --> a if and only if f(xn) --> 00 for any sequence Xn E I that converges to a and satisfies Xn =F a for n E N. Suppose first that f(x) --> 00 as x --> a. If Xn E I, Xn --> a as n --> 00, and Xn =F a, then given MER there is a 0 > 0 such that 0 < Ix - al < 0 implies f(x) > M, and there is an N E N such that n ~ N implies IX n - al < o. Consequently, n ~ N implies f(xn) > M; i.e., f(xn) --> 00 as n --> 00, as required. Conversely, suppose to the contrary that f(xn) --> 00 for any sequence Xn E I that converges to a and satisfies Xn =F a but f(x) does NOT converge to 00 as x --> a. By the definition of "convergence" to 00 there are numbers Mo E Rand Xn E I such that IX n - al < l/n and f(xn) :::; Mo for all n E N. The first condition implies Xn --> a but the second condition implies that f(xn) does not converge to 00 as n --> 00. This contradiction proves 3.17 in the case a E I and L = 00. I Using Theorem 3.17, we can prove limit theorems that are function analogues of Theorem 2.15 and Corollary 2.16. We leave this to the reader and will use these results as the need arises. These limit theorems can be used to evaluate infinite limits. 3.18 Example. Prove that
2X2 -1
}~~ 1-x2 = -2. PROOF. Since the limit of a product is the product of the limits, we have by Example 3.15 that l/x m --> 0 as x --> 00 for any mEN. Multiplying numerator and denominator of the expression above by 1/x 2 we have lim 2x2 - 1 x--+oo 1-x2
= lim 2 - 1/x 2 = limx--+oo(2 - 1/x2) = x--+oo-1+1/x 2
limx--+oo(-l
+ 1/x2)
EXERCISES 1. Using definitions (rather than limit theorems) Prove that
lim f(x) x--+a+ exists and equals L in each of the following cases. (a) f(x) = lxi/x, a = 0, and L = 1. (b) f(x) = -l/x, a = 0, and L = -00. (c) f(x) = (x - 1)/(x2 + X - 2), a = -2, and L = (d) f(x) = 1/(x 2 - 1), a = 1, and L = 00.
00.
2. = -2 -1
I .
70
Chapter 3
CONTINUITY ON R
2. Evaluate the following limits when they exist,
(a)
(b)
(c)
+1
x
I,
1m 2 x-+O+ X -
I,
X3 -
1m x-+l-
X
lim (x 2
X-+7f+
3
2x '
3x + 2 - 1
+ 1) sinx,
,
x
(d)
I1m - , x-+O+
(e)
lim
Ixl
tanx
x-+7f/2-
X
3. Evaluate the following limits when they exist,
(a)
(b)
(c)
(d)
, 3X2 - 13x + 4 I1m 2 '
x-+oo
1- x - x
, x2 I1m 3
+X +2
, I1m
x3 - 1 --,
x-+oo X
x-+-oo
-
X -
2'
x2 + 2
lim arctan x,
x-+oo
[You may assume that tan x ---. L as x ---. a, x E (-7f/2,7f/2), if and only if arctan x ---. a as x ---. L,]
(e)
lim
x-+oo
sinx -2-' X
3.3
(f)
[!].
Continuity
71
lim x 2 sinx.
x---+-oo
This exercise is used in many places. Recall that a polynomial of degree n is a function of the form P(x) = anx n + an_lX n- 1 + ... + alX + ao
where aj E R for j = 0,1, ... , n and an =f. O. (a) Prove that limx-->a xn = an for n = 0,1, .... (b) Prove that if P is a polynomial, then lim P(x) = P(a) x-->a for every a E R. (c) Suppose that P is a polynomial and P(a) > O. Prove that P(x)/(x-a) ----; as x ----; a+, P(x)/(x - a) ----; -00 as x ----; a-, but
00
P(x) 1l. m -x-->a X - a does not exist. 5. Prove that (sin(x + 3) - sin3)/x converges to 0 as x ----; 00. 6. Prove that y'I- cosx/sinx ----; V2/2 as x ----; 0+. 7. Prove the following comparison theorems for real functions. (a) If f(x) ~ g(x) and g(x) ----; 00 as x ----; a, then f(x) ----; 00 as x ----; a. (b) If f(x) :::; g(x) :::; h(x) and
L:= lim f(x) x---+oo
=
lim h(x),
x---+oo
then g(x) ----; L as x ----; 00. 8. Suppose that f: [a, 00) ----; R for some a E R. Prove that f(x) ----; L as x ----; 00 if and only if f (x n ) ----; L for any sequence Xn E (a, 00) that converges to 00 as n ----; 00. 9. Suppose that f : [0,1]----; Rand f(a) = limx-->a f(x) for all a E [0,1]. Prove that that f(q) = 0 for all q E Q n [0,1] if and only if f(x) = 0 for all x E [0,1]. 10. [CAUCHY] Suppose that f : N ----; R. If lim f(n + 1) - f(n) = L, n-->oo prove that limn-->oo f(n)/n exists and equals L.
3.3 CONTINUITY In elementary calculus, a function is called continuous at a if a E Dom f and f(x) ----; f(a) as x ----; a. In particular, it is tacitly assumed that f is defined on BOTH sides of a. Here, we introduce a more general concept of continuity that includes functions, such that v'x at a = 0, which are defined on only one side of some point in their domain.
Chapter 3
72
CONTINUITY ON R
3.19 DEFINITION. Let E be a nonempty subset of Rand
f :E
--+
R.
(i) f is said to be continuous at a point a E E if and only if given e > 0 there is a 8 > 0 (which in general depends on e, f, and a) such that (5)
Ix - al < 8
and
x E E
imply
If(x) - f(a)1 < e.
(ii) f is said to be continuous on E (notation: f : E only if f is continuous at every x E E.
--+
R is continuous) if and
The following result shows that if E is an open interval that contains a, then "f is continuous at a E E" means "f(x) --+ f(a) as x --+ a." (Therefore, we shall abbreviate "f is continuous at a E E" by "f is continuous at a" when E is an open interval.) 3.20 Remark. Let I be an open interval that contains a point a and f : I Then f is continuous at a E I if and only if f(a)
--+
R.
= x-->a lim f(x).
PROOF. Suppose that I = (c, d) and set 80 := min{lc - ai, Id - al}. If 8 < 80 , then Ix - al < 8 implies x E I. Therefore, condition (5) is identical to (1) when f(a) = L, E = I, and 8 < 80 . It follows that f is continuous at a E I if and only if f(x) --+ f(a) as x --+ a .•
By repeating the proof of Theorem 3.6, we can establish a sequential characterization of continuity that is valid on any nonempty set. 3.21 THEOREM. Suppose that E is a nonempty subset of R, a E E, and f : E --+ R. Then the following statements are equivalent:
(i) f is continuous at a E E. (ii) If Xn converges to a and Xn E E, then f(x n )
--+
f(a) as n
--+
00.
In particular, v'x is continuous on I = [0, 00) by Exercise 4, p. 44. By combining Theorem 3.21 with Theorem 2.12, we obtain the following result. 3.22 THEOREM. Let E be a nonempty subset of Rand f, g : E --+ R. If f, g are continuous at a point a E E (respectively, continuous on the set E), then so are f + g, fg, and af (for any a E R). Moreover, f /g is continuous at a E E when g(a) -# 0 (respectively, on E when g(x) -# 0 for all x E E). It follows from Exercises 7, 8, and 9, p. 65, that if f, g are continuous at a point a E E or on a set E, then so are If I, f+, f-, f V g, and f 1\ g. We also notice by Exercise 4, p. 71, that every polynomial is continuous on R. Many complicated functions can be broken into simpler pieces, using sums, products, quotients, and the following operation.
3.3
Continuity
73
3.23 DEFINITION. Suppose that A and B are subsets of R and that f : A ---; R and 9 : B ---; R. If f(A) ~ B, then the composition of 9 with f is the function 9 0 f : A ---; R defined by
(g
0
f)(x) := g(f(x)),
xEA.
The following result contains information about when a limit sign and something else (in this case, the computation of a function) can be interchanged. We shall return to this theme many times, identifying conditions under which one can interchange any two of the following objects: limits, integrals, derivatives, infinite summations, and computation of a function (see especially Sections 7.1, 7.2, and 11.1, and the entry "interchange the order of" in the Index).
3.24 THEOREM. Suppose that A and B are subsets of R and that f : A ---; R and 9 : B ---; R with f(A) ~ B. (i) If A := 1\ {a}, where I is a nondegenerate interval that either contains a or has a as one of its endpoints, if
L := limf(x) x-a xEI
exists and belongs to B, and if 9 is continuous at L E B, then
~~ (g xEI
(ii) If f is continuous at a continuous at a E A.
E
0
f)(x) = 9
(~~ f(X)) . xEI
A and 9 is continuous at f(a)
E
B, then go f is
PROOF. Suppose that Xn E I \ {a} and Xn ---; a as n ---; 00. Since f(A) ~ B, f(x n ) E B. Also, by the Sequential Characterization of Limits (Theorem 3.17), f(x n ) ---; L as n ---; 00. Since 9 is continuous at L E B, it follows from Theorem 3.21 that go f(x n ) := g(f(x n )) ---; g(L) as n ---; 00. Hence by Theorem 3.17, 9 0 f (x) ---; 9 (L) as x ---; a in I. This proves (i). A similar proof establishes part (ii). I
For many applications, it is important to be able to find the maximum or minimum of a given function. As a first step in this direction, we introduce the following concept.
3.25 DEFINITION. Let E be a nonempty subset Qf R. A function f : E ---; R is said to be bounded on E if and only if there is an MER such that If (x) I :::; M for all x E E. (When If(x)1 :::; M for all x E E, we shall say that f is dominated by M on E.) Notice that whether a function f is bounded or not on a set E depends on E as well as on f. For example, f(x) = l/x is bounded on [1,00) (by 1) but not on (0,2). Again, the function f (x) = x 2 is bounded on (- 2, 2) (by 4) but not on [0,00). The following result, which shall be used often, shows that a continuous function on an interval [a, b] is always bounded.
74
Chapter 3
CONTINUITY ON R
3.26 THEOREM [EXTREME VALUE THEOREM]. If I is a closed, bounded interval and f : I ---+ R is continuous on I, then f is bounded on I. Moreover, if M = sup f(x) and m = inf f(x), xEI
xEI
then there exist points x m , XMEl such that (6)
f(XM) = M
PROOF. Suppose first that that (7)
and
f(x m ) = m.
f is not bounded on I. Then there exist
Xn
E I such
nEN.
Since I is bounded, we know (by the Bolzano-Weierstrass Theorem) that {x n } has a convergent subsequence, say x nk ---+ a as k ---+ 00. Since I is closed, we also know (by the Comparison Theorem) that a E I. In particular, f(a) E R. On the other hand, substituting nk for n in (7) and taking the limit of this inequality as k ---+ 00, we have If(a)1 = 00, a contradiction. Hence, the function f is bounded on I. We have proved that both M and m are finite real numbers. To show that there is an x MEl such that f (x M) = M, suppose to the contrary that f (x) < M for all x E I. Then the function 1 g(x) = M - f(x) is continuous, hence, bounded on I. In particular, there is a C > Ig(x)1 = g(x) :::; C. It follows that
(8)
°such that
1 f(x) :::; M - C
for all x E I. Taking the supremum of (8) over all x E I, we obtain M :::; M -l/C < M, a contradiction. Hence, there is an XM E I such that f(XM) = M. A similar argument proves that there is an Xm E I such that f(x m ) = m. I We shall sometimes refer to (6) by saying that the supremum and infimum of f are attained on I. We shall also call the value M (respectively, m) the maximum (respectively, the minimum) of f on I. Neither of the hypotheses on the interval I in Theorem 3.26 can be relaxed.
3.27 Remark. The Extreme Value Theorem is false if either "closed" or "bounded" is dropped from the hypotheses. PROOF. The interval (0,1) is bounded but not closed, and the function f(x) = l/x is continuous and unbounded on (0,1). The interval [0,00) is closed but not bounded, and the function f(x) = x is continuous and unbounded on [0,00). I What more can be said about continuous functions? One useful conceptualization of functions which are continuous on an interval is that their graphs have no holes or jumps (see Theorem 3.29 below). Our proof of this fact is based on the following elementary observation.
3.3
75
Continuity
y
f(xo)
--------
f(xo)
-2-
x
Figure 3.3 3.28 Lemma [SIGN-PRESERVING PROPERTY]. Let f : I - t R where I is an open, nondegenerate interval. If f is continuous at a point Xo E I and f(xo) > 0, then there are positive numbers c and 8 such that Ix - xol < 8 implies f(x) > C.
STRATEGY: The idea behind this proof is simple. If f(xo) > 0, then f(x) > f(xo)/2 for x near Xo (see Figure 3.3). Here are the details. PROOF. By (1), given c = f(xo)/2, choose 8 > If(x) - f(xo)1 < c. It follows that
°
such that Ix - xol < 8 implies
_ f(xo) < f(x) _ f(xo) < f(x o). 2 2 Solving the left-hand inequality, we see that f(x) Ix-xol
> f(xo)/2
=
c holds for all
< 8. I
A real number Yo is said to lie between two numbers c and d if and only if c < Yo < d or d < Yo < c. 3.29 THEOREM [INTERMEDIATE VALUE THEOREM]. Let I be a nondegenerate interval and f : I - t R be continuous. If a, bEl with a < b, and if Yo lies between f(a) and f(b), then there is an Xo E (a, b) such that f(xo) = Yo.
PROOF. We may suppose that f(a) < Yo < f(b). Consider the set E = {x E [a, b] : f(x) < Yo} (see Figure 3.4). Since a E E and E ~ [a, b], E is a nonempty bounded subset of R. Hence, by the Completeness Axiom, Xo := sup E is a finite real number. Since Yo equals neither f(a) nor f(b), Xo cannot equal a or b. Thus Xo E (a, b). It remains to prove that f(xo) = Yo. To prove that this guess is correct, use Theorem 2.11 to choose Xn E E such that Xn - t Xo as n - t 00. Since E ~ [a, b], Xo E [a, b] (see Theorem 2.17). Hence, by continuity of f and the definition of E, we have f(xo) = limn->co f(x n ) ::; Yo.
Chapter 3
76
CONTINUITY ON R
Y
f(b)
Yo f(a)
Xo
a
b
x
Figure 3.4 To show that 1 (Xo) = Yo, suppose to the contrary that 1 (XO) < YO· Then Yo - 1 (X) is a continuous function whose value at x = Xo is positive. Hence, by Lemma 3.28, we can choose positive numbers E: and 8 such that Yo - I(x) > E: > 0 for Ix-xol < 8. In particular, any x that satisfies Xo < x < Xo + 8 also satisfies I(x) < Yo, a contradiction of the fact that Xo = sup E. I Thus, if 1 is continuous on [a, b] and I(a) ::; Yo ::; I(b), then there is an Xo E [a, b] such that I(xo) = Yo· H 1 fails to be continuous at a point a, we say that 1 is discontinuous at a and call a a point 01 discontinuity of f. How badly can a function behave near a point of discontinuity? The following examples can be interpreted as answers to this question. (See also Exercise 9, p. 288.) 3.30 Example. Prove that the function
I(x)
={
Ixl
t
x#o x=o
is continuous on (-00,0) and [0,00), discontinuous at 0, and that both 1(0+) and 1(0-) exist. PROOF. Since I(x) = 1 for x;::: 0, it is clear that 1(0+) = 1 exists and I(x) -+ I(a) as x -+ a for any a > o. In particular, 1 is continuous on [0,00). Similarly, 1(0-) = -1 and 1 is continuous on (-00,0). Finally, since 1(0+) # 1(0-), the limit of I(x) as x -+ 0 does not exist by Theorem 3.14. Therefore, 1 is not continuous at
O. I 3.31 Example. Assuming that sin x is continuous on R, prove that .
1
I(x) = { :m~
x#O x=O
3.3
77
Continuity
is continuous on (-00,0) and (0,00), discontinuous at 0, and neither f(O+) nor f(O-) exists. (See Figure 3.2 on p. 62.) PROOF. The function l/x is continuous for X =/:-0 by Theorem 3.8. Hence, by Theorem 3.24, f(x) = sin(l/x) is continuous on (-00,0) and (0,00). To prove that f(O+) does not exist, let Xn = 2/((2n + 1)11"), and observe (see Appendix B) that sin(l/xn ) = (_1)n, n E N. Since Xn 1 0 but (-l)n does not converge, it follows from Theorem 3.21 (the Sequential Characterization of Continuity) that f(O+) does not exist. A similar argument proves that f(O-) does not exist. I 3.32 Example. The Dirichlet function is defined on R by
f(x) := {
~
Q x~ Q.
xE
Prove that every point x E R is a point of discontinuity of called nowhere continuous.)
f. (Such functions are
PROOF. By Theorem 1.24 and Exercise 3 on p. 23 (Density of Rationals and Irrationals), given any a E Rand 8 > 0 we can choose Xl E Q and X2 E R \ Q such that IXi - al < 8 for i = 1,2. Since f(XI) = 1 and f(X2) = 0, f cannot be continuous at a. I 3.33 Example. Prove that the function X
=
!!.
q x~ Q.
E Q
(in reduced form)
is continuous at every irrational in the interval (0, 1) but discontinuous at every rational in (0,1). PROOF. Let a be a rational in (0,1) and suppose that f is continuous at a. If Xn is a sequence of irrationals that converges to a, then f(xn) -+ f(a); Le., f(a) = O. But f(a) =/:- 0 by definition. Hence, f is discontinuous at every rational in (0,1). Let a be an irrational in (0,1). We must show that f(xn) -+ f(a) for every sequence Xn E (0,1) that satisfies Xn -+ a as n -+ 00. We may suppose that Xn E Q. For each n E N, write Xn = Pn/qn in reduced form. Since f(a) = 0, it suffices to show that qn -+ 00 as n -+ 00. Suppose to the contrary that there exist integers nl < n2 < ... such that Iqnk I ~ M < 00 for kEN. Since x nk E (0,1), it follows that the set
E := {xnk = Pnk : kEN} qnk contains only a finite number of points. Hence, the limit of any sequence in E must belong to E, a contradiction since a is such a limit and is irrational. I To see how counterintuitive Example 3.33 is, try to draw a graph of y Stranger things can happen.
= f(x).
Chapter 3
78
CONTINUITY ON R
3.34 Remark. The composition of two functions go f can be nowhere continuous, even though f is discontinuous only on Q and g is discontinuous at only one point. PROOF.
Let
f
be the function given in Example 3.33 and set g(x)
={
~
xr!=O x
= O.
Clearly, (g
Hence, g 0
f
0
f)(x)
={
~
x EQ x ~ Q.
is the Dirichlet function, nowhere continuous by Example 3.32. I
In view of Example 3.33 and Remark 3.34, we must be skeptical of proofs that rely exclusively on geometric intuition. Although we shall use geometric intuition to suggest methods of proof for many results in subsequent chapters, these suggestions will always be followed by a careful rigorous proof that contains no fuzzy reasoning based on pictures or sketches no matter how plausible they seem.
EXERCISES For these exercises, assume that sinx, cosx, and eX are continuous on R.
1. For each of the following, prove that there is at least one x E R that satisfies the given equation. (a) e x =x2. (b) eX = cos x
+ 1.
(c) 2x = 2 - x. 2. Use limit theorems to show that the following functions are continuous on [0,1]. 2
(a)
f(x)
f(x)
(b)
(c)
= xe X + 5.
f(x}
= 1-
x. l+x
~ {:XSin~
x=t'=O x
(d)
f(x)
= Jf=X.
= O.
3.4
5. 6.
79
f(x) =
sin(e X ) • +x - 6 If f : [a, b] ---+ R is continuous, prove that SUPxE[a,b]lf(x)1 is finite. Suppose that f is a real-valued function of a real variable. If f is continuous at a with f (a) < M for some MER, prove that there is an open interval I containing a such that f (x) < M for all x E I. Show that there exist nowhere continuous functions f and g whose sum f + g is continuous on R. Show that the same is true for the product of functions. Let
(e)
3. 4.
Uniform continuity
x2
xi=O
x
=
O.
(a) Prove that f is continuous on (0,00) and (-00,0) but discontinuous at O. (b) Suppose that g : [0,2/7f] ---+ R is continuous on (0, 2/7f) and that there is a positive constant C > 0 such that Ig(x)1 :::; C-/X for all x E (0,2/7f). Prove that f(x)g(x) is continuous on [0,2/7f]. 7. Suppose that a E R, that I is an open interval containing a, that f,g : 1---+ R, and that f is continuous at a. (a) Prove that g is continuous at a if and only if f + g is continuous at a. (b) Make and prove an analogous statement for the product fg. Show by example that the hypothesis about f you added cannot be dropped. 8. Suppose that f : R ---+ R satisfies f(x + y) = f(x) + f(y) for each x, y E R. (a) Show that f(nx) = nf(x) for all x E Rand n E Z. (b) Prove that f(qx) = qf(x) for all x E Rand q E Q. (c) Prove that f is continuous at 0 if and only if f is continuous on R. (d) Prove that if f is continuous at 0, then there is an mER such that f(x) = mx for all x E R. [!]. This exercise is used in Section 7.4. Suppose that f : R ---+ (0,00) satisfies f(x + y) = f(x)f(y). Modifying the outline in Exercise 8, show that if f is continuous at 0, then there is an a E (0,00) such that f(x) = aX for all x E R. [Note: You may assume that the function aX is continuous on R.] 10. If f : R ---+ R is continuous and lim f(x)
x--+oo
prove that
= x--+-oo lim f(x) = 00,
f has a minimum on R; i.e., there is an Xm E R such that f(x m ) = inf f(x) < 00. xER
3.4 UNIFORM CONTINUITY The following concept is very important and will be used many times in the rest of the course.
80
Chapter 3
CONTINUITY ON R
3.35 DEFINITION. Let E be a nonempty subset of Rand f : E - t R. Then f is said to be uniformly continuous on E (notation: f : E - t R is uniformly continuous) if and only if for every E > 0 there is a 0 > 0 such that
(9)
Ix - al < 0
x, a E E
and
imply
If(x) - f(a)1 <
E.
Notice that the 0 in Definition 3.35 depends on E and f, but not on a and x. This issue needs to be addressed when one proves that a given function is uniformly continuous on a specific set (e.g., by determining 0 before a is mentioned).
= x 2 is uniformly continuous on the interval (0,1).
3.36 Example. Prove that f(x)
PROOF. Given E > 0, set 0 = E/3. If x, a E (0,1), then Therefore, if x, a E (0,1) and Ix - al < 0, then
If(x) - f(a)1
=
Ix2 -
a
2
I = Ix -
allx + al
Ix + al :::; Ixl + lal
~ :::; 21x - al :::; 3 < E.
:::; 2.
I
The definitions of continuity and uniform continuity are very similar. In fact, the only difference is that for a continuous function, the parameter 0 may depend on both E and a, whereas for a uniformly continuous function, 0 must be chosen independently of a. In particular, every function uniformly continuous on E is also continuous on E. The following example shows that the converse of this statement is false unless some restriction is made on E. 3.37 Example. Show that f(x)
=
x 2 is not uniformly continuous on R.
PROOF. Suppose to the contrary that f is uniformly continuous on R. Then there is a 0 > 0 such that Ix - al < 0 implies If(x) - f(a)1 < 1 for all x, a E R. By the Archimedean Principle, choose n E N so large that nO > 1. Set a = nand x = n + 0/2. Then Ix - al < 0 and
1> If(x) - f(a)1 This contradiction proves that
f
02
= Ix 2 - a2 1= nO + "4 > no > 1. is not uniformly continuous on R. I
Here is a key that unlocks the difference between continuity and uniform continuity. 3.38 Lemma. Suppose that E ~ Rand Xn E E is Cauchy, then f(x n ) is Cauchy.
f :E
-t
R is uniformly continuous. If
PROOF. Let E > 0 and choose 0 > 0 such that (9) holds. Since {x n } is Cauchy, choose N E N such that n, m ~ N implies IX n - xml < O. Then n, m ~ N implies If(xn) - f(xm)1 < E. I Notice that f(x) = l/x is continuous on (0,1) and Xn = l/n is Cauchy but f(x n ) is not. In particular, l/x is continuous but not uniformly continuous on the open interval (0,1). Notice how the graph of y = l/x corroborates this fact. Indeed, as a
3.4
Uniform continuity
81
y
f(xo)
+e
f(xo) - e
-
I I II
--n II I II I II I II I II I II I --I..!-.l----II I II I II I II I I
I
-111-----1--1----II I II I
I I
I I x
Figure 3.5 gets closer to 0, the value of 0 gets smaller (compare 01 to 00 in Figure 3.5), hence cannot be chosen independently of a. Thus on an open interval, continuity and uniform continuity are different even if the interval is bounded. This is not the case for closed, bounded intervals (see Theorem 3.39). The following result is extremely important because uniform continuity is so strong. Indeed, we shall use it dozens of times before this course is finished.
3.39 THEOREM. Suppose that I is a closed, bounded interval. If f continuous on I, then f is uniformly continuous on I.
:I
--t
R is
PROOF. Suppose to the contrary that f is continuous but not uniformly continuous on I. Then there is an co> 0 and points Xn , Yn E I such that IX n - Ynl < lin and
(10)
nEN.
By the Bolzano-Weierstrass Theorem and the Comparison Theorem, the sequence {xn} has a subsequence, say Xnk , that converges, as k --t 00, to some x E I. Similarly, the sequence {Ynk hEN has a convergent subsequence, say Ynk 3 , that converges, as j --t 00, to some Y E I. Since x nk . --t X as j --t 00 and f is continuous, 3 it follows from (10) that If(x) - f(Y)1 ~ co; i.e., f(x) =f f(y). But IXn - Ynl < lin for all n E N, so Theorem 2.9 (the Squeeze Theorem) implies that x = y. Therefore, f(x) = f(y), a contradiction. I
Chapter 3
82
CONTINUITY ON R
Our first application of this result is a useful but simple characterization of uniform continuity on bounded open intervals. (This result does NOT work for unbounded intervals.)
3.40 THEOREM. Let (a, b) be a bounded, open, nonempty interval and f : (a, b) - t R. Then f is uniformly continuous on (a, b) if and only if f can be extended continuously to [a, b], i.e., if and only if there is a continuous function g : [a, b] - t R that satisfies
f(x) = g(x),
(11)
x
E
(a, b).
PROOF. Suppose that f is uniformly continuous on (a, b). Let Xn E (a, b) converge to bas n - t 00. Then {x n } is Cauchy; hence, by Lemma 3.38, so is {f(x n )}. In particular, g(b):= lim f(xn) n-+oo
exists. This value does not change if we use a different sequence to approximate b. Indeed, let Yn E (a, b) be another sequence that converges to b as n - t 00. Given E: > 0, choose 8 > 0 such that (9) holds for E = (a, b). Since Xn - Yn - t 0, choose N E N so that n ~ N implies IX n - Ynl < 8. By (9), then, If(xn) - f(Yn)1 < E: for all n ~ N. Taking the limit of this inequality as n - t 00, we obtain
for all
E:
> O. It follows from Theorem 1.9 that
Thus, g(b) is well defined. A similar argument defines g(a). Set g(x) = f(x) for x E (a, b). Then g is defined on [a, b], satisfies (11), and is continuous on [a, b] by the Sequential Characterization of Limits. Thus, f can be "continuously extended" to g as required. Conversely, suppose that there is a function g continuous on [a, b] that satisfies (11). By Theorem 3.39, g is uniformly continuous on [a, b]; hence, g is uniformly continuous on (a, b). We conclude that f is uniformly continuous on (a, b). I Let f be continuous on a bounded nonempty interval (a, b). Notice that f is continuously extendable to [a, b] if and only if the one-sided limits of f exist at a and b. Indeed, when they exist, we can always define g at a and b to be the values of these limits. In particular, we can prove that f is uniformly continuous without using E:'S and 8's.
3.41 Example. Prove that f(x) = (x -1)/logx is uniformly continuous on (0,1). SOLUTION
It is clear that f(x)
-t
0 as x
lim f(x)
x-+I-
-t
0+. Moreover, by l'Hopital's Rule,
= lim _II = 1. x-+I-
1 x
3.4
Uniform continuity
Hence f is continuously extendable to [0, 1], so by Theorem 3.40, continuous on (0,1). I
83
f is uniformly
EXERCISES 1. Using Definition 3.35, prove that each of the following functions is uniformly continuous on (0,1).
(a) f(x) = x 3 . (b) f(x) = x 2 - x. (c) f(x) = x sin 2x. 2. Prove that each of the following functions is uniformly continuous on (0,1).
x 3 -1
(a)
f(x)
(b)
f(x) = xsm-.
(c)
f(x) is any polynomial.
(d)
= --1'
x-
. 1 x
f(x)
= sin x .
x
(e) [You may use I'Hopital's Rule (see Theorem 4.18) on parts (d) and (e).] 3. Find all real a such that xC< sin(l/x) is uniformly continuous on the open interval (0,1). 4. (a) Suppose that f: [0,00) - t R is continuous and there is an L E R such that f(x) - t L as x - t 00. Prove that f is uniformly continuous on [0,00). (b) Prove that f (x) = 1/ (x 2 + 1) is uniformly continuous on R.
5. (a) Let I be a bounded interval. Prove that if f : I - t R is uniformly continuous on I, then f is bounded on I. (b) Prove that (a) may be false if I is unbounded or if f is merely continuous.
Chapter 3
84
CONTINUITY ON R
6. Suppose that 0: E R, E is a nonempty subset of R, and j, g : E uniformly continuous on E.
~
Rare
(a) Prove that j + g and o:j are uniformly continuous on E. (b) Suppose that j, g are bounded on E. Prove that j g is uniformly continuous on E. (c) Show that there exist functions j, g uniformly continuous on R such that j g is not uniformly continuous on R. (d) Suppose that j is bounded on E and that there is a positive constant EO such that g(x) ~ EO for all x E E. Prove that j jg is uniformly continuous on E. (e) Show that there exist functions j, g, uniformly continuous on the interval (0,1), with g(x) > for all x E (0,1), such that j jg is not uniformly continuous on (0,1). (f) Prove that if j, g are uniformly continuous on an interval [a, b] and g(x) i= for x E [a, b], then j j g is uniformly continuous on [a, b].
°
°
R. A function j : E ~ R is said to be increasing on E if and only if Xl < X2 imply j(XI) ::::; j(X2). Suppose that j is increasing and bounded on an open, bounded, nonempty interval (a, b). (a) Prove that j (a+) and j (b-) both exist and are finite. (b) Prove that j is continuous on (a, b) if and only if j is uniformly continuous on (a, b). (c) Show that (b) is faise if j is unbounded. Indeed, find an increasing function g : (0, 1) ~ R that is continuous on (0,1) but not uniformly continuous on (0,1). 8. Suppose that j is continuous on [0,1] and set 7. Let E
Xl, x2
<:;;;
E E and
[k-
k]
1 1- - k 2n' 2n
for k = 1,2, ... ,2n. Prove that given E > implies sup j(x) - inf j(x) < E, xE1k
°
there is an N E N such that n
~
N
k=1,2, ... ,2 n .
xEh
9. Prove that a polynomial of degree n is uniformly continuous on R if and only if
n
°
= or
1.
Chapter 4
Differentiahility on R
4.1 THE DERIVATIVE For many applications, one needs to compute the slope of a tangent line of some function f. The following concept is useful in this regard.
4.1 DEFINITION. A real function f is said to be differentiable at a point a E R if and only if f is defined on some open interval I containing a and
f'(a)
(1) exists. In this case
f' (a)
:= lim h-O
f(a
+ h) -
f(a)
h
is called the derivative of f at a.
The assumption that f be defined on an open interval containing a is made so that the quotients in (1) are defined for all h =f 0 sufficiently small. You may recall that the graph of y = f(x) has a tangent line at the point (a, f(a)) if and only if f has a derivative at a, in which case the slope of that tangent line is f' (a). To see why this connection makes sense, let us consider a geometric interpretation of (1). Suppose that f is differentiable at a. A secant line of the graph y = f(x) is a line passing through at least two points on the graph, and a chord is a line segment that runs from one point on the graph to another. Let x = a + h and observe that the slope of the chord passing through the points (x, f(x)) and (a, f(a)) is given by (f(x) - f(a))/(x - a). Now, since x = a + h, (1) becomes f'(a) = lim f(x) - f(a). x-a
85
X -
a
Chapter
86
4 DIFFERENTIABILITY ON R
y
Figure 4.1 Hence, as x ---t a the slopes of the chords through (x, f(x)) and (a, f(a)) approximate the slope of the tangent line ofy = f(x) at x = a (see Figure 4.1), and in the limit, the slope of the tangent line to y = f(x) at x = a is precisely f'(a). Thus, we shall say that the graph of y = f(x) has a unique tangent line at a point (a, f(a)) if and only if f' (a) exists. If f is differentiable at each point in a set E, then f' is a function on E. This function is denoted in several ways:
When y = f (x) we shall also use the notation dy / dx or y' for f'. Higher-order derivatives are defined recursively; i.e., if n E N, then f(n+1)(a) := (f(n) (a))', provided that these derivatives exist. Higher-order derivatives are also denoted in several ways, including D,{;;f, dnf/dx n , fen), and dny/dx n and yen) when y = f(x). The second derivatives f(2) (respectively, y(2)) are usually written as f" (respectively, y"), and when they exist at some point a, we shall say that f is twice differentiable at a. Here are two characterizations of differentiability that we shall use to study derivatives. The first one, which characterizes the derivative in terms of the "chord function"
(2)
F(x) := f(x) - f(a) x-a
x#a,
will be used below to prove the Chain Rule.
4.2 THEOREM. A real function f is differentiable at some point a E R if and only ifthere exist an open interval I and a function F : I ---t R such that a E I, f
4.1
The derivative
87
is defined on I, F is continuous at a, and
(3) holds for all x E
f(x) = F(x)(x - a) I, in which case F(a) = f'(a).
+ f(a)
PROOF. Notice once and for all that for x E I \ {a}, (2) and (3) are equivalent. Suppose that f is differentiable at a. Then f is defined on some open interval I containing a, and the limit in (1) exists. Define F on I by (2) if x -# a, and by F(a) := f'(a). Then (3) holds for all x E I, and F is continuous at a by (2) since f' (a) exists. Conversely, if (3) holds, then (2) holds for all x E I, x -# a. Taking the limit of (2) as x ---+ a, bearing in mind that F is continuous at a, we conclude that F(a) = f'(a). I
The second characterization of differentiability, in terms of linear approximations (i.e., how well f(a + h) - f(a) can be approximated by a straight line through the origin) will be used in Chapter 11 to define the derivative of a function of several variables.
4.3 THEOREM. Let f : R ---+ R. Then f is differentiable at a if and only if there is a function T of the form T(x) := mx such that 1· If(a + h) - f(a) - T(h)1 - 0 h~ Ihl - . PROOF. Suppose that f is differentiable, and set m := f'(a). Then by (1),
(4)
f(a
+ h) -
f(a) - T(h) = f(a
+ h) -
h
f(a) _ f'(a)
---+
0
h
as h ---+ O. Conversely, if (4) holds for T(x) := mx, then lim f (a
+ h) - f (a) _ m = lim f (a + h) - f (a) - mh h
h-+O
h
h-+O
= lim f(a + h) - f(a) - T(h) = o. h Since the limit of a difference is the difference of its limits, it follows that (1) holds with m = f'(a). I h-+O
Our first application of Theorem 4.2 answers the question: Are differentiability and continuity related?
4.4 THEOREM. If f is differentiable at a, then
f
is continuous at a.
PROOF. Suppose that f is differentiable at a. By Theorem 4.2, there is an open interval I and a function F, continuous at a, such that f(x) - f(a) = F(x)(x - a) for all x E I. Taking the limit of this product as x ---+ a, we see that
lim f(x) - f(a)
x-+a
In particular, f(x)
---+
f(a) as x
---+
= F(a) ·0 = O.
a; i.e., f is continuous at a. I
Thus any function that fails to be continuous at a cannot be differentiable at a. The following example shows that the converse of Theorem 4.4 is false.
Chapter 4
88
DIFFERENTIABILITY ON R y
x
Figure 4.2
= Ixl is continuous at 0 but not differentiable there. PROOF. Since x -+ 0 implies that Ixl -+ 0, f is continuous at O. On the other hand, since Ihl = h when h > 0 and Ihl = -h when h < 0, we have
4.5 Example. Show that f(x)
lim f(h) - f(O) h
= 1 and
lim f(h) - f(O) h
=
-1.
h-+O-
h-+O+
Since a limit exists if and only if its one-sided limits exist and are equal (Theorem 3.14), it follows that the limit in (1) does not exist when a = 0 and f(x) = Ixl. Therefore, f is not differentiable at O. I This example reflects the conventional wisdom about the difference between differentiable and continuous functions. Since a function differentiable at a always has a unique tangent line at (a, f (a)), the graph of a differentiable function on an interval is "smooth" with no corners, cusps, or kinks. On the contrary, although the graph of a continuous function on an interval is unbroken (has no holes or jumps), it may well have corners, cusps, or kinks. In particular, f(x) = Ixl is continuous but not differentiable at x = 0 and the graph of y = Ixl is unbroken but has a corner at the point (0,0) (see Figure 4.2). By Definition 4.1, if f is differentiable at a, then f must be defined on an open interval containing a, i.e., on both sides of a. As with the theory of limits, it is convenient to define "one-sided" derivatives to deal with functions whose domains are closed intervals (see Example 4.7). Here is a brief discussion of what it means for a real function to be differentiable on an interval (as opposed to being differentiable at every point in an interval). This concept will be used in Sections 5.3, 5.6, and 11.1. 4.6 DEFINITION. Let I be a nondegenerate interval. (i) A function f : 1-+ R is said to be differentiable on I if and only if
f;(a)
:= lim x-+a
xEI
f(x) - f(a) X -
a
4.1
89
The derivative
exists and is finite for every a E I. (ii) f is said to be continuously differentiable on I if and only if f; exists and is continuous on I. Notice that when a is not an endpoint of I, fJ(a) is the same as f'(a). Because of this, we usually drop the subscript on fl' In particular, if f is differentiable on [a, b], then
f'(a):= lim f(a + h) - f(a) and f'(b):= lim f(b + hl- f(b). h-+O+ h h-+OThe following example shows that Definition 4.6 enlarges the collection of differentiable functions. 4.7 Example. The function f(x) = E [0,(0).
X 3/ 2
is differentiable on [0,(0) and f'(x) =
3.jX/2 for all x
PROOF. By the Power Rule (see Exercise 8, p. 94), f'(x) (0,00). By definition,
1'(0) = lim
h-+O+
h3/ 2 - 0 h = lim
h-+O+
= 3.jX/2 for all x
E
v'h = O. I
Here is notation widely used in conjunction with Definition 4.6. Let I be a nondegenerate interval. For each n E N, en(I) will denote the collection of real functions f whose nth derivatives exist and are continuous on I. (Thus 1 (I) is precisely the collection of real functions that are continuously differentiable on I.) We shall also denote the collection of f that belong to en (I) for all n E N byeOO(I). [When dealing with specific intervals, we shall drop the outer set of parentheses; e.g., we shall write en[a, b] for en([a, b]).] By modifying the proof of Theorem 4.4, we can show that if f is differentiable on I, then f is continuous on I. Thus, em(I) c en(I) when m > n. The following example shows that not every function that is differentiable on R belongs to e 1 (R).
e
4.8 Example. The function
f(x)
= {
~2 sin(I/x)
x=O is differentiable on R but not continuously differentiable on any interval that contains the origin. PROOF. By definition, 1 - cos x for x =j:. O. Thus f is differentiable on R but limx-+o f'(x) does not exist. In particular, I' is not continuous on any interval that contains the origin. I
1'(0) = lim hsin -hI = 0 and h-+O
1
I' (x) = 2x sin -x
It is important to notice that a function which is differentiable on two sets is not necessarily differentiable on their union.
Chapter
90
4 DIFFERENTIABILITY ON R
4.9 Remark. f(x) = Ixl is differentiable on [0,1] and on [-1,0] but not on [-1,1]. PROOF. Since f(x) = x when x > 0 and = -x when x < 0, it is clear that f is differentiable on [-1,0) U (0,1] (with f'(x) = 1 for x> 0 and f'(x) = -1 for x < 0). By Example 4.5, f is not differentiable at x = o. However,
f ['OIJ () 0 ,
Therefore,
=
1·1m -h Ihl = 1 an d
h-+O+
f'[-1 oJ () 0
=
'
1·1m -h Ihl =-l.
h-+O-
f is differentiable on [0,1] and on [-1,0]. I
EXERCISES 1. For each of the following real functions, use Definition 4.1 directly to prove that I' (a) exists. (a) f(x) = x 2 , a E R.
(b) f(x) = l/x, a =f O. (c) f(x) = y'x, a> O. 2. Let I be an open interval that contains 0 and f : I --+ R. If there exists an a> 1 such that If(x)1 :::; Ixl o for all x E I, prove that f is differentiable at O. What happens when a = I? 3. Let I be an open interval, f : 1--+ R, and c E I. The function f is said to have a local maximum at c if and only if there is a 8 > 0 such that f(c) ~ f(x) holds for all Ix - cl < 8. (a) If f has a local maximum at c, prove that
f(c + h) - f(c) -< 0 and .:......:....---:h:----..:......:.....:...
f(c
+ H) - f(c) > 0 H
-
for h > 0 and H < 0 sufficiently small. (b) If f is differentiable and has a local maximum at c, prove that f'(c) = O. (c) Make and prove analogous statements for local minima. (d) Show by example that the converses of the statements in parts (b) and (c) are false. Namely, find an f such that 1'(0) = 0 but f has neither a local maximum nor a local minimum at O. 4. Using elementary geometry and the definition of sin x, cos x, one can show for every x, y E R (see Appendix B) that (i)
(ii) (iii)
Isinxl :::; 1, sin(-x) sin2 x
Icosxl :::; 1,
= -sinx,
+ cos2 X = 1,
sin(O) = 0,
cos(O)
cos(-x) = cos x, cos x
= 1- 2sin2~,
= 1,
4.1
(iv)
91
The derivative
sin(x ± y) = sinx cos y ± cosxsiny.
Moreover, if x is measured in radians, then (v)
cosx = sin
(i -x) ,
sinx = cos
(i -x) ,
and (vi)
0< xcosx < sin x < x,
0< x:S
7r
2·
Using these properties and the Chain Rule (see p. 92), prove that each of the following statements. (a) The functions sinx and cosx are continuous at O. (b) The functions sin x and cos x are continuous on R. (c) The limits 1 and · sin x 1Im--=
x--+O
X
· 1- cos x 0 11m = x--+O
X
exist. (d) The function sinx is differentiable on R with (sin x)' = cosx. (e) The functions cosx and tanx := sin x/ cos x are differentiable on R with (cos x)' = - sin x and (tanx)' = sec2 x. 5. Suppose that 1 : (0,00) -+ R satisfies I(x) - I(y) = I(x/y) for all x, y E (0,00) and 1(1) = o. (a) Prove that 1 is continuous on (0,00) if and only if 1 is continuous at 1. (b) Prove that 1 is differentiable on (0,00) if and only if 1 is differentiable at 1. (c) Prove that if 1 is differentiable at 1, then f'(x) = f'(l)/x for all x E (0,00). (Note: If 1'(1) = 1, then I(x) = log x.)
[!].
This exercise is used in Section 4.2. (a) Prove that (xn)' = nxn - l for n E N and x E R. (b) Prove that if I(x) = x'\ where a = l/n for some n E N, then y = I(x) is differentiable and I' (x) = a . xQ -1 for every x E (0, 00 ) . 7. Suppose that xi=O x
= o.
Show that IQ(x) is continuous at x = 0 when a > 0 and differentiable at x = 0 when a > 1. Graph these functions for a = 1 and a = 2 and give a geometric interpretation of your results. 8. Let I be an open interval, let a E I, and let I, g be functions from I to R. (a) Prove that if 1 and g are differentiable at a, then 1 + g is differentiable at a with (f + g)' (a) = f'(a) + g'(a). (b) Prove that if 1 is differentiable at a and a E R, al is differentiable at a with (al)'(a)
= af'(a).
Chapter 4
92
DIFFERENTIABILITY ON R
4.2 DIFFERENTIABILITY THEOREMS In this section we prove several familiar results about derivatives.
4.10 THEOREM. Let f and 9 be real functions and a E R. If f and 9 are differentiable at a, then f +g, a f, f· g, and (when 9 (a) =I- 0) f / 9 are all differentiable at a. In fact,
(f + g)' (a) = f'(a)
(5)
(af)'(a)
(6)
=
af'(a),
(7)
(f. g)' (a) = g(a)f'(a)
(8)
(a) (L)' 9
=
+ g'(a),
+ f(a)g'(a).
g(a)f'(a) - f(a)g'(a). g2(a)
PROOF. The proofs of these rules are similar. We provide the details only for (7). By adding and subtracting f(a)g(x) in the numerator of the left side of the following expression, we can write
f(x)g(x) - f(a)g(a) = g(x/(x) - f(a) x-a x-a
+ f(a)g(x)
- g(a). x-a
This last expression is a product of functions. Since 9 is continuous (see Theorem 4.4), it follows from Definition 4.1 and Theorem 3.8 that lim f(x)g(x) - f(a)g(a) x-a
X -
= g(a)f'(a) + f(a)g'(a). I
a
Formula (5) is called the Sum Rule, (6) is sometimes called the Homogeneous Rule, (7) is called the Product Rule, and (8) is called the Quotient Rule. Next, we show what the derivative does to a composition of two functions.
4.11 THEOREM [CHAIN RULE]. Let f and 9 be real functions. Iff is differentiable at a and 9 is differentiable at f (a), then 9 0 f is differentiable at a with
(g 0 f)'(a) = g'(f(a))f'(a).
(9)
PROOF. By Theorem 4.2, there exist open intervals I and J, and functions F : I ---+ R, continuous at a, and G : J ---+ R, continuous at f(a), such that F(a) = f'(a), G(f(a)) = g'(f(a)),
(10)
f(x)
=
F(x)(x - a) + f(a),
xEI
4.2
Differentiability theorems
93
and
g(y) = G(y)(y - f(a))
(11)
+ g(f(a)),
Y E J.
f is continuous at a we may assume (by making I smaller if necessary) that E J for all x E I. Fix x E I. Apply (11) to y = f(x) and (10) to x to write
Since
f(x)
(g
0
f)(x) = g(f(x)) = G(f(x)) (f(x) - f(a)) + g(f(a)) = G(f(x))F(x)(x - a) + (g 0 f)(a).
Set H(x) = G(f(x))F(x) for x E I. Since F is continuous at a and G is continuous at f(a), it is clear that H is continuous at a. Moreover,
H(a) = G(f(a))F(a) = g'(f(a))f'(a). It follows from Theorem 4.2, therefore, that (g
0
f)'(a) = g'(f(a))f'(a). I
EXERCISES 1. For each of the following functions, find all x for which f'(x) exists and find a formula for 1'. (You may use Exercise 8.)
(a) f(x) = (x 3 - 2x2 + 3x)/ y'x. = 1/(x2 + x-I). (c) f(x) = xx. (d) f(x) = Ix 3 + 2X2 - X - 21. 2. Let f and 9 be differentiable at 2 and 3 with 1'(2) = a, f'(3) = b, g'(2) = c, and g'(3) = d. If f(2) = 1, f(3) = 2, g(2) = 3, and g(3) = 4, evaluate each of the following derivatives. (a) (fg)'(2). (b) (f /g)'(3). (c) (gof)'(3). (d) (f 0 g)'(2). 3. Assuming that eX is differentiable on R, prove that (b) f(x)
x#O
x
f(x)= { 10+e 1/ X
x=O
is differentiable on [0,00). Is f differentiable at O? 4. Using Exercise 6, p. 91, prove that every polynomial belongs to COO (R). 5. [RECIPROCAL RULE] Suppose that f is differentiable at a and f(a) # O. (a) Show that for h sufficiently small, f(a + h) # O. (b) Using Definition 4.1 directly, prove that 1/ f(x) is differentiable at x = a and
( 1)' (a)
7
f'(a)
= -
J2(a)'
Chapter 4
94
DIFFERENTIABILITY ON R
6. Use Exercise 5 and the Product Rule to prove the Quotient Rule. 7. Suppose that n E Nand f, 9 are real functions of a real variable whose nth derivatives f(n), g(n) exist at a point a. Prove Leibniz's generalization of the Product Rule:
~. This exercise is used in Section 5.3. (a) Prove that if f(x) = x m / n for some m, n E N, then y = f(x) is differentiable and satisfies nyn-l y' = mx m - l for every x E (0,00). (b) [POWER RULE] Prove that x q is differentiable on (0,00) for every q E Q and (x q )' = qx q - 1 . 9. Consider the following outline to a proof of the Chain Rule for real functions. Let y = f(x), Yo = f(xo), and observe that y ~ Yo as x ~ Xo. Thus lim go f(x) - 9 X->Xo
0
X - Xo
f(xo) = lim g(f(x)) - g(f(xo)) f(x) - f(xo) X->Xo f(x) - f(xo) x - Xo
=
(lim g(y) - g(yo)) (lim f(x) - f(x o)) Y->Yo
y - Yo
X->Xo
x - Xo
= g'(yo)f'(xo) = g'(f(xo))f'(xo). (a) Find the flaw in this argument. (b) Write down a statement that this argument does prove.
4.3 MEAN VALUE THEOREM The Mean Value Theorem makes a precise statement about the relationship between the derivative of a function and the slope of one of its chords. It was discovered by the following geometric reasoning. Suppose that f is differentiable on (a, b). Since the graph of f on (a, b) has a tangent at each of its points, it seems likely that the slope of the chord through the points (a, f(a)) and (b, f(b)) equals the slope f'(c) for some value of c E (a, b) (see Figure 4.3). We begin with a special case. 4.12 Lemma [ROLLE'S THEOREM]. Suppose that a, b E R with a =I- b. If f is continuous on [a, b], differentiable on (a, b), and if f(a) = f(b), then f'(c) = 0 for some c E (a, b). PROOF. By the Extreme Value Theorem, f has a finite maximum M and a finite minimum m on [a, b]. If M = m, then f is constant on (a, b) and f'(x) = 0 for all
x E (a, b). Suppose that M =I- m. Since f(a) = f(b), f must assume one of the values M or m at some point c E (a, b). By symmetry, we may suppose that f(c) = M. (That
4.3
Mean Value Theorem
95
is, if we can prove the theorem when f (c) = M, then a similar proof establishes the theorem when f(c) = m.) Since M is the maximum of f on [a, bJ, we have
f(c+ h) - f(c):::; 0 for all h that satisfy c + h E (a, b). In the case h
> 0 this implies that
f'(c) = lim f(c + h~ - f(c) :::; 0, h-O+
and in the case h
< 0 this implies that f'(c) = lim f(c + h) - f(c) ~ O. h-Oh
It follows that f'(c)
= O. I
y
a
c
b
x
Figure 4.3 4.13 Remark. The continuity hypothesis in Rolle's Theorem cannot be relaxed at even one point in [a, bJ. PROOF.
The function
f(x)
= {
~
x E [0,1) x=1
is continuous on [0,1) and differentiable on (0,1), f(O) = f(1) = 0, but f'(x) is never zero. I
Chapter 4
96
DIFFERENTIABILITY ON R
4.14 Remark. The differentiability hypothesis in Rolle's Theorem cannot be relaxed at even one point in (a, b). PROOF. The function f(x) = Ixl is continuous on [-1,1] and differentiable on (-1,1) \ {O} and f(-l) = f(l), but f'(x) is never zero. I We shall use Rolle's Theorem to obtain several useful results. The first is a pair of "Mean Value Theorems."
4.15 THEOREM. Suppose that a, bE R with a =I- b. (i) [GENERALIZED MEAN VALUE THEOREM] Iff, 9 are continuous on [a, b] and differentiable on (a, b), then there is an c E (a, b) such that
f'(c)(g(b) - g(a))
=
g'(c) (f(b) - f(a)).
(ii) [MEAN VALUE THEOREM] If f is continuous on [a, b] and differentiable on (a, b), then there is an c E (a, b) such that
f(b) - f(a) = f'(c)(b - a). PROOF. (i) Set h(x) = f(x)(g(b) - g(a)) - g(x)(f(b) - f(a)). Since h'(x) = f'(x)(g(b) - g(a)) - g'(x)(f(b) - f(a)), it is clear that h is continuous on [a, b], differentiable on (a, b), and h(a) = h(b). Thus, by Rolle's Theorem, h'(c) = 0 for some c E (a, b). (ii) Set g(x) = x and apply part (i). (For a geometric interpretation of this result, see the opening paragraph of this section and Figure 4.3.) I The Generalized Mean Value Theorem is also called Cauchy's Mean Value Theorem. It is crucial when comparing derivatives of two functions simultaneously (e.g., see Theorem 4.18), for studying certain kinds of generalized derivatives (e.g., see Remark 14.33), and for using higher-order derivatives to approximate a given function (e.g., see Taylor's Formula, Theorem 7.44). The Mean Value Theorem is most often used to extract information about f from f' (see, for example, Theorem 4.17 and Exercises 5 through 9). It is also sometimes useful (as the next example shows) for comparing one function with another.
4.16 Example. Prove that 1 + x < eX for all x> O. PROOF. Let f(x)
=
eX - x-I and fix x> O. By the Mean Value Theorem, eX - x-I = f(x) - f(O) = x!,(c)
for some c between 0 and x. But c > 0 implies that f'(c) eX - x-I = xf'(c) > OJ i.e., eX > x + 1. I Here is another application of the Mean Value Theorem.
=e
C -
1
> O.
Hence
4.3
Mean Value Theorem
97
4.17 THEOREM [BERNOULLI'S INEQUALITY]. Let a be a positive real number and 0 ~ -1. If < a ::; 1, then
°
(1 + 0)" ::; 1 + and if a
~
ao,
1, then
PROOF. The proofs of these inequalities are similar. We present the details only for the case < a::; 1. Let f(x) = x". By the Mean Value Theorem,
°
f(l
+ 0) =
f(l)
+ aoc,,-l
°
for some c between 1 and 1 + O. If 0 > 0, then c > 1. Since < a ::; 1, it follows that C,,-l ::; 1 (see Exercise 5, p. 134); hence, OC,,-l ::; O. On the other hand, if -1::; 0::; 0, then c,,-l ~ 1 and again OC,,-l ::; O. Therefore, (1
+ 0)" = f(l + 0) = f(l) + aoc,,-l ::; 1 + ao. I
Another application of the Mean Value Theorem is the following technique for evaluating limits of certain quotients. (Our statement is general enough to include one-sided limits and limits at infinity.)
4.18 THEOREM [L'HoPITAL'S RULE]. Let a be an extended real number and I be an open interval that either contains a or has a as an endpoint. Suppose that f and 9 are differentiable on 1\ {a}, and g(x) =I- =I- g'(x) for all x E 1\ {a}. Suppose further that A := limf(x) = limg(x) x--+a x--+a xEI xEI
°
is either
°
or 00. If
B
f'(x) x--+a g'(x) xEI
:= lim
exists as an extended real number, then lim f(x) = lim f'(x). x--+ag(x) x--+ag'(X) xEI xEI PROOF. Let Xk E I with Xk ---7 a as k ---7 00 such that either Xk < a or Xk > a for all kEN. By the Sequential Characterization of Limits and by the characterization of two-sided limits in terms of one-sided limits, it suffices to show that f(Xk)/g(Xk) ---7 B as k ---7 00. We suppose for simplicity that B E R. (For the cases B = ±oo, see Exercise 10.) Notice once and for all, since g' is never zero on I, that by Mean Value Theorem the differences g(x) - g(y) are never zero for x, y E I, x =I- y, provided that either x, y ~ a or x, y ::; a. Hence, we can divide by these differences at will.
Chapter -4
98
DIFFERENTIABILITY ON R
Case 1. A = 0 and a E R. Extend f and 9 to I U {a} by f(a) := 0 =: g(a). By hypothesis, f and 9 are continuous on I U {a} and differentiable on I \ {a}. Hence by the Generalized Mean Value Theorem, there is a C := Ck between Xk and y := a such that f(Xk) - f(y) g(Xk) - g(y)
(12) Since f(y)
= g(y) = 0,
f'(C) g'(c),
it follows that
f(Xk) - f(y) g(Xk) - g(y)
(13)
f'(C) g'(c)'
Let k - t 00. Since C lies between Xk and a, C also converges to a as k - t 00. Hence hypothesis and (13) imply f(Xk)/g(Xk) - t Bask - t 00. Case 2. A = ±oo and a E R. We suppose by symmetry that A = +00. For each k, n E N, apply the Generalized Mean Value Theorem to choose a C := Ck,n between Xk and Xn such that (12) holds for y := Xn . Thus
i.e.,
f(xn) g(xn)
(14)
=
f(Xk) _ g(Xk) . f'(Ck,n) g(xn) g(xn) g'(Ck,n)
+ f'(Ck,n). g'(Ck,n)
Since A = 00, it is clear that l/g(xn) - t 0, and since Ck,n lies between Xk and Xn , it is also clear that Ck,n - t a, as k, n - t 00. Thus (14) and hypothesis should imply that f(xn)/g(xn) ~ 0 - 0 + B = B for large nand k. Specifically, let 0 < c < 1. Since Ck,n - t a as k, n - t 00, choose an No so large that n ~ No implies that 1f'(CNo,n)/g'(CNo,n) - BI < c/3. Since g(Xn) - t 00, choose an N > No such that If(XNo)/g(xn)1 and the product Ig(XNo)/g(xn)1 . 1f'(CNo,n)/g'(CNo,n) I are both less than c/3 for all n ~ N. It follows from (14) that for any n ~ N,
f(xn) I g(xn)
- BI < I f(XNo) - g(xn)
I + Ig(XNo)
f'(CNo,n) g(xn) g'(CNo,n)
I + I f'(CNo,n)
g'(CNo,n)
- BI < c.
Hence, f(xn)/g(xn) - t Bas n - t 00. Case 3. a = ±oo. We suppose by symmetry that a = +00. Choose C > 0 such that I:) (c,oo). For each y E (0, l/c), set cf>(y) = f(l/y) and 1jJ(y) = g(l/y). Notice that by the Chain Rule,
cf>'(y) 1jJ'(y)
f'(l/y)( -1/y2) g'(l/y)( -1/y2)
f'(l/y) g'(l/y) .
4.3
Mean Value Theorem
99
Thus, for x = l/y E (e,oo), f'(x)/g'(x) = ¢/(y)N'(y). Since x ~ 00 if and only if y = l/x ~ 0+, it follows that ¢ and 1jJ satisfy the hypotheses of Cases 1 or 2 for a = 0 and I = (0, l/e). In particular,
I
lim f'(x) = lim ¢'(y) = lim ¢(y) = lim f(x). x-+oo g'(x) y-+O+ 1jJ'(y) y-+O+ 1jJ(y) x-+oo g(x)
l'Hopital's Rule can be used to compare the relative rates of growth of two functions. For example, the next result shows that as x ~ 00, eX converges to 00 much faster that x 2 does. 4.19 Example. Prove that lim x-+ oo x2 /e x =
o.
PROOF. Since the limits of x 2 /e x and x/ex are of the form l'Hopital's Rule twice to verify that
00/00,
we apply
· x2 1·Im-= 2x 1·Im-=. 2 0 I 1Im-=
x-+oo
eX
x--+oo
X--+OO eX
eX
For each subsequent application of l'Hopital's Rule, it is important to check that the hypotheses still hold. For example,
x2
lim
x-+O x 2 + sin x
= lim 2x = 0 ¥- 1 = lim 2 x-+O 2x + cos x x-+O 2 - sin x
Notice that the middle limit is not of the form 0/0. l'Hopital's Rule can be used to evaluate limits of the form 0 . 00. 4.20 Example. Find limx-+o+ x log x. SOLUTION. By writing x as l/(l/x), we see that the limit in question is of the form 00/00. Hence, by l'Hopital's Rule,
r 1 r log x r l/x 0 x.!..w+ x ogx = x.!..w+ l/x = x.!..W+ -1/x2 = . I The next two examples show that l'Hopital's Rule can also be used to evaluate limits of the form 100 and 00 • 4.21 Example. Prove that the sequence (1 its limit e satisfies 2 < e ::; 3 and log e = 1. PROOF. The sequence (1
+ l/n)n
+ l/n)n
is increasing, as n
~ 00,
and
is increasing, since by Bernoulli's Inequality,
( 1 + .!.)nl(n+l) < (1 + _1_) . n
-
n+1
To prove that this sequence is bounded above, observe by the Binomial Formula that
(1 +~)
n
=
( n) k
L~=o (~) (~)
(.!.) n
k
k.
Now,
= n(n - 1) ... (n - k
nk
+ 1) . ~ < ~ < _1_ k! - k! -
2k-l
Chapter -4
100
DIFFERENTIABILITY ON R
for all kEN. It follows from Exercise lc, p. 17, that 2=
(1+~) 1
<
(1+.!)n <1+1+ ~~=3 _ _ 1_<3 n 2k 2 ~
n- l
k=1
for n > 1. Hence, by the Monotone Convergence Theorem, the limit defining e exists, and satisfies 2 < e ::; 3. Finally, to verify log e = 1, use l'Hopital's Rule: loge= lim log(1+ l/n) = lim (n/(n+l))(-I/n 2 ) =1. n--+oo l/n n--+oo -1/n2
I
4.22 Example. Find L = limx--+l(logx)l-x. If the limit L exists, then log L = limx--+ 1 (1 - x) log log x is of the Hence, by l'Hopital's Rule
SOL UTI 0 N •
form O·
00.
logL = lim log log x = lim 1/(xlogx) = lim -2(1- x) = O. x--+l 1/(1 - x) x--+l 1/(1 - x)2 x--+l 1 + log x
Therefore, L = eO = 1. I EXERCISES 1. Evaluate the following limits. . sin(3x) 11m .
(a) (b)
(c)
r x!..W+
(f)
cos x - eX log(1 + x 2 ) •
. (X)
hm - x--+O sin x
l/x
2
lim xx.
(d)
(e)
X
x--+O
x--+O+
1.
log x
1m . ( ) x--+ 1 SIn 7rX
}!.ll!, x ( arctan x -
i) .
(For the derivative of arctan x, see Exercise 4, p. 106.) 2. Prove that each of the following inequalities. (a) VI + 2x < 1 + x for all x> O. (b) log x ::; x-I for all x ::::: 1. (c) 7(x-l) <eX forallx:::::2. (d) sin2 x::; 21xl for all x E R.
4.3
101
Mean Value Theorem
~. This exercise is used in Sections 7.4 and 12.5. Assume that eX is differentiable on R with (ex)' = eX. (a) Show that the derivative of
f(x) =
{
~
_1/x2
xyfO x=O
exists and is continuous on R with f' (0) = O. (b) Do analogous statements hold for f(n)(x) when n
=
2,3, ... ?
1iJ. This exercise is used in Sections 5.4, 6.3, and elsewhere. (a) Using (eX)' = eX, (log x)' = 1jx, and x" = e"logx, show that (x")' = O:X,,-l for all x> O. (b) Let 0: > O. Prove that log x :::; x" for x large. Prove that there exists a constant C" such that log x :::; C"x" for all x E [1,00), C" -+ 00 as 0: -+ 0+, and C" -+ 0 as 0: -+ 00. (c) Obtain an analogue of (b) valid for eX and x" in place of log x and x". 5. Suppose that
f is differentiable on R.
(a) If f'(x) = 0 for all x E R, prove that f(x) = f(O) for all x E R. (b) If f(O) = 1 and 1f'(x)1 :::; 1 for all x E R, prove that If(x)1 :::; Ixl + 1 for all x E R. (c) If f'(x) ::::: 0 for all x E R, prove that a < b implies that f(a) :::; f(b). 6. Let f be differentiable on a nonempty, open interval (a, b) with f' bounded on (a, b). Prove that f is uniformly continuous on (a, b). 7. Let f be differentiable on (a, b), continuous on [a, b], with f(a) = f(b) = O. Prove that if f(c) > 0 for some c E (a, b), then there exist Xl, x2 E (a, b) such that f'(xd > 0 > f'(X2). 8. Let f be twice differentiable on (a, b) and let there be points Xl < X2 < X3 in (a, b) such that f(xd > f(X2) and f(X3) > f(X2). Prove that there is a point c E (a, b) such that f"(c) > O. 9. Let f be differentiable on (0,00). If L = limx-->DO f'(x) and limn-->DO f(n) both exist and are finite, prove that L = O. 10. Prove I'H6pital's Rule for the cases lEI = 00 by first proving that g(x)j f(x) -+ 0 when f(x)jg(x) -+ ±oo, as x -+ a. 11. Suppose that f : [a, b] -+ R is continuous and increasing (see Definition 4.23). Prove that sup f(E) = f(sup E) for every nonempty set E S;;; [a, b]. 12. Let (a, b) be an open interval, f : (a, b) -+ R, and Xo E (a, b). The function f is said to have a proper local maximum at Xo if there is a <5 > 0 such that f(xo) > f(x) for all < Ix - xol < <5. (a) If f is differentiable on (a, b) and has a proper local maximum at Xo E (a, b), prove that f'(xo) = and that given <5 > 0, there exist Xl < Xo < X2 such that f'(XI) > 0, f'(X2) < 0, and IXj - xol < <5 for j = 1,2. (b) Make and prove an analogous statement for a proper local minimum.
°
°
102
Chapter 4
DIFFERENTIABILITY ON R
4.4 MONOTONE FUNCTIONS AND THE INVERSE FUNCTION THEOREM Monotone functions (Le., those that either increase or decrease on their domain) are important from both a theoretical and a practical point of view (e.g., see Theorem 5.34). In this section we study monotone functions and the role they play in the Inverse Function Theorem. 4.23 DEFINITION. Let E be a nonempty subset of Rand f : E --+ R. (i) f is said to be increasing (respectively, strictly increasing) on E if and only f(X2) (respectively, f(xd < f(X2)). if XI, X2 E E and Xl < X2 imply f(xd (ii) f is said to be decreasing (respectively, strictly decreasing) on E if and only if Xl, X2 E E and Xl < X2 imply f(xd ~ f(X2) (respectively, f(Xl) > f(X2)). (iii) f is said to be monotone (respectively, strictly monotone) on E if and only if f is either decreasing or increasing (respectively, either strictly decreasing or strictly increasing) on E.
s
Thus, although f(x) = x 2 is strictly monotone on [0,1] and on [-1,0]' it is not monotone on [-1,1]. The derivative gives a simple method for finding intervals on which a differentiable function is monotone. 4.24 THEOREM. Suppose that a, b E R, with a i=- b, that f is continuous on [a, b], and that f is differentiable on (a, b). (i) If f'(x) > 0 (respectively, f'(x) < 0) for all X E (a, b), then f is strictly increasing (respectively, strictly decreasing) on [a, b]. (ii) If f'(x) = 0 for all X E (a, b), then f is constant on [a, b]. PROOF. Let a S Xl < X2 S b. By the Mean Value Theorem, there is an c E (a, b) such that f(X2) - f(Xl) = f'(C)(X2 - Xl). Thus, f(X2) > f(xd when f'(c) > 0 and f(X2) < f(Xl) when f'(c) < O. This proves part (i). To prove part (ii), let a S X S b. By the Mean Value Theorem and hypothesis there is an c E (a, b) such that
f(x) - f(a) = f'(c)(x - a) = O. Thus, f(x)
= f(a) for all X E [a, b]. I
4.25 Remark. If f and g are continuous on a nondegenerate interval [a, b], are differentiable on (a, b), and if f'(x) = g'(x) for all X E (a, b), then f - g is constant on [a,b]. PROOF.
Apply Theorem 4.24ii to the function
f -
g. I
Let f be a real function. Recall (see Figure 1.2) that if f has an inverse function f-I, then the graph of y = f-l(x) is a reflection of the graph of y = f(x) about the line y = x. Thus, it is not difficult to imagine that f- l is as smooth as f. This is the subject of the next two theorems.
4.4
Monotone functions and the Inverse Function Theorem
103
4.26 THEOREM. If f is 1-1 and continuous on an interval I, then f is strictly monotone on I and f- l is continuous and strictly monotone on f(l). PROOF. We may suppose that I contains at least two points. Notice that since f is 1-1, a, b E I and a < b imply f(a) < f(b) or f(a) > f(b). Thus if f is not strictly monotone on I, then there exist points a, b, c E I such that a < c < b but f(c) does not lie between f(a) and f(b). It follows that either f(a) lies between f(b) and f(c) or f(b) lies between f(a) and f(c). Hence by the Intermediate Value Theorem, there is an Xl E (a, b) such that f(xd = f(a) or f(Xl) = f(b). Since f is 1-1, we conclude that either Xl = a or Xl = b, a contradiction. Therefore, f is strictly monotone on I. We may suppose that f is strictly increasing on I. Since f is I-Ion I, apply Theorem 1.31 to verify that f- l takes f(1) onto I. To show that f- l is strictly increasing on f(l), suppose to the contrary that there exist Yl, Y2 E f(l) such l (Y2) that Yl < Y2 but f-l(Yl) ~ f- l (Y2). Then Xl := f-l(Yl) and X2 := satisfy Xl ~ X2 and Xl> X2 E I. Since f is strictly increasing on I, it follows that Yl = f(Xl) ~ f(X2) = Y2, a contradiction. Thus, f- l is strictly increasing on f(l). Let c, dE I. Since f is strictly increasing on I, it is easy to see by the Intermediate Value Theorem that f([c, dj) = [f(c), f(d)]. It follows that f(l) is an interval. In particular, it remains to prove that for each Yo E f(I), f-l(y) --t f-l(yO) as Y --t Yo through f (I). Fix Yo E f(I) and let c > O. Since f- l is strictly increasing on f(l), if Yo is not a right endpoint of f(l), then Xo := f-l(yO) is not a right endpoint of I. Thus there is an co > 0 so small that co < c and Xo + co E I. Set 0 = f(xo + co) - f(xo) and suppose that 0 < Y - Yo < o. The choice of 0 implies that Yo < Y < Yo + 0 = f(xo + co). If X := f-l(y), then this last inequality can be written as f(xo) < f(x) < f(xo + co). Since f- l is strictly increasing, it follows that Xo < X < Xo + co; i.e., 0 < X - Xo < co < c. Finally, since X = f-l(y) and Xo = f-l(yo), we conclude that 0 < f-l(y) - f-l(yO) < c for all 0 < y-yo < 0; i.e., f-l(yO+) exists and equals f-l(yO). A similar argument shows that if Yo is not a left endpoint of f(l), then f-l(yO-) = f-l(yO). Thus f- l is continuous on f(l)· I
r
4.27 THEOREM [INVERSE FUNCTION THEOREM]. Let f be 1-1 and continuous on an open interval I. If a E f(l) and if f'(f-l(a)) exists and is nonzero, then f- l is differentiable at a and
PROOF. By Theorem 4.26, f is strictly monotone, say strictly increasing on I, and f- l exists, is continuous, and strictly increasing on f(l). Moreover, since Xo := f-l(a) E I and I is open, we can choose c,d E R such that Xo E (c,d) c I. Since by the Intermediate Value Theorem f(c, d) = (f(c), f(d)), we can choose hi=- 0 so small that a + hE f(I); i.e., f-l(a + h) is defined. Set X = f-l(a + h) and observe that f(x) - f(xo) = a + h - a = h. Since f- l is continuous, X --t Xo if and only if h --t o. Therefore, by direct substitution, we
Chapter
104
4 DIFFERENTIABILITY ON R
conclude that
1.
1m h--+O
f-l(a
+ h) -
f-l(a) _ 1. -
h
1m X--+Xo
x - Xo _ 1 I ---. f(x) - f(xo) f'(xo)
This theorem is usually presented in elementary calculus texts in a form more easily remembered: If y = f(x) and x = f-l(y), then
dx dy
1
dyjdx·
Notice that by using this formula, we do not need to solve explicitly for able to compute (1-1)' (see Exercises 2, 3, and 7).
f- l
to be
We close this section with several optional results that delve a little deeper into differentiability of real functions. Recall (see Examples 3.31 and 3.32) that there exist functions that have neither right nor left limits at a given point. The following result shows that monotone functions never behave this badly.
*4.28 Lemma. Suppose that
(i) Ifxo (ii) If Xo
E E
f is increasing on [a, b].
[a, b), then f(xo+) exists and f(xo) :::; f(xo+). (a, b], then f(xo-) exists and f(xo-) :::; f(xo).
PROOF. Fix Xo E (a, b]. By symmetry it suffices to show that f(xo-) exists and satisfies f(xo-) :::; f(xo). Set E = {f(x) : a < x < xo} and s = sup E. Since f is increasing, f(xo) is an upper bound of E. Hence, s is a finite real number that satisfies s :::; f(xo). Given E: > 0, choose by the Approximation Property an Xl E (a, xo) such that s - E: < f(Xl) :::; s. Since f is increasing, sfor all
Xl
< X < Xo.
E:
< f(xt} :::; f(x) :::; s
Therefore, f(xo-) exists and satisfies f(xo-) = s :::; f(xo). I
We have seen (Example 3.32) that a function can be nowhere continuous, i.e., can have uncountably many points of discontinuity. How many points of discontinuity can a monotone function have?
*4.29 THEOREM. Iff is monotone on an interval [, then f has at most countably many points of discontinuity on [. PROOF. Without loss of generality, we may suppose that f is increasing. Since the countable union of at most countable sets is countable (Theorem 1. 38ii) , it suffices to show that the set of points of discontinuity of f can be written as a countable union of at most countable sets. Since R is the union of closed intervals [-n, n], n E N, we may suppose that [ is a closed, bounded interval [a, b].
4.4
Monotone functions and the Inverse Function Theorem
105
Let E represent the set of points of discontinuity of f on (a, b). By Lemma 4.28, f(x-) ::::; f(x) ::::; f(x+) for all x E (a,b). Thus, f is discontinuous at such an x if and only if f(x+) - f(x-) > O. It follows that
n.
where for each j E N, Aj := {x E R : f(x+) - f(x-) ~ 1/ We will complete the proof by showing that each Aj is finite. Suppose to the contrary that Ajo is infinite for some jo· Set Yo := jo(f(b) - f(a)) and observe that since f is finite-valued on I, Yo is a finite real number. On the other hand, since Ajo is infinite, there exist Xl < X2 < ... in [a, bj such that f(Xk+) - f(Xk-) ~ l/jo for kEN. Since f is monotone, it follows that n
f(b) - f(a) ~ :l)f(Xk+) - f(Xk-)) ~ ~; k=l
Jo
i.e., Yo = jo(f(b) - f(a)) ~ n for all n E N. Taking the limit of this last inequality as n ----. 00, we see that Yo = +00. With this contradiction, the proof of the theorem is complete. I Although a differentiable function might not be continuously differentiable, the following result shows that its derivative does satisfy an intermediate value theorem. (This result is sometimes called Darboux's Theorem.) *4.30 THEOREM [INTERMEDIATE VALUE THEOREM FOR DERIVATIVESj. Suppose that f is differentiable on [a, bj with f'(a) i= f'(b). If Yo is a real number that lies between f'(a) and f'(b), then there is an Xo E (a, b) such that f'(xo) = Yo.
STRATEGY: Let F(x) := f(x) - YoX. We must find an Xo E (a, b) such that := f'(xo) - Yo = O. Since local extrema of a differentiable function F occurs where the derivative of F is zero (e.g., see the proof of Rolle's Theorem), it suffices to show that F has a local extremum at some Xo E (a, b).
F'(xo)
PROOF. Suppose that Yo lies between f'(a) and f'(b). By symmetry, we may suppose that f'(a) < Yo < f'(b). Set F(x) = f(x) - YoX for x E [a, b], and observe that F is differentiable on [a, bj. Hence, by the Extreme Value Theorem, F has an absolute minimum, say F(xo), on [a, bj. Now F'(a) = f'(a) - Yo < 0, so F(a + h) - F(a) < 0 for h > 0 sufficiently small. Hence F(a) is NOT the absolute minimum of F on [a, bj. Similarly, F(b) is not the absolute minimum of F on [a, bj. Hence, the absolute minimum F(xo) must occur on (a, b); i.e., Xo E (a, b) and F'(xo) = O. I
EXERCISES 1. (a) Find all a E R such that x 3
+ ax 2 + 3x + 15 is strictly increasing near x = 1.
Chapter 4
106
DIFFERENTIABILITY ON R
(b) Find all a E R such that ax 2 + 3x + 5 is strictly increasing on the interval (1,2). (c) Find where I(x) = 21x -11 + 5v'x 2 + 9 is strictly increasing and where I(x) is strictly decreasing. 2. Let I and 9 be 1-1 and continuous on R. If 1(0) = 2, g(l) = 2, 1'(0) = and g' (1) = e, compute the following derivatives.
7r,
(a) (1-1)'(2). (b) (g-I)'(2).
(c) (1-1. g-I)'(2). 3. Let I(x) = x 2 ex2 , x E R.
(a) Show that I -1 exists and is differentiable on (0, 00 ). (b) Compute (I-l)'(e). 4. Using the Inverse Function Theorem, prove that (arcsin x)' = 1/v'1 - x 2 for x E (-1,1) and (arctanx), = 1/(1 + x 2 ) for x E (-00, (0). 5. Suppose that I' exists and is continuous on a nonempty, open interval (a, b) with I'(x) =I- 0 for all x E (a, b).
(a) Prove that I is I-Ion (a, b) and takes (a, b) onto some open interval (c, d). (b) Show that (1-1)' exists and is continuous on (c, d). (c) Using the function I (x) = x 3 , show that (b) is false if the assumption I' (x) =Io fails to hold for some x E (a, b). (d) Sketch the graphs of y = tan x and y = arctan x to see that c and d in part (b) might be infinite. 6. Let [a, b] be a closed, bounded, nondegenerate interval. Find all functions that satisfy the following conditions for some fixed O! > 0: I is continuous and I-Ion [a, b], I'(x) =I- 0 and I'(x) = O!(I-l )'(I(x)) for all x E (a, b). 7. Suppose that I is continuous on a closed, bounded interval [a, b].
I
(a) If I is differentiable on (a, b) and I'(x) 2: co> 0 for all x E (a, b), prove that (I-I)' exists and is bounded on (I(a), I(b)). (b) If I is continuously differentiable on (a, b) and I'(xo) =I- 0 for some Xo E (a, b), prove that there are intervals I and J such that I is 1-1 from I onto J and 1-1 is continuously differentiable on J. * 8. Let I be an interval and n EN. Show that if Ii
: I ---+ R are monotone functions and I = L-7=1 O!jli for some O!j E R, then I has at most count ably many points of discontinuity on I. *9. Let I be differentiable at every point in a closed, bounded interval [a, b]. Prove that if I' is I-Ion [a, b], then I' is strictly monotone on [a, b]. *10. Let I be differentiable at every point in a closed, bounded interval [a, b]. Prove that if I' is increasing on (a, b), then I' is continuous on (a, b).
Chapter 5
Integrability on R 5.1 RIEMANN INTEGRAL
In this chapter we shall study integration of real functions. We begin our discussion by introducing the following terminology.
5.1 DEFINITION. Let a, bE R with a < b. (i) A partition of the interval [a, b] is a set of points P = {xo, X!, ... , xn} such that a = Xo < Xl < ... < Xn = b. (ii) The norm of a partition P = {xo, Xl, ... , Xn} is the number
(iii) A refinement of a partition P = {xo, X!, ... , xn} is a partition Q of [a, b] that satisfies Q ~ P. In this case we say that Q is finer than P. 5.2 Example [DYADIC PARTITION]. Prove that for each n E N, P n = {j/2 n : j = 0, 1, ... , 2n} is a partition of the interval [0, 1], and Pm is finer than Pn when
m>n. PROOF. Fix n E N. If Xj = j/2 n , then 0 = Xo < Xl < ... < X2n = 1. Thus Pn is a partition of [0,1]. Let m > n and set p = m - n. If 0 ~ j ~ 2n , then j /2 n = j2P /2 m and 0 ~ j2P ~ 2m . Thus Pm is finer than Pn · I It is clear that by definition, if P and Q are partitions of [a, b], then P U Q is finer than both P and Q. (Note that "finer" does not rule out the possibility that P U Q = Q, which would be the case if Q were a refinement of P.) And if Q is a refinement of P, then IIQII ~ IIPII. We shall use these observations often. 107
Chapter 5
108
INTEGRABILITY ON R
y
Figure 5.1 Let f be nonnegative on an interval [a, b]. You may recall that the integral of f over [a, b] (when this integral exists) is the area of the region bounded by the curves y = f(x), y = 0, x = a, and x = b. This area, A, can be approximated by rectangles whose bases lie in [a, b] and whose heights approximate f (see Figure 5.1). If the tops of these rectangles lie above the curve y = f(x), the resulting approximation is larger than A. If the tops of these rectangles lie below the curve y = f(x), the resulting approximation is smaller than A. Hence, we make the following definition. 5.3 DEFINITION. Let a,b E R with a < b, let P = {xQ,XI, ... ,xn } be a partition of the interval [a, b], and suppose that f : [a, b] --+ R is bounded. (i) The upper Riemann sum of f over P is the number n
U(f, P)
:=
L Mj(f)(xj -
Xj-I),
j=I
where
Mj(f) :=
sup
f(x).
xE[xJ-l,xJl
(ii) The lower Riemann sum of f over P is the number n
L(f, P) :=
L mj(f) (Xj -
Xj-I),
j=I
where
[Note: We assumed that f is bounded so that the numbers Mj(f) and mj(f) would exist and be finite.] Some specific upper and lower Riemann sums can be evaluated with the help of the following elementary observation.
5.1
5.4 Remark. If g : N
---t
Riemann integral
109
R, then
n
L (g(k + 1) - g(k)) = g(n + 1) - g(m) k=m
for all n ~ m in N. PROOF. The proof is by induction on n. The formula holds for n for some n - 1 ~ m, then
=
m. If it holds
n
L (g(k + 1) - g(k)) = (g(n) - g(m)) + (g(n + 1) - g(n)) = g(n + 1) - g(m).
•
k=m
We shall refer to this algebraic identity by saying that the sum telescopes to g(n + 1) - g(m). Before we define what it means for a function to be integrable, we make several elementary observations concerning upper and lower sums. 5.5 Remark. If f(x) = a is constant on [a, b], then
U(f, P) = L(f, P) = a(b - a) for all partitions P of [a, b]. PROOF. Since Mj(f) = mj(f) = a for all j, the sums U(f, P) and L(f, P) telescope to a(b - a) .•
5.6 Remark. L(f, P) PROOF.
~
U(f, P) for all partitions P and all bounded functions f.
By definition, mj (f)
~
Mj (f) for all j. •
The next result shows that as the partitions get finer, the upper and lower Riemann sums get nearer each other. 5.7 Remark.
If P is any partition of [a, b] and Q is a refinement of P, then L(f, P)
~
L(f, Q)
~
U(f, Q)
~
U(f, P).
PROOF. Let P = {xo, Xl, .•. ,xn } be a partition of [a, b]. Since Q is finer than P, Q can be obtained from P in a finite number of steps by adding one point at a time. Hence it suffices to prove the inequalities above for the special case Q = {c} U P for some c E (a, b). Moreover, by symmetry and Remark 5.6, we need only show that U(f,Q) ~ U(f,P). We may suppose that c rt P. Hence, choose an index jo such that Xjo-l < c < Xjo. By definition, it is clear that
Chapter 5
110
INTEGRABILITY ON R
y
a
XI
x2
x3
x4
x5
b
and
M=
X
Figure 5.2
where
M(l) =
sup
M(r) =
f(x),
xE[xJO-l,e]
sup f(x), xE[e,x Jo ]
sup f(x). xE[Xjo-l,XjO]
By the Monotone Property of Suprema, M(l) and M(r) are both less than or equal to M. Therefore,
U(f, Q) - U(f, P)
~
M(c - Xjo-l)
+ M(xjo
- c) - M(xjo - Xjo-l) = O. I
5.8 Remark. If P and Q are any partitions of [a, b], then
L(f, P)
~
U(f, Q).
PROOF. Since P U Q is a refinement of P and Q, it follows from Remark 5.7 that
L(f, P)
~
L(f, P U Q)
~
U(f, P U Q)
~
U(f, Q)
for any pair of partitions P, Q, whether Q is a refinement of P or not. I We now use the connection between area and integration to motivate the definition of "integrable." Suppose that f(x) is nonnegative on [a, b] and the region bounded by the curves y = f(x), y = 0, x = a, and x = b has a well-defined area A. By Definition 5.3, every upper Riemann sum is an overestimate of A, and every lower Riemann sum is an underestimate of A (see Figure 5.1). Since the estimates U(f, P) and L(f, P) should get nearer to A as P gets finer, the differences U(f, P) - L(f, P) should get smaller. (The shaded area in Figure 5.2 represents the difference U(f, P) - L(f, P) for a particular P.) This leads us to the following definition (see also Exercise 9).
5.9 DEFINITION. Let a, bE R with a < b. A function f : [a, b] --t R is said to be (Riemann) integrable on [a, b] if and only if f is bounded on [a, b], and for every e> 0 there is a partition P of [a, b] such that U(f, P) - L(f, P) < e. Notice that this definition makes sense whether or not f is nonnegative. The connection between nonnegative functions and area was only a convenient vehicle
5.1
111
Riemann integml
to motivate Definition 5.9. Also notice that by Remark 5.6, U(f, P) - L(f, P) = IU(f, P) - L(f, P)I for all partitions P. Hence, U(f, P) - L(f, P) < c is equivalent to IU(f, P) - L(f, P)I < c. This section provides a good illustration of how mathematics works. The connection between area and integration leads directly to Definition 5.9. This definition, however, is not easy to apply in concrete situations. Thus, we search for conditions that imply integrability and are easy to apply. In view of Figure 5.2, it seems reasonable that a function is integrable if its graph does not jump around too much (so that it can be covered by thinner and thinner rectangles). Since the graph of a continuous function does not jump at all, we are led to the following simple criterion that is sufficient (but not necessary) for integrability.
5.10 THEOREM. Suppose that a, b E R with a interval [a, b], then f is integrable on [a, b]. PROOF.
Let c
< b. 1£ f is continuous on the
> O. Since f is uniformly continuous on [a, b], choose 8 > 0 such
that
Ix - yl < 8
(1)
If(x) - f(y)1 < b ~ a'
implies
Let P = {xQ, XI, ... , Xn} be any partition of [a, b] that satisfies IIPII < 8. Fix an index j and notice, by the Extreme Value Theorem, that there are points Xm and XM in [Xj-I, Xj] such that f(Xm)
Since
= mj(f)
and
IIPII < 8, we also have IXM-Xml < 8.
f(XM)
= Mj(f).
Hence by (1), Mj(f)-mj(f)
< c/(b-a).
In particular, n
U(f, P) - L(f, P)
= ~)Mj(f) -
n
mj(f))(xj - Xj-I)
j=1
< b ~ a ~)Xj - Xj-I) = c. j=1
(The last step comes from telescoping.) • Although the converse of Theorem 5.10 is false (see Example 5.12, and Exercises 3 and 8), there is a close connection between integrability and continuity. Indeed, we shall see (Theorem 9.49) that a function is integrable if and only if it has relatively few discontinuities. This principle is illustrated by the following examples. The nonintegrable function in Example 5.11 is nowhere continuous (hence has many discontinuities), but the integrable function in Example 5.12 has only one discontinuity (hence has few discontinuities).
5.11 Example. The Dirichlet function f(x) = {
is not Riemann integrable on [0,1].
~
xE Q x~
Q
Chapter 5
112
INTEGRABILITY ON R
PROOF. Clearly, f is bounded on [0,1J. By Theorem 1.24 and Exercise 3, p. 23 (Density of Rationals and Irrationals), the supremum of f over any nondegenerate interval is 1, and the infimum of f over any nondegenerate interval is o. Therefore, U(f, P) - L(f, P) = 1- 0 = 1 for any partition P of the interval [0, 1J; Le., f is not integrable on [0,1J. I
5.12 Example. The function
f(x)
~
= {
o~ x
< 1/2
1/2 ~ x ~ 1
is integrable on [0,1J. PROOF.
Let c
> 0 and set P = { 0, 1 ; c, 1 ; c, 1} .
We may suppose that c < 1/2, Le., that P is a partition of [0,1J. Since ml(f) = 0= M 1 (f), m2(f) = 0 < 1 = M2(f), and m3(f) = 1 = M3(f), it is easy to see that U(f, P) - L(f, P) = c. Therefore, f is integrable on [0,1J. I We have defined integrability but not the value of the integral. We remedy this situation by using the Riemann sums U(f, P) and L(f, P) to define upper and lower integrals.
5.13 DEFINITION. Let a, bE R with a < b, and f (i) The upper integml of f on [a, bJ is the number
(U)
lb
f(x) dx
:= inf{U(f,
: [a, bJ
--+
R be bounded.
P) : P is a partition of [a, b]}.
(ii) The lower integml of f on [a, bJ is the number
(L)
lb
f(x) dx := sup{L(f, P) : P is a partition of [a, b]}.
(iii) If the upper and lower integrals of f on [a, bJ are equal, we define the integml of f on [a, bJ to be the common value
lb
f(x) dx
:=
(U)
lb
f(x) dx
=
(L)
lb
f(x) dx.
This defines integration over nondegenerate intervals. If a = b, then motivated by the interpretation of integration as area, we define the integral of any bounded function f to be zero; Le.,
l
a
f(x) dx
:= O.
Although a bounded function might not be integrable (see Example 5.11), the following result shows that the upper and lower integrals of a bounded function always exist.
5.1
5.14 Remark. If f : [a, b] and are finite, and satisfy
----t
(L)
Riemann integral
113
R is bounded, then its upper and lower integmls exist
lb f(x) dx :::; (U) lb f(x) dx.
PROOF. By Remark 5.8, L(f, P) :::; U(f, Q) for all partitions P and Q of [a, b]. Taking the supremum of this inequality over all partitions P of [a, b], we have
(L)
lb f(x) dx :::; U(f,
Q)j
i.e., the lower integral exists and is finite. Taking the infimum of this last inequality over all partitions Q of [a, b], we conclude that the upper integral is also finite and greater than or equal to the lower integral. • Suppose that f is bounded and nonnegative on [a, b]. Since the upper and lower sums of f approximate the "area" of the region bounded by the curves y = f(x), y = 0, x = a, and x = b, we guess that f is integrable if and only if the upper and lower integrals of f are equal. The following result shows that this guess is true whether or not f is nonnegative.
5.15 THEOREM. Let a, bE R with a < b, and
f : [a, b]
----t
R be bounded. Then
f is integrable on [a, b] if and only if (2)
(L)
PROOF. Suppose that such that
f
lb f(x) dx
(U)
=
lb f(x) dx.
is integrable. Let e > 0 and choose a partition P of [a, b]
U(f, P) - L(f, P) < e.
(3)
f:
By definition, (U) f(x) dx :::; U(f, P) and the opposite inequality holds for the lower integral and the lower sum L(f, P). Therefore, it follows from Remark 5.14 and (3) that
I(U) lb f(x) dx - (L) lb f(x) dx (U) lb f(x) dx - (L) lb f(x) dx =
:::; U(f, P) - L(f, P) < e. Since this is valid for all e > 0, (2) holds as promised. Conversely, suppose that (2) holds. Let e > 0 and choose, by the Approximation Property, partitions PI and P2 of [a, b] such that
U(f, PI) < (U)
I
a
b
f(x) dx
e
+2
Chapter 5
114
INTEGRABILITY ON R
and
L(f, P2 ) > (L)
lb
f(x) dx -
~.
Set P = PI U P2 . Since P is a refinement of both PI and P2 , it follows from Remark 5.7, the choices of PI and P2 , and (2) that
U(f, P) - L(f, P) ::; U(f, PI) - L(f, P2 ) ::; (U)
Ib
f(x) dx + "2e - (L)
a
Ib a
f(x) dx + "2e
=
e. I
Since the integral has been defined only on nonempty intervals [a, b], we have tacitly assumed that a ::; b. We shall use the convention
i
a
f(x) dx
=
-lb
f(x) dx
to extend the integral to the case a > b. In particular, if f(x) is integrable and nonpositive on [a, b], then the area of the region bounded by the curves y = f(x), y = 0, x = a, and x = b is given by a f(x) dx. In the next section we shall use the machinery of upper and lower sums to prove several familiar theorems about the Riemann integral. We close this section with one more result, which reinforces the connection between integration and area.
Ib
5.16 THEOREM. If f(x) = a is constant on [a, b], then
lb
f(x) dx
= a(b -
a).
PROOF. By Theorem 5.10, f is integrable on [a, b]. Hence, it follows from Theorem 5.15 and Remark 5.5 that
Ib
f(x) dx = (U)
a
Ib
f(x) dx = inf U(f, P) = a(b - a). I
a
P
EXERCISES 1. For each of the following, compute U(f, P), L(f, P), and
P =
Iol f(x) dx, where
{o,~,~,~, I}.
Find out whether the lower sum or the upper sum is a better approximation to the integral. Graph f and explain why this is so. (a) f(x) = 1- x 2 • (b) f(x) = sin x. (c) f(x) = x 2 - X.
5.1
Riemann integral
115
2. (a) Prove that for each n E N,
{~ : j
Pn :=
is a partition of [0,1]. (b) Prove that a bounded function
(*)
f
= 0,1, ...
is integrable on [0, 1] if
fo:= lim L(f, Pn ) n---+OCl
,n}
= n---+oo lim U(f, Pn ),
in which case J01f(x) dx equals f o. (c) For each ofthe following functions, use Exercise 1, p. 17, to find formulas for the upper and lower sums of f on Pn , and use them to compute the value of
J01 f(x) dx. f(x) = x.
(a)
((J)
f(x) = { 3. Let E
= {l/n : n
~
~ x < 1/2 1/2 ~ x ~ 1.
EN}. Prove that the function
f(x)
= {
~
x EE otherwise
is integrable on [0,1]. What is the value of
[!].
°
J01f(x) dx?
This exercise is used in Section e14.2. Suppose that [a, b] is a closed, nondegenerate interval and f : [a, b] ---7 R is bounded.
(a) Prove that if f is continuous at Xo E [a, b] and f(xo)
(L)
lb
=1=
0, then
If(x)1 dx > 0.
J:
°
(b) Show that if f is continuous on [a, b], then If(x)1 dx = if and only if f(x) = for all x E [a, b]. (c) Does part (b) hold if the absolute values are removed? If it does, prove it. If it does not, provide a counterexample.
°
Chapter 5
116
INTEGRABILITY ON R
f is continuous on a nondegenerate interval [a, bJ. Show that
5. Suppose that
l
C
f(x)dx = 0
for all c E [a, bJ if and only if f(x) = 0 for all x E [a, bJ. (Compare with Exercise 4, and notice that f need not be nonnegative here.) 6. Let f be integrable on [a, bJ and E be a finite subset of [a, bJ. Show that if 9 is a bounded function that satisfies g(x) = f(x) for all x E [a, bJ \ E, then 9 is integrable on [a, bJ and
lb
g(x) dx =
lb
f(x) dx.
[!].
This exercise is used in Section 12.3. Let f,g be bounded on [a,bJ. (a) Prove that
(U)
lb
(f(x)
+ g(x)) dx :::;
(U)
lb
f(x) dx + (U)
lb
(L)
lb
(f(x)
+ g(x)) dx
~ (L)
lb
f(x) dx + (L)
lb
g(x) dx
and
g(x) dx.
(b) Prove that
(U) and
(L)
lb lb
f(x) dx = (U)
f(x) dx = (L)
l l
c
f(x) dx + (U)
c
f(x) dx + (L)
lb lb
f(x) dx
f(x) dx
for a < c < b. ~. This exercise is used in Sections e5.5, 6.2, and e7.5. (a) If f is increasing on [a, bJ and P prove that
= {xo, ... , xn } is any partition of [a, bJ,
n
'L(Mj(f) - mj(f))(xj - Xj-1) :::; (f(b) - f(a))
IIPII·
j=l
(b) Prove that if f is monotone on [a,b], then f is integrable on [a,bJ. [Note: By Theorem 4.29, f has at most countably many (i.e., relatively few) discontinuities on [a,bJ. This has nothing to do with the proof of part (b), but points out a general principle that will be discussed in Section 9.6.J
5.2
Riemann sums
117
9. Let f be bounded on a nondegenerate interval [a, b]. Prove that f is integrable on [a, b] if and only if given c > 0 there is a partition Pe of [a, b] such that
P 2 Pe implies
1U(f, P) - L(f, P)I < c.
5.2 RIEMANN SUMS There is another definition of the Riemann integral frequently found in elementary calculus texts.
5.17 DEFINITION. Let f: [a, b] -+ R. (i) A Riemann sum of f with respect to a partition P = {xo, ... , Xn} of [a, b] is a sum of the form n
'L f(tj)(xj -
Xj-I),
j=1
where the choice oftj E [Xj-I,Xj] is arbitrary. (ii) The Riemann sums of f are said to converge to 1(f) as if given c > 0 there is a partition Pe of [a, b] such that
IIPII -+ 0 if and only
n
'Lf(tj)(Xj-Xj-I)-1(f)
P={xo, ... ,xn L2Pe implies
j=1
for all choices of tj E [Xj-I, Xj], j notation
= 1,2, ... , n. In this case we shall use the
The following result shows that for bounded functions this definition of the Riemann integral is the same as the one using upper and lower integrals.
5.18 THEOREM. Let a, b E R with a < b, and suppose that bounded. Then f is Riemann integrable on [a, b] if and only if
f : [a, b]
-+
R is
exists, in which case
1(f) = lb f(x) dx. PROOF. Suppose that f is integrable on [a, b] and c Property, there is a partition Pe of [a, b] such that
(4)
L(f,Pe) > l bf (X)dX-c
and
> O. By the Approximation
U(f,Pe)
Chapter 5
118
INTEGRABILITY ON R
Let P = {xo,XI, ... ,xn } ~ Pe. Then (4) holds with P in place of Pe' But mj(f) :::; f(tj) :::; Mj(f) for any choice oft j E [Xj-I, Xj]. Hence,
1b
f(x) dx - c < L(f, P) :::; Ln f(tj)(xj - Xj-I) :::; U(f, P) <
1b
j=l
a
In particular, n
L f(tj)(xj - Xj-l) j=l
f(x) dx + c.
a
-1
b
f(x) dx < c
a
for all partitions P ~ Pe and all choices of tj E [Xj-I, Xj], j = 1,2, ... , n. Conversely, suppose that the Riemann sums of f converge to 1(f). Let c > 0 and choose a partition P = {xo, Xl, •.• , xn} of [a, b] such that n
(5)
f(tj)(xj - Xj-I} - 1(f) < ~
L j=l
for all choices of tj E [Xj_I,Xj]. By the Approximation Property, choose tj,Uj E [Xj-I, Xj] such that
By (5) and telescoping, we have n
U(f, P) - L(f, P)
=
L(Mj(f) - mj(f))(xj - Xj-I} j=l n
n
< f;(f(t j ) - f(uj))(xj - Xj-I) + 3(b ~ a) f;(Xj - Xj-l) n
< L f(tj)(Xj - Xj-I) - 1(f) j=l
+ 2c
II (f) - t, feu; )(x; - x;-Il + 3(b ~ a) t,(X; -x;_,) c
< 3+3 =c. Therefore,
f is integrable on [a, b]. I
The next two results show that Riemann integrals of complicated functions can be broken into simpler pieces.
5.2
Riemann sums
119
5.19 THEOREM [LINEAR PROPERTY]. If I, 9 are integrable on [a, b] and a: E R, then 1+ 9 and a:1 are integrable on [a, b]. In fact,
(6)
lb
(f(x)
lb
+ g(x)) dx =
I(x) dx +
lb
g(x) dx
and
lb
(7)
(a:/(x)) dx
= a:
lb
I(x) dx.
PROOF. Let c > 0 and choose Pe such that for any partition P Pe of [a, b] and any choice of tj E [Xj-I, Xj], we have
= {xQ, xl. ... , x n } ;;2
and
By the triangle inequality,
for any choice oftj E [Xj-I, Xj]' Hence, (6) follows directly from Theorem 5.18. Similarly, if Pe is chosen so that if P = {xQ, ... , xn} is finer that Pe , then
it is easy to check that
for any choice of tj E
[Xj-I, Xj]'
We conclude by Theorem 5.18 that (7) holds. I
5.20 THEOREM. If I is integrable on [a, b], then terval [c, d] of[a, b]. Moreover,
(8)
lb
I(x) dx =
l
c
I(x) dx +
lb
I
is integrable on each subin-
I(x) dx
Chapter 5
120
INTEGRABILITY ON R
for all e E (a, b). PROOF.
We may suppose that a < b. Let c > 0 and choose a partition P of [a, b]
such that
U(f, P) - L(f, P) < c.
(9)
Let pi = P U {e} and PI = pi n [a, e]. Since PI is a partition of [a, e] and pi is a refinement of P, we have by (9) that
U(f, PI) - L(f, PI)
~
U(f, Pi) - L(f, Pi)
~
U(f, P) - L(f, P) < c.
Therefore, f is integrable on [a, e]. A similar argument proves that f is integrable on any subinterval [e, d] of [a, b]. To verify (8), suppose that P is any partition of [a,b]. Let Po = Pu {e}, PI = Po n [a, e], and P2 = Po n [e, b]. Then Po = PI U P2 and by definition
U(f, P) ;::: U(f, Po) ;::: (U)
l
C
=
U(f, PI) + U(f, P2 )
f(x) dx + (U)
lb
f(x) dx =
(This last equality follows from the fact that Taking the infimum of
U(f, P) ;:::
l
c
f
l
c
f(x) dx +
lb
f(x) dx.
is integrable on both [a, e] and [e, b].)
f(x) dx +
lb
f(x) dx
over all partitions P of [a, b], we obtain
lb
f(x) dx = (U)
lb
f(x) dx ;:::
l
c
f(x) dx +
lb
f(x) dx.
A similar argument using lower integrals shows that
lb
f(x) dx
~
l
c
f(x) dx +
lb
f(x) dx. I
Using the conventions
lb
f(x) dx =
-l
a
f(x) dx
and
l
a
f(x) dx = 0,
it is easy to see that (8) holds whether or not e lies between a and b, provided that f is integrable on the union of these intervals (see Exercise 4).
5.2
121
Riemann sums
5.21 THEOREM [COMPARISON THEOREM FOR INTEGRALS]. If f, g are integrable on [a, b] and f(x) :::; g(x) for all x E [a, b], then
lb
f(x) dx:::;
lb
g(x) dx.
In particular, ifm:::; f(x):::; M for x E [a,b], then m(b - a) :::;
lb
f(x) dx :::; M(b - a).
PROOF. Let P be a partition of [a, b]. By hypothesis, Mj(f) :::; Mj(g) whence U(f, P) :::; U(g, P). It follows that
lb
f(x) dx = (U)
lb
f(x) dx :::; U(g, P)
for all partitions P of [a, b]. Taking the infimum of this inequality over all partitions P of [a, b], we obtain
lb
f(x) dx :::;
lb
g(x) dx.
If m :::; f(x) :::; M, then (by what we just proved and by Theorem 5.16)
m(b - a) =
lb
mdx:::;
lb
f(x) dx:::;
lb
M dx
= M(b - a). •
We shall use the following result nearly every time we need to estimate an integral.
5.22 THEOREM. If f is (Riemann) integrable on [a, b], then If I is integrable on [a,b] and
lb
PROOF. Let P
=
f(x) dx :::;
lb
If(x)1 dx.
{xo, Xl, •.. , x n } be a partition of [a, b]. We claim that
(10) holds for j = 1,2, ... , n. Indeed, let x, y E [Xj-I, Xj]' If f(x), f(y) have the same sign, say both are nonnegative, then
If(x)1 - If(y)1 = f(x) - f(y) :::; Mj(f) - mj(f). If f(x), f(y) have opposite signs, say f(x)
If(x)I-lf(y)1
=
f(x)
+ f(y)
~
0 ~ f(y), then mj(f) :::; 0, hence
:::; Mj(f)
+ 0:::; Mj(f) -
mj(f).
122
Chapter 5
INTEGRABILITY ON R
Thus (10) holds in any event. Let c > 0 and choose a partition P of [a, b] such that U(f, P) - L(f, P) < c. Since (10) implies that U(lfl, P) - L(lfl, P) ::; U(f, P) - L(f, P), it follows that
U(lfl, P) - L(lfl, P) < c. Thus If I is integrable on [a, b]. Since -If(x)1 ::; f(x) ::; If(x)1 holds for any x E [a, b], we conclude by Theorem 5.21 that
b - l If(x)1 dx ::; lb f(x) dx ::; lb If(x)1 dx. I By Theorem 5.19, the sum of integrable functions is integrable. What about the product?
5.23 COROLLARY. Iff and g are (Riemann) integrable on [a, b], then so is fg. PROOF. Suppose for a moment that the square of any integrable function is integrable. Then, by hypothesis, p, g2, and (f + g)2 are integrable on [a, b]. Since
(f
+ g)2 -
fg=
f2 _ g2
2
'
it follows from Theorem 5.19 that fg is integrable on [a, b]. It remains to prove that f2 is integrable on [a,b]. Since Mj(P) = (Mj(lfl))2 and mj(p) = (mj(lfl))2, it is clear that
Mj (f2) - mj(f2) = (Mj(lfl))2 - (mj(lfl))2
= (Mj(lfl) + mj(lfl))(Mj(lfl) - mj(lfl)) ::; 2M(Mj (lfl) - mj(lfl)), where M = SUPxE[a,bjlf(x)l. Multiplying this inequality by (Xj-Xj-l) and summing over j = 1,2, ... ,n, we have
U(f2, P) - L(f2, P) ::; 2M(u(lfl, P) - L(lfl, P)). Hence, it follows from Theorem 5.22 that
P is integrable on [a, b].
I
We close this section with two integral analogues of the Mean Value Theorem.
5.24 THEOREM [FIRST MEAN VALUE THEOREM FOR INTEGRALS]. Suppose that f and g are integrable on [a, b] with g(x) ~ 0 for all x E [a, b]. If m
= inf f(x) and M = sup f(x), xE[a,bj
xE[a,bj
then there is a number C E [m, M] such that
lb f(x)g(x) dx
= C lb
g(x) dx.
5.2
123
Riemann sums
In particular, if f is continuous on [a, bj, then there is an Xo E [a, bj that satisfies
lb
f(x)g(x) dx
f(xo)
=
lb
g(x) dx.
PROOF. Since 9 ;::: 0 on [a, b], Theorem 5.21 implies that m
lb
f:
If g(x) dx = 0, then set
g(x) dx:::;
f:
lb
f(x)g(x) dx
f(x)g(x) dx :::; M
g(x) dx.
= 0 and there is nothing to prove. Otherwise,
t f(x)g(x) dx
- f:
c-
lb
"-,a,,,---;...:........:...:......:......:........:._
g(x) dx
and note that c E [m, M]. If f is continuous, then (by the Intermediate Value Theorem) we can choose Xo E [a, b] such that f(xo) = c. I Before we state the Second Mean Value Theorem we introduce an idea that will be used in the next section to prove the Fundamental Theorem of Calculus. If f is integrable on [a, b], then f can be used to define a new function
F(x):=
l
x
x E [a,b].
f(t) dt,
5.25 Example. Find F(x) = fox f(t) dt if
f(x) = { 1
-1
x;:::o x <
o.
SOLUTION. By Theorem 5.16,
F(x) = Hence, F(x)
r f(t)dt = { x-x 10
x;::: 0 x < o.
= Ix!- I
Notice in Example 5.25 that the integral F of f is continuous even though f itself is not. The following result shows that this is a general principle.
5.26 THEOREM. If f is (Riemann) integrable on [a, b], then F(x) = exists and is continuous on [a, b].
f: f(t) dt
PROOF. By Theorem 5.20, F(x) exists for all x E [a, b]. To prove that F is continuous on [a, b], it suffices to show that F(x+) = F(x) for all x E [a, b) and F(x-) = F(x) for all x E (a, b]. Fix Xo E [a, b). By definition, f is bounded on
Chapter 5
124
c
INTEGRABILITY ON R
-----
a
x
b
Figure 5.3 y
M
--------
a
b
x
Figure 5.4
[a,bj. Thus, choose MER such that I/(t)1 ::; M for all t E [a,bj. Let e > 0 and set 8 = elM. If 0::; x - Xo < 8, then by Theorem 5.22, IF(x) - F(xo)1
=
11:
I(t) dtl ::;
1:
I/(t)1 dt ::; Mix - xol < c.
Hence, F(xo+) = F(xo). A similar argument shows that F(xo-) = F(xo) for all Xo E
(a,bj. I
5.27 THEOREM [SECOND MEAN VALUE THEOREM FOR INTEGRALSj. Suppose that I, 9 are integrable on [a, bj, that 9 is nonnegative on [a, bj, and that m, M are real numbers that satisfy m ::; inf I([a, b]) and M ~ sup I([a, b]). Then there is an Xo E [a, bj such that
lb In particular, if satisfies
I
I(x)g(x) dx = m
l
xO
g(x) dx + M
1:
g(x) dx.
is also nonnegative on [a, bj, then there is an Xo E [a, bj that
Ib
I(x)g(x) dx
a
=
M
lb g(x) dx. Xo
Riemann sums
125
The second statement follows from the first since we may use m
=0
5.2 PROOF.
when
f ;:::: O. To prove the first statement, set F(x) = m
l
x
g(t) dt + M
lb
g(t) dt
for x E [a, b], and observe by Theorem 5.26 that F is continuous on [a, b]. Since 9 is nonnegative, we also have mg(t) ::; f(t)g(t) ::; Mg(t) for all t E [a, b]. Hence, it follows from the Comparison Theorem (Theorem 5.21) that
F(b) = m
lb
g(t) dt ::;
lb
f(t)g(t) dt ::; M
lb
g(t) dt = F(a).
Since F is continuous, we conclude by the Intermediate Value Theorem that there is an Xo E [a, b] such that
F(xo) =
lb
f(t)g(t) dt.
I
When g(x) = 1 and f(x) ;:::: 0, these Mean Value Theorems have simple geometric interpretations. Indeed, let A represent the area bounded by the curves y = f(x), y = 0, x = a, and x = b. By the First Mean Value Theorem, there is acE [m, M] such that the area of the rectangle of height c and base b - a equals A (see Figure 5.3); and by the Second Mean Value Theorem, if M is the maximum value of f on [a, b], then there is an Xo E [a, b] such that the area of the rectangle of height M and base b - Xo equals A (see Figure 5.4).
EXERCISES 1. Using the connection between integrals and area, evaluate each of the following
integrals.
10
(a)
1
Ix -
0.51 dx.
a> O.
(b)
(c)
(d)
lb
(3x
+ 1) dx,
a
< b.
126
Chapter 5
INTEGRABILITY ON R
2. Prove that if f and 9 are integrable on [a, b], then so are f V 9 and f 1\ 9 (see Exercise 9, p. 65). 3. Prove that if f is integrable on [0,1] and (3 > 0, then
1
1/ n (3
lim nCl. n-oo
for all a
f(x) dx
0
=
0
< (3.
4. Suppose that a < b < c and
lb
f
f(x) dx
is integrable on [a, c]. Prove that
l
=
c
lb
+
f(x) dx
f(x) dx.
5. (a) Suppose that gn 2: 0 is a sequence of integrable functions that satisfies
lim n----i>OO
Show that if f
: [a, b] ---; R lim n---+oo
(b) Prove that if
f
Ib
= O.
gn(x) dx
a
is integrable on [a, b], then
Ib
f(x)gn(x) dx
=
O.
a
is integrable on [0,1], then
Jor xnf(x) dx l
lim n---+oo
=
O.
6. (a) Prove that if f is integrable on [0,1], then
L1+ n
1
r f(x) dx = Jo
lim
1/2k
1/2 k
n-oo k=O
f(x) dx. 1
(b) Suppose that f is integrable on [a, b], Xo = a, and Xn is a sequence of numbers in [a, b] such that Xn l' b as n ---; 00. Prove that
I
a
b
f(x) dx
= nl~~
t; l n
xk
Xk
+1
f(x) dx.
7. Let f : [a, b] ---; R, a = Xo < Xl < ... < Xn = b, and suppose that f(Xk+) exists and is finite for k = 0,1, ... , n-1 and f(Xk-) exists and is finite for k = 1, ... , n. Show that if f is continuous on each subinterval (Xk-l, Xk), then f is integrable on [a, b] and
I
a
b
f(x) dx
n =~
l
xk
Xk-l
f(x) dx.
5.3
8. Let
Fundamental Theorem of Calculus
127
f be continuous on a closed, nondegenerate interval [a, b] and set M
=
sup If(x)l. xE[a,b]
(a) Prove that if M > 0 and p > 0, then for every € > 0 there is a nondegenerate interval I c [a, b] such that
(b) Prove that
l b ) lip
J~~ (
lf (x)IP dx
M.
=
5.3 FUNDAMENTAL THEOREM OF CALCULUS
J:
Let f be integrable on [a, b] and F(x) = f(t) dt. By Theorem 5.26, F is continuous on [a, b]. The next result shows that if f is continuous, then F is continuously differentiable. Thus "indefinite integration" improves the behavior of the function.
5.28 THEOREM [FUNDAMENTAL THEOREM OF CALCULUS]. Let [a, b] be nondegenerate and suppose that f : [a, b] --+ R. (i) Iff is continuous on [a,b] and F(x) = f(t)dt, then F E C1 [a,b] and
J:
dx d
l a
x
f(t) dt := F'(x) = f(x)
for each x E [a, b]. (ii) If f is differentiable on [a, b] and
l
x
l'
is integrable on [a, b], then
I'(t) dt = f(x) - f(a)
for each x E [a, b].
PROOF. (i) Let
F(x) =
l
x
f(t) dt,
x E [a,b].
By symmetry, it suffices to show that if f(xo+) = f(xo) for some Xo E [a, b), then
(11)
· F(xo 11m
h--->O+
+ h)h -
F(xo)
=
f(
Xo
)
Chapter 5
128
INTEGRABILITY ON R
(see Definition 4.6). Let c > 0 and choose a 8 > 0 such that Xo ::; t < Xo + 8 implies that If(t) - f(xo)1 < c. Fix 0 < h < 8. Notice that by Theorem 5.20,
F(xo
+ h) -
F(xo)
=
l
xo +h
f(t) dt
Xo
and that by Theorem 5.16,
f(xo)
=
l1
h
xo h
+ f(xo) dt.
Xo
Therefore,
F(xo
+ h) -
F(xo) _ f(xo)
=
h
.!.l h
xo h
+ (J(t) - f(xo)) dt.
Xo
Since 0 < h < 8, it follows from Theorem 5.22 and the choice of 8 that
IF(xo + h~ -
F(xo) - f(xo)1 ::;
~ l~o+h If(t) -
This verifies (11) and the proof of part (i) is complete. (ii) We may suppose that x = b. Let c > O. Since partition P = {xo, Xl, .•. , x n } of [a, b] such that
f'
f(xo)1 dt ::; c.
is integrable, choose a
for any choice of points tj E [Xj_I,Xj]. Use the Mean Value Theorem to choose points tj E [Xj-I, Xj] such that f(xj) - f(Xj-l) = f'(tj)(Xj - Xj-l)' It follows by telescoping that
f(b) - f(a)
_1b a
f'(t) dt
=
't(J(Xj) - f(Xj-l))
_1b
f'(t) dt < c. •
a
j=l
5.29 Remark. The hypotheses of the Fundamental Theorem of Calculus cannot be relaxed. PROOF. (i) Define f on [-1,1] by -I
f(x) = { 1
X
~
0;
then f is integrable on [-1,1] but F(x) := J~l f(x) dx = at X = O.
Ixl-l is not differentiable
5.3
129
Fundamental Theorem of Calculus
(ii) Define Ion [O,lJ by I(x) := x 2 sin(1/x 2) when x is differentiable on [0, 1J, but
. 1 2 1 1'( x ) =2xsln---·cos-, 2 2 x
x
x
=1=
°
and 1(0) = 0. Then I
x =1= 0,
is not even bounded on (0,1], much less integrable on [O,lJ .• By the Fundamental Theorem of Calculus, integration is the inverse of differentiation in the following sense. If I' is integrable, then
I
a
b
f'(x) dx
In particular,
I
b
=
lab :=
I(x)
X"'+l
Ib
a+ 1
a
x"'dx = - -
a
°
I(b) - I(a).
for each a ~ and for each a < 0, provided that a =1= -1 and [a,bJ does not contain (see Exercise 8, p. 94, and Exercise 5e). This result is sometimes called the Power Rule. These observations can be used to evaluate many integrals.
°
5.30 Examples. (i) Find
(ii) Find J07l"/2(1
Jo1 (3x -
2)2 dx.
+ sin x) dx.
SOLUTION. (i) Since (3x - 2)2 = 9x 2
-
12x + 4, we have by the Power Rule that
(ii) Since (cos x)' = - sin x, we have by the Fundamental Theorem of Calculus that
1
71"/2
o
/
7r
(l+sinx)dx=x-cosx I~ 2= -+1. • 2
Combining the Product Rule and the Fundamental Theorem of Calculus, we have another tool for evaluating integrals.
5.31 THEOREM [INTEGRATION BY PARTSJ. Suppose that I,g are differentiable on [a, bJ with f', g' integrable on [a, bJ. Then
lb
f'(x)g(x) dx = I(b)g(b) - I(a)g(a)
-lb
I(x)g'(x) dx.
PROOF. By the Product Rule, (f(x)g(x))' = f'(x)g(x) + I(x)g'(x) for x E [a, bJ. Since I, 9 are continuous on [a, bJ and f', g' are integrable on [a, b], it follows that
Chapter 5
130
INTEGRABILITY ON R
(f g)' is a sum of products of integrable functions, hence, integrable on [a, b]. Thus, by the Fundamental Theorem of Calculus, f(b)g(b) - f(a)g(a) =
lb
f'(x)g(x) dx
+
lb
f(x)g'(x) dx.
•
This rule is sometimes abbreviated as
J
u dv
= uv -
J
v du,
where it is understood that if w = h(x) for some differentiable function h, then the Leibnizian differential dw is defined by dw = h'(x) dx. Integration by parts can be used to reduce the exponent n on an expression of the form (ax + b)nf(x) when f is integrable. 5.32 Example. Find SOLUTION.
J07l"/2 xsinxdx.
Let u = x and dv = sin x dx. Then du = dx and v = - cos x. Hence,
by parts, {71"/2
Jo
{71"/2
xsinx=-xcosxl~/2- Jo
(-cosx)dx=sinxl~/2=1. •
Integration by parts is also very effective on integrals involving products of polynomials and logarithms. 5.33 Example. Find J1310g x dx. SOLUTION.
parts,
Let u = log x and dv = dx. Then du = dx/x and v = x. Hence, by
1 3
log x dx = x log x
I~ -
1 3
dx = 3 log 3 - 2.
•
Complicated problems can frequently be reduced to simpler ones by changing variables. The following result shows how to change variables in a Riemann integral onR. 5.34 THEOREM [CHANGE OF VARIABLES]. Let ¢ be continuously differentiable on a closed, nondegenerate interval [a, b]. If (12)
f is continuous on ¢([a, b]),
or if (13)
¢ is strictly increasing on [a, b] and f is integrable on [¢(a), ¢(b)],
then (14)
l "'(b) "'(a)
f(t) dt =
lb f(¢(x))¢'(x) dx. a
5.3 PROOF.
G(X):=
l
Fundamental Theorem of Calculus
131
Suppose first that (12) holds. Set x
f(¢(t))¢'(t)dt,
x
E
[a,b],
and
F(u):=
l
u
f(t) dt,
u
E
¢([a, b]),
<J;(a)
and observe that if m is the infimum of f([a, b]), then F(u) = J~ f(t) dt- J~(a) f(t) dt. It follows from the Fundamental Theorem of Calculus that G'(x) = f(¢(x))¢'(x) and F'(u) = f(u). Hence, by the Chain Rule,
d
dx (G(x) - F(¢(x))) = 0 for all x E [a, b]. It follows from Theorem 4.24ii that G(x) - F(¢(x)) is constant on [a, b]. Evaluation at x = a shows that this constant is zero. Thus G(x) = F(¢(x)) for all x E [a, b], in particular, when x = b. This proves (14) under hypothesis (12). The theorem is more difficult to prove under hypotheses (13), but the idea behind the proof is simple. STRATEGY: Suppose that P = {xo, XI, ... , x n } is a partition of [a, b]. Since ¢ is increasing, {¢(xo), ... , ¢(x n )} is a partition of 1:= [¢(a), ¢(b)]. A typical Riemann sum of the left side of (14) with respect to this partition is 2::7=1 f(uj)(¢(Xj) ¢(xj-d). But ¢ is continuously differentiable, so we can use the Intermediate Value Theorem to choose 8j E [Xj-I, Xj] such that Uj = ¢(8j), and the Mean Value Theorem to choose Cj E [Xj-I,Xj] such that ¢(Xj) - ¢(Xj-1) = ¢'(Cj)(Xj - Xj-1). Hence, the Riemann sum above can be written as
n
(15)
S:=
L j=1
n
f(uj)(¢(Xj) - ¢(Xj-1))
=
L
f(¢(8j))¢'(Cj)(Xj - xj-d·
j=1
If we replaced Cj in this last sum by 8j, the right side of (15) would be exactly a Riemann sum of the right side of (14). Since Cj,8j both belong to the interval [Xj-I, Xj] and ¢' is continuous, making this replacement should not change S much if the norm of P is small enough. Hence, a Riemann sum of the left side of (14) is approximately equal to a Riemann sum of the right side of (14). This means that the integrals in (14) should be equal. Here are the details.
Let c: > O. Since f is bounded, there is an M E (0,00) such that If(x)1 :S M for all x E I. Since ¢' is uniformly continuous on [a, b], choose 8 > 0 such that
i.e.,
(16)
132
Chapter 5
INTEGRABILITY ON R
for all Sj, Cj E [a, b] with ISj - Cj I < b. Next, notice by the Inverse Function Theorem (Theorem 4.26) that ¢-1 is continuously differentiable and strictly increasing on I. This has two consequences. There is an Tf > 0 such that if s, C E I and Is - ci < Tf, then 1¢-I(s) - ¢-I(c)1 < b. And, if {to, ... , t n } is a partition of I, then {¢-I (to), ... , ¢-I(tn )} is a partition of [a,b]. Since J is integrable on I = [¢(a), ¢(b)], choose a partition P = {to, tl, ... , t n } of I such that IIPII < Tf and (17) holds for any choice of Uj E [tj-I, tj]. Set Xj = ¢-I(tj) and observe (by the choice of Tf) that P = {xo, . .. , xn} is a partition of [a, b] that satisfies IIPII < b. To compare a Riemann sum of the right side of (14) to the integral on the left side of (14), let Sj E [Xj-I, Xj], set Uj = ¢(Sj), and apply the Mean Value Theorem to choose Cj E [Xj-I,Xj] such that ¢(Xj) - ¢(Xj-I) = ¢'(Cj)(Xj - Xj-I). Then, by the choices of Cj, Uj, and tj, we have Uj E [tj-I, tj] and
Hence, it follows from (16) and (17) that
n
<
L J(¢(Sj))(¢'(Sj) -
¢'(Cj))(Xj - Xj-I)
j=1
n
c" < 2(b - a) L)Xj - Xj-I)
c
+ "2 < c.
J=I
We obtained this estimate for the fixed partition P of [a, b], but the same steps also verify this estimate for any partition finer than P. We conclude by Theorem 5.18 that (f 0 ¢) . ¢' is integrable on [a, b] and (14) holds. I The Change-of-Variables Formula can be remembered as a substitution if we use the Leibnizian differentials introduced above: t = ¢(x) implies that dt = ¢'(x) dx. The following example illustrates a typical application of the Change-of-Variables Formula. 5.35 Example. Find
101 ev'x+l / JX+T dx.
5.3 SOLUTION.
Let t =
133
Fundamental Theorem of Calculus
vx + 1 and observe that dt =
dx
2vx+ 1 Therefore,
Please notice that when changing variables, you must also change the limits of integration, e.g., from x = 0 to t = VO+T = 1. It is interesting to note that hypothesis (12) does not require that ¢ be 1-1. This observation is used in the following example.
5.36 Example. Evaluate
f continuous on [0,1]. SOLUTION. Let ¢(x) = x 2 and observe that f is continuous on ¢([-1, 1]) = [0,1].
for any
Hence, by Theorem 5.34,
1-11
111-1
xf(x 2) dx = 2
1111
f(¢(x))¢' (x) dx = 2
EXERCISES 1. Compute each of the following integrals.
(a)
1 3
Ix2
+X
-
21 dx.
-3
(b)
/, 1
Vi -
4
r;;;
1d X.
yX
(c)
(d)
l
e
xlogxdx.
f(t) dt =
o. •
Chapter 5
134
INTEGRABILITY ON R
(e) 4x 2
f)
-
2
4x + 1
x -x+3
dx.
2. Use the First Mean Value Theorem for Integrals to prove the following version of the Mean Value Theorem for Derivatives. If fECI [a, b]' then there is an Xo E [a, b] such that
f(b) - f(a) 3. (a) If f
: [0,00)
--+
--+
1
x2
t
--+
f(t)dt.
R is continuous, find
: it (c) If 9 : R
(b - a)f'(xo).
R is continuous, find
d~ (b) If h : R
=
h(x) dx.
cost
R is continuous, find d
t
dt Jo g(x - t) dx. (d) If f(x)
x3
= fo
2
et dt, show that
61 [!].
1
x 2 f(x) dx -
211
ex2 dx = 1 - e.
This exercise is used in Sections 5.4 and 6.1. Define L : (0,00)
L(x) =
t
J1
--+
R by
dt. t
(a) Prove that L is differentiable and strictly increasing on (0,00), with L'(x) = l/x and L(l) = o. (b) Prove that L(x) --+ 00 as x --+ 00 and L(x) --+ -00 as x --+ 0+. (You may wish to prove that
for all n EN.) (c) Using the fact that (x q ), = qx q - 1 for x > 0 and q E Q (see Exercise 8, p. 94), prove that L(x q ) = qL(x) for all q E Q and x> o. (d) Prove that L(xy) = L(x) + L(y) for all x, y E (0,00). (e) Let e = limn -+ oo (l + l/n)n. Use I'Hopital's Rule to show that L(e) = l. (L(x) is the natural logarithm function log x.)
5.3
Fundamental Theorem of Calculus
00. This exercise was used in Section 4.3. Let E =
135
L -1, where L is defined
in Exercise 4. (a) Use the Inverse Function Theorem to show that E is differentiable and strictly increasing on R with E'(x) = E(x), E(O) = 1, and E(I) = e. (b) Prove that E(x) -7 00 as x -7 00 and E(x) -70 as x -7 -00. (c) Prove that E(xq) = (E(x))q and E(q) = eq for all q E Q and x E R. (d) Prove that E(x + y) = E(x)E(y) for all x, y E R. (e) For each a E R define eC> = E(a). Let x > 0 and define xC> = ec>!ogx := E(aL(x)). Prove that 0 < x < y implies xC> < yC> for a > 0 and xC> > yC> for a < O. Also prove that
for all a, f3 E R and x > O. 6. (a) Suppose that Prove that
9 is integrable and nonnegative on [1,3] with I13 g(x) dx = -1 7r
/,9 g( ..jX) dx < 2. 1
(b) Suppose that h is integrable and nonnegative on [1,11] with Prove that
7. Suppose that Prove that
1.
I111 h(x) dx = 3.
f : [a, b] -7 R is continuously differentiable and I-Ion [a, b].
I
b
f(x) dx +
a
If(b)
r
1 (x)
dx = bf(b) - af(a).
f(a)
8. If f is continuous on [a, b] and there exist numbers a
a
I
c
=1=
f3 such that
f(x) dx + f31b f(x) dx = 0
holds for all c E (a, b), prove that f(x) = 0 for all x E [a, b]. 9. Let 0 :s; x :s; 7r /2. (a) Use 0 :s; cos x :s; 1 and the Comparison Theorem for integrals to prove that o :s; sin x :s; x. (b) For each nonnegative integer m, set and
Chapter 5
136
INTEGRABILITY ON R
Prove 1 that
82n+l(X) :::; sinx :::; 82n(X),
82n+l(X):::; sinx :::; 82n+2(X),
C2n+l(X) :::; cos x :::; C2n(X), and C2n+l(X):::; cos x :::; C2n+2(X) hold for n = 0, 1,2, .... 10. Suppose that 9 is differentiable on [a, b] and g' is integrable on [a, b]. (a) Prove that if f is continuously differentiable and increasing on [a, b] and 9 is positive on (a, b) with g(b) = g(a) = 0, then
lb
f(x)g'(x) dx
0
=
if and only if f is constant on [a, b]. (b) Show that (a) is false if "g is positive on (a, b)" is replaced by "g is nonnegative on [a, b] and positive on some subinterval of (a, b)."
@].
This exercise is used in Section 12.4. Suppose that ¢ is continuously differentiable on an interval [a, b] with ¢' (x) =f=. 0 for all x E [a, b]. Prove that if [c, d] = ¢([a, b]) and f is integrable on [c, d], then
ld
f(t) dt
=
lb
f(¢(x)) 1¢'(x)1 dx.
5.4 IMPROPER RIEMANN INTEGRATION To extend the Riemann integral to unbounded intervals or unbounded functions, we begin with an elementary observation. 5.37 Remark. If f is integrable on [a, b], then
bf(x) dx =
Jra
lim (lim c--->a+
PROOF. By Theorem 5.26,
d--->b-
F(x) =
l
F(b) - F(a)
=
X
jd f(x) dX) . c
f(t) dt
is continuous on [a, b]. Thus
J(ba f(x) dx = =
lim ( lim (F(d) - F(c))) c--->a+ d--->b-
lim (lim
c--->a+
d--->b-
jd f(x) dX) . I c
This leads to the following generalization of the Riemann integral. IThis exercise is due to Deng Bo ("A Simple Derivation of the Maclaurin Series for Sine and Cosine," American Mathematical Monthly, vol. 97 (1990), 836. See also Exercise 3, p. 172.
5.4
Improper Riemann integration
137
5.38 DEFINITION. Let (a, b) be a nonempty, open (possibly unbounded) interval and f : (a, b) --+ R. (i) f is said to be locally integrable on (a, b) if and only if f is integrable on each closed subinterval [c, d] of (a, b). (ii) f is said to be improperly integrable on (a, b) if and only if f is locally integrable on (a, b) and
l
(18)
b
a
f(x) dx:= lim (lim c--+a+
d--+b-
Id f(x) dX) c
exists and is finite. This limit is called the improper (Riemann) integral of f over (a, b). 5.39 Remark. The order of the limits in (18) does not matter. In particular, if the limit in (18) exists, then
l
a
PROOF.
Let Xo
lim (lim
c--+a+
E
d--+b-
b
f(x) dx = lim (lim d--+b-
c-+a+
Id f(x) dX) . c
(a, b) be fixed. By Theorems 5.20 and 3.8,
Id f(x) dX) = c
lim
c--+a+
(l l
XO
= lim
c-+a+
XO
c
rdf(x) dX)
f(x) dx + lim
d-+b- } Xo
c
f(x) dx + lim
d-+b-
= d--+blim (lim c-+a+
ld
Id f(x) dX) . c
f(x) dx
Xo
I
Thus we shall use the notation lim
c-+a+ d--+b-
Id f(x) dx c
to represent the limit in (18). If the integral is not improper at one of the endpoints, e.g., if f is Riemann integrable on closed subintervals of (a, b], we shall say that f is improperly integrable on (a, b] and simplify the notation even further by writing
lb a
f(x) dx = lim
c-+a+
Ib f(x) dx. c
The following example shows that an improperly integrable function need not be bounded.
Chapter 5
138
INTEGRABILITY ON R
5.40 Example. Show that f(x) =
1//X is improperly integrable on (0,1].
SOLUTION. By definition,
11
1
r,;.
o vx
dx = lim
11
a--+O+
a
1
r,;. V X
dx = lim (2 - 2Fa) = 2. I a--+O+
The following example shows that a function can be improperly integrable on an unbounded interval.
5.41 Example. Show that f(x)
= 1/x 2
is improperly integrable on [1,00).
SOLUTION. By definition,
J 1 oo
~ dx = X
lim Jd
d--+oo
1
~ dx = X
lim
d--+oo
(1- -d1 )
= 1.
I
Because an improper integral is a limit of Riemann integrals, many of the results we proved earlier in this chapter have analogues for the improper integral. The next two results illustrate this principle.
5.42 THEOREM. If f, 9 are improperly integrable on (a, b) and a, 13 E R, then af + f3g is improperly integrable on (a, b) and
lb (af(x)
+ f3g(x)) dx = a lb f(x) dx + f31b g(x) dx.
PROOF. By Theorem 5.19 (the Linear Property for Riemann Integrals),
ld (af(x)
+ f3g(x)) dx = a ld f(x) dx + f31d g(x) dx
for all a < c < d < b. Taking the limit as c -7 a+ and d
-7
b- finishes the proof. I
5.43 THEOREM [COMPARISON THEOREM FOR IMPROPER INTEGRALS]. Suppose that f, 9 are locally integrable on (a, b). IfO :S f(x) :S g(x) for x E (a, b), and 9 is improperly integrable on (a, b), then f is improperly integrable on (a, b) and
lb f(x) dx :S lb g(x) dx.
t
PROOF. Fix c E (a, b). Let F(d) = fed f(x) dx and G(d) = g(x) dx for d E [c, b). By the Comparison Theorem for Integrals, F(d) :S G(d). Since f :::: 0, the function F is increasing on [c, b], hence F(b-) exists. Thus, by definition, f is improperly integrable on (c, b) and
lb f(x) dx = F(b-) :S G(b-) = lb g(x) dx.
5.4
Improper Riemann integration
139
A similar argument works for the case c --+ a+. I This test is frequently used in conjunction with the following inequalities: Isin xl :S Ixl for all x E R (see Appendix B); for every a > 0 there exists a constant Bo. > 1 such that Ilogxl :S x o. for all x ~ Bo. (see Exercise 4, p. 101). Here are two typical examples. 5.44 Example. Prove that f(x)
= Isin x/HI is improperly integrable on (0,1].
PROOF. Since O:S f(x) = ISinx/HI :S Ixl/x 3 / 2 = l/ft on (0,1], and this last function is improperly integrable on (0,1] by Example 5.40, it follows from the Comparison Test that f(x) is improperly integrable on (0,1]. I
= logx/Vx5 is improperly integrable on [1,00). PROOF. Since f is continuous on (0,00), f is integrable on [1,C] for any C E R.
5.45 Example. Prove that f(x)
Since 0 :S f(x) = logx/Vx5 :S X 1/ 2 /X 5 / 2 = l/x 2 for x ~ C := B 1/ 2 , and this last function is improperly integrable on [1,00) by Example 5.41, it follows from the Comparison Test that f(x) is improperly integrable on [1,00). I Although improperly integrable functions are not closed under multiplication (see Exercise 5), the Comparison Theorem can be used to show that some kinds of products are improperly integrable. 5.46 Remark. If f is bounded and locally integrable on (a, b) and Igl is improperly
integrable on (a, b), then If gl is improperly integrable on (a, b). PROOF. Let M = sUPxE(a,b) If(x)l. Then 0 :S If(x)g(x)1 :S Mlg(x)1 for all x E (a, b). Hence, by Theorem 5.43, Ifgl is improperly integrable on (a, b). I For the Riemann integral, we proved that If I is integrable when f is (see Theorem 5.22). This is not the case for the improper integral (see Example 5.49). For this reason we introduce the following concepts.
f : (a, b) --+ R. (i) f is said to be absolutely integrable on (a, b) if and only if If I is improperly integrable on (a, b). (ii) f is said to be conditionally integrable on (a, b) if and only if f is improperly integrable but not absolutely integrable on (a, b).
5.47 DEFINITION. Let (a, b) be a nonempty, open interval and
The following result, an analogue of Theorem 5.22 for absolutely integrable functions, shows that absolutely integrability implies improper integrability. 5.48 THEOREM. If integrable on (a, b) and
f is absolutely integrable on (a, b), then f is improperly
lib
f(x) dx :S
ib
If(x)1 dx.
PROOF. Since 0 :S If(x)1 + f(x) :S 2If(x)l, we have by Theorem 5.43 that If I + f is improperly integrable on [a, b]. Hence, by Theorem 5.42, so is f = (If I + J) -If I·
140
Chapter 5
INTEGRABILITY ON R
Moreover,
for every a < c < d < b. We finish the proof by taking the limit of this last inequality as c ---+ a+ and d ---+ b-. • The converse of Theorem 5.48, however, is false. 5.49 Example. [1,00).
Prove that the function sin x/x is conditionally integrable on
PROOF. Integrating by parts, we have
1 d
sinx dx = _ cos x Id X
1
-l co~x ld d
dx
X
XII
cosd -_ cos (1) - -d-
cos x dx.
-2-
1
X
Since 1/x2 is absolutely integrable on [1,00), it follows from Remark 5.46 that cosx/x 2 is absolutely integrable on [1,00). Therefore, sinx/x is improperly integrable on [1,00) and
1
1
00
sinx d x-cos (1) -
00
X
1
1
cos 2x X
d x.
To show that sin x/x is not absolutely integrable on [1,00), notice that
nlbr
1I mr
sinxl
--dx:2:L x k=2
1
Isinxl
--dx
(k-l)1r
1
n
:2: L k k=2
7r
2
n
X
lk1r
Isinxldx
(k-l)1r
2
n
1
=Lk7r=;Lk k=2
k=2
for each n EN. Since n
1
n
k=2
k
k=2
L - :2: L as n
---+
l k
k +1
1
- dx =
I
n +1
-1 dx = log( n + 1) - log 2 ---+ 00
2
X
X
00, it follows from the Squeeze Theorem that
· 11m
n ..... oo
1n1r 1
Isinxl d x=oo.
-X
5.4
Improper Riemann integration
141
Thus, sin x/x is not absolutely integrable on [1, (0). I
EXERCISES 1. Evaluate the following improper integrals.
j001+X d 3 x. 1 x
(a)
1 --2 + 00
(b)
-00
1
1
x
dx.
(c)
l1f/2
(d)
10 -IX e-y'x dx.
o
cosx
~
sinx
dx.
00
2. For each of the following, find all values of pER for which f is improperly integrable on I. (a) f(x) = l/x P , 1= (1, (0). (b) f(x) = l/x P , 1= (0,1). (c) f (x) = 1/ (x 10gP x), I = (e, 00 ) . (d) f(x) = 1/(1 + x P ), 1= (0, (0). (e) f(x) = loga x/x P , where a > is fixed, and 1= (1, (0). 3. Show that for each p > 0, sin x / x P is improperly integrable on [1,(0) and cos x / 10gP x is improperly integrable on (e, (0). 4. Decide which of the following functions are improperly integrable on I. (a) f(x) = sinx, 1= (0, (0). (b) f(x) = 1/x 2 , 1= [-1,1]. (c) f(x) = X-I sin(x- 1 ), 1= (1, (0). (d) f(x) = logx, 1= (0,1). (e) f(x) = (1- cosx)/x 2 , 1= (0, (0). 5. Use the examples provided by Exercise 2b to show that the product of two improperly integrable functions might not be improperly integrable. 6. Suppose that f, 9 are nonnegative and locally integrable on [a, b) and
°
L:= lim f(x) x-+b- g(x) exists as an extended real number. (a) Show that if L < 00 and 9 is improperly integrable on [a, b), then so is
f. (b) Show that if neither is f.
°: :;
°< L :::;
00
and 9 is not improperly integrable on [a, b), then
142
Chapter 5
INTEGRABILITY ON R
7. (a) Suppose that f is improperly integrable on [0,00). Prove that if L limx->oo f(x) exists, then L = O. (b) Let n::; x < n + 2- n , n E N f(x) = { otherwise.
~
Prove that f is improperly integrable on [0,00) but limx->oo f(x) does not exist. 8. Prove that if f is absolutely integrable on [1,00), then lim
n--+oo
1
00
o.
f(x n ) dx =
1
9. Assuming that e = limn->oo l:~=o 11k! (see Example 7.47), prove that
lim n->oo
10. (a) Prove that
(~ n.
l°
x 11roo xne- dX)
1r/2
e-asinx
= 1.
2 dx ::; -
a
for all a > O. (b) What happens if cos x replaces sin x? e5.5 FUNCTIONS OF BOUNDED VARIATION material from any other enrichment section.
This section uses no
In this section we study functions that do not wiggle too much. These functions, that playa prominent role in the theory of Fourier series (see Sections e14.3 and eI4.4) and probability theory, are important tools for theoretical as well as applied mathematics. Let ¢ : [a, b] -+ R. To measure how much ¢ wiggles on an interval [a, b], set n
V(¢,P) =
L I¢(xj) - ¢(xj-dl j=l
for each partition P = {xo, Xl,
(19)
... , Xn}
of [a, b]. The variation of ¢ is defined by
Var(¢):= sup{V(¢,P): P is a partition of [a,b]}.
5.50 DEFINITION. Let [a, b] be a closed, nondegenerate interval and ¢ : [a, b] -+ R. Then ¢ is said to be of bounded variation on [a, b] if and only if Var (¢) < 00. The following three remarks show how the collection of functions of bounded variation is related to other collections of functions we have studied.
5.5
Functions of bounded variation
143
5.51 Remark. If ¢ E CI [a, b], then ¢ is of bounded variation on [a, b]. However, there exist functions of bounded variation that are not continuously differentiable. PROOF. Let P = {Xa, Xl, ... , Xn} be a partition of [a, b]. By the Extreme Value Theorem, there is an M > 0 such that 1¢'(x)1 ::; M for all X E [a, b]. Therefore, it follows from the Mean Value Theorem that for each k between 1 and n there is a point Ck between Xk-l and Xk such that
By telescoping, we obtain V( ¢, P) ::; M(b-a) for any partition P of [a, b]. Therefore, Var(¢) ::;M(b-a). On the other hand, X2 sin(l/x) is of bounded variation on [0,1] (see Exercise 2) but does not belong to CI [0,1] (see Example 4.8). I
5.52 Remark. If ¢ is monotone on [a, b], then ¢ is of bounded variation on [a, b]. However, there exist functions of bounded variation that are not monotone. Let ¢ be increasing on [a, b] and P
PROOF.
=
{xa, Xl, ... , xn} be a partition of
[a, b]. Then by telescoping, n
L
n
I¢(xj) - ¢(xj-I)1 = L(¢(Xj) - ¢(xj-d) j=l
j=l
= ¢(xn) - ¢(xa) = ¢(b) - ¢(a) =: M
Thus, Var (J) = M. On the other hand, by Remark 5.51, ¢(x) variation on [-1,1]. I
<
00.
= X2 is of bounded
5.53 Remark. If ¢ is of bounded variation on [a, b], then ¢ is bounded on [a, b]. However, there exist bounded functions that are not of bounded variation. PROOF. Let X E [a, b] and note by definition that I¢(X) - ¢(a)1 ::; I¢(x) - ¢(a)1
+ I¢(b) -
¢(x)1 ::; Var (¢).
Hence, by the triangle inequality, 1¢(x)1 ::; 1¢(a)1
+ Var(¢).
To find a bounded function that is not of bounded variation, consider ¢(X) := {
~in(l/X)
X#O x = O.
Clearly, ¢ is bounded by 1. On the other hand, if Xj = {
0
j=O
2 (n - j)7r
0< j < n,
Chapter 5
144
then
INTEGRABILITY ON R
n
L 1¢(Xj) - ¢(Xj-l)1 = 2n ~
00
j=l
as n
Thus ¢ is not of bounded variation on [0,2/11"]. I
~ 00.
The following result and Exercise 3 are partial answers to the question: Is the class of functions of bounded variation preserved by algebraic operations?
5.54 THEOREM. If ¢ and 7jJ are of bounded variation on a closed interval [a, bl, then so are ¢ + 7jJ and ¢ - 7jJ. PROOF.
Let a = Xo
< Xl < ... < Xn = b. Then
n
L I¢(xj) ± 7jJ(Xj) -
(¢(Xj-l) ± 7jJ(xj-l))1
j=l n
~ ~
Therefore, Var(¢±7jJ)
~
n
L I¢(xj) - ¢(xj-dl + L 17jJ(xj) - 7jJ(xj-dl
j=l Var (¢) + Var (7jJ).
j=l
Var(¢)+ Var(7jJ). I
It turns out that there is a close connection between functions of bounded variation and monotone functions (see Corollary 5.57 below). To make this connection clear, we introduce the following concept.
5.55 DEFINITION. Let ¢ be of bounded variation on a closed interval [a, b]. The total variation of ¢ is the function I[> defined on [a, b] by
I[>(x) := sup
{t
I¢(xj) - ¢(xj-l)1 : {Xo, Xl,··., xd is a partition of [a, X]} .
J=l
5.56 THEOREM. Let ¢ be of bounded variation on [a, b] and variation. Then (i) I¢(Y) - ¢(x)1 ~ I[>(y) - I[>(x) for all a ~ X < Y ~ b, (ii) I[> and I[> - ¢ are increasing on [a, bl, and (iii) Var (¢) ~ Var (1[».
I[>
be its total
PROOF. (i) Let X < Y belong to [a,b] and {XO,X1, ... ,xd be a partition of [a,x]. Then {Xo, Xl, . .. ,Xk, y} is a partition of [a, yl, and we have by Definition 5.55 that k
k
j=l
j=l
L I¢(xj) - ¢(xj-l)1 ~ L I¢(xj) - ¢(xj-1)1 + I¢(y) - ¢(x)1 ~ I[>(y). Taking the supremum of this inequality over all partitions {XO, Xl, ... ,Xk} of [a, x], we obtain I[>(X) ~ I[>(x) + I¢(y) - ¢(x)1 ~ I[>(y).
5.5
FUnctions of bounded variation
145
(ii) By the Monotone Property of Suprema, cP is increasing on [a, bj. To show that cP - ¢ is increasing, suppose that a :::; x < y :::; b. By part (i),
¢(y) - ¢(x) :::; I¢(y) - ¢(x)1 :::; cp(y) - cp(x). Therefore, cp(x) - ¢(x) :::; cp(y) - ¢(y). (iii) Let P = {xo, Xl> ... , x n } be a partition of [a, bj. By part (i) and Definition 5.50, n
L
n
I¢(xj) - ¢(xj-1)1 :::;
j=l
L
Icp(xj) - cp(xj-1)1 :::; Var (cp).
j=l
Taking the supremum of this inequality over all partitions P of [a, bj, we obtain Var (¢) :::; Var (cp) .•
5.57 COROLLARY. Let [a, b] be a closed interval. Then ¢ is of bounded variation on [a, b] if and only if there exist increasing functions
¢(x) = f(x) - g(x),
x
E
f, 9
on [a, b] such that
[a,b].
PROOF. Suppose that ¢ is of bounded variation, let cP represent the total variation of ¢, f = CP, and 9 = cP - ¢. By Theorem 5.56, f and 9 are increasing, and by construction, ¢ = f - g. Conversely, suppose that ¢ = f - 9 for some increasing f, 9 on [a, bj. Then by Remark 5.52 and Theorem 5.54, ¢ is of bounded variation on [a, b] .• In particular, if f is of bounded variation on [a, bj then (i) f(x+) exists for each x E [a, b) and f(x-) exists for each x E (a, b] (see Lemma 4.28), (ii) f has no more than countably many points of discontinuity in [a, bj (see Theorem 4.29), and (iii) f is integrable on [a, b] (see Exercise 8, p. 116).
EXERCISES 1. (a) Show that 4k/(4k 2
-
1) > l/k for kEN.
(b) Prove that
n 2
-
for n E N. (c ) Prove that 2.
¢(x) =
{
1
: sm x2
is not of bounded variation on [0, 1].
x~o
x=o
Chapter 5
146
2. (a) Show that (Sk 2 (b) Prove that
INTEGRABILITY ON R
+ 2)/(4k 2 - 1)2 < n
L
l/k 2 for k = 2,3, ....
1 n-l (1 k2 ::; 1 + k
L
k=l
1) k +1
-
=2-
k=l
1
n
for n E N. (c) Prove that
¢(x) =
{
~
2.
1
SIllX-
x~O
x=O is of bounded variation on any bounded interval [a, b].
[!J.
This exercise is used in Section e14.3. Suppose that ¢ and 'lj; are of bounded variation on a closed interval [a, b].
(a) Prove that ex¢ is of bounded variation on [a, b] for every ex E R. (b) Prove that ¢'lj; is of bounded variation on [a, b]. (c) If there is an co > 0 such that
¢(x) ~ co,
x E [a, b],
prove that 1/¢ is of bounded variation on [a, b]. 4. Suppose that ¢ is of bounded variation on an interval [a, b]. Prove that ¢ is continuous on (a, b) if and only if ¢ is uniformly continuous on (a, b). 5. (a) If ¢ is continuous on a closed nondegenerate interval [a, b], differentiable on (a, b), and ¢' is bounded on (a, b), prove that ¢ is of bounded variation on
[a,b].
(b) Show that ¢(x) = ?Ix is of bounded variation on [-1,1] but ¢' is unbounded at some point in (-1,1). 6. Let P be a polynomial of degree N. (a) Show that P is of bounded variation on any closed interval [a, b]. (b) Obtain an estimate for Var (P) on [a, b], using values of the derivative P'(x) at no more than N points. 7. Let ¢ be a function of bounded variation on [a, b] and 1> be its total variation function. Prove that if 1> is continuous at some point Xo E (a, b), then ¢ is continuous at Xo. This exercise is used in Section e14.4. If f is integrable on [a, b] and
00.
F(x) =
IX f(t) dt
prove that F is of bounded variation on [a, b]. 9. Suppose that f' exists and is integrable on [a, b]. Prove that
f
is of bounded
5.6
147
Convex functions
Y
Yo !(Xo)
------r-""--.......-~ I I I I I c
d
Xo
x
Figure 5.5
variation and
Var (I) What happens to this result if
f'
=
lb
1!,(x)1 dx.
is bounded but not necessarily integrable?
e 5.6 CONVEX FUNCTIONS The last half page of this section uses Theorems 4.29 and 4.30, optional results from Section 4.4.
In this section we examine another collection of functions that is important for certain applications, especially for Fourier analysis, functional analysis, numerical analysis, and probability theory. 5.58 DEFINITION. Let I be an interval and
f :I
---+
R.
(i) f is said to be convex on I if and only if f(ax
+ (1 -
a)y) ~ af(x)
+ (1 -
a)f(y)
for all 0 ~ a ~ 1 and all x, y E I. (ii) f is said to be concave on I if and only if - f is convex on I. Notice that by definition, a function f is convex on an interval I if and only if f is convex on every closed subinterval of I. It is easy to check that f (x) = mx + b is both convex and concave on any interval (see also Exercise 3) but in general it is difficult to apply Definition 5.58 directly. For this reason, we include the following simple geometric characterizations of convexity. 5.59 Remark. Let I be an interval and f : I ---+ R. Then f is convex on I if and only if given any [c,d} ~ I, the chord through the points (c,/(c)), (d,/(d)) lies on or above the graph ofy = f(x) for all x E [c,d]. (See Figure 5.5.)
Chapter 5
148
INTEGRABILITY ON R
y
_-
-- -y =A(X)
-c
x
x
Figure 5.6 PROOF. Suppose that J is convex on I and Xo E [c,d]. Choose 0 ::; a ::; 1 such that Xo = ac + (1 - a)d. The chord from (c,J(c)) to (d,J(d)) has slope (f (d) - J (c) ) / (d - c). Hence, the point on this chord which has the form (xo, Yo) must satisfy Yo = aJ(c) + (1- a)J(d). Since J is convex, it follows that J(xo) ::; Yo; i.e., the point (xo, Yo) lies on or above the point (xo, J(xo)). A similar argument establishes the reverse implication. I
Thus both J(x)
= Ixl and J(x) = x 2 are convex on any interval.
5.60 Remark. A Junction J is convex on a nonempty, open interval (a, b) iJ and
only iJ the slope of the chord always increases on (a, b); i.e., a < c < x < d < b implies
J(x;
=~(c) ::; J(d~ =~(x).
PROOF. Fix a < c < x < d < b and let >.(x) be the equation of the chord to J through the points (c,J(c)) and (d,J(d)). If J is convex, then J(x) ::; >.(x) (see
Figure 5.6). Therefore,
::.. . :J('---'x)'-------=J:.. . :.(--'-c) < >.(x) - >.(c) = >.(d) - >.(x) < J(d) - J(x). x-c x-c d-x d-x Conversely, if J is not convex, then >.(x) < J(x) for some x E (c, d). It follows that
J(x) - J(c) x-c
.:.....:....--'------'--'-'- >
>.(x) - >.(c) >.(d) - >.(x) J(d) - J(x) = > . x-c d-x d-x
Therefore, the slope of the chord does not increase on (a, b). I This leads us to a characterization of differentiable convex functions.
5.6
Convex junctions
149
5.61 THEOREM. Suppose that f is differentiable on a nonempty, open interval I. Then f is convex on I if and only if f' is increasing on I. PROOF. Suppose that f is convex on I =: (a, b) and that c, d E (a, b) satisfy c < d. Choose h > 0 so small that c + h < d and d + h < b. Then by Remark 5.60,
f(c + h~ - f(c) :::; f(d + h~ - f(d). In particular, f' (c) :::; f' (d) . Conversely, let f' be increasing on (a, b) and let a < c < x < d < b. Use the Mean Value Theorem to choose Xo (between c and x) and Xl (between x and d) such that f(x) - f(c) = !'(xo) and f(d) - f(x) = !'(Xl).
x-c
d-x
Since Xo < Xl it follows that f'(xo) :::; f'(Xl). In particular, we conclude by Remark 5.60 that f is convex on (a, b). I Combining Theorems 4.24 and 5.61, we obtain the usual convexity criterion in terms of the second derivative: If f is twice differentiable on (a, b), then f is convex on (a, b) if and only if f"(x) ~ 0 for all x E (a, b). In particular, convexity is what elementary calculus texts call concave upward, and concavity is what elementary calculus texts call concave downward. On open intervals, convex functions are always continuous. (The statements and proofs of the next two results come from Zygmund [15].)
5.62 THEOREM. If f is convex on some nonempty, open interval I, then continuous on I.
f is
PROOF. Let Xo E I =: (a, b). By symmetry, it suffices to show that f(x) ~ f(xo) as x ~ xo+. Let a < c < Xo < x < d < b, y = g(x) represent the equation of the chord through (c, f(c)), (xo,!(xo)), and y = h(x) represent the equation of the chord through (xo, f(xo)), (d, f(d)). Since f is convex, we have by Remark 5.59 that f(x) :::; h(x). Since f(xo) lies on or below the chord from (c, f(c)) to (x, f(x)), we also have that g(x) :::; f(x). Consequently,
g(x) :::; f(x) :::; h(x),
x E (xo, b).
Both chords y = g(x) and y = h(x) pass through the point (xo, f(xo)), so g(x) ~ f(xo) and h(x) ~ f(xo) as x ~ xo+. Hence, it follows from the Squeeze Theorem that f(x) ~ f(xo) as x ~ xo+· I Theorem 5.62 does not hold for closed intervals [a, b]. Indeed, the function
f(x) := {
~
O:::;x
is convex on [0,1] but not continuous there. A function f is said to have a proper maximum (respectively, proper minimum) at Xo if and only if there exists a 8 > 0 such that f(x) < f(xo) (respectively, f(x) > f(xo)) for all 0 < Ix - xol < 8. As far as proper extrema are concerned, convex functions behave like strictly increasing functions.
Chapter 5
150
INTEGRABILITY ON R
5.63 THEOREM. (i) If f is convex on a nonempty, open interval (a, b), then f has no proper maximum on (a, b). (ii) Iff is convex on [0,(0) and has a proper minimum, then f(x) ----; 00 as x ----; 00. PROOF. (i) Suppose that Xo E (a, b) and f(xo) is a proper maximum of f. Then there exist c < Xo < d such that f(x) < f(xo) for c < x < d. Thus the chord through (c, f (c) ), (d, f (d)) must lie below f (xo) for c, d near Xo, a contradiction. (ii) Suppose that Xo E (a, b) and f(xo) is a proper minimum of f. Fix Xl > Xo. Let y = g(x) represent the equation of the chord through (xo, f(xo)) and (Xl, f(X1)). Since f(xo) is a proper minimum, f(xd > f(xo), hence 9 has positive slope. Moreover, by the proof of Theorem 5.62, g(x) ::; f(x) for all X E (Xl, (0). Since g(x) ----; 00 as X ----; 00, we conclude that f(x) ----; 00 as X ----; 00. I
Another important result about convex functions addresses the question: What happens when we interchange the order of a convex function and an integral sign?
5.64 THEOREM [JENSEN'S INEQUALITY]. Let ¢ be convex on a closed interval [a, b] and f : [0,1] ----; [a, b]. Iff and ¢ 0 f are integrable on [0,1]' then (20) PROOF.
¢
(11
f(x) dX) ::;
Set c=
11
11
(¢ 0 f)(x) dx.
f(x) dx
and observe that (21)
¢
(1 f(x) dX) = ¢(c) + s(1 f(x) dx - c) 1
1
for all s E R. (Note: Since a ::; f(x) ::; b for each X E [0,1]' c must belong to the interval [a, b] by the Comparison Theorem for Integrals. Thus ¢(c) is defined.) Let ¢(c) - ¢(x) s = sup . xE[a,c)
C -
X
By Remark 5.60, s::; (¢(u) - ¢(c))j(u - c) for all u E (c, b]; i.e., (22)
¢(c) + s(u - c) ::; ¢(u)
for all u E [c, b]. On the other hand, if u E [a, c), we have by the definition of s that s~
¢(c) - ¢(u) . c-u
Thus (22) holds for all u E [a, b]. Applying (22) to u
=
f(x), we obtain
¢(c) + s(f(x) - c) ::; (¢ 0 f)(x).
5.6
Convex functions
151
Integrating this inequality as x runs from 0 to 1, we obtain
Combining this inequality with (21), we conclude that (20) holds .• What about differentiability of convex functions? To answer this question we introduce the following concepts (compare with Definition 4.6).
5.65 DEFINITION. Let J: (a,b) --> R and x E (a,b). (i) J is said to have a right-hand derivative at x if and only if
DRJ(x):= lim J(x h--->O+
+ h) -
J(x)
h
exists as an extended real number.
(ii) J is said to have a left-hand derivative at x if and only if DLJ(x):= lim J(x h--->O-
+ h) -
J(x)
h
exists as an extended real number. The following result is a simple consequence of the definition of differentiability and the characterization of two-sided limits by one-sided limits (see Theorem 3.14).
5.66 Remark. A real Junction J is differentiable at x iJ and only iJ both DRJ(x) and DLJ(X) exist, are finite, and equal, in which case f'(x) = DRJ(x) = DLJ(X). The next result shows that the left-hand and right-hand derivatives of a convex function are remarkably well-behaved.
5.67 THEOREM. Let J be convex on an open interval (a, b). Then the left-hand and right-hand derivatives of J exist, are increasing on (a,b), and satisfy
for all x E (a, b). PROOF.
Let h < 0 and notice that the slope of the chord through the points
(x, J(x)) and (x + h, J(x + h)) is (f(x + h) - J(x) )/h. By Remark 5.60, these slopes increase as h --> 0-. Since increasing functions have a limit (which may be +00), it follows that DLJ(x) exists and satisfies -00 < DLJ(X) ~ 00. Similarly, DRJ(x) exists and satisfies -00 ~ DRJ(X) < 00. Remark 5.60 also implies that (23) Hence, both numbers are finite and by symmetry it remains to show that DRJ(x) is increasing on (a, b).
Chapter 5
152
Let
Xl
INTEGRABILITY ON R
< u < t < X2 be points that belong to (a, b). Then J(u) - J(xd < J(X2) - J(t) . u - Xl X2 - t
~~~~~
Taking the limit of this inequality as u that
--+ Xl
+ and t
--+
X2-, we conclude by (23)
(24) The next proof uses Theorem 4.29, an optional result from Section 4.4. *5.68 COROLLARY. If J is convex on an open interval (a, b), then J is differentiable at all but countably many points of (a, b); i.e., there is an at most countable set E c (a, b) such that f'(x) exists for all X E (a, b) \ E. PROOF. Let E be the set where either DLJ(x) or DRJ(x) is discontinuous. By Theorems 5.67 and 4.29, the set E is at most countable. Suppose that Xo E (a, b) \E and X < Xo. By (24),
Let X --+ Xo. Since both DLJ(X) and DRJ(X) are continuous at Xo, we obtain DRJ(xo) :::; DLJ(XO) :::; DRJ(XO). In particular, f'(xo) exists for all Xo E (a, b)\E. I How useful is a statement about f'(x) that holds for all but count ably many points x? We address this question by proving a generalization of Theorem 4.24. (The proof here uses Theorem 4.30, an optional result from Section 4.4.)
*5.69 THEOREM. Suppose that J is continuous on a closed interval [a, b] and differentiable on (a, b). If f'(x) ~ 0 for all but countably many X E (a, b), then J is increasing on [a,b]. PROOF. Suppose that f'(Xl) < 0 for some Xl E (a, b) and let y E (f'(Xl),O). By Theorem 4.30 (the Intermediate Value Theorem for derivatives), there is an X = x(y) E (a, b) such that f'(x) = y < O. It follows that if f'(x) < 0 for one X E (a, b), then f'(x) < 0 for uncountably many x E (a, b), a contradiction. Therefore, f'(x) ~ 0 for all x E (a, b); hence, by Theorem 4.24, J is increasing on (a, b). I
* 5.70 COROLLARY. If J is continuous on a closed interval [a, b] and differentiable on (a, b) with f'(x) = 0 for all but countably many x E (a, b), then J is constant on [a,b].
EXERCISES 1. Suppose that J, g are convex on an interval I. Prove that J convex on I for any c ~ O.
+ g and cJ are
5.6
Convex functions
153
2. Suppose that In is a sequence of functions convex on an interval I and
I(x) = lim In(x) n->oo
exists for each x E I. Prove that I is convex on I. 3. Prove that a function I is both convex and concave on I if and only if there exist m, bE R such that I(x) = mx + b for x E I. 4. Prove that I(x) = x P is convex on [0,00) for p ;::: 1, and concave on [0,00) for O
5. Show that if I is increasing on [a, b], then F(x) =
l
x
I(t) dt
is convex on [a, b]. (Recall that by Exercise 8, p. 116, I is integrable on [a, b].) 6. If I : [a, b] - R is integrable on [a, b], prove that
llf(x)1 dx:; (b - a)'/' ( l f'(x) dx )
1/2
7. Suppose that I : [0,1] - [a, b] is integrable on [0,1]. Assume that ef(x) and I/(x)iP are integrable for all 0 < p < 00 (see Exercise 11, p. 406). (a) Prove that efol f(x) dx
-::;.1
1
ef(x)
dx and
) liT
1 (
11/(xW dx
-::;.1
1
1/(x)1 dx
for all 0 < r -::;. 1. (b) If 0 < p < q, prove that
(1 11/(X)iP dX)
lip -::;.
(1 I/(xW dX) 11q 1
(c) State and prove analogues of these results for improper integrals. *8. Let I be continuous on a closed, bounded interval [a, b] and suppose that DRI(x) exists for all x E (a, b). (a) Show that if I(b) < Yo < I(a), then
Xo := sup{x E [a, b] : I(x) > yo} satisfies I(xo) = Yo and DRI(xo) -::;. O. (b) Prove that if I(b) < I(a), then there are uncountably many points x that satisfy DRI(x) -::;. O. (c) Prove that if DRI(x) > 0 for all but countably many points x E (a, b), then I is increasing on [a, b]. (d) Prove that if DRI(x) ;::: 0 and g(x) = I(x) + x/n for some n E N, then DR9(X) > O. (e) Prove that if DRI(x) ;::: 0 for all but count ably many points x E (a, b), then I is increasing on [a, b].
Chapter 6
Infinite SeJries of Re<8l1 N uJrllbeJrs
Infinite series are one of the most widely used tools of analysis. They are used to approximate numbers and functions. (Series of Ramanujan type have been used to compute billions of digits of the decimal expansion of IT.) They are used to approximate solutions of differential equations. (You may have used power series to solve ordinary differential equations with nonconstant coefficients.) They even form the basis for some very practical applications including pattern recognition (e.g., reading zip codes), image enhancement (e.g., removing raindrop clutter from a radar scan), and data compression (e.g., transmission of hundreds of TV programs through a single, photonic, fiber optic cable). Other applications of infinite series can be found in Section 7.5. In view of the variety of these applications, it should come as no surprise that the subject matter of this chapter (and the next) is of fundamental importance.
6.1 INTRODUCTION Let {ad kEN be a sequence of numbers. We shall call an expression of the form 00
(1)
an infinite series with terms ak. (No convergence is assumed at this point. This is merely a formal expression.)
6.1 DEFINITION. Let S = to R.
2::%:1 ak be an infinite series whose terms ak belong
(i) The partial sums of S of order n are the numbers defined, for each n E N, by
154
6.1
Introduction
155
(ii) S is said to converge if and only if its sequence of partial sums {sn} converges to some S ERas n ---7 00; Le., for every c > 0 there is an N E N such that n 2: N implies that ISn - sl < c. In this case we shall write
(2) and call Sthe sum, or value, of the series 2::~1 ak. (iii) S is said to diverge if and only if its sequence of partial sums {sn} does not converge as n ---7 00. When Sn diverges to +00 as n ---7 00, we shall also write (Xl
Lak =
00.
k=l (We shall deal with series of functions in Chapter 7.) You are already familiar with one type of infinite series, decimal expansions. Every decimal expansion of a number x E (0,1) is a series of the form 2::%"=1 xk/l0k, where the Xk'S are integers in [0,9]. For example, when we write 1/3 = 0.333 ... we mean
In particular, the partial sums 0.3, 0.33, 0.333, ... are approximations to 1/3 that get closer and closer to 1/3 as more terms of the decimal expansion are taken. One way to determine if a given series converges is to find a formula for its partial sums simple enough so that we can decide whether or not they converge. Here are two examples. 6.2 Example. Prove that Sn
2::%"=1 2- k = 1.
PROOF. By induction, we can show that the partial sums = 1 - 2- n for n E N. Thus Sn ---7 1 as n ---7 00. I
Sn
=
2::~=11/2k satisfy
6.3 Example. Prove that 2::~1 (_1)k diverges.
PROOF. The partial sums
Sn
Sn
Thus
Sn
=
does not converge as n
= 2::~=l(-I)k satisfy -I { 0 ---7
00.
if n is odd if n is even.
I
Another way to show that a series diverges is to estimate its partial sums. 6.4 Example. [HARMONIC SERIES]. Prove that the sequence l/k converges but the series 2::%"=11/k diverges to +00.
Chapter 6
156
INFINITE SERIES OF REAL NUMBERS
PROOF. The sequence 11k converges to zero (by Example 2.2). On the other hand, by the Comparison Theorem for Integrals, n 1 n lk+! 1 /,n+! 1 L"k ~L ~dx= ~dx=log(n+l). k=1 k=1 k 1 We conclude that Sn --+ 00 as n --+ 00. I
This example shows that the terms of a divergent series may converge. In particular, a series does not converge just because its terms converge. On the other hand, the following result shows that a series cannot converge if its terms do not converge to zero.
6.5 THEOREM [DIVERGENCE TEST]. Let {adkEN be a sequence of real numbers. If ak does not converge to zero, then the series 2::%"=1 ak diverges. PROOF. Suppose to the contrary that 2::%"=1 ak converges to some S E R. By definition, the sequence of partial sums Sn := 2::~=1 ak converges to S as n --+ 00. Therefore, ak = Sk - Sk-l --+ S - S = 0 as k --+ 00, a contradiction. I The proof of this result establishes a property interesting in its own right: If 2::%"=1 ak converges, then ak --+ 0 as k --+ 00. It is important to realize from the beginning that the converse of this statement is false; i.e., Theorem 6.5 is a test for divergence, not a test for convergence. Indeed, the harmonic series is a divergent series whose terms converge to zero. Finding the sum of a convergent series is usually difficult. The following two results show that this is not the case for two special kinds of series.
6.6 THEOREM [TELESCOPIC SERIES]. If {ad is a convergent real sequence, then 00 "(ak - ak+!) ~ k=1 PROOF. By telescoping, we have
= al
- lim ak. k--+oo
n
Hence Sn
--+
Sn := L(ak - ak+d = al - an+l· k=1 al - limk--+oo ak as n --+ 00. I
6.7 THEOREM [GEOMETRIC SERIES]. The series 2::~1 xk converges if and only if Ixl < 1, in which case 00
L
k=1
X
k
X ---
- I-x'
(See also Exercise 1.) PROOF. If Ixl ~ 1, then 2::%"=1 xk diverges by the Divergence Test. If Ixl then set sn = 2::~=1 xk and observe by telescoping that
(1 - x)sn
=
(1 - x)(x + x 2
=
X
+ ... + xn) + x 2 + ... + xn - x 2 - x 3 -
••• -
xn+l
=X
_
xn+l.
< 1,
6.1
Hence
157
Introduction
X x n +1 Sn=-----
I-x
I-x
Ixl <
for all n E N. Since xn+1 ----+ 0 as n ----+ 00 for all conclude that Sn ----+ x/(1 - x) as n ----+ 00. I
1 (see Example 2.20), we
(Note: In everyday speech, the words sequence and series are considered synonyms. Example 6.4 shows that in mathematics, this is not the case. In particular, you must not apply a result valid for sequences to series, and vice versa. Nevertheless, because convergence of an infinite series is defined in terms of convergence of its sequence of partial sums, any result about sequences contains a result about infinite series. The following three theorems illustrate this principle.)
6.8 THEOREM [CAUCHY CRITERION]. Let {ad be a real sequence. Then the infinite series 2:~=1 ak converges if and only if for every c > 0 there is an N E N such that m
m >n ~ N
L
imply
ak <
C.
k=n PROOF. Let Sn represent the sequence of partial sums of 2:~=1 ak and set So = O. By Cauchy's Theorem (Theorem 2.29), Sn converges if and only if given c > 0 there is an N E N such that m, n ~ N imply ISm - sn-11 < c. Since m
Sm - Sn-1
=
L
ak
k=n for all integers m > n
~ 1,
the proof is complete. I
6.9 COROLLARY. Let {ak} be a real sequence. Then the infinite series 2:~=1 ak converges if and only if given c > 0 there is an N E N such that 00
n ~N
L ak
implies
< C.
k=n
6.10 THEOREM. Let {ad and {bk} be real sequences. If 2:%:1 ak and 2:%:1 bk are convergent series, then
and
00
00
00
k=1
k=1
k=1
00
00
k=1
k=1
for any a E R. PROOF. Both identities are corollaries of Theorem 2.12; we provide the details only for the first identity.
158
Chapter 6
INFINITE SERIES OF REAL NUMBERS
Let Sn represent the partial sums of I:~=l ak and tn represent the partial sums of I:~l bk · Since real addition is commutative, we have n
+ bk) = Sn + tn,
2)ak
nEN.
k=l
Taking the limit of this identity as n ----
00,
we conclude by Theorem 2.12 that
00
00
"(ak ~
00
+ bk) = n-+oo lim Sn + n-+oo lim tn = "ak + "bk. ~ ~
k=l
k=l
I
k=l
EXERCISES 1. Show that 00
2:
X
k
n X ---
I-x
k=n
for Ixl < 1 and n = 0, 1, .... 2. Prove that each of the following series converges and find its value. 00
2:
(a)
(_I)k+l 'Irk
k=l 00
3k
00
(c ) "~7k-l.
k e- k • (d)"2 ~
k=O
k=l
3. Represent each of the following series as a telescopic series and find its value.
1
00
(a)
2: k(k + 1) .
~log 00
(b)
(k(k+2)) (k+l)2 .
k=l
(c)
~ t{f (1 - (~yk) , where jk =
-1/(k(k + 1)) for kEN.
4. Find all x E R for which 00
2: 3 (x k _Xk-1)(X k +Xk- 1) k=l
converges. For each such x, find the value of this series. 5. Prove that each of the following series diverges. 00
(a)
1
2: cos k k=l
2·
00
(
l)k
(b) { ; 1- k
k+1 2: k2 . 00
(c)
k=l
6. (a) Prove that if I:~=l ak converges, then its partial sums Sn are bounded. (b) Show that the converse of part (a) is false. Namely, show that a series I:~l ak may have bounded partial sums and still diverge.
6.1
Introduction
159
7. Let {bk} be a real sequence and bE R. (a) Suppose that there is an N E N such that Ib - bk I :::; M for all k 2: N. Prove that nb -
for all n > N. (b) Prove that if bk
~
n
N
k=l
k=l
2: bk :::; 2: Ibk bask
~ 00,
bl + M(n -
N)
then
as n ~ 00. (c) Show that the converse of (b) is false. 8. A series 2::'0 ak is said to be Cesaro summable to an L E R if and only if an :=
2:
n-1 (
k) ak
1 -:;;:
k=O
converges to L as n (a) Let
Sn =
~ 00.
2:~:~ ak. Prove that n
for each n EN. (b) Prove that if ak E Rand 2::'0 ak = L converges, then 2::'0 ak is Cesaro summable to L. (c) Prove that 2:%"=0 (-l)k is Cesaro summable to 1/2; hence the converse of (b) is false. (d) [TAUBER]. Prove that if ak 2: 0 for kEN and 2:%"=0 ak is Cesaro summable to L, then 2:%"=0 ak = L. 9. (a) Suppose that {ad is a decreasing sequence of real numbers. Prove that if 2::'1 ak converges, then kak ~ 0 as k ~ 00. (b) Let Sn = 2:~=1 (_1)k+1 /k for n E N. Prove that S2n is strictly increasing, S2n+1 is strictly decreasing, and S2n+1 - S2n ~ 0 as n ~ 00.
(c) Prove that part (a) is false if "decreasing" is removed. 10. Suppose that ak 2: 0 for k large and
2::'1 ak/k converges.
00
.lim 3- 00
2: J.a+ k =0. k
k=l
Prove that
160
Chapter 6
INFINITE SERIES OF REAL NUMBERS
6.2 SERIES WITH NONNEGATIVE TERMS Although we obtained exact values in the preceding section for telescopic series and geometric series, finding exact values of a given series is frequently difficult, if not impossible. Fortunately, for many applications it is not as important to be able to find the value of a series as it is to know that the series converges. When it does converge, we can use its partial sums to approximate its value as accurately as we wish (up to the limitations of whatever computing device we are using). Therefore, much of this chapter is devoted to establishing tests that can be used to decide whether a given series converges or whether it diverges. Let Pk be a statement that depends on kEN. We shall say that Pk holds for large k if there is an N E N such that Pk is true for k ~ N. The partial sums of a divergent series may be bounded (like 2::%"=1 (_l)k) or unbounded (like 2::~11/k). When the terms of a divergent series are nonnegative, the former cannot happen.
6.11 THEOREM. Suppose that ak ~ 0 for k ~ N. Then 2::~1 ak converges if and only if its sequence of partial sums {sn} is bounded, i.e., if and only if there exists a finite number M > 0 such that
~
ak
I ::; M
for all n EN.
PROOF. Set Sn = 2::~=1 ak for n E N. If 2::%"=1 ak converges, then Sn converges as n ---+ 00. Since every convergent sequence is bounded (Theorem 2.8), 2::%"=1 ak has bounded partial sums. Conversely, suppose that ISnl ::; M for n E N. Since ak ~ 0 for k ~ N, Sn is an increasing sequence when n ~ N. Hence by the Monotone Convergence Theorem (Theorem 2.19), Sn converges. I
If ak ~ 0 for large k, we shall write 2::%"=1 ak < 00 when the series is convergent and 2::~1 ak = 00 when the series is divergent. In some cases, integration can be used to test convergence of a series. The idea behind this test is that
Joo f(x) dx = f 1
k=1
l
k
1 + f(x) dx
k
~
f
f(k)
k=1
when f is almost constant on each interval [k, k + 1J. This will surely be the case for large k if f(k) 10 as k ---+ 00 (see Figure 6.1). This observation leads us to the following result.
6.12 THEOREM [INTEGRAL TESTJ. Suppose that f : [1, (0) ---+ R is positive and decreasing on [1, (0). Then 2::~1 f(k) converges if and only if f is improperly integrable on [1, (0), i.e., if and only if
1
00
f(x) dx < 00.
6.2
161
Series with nonnegative terms
2
x
5 ...
4
3
Figure 6.1
PROOF. Let Sn = I:~=l f(k) and tn = J1n f(x) dx for n E N. Since f is decreasing, f is locally integrable on [1, (0) (see Exercise 8, p. 116) and f(k + 1) ::; f(x) ::; f(k) for all x E [k, k + 1]. Hence, by the Comparison Theorem for Integrals, (k+1
f(k
+ 1) ::; i k
f(x) dx ::; f(k)
for kEN. Summing over k = 1, ... , n - 1, we obtain
Sn - f(l) = for all n
(3)
~
~ f(k) ::; in f(x) dx = tn ::; ~ f(k) = Sn -
f(n)
N. In particular,
f(n) ::;
~ f(k) -in f(x) dx ::; f(l)
for n E N.
By (3) it is clear that {sn} is bounded if and only if {tn} is. Since f(x) ~ 0 implies that both Sn and tn are increasing sequences, it follows from the Monotone Convergence Theorem that Sn converges if and only if tn converges, as n -+ 00. I This test works best on series for which the integral of f can be easily computed or estimated. For example, to find out whether I:~11/(1 + k2 ) converges or diverges, let f(x) = 1/(1 + x 2 ) and observe that f is positive on [1, (0). Since f'(x) = -2x/(1 + x 2 )2 is negative on [1, (0), it is also clear that f is decreasing. Since
1, 1
00
dx = arctan x l+x
--2
00
7r
11 = - 2
arctan(l) <
00,
it follows from the Integral Test that I:~11/(1 + k 2 ) converges. The Integral Test is most widely used in the following special case.
162
Chapter 6
INFINITE SERIES OF REAL NUMBERS
6.13 COROLLARY [P-SERIES TEST]. The series
(4) converges if and only if p > 1. PROOF. If P = 1 or p :::; 0, the series diverges. If p > 0 and p =I- 1, set f(x) = x- P and observe that f'(x) = -px- p - 1 < 0 for all x E [1,00). Hence, f is nonnegative and decreasing on [1, 00 ). Since
J
OO
1
dx
.
x 1- p 1- p
-=hm--
xP
n-+oo
In =hm . 1
n-+oo
n 1-
p -
1
1- p
has a finite limit if and only if 1 - p < 0, it follows from the Integral Test that (4) converges if and only if p > 1. I The Integral Test, which requires f to satisfy some very restrictive hypotheses, has limited applications. The following test can be used in a much broader context. 6.14 THEOREM [COMPARISON TEST]. Suppose that 0 :::; ak :::; bk for large k.
(i) If L:~=l bk (ii) If L:~=l ak
< 00, then L:~l ak < 00. = 00, then L:~l bk = 00.
PROOF. By hypothesis, choose N E N so large that 0 :::; ak :::; bk for k > N. Set = L:~=l ak and tn = L:~=l bk , n E N. Then 0:::; Sn -SN :::; tn -tN for all n 2: N. Since N is fixed, it follows that Sn is bounded when tn is, and tn is unbounded when Sn is. Apply Theorem 6.11 and the proof of the theorem is complete. I Sn
The Comparison Test is used to compare one series with another whose convergence property is already known, e.g., a p-series or a geometric series. Frequently, the inequalities Isin xl :::; Ixl for all x E R (see Appendix B) and Ilog xl :::; xC> for each a > 0 provided that x is sufficiently large (see Exercise 4, p. 101) are helpful in this regard. Although there is no simple algorithm for this process, the idea is to examine the terms of the given series, ignoring the superfluous factors, and to replace the more complicated factors by simpler ones. Here is a typical example. 6.15 Example. Determine whether the following series converges or diverges.
(5)
~~JIOgk ~ k 2 +k
k=l
k
SOLUTION. The kth term of this series can be written by using three factors: 1 3k JIogk kk+1 -k-'
6.2
Series with nonnegative terms
163
The factor 3k/(k + 1) is bounded by 3 for large k and can be ignored. Since log k ~ Jk for large k, the factor y'log k/k satisfies
Wi = _1_ -YT ~
JIOgk < k
for large k. Therefore, the terms of (5) are dominated by 3/k 5 / 4. Since 2::%"=1 3/k 5 / 4 converges by the p-Series Test, it follows from the Comparison Test that (5) converges. I The Comparison Test may not be easy to apply to a given series, even when we know which series it should be compared with, because the process of comparison often involves use of delicate inequalities. For situations like this, the following test is usually more efficient.
6.16 THEOREM [LIMIT COMPARISON TEST]. Suppose that ak ~ 0 and bk > 0 for large k and L := limn -+ oo an/bn exists as an extended real number. (i) If 0 < L < 00, then 2::%"=1 ak converges if and only if 2::%:1 bk converges. (ii) If L = 0 and 2::%"=1 bk converges, then 2::%"=1 ak converges. (iii) If L = 00 and 2::%"=1 bk diverges, then 2::%:1 ak diverges. PROOF. (i) If L is finite and nonzero, then there is an N E N such that
L 3L 2bk < ak < ""2bk for k ~ N. Hence, part (i) follows immediately from the Comparison Test and Theorem 6.10. Similar arguments establish parts (ii) and (iii)-see Exercise 8. I
In general, the Limit Comparison Test is used to replace a series 2::%"=1 ak by 2::%:1 bk when ak ~ Cb k for k large and some absolute fixed constant C. For example, to determine whether or not the series k
00
S :=
L ~v;=;='4k;=;:4=+=;k;::;;;:2-+--:5::-k
k=l
converges, notice that its terms are approximately 1/(2k) for k large. This leads us to compare S with the harmonic series 2::%:11/k. Since the harmonic series diverges and
k/( J4k4
+ k2 + 5k)
l/k as k --+ 00, it follows from the Limit Comparison Test that S diverges. Here is another application of the Limit Comparison Test.
6.17 Example. Let ak --+ 0 as k only if 2::%"=1 lakl converges.
--+
00.
Prove that
2::%"=1 sin lakl
converges if and
164
Chapter 6
PROOF.
INFINITE SERIES OF REAL NUMBERS
By I'Hopital's Rule, lim sin lakl k-+oo
lak I
= lim sinx x-+O+
=
l.
x
Hence, by the Limit Comparison Test, E%"=l sin lak I converges if and only if E%"=l lak I converges. I
EXERCISES 1. Prove that each of the following series converges.
k- 3
00
(a) (d)
L k3 + k + 1· k=l
1k-l· L.jk k=l k 3 00
(e)
(c)
L k=l 00
(
1)
fk=l
lo;k,
p> l.
10 + k k- e .
2. Prove that each of the following series diverges.
00!fk
(a)
(b) { ; 10gP(k + 1)'
k=l 00
(c)
1
00
LT. k 2 + 2k + 3
00
L k3 _ 2k2 + J2. k=l
(d)
L
k=2
p>
1 k 10 P k ' g
o. P ~ l.
3. Find all p 2: 0 such that the following series converges. 1
00
L klogP(k + 1)· k=l 4. If ak 2: 0 is a bounded sequence, prove that 00
'"'
~ (k
k=l
ak
+ l)p
converges for all p > 1. 5. Suppose that ak E [0, 1) and ak ---> 0 as k ---> converges if and only if E~l ak converges. 6. If E%"=l lak I converges, prove that
00.
Prove that E%"=l arcsin ak
~~ ~
k=l
kp
converges for all p 2: O. What happens if p < O? 7. Suppose that ak and bk are nonnegative for all kEN. (a) Prove that if E~l ak and E%"=l bk converge, then E%"=l akbk also converges. (b) Improve this result by replacing convergence of one of the series by something else.
6.3
165
Absolute convergence
8. Prove Theorem 6.16ii and iii. 9. Suppose that a, bE R satisfy bja E R \ Z. Find all q > 0 such that
1
00
L
k=l
(ak
+ b)qk
converges. 10. Suppose that ak -+ O. Prove that 2::~1 ak converges if and only if the series 2::;:'=1 (a2k + a2k+1) converges.
6.3 ABSOLUTE CONVERGENCE
In this section we investigate what happens to a convergent series when its terms are replaced by their absolute values. We begin with some terminology.
6.18 DEFINITION. Let S = 2::~1 ak be an infinite series. (i) S is said to converge absolutely if and only if 2::;:'=1 lakl < 00. (ii) S is said to converge conditionally if and only if S converges but not absolutely. The Cauchy Criterion gives us the following test for absolute convergence.
6.19 Remark. A series 2::;:'=1 ak converges absolutely if and only if for every c > 0 there is an N E N such that m
(6)
m > n 2:: N
implies
L lakl < c. k=n
As was the case for improper integrals, absolute convergence is stronger than convergence.
6.20 Remark. If 2::;:'=1 ak converges absolutely, then 2::~1 ak converges, but not conversely. In particular, there exist conditionally convergent series. PROOF. Suppose that so that (6) holds. Then
2::;:'=1 ak converges absolutely. m
m
k=n
k=n
Given c > 0, choose N E N
L ak :::; L lak I < c
for m > n 2:: N. Hence, by the Cauchy Criterion, 2::~1 ak converges. We shall finish the proof by showing that S := 2::;:'=1 (_l)k j k converges conditionally. Since the harmonic series diverges, S does not converge absolutely. On the other hand, the tails of S look like
166
Chapter 6
INFINITE SERIES OF REAL NUMBERS
By grouping pairs of terms together, it is easy to see that the sum inside the parentheses is greater than 0 but less than 11k, i.e.,
~ (-I)j L.J.
j=k
J
<.!.
k'
Hence L:%:l(-I)klk converges by Corollary 6.9. I We shall see below that it is important to be able to identify absolutely convergent series. Since every result about series with nonnegative terms can be applied to the series L:%:llakl, we already have three tests for absolute convergence (the Integral Test, the Comparison Test, and the Limit Comparison Test). We now develop two additional tests for absolute convergence that are arguably the most practical tests presented in this chapter. Before we state these tests, we need to introduce another concept. (If you covered Section 2.5, you may skip the next half page and proceed directly to Theorem 6.23.) An extended real number a is called an adherent point of a real sequence {xd if and only if there is a subsequence of {Xk} that satisfies XkJ --+ a as j --+ 00. For example, 1 and -1 are adherent points of {(-I)k} and 00 is an adherent point of {log k}. Notice once and for all that if a is an adherent point of a subsequence of {x d, then it is an adherent point of {xd. Also notice that every real sequence has at least one adherent point. Indeed, if the sequence is unbounded, then by definition, either 00 or -00 is an adherent point. On the other hand, if it is bounded, then by the Bolzano-Weierstrass Theorem, it has a finite adherent point. Hence, the following concept makes sense.
6.21 DEFINITION. The supremum s of the set of adherent points of a sequence {xd is called the limit supremum of {xd. (Notation: s := limsuPk-+oo Xk.) Thus the limit supremum of (-I)k is 1, of 3 + (_I)k is 4, and of -2 - (-I)k is -1. (Definition 2.32, a more sophisticated definition of this concept, explains the etymology of the term "limit supremum." It is equivalent to Definition 6.21 by Remark 2.37.) The only thing we need to know about limits supremum (for now) is the following result.
6.22 Remark. Let x E Rand {Xk} be a real sequence. (i) If limsuPk-+oo Xk < x, then Xk < x for large k. (ii) If limsuPk-+oo Xk > x, then Xk > x for infinitely many k. (iii) If Xk --+ x as k --+ 00, then limsuPk-+oo Xk = X. PROOF. (i) Let s := lim sUPk-+oo Xk < x but suppose to the contrary that there exist natural numbers kl < k2 < ... such that x kJ :::: x for j EN. If {x kJ } is unbounded above, then 00 is an adherent point of {xd so s = 00, a contradiction. If {XkJ is bounded above (by C), then it is bounded (since x ::; XkJ ::; C for all
6.3
167
Absolute convergence
Xk] 2:: x, has an adherent point 2:: x, i.e., s 2:: x, another contradiction. (ii) If s > x, then choose s > a > x. By the Approximation Property, there is a subsequence {XkJ that converges to a; i.e., Xk] > x for large j. (iii) If Xk converges to x, then any subsequence Xk] also converges to x (see Theorem 2.6). I
j EN). Hence, by the Bolzano-Weierstrass Theorem and the fact that {Xk}
The limit supremum gives a very useful and efficient test for absolute convergence.
6.23 THEOREM [ROOT TEST]. Let ak E R and r:= limsuPk-+oo /ak/ 1/ k . (i) IE r < 1, then L:~l ak converges absolutely. (ii) IE r > 1, then L:;:'=l ak diverges. PROOF. (i) Suppose that r < 1. Let r < x < 1 and notice that the geometric series L:~l xk converges. By Remark 6.22 or Exercise 3, p. 55,
/ak/ 1/ k < x for large k. Hence, lakl < xk for large k and it follows from the Comparison Test that L:~l lakl converges. (ii) Suppose that r > 1. By Remark 6.22 or Exercise 3, p. 55,
lakl 1/ k > 1 for infinitely many kEN. Hence, lakl > 1 for infinitely many k and it follows from the Divergence Test that L:~l ak diverges. I By Remark 6.22iii (or Theorem 2.36), if r := limk-+oo lakl 1/ k exists, then (by the Root Test) L:~l ak converges absolutely when r < 1 and diverges when r > 1. The following test is weaker than the Root Test (see Exercise 9) but is easier to use when the terms of L:~l ak are made up of products (e.g., factorials).
6.24 THEOREM [RATIO TEST]. Let ak E R with ak that
=f 0 for large k and suppose
exists as an extended real number. (i) IE r < 1, then L:~l ak converges absolutely. (ii) IE r > 1, then L:;:'=l ak diverges. PROOF. If r > 1, then lak+11 2:: lakl for k large and thus ak cannot converge to zero. Hence, by the Divergence Test, L:;:'=l ak diverges. If r < 1, then observe for any x E (r, 1) that
lak+11 lakl
xk+l xk
--<X=--
for k large. Hence, the sequence lakl/xk is decreasing for large k and thus bounded. In particular, there is an M > 0 such that Iak I ::; M xk for all kEN. Since x < 1, it follows from the Comparison Test that L:~l lakl converges. I
168
Chapter 6
INFINITE SERIES OF REAL NUMBERS
6.25 Remark. The Root and Ratio Tests are inconclusive when r
=
l.
For example, under the Ratio Test 2:::=11jk and 2::11jk2 both yield r = 1. Nevertheless, the first series diverges whereas the second converges absolutely. There are two ways to proceed when r = 1. There are tests that conclude that a series converges provided that its ratios converge to 1 rapidly enough. (Three of these tests are covered in Section 6.6 and its exercises.) There is also a very useful asymptotic estimate of k! (called Stirling's Formula-see Theorem 12.73) that you may find useful on series with factors of the form k!jkk (see Exercise 6e, p. 172, or Exercise 2c, p. 183). It is natural to assume that the usual laws of algebra hold for infinite series, e.g., associativity and commutativity. Is this assumption warranted? We have "inserted parentheses" (i.e., grouped terms together) to aid evaluation of some series (e.g., to evaluate some telescopic series and to prove that 2::1 (-1)k j k converges conditionally). This is valid for convergent series (absolutely or conditionally) because if the sequence of partial sums Sn converges to s, then any subsequence snk also converges to s. The situation is more complicated when we start changing the order of the terms (compare Theorem 6.27 with Theorem 6.29). To describe what happens, we introduce the following terminology. 6.26 DEFINITION. A series 2:~1 bj is called a rearrangement of a series 2:: 1ak if and only if there is a 1-1 function f from N onto N such that
kEN. The following result demonstrates why absolutely convergent series are so important. 6.27 THEOREM. If 2:::=1 ak converges absolutely and 2:~1 bj is any rearrangement of 2:::=1 ak, then 2:~1 bj converges and 00
00
k=l
j=l
Lak = Lb
j .
PROOF. Let c: > O. Set Sn = 2:~=1 ak, S = 2::1 ak, and tm = 2:j=l bj , n, mEN. Since 2::1 ak converges absolutely, we can choose N E N (see Corollary 6.9) such that
(7) Thus (8)
6.3
Let
f be a
169
Absolute convergence
1-1 function from N onto N that satisfies
kEN
and set M = max{f(I), ... ,f(N)}. Notice that
Let m ~ M. Then tm it follows from (7) that
SN
contains only ak's whose indices satisfy k > N. Thus,
00
it m
-
sNi::S
L
iaki <
~.
k=N+l
Hence by (8),
for m
~
M. Therefore,
The rest of this section, which is used nowhere else in this book, is optional.
We now show that Theorem 6.27 fails in a catastrophic way for conditionally convergent series (see Theorem 6.29). To facilitate our discussion, recall (see Exercise 1, p. 11) that the positive and negative parts of an a E R are defined by a+ :=
iai + a
= { a
a~O
0
a
a={ 0 2 -a
a~O
2
and
aNotice that
(9) and
(10) for all a E R.
:=
iai -
a
< O.
170
Chapter 6
INFINITE SERIES OF REAL NUMBERS
*6.28 Lemma. Suppose that ak E R for kEN.
(i) If
2:%"=1 ak converges absolutely, 00
00
then so do 2:%"=1 at and 2:%"=1 a;. In fact,
00
L lakl = Lat k=1 k=1
00
+ La;
and
k=1
00
00
Lak = Lat - La;. k=1 k=1 k=1
(ii) If 2:~1 ak converges conditionally, then 00
00
Lat = La; = k=1 k=1
00.
PROOF. By definition, at = (Iakl + ak)/2. Since both 2:~1 lakl and 2:~1 ak converge, it follows from Theorem 6.10 that
converges. Similarly, 00
1
00
100
La; = "2 L lak I - "2 L ak k=1 k=1 k=1 converges. This proves part (i). Suppose that part (ii) is false. By symmetry we may suppose that 2:~1 at converges. Since 2:%"=1 ak converges, it follows from (10) that 00
00
00
La; = Lat - Lak k=1 k=1 k=1 converges. Thus, 00
00
L lakl = Lat k=1 k=1
00
+ La; k=1
converges, a contradiction. I We are prepared to show that Theorem 6.27 is false if the hypothesis "absolutely convergent" is dropped. In fact, as the following result shows, rearrangements of conditionally convergent series can converge to anything one wishes (see also Exercise 10). *6.29 THEOREM [RIEMANN]. Let x E R. If2:~1 ak is conditionally convergent, then there is a rearrangement of 2:%"=1 ak that converges to x.
STRATEGY: The idea behind the proof is simple. Since 2:~1 at = 2:~1 a; = 00 by Theorem 6.28, begin by adding enough at's until the resulting partial sum is
6.3
171
Absolute convergence
> x. Then subtract enough ak's until the resulting partial sum is < x, and continue adding and subtracting. Since ak --+ 0 as k --+ 00, the resulting partial sums should be getting closer to x. We now make this precise. PROOF. Since
2:%:1 at = 00,
let k1 be the smallest integer that satisfies
Sk, := at
Since k1 is least, Sk , -1 := at
+ at + ... + at > x.
+ at + ... + at -1
::; X,
so Sk, ::;
X
+ at.
Therefore,
(11) for j = 1. Similarly, since 2:%:1 a k = 00, let r1 > k1 be the smallest integer that satisfies Sr, := Sk, - a 1 - ... - a;, -k, < X and observe that (12) for j = 1. Continuing, we generate integers k1 < r1 < k2 < . .. least, so that
and and so that (11) and (12) hold. Since each at and -a k is either ak or 0, it is clear (after deleting the zero terms) that the sn's are the partial sums of a rearrangement of 2:%:1 ak· Moreover, since ak --+ 0 as k --+ 00, (11) and (12) together with the Squeeze Theorem imply that both Sk J and srJ converge to x as j --+ 00. Suppose that n E N with n 2: k 1 . Then there is a j E N such that either k j ::; n < rj or rj ::; n < kj+l. In the former case, since Sn is formed from Sk J by adding negative terms, Similarly, in the latter case we have
We conclude by the Squeeze Theorem that Sn
--+
x as n
EXERCISES 1. Prove that each of the following series converges.
--+
00.
I
172
Chapter 6
1
00
(a)
INFINITE SERIES OF REAL NUMBERS
00
L k!· k=l
L
(b)
1 kk·
k=l
2. Decide, using results covered so far in this chapter, which of the following series converge and which diverge. 00
(a)
k2
k'
L 2~· k=l 00
Lk· k=l IT
(b)
00
(e) {;
2
k'
(
00
(c) {;
)k
(k +·2)!
(f)
(k+1 2k + 3
r
(d)
~ (3 + ~-l)k) k
~ (IT -~) k- 1. (1 + (_l)k)k L ek . k=l 00
(g)
3. Using Exercise 9, p. 135, prove that •
smx
00
=
L
k=O
(_1)k x 2k+1 (2k I)!
+
for all x E [0, IT /2J. 4. Define ak recursively by a1
00
and
cos x
=L
(_1)kx2k (2k)!
k=O
= 1 and k>1.
Prove that 2:%"=1 ak converges absolutely. 5. Suppose that ak ~ 0 and a~/k ---7 a as k ---7 00. Prove that 2:~1 akxk converges absolutely for all Ixl < l/a if a =I- 0 and for all x E R if a = O. 6. For each of the following, find all values of pER for which the given series converges absolutely. 1
00
(a) ".,......,......;::-:P
~ klog k·
k=2
1 (d) { ; Vk(k P
00
(b)
1
L 10 k=2 g
P
k· 00
00
7. Suppose that akj
-
~
1). 0 for k,j EN. Set
(f)
L( Vk k=l
2p
+ 1- k P ).
6.4
Alternating series
173
for each kEN, and suppose that L%"=l Ak converges. (a) Prove that
(b) Show that
(c) Prove that (b) may not hold if akj has both positive and negative values. Hint: Consider
j=k j=k+1
otherwise. 8. (a) Suppose that L%:l ak converges absolutely. Prove that L%"=l lak IP converges for all p :::: l. (b) Suppose that L%:l ak converges conditionally. Prove that L%:l kPak diverges for all p > l. 9. (a) Let an > 0 for n EN. Set b1 = 0, b2 = log(a2/ar), and k = 3,4, ....
Prove that if
. an r= 11 m - -
exists and is positive, then
k00 lim log(a;(n) = lim Ln (1-1 -) bk = Lb k = logr. n_oo n-oo n k=l
k=l r as n -+
(b) Prove that if an E R \ {O} and lan+danl -+ 00, for some r > 0, then la n l1 / n -+ r as n -+ 00. *10. Let x ::; y be any pair of extended real numbers. Prove that if L%"=l ak is conditionally convergent, then there is a rearrangement L;:l bj of L%:l ak whose partial sums Sn satisfy liminf Sn = x n-oo
and
limsupsn = y.
6.4 ALTERNATING SERIES We have identified many tests for absolute convergence but have said little about conditionally convergent series. In this section we derive two tests to use on series whose terms are of mixed sign. Both tests rely on the following algebraic observation. (This result will also be used in Chapter 7 to prove that limits of power series are continuous.)
174
Chapter 6
INFINITE SERIES OF REAL NUMBERS
6.30 THEOREM [ABEL'S FORMULA]. Let {adkEN and {bdkEN be real sequences, and for each pair of integers n ~ m ~ 1 set n
An,m:=
L
ak·
k=m
Then
n
n-l
L
akbk = An,mbn -
k=m
for all integers n > m
~
L
Ak,m(bk+l - bk )
1.
PROOF. Since Ak,m - A(k-l),m n
L
k=m
= ak
for k > m and Am,m
= am,
we have
n
akbk
L
= ambm +
k=m
(Ak,m - A(k-l),m)bk
k=m+l n
= ambm
n-l
L
+
Ak,mbk -
k=m+l
L
Ak,mbk+l
k=m
n-l
= ambm
n-l
L
+
Ak,mbk
L
+ An,mbn -
k=m+l
Ak,mbk+l - Am,mbm+l
k=m+l n-l
= An,mbn - Am,m(bm+l - bm ) -
L
Ak,m(bk+l - bk )
k=m+l n-l
= An,mbn -
L
Ak,m(bk+l - bk). I
k=m
This result is somewhat easier to remember using the following analogy. If f : [l,N]-+ R for some N E N, then the summation f(k) is an approximation to J{" f(x) dx and the finite difference f(k + 1) - f(k) is an approximation to f'(k)
L::=-/
for k = 1,2, ... , N - 1. In particular, summation is an analogue of integration and finite difference is an analogue of differentiation. In this context, Abel's Formula can be interpreted as a discrete analogue of integration by parts. Our first application of Abel's Formula is the following test. (Notice that it does not require the ak's to be nonnegative.)
6.31 THEOREM [DIRICHLET'S TEST]. Let ak, bk E R for kEN. If the sequence of partial sums Sn = L:~=l ak is bounded and bk ! 0 as k -+ 00, then L:~l akbk converges. PROOF. Choose M > 0 such that nEN.
6.4
175
Alternating series
By the triangle inequality, (13) for n > m > 1. Let e > 0 and choose N E N so that Ibkl < elM for k 2: N. Since {bd is decreasing and nonnegative, we find by Abel's Formula, (13), and telescoping that
If
akbkl ::; IAn,mllbnl
+ ~ IAk,ml (bk -
bk+1)
::; Mb n + M(b m - bn ) = Mb m < e for all n > m 2: N. I The following special case of Dirichlet's Test is widely used.
6.32 COROLLARY
[ALTERNATING SERIES TEST].
If ak
! 0 as k
--+ 00,
then
converges.
PROOF. Since the partial sums of I:~l (-1)k are bounded, I:~=l (_1)k ak converges by Dirichlet's Test. I We note that the series I:~=l(-I)klk, used in Remark 6.20, is an alternating series. Here is another example.
6.33 Example. Prove that I:~=l (-l)k I log k converges. PROOF. Since II log k ! 0 as k nating Series Test. I
--+ 00,
this follows immediately from the Alter-
The Dirichlet Test can be used for more than just alternating series. *6.34 Example. Prove that S(x)
= I:~=l sin(kx)lk converges for each x E R.
PROOF. Since ¢(x) = sin(kx) is periodic of period 271' (i.e., ¢(x + 271') = ¢(x) for all x E R) and has value identically zero when x = 0 or 271', we need only show that S(x) converges for each x E (0,271'). By Dirichlet's Test, it suffices to show that n
(14)
Dn(x)
:=
L sin(kx), k=l
is a bounded sequence for each fixed x E (0,271').
nEN
176
Chapter 6
INFINITE SERIES OF REAL NUMBERS
This proof, originally discovered by Dirichlet, involves a clever trick that leads to a formula for Dn. Indeed, applying a sum angle formula (see Appendix B) and telescoping, we have 2sin ~Dn(x)
n
= L 2 sin ~ sin(kx) k=l
=
x 1 cos( 2) - cos((n + 2")x).
Therefore,
cos~ -cos ((n+~) x) . x 2 sm 2
1
<-- Isin ~I
for all n EN. I
EXERCISES 1. Prove that each of the following series converges. 00
L( _l)k (i - arctank) .
(a)
(c)
k=l
f(~;)k,
p>O.
k=l
(d)
f
sin~:x)
x E R,
(e)
p> O.
f _(__
l_)k _2·_4_·'.,-'(-,-2k--,-)--:-
k2
k=l
k=l
1·3 .. ·(2k-1)'
2. For each of the following, find all values x E R for which the given series converges. k
00
(a)
L xk .
00
(b)
k=l
L
3k
~k
.
k=l
3. Using any test covered in this chapter, find out which of the following series converge absolutely, which converge conditionally, and which diverge.
L oo
(b)
k=l
L k=l 00
(c)
(k
+ l)k
p
kk"
.
00
p> e.
(d)
L k=l
(-1)(-3) .. · (1 - 2k) . 1· 4 .. · ( 3k -) 2
(_l)k+ly'k k
+1
.
6.5
177
Estimation of series
4. [ABEL'S TEST] Suppose that 2:~1 that 2:~=1 akbk converges. * 5. Prove that
ak
converges and
bk
1 b as
k ---- 00. Prove
00
L
ak
cos(kx)
k=l
converges for every x E (0,21l') and every ak 10. What happens when *6. Suppose that ak 1 0 as k ---- 00. Prove that
x
= O?
00
L
ak
sin((2k + l)x)
k=l
converges for all x E R. 7. Show that under the hypotheses of Dirichlet's Test, 00
L
00
akbk
=L
k=l
sk(bk - bk + 1 ).
k=l
8. Suppose that {ad and {bd are real sequences such that
nEN. Prove that 2:~=1 akbk converges. 9. Suppose that 2:~=1 ak converges. Prove that if converges, then
bk
i
00 and 2:~=1
akbk
as m ---- 00. e6.5 ESTIMATION OF SERIES
In practice, one estimates a convergent series by truncation, i.e., by adding finitely many terms of the given series. In this section we show how to estimate the error associated with such a truncation. The proofs of several of our earlier tests actually contain estimates of the truncation error. Here is what we can get from the Integral Test.
6.35 THEOREM. Suppose that [1,00). Then
f(n) :::;
t k=l
f(k) -
f :
in 1
[1,00) ---- R is positive and decreasing on
f(x) dx :::; f(l)
for n E N.
178
Chapter 6
Moreover,
if'E':=1 f(k)
INFINITE SERIES OF REAL NUMBERS
converges, then
o :::; ~ f(k) +
[00
~ f(k) :::; f(n)
f(x) dx -
for all n E N. PROOF. The first set of inequalities have already been verified (see (3)). To establish the second set, let Uk = Sk - tk for kEN, and observe, since f is decreasing, that
0:::;
r
= ik
Uk - Uk+!
k +!
f(x) dx - f(k
+ 1) :::; f(k) - f(k + 1).
Summing these inequalities over k 2: n and telescoping, we have 00
0:::;
Un -
lim
J--+oo
00
= "'(Uk ~
Uj
- Uk+!) :::;
k=n
Since
Uj --->
t
f(k)
+
k=l
~
+ 1)) = f(n).
k=n
'E':=l f(k) - floo f(x) dx as j 0:::;
"'U(k) - f(k
1
00
--->
00, we conclude that
f(x) dx - fJ(k) :::; f(n).
n
I
k=l
The following example shows how to use this result to estimate the accuracy of a truncation of a series to which the Integral Test applies.
6.36 Example. Prove that 'E~l ke- k2 converges and estimate its value to an accuracy of 10- 3 . PROOF. Let
f(x) = xe- x2 • Since f'(x) = e- X2 (1_ 2x2) :::;
0 for
x 2:
1,
f is
decreasing on [1,00). Since
J
_X2
OO
I
xe
dx
1
="2
Joo I
-u
du
e
1
= 2e < 00,
it follows from the Integral Test that 'E':=l ke- k2 converges. To estimate the value S of this series, notice that f(2) = 0.036631 and f(3) = 0.000370. Therefore, by Theorem 6.35, S is approximately equal to 3
2
Lke- k + k=l
1
00
3
1 e
2 e
3 e
1 e
xe- x dx = - +"""4 +"9 + -29 ~ 0.4049427 2
with an error no more than 0.000370. I
6.5
179
Estimation of series
The next example shows that Theorem 6.35 can be used to estimate divergent series as well.
6.37 Example. Prove that there exist numbers
en E (0,1] such that
for all n E N. PROOF. Clearly, f(x) = l/x is positive, decreasing, and locally integrable on [1,00). Hence, by Theorem 6.35,
Next, we see what the Alternating Series Test has to say about truncation error.
6.38 THEOREM. Suppose that ak 1 0 as k ----> 00. If 8 = 2:r=l ( _l)k ak and 8 n = 2:~=l(-l)kak' then o ~ 18 - 8n l ~ an+l for all n E N.
PROOF. Suppose first that n is even, say n
= 2m.
Then
00
L
= ~
Le., 0 ~
8 -
8n
~
(-1)k ak = 8 - 8 n k=2m+l -a2m+l + (a2m+2 - a2m+3) -a2m+l;
+ (a2m+4 -
a2m+5)
-an+!' A similar argument proves that 0
~ 8 -
+ ... 8n
~
an+l when
n is odd. I
This result can be used to estimate the error of a truncation of any alternating series.
6.39 Example. For each a > 0, prove that the series 2:r=l (-1)kk/(k2 + a) converges. If 8 n represents its nth partial sum and 8 its value, find an n so large that 8 n approximates 8 to an accuracy of 10- 2 • PROOF. Let f(x) = x/(x 2 + a) and note that f(x) ----> 0 as x ----> 00. Since f'(x) = (a - x 2)/(x 2 + a)2 is negative for x > it follows that k/(k 2 + a) 10 as k ----> 00. Hence, the given series converges by the Alternating Series Test. By Theorem 6.38, 8 n will estimate 8 to an accuracy of 10- 2 if f(n) < 10- 2 , i.e., if n 2 - lOOn + a > O. When a > 502 , this last quadratic has no real roots;
VIal,
Chapter 6
180
INFINITE SERIES OF REAL NUMBERS
hence, the inequality is always satisfied and we may choose n = 1. When Q ~ 50 2 , the quadratic has roots 50 ± v'50 2 - Q. Hence, choose any n that satisfies n > 50 + v'50 2 - Q . • Finally, we examine what information the proofs of the Root and Ratio Tests contain about accuracy of truncations.
6.40 THEOREM. Suppose that I:~=1 ak converges absolutely and s is the value ofI:~=1Iakl· (i) If there exist numbers x E (0,1) and N E N such that
lakl 1/ k ~ x for all k > N, then
for all n ~ N. (ii) If there exist numbers x E (0,1) and N E N such that lak+11 < x lakl for k > N, then
for all n
~
N.
PROOF. Let n ~ N. Since lakl ~ xk for k > N, we have, by summing a geometric series, that
for all n ~ N. This proves part (i). The proof of part (ii) is left as an exercise .•
6.41 Example. Prove that I:~=1 k2k / (3k 2 + k)k converges absolutely. If Sn represents its nth partial sum and S its value, find an n so large that Sn approximates S to an accuracy of 10- 2 . SOLUTION. Since
( for all k
~
k2k ) 1/k k2 2 2 (3k +k)k = 3k +k
1
~3
N := 1, the series converges absolutely by the Root Test.
Since
(1/3)n+1 /(1-1/3) ~ 10- 2 for n ~ 4, we conclude by Theorem 6.40i that it takes at most four terms to approximate the value of this series to an accuracy of 10- 2 • •
6.6
Additional tests
181
EXERCISES 1. For each of the following series, let Sn represent its partial sums and S its value. Prove that S is finite and find an n so large that Sn approximates S to an accuracy of 10- 2 . 00
(a)
2) _l)k (~ -
(b)
arctank) .
k=l
f
(_~~kk2
(c)
k=l
Loo
(-l)k
k=l
2.4 ... (2k)
-. k2 1·3···(2k-1)
2. (a) Find all p ?: 0 such that the following series converges. 1
00
L
k=l
k logP(k + 1)
(b) For each such p, prove that the partial sums of this series satisfy
n+ p
-1(
Is - snl:S n(p _ 1)
1
logP-l(n)
Sn
and its value
S
)
for all n ?: 2. 3. For each of the following series, let Sn represent its partial sums, S represent its value. Prove that S is finite and find an n so large that Sn approximates S to an accuracy of 10- 2 . 00
(a)
1
L k!' k=l
00
(b)
L
1 kk'
k=l
00
(c)
L
2k k!'
k=l
4. Prove Theorem 6.4Oii. e6.6 ADDITIONAL TESTS If the Ratio or Root Test yields a value r = 1, then no conclusion can be made. There are some tests designed to handle just that situation (see Exercise 3). We cover two of them in this section (see also Exercises 4 and 5). The first test compares the growth of the terms of a series with the growth of the logarithm function.
6.42 THEOREM [LOGARITHMIC TEST]. Suppose that p
=
ak
-=I- 0 for large k and
lim log(l/l a kl) k-->oo log k
exists as an extended real number. If p > 1, then L:~=l p < 1, then L:~=llakl diverges.
ak
converges absolutely. If
PROOF. Suppose that p > 1. Fix q E (l,p) and choose N E N so that k ?: N implies that log(l/lakl) > qlogk = log(kq). Since the logarithm function is
Chapter 6
182
INFINITE SERIES OF REAL NUMBERS
monotone increasing, it follows that l/lakl > k q , i.e., that lakl < k- q for k 2: N. Hence, by the Comparison Test, L~=l lak I converges. Similarly, if p < 1, then lakl > 11k for large k. Hence, by the Comparison Test, L~=l lakl diverges. I r
The final test works by examining how rapidly the ratios of ak+I/ak converge to 1 (see also Exercise 5).
=
6.43 THEOREM
[RAABE'S TEST].
Suppose that there is a constant C and a
parameter p such that
I < 1 - -pI ak+l ak k+C
(15)
for large k. Ifp> 1, then L~=l ak converges absolutely. PROOF. Set Xk = k + C - 1 for kEN and choose N E N such that Xk > 1 and (15) hold for k 2: N. By the p-Series Test and the Limit Comparison Test, 00
(16)
'~ "'
x k-p
< 00.
k=N
By (15) and Bernoulli's Inequality,
I-ak+l - I< 1 ak -
-P- < Xk+l -
(1 -
1 --) xk+l
p
p
~ = --.
X1+1
Hence, the sequence {Iakl xnk"=N is decreasing and bounded above. In particular, there is an M > 0 such that lakl :::; Mxi? for k 2: N. We conclude by (16) that L~=l ak converges. I
EXERCISES 1. Using any test covered in this chapter, find out which of the following series converge absolutely, which converge conditionally, and which diverge.
L oo
(a)
k=l
3·5··· (2k + 1) . 2·4·· ·2k 00
(c)
L (log k=2
1 k)loglog k •
L
oo
(b)
k=l
1·3··· (2k - 1) 5·7··· ( 2k) + 3"
(d)f(/k-/k l)k k=l
2. For each of the following, find all values of pER for which the given series converges absolutely, for which it converges conditionally, and for which it
6.6
183
Additional tests
diverges. 1
00
(b)
L (log k)p
log k .
(c)
k=2
f (p:t· k=1
3. (a) Prove that the Root Test applied to the series
1
00
L (log k k=2
)log k
yields r = 1. Use the Logarithmic Test to Prove that this series converges. (b) Prove that the Ratio Test applied to the series ~ 1· 3··· (2k - 1) L..J 4 . 6· .. (2k + 2)
k=1
yields r = 1. Use Raabe's Test to Prove that this series converges. 4. Suppose that f : R -+ (0,00) is differentiable, f(x) -+ 0 as x -+ 00, and
. a:= X--+OO lIm
xf'(x) - fx( )
exists. If a < -1, prove that 2:;:'=1 f(k) converges. 5. Suppose that {ad is a sequence of nonzero real numbers and p
= lim k k--+oo
(1 _I
ak +1 ak
exists as an extended real number. Prove that when p > 1.
1)
2:%':1 ak
converges absolutely
Chapter 7
Infinite Series of Functions 7.1 UNIFORM CONVERGENCE OF SEQUENCES You are familiar with what it means for a sequence of numbers to converge. In this section we examine what it means for a sequence of functions to converge. It turns out that there are several different ways to define "convergence" of a sequence of functions. We begin with the simplest way.
7.1 DEFINITION. Let E be a nonempty subset of R. A sequence of functions f n : E ---> R is said to converge pointwise on E (notation: f n ---> f pointwise on E as n ---> (0) if and only if f (x) = limn --+ co f n (x) exists for each x E E. Because Un} converges pointwise on a set E if and only if the sequence of real numbers {fn(x)} converges for each x E E, every result about convergence of real numbers contains a result about pointwise convergence of functions. Here is a typical example.
7.2 Remark. Let E be a nonempty subset ofR. A sequence of functions fn : E ---> R is said to converge pointwise on E to a function f (notation: fn ---> f pointwise on E as n ---> (0) if and only if for every e > 0 and x E E there is an N E N (which may depend on x as well as e) such that n ~ N
implies
Ifn(x) - f(x)1 < e.
PROOF. By Definition 7.1, fn ---> f pointwise on E if and only if fn(x) ---> f(x) for all x E E. This occurs, by Definition 2.1, if and only if for every e > 0 and x E E there is an N E N such that n ~ N implies that Ifn(x) - f(x)1 < e. I
If fn ---> f pointwise on [a, b], it is natural to ask: What does f inherit from fn? The next four remarks show that, in general, the answer to this question is: Not much. 184
7.1
185
Uniform convergence of sequences
7.3 Remark. The pointwise limit of continuous (respectively, differentiable) functions is not necessarily continuous (respectively, differentiable). PROOF.
Let fn(x)
= xn and set f(x)
={
~
O:S;x<1
x=l.
Then fn ----) f pointwise on [0,1] (see Example 2.20), each fn is continuous and differentiable on [0, 1], but f is neither differentiable nor continuous at x = 1. I 7.4 Remark. The pointwise limit of integrable functions is not necessarily integrable. PROOF.
Set
x
=
p/m E Q, written in reduced form, where m :s; n
otherwise, for n E Nand f(x) = {
~
x EQ otherwise.
Then fn ----) f pointwise on [0,1]' each fn is integrable on [0,1] (with integral zero), but f is not integrable on [0,1] (see Example 5.11). I 7.5 Remark. There exist differentiable functions fn and f such that fn ----) f pointwise on [0, 1] but
(1) for x
=
1.
PROOF. Let fn(x) = xn/n and set f(x) = 0. Then fn ----) f pointwise on [0,1], each fn is differentiable with f~(x) = xn-l. Thus the left side of (1) is 1 at x = 1 but the right side of (1) is zero. I
7.6 Remark. There exist continuous functions fn and f such that fn ----) f pointwise on [0, 1] but
(2)
lim n--oo
lim fn(x)) 10fl fn(x) dx i- Jofl (n--oo
dx.
PROOF. Let JI(x) = 1, and for n > 1 let fn be a sequence of functions whose graphs are triangles with bases 2/n and altitudes n (see Figure 7.1). By the pointslope form, formulas for these f n 's can be given by
o:s; x < l/n l/n:S; x < 2/n 2/n :s; x :s; 1.
Chapter 7
186
INFINITE SERIES OF FUNCTIONS
y
n
y = fn(x)
2
------ -----_-
-_
2
I
n
2"
y=/z(x)
x
Figure 7.1
°
Then fn --+ pointwise on [O,IJ, and since the area of a triangle is one-half base times altitude, f01 fn(x) dx = 1 for each n E N. Thus, the left side of (2) is 1 but the right side is zero. I In view of the preceding examples, it is clear that pointwise convergence is of limited value for the calculus of limits of sequences. It turns out that the following concept, discovered independently by Stokes, Cauchy, and Weierstrass around 1850, is much more useful in this context. 7.7 DEFINITION. Let E be a nonempty subset of R. A sequence of functions fn : E --+ R is said to converge uniformly on E to a function f (notation: fn --+ f uniformly on E as n --+ (0) if and only if for every E > there is an N E N such that n ~ N implies Ifn(x) - f(x)1 < E
°
for all x E E. Comparing Definition 7.7 with Remark 7.2, we see that the only difference between uniform convergence and pointwise convergence is that for uniform convergence, the integer N must be chosen independently of x (see Figure 7.2). Notice that this is similar to the difference between uniform continuity and continuity (see the discussion following Example 3.37). By definition, if fn converges uniformly on E, then fn converges pointwise on E. The following example shows that the converse of this statement is false. (This example also shows how to prove that f n --+ f uniformly on a set E: dominate Ifn(x) - f(x)1 by constants bn , independent of x E E, that converge to zero as n--+oo.)
7.8 Example. Prove that xn but not uniformly, on [0,1).
--+
°
uniformly on [0, bJ for any b < 1, and pointwise,
7.1
187
Uniform convergence of sequences
y
,-.----------l1- ..............
-
y =j(x) +e
t---r7'--'~_-L.,..- y = fn(x) 1
..--
1 ...... - - -
1---- y =j(x) - e
............ ~
............
y =j(x)
------- --,.J 1
1 1 1 1
1 1 1 1
x
~-------y~------~
E
Figure 7.2
°
PROOF. By Example 2.20, xn ---> pointwise on [0,1). Let b < 1. Given E > 0, choose N E N such that n ~ N implies that bn < E. Then x E [0, b] and n ~ N imply Ixnl ::; bn < E, Le., xn ---> uniformly for x E [0, b]. Finally, suppose that xn converges to uniformly on [0,1). Then given < E < 1/2, there is an N E N such that IxNI < E for all x E [0,1). On the other hand, since x N ---> 1 as x ---> 1-, we can choose an Xo E (0,1) such that x~ > E (see Figure 7.3). Thus E < x~ < E, a contradiction. I
°
°
The next several results show that if In much from In.
--->
°
I or I~
--->
f' uniformly, then I
inherits
7.9 THEOREM. Let E be a nonempty subset of R and suppose that In ---> I uniformly on E. If each In is continuous at some Xo E E, then I is continuous at Xo E E. PROOF.
Let
E
>
°and choose
n ~N
and
x
N E N such that E
E
Since IN is continuous at Xo E E, choose 0 >
Ix Suppose that
Thus
I
xol < 0 and x E E
Ix -
xol < 0 and x
E
is continuous at Xo E E. I
Iln(x) - l(x)1 < ~.
imply
°such that
imply
E. Then
IIN(X) - IN(Xo)1 <
E
3·
188
Chapter 7
INFINITE SERIES OF FUNCTIONS
y
b
Xo 1
x
Figure 7.3 (For a generalization of this result, see Exercise 6. For a converse of this result when the sequence In is pointwise monotone, see Theorem 9.40.) Here is an important theorem about interchanging a limit sign and an integral sign (compare with Remark 7.6). 7.10 THEOREM. Suppose that In ~ I uniformly on a closed interval [a, bj. If each In is integrable on [a, bj, then so is I and lim
n-+oo
In fact, limn -+ oo PROOF. E:
lb a
In (x) dx
=
lb a
(lim In(x)) dx. n--+oo
J: In(t) dt = J: I(t) dt uniformly for x
E
[a, bj.
By Exercise 5, I is bounded on [a, bj. To prove that E N such that
I
is integrable, let
> 0 and choose N
(3)
n? N
implies
I/(x) - In(x)1 < 3(b ~ a)
for all x E [a, bj. Using this inequality for n = N, we see that by the definition of upper and lower sums,
U(f - IN,P):::;
E:
3
and
L(f - IN,P)
E:
?-3
for any partition P of [a, bj. Since IN is integrable, choose a partition P such that
7.1
189
Uniform convergence of sequences
It follows that
U(f,P) - L(f,P):::; U(f - IN,P)
< i.e.,
E
E
+ U(fN,P) -
L(fN,P) - L(f - IN,P)
E
"3 +"3 +"3 =
E;
I is integrable on [a, b]. We conclude by Theorem 5.22 and (3) that
for all x E [a,b] and n ~ N .• Here is a Cauchy Criterion for uniform convergence. 7.11 Lemma [UNIFORM CAUCHY CRITERION]. Let E be a nonempty subset of ~ R be a sequence of functions. Then In converges uniformly on E if and only if for every E > 0 there is an N E N such that
R and let In: E (4)
n,m ~ N
imply
I/n(x) - Im(x)1 <
E
for all x E E. PROOF. Suppose first that In choose N E N such that n
~
N
~
I uniformly on E as n
implies
~ 00.
Let
E
> 0 and
I/n(x) - l(x)1 < ~
for x E E. Since lIn (x) - Im(x)1 :::; lIn (x) - l(x)1 + I/(x) - Im(x)l, it is clear that (4) holds for all x E E. Conversely, if (4) holds for x E E, then {fn(X)}nEN is Cauchy for each x E E. Hence, by Cauchy's Theorem for sequences (Theorem 2.29),
I(x):= lim In(x) n--->oo
exists for each x E E. Take the limit of the second inequality in (4) as m ~ 00. We obtain lIn (x) - l(x)1 :::; E for all n ~ N and x E E. Hence, by definition, In ~ I uniformly on E .• Here is a result about interchanging a limit sign and the derivative sign (compare with Remark 7.5). The proof presented here comes from Apostol [1].
7.12 THEOREM. Let (a, b) be a bounded interval and suppose that In is a sequence of functions that converges at some Xo E (a, b). If each In is differentiable on (a, b), and I~ converges uniformly on (a, b) as n ~ 00, then In converges uniformly on (a, b) and lim
n-+oo
I~(x) = ( n-+oo lim In(x))'
Chapter 7
190
INFINITE SERIES OF FUNCTIONS
for each x E (a, b). PROOF.
Fix c E (a, b) and define
fn(x) - fn(c) x-c
x=l-c
f~(c)
x=c
9n(X) = { for n E N. Clearly,
(5) for n E N and x E (a, b). We claim that for any c E (a, b), the sequence gn converges uniformly on (a, b). Let E > 0, n, mEN, and x E (a, b) with x =I- c. By the Mean Value Theorem, there is a ~ between x and c such that
Since
f~
converges uniformly on (a, b), it follows that there is an N E N such that n,m 2: N
implies
Ign(x) - gm(x)1 <
E
for x E (a, b) with x =I- c. This implication also holds for x = c because gn(c) = f~(c) for all n EN. This proves the claim. To show that fn converges uniformly on (a, b), notice that by the claim, gn converges uniformly as n ---+ 00 and (5) holds for c = Xo. Since fn(xo) converges as n ---+ 00 by hypothesis, it follows from (5) and b - a < 00 that fn converges uniformly on (a, b) as n ---+ 00. Fix c E (a, b). Define f, 9 on (a, b) by f(x) := limn --+ oo fn(x) and g(x) := limn --+ oo 9n (x). We need to show that
f'(c) = lim f~(c).
(6)
n--+oo
Since each gn is continuous at c, the claim implies that 9 is continuous at c. Since gn(c) = f~(c), it follows that the right side of (6) can be written as lim f~(c) = lim gn(c) = g(c) = lim g(x).
n~oo
x-+c
n~oo
On the other hand, if x =I- c, we have by definition that
f(x) - f(c) = lim fn(x) - fn(c) = lim gn(x) = g(x). X - C
n--+oo
X -
C
n--+oo
Therefore, the left side of (6) also reduces to
f'(c) = lim f(x) - f(c) = lim g(x). X--+C
x - c
x--+c
7.1
Uniform convergence of sequences
191
This verifies (6), and the proof of the theorem is complete .•
EXERCISES 1. (a) Prove that xln
-+ 0 uniformly, as n -+ 00, on any closed interval [a, b]. (b) Prove that I/(nx) -+ 0 pointwise but not uniformly on (0,1) as n -+ 00. 2. Prove that the following limits exist and evaluate them.
(a) lim n->oo
1 1
-1
eX 2 In dx.
(b) lim n-+oo
1 3
1
2+ 3 nx 3 dx. X + nx
(c) lim n-+oo
171:12 0
. x +cosx dx. n n
Slll-
3. Suppose that fn -+ f and gn -+ g, as n -+ 00, uniformly on some set E S;;; R. (a) Prove that fn + gn -+ f + 9 and afn -+ af, as n -+ 00, uniformly on E for allaER. (b) Prove that f ngn -+ f 9 pointwise on E. (c) Prove that if f and 9 are bounded on E, then fngn -+ fg uniformly on E. (d) show that (c) may be false when 9 is unbounded. 4. Let f, 9 be continuous on a closed bounded interval [a, b] with Ig(x)1 > 0 for x E [a, b]. Suppose that fn -+ f and gn -+ gas n -+ 00, uniformly on [a, b]. (a) Prove that II gn is defined for large nand fnl gn -+ fig uniformly on [a, b] as n -+ 00. (b) Show that (a) is false if [a, b] is replaced by (a, b). 5. A sequence of functions fn is said to be uniformly bounded on a set E if and only if there is an M > 0 such that
for all x E E and all n EN. If each f n is a bounded function on a set E and fn -+ f uniformly on E, prove that {fn} is uniformly bounded on E and f is a bounded function on E. 6. Suppose that E is a nonempty subset of R and that f n -+ f uniformly on E. (a) Prove that if each fn is continuous on E, then f is continuous on E. (b) Prove that if each f n is uniformly continuous on E, then f is uniformly continuous on E. 7. Suppose that b> a > O. Prove that lim n-+oo
lb(I+~)ne-xdx=b-a. a n
8. Let [a, b] be a closed bounded interval, f : [a, b] -+ R be bounded, and 9 : [a, b] -+ R be continuous with g(a) = g(b) = O. Let fn be a uniformly bounded sequence of functions on [a, b] (see Exercise 5). Prove that if fn -+ f uniformly on all closed intervals [c, d] C (a, b), then fng -+ fg uniformly on [a, b].
192
Chapter 7
INFINITE SERIES OF FUNCTIONS
9. Let fn be integrable on [0,1] and fn as n ----> 00, then lim
n->oo
i
bn
0
---->
f uniformly on [0,1]. Show that if bn i 1
fn(x) dx =
11 f(x) dx. 0
10. Let E be a nonempty subset of Rand f be a real-valued function defined on E. Suppose that fn is a sequence of bounded functions on E that converges 4 to f uniformly on E. Prove that
JI(x)
+ ... + fn(x)
---->
f(x)
n
uniformly on E as n
----> 00
(compare with Exercise 7, p. 159).
7.2 UNIFORM CONVERGENCE OF SERIES In this section we extend the concepts introduced in Section 7.1 from sequences to series.
7.13 DEFINITION. Let !k be a sequence of real functions defined on some set E and set n sn(x) :=
L
fk(X),
x
E
E, n E N.
k=l (i) The series L:~1 fk is said to converge pointwise on E if and only if the sequence sn(x) converges pointwise on E as n ----> 00.
(ii) The series L:;:"=1!k is said to converge uniformly on E if and only if the sequence sn(x) converges uniformly on E as n ----> 00. (iii) The series L:;:"=1 fk is said to converge absolutely (pointwise) on E if and only if L:;:"=llfk(X)1 converges for each x E E. Since convergence of series is defined in terms of convergence of sequences of partial sums, every result about convergence of sequences of functions contains a result about convergence of series of functions. For example, the following result is an immediate consequence of Theorems 7.9, 7.10, and 7.12.
7.14 THEOREM. Let E be a nonempty subset ofR, and let {fd be a sequence of real functions defined on E. (i) Suppose that Xo E E and that each !k is continuous at Xo E E. If f = L:;:"=1 fk converges uniformly on E, then f is continuous at Xo E E. (ii) [TERM-BY-TERM INTEGRATION]. Suppose that E = [a, b] and that each !k is integrable on [a, b]. If f = L:~1!k converges uniformly on [a, b], then f is integrable on [a, b] and
7.2
193
Uniform convergence of series
(iii) [TERM-BY-TERM DIFFERENTIATION]. Suppose that E is a bounded, open interval and that each fk is differentiable on E. If 2:;;'=1 Ik converges at some Xo E E, and 2:;;'=1 f~ converges uniformly on E, then f := 2:;;'=1 fk converges uniformly on E, f is differentiable on E, and
for x E E. Here are two much-used tests for uniform convergence of series. (The second test, and its example, are optional because we do not use it elsewhere in this text.)
7.15 THEOREM [WEIERSTRASS M-TEST]. Let E be a nonempty subset ofR, let Ik : E -+ R, kEN, and let Mk ~ 0 satisfy 2:;;'=1 Mk < 00. If Ifk(X)1 :::; Mk for kEN and x E E, then 2:;;'=1 fk converges absolutely and uniformly on E. PROOF. Let E > 0 and use the Cauchy Criterion to choose N E N such that m > n ~ N implies 2:;;'=n Mk < E. Thus, by hypothesis, m
m
m
k=n
k=n
k=n
for m > n ~ N and x E E. Hence, the partial sums of 2:;;'=1 Ik are uniformly Cauchy and the partial sums of 2:~1 IIk(x)1 are Cauchy for each x E E. I
*7.16 THEOREM [DIRICHLET'S TEST FOR UNIFORM CONVERGENCE]. Let E be a nonempty subset ofR and suppose that Ik,gk : E -+ R, kEN. If
for n E N and x E E, and if gk converges uniformly on E.
!
0 uniformly on E as k
-+
00,
then
PROOF. Let n
Fn,m(x) =
L
Ik(x),
m, n E N, n
~
m, x E E
k=m and fix integers n > m > O. By Abel's Formula and hypothesis,
n-1
:::; 2Mgn(x)
+ 2M L
k=m
(gk(X) - gk+1(X))
2:;;'=1 fkgk
Chapter 7
194
INFINITE SERIES OF FUNCTIONS
for all x E E. Since gm(x) ~ uniform Cauchy Criterion that
°L%"=l h(X)gk(X)E, uniformly on
as m ~ 00, it follows from the converges uniformly on E. I
Here is a typical application of Dirichlet's Test. *7.17 Example. Prove that if ak 10 as k ~ 00, then uniformly on any closed subinterval [a, b] of (0, 21T). PROOF. Let fk(X) = cos(kx) and gk(X) in Example 6.34, we can show that
n
Dn (x) :=
sin
nEN
and
xE
converges
= ak for kEN. By the technique used
~ + sin ((n + ~) x)
' " cos kx = ------'--'-;;;;-------'-----''-
L.t
2 . X sm 2'
k=O
for
L%"=o ak cos(kx)
(0, 21T). Hence the partial sums of
L%"=o h(x)
satisfy
for x E (0, 21T). If J = min{21T - b,a} and x E [a,b], then sin(x/2) 2: sin(J/2) (see Figure 7.4). Therefore, L%"=l ak cos(kx) converges uniformly on [a, b] by Dirichlet's Test. I y
x
Figure 7.4
This example can be used to show that uniform convergence of a series alone is not sufficient for term-by-term differentiation. Indeed, although L%"=l cos(kx)/k converges uniformly on [1T /2, 31T /2]' its term-by-term derivative L%"=l (- sin( kx)) converges at no point in [1T/2,31T/2].
7.2
195
Uniform convergence of series
A double series is a series of numbers or functions of the form
Such a double series is said to converge if and only if E;:l akj converges for each kEN and 00
N
00
LLakj:= N-+oo lim L k=l j=l k=l
(00
Lakj j=l
)
exists and is finite. When working with double series, one frequently wants to be able to change the order of summation. We already know that the order of summation can be changed when akj :::: 0 (see Exercise 7, p. 172). We now prove a more general result. (The elegant proof given here, which comes from Rudin [11 J,l uses uniform convergence.)
7.18 THEOREM. Let akj E R for k,j EN and suppose that 00
Aj for each j E then
N. liE;:l Aj
L!akj! < k=l
=
00
converges (i.e., the double sum converges absolutely), 00
00
00
00
LLakj = LLakj. k=l j=l j=lk=l PROOF.
Let E
= {O, 1,~,~, ... }.
For each j EN, define a function fj on E by
00
/j(0)
=
L akj, k=l
/j
(~) = takj,
nEN.
k=l
By hypothesis, /j(0) exists and by the definition of series convergence, lim
(.!.)
I- n = 1-(0); J
n-+oo J
i.e., /j is continuous at 0 E E for each j EN. Moreover, since !/j(x)! ::; Aj for all x E E and j EN, the Weierstrass M - Test implies that 00
f(x)
:=
L /j(x) j=l
1 Walter Rudin, Principles of Mathematical Analysis, 3rd ed. (New York: McGraw-Hill Book Co., 1976). Reprinted with permission of McGraw- Hill Book Co.
Chapter 7
196
INFINITE SERIES OF FUNCTIONS
converges uniformly on E. Thus f is continuous at 0 E E by Theorem 7.9. It follows from the sequential characterization of continuity (Theorem 3.21) that f(l/n) -+ f(O) as n -+ 00. Therefore, 00
00
n
00
(1)
00
n
00
(1)
00
00
= n-+oo lim '~ " fj -n = n--+oo lim f -n = f(O) = '" ~ '" ~ akj' I j=l
j=lk=l
EXERCISES 1. (a) Prove that L:~=l cos(kx)/k2 converges uniformly on R. (b) Prove that L:~=l sin(x/k 2 ) converges uniformly on any bounded interval. 2. Prove that the geometric series 00
1
~
I-x
"'xk _ _ k=O
converges uniformly on any closed interval [a, b] c (-1,1). 3. Let E(x) = L:~=o xk/k!. (a) Prove that the series defining E(x) converges uniformly on any closed interval
[a,b]. (b) Prove that
lb
E(x) dx = E(b) - E(a)
for all a,b E R. (c) Prove that the function y
= E(x) satisfies the initial value problem
y' - y
=
y(O)
0,
=
1.
(We shall see in Section 7.4 that E(x) = eX.) 4. Suppose that
Prove that
1
7r/2
o
00
f(x) dx =
L
k=O
5. Show that
f(x) =
f
k=l
(-I)k (2k 1)3'
+
~ sin (i-) +1
7.3
Power series
197
converges, pointwise on R and uniformly on each bounded interval in R, to a differentiable function f that satisfies
If(x)1 ~ for all x E R. 6. Prove that
Ixl
I~ (1 -
and
1!,(x)1 ~ 1
COS(l/k))1
~ 2.
7. If f = L::;;O=l!k converges uniformly on a set E ~ R and if gl is bounded on E with gk(X) ~ gk+1(X) ~ 0 for all x E E and kEN, prove that L::;;O=l fkgk converges uniformly on E. 8. Let n ~ 0 be a fixed nonnegative integer and recall that O! := 1. The Bessel function of order n is the function defined by 00
Bn(x)
:= { ;
(-l)k (x)n+2k (k!)(n + k)!"2 .
(a) show that Bn (x) converges pointwise on R and uniformly on any closed interval [a, b]. (b) Prove that y = Bn (x) satisfies the differential equation
X2y"
+ xy' + (x 2 -
n 2)y = 0
for x E R. (c ) Prove that for n E N and x E R. *9. Suppose that ak ! 0 as k - t 00. Prove that on any closed interval [a,b] C (0, 21r).
L::;;O=l ak sin kx converges uniformly
7.3 POWER SERIES Polynomials are functions of the form P(x) = L:~=o akxk, where ak E Rand n ~ O. In this section we investigate a natural generalization of polynomials, namely, series of the form L::;;O=o ak xk . Actually, we shall consider a slightly more general class of series. A power series (centered at xo) is a series of the form 00
S(x) :=
L ak(x -
k=O where we use the convention that (x - xo)O
xo)k,
= 1. Indeed, although 00 is in general indeterminant, when dealing with power series we always interpret 00 = 1. Since S(x) is identically ao when x = Xo, it is clear that every power series converges at at least one point. The following result shows that this may be the only point.
Chapter 'l
198
INFINITE SERIES OF FUNCTIONS
7.19 Remark. There exist power series that converge only at one point. PROOF. For each x i= 0, (kklxlk)l/k = klxl ---+ 00 as k Root Test, the series 2:%"'=1 kkxk diverges when x i= O. I
---+
00. Therefore, by the
In general, a series of functions can converge at several isolated points. (For example, the series 2:%"'=1 sin(kx) converges only when x = mr for some n E Z.) We shall see (Theorem 7.21) that this cannot happen for power series. Hence, we introduce the following concept.
7.20 DEFINITION. An extended real number R is said to be the radius of convergence of a power series S(x) := 2:%"'=0 ak(x-xol if and only if S(x) converges absolutely for Ix - xol < Rand S(x) diverges for Ix - xol > R. The extreme cases are R S(x) converges only when x absolutely for all x E R.
= =
0 and R = 00. When R = 0, the power series Xo. When R = 00, the power series S(x) converges
The next result shows that every power series S has a radius of convergence that can be computed using roots of the coefficients of S.
7.21 THEOREM. Let S(x) = 2:%"'=0 adx - xo)k be a power series centered at Xo. If R = l/limsuPk--->oo lakl l / k , with the convention that 1/00 = 0 and 1/0 = 00, then R is the radius of convergence of S. In fact,
(i) S(x) converges absolutely for each x E (xo - R, Xo + R), (ii) S(x) converges uniformly on any closed interval [a, b] c (xo - R, Xo (iii) and (when R is finite), S(x) diverges for each x t/:. [xo - R, Xo + R].
+ R),
PROOF. Fix x E R, x i= Xo, and set p := 1/ limsuPk--->oo lakl l / k , with the convention that 1/00 = 0 and 1/0 = 00. To apply the Root Test to S(x), consider
r(x)
:=
lim sup lak(x - xo)kll/k k------:,oo
=
Ix - xol·limsup lakl l / k . k------:,oo
Case 1. p = O. By our convention, p = 0 implies that r(x) = 00 > 1, so by the Root Test, S(x) does not converge for any x i= Xo. Hence, the radius of convergence of S is R = 0 = p. Case 2. p = 00. Then r(x) = 0 < 1, so by the Root Test, S(x) converges absolutely for all x E R. Hence, the radius of convergence of S is R = 00 = p. Case 3. p E (0, (0). Then r(x) = Ix - xol/ p. Since r(x) < 1 if and only if Ix - xol < p, it follows from the Root Test that S(x) converges absolutely when x E (xo - p,xo + p). Similarly, since r(x) > 1 if and only if Ix - xol > p, we also have that S(x) diverges when x t/:. [xo - p, Xo + pl. This proves that p is the radius of convergence of S, and that parts (i) and (iii) hold. To prove part (ii), let [a, b] C (xo-R, xo+R). Choose an Xl E (xo-R, xo+R) such that x E [a, b] implies Ix - xol ::; IXI - xol (see Figure 7.5). Set Mk = lakllxl - xol k and observe by part (i) that 2:%"'=0 Mk converges. Since lak(x - xo)kl ::; Mk for x E [a, b] and kEN, it follows from the Weierstrass M-Test that S(x) converges uniformly on [a, b]. I
7.3
(
•
Power series
.11111111 ..
199
) Xo +R
Figure 7.5 The following result, which is weaker than Theorem 7.21 (see Exercise 9, p. 173), provides another way to compute the radius of convergence of some power series (see also Exercise 4). This way is easier when ak contains products, e.g., factorials.
7.22 THEOREM. If the limit
exists as an extended real number, then R is the radius of convergence of the power series S(x) = l:~=o ak(x - xo)k. PROOF. Repeat the proof of Theorem 7.21, using the Ratio Test instead of the Root Test, to find that S(x) converges absolutely on (xo-R, xo+R) and diverges for each x tf. [xo - R, Xo + R]. By Definition 7.20, R must be the radius of convergence of S(x) .•
7.23 DEFINITION. The interval of convergence of a power series S(x) is the largest interval on which S(x) converges. By Theorem 7.21, for a given power series S three possibilities:
=
l:~=o ak(x
- xo)k, there are only
(i) R = 00, in which case the interval of convergence of S is (-00,00), (ii) R = 0, in which case the interval of convergence of S is {xo}, and (iii) 0 < R < 00, in which case the interval of convergence of S is (xo - R, Xo + R), [xo - R,xo + R), (xo - R,xo + RJ, or [xo - R,xo + R]. To find the interval of convergence of a power series, therefore, one needs to compute the radius of convergence R first. If 0 < R < 00, one must also check both endpoints, Xo - R and Xo + R, to see whether the interval of convergence is closed, open, or half open/closed. Notice once and for all that the Ratio and Root Tests cannot be used to test the endpoints, since it was the Ratio and Root Tests that gave us R to begin with.
7.24 Example. Find the interval of convergence of S(x) = l:~1 xk /..fk. SOLUTION. By Theorem 7.22,
R
= lim v'k+1 = k-->oo
..fk
· k +1 - 1 I1m k -.
k-->oo
Thus, the interval of convergence has endpoints 1 and -1. S(x) diverges at x = 1 by the p-Series Test and converges at x = -1 by the Alternating Series Test. Thus, the interval of convergence of S(x) is [-1,1) .•
200
Chapter 7
INFINITE SERIES OF FUNCTIONS
7.25 Remark. The interval of convergence may contain none, one, or both its endpoints. PROOF.
By Theorem 7.22, the radius of convergence of each of the series 00
k
00
L
X
k=l
k ,
00
k
L~2
k=l
is 1, but by the Divergence Test, the Alternating Series Test, and the p-Series Test, the intervals of convergence of these series are (-1,1), [-1,1), and [-1,1], respectively. I We now pass from convergence properties of power series to the calculus of power series. The next several results answer the question: What properties (e.g., continuity, differentiability, integrability) does the limit of a power series satisfy?
7.26 THEOREM. If f(x) = E~=o ak(x - xo)k is a power series with positive radius of convergence R, then f is continuous on (xo - R, Xo + R). Let x E (xo - R, Xo + R) and choose a, bE R such that x E (a, b) and [a, b] c (xo - R, Xo + R). By Theorems 7.21ii and 7.14i, f is continuous on (a, b), hence at x. I PROOF.
The following result shows that continuity of the limit extends to the endpoints when they belong to the interval of convergence.
7.27 THEOREM [ABEL'S THEOREM]. Suppose that [a, b] is nondegenerate. If f(x):= E~oak(x-xo)k converges on [a,b], thenf(x) is continuous and converges uniformly on [a, b]. PROOF. By Theorems 7.21ii and 7.26, we may suppose that f has a positive, finite radius of convergence R, and by symmetry, that a = Xo and b = Xo + R. Thus, suppose that f (x) converges at x = Xo + R and fix Xl E (xo, Xo + R]. Set bk = ak Rk and Ck = (Xl - xo)k / Rk for kEN. By hypothesis, E~=l bk is convergent. Hence, given c > 0, there is an integer N > 1 such that
Since 0 < Xl - Xo :::; R, the sequence {cd is decreasing. Applying Abel's Formula and telescoping, we have
= Cn
n
n-l
k
k=m
k=m
j=m
L bk + L (Ck - ck+d L b
j
7.3
Since
Cm ~ CI ~
201
Power series
R/ R = 1 it follows that n
k=m
for all Xl E (Xo, Xo + R]. Since this inequality also holds for Xl that L:~=oak(x - xo)k converges uniformly on [xo,xo + R]. I
= Xo,
we conclude
7.28 Remark. If a power series S(x) = L:~o ak(x - xO)k converges at some Xl > Xo, then S(x) converges uniformly on [xo, xd and absolutely on [xo, xd. It might not converge absolutely at X = Xl' PROOF. By Theorems 7.21 and 7.27, S(x) converges uniformly on [XO,XI] and absolutely on [xo, Xl)' The power series L:~=l (_x)k /k converges uniformly on [0,1] but not absolutely at X = L I
To discuss differentiability of the limit of a power series, we first show that the radius of convergence of a power series is not changed by term-by-term differentiation (see also Exercise 6, p. 56).
7.29 Lemma. If an E R for n E N, then X := limsup(n!an!)l/(n-l) = lim sup !an!l/(n-l) := y. n-->oo
n-->oo
PROOF. Let c: > O. Since nl/(n-l) ---+ 1 as n implies 1 - c: < nl/(n-l) < 1 + c:; i.e.,
---+ 00,
(1 - c:)!an!l/(n-l) < (n!an!)l/(n-l) < (1
choose N
E
N so that n ?: N
+ c:)!an!l/(n-l)
for large n. Since y is the supremum of the set of adherent points of !an!l/(n-l), the right-most inequality above implies that X ~ (1 +c:)y; i.e., X ~ y. Similarly, the left-most inequality above implies that y ~ x. I We use this result to prove that each power series with a positive radius of convergence is term-by-term differentiable.
7.30 THEOREM. If f(x) = L:~o ak(x - xO)k is a power series with positive radius of convergence R, then 00
J'(x)
=
L
kak(X - xO)k-1
k=l for X E (xo - R,xo
+ R).
+ R). Choose a, b E R such that X E (a, b) and By Lemma 7.29, the radius of convergence of the derived series L:~=l kak(X - xO)k-1 is also R. Hence by Theorem 7.21 the derived series converges uniformly on [a, b]. We conclude by Theorem 7.14iii that the series f(x) is term-by-term differentiable on (a, b), hence at x. I PROOF.
[a, b]
Let X
E
(xo - R, Xo
c (xo - R, Xo + R).
Recall that for each nonempty, open interval (a, b), Coo (a, b) represents the set of functions f such that f(k) exists and is continuous on (a, b) for all kEN. The following result generalizes Theorem 7.30.
202
Chapter 7
INFINITE SERIES OF FUNCTIONS
7.31 COROLLARY. If f(x) = L~=o ak(X - xO)k has a positive radius of convergence R, then f E Coo(xo - R, Xo + R) and 00
f (k)( x )
(7)
_"
,
n.
(
- L..J (n _ k)!a n x - Xo
)n-k
n=k
for x E (xo - R, Xo
+ R)
and kEN.
PROOF. The proof is by induction on k. By Theorem 7.30 and the fact that O! := 1, (7) holds for k = 1 and x E (xo - R, Xo + R). If (7) holds for some kEN and all x E (xo - R, Xo + R), then f(k) is a power series with radius of convergence R. It follows from Theorem 7.30 that
for all x E (xo - R, Xo
+ R).
Hence, (7) holds for k
+ 1 in place of k.
I
The following result shows that each power series with a positive radius of convergence can also be integrated term by term.
7.32 THEOREM. Let f(x) = L~=o ak(x-xo)k be a power series and let a, bE R with a < b. (i) If f(x) converges on [a, b], then f is integrable on [a, b] and
1
L ak 1(x -
bOOb
a
f(x) dx =
k=O
xo)k dx.
a
*(ii) If f(x) converges on [a, b) and if L~=o ak(b-xo)k+l /(k+ 1) converges, then f is improperly integrable on [a, b) and
PROOF. (i) By Abel's Theorem, f(x) converges uniformly on [a, b]. Hence, by Theorem 7.14ii, f(x) is term-by-term integrable on [a, b]. (ii) Let a :s: t < b and set A = L~=o ak(a - xo)k+1 /(k + 1). By part (i),
7.3
Power series
203
The leftmost term of this last difference is a power series which by hypothesis converges at t = b. Thus, by the definition of improper integration and Abel's Theorem,
I
b
f(x) dx
=
a
lim t-->b-
It
f(x) dx
a
00
=
lim '"' ~(t - xo)k+1 - A t-->b- ~ k + 1 k=O
=
L k ~ 1 (b - xo)k+1 - A k=O 00
00
L ak k=O
=
Ib
(x - xO)k dx. I
a
The following result shows that the product of two power series is a power series. (For a result on the division of power series, see Taylor [13], p. 619.)
7.33 THEOREM. If f(x) = L:~=o akx k and g(x) = L:~=o bkx k converge on (-r, r) and k
k = 0,1, ... ,
Ck = Lajbk- j , j=O
then L:~=o Ckxk converges on (-r, r) and converges to f(x)g(x). PROOF.
Fix x
E
(-r, r) and for each n
n
n
fn(x) = Lakxk, k=O
E
N, set n
gn(x) = Lbkxk, k=O
and
hn(x)
=
L Ckxk. k=O
By changing the order of summation, we see that
n k n n hn(x) = LLajbk-jxjxk-j = LajXj Lbk-jX k- j k=Oj=O j=O k=j n
=
n
Lajxjgn-j(x) j=O
=
g(x)fn(x)
+ Lajxj(gn-j(X) -
g(x)).
j=O
Thus, it suffices to show that n
lim '"' ajx j (gn-j (x) - g(x)) = O.
n-+oo~
j=O
> O. Since f(x) converges absolutely and gn(x) converges as n M > 0 such that L:~o lakxk I < M and Let
€
Ign-j(x) - g(x)1 :::; M
~ 00,
choose
Chapter 7
204
for all integers n f! 2: N
Let n
INFINITE SERIES OF FUNCTIONS
> j > O. Similarly, choose N E N such that E
Igf(X) - g(x)1 < 2M
implies
and
> 2N. Then n
LajXj(gn-j(X) - g(x)) j=O N
n
LajXj(gn-j(X) - g(x)) j=O
+
L
ajxj(gn-j(x) - g(x))
j=N+l
k
7.34 COROLLARY. Suppose that ak, bk E Rand Ck := 2::j=o ajb k _j for k = 0,1, .... If either (i) 2::%"=0 ak and 2::%"=0 bk both converge, and at least one of them converges absolutely, or (ii) if 2::%"=0 ak, 2::%"=0 bk, and 2::%"=0 Ck all converge, then
(8) (i) Repeat the proof of Theorem 7.33 with x = l. (ii) By hypothesis, the radii of convergence of 2::%"=0 akxk, 2::%"=0 bkx k , and 2::%"=0 Ckxk are all at least 1; hence, by Theorem 7.33, PROOF.
~ Ckxk ~ (~akxk) (~bkXk)
(9)
for x E (-1,1). But by Abel's Theorem (7.27), the limit of (9) as xiI is (8). I The hypotheses of Corollary 7.34 cannot be relaxed. * 7.35 Example. If ak diverges. PROOF.
= bk = (-l)k / /k for kEN and ao = bo = 0, then 2::%"=0 Ck
If 2::%"=0 Ck converges, then Ck k-l
1
ICkl=Lv0~=2 j=l J J >2 -
----+
0 as k
L
(~) ( 1 ) 2 J(k - 1)/2
00.
But for k
> 1 odd,
1
(k-l)/2 j=l
----+
v0~ J J (
1
J(k - 1)
)
=
v'2 .
7.3
Thus
Ck
Power series
205
cannot converge to zero, a contradiction. I
We close this section with some optional material on finding exact values of convergent power series. Namely, we show how term-by-term differentiation and integration can be used in conjunction with geometric series to obtain simple formulas for certain kinds of power series. Such formulas are called closed forms. *7.36 Example. Find a closed form of the power series 00
SOLUTION. Since the interval of convergence of this power series is (-1,1), we have by Theorems 7.32 and 6.7 (the Geometric Series) that
for each x E (-1,1). (Note that f(x)/x is defined at x by the Fundamental Theorem of Calculus,
=
0 and has value 1.) Hence,
1 (1 - X)2
f(x) = (_x )' x I-x and it follows that
f(x)
=
x (l-x)2'
XE(-I,I).
I
*7.37 Example. Find a closed form of the power series 00
k
9(X)=Lk:l' k=O
SOL UTION. Since the interval of convergence of this power series is [-1, 1), we have by Theorem 7.30 that
(k+l)' :+1 =Lx
00
00
(xg(x))'=L k=O
k
1
=I_X
k=O
for x E (-1,1). Hence, by the Fundamental Theorem of Calculus,
xg(x) =
l
x
-
dt
o 1- t
= -log(1 - x)
206
Chapter 7
INFINITE SERIES OF FUNCTIONS
= -1,
for x E (-1,1). Since g(-l) exists and log(l - x) is continuous at x conclude by Abel's Theorem that
g(x)
=
_log(l - x), x
x
E
[-1,1) \ {O},
and
g(O)
= 1.
we
•
EXERCISES 1. Find the interval of convergence of each of the following power series. k
00
(a)
00
L~k.
(b) L(( _l)k
k=O
+ 3)k(x -
l)k.
k=O
(k+1)
~ 1·3··· (2k - 1) 2k (d) ~ (k I)! x. k=l + *2. Find a closed form for each of the following series and the largest set on which this formula is valid. *
00 (c) Llog -k- xk. k=l
00
(a) L3x3k - 1 . k=l
00
(b) L
kxk-2.
L 00
(c)
2k k1(1 - x)k.
k=l
k=2
+
3. Use Theorems 7.30 and 7.33 to give two different proofs of the following identity:
1 00 (1 _ X)2 = L(k + l)xk, k=O
x E (-1,1).
*4. If 2:;;:1 akxk has radius of convergence Rand ak =J 0 for large k, prove that
~
~
lim inf 1 1 ::; R ::; lim sup 1 I· k-->oo ak+ 1 k-->oo ak+ 1 5. Suppose that lakl ::; Ibkl for large k. Prove that if 2:%"=1 bkx k converges on an open interval I, then 2:;;:1 akxk also converges on I. Is this result true if "open" is omitted? 6. Suppose that {ad~o is a bounded sequence of real numbers. (a) Prove that 00
L ak xk k=O has a positive radius of convergence. (b) If [a, bJ C (0,1) and I(x)
In(x):= I (x
:=
-~),
prove that In ~ I uniformly on [a, bJ.
x E [a,b],
7.4
Analytic functions
207
7. A series 2:~o ak is said to be Abel summable to L if and only if
(a) Prove that if 2:~o ak converges to L, then 2:~o ak is Abel summable to L. (b) Find the Abel sum of 2:~o (-l)k. * 8.
Prove that
f(x) =
~ C-l)~ +4)
k
is differentiable on (-3,3) and
1!,(x)1 ::; (3! X)2 for 0 ::; x < 3. 9. Let ak 1 0 as k
for all
~ 00.
x, y E [0,1]
Prove that given
that satisfy
€
> 0 there is a 8 > 0 such that
Ix - yl < 8.
*10. (a) Prove the following weak form of Stirling's Formula (compare with Theorem 12.73):
(b) Find all x E R for which the power series
converges absolutely.
7.4 ANALYTIC FUNCTIONS In this section we study functions that can be represented by power series. (For a discussion of how to represent functions by trigonometric series instead of power series, see Chapter 14.) We begin with the following definition.
208
Chapter 7
INFINITE SERIES OF FUNCTIONS
7.38 DEFINITION. A real-valued function I is said to be (real) analytic on a nonempty, open interval (a, b) if and only if given Xo E (a, b) there is a power series centered at Xo that converges to I near Xo; i.e., there exist coefficients {ad~o and points c, dE (a, b) such that c < Xo < d and 00
I(x) = Lak(X - xO)k k=O for all x E (c, d). We shall develop several techniques for showing that a given function is analytic. To simplify statements of results, we shall use the conventions ICO) := I and O! := 1. First, it is important to realize that if I can be represented by a power series S, then I is locally smooth and the coefficients of S can be computed using derivatives of I.
7.39 THEOREM [UNIQUENESS]. Let c, d be extended real numbers with c < d, let Xo E (c, d), and suppose that I : (c, d) ~ R. If I(x) = 2:%"=0 ak(x - xO)k for each x E (c, d), then I E COO(c, d) and k
= 0, 1, ....
PROOF. Clearly, I(xo) = ao. Fix kEN. By hypothesis, the radius of convergence R ofthe power series 2:%"=0 ak(x-xo)k is positive and (c, d) ~ (xo- R, xo+R). Hence, by Corollary 7.31, IE COO(c, d) and I
00
(10)
n. (_ )n-k I Ck)( x ) -_ "" ~ (n_k)!a n x Xo n=k
for x E (c, d). Apply this to x = Xo. The terms on the right side of (10) are zero when n > k and k!ak when n = k. Hence, ICk)(xO) = k!ak for each kEN. I In particular, if I is analytic on (a, b), then for each Xo E (a, b) there is only one power series centered at Xo that represents I near Xo. This power series has a special name.
7.40 DEFINITION. Let I E COO (a, b) and let Xo E (a, b). The Taylor expansion (or Taylor series) of I centered at xo is the series ~ ICk)(xO)
~
k!
k (x - xo) .
k=O (No convergence is implied or assumed.) The Taylor expansion of I centered at Xo = 0 is usually called the Maclaurin expansion (or Maclaurin series) of f. By Theorem 7.39, every analytic function is a Coo function. The next remark shows that the converse of this statement is false.
7.4
209
Analytic functions
7.41 Remark. [CAUCHY]. The function f(x)
=
{
~
_1/x2
x#o x=o
belongs to Coo ( -00,00) but is not analytic on any interval that contains x
= o.
PROOF. It is easy to see (Exercise 3, p. 101) that f E Coo(-oo,oo) and f(k) (0) = 0 for all kEN. Thus the Taylor expansion of f about the point Xo = 0 is identically zero, but f(x) = 0 only when x = o. I One of the aims of this section is to prove that many of the classical Coo functions used in elementary calculus are analytic on their domain. Since, by Theorem 7.39, a Coo function f is analytic on an open interval I if and only if its Taylor expansion at each Xo E I converges to f near xo, the following concept is useful in this regard. 7.42 DEFINITION. Let f E Coo(a, b) and Xo E (a, b). The remainder term of order n of the Taylor expansion of f centered at Xo is the function
In fact, the remainder term completely determines analyticity of a given function in the following way. 7.43 THEOREM. A function f E Coo (a, b) is analytic on (a, b) if and only if given Xo E (a, b) there is an interval (c, d) containing Xo such that the remainder term R~'xO (x) converges to zero for all x E (c, d).
PROOF. By Theorem 7.39, f is analytic on (a, b) if and only if given Xo E (a, b) there is an interval (c, d) containing Xo such that the Taylor expansion of f centered at Xo converges to f pointwise on (c, d). By Definition 7.42, this happens if and only if R~'xO ~ 0, as n ~ 00, for every x E (c, d). I Thus, to decide whether a given f E Coo(a, b) is analytic on (a, b) we need to estimate the corresponding remainder terms. We shall prove two results (see Theorems 7.44 and 7.52 below) that can be used to estimate remainder terms in concrete situations. To motivate the first result, notice that R{'xO = f(x) - f(xo), the remainder term of order 1, can always be estimated using the Mean Value Theorem. The proof of the following result shows that the remainder term of order n can be estimated by the Generalized Mean Value Theorem. 7.44 THEOREM [TAYLOR'S FORMULA]. Let n E N, let a, b be distinct extended real numbers, let f : (a, b) ----; R, and suppose that f(n) exists on (a, b). Then for each pair of points x, Xo E (a, b) there is a number c between x and Xo such that
Chapter 7
210
INFINITE SERIES OF FUNCTIONS
In particular,
f(x)
f(k) ( ) Xo (x - XO)k ~ k! k=O
n-l
= '"
f(n) ( ) n!
+ __C_(X -
xot
for some number c between x and Xo. PROOF.
Without loss of generality, suppose that Xo < x. Define
F(t) := (x _,t)n n.
and G(t):= R~,t(x) = f(x) -
f(k)(t) L -k!-(x - t)k
n-l
k=O
for each t E (a, b). In order to apply the Generalized Mean Value Theorem to F and G, we need to be sure the hypotheses of that result hold. Notice by the Chain Rule that
F'(t)
(11)
= _
(x - t)n-l (n - I)!
for t E R. On the other hand, since
!!.- (f(k)(t) dt
k!
(x
_ t)
k) _ f(kH)(t) _ k _ f(k)(t) _ k-l k! (x t) (k_1)!(x t)
for t E (a, b) and kEN, we can telescope to obtain (12)
G'(t) = -
f (n) (t) (x - t)n-l
(n - 1)!
for t E (a, b). Thus, F and G are differentiable on (xo, x) and continuous on [xo, xl. By the Generalized Mean Value Theorem and the fact that F(x) = G(x) = 0, there is a number c E (xo, x) such that
-F(xo)G'(c) = (F(x) - F(xo))G'(c) = (G(x) - G(xo))F'(c) = -G(xo)F'(c). Hence, it follows from (11) and (12) that
(x - xo)n (f(n)(c)(x - c)n-l) _ J,xo (x - c)n-l n! (n - 1)! - Rn (x) (n - I)! . Solving this equation for R~'xO completes the proof. I The following theorem, a corollary of Taylor's Formula, is the first of several results that identify conditions on the derivatives f(n) of an f E Coo sufficient for f to be analytic on an interval (a, b).
7.4
7.45 THEOREM. Let
211
Analytic functions
f E Coo (a, b).
If there is an M > 0 such that
for all x E (a, b) and n E N, then f is analytic on (a, b). In fact, for each Xo E (a, b),
L 00
f(x) =
f(k) ( ) k!xO (x - xO)k
k=O holds for all x E (a, b).
PROOF. Fix Xo E (a, b) and set 7.44,
e = max{Mla -
xol, Mlb - xol}. By Theorem
en
for all n E N. But In! ----> 0 as n ----> 00 for any e E R (being terms of a convergent series by the Ratio Test). Thus, by the Squeeze Theorem, the remainder term Rn (x) converges to zero for every x E (a, b). • 7.46 Example. Prove that sin x and cos x are analytic on R and have Maclaurin expansions 00 (_1)k x 2k • 00 (_1)k x 2k+1 cos x = (2k)! smx = (2k + 1)! ' k=O k=O
L
L
PROOF. Set f(x) = sinx. It is easy to see that
sinx
n= 4j,
= 4j + 1, = 4j + 2, n = 4j + 3
n n
f(n) (x) = { cos.x -smx -cosx for j
= 0,1, .... Hence If(n)(x)1 :::; 1 for all x
R, and
E
n
=
2k-1
n
= 2k
for kEN. It follows from Theorem 7.45 that f(x) = sinx is analytic on R and its Maclaurin expansion has the promised form. A similar argument verifies the result for cos x .•
7.4 7 Example. Prove that eX is analytic on R and has Maclaurin expansion k
00
X
e = ~k!' X
'"'
k=O
212
Chapter 7
INFINITE SERIES OF FUNCTIONS
PROOF. Fix C > 0 and set M = eG. If f(x) = eX, then f(n)(x) = eX for all x E R. Hence, If(n) (x)1 ::; M ::; Mn for n = 0,1, ... and x E [-C, C]. It follows from Theorem 7.45 that the Maclaurin series of f converges to f on [-C, C]. Since f(n)(o) = 1 for all n E N, and C > 0 was arbitrary, we conclude that '2:.~oxk/k! converges to eX for all x E R. I
Sometimes, it is impractical to get the kind of global estimates on the derivatives of f necessary to apply Theorem 7.45. The following result, which shows that the center of a power series can be changed within its interval of convergence, is sometimes used to shortcut this process.
7.48 THEOREM. Suppose that I is an open interval centered at c and 00
x E I.
k=O If Xo E I and r > 0 satisfy (xo - r, Xo
+ r)
~ I, then f(k)( ) k!xO (x - XO)k
L 00
f(x) =
k=O for all x E (xo - r, Xo + r). In particular, if f is a Coo function whose Taylor series expansion converges to f on some open interval J, then f is analytic on J. It suffices to prove the first statement. By making the change of variables w = x-c, we may suppose that c = 0 and 1= (-R, R), i.e., that f(x) = '2:.~0 akxk, for all x E (-R, R). Suppose that (xo - r, Xo + r) ~ (-R, R) and fix x E (xo r, Xo + r). By hypothesis and the Binomial Formula, PROOF.
(13)
f(x) =
L00 ak xk = L00 ak((x k=O
Since
'2:.%"=0 akyk
xo)
00
+ xO)k = L
k=O
ak
k=O
converges absolutely at y :=
t (~)x~-j(X
flak k=O j=O J
- Xo)j ::;
f
L (k). x~-j (x k
Ix - xol + Ixol < R, lakl
k=O
xo)j.
j=O J
t (~)
we have
Ixolk-jlx -
xol j
j=O J
00
= L lakl(lx - xol + Ixol)k < 00. k=O Hence, by (13), Theorem 7.18, and Corollary 7.31,
00
f(x) = {;ak
=
f; (k) Xo -J(x - xoF k
k
j
.
f (f (~) akx~-j) J=O
j=O
(x - xo)j
k=J J
00 (00 ""
= "" ~
.
~
k=j
k'
j (k _ . .)!a k(Xo_ O)kJ
) ( _
X '!Xo J
)j
00
f(j) (
= "" Xo ~'!
j=O
J
)
(_ x Xo )j . I
7.4
213
Analytic functions
7.49 Example. Prove that arctan x is analytic on (-1, 1) and has Maclaurin expansion 00 (_1)k x 2k+1 x E (-1,1). arctan x = 2k 1 k=O +
L
PROOF. For each 0 < x < 1, the geometric series 2:: %:0 (_1)kt 2k converges uniformly on [-x, xl to 1/(1 + t 2 ) (see Exercise 1, p. 158). Thus, by Theorem 7.32, arctan x =
l
x
l
dt -- =
o 1+
t2
x
0
L00 (-1)
ke k
dt =
( 1)kx2k+1 L00 ..:..---' ,---,---2k + 1
k=O k=O By uniqueness, this is the Maclaurin expansion of arctan x. Since this expansion converges on (-1, 1), it follows from Theorem 7.48 that arctan x is analytic on (-1,1).1
In Examples 7.46 and 7.47, we found the Taylor expansion of a given f by computing the derivatives of f and estimating the remainder term. In the preceding example, we found the Taylor expansion of arctan x without computing its derivatives. This can be done in general, using term-by-term differentiation or integration or products of power series, when the function in question can be written as an integral or derivative or product of functions whose Taylor series are known. Here are two more examples of this type. 7.50 Example. Find the Maclaurin expansion of arctanx/(l- x). PROOF. By Theorem 7.33 and Example 7.49, for each Ixl < 1, arctanx = I-x
=
(00
L k=O
xk)
f (L
(00
(_1)k X 2k+1)
L 2k+1 k=O ~~1):) xk,
k=l JEAk J where Ak := {j EN: 0 ::; j ::; (k - 1)/2}. 1
+
7.51 Example. Show that the Taylor expansion of log x centered at Xo = 1 is 00 ( l)k+l logx=L - k (x-1)k XE(0,2). k=l PROOF. By Theorem 7.32, for each x E (0,2),
r tdt = }r 1 - (1dt - t)
log x = } 1
=
1
00 l L(1x
1
k=O
t)k dt =
L00 (- l)k+l k (x -
l)k. 1
k=l
In some situations it is useful to have an integral form of the remainder term. This requires a slightly stronger hypothesis than Theorem 7.44 but can yield a sharper estimate.
214
Chapter 7
INFINITE SERIES OF FUNCTIONS
7.52 THEOREM [LAGRANGE]. Let n E N. If j E Cn(a,b), then
for all x, Xo E (a, b). PROOF. The proof is by induction on n. If n = 1, the formula holds by the Fundamental Theorem of Calculus. Suppose that the formula holds for some n E N. Since
Rn+1(x)
=
Rn(x)-
j(n) (xo) n , (x-xo)
n.
and
(x - xo)n
n!
=
1 (n-1)!
l
x
(x-tt-1 dt
Xo
it follows that
Let u = j(n) (t) - j(n) (xo), dv = (x - t)n-l and integrate the right side of the identity above by parts. Since u(xo) = 0 and v(x) = 0, we have R n+1(x)
= - ( _1 1)' n
.
l
Hence, the formula holds for n
x
u'(t)v(t) dt
= ,1
Xo
+ 1.
n.
l
x
(x - t t j(n+1)(t) dt.
Xo
I
The rest oj this section contains some additional (but optional) material on analytic functions.
In order to generalize the Binomial Formula from integer exponents to real exponents (compare Theorem 1.15 with Theorem 7.54), we introduce the following notation. Let a E Rand k be a nonnegative integer. The genemlized binomial coefficient a over k is defined by
(~),~ { ~(a-l)~~a-k+l)
ki=0 k
= O.
Notice that when a E N, these generalized binomial coefficients coincide with the usual binomial coefficients, because in this case
(~) = 0 for k > a.
*7.53 Lemma. Suppose that a, f3 E R. Then k
= 0, 1, ....
215
Analytic functions
7./;
PROOF. The formula holds for k = 0 and k = 1. If it holds for some k ~ 1, then by the inductive hypothesis and the definition of the generalized binomial coefficients,
( Ct +
(3)
=
k+1
(3) Ct + /3 - k
(Ct + k
=
t( .
J=O
k+1
Ct .)
k-J
(~)
(Ct -
k
+ j + /3 -
k+1
J
j)
k+1
~ (k - j + 1) ( Ct ) (/3) (j + 1) ( Ct ) ( /3 ) =f;:6 k+1 k-j+1 j + k+1 k-j j+1 Ct) ~ = ( k+1 +~
(k -k+1 j + 1 +k+1 j) (k-j+1 Ct ) (/3) (/3) j + k+1
With this ugly calculation out of the way, we are prepared to generalize the Binomial Formula.
*7.54 THEOREM
[BINOMIAL SERIES]. If Ct E Rand
(1
+ x)Q =
f
Ixl < 1,
then
(~)xk.
k=O
+ x)Q is analytic on (-1,1) for all Ct E R. Ixl < 1 and consider the series F(Ct) := L:%':o (~)xk.
In particular, (1 PROOF. Fix
Since
is independent of Ct, it follows from the proof of the Ratio Test that F converges absolutely and uniformly on R. Hence, F is continuous. Moreover, by Theorem 7.33 and Lemma 7.53,
=t; 00
(
Ct + k
(3) xk=F(Ct+/3).
216
Chapter 7
INFINITE SERIES OF FUNCTIONS
Hence, it follows from Exercise 9, p. 79, that F(o:)
F(l) =
f
k=O
= F(l)".
Since
(~)xk = 1 +x,
we conclude that F(x) = (1 + x)" for all Ixl < 1. I Lagrange's Theorem gives us another condition on the derivatives of j sufficient to conclude that j is analytic. *7.55 THEOREM [BERNSTEIN]. If j E COO (a, b) and j(n) (x) ~ 0 for all x E (a, b) and n E N, then j is analytic on (a, b). In fact, ifxo E (a,b) and j(n) (x) ~ 0 for x E [xo, b) and n E N, then (14)
for all x
E
PROOF.
variables t (15)
[xo, b). Fix Xo < x < band n E N. = (x - xo)u + Xo to write
Rn(x) = R!'XO(x) =
Use Lagrange's Theorem and a change of
(x - xo)n 11 (1- u)n-1 j(n)((x - xo)u + xo) duo (n-1)! 0
Since j(n) ~ 0, (15) implies that Rn(x) ~ hypothesis,
Rn(x) = j(x) -
o.
On the other hand, by definition and
n-1 j(k) ( ) k!xO (x - xO)k ::; j(x). k=O
L
Therefore, (16) for all x E (xo, b). Let bo E (xo, b) and notice that it suffices to verify (14) for Xo ::; x < boo (We introduce the parameter bo in order to handle the cases b E R and b = 00 simultaneously.) Since Rn (xo) = 0 for all n EN, we need only show that Rn (x) ~ 0 as n ~ 00 for each x E (xo, bo). By hypothesis, j(n+1) (t) ~ 0 for t E [xo, b), so j(n) is increasing on [xo, b). Since x < bo < b, we have by (15) and (16) that
0::; Rn(x)
=
(x - xo)n (n _ I)!
(x x)n ::; (n--~)!
10r (1 1
ut- 1j(n)((x - xo)u + xo) du
r1(1-ut- 1j(n)((bo - x o)u+xo)du
10
= (x - Xo )n Rn(bo). bo - Xo
I
I' I I I I I I I I
217
Analytic functions
7·4
I' I I I I c
'I I I I
f
~ a
1'1
)
I I I I
b
d
I I I I
~
Figure 7.6 Since (x - xo)/(bo - xo) < 1 and, by (16), Rn(bo) :S f(bo), we conclude by the Squeeze Theorem that Rn(x) ---+ 0 as n ---+ 00. I
*7.56 Example. Prove that aX is analytic on R for each a > O. PROOF. First suppose that a 2: 1. Since fCn)(x) = (loga)n. aX 2: 0 for all x E R and n E N, aX is analytic on R by Bernstein's Theorem. If 0 < a < 1, then by what we just proved and a change of variables, 00
a
X
k( -1)(
I
= ( -1)-x = '""' og a ~
a
k=O
k!
-x
)k
00
I
k
= '""' og a· x ~
k!
k=O
k .
Hence by Theorem 7.48, aX is analytic on R. I Our final theorem shows that an analytic function cannot be extended in an arbitrary way to produce another analytic function. We first prove the following special case.
*7.57 Lemma. Suppose that f,g are analytic on an open interval (c,d) and that Xo E (c, d). If f(x) = g(x) for x E (c, xo), then there is a 8> 0 such that f(x) = g(x) for all x E (xo - 8, Xo + 8). PROOF. By Theorem 7.39 and Definition 7.38, there is a 8 > 0 such that (17)
f(x) =
L 00
fCk)(xO) k!
(x - xo)
k and
g(x) =
k=O
00
gCk)(xO) k!
(x - xo)
k
k=O
for all x E (xo - 8, Xo
(18)
L
+ 8).
By hypothesis, f, 9 are continuous at Xo and
f(xo) = lim f(x) = lim g(x) = g(xo). x----+xo-
X-+Xo-
Similarly, fCk)(xo) = gCk)(xo) for kEN. We conclude from (17) that f(x) for all x E (xo - 8,xo + 8). I
= g(x)
*7.58 THEOREM [ANALYTIC CONTINUATION]. Suppose that I and J are open intervals, that f is analytic on I, that 9 is analytic on J, and that a < b are points in In J. If f(x) = g(x) for x E (a, b), then f(x) = g(x) for all x E In J. PROOF. We assume for simplicity that I and J are bounded intervals. Since In J =f 0, choose c, dE R such that In J = (c, d) (see Figure 7.6).
Chapter 7
218
INFINITE SERIES OF FUNCTIONS
Consider the set E = {t E (a,d) : f(x) = g(x) for all x E (a,t)}. By our assumption, d < 00 and by hypothesis bEE. Thus E is bounded and nonempty. Let Xo = supE. If Xo < d, then by Lemma 7.57 there is a 8 > 0 such that f(x) = g(x) for all x E (xo - 8, Xo +8). This contradicts the choice of Xo. Therefore, Xo = d; i.e., f(x) = g(x) for all x E (a, d). A similar argument proves that f(x) = g(x) for all x E (c, b). I
EXERCISES 1. Prove that each of the following functions is analytic on R and find its Maclaurin expansion.
(a) cos(3x).
(c) cos 2 x.
2. Prove that each of the following functions is analytic on (-1, 1) and find its
Maclaurin expansion. eX
x3
(c)-.
(a) 10g(l-x).
(d) (l-x)2.
I-x
* (d) arcsin x.
3. Prove that each of the following functions is analytic on R and find its Maclaurin
expansion. (b) eX cosx.
(c) sinx.
x¥=O x=O
(where a > 0 fixed).
(d) f(x) := { (aX - 1)/x log a
eX
4. For each of the following functions, find its Taylor expansion centered at Xo and determine the largest interval on which it converges.
(a) 10glO x. 5. (a) Prove that for all x E [0,1]' 1+ x
x2
x3
x2
9
x3
+ -2 + -6 < eX < - + x + - + -. -8 2 6
(b) Prove that for all x E [0,1]'
x3 X -
-
(c) Prove that for all x E [1,2] and y
y2
Y-
y3
x3
1
< arctan x -< -4 + x - -3. 3 -
= x-I,
y4
2 + 3 - 4 :::; log x
y2
:::; y -
y3
2 +3 -
y4 64·
=1
7.5
219
Applications
6. (a) Prove that
for all 0 < 8 :::; 1. (b) Prove that if Ix - 'IT I :::; 8 :::; 1, then 7. Suppose that f E COO (-00, 00) and
for all a E R. Prove that
Ix + sin x - 'IT I :::; 83 /3!.
f is analytic on (-00,00) and
LT
f(k)( )
00
f(x) =
xk ,
x
E
R.
k=O
8. (a) Prove that
for n E N. (b) Show that 2.9253
<
11
ex2 dx < 2.9254.
-1
9. Let f E COO (a, b). Prove that f is analytic on (a, b) if and only if f' is analytic on (a,b). *10. Suppose that f is analytic on (-00,00) and
lb
If(x)1 dx
for some at- bin R. Prove that f(x) *11. Prove that
for all
ak E R
0
= 0 for all x
E
R.
Elakl f3 )1/f3 :::; Elakl 00
(
=
00
and all f3 > 1.
e7.5 APPLICATIONS ment section.
This section uses no material from any other enrich-
The theory of infinite series is a potent tool for both pure and applied mathematics. In this section we give several examples to back up this claim.
220
Chapter 7
INFINITE SERIES OF FUNCTIONS
We begin with a nontrivial theorem from number theory. Recall that an integer n ~ 2 is called prime if the only factors of n in N are 1 and n. Also recall that given n E N there are primes PI, P2, ... ,Pk and exponents aI, a2, ... ,ak such that
7.59 THEOREM [EUCLID'S THEOREM; EULER'S PROOF]. There are infinitely many primes in N. PROOF. Suppose to the contrary that PI, P2, ... ,Pk represent all the primes in N. Fix N E N and set a = sup{al, ... , ad, where this supremum is taken over all aj's that satisfy n = p~lp~2 ... p~k for some n ~ N. Since every integer j E [1, N] must have the form j = p~l ... p~k for some choice of integers 0 ~ ei ~ a, we have
On the other hand, for each integer i E [1, k], we have by Theorem 6.7 that
1
1
(1)£
l+~+···+a~L ~ P.
Pi
00
£=1
P.
Pi Pi - 1
Consequently, N
,,; ~J J=1
~ (~) ... (~) PI -1 Pk- 1
Taking the limit of this inequality as N 00, a contradiction. I
---+ 00,
=M
we conclude that L~I
1/j
~ M
<
Our next application, a result used to approximate roots of twice-differentiable functions, shows that if an initial guess Xo is close enough to a root of a suitably well-behaved function f, then the sequence Xn generated by (19) converges to a root of f.
7.60 THEOREM [NEWTON]. Suppose that f : [a, b] ---+ R is continuous on [a, b] and that f(c) = 0 for some c E (a, b). Iff" exists and is bounded on (a, b) and there is an co > 0 such that 1f'(x)1 ~ co for all x E (a, b), then there is a closed interval I S; (a, b) containing c such that given Xo E I, the sequence {xn }nEN defined by
(19)
Xn
= Xn-l
f(Xn-l) - f' ( ), Xn-l
satisfies Xn E I and Xn ---+ c as n ---+
nEN,
00.
PROOF. Choose M > 0 such that 11"(x)1 ~ M for x E (a, b). Choose ro E (0,1) so small that I = [c - ro, c + ro] is a subinterval of (a, b) and ro < co/ M. Suppose
7.5
Applications
221
that Xo E I and define the sequence {xn} by (19). Set r := roM/co and observe by the choice of ro that r < 1. Thus it suffices to show that (20) and (21) hold for all n E N. The proof is by induction on n. Clearly, (20) and (21) hold for n = O. Fix n E N and suppose that (22) and (23)
IXn-l -
Use Taylor's Formula to choose a point
~
cl < roo between c and Xn-I such that
Since (19) implies that - f(Xn-l) = f'(xn-d(xn - xn-d, it follows that
Solving this equation for Xn - c, we have by the choice of M and co that (24) Since M/co
IXn -
<
cl
=
1"(0 ) 1 2M 2 IXn-l - cl < - I X n - 1 - cl . I2f'( Xn-I 2cO
l/ro, it follows from (24) and (23) that
IXn -
cl <
M co
-IXn-l -
1 2 cl 2 < -lral = roo ro
This proves (21). Again, by (24), (22), and the choice of r, we have
Since r < 1 and 2n - 1 ;::: n imply r 2n - 1 r 2n - 1 lxo - cl:::; rnlxo - cI-
:::;
rn, we conclude that IXn -
cl <
Notice that if Xn-I and Xn satisfy (19), then Xn is the x-intercept of the tangent line to y = f(x) at the point (Xn-I, f(Xn-I)) (see Exercise 4). Thus, Newton's
222
Chapter 7
INFINITE SERIES OF FUNCTIONS
y
x
Figure 7.7 method is based on a simple geometric principle (see Figure 7.7). Also notice that by (24), this method converges very rapidly. Indeed, the number of decimal places of accuracy nearly doubles with each successive approximation. As a general rule, it is extremely difficult to show that a given nonalgebraic number is irrational. The next result shows how to use infinite series to give an easy proof that certain kinds of numbers are irrational.
7.61 THEOREM [EULER]. The number e is irrational. PROOF.
Suppose to the contrary that e
7.47,
=
fJ.. =e- 1 =
f
P
k=O
p/q for some p,q
E
N. By Example
(-1)k. k!
Breaking this sum into two pieces and multiplying by (_1)p+lp!, we have x:= (_1)P+1 (q(P-1)! -
t (_~kp!)
= y:=
k=O
f
(_1)k+P+1~:.
k=p+l
Since p!/k! E N for all integers k ::; p, the number x must be an integer. On the other hand, 1
1
y = p + 1 - (p + 1)(p + 2)
1
+ (p + 1)(p + 2)(p + 3)
- ...
lies between 1/(p + 1) and 1/(p + 1) - 1/(p + 1)(p + 2). Therefore, y is a number that satisfies 0 < y < 1. In particular, x#- y, a contradiction. I
7.5
223
Applications
y
x
I
T
Figure 7.8 We know that a continuous function can fail to be differentiable at one point (e.g., f(x) = Ixl). Hence, it is not difficult to see that given any finite set of points E, there is a continuous function that fails to be differentiable at every point in E. We shall now show that there is a continuous function that fails to be differentiable at any point in R. Once again, here is a clear indication that although we use sketches to motivate proofs and to explain results, we cannot rely on sketches to give a complete picture of the general situation.
7.62 THEOREM [WEIERSTRASS]. There is a function f continuous on R that is not differentiable at any point in R. [Note: Such functions are called nowhere differentiable.] PROOF.
Let
fo(x) = { x
I-x
o :S x < 1/2 1/2 :S x
<1
and extend fo to R by periodicity of period 1, Le., so that fo(x) = fo(x + 1) for all x E R (see Figure 7.8). Set h(x) = fo(2kx)/2k for x E Rand kEN and consider the function x E R.
Normalizing fk by 2k has two consequences. First, since fo(Y) satisfies 2y ~ Z, it is easy to see that
(25)
fHy) = ±1
=
±1 for each Y that
for each y that satisfies 2k+ly ~ Z.
Second, by the Weierstrass M-Test, f converges uniformly, hence is continuous onR. Since f is periodic of period 1, it suffices to show that f is not differentiable at any x E [0,1). Suppose to the contrary that f is differentiable at some x E [0,1). For each n E N, choose p E Z such that x E [an, .an) for an = p/2 n and.an = (p+l)/2n.
Chapter 7
224
INFINITE SERIES OF FUNCTIONS
Since each !k is linear on [ak+b ,Bk+l] and [an' ,Bn] ~ [ak+l, ,Bk+1] for n > k, it is clear that k (c-a;.:. :. .n) Ck := .::.c.!k:.. o:(/3-:n:.:. . )_--=-f"'-C
,Bn - an depends only on k and not on n when n > k. Moreover, by (25), it is also clear that each Ck = ±l. Therefore, L:%"=o Ck cannot be convergent. On the other hand, since f is differentiable at x,
(26)
f'(x)
=
lim f(,Bn) - f(a n ) n--+oo
f3n -
Qn
(see Exercise 7). However, since fo(Y) = 0 if and only if Y E Z, we also have !k(,Bn) = !k(a n ) = 0 for k 2: n. It follows that f(,Bn) = L:~:~ !k(,Bn) and f(a n) = L:~:~ fk(a n ). We conclude from (26) that
oo L Ck = l'1m nL-l Ck = l'1m f(,Bn) - f(a n) = f'() X n--+oo n--+oo (3n - Q n k=O k=O is convergent, a contradiction. I
EXERCISES 1. Using a calculator and Theorem 7.60, approximate all real roots of f(x) = x 3 + 3x 2 + 4x + 1 to five decimal places. 2. (a) Using the proof of Theorem 7.60, prove that (20) holds if r/2 replaces r. (b) Use part (a) to estimate the difference IX4 - nl, where Xo = 3, f(x) = sinx, and Xn is defined by (19). Evaluate X4 directly; and verify that X4 is actually closer than our theory predicts. 3. Prove that given any n E N, there is a function f E Cn(R) such that f(n+1)(x) does not exist for any x E R. 4. Prove that if Xn-l, Xn satisfy (19), then Xn is the x-intercept of the tangent line to Y = f(x) at the point (Xn-l, f(Xn-l)). 5. Prove that cos(l) is irrational. 6. Suppose that f : R ---- R. If 1" exists and is bounded on R, and there is an co > 0 such that 1f'(x)1 2: cO for all x E R, prove that there exists a 8 > 0 such that if If(xo)1 :::; 8 for some Xo E R, then f has a root, i.e., that f(c) = 0 for some C E R. 7. Let x E [0,1) and an, ,Bn be defined as in Theorem 7.62. (a) If f : [0,1) ---- Rand E R, prove that
"f
f(,Bn) - f(a n) _ ~-~
"f =
(f(,Bn) - f(x) _
"f) ( ,Bn -
~-x
X)
~-~
+ (f (x) - f (an) _ "f) x - an
(b) If f is differentiable at x, prove that (26) holds.
(x-
an ) . ,Bn - an
Chapter 8
The world we live in is at least four-dimensional: three spatial dimensions together with the time dimension. Moreover, certain problems from engineering, physics, chemistry, and economics force us to consider even higher dimensions. For example, guidance systems for missiles frequently require as many as 100 variables (longitude, latitude, altitude, velocity, time after launch, pitch, yaw, fuel on board, etc.). Another example, the state of a gas in a closed container, can best be described by a function of 6m variables, where m is the number of molecules in the system. (Six enters the picture because each molecule of gas is described by three space variables and three momentum variables.) Thus, there are practical reasons for studying functions of more than one variable.
8.1 ALGEBRAIC STRUCTURE For each n E N, let R n denote the n-fold Cartesian product of R with itself; i.e., R n :=
{(XI,X2, ...
,xn) : Xj E R for j = 1,2, ... ,n}.
By a Euclidean space we shall mean R n together with the "Euclidean inner product" defined in Definition 8.1 below. The integer n is called the dimension of Rn, elements x = (Xl, X2, .•. , xn) of Rn are called points or vectors or ordered n-tuples, and the numbers Xj are called coordinates, or components, of x. Two vectors x, yare said to be equal if and only iftheir components are equal; i.e., Xj = Yj for j = 1,2, ... , n. The zero vector is the vector whose components are all zero; i.e., 0 := (0,0, ... ,0). When n = 2 (respectively, n = 3), we usually denote the components of x by X, Y (respectively, by X, Y, z). We have already encountered the sets Rn for small n. RI = R is the real line; we shall call its elements scalars. R 2 is the xy plane used to graph functions of the form y = f(x). And R3 is the xyz space used to graph functions of the form z = f(x, y). 225
226
Chapter 8
EUCLIDEAN SPACES
We began our study of functions of one variable by examining the algebraic structure of R. In this section we begin our study of functions of several variables by examining the algebraic structure of R n. That structure is described in the following definition. 8.1 DEFINITION. Let x = (Xl"'" xn), Y 0: E R be a scalar. (i) The sum of x and Y is the vector
x +y:= (Xl
= (YI," . ,Yn) E Rn be vectors and
+ YI, X2 + Y2,""
Xn
+ Yn).
(ii) The difference of x and y is the vector
x-y:= (Xl - YI,X2 - Y2,··· ,Xn - Yn). (iii) The product of a scalar
0:
and a vector x is the vector
(iv) The (Euclidean) dot product (or scalar product or inner product) of x and y is the scalar x·y:= XIYI + X2Y2 + ... + XnYn' These algebraic operations are analogues of addition, subtraction, and multiplication on R. It is natural to ask: Do the usual laws of algebra hold in R n ? An answer to this question is contained in the following result. 8.2 THEOREM. Let X,y,Z E Rn and 0:,{J E R. Then nO = 0, Ox = 0, 1x = x, o:({Jx) = (J( ax) = (o:{J)x, o:(x· y) = (ax) . y = x . (o:y), o:(x + y) = ax + o:y, + x = x, x - x = 0, 0· x = 0, x + (y + z) = (x + y) + Z, x + y = y + x, X· Y = y. x, and X· (y+z) =x·y+x·z.
°
PROOF. These properties are direct consequences of Definition 8.1 and corresponding properties of real numbers. We will prove that vector addition is associative, and leave the proof of the rest of these properties as an exercise. By definition and associativity of addition on R (see Postulate 1 in Section 1.1),
x+(y+z) = (XI, ... ,Xn)+(YI +ZI, ... ,Yn+zn) = (Xl + (YI + zt}, ... , Xn + (Yn + zn)) = ((Xl + yt} + Zl,"" (xn + Yn) + zn) = (x+y) +z. I Thus (with the exception of the closure of the dot product and the existence of the multiplicative identity and multiplicative inverses), Rn satisfies the same algebraic laws, listed in Postulate 1, that R does. This means that one can use instincts developed in high school algebra to compute with these vector operations. For example, just as (x - y)2 = X2 - 2xy + y2 holds for real numbers X and y,
(1)
(x -y) . (x -y) = X·X - 2x 'y+y'y
holds for any vectors X,y ERn. The algebra of Rn can be used to define what it means for two vectors to be parallel or orthogonal.
8.1
Algebraic structure
227
8.3 DEFINITION. Let a and b be nonzero vectors in Rn. (i) a and b are said to be parallel if and only if there is a scalar t E R such that a=tb. (ii) a and b are said to be orthogonal if and only if a· b = O. In earlier courses, vectors were (most likely) directed line segments, but our vectors look like points in R n. What is going on? When we call an a ERn a vector, we are thinking of the directed line segment that starts at the origin and ends at the point a. (For example, verify that the vectors a = (3,5), b = (-6, -10) are parallel and that the vectors c = (1,1), d = (1, -1) are orthogonal, both in the sense of Definition 8.3 and in the usual sense, i.e., graph them as directed line segments emanating from the origin, and use your innate geometric reasoning.) What about directed line segments that begin at arbitrary points? Two arbitrary directed line segments are said to be equivalent if and only if they have the same length and same direction. Thus every directed line segment V is equivalent to a directed line segment in standard position, i.e., one that points in the same direction as V, has the same length as V, but whose "tail" sits at the origin and whose "head," a, is a point in Rn. If we identify V with a, then we can represent every arbitrary directed line segment in R n by a point in Rn. In general, we make no distinction between points and vectors, but in each situation we adopt the interpretation that proves most useful. Identifying arbitrary vectors in R n with vectors in standard position and, in turn, with points in R n may sound confusing and sloppy, but it is no different from letting 1/2 represent 2/4, 3/6,4/8, etc. (In both cases, there is an underlying equivalence relation, and we are using one member of an equivalence class to represent all of its members. For vectors, we are using the representative that lies in standard position; for rationals, we are using the representative that is in reduced form.) In the first four chapters, we used algebra together with the absolute value to define convergence of sequences and functions in R. Is there an analogue of the absolute value for Rn? The following definition illustrates the fact that there are many such analogues. 8.4 DEFINITION. Let x ERn. (i) The (Euclidean) norm (or magnitude) of x is the scalar n
Ilxll:=
L
IXkI2.
k=l
(ii) The il-norm (read L-one-norm) of x is the scalar n
IIxlll := L IXkl· k=l
(iii) The sup-norm of x is the scalar
228
Chapter 8
EUCLIDEAN SPACES y
x
Figure 8.1 (Note: For relationships between these three norms, see Remark 8.7. The subscript is frequently used for supremum norms because the supremum of a continuous function on an interval [a, bj can be computed by taking the limit of If(x)IP dX)I/p as p --> oo--see Exercise 8, p. 126.)
U:
00
Since Ilxll = Ilxlll = Ilxll oo = lxi, when n = 1, each norm defined above is an extension of the absolute value from R to Rn. The most important, and in some senses the most natural, of these norms is the Euclidean norm. This is true for at least two reasons. First, by definition,
IIxl1 2 =x·x (This aids in many calculations; see, for example, the proofs of Theorems 8.5 and 8.6.) Second, if ~ is the triangle in R2 with vertices (0,0), x:= (a,b), and (a,O), then by the Pythagorean Theorem, the hypotenuse of ~, .Ja 2 + b2 , is exactly the norm of x. Hence we define the (Euclidean) distance between two points a,b ERn by dist (a, b) := Iia-bil. Thus the Euclidean norm of a vector has a simple geometric interpretation. The algebraic structure of Rn also has a simple geometric interpretation in R2 and R3 that gives us another very useful way to think about vectors. Scalar multiplication stretches or compresses a vector a but leaves it in the same straight line which passes through 0 and a. Indeed, if a = (aI, a2) and t > 0, then ta = (tal, ta2) has the same direction as a, but its magnitude is ~ or < than the magnitude of a, depending on whether t ~ 1 or t < 1. When t is negative, ta points in the opposite direction from a but is again stretched or compressed, depending on the size of Itl. To interpret the sum of two vectors, fix a pair of nonparallel vectors a, b E R 2 , and let P(a,b) denote the pamllelogmm associated with a and b; i.e., the parallelogram whose sides are given by a and b (see Figure 8.1). Notice that if a = (aI, a2) and b = (b l , b2), then by definition the vector sum a+b = (al + bl , a2 + b2) is the diagonal of P(a,b), i.e., a+b is the vector that begins at the origin and ends at the opposite vertex of P(a;b). Similarly, the difference a - b can be identified with the other diagonal of P(a, b) (see Figure 8.1).
8.1
229
Algebraic structure
The geometry of R2 can be used to extend concepts from R2 to Rn. Here are several examples. Let a and b be nonzero vectors in R 2 and plot some values of ¢(t) := a + tb. First, ¢(O) = a and ¢(1) = a + b. Next, if 0 < t < 1, then tb is a vector that points in the same direction as b but has smaller magnitude. Hence, the resulting sum, ¢( t), will be the vertex opposite the origin of the smaller parallelogram P(a, tb). As t ranges from 0 to 1, the vertices of these parallelograms P(a, tb) will trace out the edge of P(a,b) opposite and parallel to the vector b (see Figure 8.1). Hence ¢( t) traces out the line segment from a to a + b as t ranges from o to 1. In fact, as t ranges over all of R, the image of ¢(t) traces out the line in R2 parallel to b that passes through the point a (see also Exercise 3). For this reason, for a,b E Rn, b of 0 and n arbitrary this time, we define the straight line in Rn which passes through a in the direction b to be the set of points
fa(b) := {a+ tb: t E R}. Similarly, we define the line segment from a to b to be the set of points L(a;b) := {(1- t)a+ tb: t E [0, In.
In particular, it is easy to see that the entire parallelogram (perimeter and the region it surrounds) determined by a and b can be described by using the scalar product and vector sum as follows: P(a;b) := {1P(u, v) := ua + vb: u, v E [0, In.
We shall see below that in addition to suggesting definitions that work for Rn, the geometry of R 2 also can be used to help construct proofs in R n. The next two results answer the question: How many properties do the absolute value and the Euclidean norm share? Although the norm is not multiplicative, the following fundamental inequality can be used as a replacement for the multiplicative property in most proofs. (Some authors call this the Cauchy-Schwarz-Bunyakovsky Inequality. ) 8.5 THEOREM [CAUCHy-SCHWARZ INEQUALITY]. Iix,y ERn, then
IX'yl::; Ilxllllyll· STRATEGY. Using the fact that the dot product of a vector with itself is the square of the norm of the vector and the square of any real number is nonnegative, identity (1) becomes 0 ::; Ilx - Yl12 = IIxl1 2 - 2x. y + IIYI12. We could solve this inequality to get an estimate of the dot product of x·y. This estimate could be very crude if the magnitude of x were much larger than that of y, for then the norm x-y would be much larger than zero. But x -y is only one point on the line f,.,( -y). We might get a better estimate of the dot product x . y by using the inequality
(2)
0::;
Ilx - tyl12 = (x -
ty) . (x - ty) =
IIxl1 2-
2t(x. y)
+ t 211Yl12
Chapter 8
230
EUCLIDEAN SPACES y
x
Figure 8.2 for other values of t. In fact, if we draw a picture in R2 (see Figure 8.2), we see that the norm of Ilx - tyll is smallest for the value of t that make x - ty orthogonal to y, Le., when 0= (x - ty) .y = x·y - ty.y = x .y- t11Y112. This suggests using t = X· y/llyl12 when y -# O. It turns out that this value of t is exactly the one that reproduces the Cauchy-Schwarz Inequality. Here are the details. PROOF. The Cauchy-Schwarz Inequality is trivial when y
= O. If y -# 0, substitute
t = (x'y)/llyI12 into (2), to obtain 2 o ~ IIxl1 -
It follows that 0 ~ that
t(x·y)
IIxl1 2- (x.y) 21IIYI12.
= Ilxll
2
(x·y)2
-~.
Solving this inequality for (x.y)2, we conclude
The analogy between the absolute value and the Euclidean norm is further reinforced by the following result (compare with Theorem 1.7). (See also Exercise
9.) 8.6 THEOREM. Let x,y ERn. Then (i) Ilxll ~ 0 with equality only when x = O. (ii) Ilaxll = lad Ilxll for all scalars a. (iii) [TRIANGLE INEQUALITIES]. Ilx+yll ~ Ilxll
+ Ilyll
and
Ilx-yll
~
Ilxll-llyll·
PROOF. Statements (i) and (ii) are obvious. To prove (iii), observe that by Definition 8.4, Theorem 8.2, and the CauchySchwarz Inequality,
Ilx+yl12 = (x+y)· (x+y) =x·x+2x·y+y·y = IIxl1 2+ 2x. y + IIyl12 ~ IIxl1 2 + 211xlillyll + IIyl12 = (11xll + IIYI1)2.
8.1
231
Algebraic structure
This establishes the first inequality in (iii). By modifying the proof of Theorem 1.7, we can also establish the second inequality in (iii). I Since Ilxll is the magnitude of the vector x, the triangle inequality has a simple geometric interpretation: Ilx + yll ~ Ilxll + Ilyll states that the length of one side of a triangle (namely, the triangle whose vertices are 0, x, and x + y) is less than or equal to the sum of the lengths of its other two sides. For some estimates, it is convenient to relate the Euclidean norm to the fl-norm and the sup-norm.
8.7 Remark. Let x E Rn. Then (i) IXj I ~ Ilxll ~ (ii) Ilxll ~ IlxliI· PROOF.
(i) Let 1
Vn Ilxll oo ~
j
~
for each j = 1,2, ... , n, and
n. By definition,
(ii) Observe that (IXII + ... +
IX
nl)2 = IXl12 + ... +
IX
nl 2 + 2
L
IXillxjl,
(i,j)EA
where A = {(i,j) : 1 ~ i,j ~ n and i < j}. Since 2:(i,j)EA IXillxjl ~ 0, we conclude that The geometry of R2 can be used to introduce a concept of "angle between" two vectors. Let a,b E R2 \ {(O, On and suppose that A is the triangle determined by the points 0, a, and b. The sides of this triangle have length Iiall, Ilbll, and Iia - bll. If we let f) be the angle between a and b, i.e., the angle in A at the vertex (0,0), then by the Law of Cosines (see Appendix B), Iia - bl1 2 = IIal1 2 + IIbl1 2 - 211alillbil cos f). Since Theorem 8.2 implies Ila-bl1 2 = (a-b)· (a-b) = IIal1 2 - 2a·b+ Ilb11 2, it follows that -2a. b = -211allllbll cos f). Since neither a nor b are zero, we conclude that (3)
a·b cos f) = Ilallllbll'
Using this R2 result, we DEFINE the angle between two nonzero vectors a,b ERn (for any n E N) to be the number f) E [0,1f] determined by (3). Notice that by the Cauchy-Schwarz Inequality, the right side of (3) always belongs to the interval [-1,1]. Hence, for each pair of nonzero vectors a,b ERn, there is a unique angle f) E [0,1f] that satisfies (3).
232
Chapter 8
EUCLIDEAN SPACES
This definition is consistent with Definition 8.3 (see Exercise 7 below). Indeed, if () is the angle between two nonzero vectors a and b in R n, then a and b are parallel if and only if () = or () = 7f, and a and b are orthogonal if and only if () = 7f /2. We define the usual basis of Rn to be the collection {el,'" ,en}, where ej is the point in Rn whose jth coordinate is 1, and all other coordinates are 0. By definition, then, each x = (Xl"", Xn) ERn can be written as a linear combination ofthe ej's:
°
n X=
LXjej. j=l
Notice that the usual basis {ej} consists of pairwise orthogonal vectors; Le., ej' ek = when j -# k. In particular, the usual basis is an orthogonal basis. In R 2 or R 3, el is denoted by i, e2 is denoted by j, and, in R 3, e3 is denoted by k. Thus, in R 3, i:= (1,0,0), j:= (0,1,0), and k:= (0,0,1). In particular, vectors in R2 have the form xi + yj, and vectors in R3 have the form xi + yj + zk. We shall not discuss other bases of Rn or the more general concept of "vector spaces," which can be introduced using postulates similar in spirit to Postulate 1 in Chapter 1. Instead, we have introduced just enough algebraic machinery in Rn to develop the calculus of multivariable functions. For more information about R n and abstract vector spaces, see Noble and Daniel [9]. Since x·y is a scalar, the dot product in R n does not satisfy the closure property for any n > 1. Here is another product, defined only on R 3, that does satisfy the closure property. (As we shall see below, this product allows us to exploit the geometry of R3 in several unique ways.)
°
8.8 DEFINITION. The cross product of two vectors x (Yl, Y2, Y3) in R3 is the vector defined by
=
(Xl,X2,X3) and y =
Using the usual basis i = el, j = e2, k = e3, and the determinant operator (see Appendix C), we can give the cross product a more easily remembered form:
x xy
= det
[;1 Yl
~l· Y3
The following result shows that the cross product satisfies some, but not all, of the usual laws of algebra. (Specifically, notice that the cross product satisfies neither the commutative property nor the associative property.) 8.9 THEOREM. Let x,y,z E R3 be vectors and a be a scalar. Then
(i)
x xx=O,
(ii)
(ax) x y
xxy= -yxx,
= a(x x y) = x
x (ay),
8.1
233
Algebraic structure
xxy
Figure 8.3
x x (y + z)
(iii) (iv)
(x X
y) . z = x· (y x z) = det x x (y x z)
(v)
= (x x y) + (x x z), Xl [ YI
X2 Y2
X3] Y3 ,
Zl
Z2
Z3
= (x·z)y- (x·y)z,
and (vi)
Moreover, if x x y
(vii)
#- 0,
then the vector x x y is orthogonal to x and y.
PROOF. These properties follow immediately from the definitions. We will prove properties (iv), (v), and (vii) and leave the rest as an exercise. (iv) Notice that by definition, (x
x y) . z
= (X2Y3 = Xl (Y2Z3
X3Y2)ZI
+ (X3YI
- Y3Z2)
+ X2(Y3Z1
- XIY3)Z2 - YIZ3)
+ (XIY2
- X2YI)Z3
+ X3(YIZ2
- Y2 Z I)'
Since this last expression is both the scalar x· (yxz) and the value of the determinant on the right side of (iv) (expanded along the first row), this verifies (iv). (v) Since x x (y x z) = (Xl, X2, X3) X (Y2Z3 - Y3Z2, Y3Z1 - YIZ3, YIZ2 - Y2zd, the first component of x x (y x z) is X2YIZ2 -X2Y2 Z1 -X3Y3 Z 1 +X3YIZ3
=
(XIZI +X2 Z2 +X3 Z3)YI - (XIYI +X2Y2 +X3Y3)ZI.
This proves that the first components of x x (y x z) and (x· z)y - (x· y)z are equal. A similar argument shows that the second and third components are also equal. (vii) By parts (i) and (iv), (x x y) . x = -(y x x) . x = -y. (x x x) = -y. 0 = O. Thus x x y is orthogonal to x. A similar calculation shows that x x y is orthogonal
toy. I Part (vii) is illustrated in Figure 8.3. Notice that x x y satisfies the "right-hand rule." Indeed, if one puts the fingers of the right hand along x and the palm of the right hand along y, then the thumb points in the direction of x x y.
Chapter 8
234
EUCLIDEAN SPACES
By (3), there is a close connection between dot products and cosines. The following result shows that there is a similar connection between cross products and sines. 8.10 Remark. Let x,y be nonzero vectors in R3 and () be the angle between x and y. Then Ilx x yll = Ilxllllyll sin (). PROOF. By Theorem 8.9vi and (3), Ilx
X
Yl12
= (1Ixllllyll)2 - (1lxllllyll COS())2 = (1IxIIIIYII)2(1 - cos 2 ()) = (11xllllyl1)2 sin2 (). I
This observation can be used to establish a connection between cross products and area or volume (see Exercise 7, p. 241). EXERCISES 1. Using Postulate 1 in Section 1.1 and Definition 8.1, prove Theorem 8.2.
2. (a) Find all nonzero vectors orthogonal to (1, -1,0) that lie in the plane z = x. (b) Find all nonzero vectors orthogonal to the vector (3,2, -5) whose components sum to 4. (c) Find an equation of the plane containing the point (1,2,1) with normal (-1,2,1). 3. Let a,b E Rm, b # 0, and set ¢(t) = a + th. Show that the angle between ¢(td - ¢(to) and ¢(t2) - ¢(to) is 0 or Jr for any to, h, t2 E R with h, t2 # to. 4. Use the proof of Theorem 8.5 to show that equality in the Cauchy-Schwarz Inequality holds if and only if x = 0, y = 0, or x is parallel to y. 5. Prove Theorem 8.9, parts (i) through (iii) and (vi). 6. Suppose that {ad and {bk} are sequences of real numbers that satisfy 00
La% < 00 k=l
00
and
Lb% < 00. k=l
Prove that the infinite series L:~l akbk converges absolutely. 7. Let a and b be nonzero vectors in Rn, and () be the angle between them. (a) Use Exercise 4 to prove that a and b are parallel if and only if () (b) Prove that a and b are orthogonal if and only if () = Jr /2. 8. Find two lines in R3 that are not parallel but do not intersect. 9. Prove that the £l-norm and the sup-norm also satisfy Theorem 8.6.
= 0 or
Jr.
8.2 PLANES AND LINEAR TRANSFORMATIONS A plane n in R 3 is a set of points that is "flat" in some sense. What do we mean by flat? If we look at any vector that lies in the plane, it is orthogonal to a common
B.2 Planes and linear transformations
235
b
Figure 8.4
direction, called the normal (see Figure 8.4). Thus we define the hyperplane (a plane when n = 3) passing through a point a ERn with normal b =I- 0 to be the set
IIb(a)
:=
{x ERn: (x-a) ·b= O}.
Notice that by definition, IIb(a) is the set of all points x such that x - a and bare orthogonal. (Several such points x are shown in Figure 8.4.) Hence we have built "flatness" into the definition of hyperplanes. There is nothing unique about "the normal" of a hyperplane; any vector parallel to b will work. Indeed, if band c are parallel, then by definition, b = tc for some nonzero t E R, hence (x - a) . b = 0 if and only if (x - a) . c = O. However, a normal of a hyperplane can be used to determine many of its properties. For example, the angle between two hyperplanes with respective normals band c is defined to be the angle between the normals band c. By an equation of a hyperplane II we mean an expression of the form F(x) = 0, where F : Rn ~ R is a function determined by the following property: a point x belongs to II if and only if F(x) = O. By definition, then, an equation of the hyperplane IIb(a) is given by
where b = (b 1 , ... , bn ) is a normal and d = b1al + b2a2 + ... + bnan is a constant determined by a and b (and related to the distance from IIb(a) to the origin-see Exercise 8). In particular, planes in R3 have equations of the form ax + by + cz = d.
Chapter 8
236
EUCLIDEAN SPACES
Notice that a "hyperplane" in R2 is by definition a straight line. Just as straight lines through the origin played a prominent role in characterizing differentiability of functions of one variable (see Theorem 4.3), hyperplanes through the origin will play a role in defining differentiability of functions of several variables. But the equation of a hyperplane is by definition real-valued. Since we do not want to restrict our analysis of differentiable functions to the real-valued case, we need to characterize equations of hyperplanes in an algebraic way so we can generalize them further to vector-valued functions, i.e., functions that take R n into Rm. Toward this end, we make the following observation about equations of straight lines through the origin. (Here we use s for slope since m will be used for the dimension of the range space Rm.) 8.11 Remark. Let T : R satisfies
(4)
T(x
---+
R. Then T(x) = sx for some s E R if and only ifT
+ y) = T(x) + T(y)
and T(ax)
= aT (x)
for all x,y,a E R. PROOF. IfT(x) = sx, then T satisfies (4) since the distributive and commutative laws hold on R. Conversely, if T satisfies (4), set s := T(1). Then (let a = x)
T(x) = T(x· 1) = xT(1) = sx
for all x E R. I Accordingly, we make the following definition. 8.12 DEFINITION. A function T : Rn T E £(Rn; Rm)) if and only if it satisfies T(x + y)
= T(x) + T(y)
---+
and
Rm is said to be linear (notation:
T( ax)
= aT(x)
for all x,y ERn and all scalars a. Notice once and for all that if T is a linear function, then
(5)
T(O)
= O.
Indeed, by definition, T(O) = T(O+O) = T(O) + T(O). Hence (5) can be obtained by subtracting T(O) from both sides of this last equation. Functions in £(Rn; Rm) are sometimes called linear transformations or linear operators because of the fundamental role they play in the theory of change of variables in Rn. We shall take up this connection in Chapter 12. According to Remark 8.11, linear transformations of one variable, i.e., objects T E £(R; R), can be identified with R by representing T by its slope s. Is there an analogue of slope that can be used to represent linear transformations of several variables? To answer this question, we use the following half page to review some elementary linear algebra.
237
8.2 Planes and linear tmnsformations
Recall that an m x n matrix B is a rectangular array that has m rows and n columns: b12 l1 bIn b b22 b2n 21 [ b . . B = [bij]mxn := :
1
bm1 bm2 bmn For us, the entries bij of a matrix B will usually be numbers or real-valued functions. Let B = [bij]mxn and e = [cvk]pxq be such matrices. Recall that the product of B and a scalar a is defined by aB = [abij]mxn,
e is defined (when m = p and n = q) by B + e = [bij + Cij]mxn, and the product of Band e is defined (when n = p) by
the sum of Band
[t
Be =
biVCvj] mxq
v=l
Also recall that most of the usual laws of algebra hold for addition and multiplication of matrices (see Theorem C.l in Appendix C). One glaring exception is that matrix multiplication is not commutative. We shall identify points x = (XI, X2, ... , xn) E R n with 1 x n row matrices or n x 1 column matrices by setting
where BT represents the transpose of a matrix B (see Appendix C). Abusing the notation slightly, we shall usually represent the product of an m x n matrix Band an n x 1 column matrix [x] by Bx. This notation is justified, as the following result shows, since the function x f----+ [x] takes vector addition to matrix addition, the dot product to matrix multiplication, and scalar multiplication to scalar multiplication. 8.13 Remark. Ijx,y ERn and a is a scalar, then
[x+y] = [x]
+ [y],
[x·y] = [x][yf,
and
[ax]
= a[x].
PROOF. These laws follow immediately from the definitions of addition and multiplication of matrices and vectors. For example,
[x+y] = [Xl
+ YI
= [Xl
X2
X2
+ Y2 Xn]
Xn
+ [YI
+ Yn ]
Y2
Yn]
= [x] + [y]. I
The following result shows that each m x n matrix gives rise to a linear function from Rn to Rm.
238
Chapter 8
EUCLIDEAN SPACES
8.14 Remark. Let B = [bij ] be an m x n matrix whose entries are real numbers, and leteI, ... ,e,. represent the usual basis ofRn. If
T(x) = Bx,
(6)
then T is a linear function from Rn to Rm and
(7)
j
= 1,2, ... ,n.
PROOF. Notice, first, that (7) holds by (6) and the definition of matrix multiplication. Next, observe by Remark 8.13 and the distributive law of matrix multiplication (see Theorem C.l) that
T(x+y) = B[x+y] = B([x]
+ [y]) = B[x] + Bfy] = T(x) + T(y)
for all x,y ERn. Similarly, T(cvx) = B[cvx] = B(o:[x]) = o:B[x] = o:T(x) for all x ERn and 0: E R. Thus T E £(Rn; Rm). I Remark 8.14 would barely be worth mentioning were it not the case that ALL linear functions from R n to R m have this form. Here, then, is the multidimensional analogue of Remark 8.11. 8.15 THEOREM. For each T E £(Rn; Rm) there is a matrix B = [bij]mxn such that (6) holds. Moreover, the matrix B is unique. Specifically, for each fixed T there is only one B that satisfies (6), and the entries of that B are defined by (7). PROOF. Uniqueness has been established in Remark 8.14. To prove existence, suppose that T E £(Rn; Rm). Define B by (7). Then
T(x) = T (tXjej) J=l
The unique matrix B that satisfies (6) is called the matrix that represents T. Notice by (7) that the columns of B are the images of the usual basis elements under T. In Chapter 11 we shall use this point of view to define what it means for a function from Rn into Rm to be differentiable. At that point, we shall show that many of the one-dimensional results about differentiation go over to the multidimensional setting. Since the one-dimensional theory relied on estimates using the absolute value of various functions, we expect the theory in Rn to rely on estimates using the norms of various functions. Since some of those functions will be linear, the following concept will be useful in this regard.
8.2 Planes and linear transformations
239
8.16 DEFINITION. Let T E £(Rn; Rm). The operator norm of T is the extended real number IITII := inf{C > 0 : IIT(x)11 :S Cllxll
for all x ERn}.
One interesting corollary of Theorem 8.15 is that the operator norm of a linear function is always finite.
8.17 THEOREM. Let T E £(Rn;Rm). Then the operator norm ofT is finite, and satisfies (8)
IIT(x) II :S IITIIllxl1
PROOF. Let B be the m x n matrix that represents T, and suppose that the rows of T are given by~, ... , 6m. By the definition of matrix multiplication and our identification of R m with m x 1 matrices,
T(x) =
(~·x,
... ,6m ·x).
If B = 0, then IITII = 0 and (8) is an equality. If B =f. 0, then by the CauchySchwarz Inequality, the square of the Euclidean norm of T(x) satisfies
+ ... + (6m . x)2 :S (11~llllxll)2 + ... + (116mllllxI1)2
IIT(x) 112 = (~ . x)2
:S m· max{llbj l1 2 : 1 :S j :S m} IIxl1 2 =: C Ilx11 2. and C > O. Thus the set defining IITII is nonempty. Since it is bounded below (by 0), it follows from the Completeness Axiom that IITII exists and is finite. In particular, there are Ck > 0 such that Ck ! IITII and IIT(x) II :S Ckllxll for all x ERn. Taking the limit of this last inequality as k ~ 00, we obtain (8). I Theorem 8.17, an analogue of the Cauchy-Schwarz Inequality, will be used to estimate differentiable functions of several variables. If B is the matrix that represents a linear transformation T, we will refer to the number IITII as the operator norm of B, and denote it by IIBII. (For two other ways to calculate this norm, see Exercise 11.) We close this section with an optional result that shows that under the identification of linear functions with matrices, function composition is taken to matrix multiplication. This, in fact, is why matrix multiplication is defined the way it is.
*8.18 Remark. 1fT: Rn ~ Rm and U : Rm ~ RP are linear, then so is U 0 T. In fact, if B is the m x n matrix that represents T, and C is the p x m matrix that represents U, then C B is the matrix that represents U 0 T.
Chapter 8
240
EUCLIDEAN SPACES
PROOF. Let el, ... ,e,. be the usual basis of Rn, U1, ... ,'Urn be the usual basis of Rm, and W1, ... ,wp be the usual basis of RP. If B = [bij]mxn represents T and G = [Cvk]pxm represents U, then by Theorem 8.15, m
L bkjUk = (blj , ... , bmj ) = T(ej),
j
= 1,2, ... ,n,
k=l
and
p
L CvkWv = (Clk, ... , Cpk) = U(Uk)'
k=I,2, ... ,m.
v=l
Hence
= .
ft
bkjcvkWv
=
k=lv=l
(f
bkjClk, ... ,
k=l
f
bkjCPk)
k=l
for each 1 S j S n. Since this last vector is the jth column of the matrix GB, it follows that G B is the matrix that represents U 0 T. I
EXERCISES 1. (a) Find an equation of the plane containing the points (1,1,0), (1,2,3) and (-1,2,-3). (b) Find an equation of the plane that contains the line t( 1, 1, 1) + (1, 4, 1), t E R, and the point (0,3, -1).
2. (a) Find an equation of the plane orthogonal to x + y + z = 5 passing through the points (1,1,0) and (0,1,1). (b) Find an equation of the plane parallel to the hyperplane Xl + ... + Xn = 7r passing through the point (1,2, ... , n). 3. Find an equation of the hyperplane through the points (1,0,0,0), (2,1,0,0), (0,1,1,0), and (0,4,0,1). 4. Suppose that a,b,e E R3 are three points that do not lie on the same straight line and n is the plane that contains the points a, b, c. Prove that an equation of n is given by -
al al
Y - a2 b2 - a2
Cl -
al
C2 -
X -
det
[
bl
a2
z - a3 ] b3 - a3 C3 - a3
= 0.
5. Suppose that T E £(Rn; Rm) for some n, mEN. (a) If T(I, 1) = (3, 7r, 0) and T(O, 2) = (4,0,1), find the matrix representative of T. (b) If T(I, 1,0) = (e,7r), T(O, -1, 1) = (1,0), and T(I, 1, -1) = (4,7), find the matrix representative of T.
8.2 Planes and linear transformations
241
6. Suppose that T E .c(R4,R2). (a) If T(O, 1, 1, 0) = (3,4), T(O, 1, -1, 0) = (4,3), and T(O, 0, 0, -1) = (7l', 3), find all possible matrix representatives of T. (b) If T(I, 1, 0, 0) = (5,4), T(O, 0,1,0) = (1,2), and T(O, 0, 0, -1) = (7l',3), find all possible matrix representatives of T.
0. This exercise is used in Appendix E. Recall that the area of a parallelogram with base b and altitude h is given by bh, and the volume of a parallelepiped is given by the area of its base times its altitude. (a) Let a,b E R3 be nonzero vectors and P represent the parallelogram
{(x,y,Z) = ua+vb:
E [0, I]}.
U,V
Prove that the area of P is Iia x bll. (b) Let a, b, c E R 3 be nonzero vectors and P represent the parallelepiped
{(x,y,z) = ta+ub+vc: t,u,v E [0, I]}. Prove that the volume of P is I(a 8. The distance from a point Xo
x b) . cl.
= (xo, Yo, zo) to a plane II in R3 is defined to be
~1111
dist (:1:0, II) := {
:1:0
E II
:1:0
t/c II,
where 11:= (xo - Xl, Yo - YI, Zo - Zl) for some (Xl. Yl. Zl) E II, and 11 is orthogonal to II, i.e., parallel to its normal. Sketch II and:l:o for a typical plane II, and convince yourself that this is the correct definition. Prove that this definition does not depend on the choice of 11, by showing that the distance from:l:o = (xo, Yo, zo) to the plane II described by ax + by + cz = d is · ( II) laxo d 1st :1:0, =
+ byo + CZo - dl .
va
[!].
2
+b2 +c2
[ROTATIONS IN R2]. This exercise is used in Section e15.1. Let B
= [c~s () - sin () ] sm ()
cos ()
for some () E R. (a) Prove that IIB(x,y)11 = II(x,y)11 for all (x,y) E R2. (b) Let (x, y) E R 2 be a nonzero vector and
0, we shall call B counterclockwise rotation about the origin through the angle ().)
Chapter 8
242
EUCLIDEAN SPACES
10. For each of the following functions f, find the matrix representative of a linear transformation T E C(R; Rm) that satisfies lim Ilf(x + h) - £(x) - T(h)11
= o.
h-+O
(a) f(x) = (x 2 ,sinx). (b) f(x) = (eX, ~, 1 - x2 ). (c) f(x) = (1,2,3,x 2 +x,x 2 -x). 11. Let T E C(Rn; Rm) and set M:= sup IIzll=l
IIT(x)ll.
(a) Prove that M :::; IITII. (b) Using the linear property of T, prove that if x =I- 0, then M>
-
(c) Prove that M = (d) Prove that
IIT(x)11 Ilxll .
IITII. liT II = sup IIT(x) II. z#l Ilxll
8.3 TOPOLOGY OF Rn If you want a more abstract introduction to the topology of Euclidean spaces, skip the rest of this chapter and the next, and begin Chapter 10 now.
Topology, a study of geometric objects that emphasizes how they are put together over their exact shape and proportion, is based on the fundamental concepts of open and closed sets, a generalization of open and closed intervals. In this section we introduce these concepts in R n and prove their most basic properties. In the next chapter, we shall explore how they can be used to characterize limits and continuity without using distance explicitly. This additional step in abstraction will yield powerful benefits, as we shall see in Section 9.3 and in Chapter 11 when we begin to study the calculus of functions of several variables. We begin with a natural generalization of intervals to R n. 8.19 DEFINITION. Let a ERn. (i) For each r > 0, the open ball centered at a of radius r is the set of points Br(a):= {XE R n
:
Ilx-all < r}.
(ii) For each r 2: 0, the closed ball centered at a of radius r is the set of points
{x ERn:
Ilx -all:::; r}.
B.3 Topology of Rn
,,/ // I
."",.""..----
"r-\. '
~~ 1\
I
\
.....
/
~,_/\ IIx-ali \
(
I \ \ \
....... ,
243
\
~_--r-----I I I I I
,
/
,
I
, , / Br(a)
.......... .....
.....
-----
_/
"
Figure 8.5
Notice that when n = 1, the open ball centered at a of radius r is the open interval (a - r, a + r), and the corresponding closed ball is the closed interval [a - r, a + r]. Also notice that the open ball (respectively, the closed ball) centered at a of radius r contains none of its (respectively, all of its) circumference {x : Ilx - all = r}. Accordingly, we will draw pictures of balls in R 2 with the following conventions: Open balls will be drawn with dashed circumferences, and closed balls will be drawn with solid circumferences (see Figure 8.5). To generalize the concept of open and closed intervals even further, observe that each element of an open interval I lies "inside" I, i.e., is surrounded by other points in I. On the other hand, although closed intervals do NOT satisfy this property, their complements do. Accordingly, we make the following definition. 8.20 DEFINITION. Let n EN. (i) A set V in Rn is said to be open if and only if for every a E V there is an c > 0 such that Be(a) ~ V. (ii) A set E in R n is said to be closed if and only if E C := Rn \ E is open. The following result shows that every "open" ball is open. (Closed balls are also closed-see Exercise 3.) 8.21 Remark. For every x E Br(a) there is an c > 0 such that Be(x)
~
Br(a).
PROOF. Let x E Br(a). Using Figure 8.5 for guidance, we set c = r - Ilx 11 E Be(x), then by the triangle inequality, assumption, and the choice of c,
1111- all
~
all.
If
1111- xii + Ilx - all < c+ Ilx - all = r.
Thus by definition, 11 E Br(a). In particular, Be(x)
~
Br(a) .•
(This proof illustrates a valuable technique. Drawing diagrams in R2 sometimes leads to a proof valid for all Euclidean spaces.) Here are more examples of open sets and closed sets.
Chapter 8
244
EUCLIDEAN SPACES
8.22 Remark. lfa ERn, then Rn \ {a} is open and {a} is closed. PROOF. By Definition 8.20, it suffices to prove that the complement of every singleton E := {a} is open. Let x E EC and set c = Ilx - all. Then by definition, a tj:. Be(x), so Be(x) ~ EC. Therefore, EC is open by Definition 8.20. I
Students sometimes mistakenly believe that every set is either open or closed. Some sets are neither open nor closed (like the interval [0,1)), and as the following result shows, every Euclidean space contains two special sets that are both open and closed. (We shall see below that these are the only subsets of Rn that are simultaneously open and closed in Rn.)
8.23 Remark. For each n E N, the empty set open and closed.
0 and the whole space Rn are both
PROOF. Since Rn = 0c and 0 = (Rn)c, it suffices by Definition 8.20 to prove that 0 and Rn are both open. Because the empty set contains no points, "every" point x E 0 satisfies Be(x) ~ 0. (This is called the vacuous implication.) Therefore, is open. On the other hand, since Be(x) ~ R n for all x E R n and all c > 0, it is clear that R n is open. I
o
It is important to recognize that open sets and closed sets behave very differently with respect to unions and intersections. (In fact, these properties are so important that they form the basis of an axiomatic system that describes all topological spaces, even those for which measurement of distance is impossible.) 8.24 THEOREM. Let n E N. (i) If {Ve>}e>EA is any collection of open subsets ofRn , then
is open. (ii) If {Vk : k
= 1,2, ... ,p} is a finite collection of open subsets ofRn, then p
nVk:= n k=l
Vk
kE{1,2, ... ,p}
is open. (iii) If {Ee>}e>EA is any collection of closed subsets ofRn , then
is closed. (iv) If {Ek : k
=
1,2, ... ,p} is a finite collection of closed subsets ofRn, then p
U Ek := k=l
U
Ek
kE{l,2, ... ,p}
is closed.
(v) If V is open and E is closed, then V \ E is open and E \ V is closed.
8.3 Topology of Rn
245
PROOF. (i) Let x E UaEA Va. Then x E Va for some a: E A. Since Va is open, it follows that there is an r > 0 such that Br(x) ~ Va· Thus Br(x) ~ UaEA Va; i.e., this union is open. (ii) Let x E n~=l Vk . Then x E Vk for k = 1,2, ... ,po Since each Vk is open, it follows that there are numbers rk > 0 such that Brk (x) ~ Vk. Let r = min{rl' ... ,rp}. Then r > 0 and Br(x) ~ Vk for all k = 1,2, ... ,Pi i.e., Br(x) ~ n~=l Vk. Hence this intersection is open. (iii) By DeMorgan's Law (Theorem 1.41) and part (i),
is open, so naEA Ea is closed. (iv) By DeMorgan's Law and part (ii),
is open, so U~=l Ek is closed. (v) Since V \ E = V n E C and E \ V = En VC, the former is open by part (ii), and the latter is closed by part (iii). I The finiteness hypothesis in Theorem 8.24 is crucial, even for the case n = 1. 8.25 Remark. Statements (ii) and (iv) of Theorem 8.24 are false if arbitrary collections are used in place of finite collections. PROOF.
In the Euclidean space R,
is closed and
u [k:1'k:1] =(0,1)
kEN
is open. I To see why open sets are so important to analysis, we reexamine the definition of continuity using open sets. A function f : R --+ R is continuous if and only if given e > 0 and a E R there is a 8> 0 such that Ix - al < 8 implies If(x) - f(a)1 < e. Put in "ball language," this says that f(Bo(a)) ~ Be(f(a)); i.e., Bo(a) ~ f-l(Be(f(a))). Since these steps are reversible, we see that f is continuous on R if and only if for all a E R, the inverse image under f of every open ball centered at f(a) contains an open ball centered at a. What happens to this statement when the domain of f is not all of R? To answer this question, we consider two functions, f(x) = l/x and g(x) = 1 + JX=l, and
246
Chapter 8
EUCLIDEAN SPACES
one open ball, (-1,3), centered at 1. Notice that f- 1 (-1,3) = (-00, -1) U (1/3, 00) contains an open ball centered at a = 1 but g-I(-1,3) = [1,5) does not. What caused the breakdown of our observation? The domain of f, (-00,0) U (0,00), is open, but the domain of g, [1,00), is not. Is there any way to fix the statement above to handle the case when the domain of 9 is not open? Yes! You will prove (see Exercise 6, p. 276) that a function 9 is continuous on a set E ~ R n if and only if the inverse image under 9 of an open set is the intersection of an open set with E. Notice that this IS the case for the example above. When g(x) = 1 + JX=l, g-I(-1,3) = [1,5) is the intersection of E = [1,00) with the open set (-5,5). Accordingly, we modify the definition of open and closed along the following lines.
8.26 DEFINITION. Let E ~ Rn. (i) A set U is said to be relatively open in E if and only if there is an open set A such that U = En A. (ii) A set C is said to be relatively closed in E if and only if there is a closed set B such that C = E n B. The paragraph that preceded this definition was a glimpse of Chapter 9. In this section, we shall use relatively open sets to introduce connectivity, a concept that generalizes to R n an important property of intervals which played a role in the proof of the Intermediate Value Theorem, and which will be used several times in our development of the calculus of functions of several variables. First, we explore the analogy between relatively open sets and open sets.
8.27 Remark. Let U ~ E ~ Rn. (i) Then U is relatively open in E if and only if for each a E U there is an r > 0 such that Br(a) nEe U. (ii) If E is open, then U is relatively open in E if and only if U is (plain old vanilla) open (in the usual sense). PROOF. (i) If U is relatively open in E, then U = EnA for some open set A. Since A is open, there is an r > 0 such that Br(a) c A. Hence, Br(a) nEe An E = U. Conversely, for each a E U choose an r(a) > 0 such that Br(a) (a) nEe U. Then UaEU Br(a) (a) n E ~ U. Since the union is taken over all a E U, the reverse set inequality is also true. Thus UaEU Br(a) (a) n E = U. Since the union of these open balls is open by Theorem 8.24, it follows that U is relatively open in E. (ii) Suppose that U is relatively open in E. If E and A are open, then U = En A is open. Thus U is open in the usual sense. Conversely, if U is open, then EnU = U is open. Thus every open subset of E is relatively open in E. I
Next, we introduce connectivity.
8.28 DEFINITION. Let E be a subset of Rn. (i) A pair of sets U, V is said to separate E if and only if U and V are nonempty, relatively open in E, E = U U V, and Un V = 0. (ii) E is said to be connected if and only if E cannot be separated by any pair of relatively open sets U, V.
8.3 Topology of R n
247
Loosely speaking, a connected set is all in one piece, i.e., cannot be broken into smaller, nonempty, relatively open pieces which do not share any common points. The empty set is connected, since it can never be written as the union of nonempty sets. Every singleton E = {a} is also connected, since if E = U u V where both U and V are nonempty, then E has at least two points. More complicated connected sets can be found in the exercises. Notice that by Definitions 8.26 and 8.28, a set E is not connected if there are open sets A, B such that En A, En Bare nonempty, E = (E n A) U (E n B), and An B = 0. Is this statement valid if we replace E = (E n A) U (E n B) by
E r:;;. AUB? 8.29 Remark. Let E r:;;. Rn. If there exists a pair of open sets A, B such that E n A f. 0, E n B f. 0, E r:;;. A U B, and A n B = 0, then E is not connected. PROOF. Set U = En A and V = En B. By hypothesis and Definition 8.26, U and V are relatively open in E and nonempty. Since Un V r:;;. An B = 0, it suffices by Definition 8.28 to prove that E = U U V. But E is a subset of A U B, so E r:;;. U U V. On the other hand, both U and V are subsets of E, so E :2 U U V. We conclude that E = U U V .• (The converse of this result is also true, but harder to prove-see Theorem 8.38.) In practice, Remark 8.29 is often easier to apply than Definition 8.28. Here are several examples. The set Q is not connected: Set A = (- 00, V2) and B = (V2, 00). The "bowtie set" {(x,y) : -1 :::; x :::; 1 and - Ixl < y < Ixl} is not connected (see Figure 8.6 on p. 251): Set A = {(x,y) : x < O} and B = {(x,y) : x> O}. Is there a simple description of all connected subsets of R?
8.30 THEOREM. A subset E of R is connected if and only if E is an interval. PROOF. Suppose that E is a connected subset ofR. If E is empty or if E contains only one point c, then E is one of the intervals (c, c) or [c, c]. Suppose that E contains at least two points. Set a = inf E and b = sup E, and observe that -00 :::; a < b :::; 00. If a E E set ak = a, and if bEE set bk = b, kEN. Otherwise, use the Approximation Property to choose ak, bk E E such that ak ! a and bk i b as k ---t 00. Notice that in all cases, E contains each [ak' bk]. Indeed, if not, say that there is an x E [ak' bk]\E, then ak E En( -00, x), bk E En(x, 00), and E r:;;. (-00, x) U (x, 00). Hence, by Remark 8.29, E is not connected, a contradiction. Therefore, E :2 [ak' bk] for all kEN. It follows from construction that 00
E =
U[ak' bk]. k=l
Since this last union is either (a, b), [a, b), (a, b], or [a, b], we conclude that E is an interval. Conversely, suppose that E is an interval which is not connected. Then there are sets U, V, relatively open in E, which separate E, i.e., E = U U V, Un V = 0, and
248
Chapter 8
EUCLIDEAN SPACES
there exist points Xl E EnU and X2 E EnV. We may suppose that Xl < X2. Since Xl. X2 E E and E is an interval, fo := [Xl, X2] ~ E. Define f on fo by
f(x) = {
~
xEU xE
V.
Since Un V = 0, f is well defined. We claim that f is continuous on 10 . Indeed, fix Xo E [X1,X2]. Since U U V = E ;2 10, it is evident that Xo E U or Xo E V. We may suppose the former. Let Xk E 10 and suppose that Xk ---+ Xo as k ---+ 00. Since U is relatively open, there is an c > 0 such that (xo - c, Xo + c) nEe U. Since Xk E E and Xk ---+ Xo, it follows that Xk E U for large k. Hence f(Xk) = 1 = f(xo) for large k. Therefore, f is continuous at Xo by the Sequential Characterization of Continuity. We have proved that f is continuous on 10 , Hence by the Intermediate Value Theorem (Theorem 3.29), f must take on the value 1/2 somewhere on 10 , This is a contradiction, since by construction, f takes on only the values 0 or 1. I We shall use this result later to prove that a real function is continuous on a closed, bounded interval if and only if its graph is closed and connected (see Theorem 9.51).
EXERCISES 1. Graph generic open balls in R2 with respect to each of the "non-Euclidean" norms II· 111 and I . 1100' What shape are they? 2. Identify which of the following sets are open, which are closed, and which are neither. Sketch E in each case. (a) E = {(x,y): X2 +4y2:::; I}. (b) E = {(x,y) : x2 - 2x + y2 = O} U {(x,O): X E [2,3]}. (c) E = {( X, y) : y ;::: x2, 0:::; Y < I}. (d) E = {(x, y) : x2 - y2 < 1, -1 < y < I}.
3. Let n EN, let a ERn, let s, r E R with s < r, and set
V = {x ERn: s <
Ilx-all < r}
and
E = {x EX: s:::;
Ilx-all :::; r}.
Prove that V is open and E is closed. 4. Let V be a subset of Rn. (a) Prove that V is open if and only if there is a collection of open balls {Bo : Q E A} such that
(b) What happens to this result when "open" is replaced by "closed"? 5. Show that if E is closed in R n and a ~ E, then inf
sEE
Ilx - all > O.
249
8.4 Interior, closure, and boundary
6. (a) Sketch a graph of the set
U := {(x, y) : X 2 + 2y2 < 6, y ~ O}, and decide whether this set is relatively open or relatively closed in E := {(x,y) : y ~ O}. Do the same for E:= {(x,y) : x 2 + 2y2 < 6}. Explain your answers. (b) Sketch a graph of set U := {(x, y) : x 2 + y2
:s 1,
x 2 - 4x + y2
+ 2 < O},
and decide whether this set is relatively open or relatively closed in the closed ball E centered at (0,0) of radius l. Do the same for E := B.j2(2, 0). Explain your answers.
7. (a) Let a :S band
c:s d be real numbers.
Sketch a graph of the rectangle
[a,b] x [c,d]:= {(x,y): x E [a,bJ,y E [c,d]}, and decide whether this set is connected. Explain your answers. (b) Sketch a graph of the set
Bl(-2,0) U Bl(2,0) U {(x,O): -1 < x < I}, and decide whether this set is connected. Explain your answers. 8. Suppose that E <;;; R n and C is a subset of E.
(a) Prove that if E is closed, then C is relatively closed in E if and only if Cis (plain old vanilla) closed (in the usual sense). (b) Prove that C is relatively closed in E if and only if E \ C is relatively open in E. 9. Suppose that {Ea}aEA is a collection of connected sets in a Euclidean space R n such that naEAEa =f. 0. Prove that
E=
U Ea aEA
is connected. 10. Prove that the intersection of connected sets in R is connected. Show that this is false if "R" is replaced by "R2 ."
8.4 INTERIOR, CLOSURE, AND BOUNDARY To prove that every set contains a largest open set and is contained in a smallest closed set, we introduce the following topological operations.
Chapter 8
250
EUCLIDEAN SPACES
8.31 DEFINITION. Let E be a subset of a Euclidean space Rn. (i) The interior of E is the set E O := U{V: V ~ E and V is open in Rn}.
(ii) The closure of E is the set
E
:=
n{B : B ::2 E and B is closed in Rn}.
Notice that every set E contains the open set 0 and is contained in the closed set Rn. Hence, the sets EO and E are well-defined. Also notice that by Theorem 8.24, the interior of a set is always open and the closure of a set is always closed. The following result shows that EO is the largest open set contained in E, and E is the smallest closed set that contains E.
8.32 THEOREM. Let E ~ Rn. Then (i) EO ~ E ~ E, (ii) if V is open and V ~ E then V ~ EO, and (iii) if C is closed and C ::2 E then C ::2 E. PROOF. Since every open set V in the union defining EO is a subset of E, it is clear that the union of these V's is a subset of E. Thus EO ~ E. A similar argument establishes E ~ E. This proves (i). By Definition 8.31, if V is an open subset of E, then V ~ EO and if C is a closed set containing E, then E ~ C. This proves (ii) and (iii). I
In particular, the interior of a bounded interval with endpoints a and b is (a,b), and its closure is [a, b]. In fact, it is evident by parts (ii) and (iii) that E = EO if and only if E is open, and E = E if and only if E is closed. We shall use this observation many times below. Let us examine these concepts in the concrete setting R 2 •
8.33 Examples. (i) Find the interior and closure of the set E = {(x, y) : -1 :::; x :::; 1 and - Ixl < Y < Ixl}. (ii) Find the interior and closure of the set E = Bl (-2,0) UBI (2,0) U {(x, 0) : -l:::;x:::;l}. SOLUTION. (i) Graph y = Ixl and x = ±1, and observe that E is a bow-tie-shaped region with "solid" vertical edges (see Figure 8.6). Now, by Definition 8.20, any open set in R2 must contain a disk around each of its points. Since EO is the largest open set inside E, it is clear that
EO = {(x,y) : -1 < x < 1 and
-Ixl < y < Ixl}.
Similarly, E
= {(x,y):
-1:::; x:::; 1 and
-Ixl:::; y:::; Ixl}.
251
8.4 Interior, closure, and boundary
,,
y
,
,
/
Y =X /
-I
x
Y=-X".
/
Figure 8.6
(ii) Draw a graph of this region. It turns out to be "dumbbell shaped": two open disks joined by a straight line. Thus EO = Bl (-2,0) UBI (2,0), and E = B 1 (-2,0) U B 1 (2,0) U {(x,O): -1::; x::; I}.
I
These examples illustrate the fact that the interior of a nice enough set E in R2 can be obtained by removing all its "edges," and the closure of E by adding all its "edges." One of the most important results from one-dimensional calculus is the Fundamental Theorem of Calculus. It states that the behavior of a derivative l' on an interval [a, b], as measured by its integral, is determined by the values of f at the end points of [a, b]. What shall we use for "endpoints" of an arbitrary set in Rn? Notice that the endpoints a, b are the only points that lie near both [a, b] and the complement of [a, b]. Using this as a cue, we introduce the following concept. 8.34 DEFINITION. Let E
~
Rn. The boundary of E is the set
8E := {x ERn: for all r > 0,
Br(x) n E
f. 0 and Br(x) n E f. 0}. C
[We will refer to the last two conditions in the definition of 8E by saying that Br(x) intersects E and EC.] 8.35 Example. Describe the boundary of the set
E = {(x, y) : x 2 + y2 ::; 9 and (x - l)(y + 2) > O}.
°
SOLUTION. Graph the relations x 2 + y2 = 9 and (x - l)(y + 2) = to see that E is a region with a solid curved edges and dashed straight edges (see Figure 8.7). By definition, then, the boundary of E is the union of these curved and straight
252
Chapter 8
EUCLIDEAN SPACES y
3
x
-3
Figure 8.7 edges (all made solid). Rather than describing 8E analytically (which would involve solving for the intersection points of the straight lines x = 1, Y = -2, and the circle x 2 + y2 = 9), it is easier to describe 8E by using set algebra.
8E = {(x,y) : x 2 + y2 :::; 9 and (x -l)(y + 2) 2:: O} \ {(x, y) : x 2 + y2 < 9 and (x - 1) (y + 2) > O}. I It turns out that set algebra can be used to describe the boundary of any set.
8.36 THEOREM. Let E PROOF.
~
Rn. Then 8E = E \ EO.
By Definition 8.34, it suffices to show that x E E if and only if Br(x) n E =I- 0 for all T > 0, and x tJ- EO if and only if Br(x) n EC =I- 0 for all T > O.
(10) (11) We will provide the details for (10) and leave the proof of (11) as an exercise. Suppose that x E E but Bro(x) n E = 0 for some TO > O. Then (Bro(x))C is a closed set that contains E; hence, by Theorem 8.32iii, E ~ (Bro (x))c. It follows that En Bro(x) = 0, e.g., x tJ- E, a contradiction. Conversely, suppose that x tJ- E. Since (E)C is open, there is an TO > 0 such that Bro(x) ~ (E)c. In particular, 0 = Bro(x) n E :2 Bro(x) n E for some TO> O. I We have introduced topological operations (interior, closure, and boundary). The following result answers the question: How do these operations interact with the set operations (union and intersection)?
253
8.4 Interior, closure, and boundary
8.37 THEOREM. Let A,B
~
Rn. Then
(i)
AUB=AUB,
(ii)
(iii)
8(AUB)
~
8AU8B,
AnB~AnB,
and 8(AnB)
~
8Au8B.
PROOF. (i) Since the union of two open sets is open, AO U BO is an open subset of A U B. Hence, by Theorem 8.32ii, AO U BO ~ (A U B)o. Similarly, (A n B)O ;2 AO n BO. On the other hand, if V cAn B, then V c A and V c B. Thus (A n B)O ~ AO n BO. (ii) Since AU B is closed and contains Au B, it is clear that, by Theorem 8.32iii, A U B ~ A U B. Similarly, A n B ~ A n B. To prove the reverse inequality for union, suppose that x 1- Au B. Then, by Definition 8.31, there is a closed set E that contains Au B such that x 1- E. Since E contains both A and B, it follows that x 1- A and x 1- B. This proves part (ii). (iii) Let x E 8(A U B); i.e., suppose that Br(x) intersects Au B and (A U B)C for all r > O. Since (A U B)C = AC nBc, it follows that Br(x) intersects both AC and B C for all r > O. Thus Br(x) intersects A and AC for all r > 0, or Br(x) intersects Band B C for all r > 0; Le., x E 8A U 8B. This proves the first set inequality in part (iii). A similar argument establishes the second inequality in part (iii). I
The second inequality in part (iii) can be improved (see Exercise 10). Finally, we note (Exercise 11) that relatively open sets in E can be divided into two kinds: those inside EO, that contain none of their boundary, and those which intersect 8E, which contain only that part of their boundary which intersects 8E. (See Figures 15.3 and 15.4, p. 566, for an illustration of both types.)
We close this section by showing that the converse of Remark 8.30 is also true. This result is optional because we do not use it anywhere else. *8.38 THEOREM. Let E ~ Rn. If there exist nonempty, relatively open sets U,v which separate E, then there is a pair of open sets A, B such that An E =1= 0, BnE =1= 0, AnB = 0, and E ~ AUB. PROOF.
We first show that
(9)
UnV
= 0.
Indeed, since V is relatively open in E, there is a set n, open in Rn, such that = En n. Since Un V = 0, it follows that U c nco This last set is closed in Rn. Therefore,
V
Le., (9) holds.
Chapter 8
254
EUCLIDEAN SPACES
Next, we use (9) to construct the open set B. Set 8", := inf{llx - ull : u E U},
x E V,
B
and
U BO%/2(X).
=
",EV
Clearly, B is open in Rn. Since 8", > 0 for each x ~ U (see Exercise 5), B contains V, hence B n E :2 V. The reverse inequality also holds, since by construction B n U = 0 and by hypothesis E = U U V. Therefore, B n E = V. Similarly, we can construct an open set A such that A n E = U by setting lOy
:= inf{llv -yll : v E
V},
y E U,
A
and
UB
to::
ey / 2
(Y)·
yEU
In particular, A and Bare nonempty open sets that satisfy E <;;;; Au B. It remains to prove that An B = 0. Suppose, to the contrary, that there is a point a E An B. Then a E B O%/2(X) for some x E V and a E B ey / 2(Y) for some y E U. We may suppose that 8", :::; lOy. Then Ilx-yll:::; Ilx-all
+
Ila-yll <
~
+; : :;
lOy.
Therefore, Ilx - yll < inf{llv - yll : v E V}. Since x E V, this is impossible. We conclude that An B = 0. I
EXERCISES 1. Find the interior, closure, and boundary of each of the following subsets of R.
(a) [a, b) where a < b. (b) E = {lin: n EN}.
(c)
E=U~=l(n:l'~).
(d) E=U~=l(-n,n). 2. For each of the following sets, sketch EO, E, and 8E. (a) E={(x,y):x 2 +4y2:::;1}. (b) E = {(x,y): x 2 - 2X+y2 = O} u {(x,O): x E [2,3]}. (c) E = {( x, y) : y 2: x 2, 0 :::; y < I}. (d) E={(x,y):x2-y2<1, -l
@].
This exercise is used in Section 12.1. If A
A
<;;;;
Band
AO
<;;;;
<;;;;
B
<;;;;
Rn, prove that
BO.
4. Let E be a subset of Rn. (a) Prove that every subset A <;;;; E contains a set B that is the largest subset of A which is relatively open in E. (b) Prove that every subset A <;;;; E is contained in a set B that is the smallest closed set containing A that is relatively closed in E.
8.4 Interior, closure, and boundary
255
5. Complete the proof of Theorem 8.36 by verifying (11). 6. Prove that if E ~ R is connected, then EO is also connected. Show that this is false if "R" is replaced by "R2." 7. Suppose that E c Rn is connected and E ~ A ~ E. Prove that A is connected. 8. A set A is called clopen if and only if it is both open and closed.
(a) Prove that every Euclidean space has at least two clopen sets. (b) Prove that a proper subset E of Rn is connected if and only if it contains exactly two relatively clopen sets. (c) Prove that every nonempty proper subset of Rn has a nonempty boundary. 9. Show that Theorem 8.37 is best possible in the following sense. (a) There exist sets A, Bin R such that (A U B)O #- AO U BO. (b) There exist sets A, B in R such that A n B #- An B. "(c) There exist sets A, Bin R such that 8(A U B) #- 8A U 8B and 8(A n B) #8AU8B. 10. Let A and B be subsets of Rn. (a) Show that 8(A n B) n (AC U (8B)C) ~ 8A. (b) Show that if x E 8(A n B) and x tI. (A n 8B) U (B n 8A), then x E 8A n 8E. (c) Prove that 8(A n B) ~ (A n 8B) U (B n 8A) U (8A n 8B). (d) Show that even in R, there exist sets A and B such that 8(A n B) #- (A n 8B) U (B n 8A) U (8A n 8B). 11. Let E ~ Rn and U be relatively open in E. (a) If U ~ EO, then Un 8U = 0. (b) If U n 8E #- 0, then Un 8U = Un 8E.
Chapter 9
Convergence
in Rn
In this chapter we generalize the concepts of limits and continuity from R to R n. We begin, as we did in Chapter 2, with sequences. 9.1 LIMITS OF SEQUENCES Using the analogy between norms and the absolute value, we can define what it means for a sequence in Rn to be convergent, bounded, or Cauchy in the following way.
9.1 DEFINITION. Let {xd be a sequence points in Rn. (i) {xd is said to converge to some point a ERn (called the limit of Xk) if and only if for every E: > 0 there is an N E N such that k ~ N
implies
Ilxk - all < E:.
Ilxk II ::; M for all kEN. (iii) Xk ERn is said to be Cauchy if and only if for every E: > 0 there is an N E N such that k,m ~ N imply Ilxk -xmll < E:. (ii) {xd is said to be bounded if and only if there is an M > 0 such that
The following result shows that to evaluate the limit of a specific sequence in Rn we need only take the limits of the component sequences.
9.2 THEOREM. Leta:= (a(1), ... ,a(n)) andxk:= (xk(1), ... ,xk(n)) belong to Rn for kEN. Then Xk ----+ a, as k ----+ 00, if and only if the component sequences xdj) ----+ a(j), as k ----+ 00, for all j = 1,2, ... , n. PROOF.
Fixj
E
{1, ... ,n}. By Remark 8.7,
256
9.1 Limits of sequences
257
Hence, by the Squeeze Theorem, Xk(j) ---; a(j) as k ---; 00 for all 1 ::; j ::; n if and only if the real sequence Ilxk - all ---; 0 as k ---; 00. Since Ilxk - all ---; 0 if and only if Xk ---; a, as k ---; 00, the proof of the theorem is complete. I This result can be used to obtain the following analogue of the Density of Rationals (Theorem 1.24). It uses the notation Qn := {x E Rn : Xj E Q for j = 1,2, ... ,n}.
9.3 THEOREM. For each a E R n there is a sequence Xk E Qn such that Xk as k ---; 00.
---; a
PROOF. Let a := (aI, ... , an) ERn. For each 1 ::; j ::; n, choose by Theorem 1.24 sequences rkj ) E Q such that r~) ---; aj (in R) as k ---; 00. By Theorem 9.2, Xk := (rF), ... , rkn )) converges to a (in Rn) as k ---; 00. Moreover, Xk E Qn for each kE N. I
A set E is said to be separable if and only if there is an at most countable subset Z of E such that to each a E E there corresponds a sequence Xk E Z such that Xk ---; a as k ---; 00. Since Qn is countable (just iterate Theorem 1.38i), it follows from Theorem 9.3 that R n is separable. Theorem 9.3 illustrates a general principle. As long as we stay away from results about monotone sequences (which have no analogue in Rn when n > 1), we can extend most of the results found in Chapter 2 from R to R n. Since the proofs of these results require little more than replacing Ix - yl in the real case by Ilx - yll in the vector case, we will summarize what is true and leave most of the details to the reader.
9.4 THEOREM. Let n EN. (i) A sequence in R n can have at most one limit. (ii) If {XdkEN is a sequence in R n that converges to a and {XkJ hEN is any subsequence of {XdkEN, then XkJ converges to a as j ---; 00. (iii) Every convergent sequence in R n is bounded, but not conversely. (iv) Every convergent sequence in Rn is Cauchy. (v) If {xd and {Yd are convergent sequences in Rn and a E R, then
lim
k-+oo
and
Moreover, when n = 3,
(axk)
=a
lim
k--+oo
Xk,
258
Chapter 9
CONVERGENCE IN Rn
Notice once and for all that (since IIxkl12 = Xk . Xk), the penultimate equation above contains the following corollary. If Xk converges, then lim Ilxkll
k--.oo
= II k--.oo lim xkll·
As in the real case, the converse of part (iv) is also true. In order to prove that, we need an n-dimensional version of the Bolzano-Weierstrass Theorem.
9.5 THEOREM [BOLZANO-WEIERSTRASS THEOREM FOR Rnj. Every bounded sequence in Rn has a convergent subsequence. PROOF. Suppose that {Xk} is bounded in R~. For each j E {I, ... ,n}, let Xk(j) represent the jth component of the vector Xk. By hypothesis, the sequence {xk(j)hEN is bounded in R for each j = 1,2, ... , n. Let j = 1. By the one-dimensional Bolzano-Weierstrass Theorem, there is a sequence of integers 1 ::; k(l,l) < k(1,2) < ... and a number x(l) such that Xk(l,v) (1) ---+ x(l) as v ---+ 00. Let j = 2. Again, since the sequence {Xk(1,v)(2)}VEN is bounded in R, there is a subsequence {k(2, V)}VEN of {k(l, V)}VEN and a number x(2) such that Xk(2,v)(2) ---+ x(2) as v ---+ 00. Since {k(2, V)}VEN is a subsequence of {k(l, V)}VEN, we also have Xk(2,v)(1) ---+ x(l) as v ---+ 00. Thus, Xk(2,v)(£) ---+ x(£) as v ---+ 00 for all 1 ::; £ ::; j = 2. Continuing this process until j = n, we choose a subsequence kv = k(n, v) and points x(£) such that lim Xkv (£) = x(£) v--.oo
for 1 ::; £::; j = n. Set x converges to x as v ---+ 00 . •
= (x(1),x(2), ... ,x(n)). Then by Theorem 9.2, Xkv
Since the Bolzano-Weierstrass Theorem holds for Rn, we can modify proof of Theorem 2.29 to establish the following result.
9.6 THEOREM. A sequence {Xk} in Rn is Cauchy if and only if it converges. Thus sequences in R n behave pretty much the same as sequences in R. We now turn our attention to something new. How does the limit of sequences interact with the topological structure of Rn? The answer to this question contains a surprising bonus. The E'S begin to disappear from the theory.
9.7 THEOREM. Let Xk ERn. Then Xk ---+ a as k ---+ 00 if and only if for every open set V that contains a there is an N E N such that k 2: N implies Xk E V. PROOF. Suppose that Xk ---+ a and let V be an open set that contains a. By Definition 8.20, there is an 10 > 0 such that Be(a) ~ V. Given this 10, use Definition 9.1 to choose an N E N such that k 2: N implies Xk E Be(a). By the choice of 10, Xk E V for all k 2: N: Conversely, let 10 > 0 and set V = Be (a). Then V is an open set that contains a, hence by hypothesis, there is an N E N such that k 2: N implies Xk E V. In particular, Ilxk - all < 10 for all k 2: N .•
9.1 Limits of sequences
259
This is a first step toward developing a "distance-less" theory of convergence. The next result, which we shall use many times, shows that convergent sequences characterize closed sets.
9.8 THEOREM. Let E ~ Rn. Then E is closed if and only if E contains all its limit points; i.e., Xk E E and Xk ----; x imply x E E. PROOF. The theorem is vacuously satisfied if E is the empty set. Suppose that E #- 0 is closed but some sequence Xk E E converges to a point x E E C. Since E is closed, E C is open. Thus, by Theorem 9.7, there is an N E N such that k ~ N implies Xk E EC, a contradiction. Conversely, suppose that E is a nonempty set that contains all its 'limit points. If E is not closed, then by Remark 8.23, E #- R n and by definition, E C is nonempty and not open. Thus, there is at least one point x E E C such that no ball Br(x) is contained in EC. Let Xk E B1/k(X) n E for k = 1,2,.... Then Xk E E and Ilxk - xii < 11k for all kEN. Now by the Squeeze Theorem, Ilxk - xii ----; 0; i.e., Xk ----; x as k ----; 00. Thus, by hypothesis, x E E, a contradiction. I To set the stage for the next two results, we introduce the following concepts. (For a more complete treatment, see Section 9.4.)
9.9 DEFINITION. Let E be a subset of Rn. (i) An open covering of E is a collection of sets {Va}aEA such that each Va is open and
(ii) The set E is said to be compact if and only if every open covering of E has a finite subcovering; Le., if {Va}aEA is an open covering of E, then there is a finite subset Ao of A such that
Convergent sequences and topology form a potent mixture, as we shall now demonstrate by using the two previous results to prove the following "covering" lemma. It is difficult to overestimate the usefulness of this powerful result, which allows us to extend local results to global ones in an almost effortless manner (e.g., see Theorems 9.24 and 12.46 and Exercise 7). 9.10 Lemma [BOREL COVERING LEMMA]. Let E be a closed, bounded subset of Rn. Ifr is any function from E into (0,00), then there exist finitely many points Yl, ... ,YN E E such that N
E ~
UBr
(1/J)(Yj).
j=l
STRATEGY: Since r(y) > 0 and Y E Br(y)(Y) for each Y E E, it is clear that {Br(y) (Y)}YEE is an open covering of E. By moving the centers a little bit, we might
260
Chapter 9
CONVERGENCE IN Rn
be able to make the same statement with Y E En Qn in place of Y E E. Since Qn is countable (see Theorem 1.38a and Remark 1.39), it would follow that there exist Yj E E n Qn such that 00
E S;;;
U Br(1/J) (Yj)·
j=l
Hence, if the covering lemma is false, then there exist Xk E E such that Xk ffU7=1 B r(1/J) (Yj) for k = 1,2, .... Since E is closed and bounded, it follows from the Bolzano-Weierstrass Theorem and Theorem 9.8 that some subsequence Xkv converges to a point x E E as v --+ 00. Since E is a subset of the union of balls Br(YJ) (Yj), this x must belong to some B r(1/Jo) (Yjo)' Hence by Theorem 9.7, Xkv E Br(YJ) (Yj) for large v. But this contradicts the fact that if k ~ j, then Xk ff- Br(YJ) (Yj)· Here are the details. Step 1: Change the centers. Fixyo E E. By Theorem 9.3, choose a E Qn and p := p(YO,a) such that IIYo - all < r(Yo)/4 and r(Yo)/4 < P < r(Yo)/2. Since IIYo - all < r(Yo)/4 < p, we have Yo E Bp(a). On the other hand, Y E Bp(a) implies Ilyo - yll ::; Ilyo - all + Iia - yll < p + P < r(yo), i.e., Bp(a) c Br(yo) (yo). PROOF.
Step 2: Construct the sequence. We just proved that to each Yo E E there correspond a E Qn and p(Yo,a) E Q such that Yo E Bp(Yo,a)(a) C Br(yO) (Yo). Since Q and Qn are countable, it follows that there exist aj E Qn and Pj E Q such that 00
E S;;;
U B pJ (aj). j=l
Suppose for a moment that E is not a subset of any of the finite unions U7=1 Bpl (aj), kEN. For each k, choose Xk E E \ U7=1 Bpl (aj). By Theorems 9.5, 9.8, and 9.7 there is a subsequence Xkv and an index jo such that Xkv E Bpl0 (ajo) for v large.
But by construction, if kv > jo, then Xkv ff- U;~l B pJ (aj), in particular, Xkv cannot belong to Bpl0 (ajo) for large v. This contradiction proves that there is an N E N such that N
E S;;;
UB pJ (aj). j=l
Step 3: Finish the proof. By Step 1, given j E N there is a point in E, say Yj, such that B p, (aj) C Br(YJ) (Yj)· We conclude by Step 2 that
E S;;;
N
N
j=l
j=l
UB p, (aj) C UBr(Yl) (Yj)·
I
The Borel Covering Lemma can be used to establish the following important characterization of compact sets.
261
9.1 Limits of sequences
9.11 THEOREM [HEINE-BoREL THEOREM]. Let E be a subset ofRn. Then E is compact if and only if E is closed and bounded. PROOF. Suppose that E is compact. Since {Bk(O)}kEN is an open covering of R n, hence of E, there is an N E N such that N
U BdO).
E ~
k=l In particular, E is bounded by N. To verify that E is closed, suppose not. Then E is nonempty and (by Theorem 9.8) there is a convergent sequence Xk E E whose limit x does not belong to E. For each Y E E, set r(y) := Ilx-yll/2. Since x does not belong to E, r(y) > O. Thus each Br(y)(Y) is open and contains Y; i.e., {Br(lI)(Y) : Y E E} is an open covering of E. Since E is compact, we can choose pointsYj and radii rj := r(Yj) for j = 1,2, ... , M such that M
E~ UBrj(Yj). j=l Set r := min{rl, ... , rN}. (This is a finite set of positive numbers, so r is also positive.) Since Xk --+ x as k --+ 00, Xk E Br(x) for large k. But Xk E Br(x) n E implies Xk E B rj (Yj) for some j EN. Therefore, it follows from the choices of r j and r, and from the triangle inequality, that
rj ~ Ilxk -Yjll ~ Ilx-Yjll-IIXk -xii = 2rj - Ilxk - xii> 2rj - r ~ 2rj - rj
=
rj,
a contradiction. Conversely, suppose that E is closed and bounded. Let {VoJaEA be an open covering of E. Let x E E. Since {Va}aEA is an open covering of E, there exists an r(x) > 0 such that Br(z)(x) C Va' Thus by the Borel Covering Lemma, there exist finitely many points Xl,'" ,XN such that N
E ~
U Br(zj)(Xj)'
j=l
But by construction, for each rj := r(xj) there is an index Qj E A such that Br(zJ)(Xj) C Vaj" We conclude that {VaJ.f=1 is an open covering of E. I It is important to recognize that the Heine-Borel Theorem no longer holds if either closed or bounded is dropped from the hypothesis, even when n = 1 and E is an interval. Indeed, neither of the open coverings
(0,1)=
U (~,1-~) nEN
Chapter 9
262
CONVERGENCE IN Rn
has a finite subcovering of the intervals (0,1) and [1, (0).
EXERCISES 1. Using Definition 9.li, prove that the following limits exist.
(a)
(~, 1 -
Zk =
(b)
Zk
(c)
=
:2 ) .
1) .
k . ( k+ 1 ' sm k
(log(k + 1) -log k, 2- k )
Zk =
•
2. Using limit theorems, find the limit of each of the following vector sequences.
(a)
(b)
Xk =
(c) 3. If Zk k
Zk
----+
0 in R n as k
(1,Sin1rk,Cos~).
= (k-
----+ 00
Jk2+k,kl/k,~).
and Yk is bounded in R n , prove that
Zk . Yk ----+
0 as
----+ 00.
4. Find convergent subsequences of Zk
5.
6. 7.
8.
=
((-1) k'k'13k) (-1 )
which converge to different limits. Prove your limits exist. (a) Prove Theorem 9.4i and ii. (b) Prove Theorem 9.4iii and iv. (c) Prove Theorem 9.4v. Prove Theorem 9.6. Let E be closed and bounded in R, and suppose that for each x E E there is a nonnegative Coo function Ix such that Ix(x) > 0 and I~(y) = 0 for y ~ E. Prove that there is a nonnegative Coo function I such that I (y) > 0 for y E E and f'(y) = 0 for all y ~ E. (a) A subset E of R n is said to be sequentially compact if and only if every sequence Xk E E has a convergent subsequence whose limit belongs to E. Prove that every closed ball in Rn is sequentially compact. (b) Prove that Rn is not sequentially compact.
9.2 Limits of functions
263
9. Let E be a nonempty subset of Rn.
(a) Show that a sequence x'" E E converges to some point a E E if and only if for every set U, which is relatively open in E and contains a, there is an N E N such that Xk E U for k ~ N. (b) Prove that a set C ~ E is relatively closed in E if and only if the limit of every sequence Xk E E which converges to a point in E satisfies limk-+ooxk E C. 10. (a) Let E be a subset of Rn. A point a E Rn is called a cluster point of E if En Br(a) contains infinitely many points for every r > O. Prove that a is a cluster point of E if and only if for each r > 0, En Br(a) \ {a} is nonempty. (b) Prove that every bounded infinite subset of Rn has at least one cluster point.
9.2 LIMITS OF FUNCTIONS We now turn our attention to limits of functions. By a vector function (from n variables to m variables) we shall mean a function f of the form f : A ____ R m, where A ~ Rn. Since f(x) E Rm for each x E A, there are functions fJ : A ---- R (called the coordinate or component functions of 1) such that f(x) = (h(x), ... , fm(x)) for each x E A. When m = 1, f has only one component and we shall call f real-valued. If f = (h, ... ,fn) is a vector function where the fJ's have intrinsic domains (e.g., the 1/s might be defined by formulas), then the maximal domain of f is defined to be the intersection of the domains of the fJ's. The following examples illustrate this idea.
9.12 Examples. (a) Find the maximal domain of f(x, y) = (log(xy - y x 2 - y2). 2), (b) Find the maximal domain of
J9 -
g(x, y)
=
+ 2x -
(~, log(x 2 - y2), sin x cos y).
SOLUTION. (a) This function has two components: h(x, y) = log(xy-y+2x-2) and 12(x,y) = x 2 - y2. Since the logarithm is real-valu<,d only when its argument is positive, the domain of h is the set of points (x, y) which satisfy
J9 -
0< xy - y + 2x - 2 = (x - l)(y + 2). Since the square root function is real-valued if and only if its argument is nonnegative, the domain of 12 is the set of points (x, y) which satisfy x 2 + y2 ::::: 9. Thus the maximal domain of f is
{ (x, y) : x 2 + y2 ::::: 9 and (x - 1)( Y + 2) > O}. (This set is graphed in Figure 8.7, p. 252.) (b) This function has three component functions: gl(X, y) = VI - x 2, g2(X, y) = log(x 2 - y2), and g3(X, y) = sin x cos y. gl is real-valued when 1 - x 2 ~ 0; i.e.,
Chapter 9
264
CONVERGENCE IN Rn
-1 ::; X ::; 1. g2 is real-valued when x 2 - y2 > 0, i.e., when domain of g3 is all of R 2 • Thus the maximal domain of 9 is
{(x, y) : -1 ::; x ::; 1 and
-Ixl < y < Ixl.
The
-Ixl < y < Ixl}.
(This set is graphed in Figure 8.6, p. 251.) I To set up notation for the algebra of vector functions, let E ~ Rn and suppose that f, 9 : E -+ R m. For each x E E, the scalar product of an a E R with f is defined by (aJ)(x) := af(x), the sum of f and 9 is defined by
(f
+ g)(x)
:=
f(x)
+ g(x),
the (Euclidean) dot product of f and 9 is defined by
(f. g)(x) := f(x) . g(x), and (when m = 3) the cross product of f and 9 is defined by
(f x g)(x) := f(x) x g(x). (Notice that when m = 1, the dot product of two functions is the pointwise product defined in Section 3.1.) Here is the multi variable analogue of two-sided limits (compare with Definition 3.1).
9.13 DEFINITION. Let n,m E N and a ERn, let V be an open set which contains a, and suppose that f : V \ {a} -+ Rm. Then f(x) is said to converge to L, as x approaches a, if and only if for every c > 0 there is a 8 > 0 (which in general depends on c, f, V, and a) such that 0<
Ilx - all
< 8 implies
Ilf(x) -
LII
< c.
In this case we write
L= lim f(x) x-->a
and call L the limit of f(x) as x approaches a. Using the analogy between the norm on Rn and the absolute value on R, we can extend much of the theory of limits of functions developed in Chapter 3 to the Euclidean space setting. Here is a brief summary of what is true.
265
9.2 Limits of functions
9.14 THEOREM. Let n, mEN, let a ERn, let V be an open ball which contains a, and let f,g: V \ {a} -+ Rm.
= g(x) for all x E V \ {a} and f(x) has a limit as x also has a limit as x -+ a, and
(i) If f(x)
-+
a, then g(x)
lim g(x) = lim f(x). :J:-+G
Z---+G
(ii) [SEQUENTIAL CHARACTERIZATION OF LIMITS]. L = lirn",-+a f(x) exists if and only if f(Xk) -+ L as k -+ 00 for every sequence Xk E V \ {a} which converges to a as k -+ 00. (iii) Suppose that a E R. If f(x) and g(x) have limits, asx approaches a, then so do (f + g)(x), (af) (x) , (f. g)(x) , and Ilf(x)ll. In fact, lim (f + g) (x) = lim f(x)
2:--+4
Z--+G
lim (af) (x) 2:--+4
lim (f . g) (x)
z--+a
and
+ z--+a lim g(x),
= a lim f(x) , Z-+4
= (lim f(x)) . (lim g(x)) , ~-+a
~--+a
11!~f(x)11 = !~llf(x)ll.
Moreover, when m = 3,
and when m = 1 and the limit of 9 is nonzero,
!~f(x)/g(x) = (!~f(x)) / U~g(x))
.
(iv) [SQUEEZE THEOREM FOR FUNCTIONS]. Suppose that f, g, h : V \ {a} and g(x) ~ h(x) ~ f(x) for all x E V \ {a}. If lim f(x)
z--+a
-+
R
= lim g(x) = L, %--+G
then the limit of h also exists, as x
-+
a, and
lim h(x) = L.
x-+a
(v) Suppose that U is open in R m , that LEU, and h : U \ {L} pEN. If L = lirn",-+a g(x) and M = liIIly-+L h(y) , then lim h 0 g(x)
x-+a
-+
RP for some
= h(L).
How do we actually compute the limit of a given vector-valued function? The following result shows that evaluation of such limits reduces to the real-valued case, i.e., the case where the range is one-dimensional. Consequently, our examples will be almost exclusively real-valued.
Chapter 9
266
CONVERGENCE IN Rn
9.15 THEOREM. Let a E Rn, let V be an open ball that contains a, let f = (/!, ... , fm) : V \ {a} ---+ R m, and let L = (Ll' L 2 , ••. , Lm) E Rm. Then
L = lim f(x)
(1)
"'---
exists if and only if
(2) exists for each j
Lj
= lim!J (x)
"'---
= 1,2, ... , m.
PROOF. By the Sequential Characterization of Limits, we must show that for all sequences Xk E V \ {a} which converge to a, f(Xk) ---+ L as k ---+ 00 if and only if !J(Xk) ---+ L j , as k ---+ 00, for each 1 ~ j ~ n. But this last statement is obviously true by Theorem 9.2. Therefore, (1) holds if and only if (2) holds. I Using Theorem 9.14, it is easy to see that if!J are real functions continuous at a point aj, for j = 1,2, ... , n, then F(Xl' X2,"" xn) := /!(Xl) + !2(X2) + ... + fn(xn) and G(Xl,X2,""X n ) := /!(Xl)!2(X2)···fn(x n ) have a limit at the point a := (aI, a2, ... , an). In fact, lim F(x) = F(a)
and
X~
lim G(x) = G(a).
x-a
This observation is often used in conjunction with Theorem 9.15 to evaluate simple limits like the following.
9.16 Examples. (i) Find lim
(x,y)-+(O,O)
(3xy
+ 1, eY + 2).
(ii) Prove that the function
2+x-y f(x, y) = 1 + 2X2 + 3y2 has a limit as (x, y)
---+
(0,0).
SOLUTION. (i) By Theorem 9.15, this limit is (0 + 1, eO + 2) = (1,3). (ii) The polynomial 2 + x - y (respectively, 1 + 2x2 + 3y2) converges to 2 (respectively, to 1) as (x,y) ---+ (0,0). Hence, by Theorem 9.14, lim (x,y)-+(O,O)
2+x - y 1 + 2x2 + 3y2
= ~ = 2. 1
I
It was legal to use Theorem 9.14 in this example because the limit quotient was not of the form 0/0. Proving that a limit of the form % exists in several variables often involves showing that Ilf(x) -LII is dominated by (i.e., less than or equal to)
9.2 Limits of functions
some nonnegative function 9 which satisfies g(x) ---. 0 example.
267
as x
---. a. Here is a typical
9.17 Example. Prove that
converges as (x, y) ---. (0,0). PROOF. Since the numerator is a polynomial of degree 3 and the denominator is a polynomial of degree 2, we expect the numerator to overpower the denominator, i.e., the limit to be 0 as (x, y) ---. (0,0). To prove this, we must estimate f(x, y) near (0,0). Since 21xyl ~ x 2 + y2 for all (x, y) E R2, it is easy to check that
If(x,y)1 ~
3
21xl < 21xl
for all (x,y) i= (0,0). Let c > 0 and set 8 = c/2. If 0 If(x,y)1 < 21xl ~ 211(x,y)11 < 28 = c. Thus, by definition, lim
(x,y) ...... (O,O)
< II(x,y)11 < 8, then
f(x, y) = O. I
It is important to realize that by Definition 9.13, if f converges to L as x ---. a, then Ilf(x) -LII is small for all x near a. In particular, f(x) ---. L as x ---. a, no matter what path x takes. The next two examples show how to use this observation to prove that a limit does not exist.
9.18 Example. Prove that the function
2xy f(x, y) = x 2 + y2 has no limit as (x, y) ---. (0,0). PROOF. Suppose that f has a limit L, as (x, y) ---. (0,0). If (x, y) approaches (0,0) along a vertical path, e.g., if x = 0 and y ! 0, then L = 0 (because f(O, y) = 0 for all y i= 0). If (x, y) approaches (0,0) along a "diagonal" path, e.g., if y = x and x! 0, then L = 1 (because f(x, x) = 1 for all x i= 0). Since 0 i= 1, f has no limit at (0,0). I In the solution to Example 9.18, the diagonal path was chosen so that the denominator of f(x, y) would collapse to a single term. This same strategy is used in the next example. 9.19 Example. Determine whether
xy2 f(x, y) = x 2 + y4 has a limit as (x, y) ---. (0,0).
Chapter 9
268
CONVERGENCE IN R n
SOLUTION. The vertical path x = 0 gives f(O, y) = 0 even before we take the limit as y ---4 O. On the other hand, the parabolic path x = y2 gives 2 y4 1 f(y ,y) = 2y4 ="2
Therefore,
f
010.
cannot have a limit as (x, y) ---4 (0,0) .•
(Notice that if y = mx, then
f(x,y) =
m 2x 3
x
2
+m4 x 4 ---40
as x ---4 O. Thus, Example 9.19 shows that the two-dimensional limit of a function might not exist even when its limit along every linear path exists and gives the same value.) When asked whether the limit of a function f(x) exists, it is natural to begin by taking the limit as each variable moves independently. Comparing Examples 9.16 and 9.18, we see that this strategy works for some functions but not all. To look at this problem more closely, we introduce the following terminology. Let V be an open ball in R2, let (a,b) E V, and suppose that f: V\ {(a,b)} ---4 Rm. The iterated limits of f at (a, b) are defined to be lim lim f(x, y) := lim (lim f(x, y))
x--+a y--+b
x--+a
y--+b
and
lim lim f(x, y) := lim (lim f(x, y)) ,
y--+b x--+a
y-+b
x--+a
when they exist. The iterated limits of a given function might not exist. Even when they do, we cannot be sure that the corresponding two-dimensional limit exists. Indeed, although the iterated limits of the function f in Example 9.18 exist and are both zero at (0,0), f has no limit as (x, y) ---4 (0,0). It is even possible for both iterated limits to exist but give different values.
9.20 Example. Evaluate the iterated limits of
x2 f(x'Y)=x 2 +y2 at (0,0). SOLUTION.
For each x
01 0,
x 2j(x 2 + y2) ---4 1 as y ---4 O. Therefore,
. . hm hm
x-+O y-+O
x2 x2 = lim - = 1. x 2 + y2 x-+O x 2
On the other hand, · l'1m 11m
y-+O x-+o
x2 l' 0 0. • =lm-= x 2 + y2 y-+O y2
This leads us to ask: When are the iterated limits equal? The following result shows that if f has a limit as (x,y) ---4 (a,b) and both iterated limits exist, then these limits must be equal.
269
9.2 Limits of functions
9.21 Remark. Suppose that I and J are open intervals, that a E I and b E J, and that f: (I x J) \ {(a,b)} --4 R. If
g(x) := lim f(x, y) y-+b
exists for each x E 1\ {a}, if lim x -+ a f(x, y) exists for each y f(x, y) --4 L as (x, y) --4 (a, b) (in R2), then
E
J \ {b}, and if
L = lim lim f(x, y) = lim lim f(x, y). x~ay~b
PROOF.
Let e > 0<
o.
y~bx~a
By hypothesis, choose 8 > 0 such that
II(x,y) -
(a,b)11 < 8 implies
If(x,y) - LI < e.
Suppose that x E I and 0 < Ix - al < 8/V'i. Then for any y that satisfies 0 < Iy - bl < 8/V'i, we have 0 < II(x, y) - (a, b)11 < 8, hence
Ig(x) - LI :::; Ig(x) - f(x, y)1
+ If(x, y) -
LI < Ig(x) - f(x, y)1
+ e.
Taking the limit of this inequality as y --4 b, we find that Ig(x) - LI :::; e for all x E I that satisfy 0 < Ix - al < 8/V'i. It follows that g(x) --4 L as x --4 a; i.e., L
= lim lim f(x, y). x-+a y-+b
A similar argument proves that the other iterated limit also exists and equals L. I Notice by Example 9.20 that the conclusion of Remark 9.21 might not hold if the hypothesis "f(x, y) --4 L as (x, y) --+ (a, b)" is omitted. In particular, if the limit of a function does not exist, we must be careful about changing the order of an iterated limit.
EXERCISES 1. For each of the following functions, find the maximal domain of f, prove that
the limit of f exists as (x, y) --4 (a, b), and find the value of that limit. (Note: You can prove that the limit exists without using e'S and 8's-see Example 9.16.)
(a)
(b)
f(x, y) =
f ( x, y) =
(
(
-1 y _ l' x
X
+ 2)
YSinX x 2 -x-, tan y' x
,
(a, b)
+ y2 -
)
=
xy ,
(1, -1).
(a, b) = (0,1).
270
Chapter 9
(c)
(d)
CONVERGENCE IN R n
JfXYT)
X4 + y4 f(x,y) = ( X2+ Y2'{/2 X +y 2'
(a,b) = (0,0).
X _ (X 2 -1 X2Y -2XY + Y -(X-1)2) f ( , y) y2 + l' x 2 + y2 - 2x - 2y + 2 '
(a, b) = (1,1).
2. Compute the iterated limits at (0,0) of each of the following functions. Determine which of these functions has a limit as (x, y) --+ (0,0) in R2, and prove that the limit exists.
(a)
f( x,y ) = sinxsiny 2 2· X
+y
(b)
(c)
x-y f(x, y) = (x2 + y2)Q'
1 2·
0:<-
3. Prove that each of the following functions has a limit as (x,y) (a)
x3 _ y3 f(x'Y)=x 2 +y2'
(b)
f (x, y) = x 2 + y4 '
IxlQ y 4
--+
(0,0).
(x,y)-I=(O,O).
(x, y) -1= (0,0),
where 0: is ANY positive number. 4. A polynomial on R n is a function of the form
P(XI,X2, ... ,Xn ) =
5. 6. 7. 8.
Nl
Nn
iJ=O
jn=O
L ... L
ajl, ... ,jnx{l ... x~n,
where aj" ... ,jn are scalars and NI' ... ' N n are nonnegative integers. Prove that if P is a polynomial on R n and a ERn, then li~.....a P(x) = P(a). Prove Theorem 9.14i. Prove Theorem 9.14ii. Prove Theorem 9.14iii. Prove Theorem 9.14iv.
9.3 CONTINUOUS FUNCTIONS In this section we define what it means for a vector function to be continuous, obtain analogues of many results in Sections 3.3 and 3.4, and examine how open sets, closed sets, and connected sets behave under images and inverse images by continuous functions. We shall use these results many times in the sequel.
9.3
Continuous functions
271
9.22 DEFINITION. Let E be a nonempty subset of Rn and let I: E ---+ Rm. (i) I is said to be continuous at a E E if and only if for every c > 0 there is a 8> 0 (which in general depends on c, I, and a) such that (3)
Ilx-all<8 and xEE (ii)
imply
11/(x)-/(a)ll
I is said to be continuous on E (notation: I : E only if I is continuous at every x E E.
---+
R m is continuous) if and
Suppose that E is a nonempty subset of Rn. It is easy to verify that I is continuous at a E E if and only if I(Xk) ---+ I(a) for all Xk E E that converge to a. Hence, by Theorem 9.4, if I and 9 are continuous at a point a E E (respectively, continuous on E), then so are 1+ g, oJ (for a: E R), I· g, 11111, and (when m = 3) I x g. Moreover, if I : E ---+ Rm is continuous at a E E and 9 : I(E) ---+ RP is continuous at I(a) E I(E), then go I is continuous at a E E (see Exercise 3 below). We shall frequently need a stronger version of continuity.
9.23 DEFINITION. Let E be a nonempty subset of R n and I : E ---+ Rm. Then I is said to be unilormly continuous on E (notation: I : E ---+ R m is uniformly continuous) if and only if for every c > 0 there is a 8 > 0 such that Ilx -all < 8 and x,a E E
imply
11/(x) -
I (a) II < c.
As in the real case, continuity and uniform continuity of a vector function are equivalent on closed, bounded sets. We use the powerful Heine-Borel Theorem to construct a direct proof.
9.24 THEOREM. Let E be a nonempty compact subset ofRn. on E, then I is uniformly continuous on E.
1£1 is continuous
PROOF. Suppose that I is continuous on E. Given c > 0 and a 8(a) > 0 such that
x
E
Bo(a) (a)
and x E E
imply
11/(x) - I (a) II <
E
E, choose
c
2'
Since 8(a)/2 is positive for all a E E, we can choose, by the Heine-Borel Theorem, finitely many points tlj E E and numbers 8j := 8(aj)/2 such that N
(4)
Ec
UBoJ(aj). j=1
Set 8:= min{81 , ... , 8N }. Suppose that x,a E E and Ilx -all < 8. By (4), x belongs to BoJ(aj) for some 1:::; j:::; N. Hence, Iia-ajil :::; Ila-xll + Ilx-tljll < 8j +8j = 28j = 8(aj), i.e., a also belongs to Bo(aJ)(aj). It follows, therefore, from the choice of 8(aj) that
(5)
11/(x) - I (a) II :::; 11/(x) - l(aj)11
+ 11/(tlj) - I (a) II
<
c
c
2+ 2 =
c.
Chapter 9
272
This proves that
f
CONVERGENCE IN Rn
is uniformly continuous on E. I
Thus continuity of vector functions behaves much the same as it did for real functions. When we turn our attention to how continuous functions interact with the topological structure of Rn, we again find a surprising bonus. The e'S and 8's disappear.
9.25 THEOREM. Let n, mEN and conditions are equivalent.
f : Rn
---4
Rm. Then the following three
(i) f is continuous on Rn. (ii) f-l(V) is open in Rn for every open subset V ofRm. (iii) f-l(E) is closed in R n for every closed subset E ofRm. PROOF. (i) implies (ii). Suppose that f is continuous on Rn and V is open in Rm. Since 0 is open, we may suppose that some a E f-l(V). To show that f-l(V) is open, we need to find a 8 > 0 such that Bo(a) c f-l(V). But f(a) E V and V is open, so there is a e > 0 such that Be(f(a)) c V. Since f is continuous, choose 8 > 0 such that Ilx - all < 8 implies Ilf(x) - f(a)11 < e, i.e., x E Bo(a) implies f(x) E Be(f(a)). It follows that f(Bo(a)) ~ Be(f(a)) C V; i.e., Bo(a) c f-l(V). (ii) implies (iii). Let E be closed in Rm. Then Rm \ E is open in Rm. Hence by hypothesis and Theorem 1.43iv, Rn \ rl(E) = rl(Rm \ E) is open in Rn. In particular, f-l(E) is closed in Rn. (iii) implies (i) Let a ERn and e > O. Since R m \ Be(f(a)) is closed in R m, we have by hypothesis and Theorem 1.43iv that Rn \ f-l(Be(f(a))) is closed in Rn, i.e., that f-l(Be(f(a))) is open in Rn. Since a E f-l(Be(f(a))), it follows from the definition of open sets that there is a 8 > 0 such that Bo(a) C f-l(Be(f(a))); i.e., f(Bo(a)) C Be(f(a)). By the definition of balls, we conclude that Ilx - all < 8 implies Ilf(x) - f(a) I < e. I
As we saw in the discussion below Remark 8.25, when "f is continuous on Rn" is replaced by "f is continuous on some E eRn," Theorem 9.25 needs to be modified by replacing open by relatively open (and closed by relatively closed-see Exercise 6). The following result shows that this modification is unnecessary when E is open (see also Exercise 5).
9.26 THEOREM. Let n, mEN, let E be open in Rn, and suppose that f : E-+ Rm. Then f is continuous on E if and only if f-l(V) is open in Rn for every open set V in Rm. PROOF. Suppose that f is continuous on E and V is open in Rm. We may suppose that f-l(V) -=I- 0. Let a E f-l(V) := {x E E : f(x) E V}. Then f(a) E V and a E E. Since V is open, choose e > 0 such that Be(f(a)) C V. Since f is continuous at a E E and E is open, choose 8 > 0 such that Ilx - all < 8 implies x E E and Ilf(x) - f(a) II < e. Since Be(f(a)) C V, it follows that Bo(a) C f-l(V). Thus f-l(V) is open by definition. Conversely, if f-l(V) is open for all open sets V in Rm, let a E E, e > 0, and set V = Be(f(a)). Then there is a 8> 0 such that Bo(a) C f-l(V). This means that
9.3
273
Continuous functions
if Ilx -all < 8, then x E E and Ilf(x) - f(a) II < aEE. I
c.
By definition, f is continuous at
We shall refer to Theorem 9.25 by saying that open sets and closed sets are invariant under inverse images by continuous functions. It is natural to ask whether bounded sets and connected sets are invariant under inverse images by continuous functions. The following examples show that the answers to these questions are no. 9.27 Examples. (i) If f(x) = 1/(x 2 + 1) and E = (0,1]' then f is continuous on Rand E is bounded, but f-l(E) = (-00,00) is not bounded. (ii) If f(x) = x 2 and E = (1,4), then f is continuous on Rand E is connected, but rl(E) = (-2, -1) U (1,2) is not connected. We now turn our attention from inverse images of sets to images of sets. Are open sets and closed sets invariant under images by continuous functions? The following examples show that the answers to these questions are also: No. 9.28 Examples. (i) If f(x) = x 2 and V = (-1,1), then f is continuous on V and V is open, but f(V) = [0,1) is neither open nor closed. (ii) If f(x) = l/x and E = [1,00), then f is continuous on E and E is closed, but f(E) = (0,1] is neither open nor closed. As the next result shows, however, if a set is both closed and bounded (Le., compact), then so is its image under any continuous function. This innocent-looking result has far-reaching consequences which we shall exploit on many occasions. 9.29 THEOREM. Let n, mEN. If H is compact in Rn and continuous on H, then f (H) is compact in R m .
f :H
---4
Rm is
PROOF. By the Heine-Borel Theorem, it suffices to show that f(H) is closed and bounded. To show that f(H) is closed, let Yk E f(H). By definition, Yk = f(Xk) for some Xk E H. Since H is closed and bounded, we can use the Bolzano-Weierstrass Theorem and Theorem 9.8 to choose a subsequence Xk j that converges to some x E H. Since f is continuous on H, it follows from construction that
(6)
Y = .lim Yk J = .lim f(Xk·) = f(x) E f(H). J-+OO
J-+OO
J
Thus f(H) is closed by Theorem 9.8. To show that f(H) is bounded, suppose not. Thus choose Xk E H such that Ilf(Xk)11 ~ k for kEN. Again, use the Bolzano-Weierstrass Theorem and Theorem 9.8 to choose a subsequencexkj that converges to some X E H. Since f is continuous on H, we conclude by construction that Ilf(x)11 = limj-+oo Ilf(XkJII = 00. Since f (x) ERn, this is a contradiction. I Connected sets are also invariant under images of continuous functions.
Chapter 9
274
CONVERGENCE IN R n
9.30 THEOREM. Let n, mEN. If E is connected in R n and continuous on E, then J(E) is connected in Rm.
J:E
---+
R m is
PROOF. Suppose that J(E) is not connected. By Definition 8.28, there exist a pair of relatively open sets U, V in J(E) that separates J(E); i.e., un J(E) =I- 0, V n J(E) =I- 0, J(E) = U U V, and un V = 0. Set A:= J-1(U) and B:= J-l(V). By Exercise 6c, A and B are relatively open in E. Since J(E) = U U V and both J-1(U) and J-l(V) are subsets of E, we also have (see Theorem 1.43) (7) Finally, Unv = 0 implies J-1(U)nJ-1(V) = 0; i.e., AnB = 0. Thus A, B is a pair of relatively open sets that separates E; i.e., E is not connected, a contradiction. I Keeping track of which kind of sets are invariant under images and inverse images by continuous functions is a powerful tool. To illustrate this fact, we offer the following four results.
9.31 Remark. The graph y
= J(x) oj a continuous real Junction J on an interval
[a, b] is compact and connected. PROOF. The function F(x) = (x, J(x)) is continuous from [a, b] into R2, and the graph of y = J(x) for x E [a, b] is the image of [a, b] under F. Hence the graph of J is compact and connected by Theorems 9.29 and 9.30. I It is interesting to note that this property actually characterizes continuity of real functions (see Theorem 9.51). To appreciate the perspective that the topological point of view gives, compare the following simple proof with that of its one-dimensional analogue (Theorem 3.26).
9.32 THEOREM [EXTREME VALUE THEOREM]. Suppose that H is a nonempty subset ofRn and J : H ---+ R. If H is compact, and J is continuous on H, then
M
:=
sup{j(x) : x E H}
and m:= inf{J(x) : x E H}
are finite real numbers. Moreover, there exist points XM,Xm E H such that M = J(XM) and m = J(x m ). PROOF. By symmetry, it suffices to prove the result for M. Since H is compact, J(H) is compact by Theorem 9.29. Thus J(H) is bounded, so M is finite. By the Approximation Property, choose Xk E H such that J(Xk) ---+ M as k ---+ 00. Since J(H) is also closed, M E J(H). Therefore, there is an XM E H such that M = J(XM). I (For a multidimensional analogue of Theorem 3.29, see Exercise 8 below.) The following analogue of Theorem 4.26 will be used in Chapter 13 to examine change of parametrizations of curves and surfaces.
9.3
Continuous functions
275
9.33 THEOREM. Let n, mEN. If H is a compact subset ofRn and f : H is 1-1 and continuous, then f- 1 is continuous on f(H).
-+
Rm
PROOF. By Theorem 9.29 and the Heine-Borel Theorem, f(H) is closed. Thus by Exercise 5, it suffices to show that (1-1 )-1 takes closed sets to closed sets. To this end, let E be closed in Rn. Since the domain of f- 1 is f(H), we have by definition that
(I-l)-I(E) = {xE f(H): f-l(X) =y for someyE E}. Since f is 1-1, f-l(X) =yimpliesxE f(E). Thus (I-l)-I(E) = f(EnH). But En H is closed (see Theorem 8.24) and bounded (by "the bound" of H), so by Theorem 9.29 and the Heine-Borel Theorem, f(E n H) is closed and bounded. In particular, (1-1 )-1 (E) = f(E n H) is closed. I The final result of this section shows that "rectangles" are connected in Rn.
9.34 Remark. If aj ::; bj for j = 1,2, ... ,n, then
is connected. PROOF. Suppose not. Choose nonempty sets U and V, relatively open in R, such that R = U U V and un V = 0. Let a E U and bE V, and consider the line segment E := {ta+ (1 - t)b: t E [0, I]}. Since E is a continuous image of the interval [0,1], we have by Theorems 8.30 and 9.30 that E is connected. On the other hand, since E c R by the definition of R, it is easy to check that Uo := U n E and Vo := VnE are nonempty sets that are relatively open in E and satisfy E = UoUVo and Uo n Vo = 0. It follows that E is not connected, a contradiction. I
EXERCISES 1. Define f and 9 on R by f(x) = sin x and g(x) =
x/lxl
if x
=f
°
and g(O) = 0.
(a) Find f(E) and g(E) for E = (0,11"), E = [0,11"], E = (-1,1), and E = [-1,1]' and explain some of your answers by appealing to results in this section. (b) Find f-l(E) and g-I(E) for E = (0,1), E = [0,1]' E = (-1,1), and E = [-1, 1], and explain some of your answers by appealing to results in this section. 2. Define f on [0,00) and 9 on R by f(x) = g(O) = 0.
Vx and g(x) =
l/x if x
=f
°
and
(a) Find f(E) and g(E) for E = (0,1), E = [0,1), and E = [0,1], and explain some of your answers by appealing to results in this section. (b) Find f-l(E) and g-I(E) for E = (-1,1) and E = [-1,1], and explain some of your answers by appealing to results in this section.
Chapter 9
276
3. Let A
CONVERGENCE IN Rn
c Rn, let BeRm, let a E A, and let I : A \ {a}
~
B. (a) Suppose that A is open and b:= lim",____ I(x) exists. If 9 is continuous at b, prove that lim go I(x) = g(b).
"'----
(b) If I is continuous at a E A and 9 is continuous at I(a) E B, prove that go I is continuous at a E A. 4. Prove that e-l/Ix-YI x#y I(x,y) = { 0 x=y is continuous on R 2 • 5. Let B be a closed in Rn and I : B ~ Rm. Prove that the following are equivalent: (a) I is continuous on B. (b) 1-1 (E) is closed in Rn for every closed subset E of Rm. 6. Suppose that E ~ Rn and I : E ~ Rm. (a) Prove that I is continuous on E if and only if I-I (V) is relatively open E for every open set V in R m. (b) Prove that I is continuous on E if and only if 1-1 (B) is relatively closed E for every closed set B in R m. (c) Suppose that I is continuous on E. Prove that if V is relatively open I(E), then 1-1 (V) is relatively open in E, and if B is relatively closed I(E), then 1-1 (B) is relatively closed in E.
in in in in
trJ. This exercise is used in Section e9.5. Let n,m E N and let H be a nonempty, closed, bounded subset of Rn. (a) Suppose that I : H ~ Rm is continuous. Prove that IIIIIH := sup 111(x)11 ",EH
is finite and there exists anxo E H such that 111(xo)11 = IIIIIH. (b) A sequence of functions Ik : H ~ Rm is said to converge uniformly on H to a function I : H ~ R m if and only if for every c > 0 there is an N E N such that k ~ N and x E H imply Ilik(x) - I (x) II < c. Show that Ilik - IIIH ~ 0 as k ~ 00 if and only if ik ~ I uniformly on H as k ~ 00. (c) Prove that a sequence of functions ik converges uniformly on H if and only if for every c > 0 there is an N E N such that k,j ~ N
implies
Ilik -hlIH < c.
9.4
Compact sets
277
8. Let n, mEN, E eRn, and suppose that D is dense in E; i.e., suppose that DeE and D = E. If f : D -+ Rm is uniformly continuous on D, prove that f has a continuous extension to E; i.e., prove that there is a continuous function 9 : E -+ Rm such that g(x) = f(x) for all xED. 9. [INTERMEDIATE VALUE THEOREM]. Let E be a connected subset of Rn. If f : E -+ R is continuous, f(a) 1- f(b) for some a,b E E, and y is a number that lies between f(a) and f(b), then prove that there is an x E E such that f(x) = y. (You may use Theorem 8.30.) 1*10 I. This exercise is used to prove *Corollary 11.35. (a) A set E <;;; Rn is said to be polygonally connected if and only if any two points a,b E E can be connected by a polygonal path in E; i.e., there exist points Xk E E, k = 1, ... , N, such that Xo = a, XN = b and L(Xk-l;Xk) <;;; E for k = 1, ... , N. Prove that every polygonally connected set in Rn is connected. (b) Let E <;;; R n be open and Xo E E. Let U be the set of points x E E that can be polygonally connected in E to Xo. Prove that U is open. (c) Prove that every open connected set in Rn is polygonally connected. e9.4 COMPACT SETS richment section.
This section requires no material from any other en-
In this section we give a more complete description of compact sets. Most of the results we state are trivial to prove by appealing to the hard part of Heine-Borel Theorem, specifically, that closed and bounded subsets of a Euclidean space are compact. Since this powerful result does not hold in some non-Euclidean spaces, our proofs will appeal only to the basic definition of compact sets, hence avoid using the Heine-Borel Theorem. We begin by expanding our terminology concerning what we mean by a "covering." 9.35 DEFINITION. Let V = {Va}aEA be a collection of subsets of Rn, and suppose that E <;;; Rn. (i) V is said to cover E (or be a covering of E) if and only if E<;;;
U Va.
aEA
(ii) V is said to be an open covering of E if and only if V covers E and each Va is open. (iii) Let V be a covering of E. V is said to have a finite (respectively, countable) subcovering if and only if there is a finite (respectively, an at most countable) subset Ao of A such that {Va}aEAo covers E. Notice that the collections of open intervals
278
Chapter 9
CONVERGENCE IN Rn
are open coverings of the interval (0,1). The first covering of (0,1) has no finite subcovering, but any member of the second covering covers (0,1). Thus, an open covering of an arbitrary set might not have a finite subcovering. Our first general result about compact sets shows that every "space" contains compact sets. 9.36 Remark. Let n E N. The empty set and all finite subsets ofRn are compact. PROOF. These statements follow immediately from Definition 9.9. The empty set needs no set to cover it, and any finite set H can be covered by finitely many sets, one set for each element in H. I
Since the empty set and finite sets are also closed, it is natural to ask whether there is a relationship between compact sets and closed sets in general. The following three results address this question. 9.37 Remark. A compact set is always closed. PROOF. This result follows easily from the sequential characterization of closed sets (see the second paragraph in the proof on p. 261). I
Since {(n - 1, n + 1) : n E N} is an open covering of the closed set E := [1,00), the converse of Theorem 9.37 is false. The following result shows that this is not the case if E is a subset of some compact set. 9.38 Remark. A closed subset of a compact set is compact. PROOF. Let E be a closed subset of H, where H is compact, and suppose that V = {Va}aEA is an open covering of E. Now E C= Rn \ E is open. Thus V U {EC} is an open covering of H. Since H is compact, there is a finite set Ao ~ A such that
But EnE C=
0. Therefore, E is covered by {Va}aEA o' I
Finally, we show that every open covering of a set in a Euclidean space has a countable sub covering. 9.39 THEOREM [LINDELOF]. Let n E N and let E be a subset of Rn. If {Va}aEA is a collection of open sets and E ~ UaEAVa , then there is an at most countable subset Ao of A such that
PROOF. Let T be the collection of open balls with rational radii and rational centers, i.e., centers that belong to Qn. This collection is countable. Moreover, by the proof of the Borel Covering Lemma, T "approximates" the collection of open balls in the following sense: Given any open ball Br(x) ~ Rn, there is a ball Bp(a) E T such that x E Bp(a) and Bq(a) ~ Br(x).
9.4
279
Compact sets
To prove the theorem, let x E E. By hypothesis, x E V" for some a E A. Since V" is open, there is a r > 0 such that Br(x) C V". Since T approximates open balls, we can choose a ball Bx E T such that x E Bx ~ V". The collection T is countable, hence so is the subcollection
{Ul,U2, ... }:= {Bx :xE E}. By the choice of the balls Bx, for each kEN there is at least one ak E A such that Uk ~ V"k' Hence, by construction,
xEE
Thus, set Ao
:= {ak : kEN}.
kEN
kEN
I
EXERCISES 1. Identify which of the following sets are compact and which are not. If E is not compact, find the smallest compact set H (if there is one) such that E C H.
2. 3. 4.
5.
(a) {l/k: kEN} U {a}. (b) {(x, y) E R2 : a ::; x 2 + y2 ::; b} for real numbers 0 < a < b. (c) {(x, y) E R2 : y = sin(l/x) for some x E (0,1]}. (d) {(x, y) E R2 : Ixyl ::; 1}. Let A, B be compact subsets of Rn. Prove that Au B and An B are compact. Suppose that E ~ R is compact and nonempty. Prove that supE,inf E E E. Let {V"}"EA be a collection of nonempty open sets in Rn that satisfies V" n V.a = 0 for all a -:f. f3 in A. Prove that A is countable. What happens to this result when "open" is omitted? Prove that if V is open in Rn, then there are open balls BI, B 2 , ... such that
V=
UB
j .
jEN
Prove that every open set in R is a countable union of open intervals. 6. Let n E N. (a) A subset E of Rn is said to be sequentially compact if and only if every sequence Xk in E has a convergent subsequence Xkj whose limit belongs to E. Prove that every compact set is sequentially compact. (b) Prove that every sequentially compact set is closed and bounded. (c) Prove that a set E C Rn is sequentially compact if and only if it is compact. 7. Let H ~ Rn. (a) Prove that H is compact if and only if every cover {E"}"EA of H, where the E,,'s are relatively open in H, has a finite sub covering. (b) Use part (a), Exercise 6a, p. 276, and Definition 9.9 to show directly that if f : H -+ Rm is continuous and H is compact, then f(H) is compact.
280
Chapter 9
CONVERGENCE IN R n
This section uses no material from a previous enrich-
e9.5 APPLICATIONS ment section.
We have seen that topological concepts (e.g., closed sets, open sets, and connected sets) are powerful theoretical tools. In this section we continue this theme by obtaining three independent theorems (Le., you may cover them in any order) which further elucidate results we obtained in earlier chapters. Our first application of topological ideas is a partial converse of Theorem 7.10. A sequence of real-valued functions Ud is said to be pointwise increasing (respectively, pointwise decreasing) on a subset E of Rn if and only if !k(x) ::; !k+1(x) (respectively, !k(x) 2: fk+l(X)) for all x E E and kEN. A sequence is said to be pointwise monotone on E if and only if it is pointwise increasing on E or pointwise decreasing on E.
9.40 THEOREM [DINI]. Suppose that H is a compact subset ofRn and !k : H ---. R is a pointwise monotone sequence of continuous functions. If!k ---. f pointwise on H as k ---. 00 and f is continuous on H, then!k ---. f uniformly on H. In particular, if cPk is a pointwise monotone sequence of functions continuous on an interval [a, b] that converges pointwise to a continuous function, then lim
k-+oo
Ib a
cPk(t) dt =
Ib (
lim cPk(t)) dt.
a
k-+oo
PROOF. By Theorem 7.10, we need only show that !k ---. may suppose that !k is pointwise increasing and H 1- 0. Let c > O. For each x E H, choose N(x) EN such that
k
2: N(x)
l!k(x) - f(x) 1 <
implies
f uniformly on H. We
i.
Since f and fN(z) are continuous on H, choose an r = r(x) > 0 such that
y E H n Br(x) implies
If(x) - f(y) 1<
i
and
IfN(z) (x) - fN(z) (y)1 <
i·
By the Heine-Borel Theorem, choose Xj E H and rj = r(xj) such that M
He
U BrJ (Xj). j=l
= max{N(xl), ... , N(XM)} , let x E H, and suppose that k 2: N. Since BrJ (Xj) for some j E {l, ... , M} and k 2: N(xj), it follows that
Set N
x
E
If(x) - fk(X) 1 = f(x) - !k(x)::; f(x) - fN(zJ) (x) ::; If(x) - f(xj)1
+ If(xj) -
+ IfN(zj) (Xj) c
c
c
<"3 +"3 +"3 =
c.
fN(zJ) (Xj) 1
fN(zJ) (X) 1
9.5
281
Applications
Since this inequality holds for all x E H, we conclude that fk ---- f uniformly on H as k ---- 00. I Our next application of topological ideas is a characterization of Riemann integrability of a function f by the size of the set of points of discontinuity of f. To measure the size of this set, we make the following definition. (Recall that III denotes the length of an interval I.)
9.41 DEFINITION. (i) A set E c R is said to be of measure zero if and only if for every c > 0 there is a countable collection of intervals {Ij} j EN that covers E such that 00
(ii) A function f : [a, b] ---- R is said to be almost everywhere continuous on [a, b] if and only if the set of points x E [a, b] where f is discontinuous is a set of measure zero. Notice that by definition, if E is of measure zero, then every subset of E is also of measure zero. Loosely speaking, a set is of measure zero if it is so sparse that it can be covered by a sequence of intervals whose total length is as small as we wish. It is easy to see that a single point E = {x} is a set of measure zero. Indeed, h := (x - c/2, x + c/2), h := 0 for k ~ 2, cover E, and have total length c. Modifying this technique, we can show that any finite set is a set of measure zero (see also Remark 9.42 below). On the other hand, by the Heine-Borel Theorem, any open covering of [a, b] has a finite sub covering; hence, any covering of [a, b] by open intervals must have total length greater than or equal to b - a. In particular, a nondegenerate interval cannot be of measure zero. The following result shows that if a set is small in the set-theoretical sense, then it is small in the measure-theoretical sense. 9.42 Remark. Every at most countable set of real numbers is a set of measure zero. PROOF. We may suppose that E is countable, say E = {Xl, X2' and j EN, set j 1 I·J -- (x·J - c2- j - 1 ' X· J + c-2- - ) <0
Then
Xj
E Ij
••• }.
Given c > 0
•
and IIjl = c2- j for j E N. Therefore, E ~ Uj;:IIj and
The converse of Remark 9.42 is false; i.e., there exist uncountable sets of measure zero (see Exercise 9). The following result shows that the countable union of sets of measure zero is a set of measure zero.
282
Chapter 9
CONVERGENCE IN R n
9.43 Remark. If El, E 2 , ••• is a sequence of sets of measure zero, then 00
is also a set of measure zero. PROOF. Let c > O. By hypothesis, given kEN we can choose a collection of intervals {IY)}jEN that covers Ek such that 00
,,(k)
L.,..IIj I <
c 2k '
j=l
Then the collection {IY) h,jEN is countable, covers E, and 00
00
00
LL IIy) I :S L ;k = c. k=lj=l
k=l
Consequently, E is of measure zero. I To facilitate our discussion of points of discontinuity, we introduce the following concepts. 9.44 DEFINITION. Let [a, b] be a closed interval and f : [a, b] ---* R be bounded. (i) The oscillation of f on an interval J that intersects [a, b] is defined to be
nf(J) :=
sup
(f(x) - f(y))·
x,yEJn[a,b]
(ii) The oscillation of f at a point t E [a, b] is defined to be wf(t):= lim nf((t - h, t + h)), h--+O+
when this limit exists. 9.45 Remark. If f : [a, b] satisfies 0 :S wf(t) < 00.
---*
R is bounded, then wf(t) exists for all t E [a, b] and
PROOF. Fix t E [a, b] and for each interval J, set MJ
=
sup xEJn[a,b]
f(x),
mJ
=
inf
f(x).
xEJn[a,b]
Since sup( - f(x)) = - inf f(x), it is obvious that
(8) nf(J) = M J - mJ ~ O. Suppose for simplicity that t E (a, b), and choose ho so small that (t-ho, t+ho) C (a, b). For each 0 < h < ho, set F(h) = nf((t - h, t + h)). By the Monotone Property of Suprema, F(h) is increasing on (0, ho), hence has a finite limit as h ---* 0+. By (8), F(h) ~ O. Therefore, wf(t) exists and is both finite and nonnegative. I The next result shows that by using the oscillation function W f' one can represent the set of points of discontinuity of any bounded f as a countable union.
9.5
9.46 Remark. Let f : [a, b]
---7
Applications
283
R be bounded. If E represents the set of points of
discontinuity of f in [a, b], then E =
U
{t
E
[a, b] : wf(t) :::: ;}. J
j=l
PROOF. By (8), f is continuous at t E [a, b] if and only if wf(t) = O. Hence, t belongs to E if and only if wf(t) > O. Since, by the Archimedean Principle, wf(t) > 0 if and only if wf(t) :::: l/j for some j EN, the result follows at once. I We need two technical results about the oscillation of f at a point t. 9.47 Lemma. Let
f : [a, b]
---7
R be bounded. For each c
> 0, the set
H = {t E [a,b] : wf(t):::: c:}
is compact. PROOF. By definition, H is bounded (by max{lal, Ibl}). Hence, if the lemma is false, then H is not closed. Hence, there are points tk E H such that tk ---7 t as k ---7 00 but t ~ H. Since wf(t) < c, it follows that there is an ho > 0 such that
(9) Since tk
---7
t, choose N
E N
so that
Then, by (9), nf((tN - ho/2, tN + ho/2)) < c. Therefore, Wf(tN) < c, which contradicts the fact that t N E H. I 9.48 Lemma. Let I be a closed bounded interval and f : I ---7 R be bounded. If c > 0 and wf(t) < c for all t E I, then there is a 8> 0 such that nf(J) < c for all closed intervals J <;;:; I that satisfy IJI < 8.
> 0 such that
PROOF. For each t E I, choose 8t
(10) Since 8t!2 > 0, use the Heine-Borel Theorem to choose tl, ... , tN such that Ie
U N
(
8tj t·+8tJ ) t·_J 2' J 2
j=l
and set
8=
. mIll
l-:5,j-:5,N
8tJ -2'
Chapter 9
284
If J
~
CONVERGENCE IN Rn
I, then J
n (tj -
8~j ,tj + 8;J) =I 0
for some j E {I, ... , N}. If J also satisfies IJI < 8, then it follows from 28 ~ 8tj that J ~ (tj - 8tj , tj + 8tj ). In particular, (10) implies
nf(J) ~ nf((tj - 8tJ , tj
+ 8tJ )) < €. I
9.49 THEOREM [LEBESGUE]. Let f : [a,b] ---. R be bounded. Then f is Riemann integrable on [a, b] if and only if f is almost everywhere continuous on [a, b]. In particular, if f is bounded and has countably many points of discontinuity on [a, b], then f is integrable on [a, b].
f in [a, b]. Suppose that E is not of measure zero. By Remarks 9.43 and 9.46, there is a
PROOF. Let E be the set of points of discontinuity of
f is integrable but jo E N such that
H := {t E [a, b] : wf(t) is not of measure zero. In particular, there is an collection of intervals that covers H, then
~ jlo}
€o
> 0 such that if {hhEN is any
(11) Let P = {xo, ... , xn} be a partition of [a, b]. If (Xk-b Xk) definition, Mk(f) - mk(f) ~ l/jo. Hence,
n H =I 0, then by
n
U(f, P) - L(f, P)
=
2)Mk(f) - mk(f))(xk - Xk-l) k=l
But {[Xk-b Xk] : (Xk-b Xk) n H Hence, it follows from (11) that
=I 0}
is a collection of intervals that covers H.
U(f, P) - L(f, P) ~ ~o > Jo
o.
Therefore, f cannot be integrable on [a, b]. Conversely, suppose that E is of measure zero. Let M = m = infxE[a,bj f(x). Given € > 0, choose jo EN such that
M-m+b-a jo
-----<€.
sUPxE[a,bj
f(x) and
9.5
Applications
285
Since E is of measure zero, so is
Hence, by Definition 9.41, there exists a collection of intervals that covers H, whose lengths sum to a real number less than 1/(2jo). By expanding these intervals slightly, we may suppose that there exist open intervals h, 12 , ... that cover H such that 00 1
L IIvl < -;-. v=l Jo
Hence, by Lemma 9.47, we can choose N E N such that {h, h, ... , IN} covers H and N
1
L IIvl < -;-. v=l Jo
(12)
We must find a partition P such that U(f, P) - L(f, P) < c. The endpoints of the Iv's form part of this partition. Other points will come from further division of that part of [a,b] not covered by the Iv's. Indeed, let I' ~ [a,b] \ (U;;'=lIv). Since the Iv's cover H, wf(t) < 1fjo for all tEl'. Hence, by Lemma 9.48, there is a 0> 0 such that if J ~ I' satisfies IJI < 0, then nf(J) < l/jo. Subdivide [a, b] \ (U;;'=lIv) into intervals 1(., e= 1, ... , s, such that IJll < o. Then (13)
e
for = 1, ... , s. Let P = {xo, Xl, ... ,xn } represent the collection of points X such that x is an endpoint of some Iv or of some Jl. Notice that if (Xk-b Xk) n H =f. 0, then Xk-1 and Xk are endpoints of some Iv, whence by (12),
On the other hand, if 1t, whence by (13),
Consequently,
(Xk-1, Xk)
n H = 0, then Xk-1
and
Xk
are endpoints of some
Chapter 9
286
We conclude that
CONVERGENCE IN Rn
f is integrable on [a, bj .•
Recall that if a >
°
and
f (x) is positive, then F'(x) :=
ea1og(f(x)).
r
Suppose that f is Riemann integrable. Although Corollary 5.23 implies that is integrable for each n E N, we have not yet investigated the integrability of noninteger powers of f, e.g., v7 and if]. The following result shows that Lebesgue's Theorem answers the question of integrability for all positive powers of f, rational or irrational.
9.50 COROLLARY. If f : [a, bj for every a> 0.
--+
[0,00) is Riemann integrable, then so is fa
In our final application, we use connectivity to characterize the graph of a continuous function.
9.51 THEOREM [CLOSED GRAPH THEOREMj. Let f be a closed interval and f : f --+ R. Then f is continuous on f if and only if the graph of f is closed and connected in R 2 . PROOF. For any interval J ~ f, let 9(J) represent the graph of y = f(x) for x E J. Suppose that f is continuous on f. The function x f---t (x,f(x)) is continuous from f into R 2 , and f is connected in R. Thus 9 (1) is connected in R2 by Theorem 9.30. To prove that 9(1) is closed, we shall use Theorem 9.8. Let Xk E f and (Xko f(Xk)) --+ (x, y) as k --+ 00. Then Xk --+ x and f(Xk) --+ y, as k --+ 00. Hence, x E f and since f is continuous, f(Xk) --+ f(x). In particular, the graph of f is closed. Conversely, suppose that the graph of f is closed and connected in R 2 • We first show that f satisfies the Intermediate Value Theorem on f. Indeed, suppose to the contrary that there exist Xl < X2 in f with f(xI) i=- f(X2) and a value Yo between f(xI) and f(X2) such that f(t) i=- Yo for all t E [XI,x2j. Suppose for simplicity that f(xI) < f(X2). Since f(t) i=- Yo for any t E [XI,X2], the open sets
U
=
{(x,y): x < xd u {(x,y): x < X2, y < Yo},
V = {(x,y): x > X2} u {(x,y): x > Xl, y > Yo} separate 9(I), a contradiction. Therefore, f satisfies the Intermediate Value Theorem on f. If f is not continuous on f, then there exist numbers Xo E f, co > 0, and Xk E f such that Xk --+ Xo and If(Xk) - f(xo)1 > co. By symmetry, we may suppose that f(Xk) > f(xo) + co for infinitely many k's, say
f(Xk j ) > f(xo)
+ co >
f(xo),
j EN.
By the Intermediate Value Theorem, choose Cj between XkJ and Xo such that f(cj) = f(xo) + co· By construction, (Cj, f(cj)) --+ (xo, f(xo) + co) and Cj --+ Xo as j --+ 00. Hence, the graph of f on f is not closed .•
9.5
Applications
287
EXERCISES 1. Suppose that
Ik : [a, bj
-+
[0,00) for kEN and 00
f(x) :=
L
fk(X)
k=1
Ik
converges pointwise on [a, bj. If f and kEN, prove that
are continuous on [a, bj for each
2. Let E be closed and bounded in Rn and let g, fk,gk : E -+ R be continuous on E with gk 2: 0 and h 2: h··· 2: Ik 2: 0 for kEN. If 9 = 2:~1 gk converges pointwise on E, prove that 2:%"=1 Ikgk converges uniformly on E. 3. Suppose that f, fk : R -+ [0,00) are continuous. Prove that if f(x) -+ 0 as x -+ ±oo and Ik i f everywhere on R, then fk -+ f uniformly on R. 4. For each of the following functions, find a formula for wf(t).
(a)
f(x) = {
~
x EQ
(b)
f(x) = {
~
x2:0 x < O.
(c)
)
x ~ Q.
x#O x = O.
{Sin(l/X)
f(x =
0
5. Prove that (1 - x/k)k -+ e- x uniformly on any closed, bounded subset of R. 6. Show that if f : [a, bj -+ R is integrable and 9 : f([a, b]) -+ R is continuous, then go f is integrable on [a, bj. (Notice by Remark 3.34 that this result is false if 9 is allowed even one point of discontinuity.) 7. Using Theorem 7.10 or Theorem 9.30, prove that each of the following limits exists. Find a value for the limit in each case.
(a)
(b)
lim
k--+oo
l
n/2
0
sin x
lim f1 x 2f k--+oo
10
~k k 3 dx. 4 - x
(-k/ ) dx, + x
Chapter 9
288
where
CONVERGENCE IN R n
f is continuously differentiable on [0,1] and 1'(0) > 0.
(c)
lim k--->oo
1 1
0
x 3 cos
+ +X
(logk k
x) dx.
(d) 8. (a) Prove that for every
covers [0,1]
E
°
> there is a sequence of open intervals {hhEN that
n Q such that
(b) Prove that if {h hEN is a sequence of open intervals that covers [0, 1], then there is an N E N such that
9. Let E1 be the unit interval [0,1] with its middle third (1/3,2/3) removed; i.e., E1 = [0,1/3] U [2/3,1]. Let E2 be E1 with its middle thirds removed; i.e., E2
= [0,1/9] U [2/9,1/3] U [2/3,7/9] U [8/9,1].
Continuing in this manner, generate nested sets Ek such that each Ek is the union of 2k closed intervals of length 1/3 k . The Cantor set is the set
n 00
E:=
Ek ·
k=l
Assume that every point x E [0,1] has a binary expansion and a ternary expansion; i.e., there exist ak E {O, I} and bk E {O, 1, 2} such that
(For example, if x = 1/3, then a2k-1 = 0, a2k = 1 for all k and either b1 = 1, bk = for k > 1 or b1 = and bk = 1 for all k > 1.) (a) Prove that E is a nonempty compact set of measure zero. (b) Show that a point x E [0,1] belongs to E if and only if x has a ternary expansion whose digits satisfy bk -I- 1 for all kEN.
°
°
9.5
(c) Define
f :E
--4
Applications
289
[0,1] by
f
(~ ~~ ) ~ b~~2 . =
Prove that there is a countable subset Eo of E such that f is 1-1 from E \ Eo onto [0,1]; i.e., prove that E is uncountable. (d) Extend f from E to [0,1] by making f constant on the middle thirds E k - 1 \ E k . Prove that f : [0,1] --4 [0,1] is continuous and increasing. (Note: The function f is almost everywhere constant on [0,1], i.e., constant off a set of measure zero. Yet, it begins at f(O) = and ends at f(l) = 1.)
°
Chapter 10
This chapter, an alternative to Chapter 9, covers topological ideas in a metric space setting. If you have already covered Chapter 9, skip this chapter and proceed
directly to Chapter 11. 10.1 INTRODUCTION The following concept shows up in many parts of analysis.
10.1 DEFINITION. A metric space is a set X together with a function p : X x X -. R (called the metric of X), that satisfies the following properties for all x, y, z E X: POSITIVE DEFINITE SYMMETRIC TRIANGLE INEQUALITY
p(x, y) ;::: 0 with p(x, y) = 0 if and only if x = y, p(x, y) = p(y, x), p(x, y) ~ p(x, z) + p(z, y).
(Notice that by definition, p(x, y) is finite-valued for all x, y EX.) We are already very familiar with a whole class of metric spaces.
10.2 Example. For each n E N, Rn is a metric space with metric p(x,y) = Ilx-yll. (We shall call this the usual metric on Rn. Unless specified otherwise, we shall always use the usual metric on Rn.) PROOF.
By Theorems 1.7 and 8.6, p is a metric on Rn. I 290
10.1
Introduction
291
We shall develop a theory of convergence (for both sequences and functions) for arbitrary metric spaces. According to Example 10.2, this theory is valid (and will be used by us almost exclusively) on Rn. Why, then, subject ourselves to such stark generality? Why not stick with the concrete Euclidean space case? There are at least three answers to these questions: (1) Economy. You will soon discover that there are many other metric spaces that crop up in analysis, e.g., all Hilbert spaces, all normed linear spaces, and many function spaces, including the space of continuous functions on a closed bounded interval. Our general theory of convergence in metric spaces will be valid for each of these examples too. (2) Visualization. As we mentioned in Section 1.1, analysis has a strong geometric flavor. Working in an abstract metric space only makes that aspect more apparent. (3) Simplicity. Emphasizing the fact that Rn is a metric space strips R of all extraneous details (the field operations, the order relation, decimal expansions) so that we can focus our attention on the underlying concept (distance) that governs convergence. Mathematics frequently benefits from such abstraction. Instead of becoming more difficult, generality actually makes the proofs easier to construct. On the other hand, R 2 provides a good and sufficiently general model for most of the theory of abstract metric spaces (especially, convergence of sequences and continuity of functions). For this reason, we often draw two-dimensional pictures to illustrate ideas and motivate proofs in an arbitrary metric space. (For example, see the proof of Remark 10.9.) We must not, however, mislead ourselves by believing that R2 provides the complete picture. Metric spaces have such simple structure that they can take on many bizarre forms. With that in mind, we introduce several more examples. 10.3 Example. R is a metric space with metric
cr(x, y) = {
~
x=y x f:. y.
(This metric is called the discrete metric.) PROOF. The function cr is obviously positive definite and symmetric. To prove that cr satisfies the triangle inequality, we consider three cases. If x = z, then cr(x, y) = 0 + cr(z, y) = cr(x, z) + cr(z, y). A similar equality holds if y = z. Finally, if x f:. z and y f:. z, then cr(x, y) :::; 1 < 2 = cr(x, z) + cr(z, y). I
Comparing Examples 10.2 and 10.3, we see that a given set can have more than one metric. Hence, to describe a particular metric space, we must specify both the set X and the metric p. For the rest of this chapter (unless otherwise stated), X and Y will represent arbitrary metric spaces (with respective metrics p and r). 10.4 Example. If E ~ X, then E is a metric space with metric p. (We shall call such metric spaces E subspaces of X.) PROOF. If the positive definite property, the symmetric property, and the triangle inequality hold for all x, y EX, then they hold for all x, y E E. I
A particular example of a subspace is provided by the set of rationals in R.
292
Chapter 10
METRIC SPACES
10.5 Example. Q is a metric space with metric p(x, y) =
Ix - yl.
Metric spaces are by no means confined to numbers and vectors. Here is an important metric space whose "points" are functions. 10.6 Example. Let C[a, b] represent the collection of continuous f : [a, b] and Ilfll:= sup If(x)l·
--+
R
xE[a,b]
Then p(f, g)
:= Ilf - gil
is a metric on
qa, b].
PROOF. By the Extreme Value Theorem, Ilfll is finite for each f E C[a, b]. By definition, Ilfll 2:: 0 for all f, and Ilfll = 0 if and only if f(x) = 0 for every x E [a, b]. Thus p is positive definite. Since p is obviously symmetric, it remains to verify the triangle inequality. But
Ilf + gil =
sup If(x) xE[a,b]
+ g(x)l::;
sup If(x)1 xE[a,b]
+
sup Ig(x)1 xE[a,b]
= Ilfll + Ilgll·
I
It is interesting to note that convergence in this metric space means uniform convergence (see Exercise 8, p. 300). There are two ways to generalize open and closed intervals to arbitrary metric spaces. One way is to use the metric directly as follows.
10.7 DEFINITION. Let a E X and r > O. The open ball (in X) with center a and radius r is the set Br(a) := {x EX: p(x,a) < r}, and the closed ball (in X) with center a and radius r is the set
{x EX: p(x,a) ::; r}. Notice by Theorem 1.6 that in R (with the usual metric), the open ball (respectively, the closed ball) centered at a of radius r is (a - r, a + r), (respectively, [a - r, a + r])j i.e., open balls are open intervals and closed balls are closed intervals. With respect to the discrete metric, however, balls look quite different. For example, both the closed and open ball centered at some a is {a} for all 0 < r < 1. The other way to generalize open and closed intervals to X is to specify what "open" and "closed" mean. Notice that every point x in an open interval I is surrounded by points in I. The same property holds for complements of closed intervals. This leads us to the following definition. 10.8 DEFINITION. (i) A set V ~ X is said to be open if and only if for every x E V there is an € > 0 such that the open ball Be (x) is contained in V. (ii) A set E ~ X is said to be closed if and only if EC := X \ E is open.
Our first result about these concepts shows that they are consistent as applied to balls.
10.1
Introduction
293
10.9 Remark. Every open ball is open, and every closed ball is closed. PROOF. Let Br(a) be an open ball. By definition, we must prove that given x E Br(a) there is an c > such that Be;(x) ~ Br(a). Let x E Br(a) and set c = r - p(x,a). (Look at Figure 8.5 to see why this choice of c should work.) If Y E Be;(x), then by the triangle inequality, assumption, and the choice of c, p(y, a) ::; p(y, x) + p(x, a) < c + p(x, a) = r. Thus by Definition 10.7, y E Br(a). In particular, Be;(x) ~ Br(a). Similarly, we can show that {x EX: p( x, a) > r} is also open. Hence, every closed ball is closed. I
°
Here are more examples of open sets and closed sets.
10.10 Remark. If a E X, then X \ {a} is open and {a} is closed. PROOF.
By Definition 10.8, it suffices to prove that the complement of every
singleton E := {a} is open. Let x E EC and set c = p(x, a). Then by Definition 10.7, a ~ Be;(x), so Be;(x) ~ EC. Therefore, EC is open by Definition 10.8. I Students sometimes mistakenly believe that every set is either open or closed. Some sets are neither open nor closed (like the interval [0,1)), and as the following result shows, every metric space contains two special sets that are both open and closed.
10.11 Remark. In an arbitrary metric space, the empty set 0 and the whole space X are both open and closed. Since X = 0c and 0 = Xc, it suffices by Definition 10.8 to prove that are both open. Because the empty set contains no points, "every" point x E 0 satisfies Be;(x) ~ 0. (This is called the vacuous implication.) Therefore, 0 is open. On the other hand, since Be;(x) ~ X for all x E X and all c > 0, it is clear that X is open. I PROOF.
oand X
For some metric spaces (like Rn), these are the only two sets that are simultaneously open and closed. For other metric spaces, there are many such sets.
10.12 Example. In the discrete space R, every set is both open and closed. PROOF. It suffices to prove that every subset of R is open (with respect to the discrete metric). Let E ~ R. By Remark 10.11, we may assume that E is nonempty. Let a E E. Since Bl(a) = {a}, some open ball containing a is a subset of E. By Definition 10.8, E is open. I
To see how these concepts are connected with limits, we examine convergence of sequences in an arbitrary metric space. Using the analogy between the metric p and the absolute value, we can transfer much of the theory of limits of sequences from R to any metric space. Here are the basic definitions.
10.13 DEFINITION. Let {x n } be a sequence in a metric space X. (i) {x n } converyes (in X) if there is a point a E X (called the limit of xn) such that for every c > there is an N E N such that
°
n 2: N
implies
p(xn' a) < c.
294
Chapter 10
METRIC SPACES
(ii) {xn} is Cauchy if for every c > 0 there is an N E N such that n, m ~ N
implies
p(xn' xm) < c.
(iii) {xn} is bounded if there is an M > 0 and abE X such that p(xn,b) :::; M for all n E N. Modifying the proofs in Chapter 2, by doing little more than replacing
Ix - yl
by
p( X, y), we can establish the following result. 10.14 THEOREM. Let X be a metric space. (i) A sequence in X can have at most one limit. (ii) If Xn E X converges to a and {x nk } is any subsequence of {xn}, then x nk converges to a as k ---+ 00. (iii) Every convergent sequence in X is bounded. (iv) Every convergent sequence in X is Cauchy. The following result shows that by using open sets, we can describe convergence of sequences in an arbitrary metric space without reference to the distance function. Later in this chapter, we shall use this point of view to great advantage.
10.15 Remark. Let Xn E X. Then Xn ---+ a as n ---+ 00 if and only if for every open set V that contains a, there is an N E N such that n ~ N implies Xn E V. PROOF. Suppose that Xn ---+ a, and let V be an open set that contains a. By Definition 10.8, there is an c > 0 such that Be(a) ~ V. Given this c, use Definition 10.13 to choose an N E N such that n ~ N implies Xn E Be(a). By the choice of c, Xn E V for all n ~ N. Conversely, let c > 0 and set V = Be (a). Then V is an open set that contains a; hence, by hypothesis, there is an N E N such that n ~ N implies Xn E V. In particular, p(xn' a) < c for all n ~ N. I
The following result, which we shall use many times, shows that convergent sequences can also be used to characterize closed sets.
10.16 THEOREM. Let E ~ X. Then E is closed if and only if the limit of every convergent sequence Xk E E satisfies lim Xk E E.
k--+oo
PROOF. The theorem is vacuously satisfied if E is the empty set. Suppose that E f:. 0 is closed but some sequence Xn E E converges to a point x E EC. Since E is closed, EC is open. Thus, by Remark 10.15, there is an N E N such that n ~ N implies Xn E EC, a contradiction. Conversely, suppose that E is a nonempty set such that every convergent sequence in E has its limit in E. If E is not closed, then by Remark 10.11, E f:. X, and by definition, EC is nonempty and not open. Thus, there is at least one point x E EC such that no ball Br(x) is contained in EC. Let Xk E Bl/dx) n E for k = 1,2, ....
10.1
Introduction
295
Then Xk E E and p(Xk' x) < 11k for all kEN. Now 11k --+ 0 as k --+ 00, so it follows from the Squeeze Theorem (these are real sequences) that p(Xk' x) --+ 0 as k --+ 00; i.e., Xk --+ x as k --+ 00. Thus, by hypothesis, x E E, a contradiction. I Notice that the Bolzano-Weierstrass Theorem and Cauchy's Theorem are missing from Theorem 10.14. There is a simple reason for this. As the next two remarks show, neither of these results holds in an arbitrary metric space.
10.17 Remark. The discrete space contains bounded sequences that have no convergent subsequences. PROOF. Let X = R be the discrete metric space introduced in Example 10.3. Since O"(O,k) = 1 for all kEN, {k} is a bounded sequence in X. Suppose that there exist integers kl < k2 < ... and an x E X such that k j --+ x as j --+ 00. Then there is an N E N such that O"(kj, x) < 1 for j ~ N; i.e., kj = x for all j ~ N. This contradiction proves that {k} has no convergent subsequences. I
10.18 Remark. The metric space )f = Q, introduced in Example 10.5, contains Cauchy sequences that do not converge. PROOF. Choose (by the Density of Rationals) points qk E Q such that qk --+ J2. Then {qd is Cauchy (by Theorem 1O.14iv) but does not converge in X since J2 fj.
X. I This leads us to the following concept.
10.19 DEFINITION. A metric space X is said to be complete if and only if every Cauchy sequence Xn E X converges to some point in X. At this point, you should read Section 9.1 to see how these concepts play out in the concrete Euclidean space setting. Notice by Theorem 9.6 that Rn is complete for all n E N. What can be said about complete metric spaces in general?
10.20 Remark. By Definition 10.19, a complete metric space X satisfies two properties: (1) every Cauchy sequence in X converges; (2) the limit of every Cauchy sequence in X stays in X. Property (2), by Theorem 10.16, means that X is closed. Hence, it is natural to ask: Is there a simple relationship between complete subspaces and closed subsets?
10.21 THEOREM. Let X be a complete metric space and E be a subset of X. Then E (as a subspace) is complete if and only if E (as a subset) is closed. PROOF. Suppose that E is complete and
Xn
E E converges. By Theorem 1O.14iv,
{xn} is Cauchy. Since E is complete, it follows from Definition 10.19 that the limit of {xn} belongs to E. Thus, by Theorem 10.16, E is closed. Conversely, suppose that E is closed and Xn E E is Cauchy in E. Since the metrics on X and E are identical, {xn} is Cauchy in X. Since X is complete, it follows that Xn --+ x, as n --+ 00, for some x E X. But E is closed, so x must belong to E. Thus E is complete by definition. I
296
Chapter 10
METRIC SPACES
EXERCISES 1. If a, bE X and p(a, b) < c for all c
> 0, prove that a = b.
2. (a) Prove that {Xk} is bounded in X if and only if sUPkEN p(Xk' a) < 00 for all aEX. (b) Prove that {xd is bounded in Rn if and only if there is a C > 0 such that Ilxk II ::; C for all kEN. 3. Prove Theorem 10.14. 4. (a) Let a E X. Prove that if Xn = a for every n E N, then Xn converges. What does it converge to? (b) Let X = R with the discrete metric. Prove that Xn ---t a as n ---t 00 if and only if Xn = a for large n. 5. (a) Let {xn} and {Yn} be sequences in X that converge to the same point. Prove that p(xn, Yn) ---t 0 as n ---t 00. (b) Show that the converse of part (a) is false. 6. Let {xn} be Cauchy in X. Prove that {xn} converges if and only if at least one of its subsequences converges. 7. Prove that the discrete space R is complete. 8. (a) Prove that every finite subset of a metric space X is closed. (b) Prove that Q is not closed in R. 9. (a) Show that if x E Br(a), then there is an c > 0 such that the closed ball centered at x of radius c is a subset of Br(a). (b) If a "# b are distinct points in X, prove that there is an r > 0 such that Br(a) n Br(b) = 0. (c) Show that given two balls Br(a) and Bs(b), and a point x E Br(a) n Bs(b), there are radii c and d such that
10. (a) A subset E of X is said to be sequentially compact if and only if every sequence Xn E E has a convergent subsequence whose limit belongs to E. Prove that every sequentially compact set is closed and bounded. (b) Prove that R is closed but not sequentially compact. (c) Prove that every closed bounded subset of R is sequentially compact.
10.2 LIMITS OF FUNCTIONS In the preceding section we used results in Chapter 2 as a model for the theory of limits of sequences in an arbitrary metric space X. In this section we use results in Chapter 3 as a model to develop a theory of limits of functions that take one metric space X to another Y.
10.2
Limits of functions
297
A straightforward adaptation of Definition 3.1 leads us to guess that in an arbitrary metric space, f (x) -+ L as x -+ a if for every e > 0 there is a 8 > 0 such that 0< p(x, a) < 8 implies T(f(X), L) < e. The only problem with this definition is that there may be no x that satisfies is the set N together with the metric p(x, y) = Ix - yl and 8 = 1. To prevent our theory from collapsing into the vacuous case, we introduce the following idea.
o < p(x, a) < 8; e.g., if X
10.22 DEFINITION. A point a is said to be a cluster point (of X) if and only if Bo(a) contains infinitely many points for each 0 > O.
For example, every point in any Euclidean space Rn is a cluster point (of Rn). Notice that any concept defined on a metric space X is defined automatically on all subsets of X. Indeed, since any subset E of X is itself a metric space (see Example 10.4 above), the definition can be applied to E as well as to X. To be more specific, let E is a subspace of X, i.e., a nonempty subset of X. By Definition 10.7 an open ball in E has the form B~(a) := {x E E : p(x, a)
< r}.
Since the metrics on X and E are the same, it follows that B~(a) = Br(a)
n E,
where Br(a) is an open ball in X. A similar statement holds for closed balls. We shall call these balls relative balls (in E). In particular, in the subspace Q of Example 10.5 above, the relative open balls take on the form Br(a) = (a - r, a + r) n Q and the relative closed balls the form [a - r, a + r] n Q. What, then, does it mean for a set E to have a cluster point? By Definition 10.22, a point a E X is a cluster point of a nonempty set E ~ X if and only if the relative ball En Bo(a) contains infinitely many points for each 8 > O. The etymology of the term cluster point is obvious. A cluster point of E is a point near which E "clusters." Cluster points are also called points of accumulation. Notice that by definition, no finite set has cluster points. On the other hand, a set may have infinitely many cluster points. Indeed, by the Density of Rationals (Theorem 1.24), every point of R is a cluster point of Q. Here are two more examples of sets and their cluster points. 10.23 Example. Show that 0 is the only cluster point of the set
SOLUTION. By Theorem 1.22 (the Archimedean Principle), given 8 > 0 there is an N E N such that liN < 8. Since n 2: N implies lin :S liN, it follows that (-8,8) n E contains infinitely many points. Thus 0 is a cluster point of E.
298
Chapter 10
METRIC SPACES
On the other hand, if Xo =1= 0, then choose 0 < Ixol, and notice that either Xo - 0> 0 or Xo + 0 < O. Thus (xo - 0, Xo + 0) n E contains at most finitely many points; i.e., Xo is not a cluster point of E .•
10.24 Example. Show that every point in the interval [0,1] is a cluster point of the open interval (0, 1). SOLUTION. Let Xo E [0,1] and 0 > O. Then Xo + 0 > 0 and Xo - 0 < 1. In particular, (xo - 0, Xo + 0) n (0, 1) is itself a nondegenerate interval, say (a, b). But (a, b) contains infinitely many points, e.g., (a + b)/2, (2a + b)/3, (3a + b)/4, .... Therefore, Xo is a cluster point of (0,1) .• We are now prepared to define limits of functions on metric spaces.
10.25 DEFINITION. Let a be a cluster point of X and f : X \ {a} ----+ Y. Then f(x) is said to converge to L, as x approaches a, if and only if for every c > 0 there is a 0 > 0 such that 0< p(x, a) < 0 implies
(1)
r(f(x), L) < c.
In this case we write L = lim f(x) x--->a
and call L the limit of f(x) as x approaches a. By modifying the proofs presented in Chapter 3, we can prove the following results about limits in metric spaces.
10.26 THEOREM. Let a be a cluster point of X and f,g: X \ {a}
(i) If f(x) = g(x) for all x E X \ {a} and f(x) has a limit as x also has a limit as x ----+ a, and lim g(x)
x----?a
=
----+
----+
Y.
a, then g(x)
lim f(x).
x---+a
(ii) [SEQUENTIAL CHARACTERIZATION OF LIMITS]. The limit L := lim f(x) x--->a
exists if and only if f(x n ) ----+ L as n ----+ 00 for every sequence xn E X \ {a} that converges to a as n ----+ 00. (iii) Suppose that Y = Rn. If f(x) and g(x) have a limit as x approaches a, then so do (f + g)(x), (f. g)(x), (of)(x), and (f/g)(x) (when Y = R and the limit of g(x) is nonzero). In fact, lim (f
x---+a
+ g) (x) = x---+a lim f(x) + lim g(x), x---+a lim (of) (x) =
x---+a
0
lim f(x),
x---+a
10.2
299
Limits of functions
=
lim (f. g) (x) X----i>a
lim f(x) . lim g(x), X----4a
x---+a
and (when Y = R and the limit of g(x) is nonzero)
lim x--'a
(I)
l~mx--.a f(x).
(x) =
g
hmx--.ag(x)
(iv) [SQUEEZE THEOREM FOR FUNCTIONS]. Suppose that Y = R. If h : X \ {a} --+ R satisfies g(x) ::; h(x) ::; f(x) for all x E X \ {a}, and lim f(x)
x---+a
then the limit of h exists, as x
= x---+a lim g(x) = L, --+
a, and
lim h(x) x--'a
(v)
= L.
[COMPARISON THEOREM FOR FUNCTIONS]. Suppose that Y = R. If f(x) ::; g(x) for all x E X \ {a}, and f and g have a limit as x approaches a, then
lim f(x) ::; lim g(x). x---+a
x---+a
At this point you should read Section 9.2 to see how these concepts play out in the concrete Euclidean space setting. Pay special attention to Theorem 9.15 and Example 9.17 (which show how to evaluate a limit in R n ), and to Example 9.19 (which shows how to prove a specific limit in Rn does not exist). Here is the metric space version of Definition 3.20.
10.27 DEFINITION. Let E be a nonempty subset of X and f : E --+ Y. (i) f is said to be continuous at a point a E E if and only if given E > 0 there is a 0 > 0 such that p(x, a) < 0 (ii)
and
xEE
imply
f is said to be continuous on E (notation: only if f is continuous at every x E E.
T(f(X), f(a)) < E.
f :E
--+
Y is continuous) if and
Notice that this definition is valid whether or not a is a cluster point. Modifying corresponding proofs in Chapter 3, we can prove the following results.
10.28 THEOREM. Let E be a nonempty subset of X and f, g : E (i) f is continuous at a E E if and only if f(x n ) --+ f(a), as n
--+
Y.
--+ 00, for all sequences Xn E E that converge to a. (ii) Suppose that Y = Rn. If f, g are continuous at a point a E E (respectively, continuous on a set E), then so are f + g, f . g, and oJ (for any ex E R). Moreover, in the case Y = R, f / g is continuous at a E E when g( a) 'I 0 (respectively, on E when g (x) 'I 0 for all x E E).
The following result shows that the composition of two continuous functions is continuous regardless of which metric spaces are involved.
300
Chapter 10
METRIC SPACES
10.29 THEOREM. Suppose that X, Y, and Z are metric spaces, a is a cluster point of X, f : X ---+ Y, and 9 : f(X) ---+ Z. If f(x) ---+ L as x ---+ a and 9 is continuous at L, then
~~ (9 0
f)(x) = 9
(~~ f(x))
.
We shall examine the metric space analogues of the Extreme Value Theorem, the Intermediate Value Theorem, and uniform continuity in Section 10.4. EXERCISES 1. Find all cluster points of each of the following sets.
(a) (b) (c) (d) (e)
E = R \ Q. = [a,b), a,b E R, a < b. E={(-l)nn:nEN}. E = {x n : n EN}, where Xn ---+ x as n E={1,1,2,1,2,3,1,2,3,4, ... }.
E
---+ 00.
2. (a) A point a in a metric space X is said to be isolated if and only if there is an r > 0 so small that Br (a) = {a}. Show that a point a E X is not a cluster point of X if and only if a is isolated. (b) Prove that the discrete space has no cluster points. 3. Prove that a is a cluster point for some E ~ X if and only if there is a sequence Xn E E \ {a} such that Xn ---+ a as n ---+ 00. 4. (a) Let E be a nonempty subset of X. Prove that a is a cluster point of E if and only if for each r > 0, En Br(a) \ {a} is nonempty. (b) Prove that every bounded infinite subset of R has at least one cluster point. 5. Prove Theorem 10.26. 6. Prove Theorem 10.28. 7. Prove Theorem 10.29. 8. Prove that if fn E C[a, b], then fn ---+ f uniformly on [a, b] if and only if fn ---+ f in the metric of era, b] (see Example 10.6). 9. Suppose that X is a metric space that satisfies the following condition. 10.30 DEFINITION. X is said to satisfy the Bolzano- Weierstmss Property if and only if every bounded sequence Xn E X has a convergent subsequence. (a) Prove that if E is a closed, bounded subset of X and Xn E E, then there is an a E E and a subsequence x nk of Xn such that x nk ---+ a as k ---+ 00. (b) If E is closed and bounded in X and f : E ---+ R is continuous on E, prove that f is bounded on E. (c) Prove that under the hypotheses of part (b) there exist points X m , X M E E such that
f(XM)
= sup xEE
f(x)
and
f(xm)
= inf xEE
f(x).
10.3
Interior, closure, and boundary
301
10.3 INTERIOR, CLOSURE, AND BOUNDARY Thus far, we have used "open" and "closed" mostly for identification. At this point, we begin to examine these concepts in more depth. Our first result shows that open sets and closed sets behave very differently with respect to unions and intersections.
10.31 THEOREM. Let X be a metric space. (i) If {Va}aEA is any collection of open sets in X, then
is open. (ii) If {Vk : k
=
1,2, ... , n} is a finite collection of open sets in X, then n
k=l
n
kE{1,2, ... ,n}
is open. (iii) If {Ea}aEA is any collection of closed sets in X, then
is closed. (iv) If {Ek : k
=
1,2, ... , n} is a finite collection of closed sets in X, then n
k=l
u
kE{1,2, ... ,n}
is closed. (v) If V is open in X and E is closed in X, then V \ E is open and E \ V is closed. PROOF. (i) Let x E UaEA Va' Then x E Va for some ex E A. Since Va is open, it follows that there is an r > 0 such that Br(x) ~ Va' Thus Br(x) ~ UaEA Va; i.e., this union is open. (ii) Let x E n~=l Vk . Then x E Vk for k = 1,2, ... , n. Since each Vk is open, it follows that there are numbers rk > 0 such that Brk (x) ~ Vk. Let r = min{rl,"" rn}. Then r > 0 and Br(x) ~ Vk for all k = 1,2, ... , n; i.e., Br(x) ~ n~=l Vk. Hence, this intersection is open. (iii) By DeMorgan's Law (Theorem 1.41) and part (i),
Chapter 10
302
METRIC SPACES
is open, so noEA Eo is closed. (iv) By DeMorgan's Law and part (ii),
is open, so U~=l Ek is closed. (v) Since V \ E = V n EC and E \ V = En Vc, the former is open by part (ii), and the latter is closed by part (iii) .• The finiteness hypothesis in Theorem 10.31 is critical, even for the case X = R. 10.32 Remark. Statements (ii) and (iv) of Theorem 10.31 are false if arbitrary collections are used in place of finite collections. PROOF.
In the metric space X = R,
n (-~,~)
=
{O}
kEN
is closed and
U [k:1'k~1] =(0,1)
kEN
is open .• Theorem 10.31 has many applications. Our first application is that every set contains a largest open set and is contained in a smallest closed set. To facilitate our discussion, we introduce the following topological operations. 10.33 DEFINITION. Let E be a subset of a metric space X. (i) The interior of E is the set
EO := U{V : V ~ E and V is open in X}.
(ii) The closure of E is the set E:= n{B: B;2 E and B is closed in X}. Notice that every set E contains the open set 0 and is contained in the closed set X. Hence, the sets EO and E are well-defined. Also notice that by Theorem 10.31, the interior of a set is always open and the closure of a set is always closed. The following result shows that EO is the largest open set contained in E, and E is the smallest closed set that contains E.
10.3
10.34 (i) (ii) (iii)
Interior, closure, and boundary
303
THEOREM. Let E ~ X. Then EO ~ E ~ E, i[V is open and V ~ E, then V ~ EO, and if C is closed and C :2 E, then C :2 E.
PROOF. Since every open set V in the union defining EO is a subset of E, it is clear that the union of these V's is a subset of E. Thus EO ~ E. A similar argument establishes E ~ E. This proves (i). By Definition 10.33, if V is an open subset of E, then V ~ EO and if C is a closed set containing E, then E ~ C. This proves (ii) and (iii). I In particular, the interior of a bounded interval with endpoints a and b is (a, b), and its closure is [a, b]. In fact, it is evident by parts (ii) and (iii) that E = EO if and only if E is open and E = E if and only if E is closed. We shall use this observation many times below. The following examples illustrate the fact that the interior of a nice enough set E in R2 can be obtained by removing all its "edges," and the closure of E by adding all its "edges." 10.35 Example. Find the interior and closure of the set E = {(x, y) : -1 ~ x ~ 1 and -Ixl < Y < Ixl}· SOLUTION. Graph y = Ixl and x = ±1, and observe that E is a bowtie-shaped region with "solid" vertical edges (see Figure 8.6). Now by Definition 10.8, any open set in R 2 must contain a disk around each of its points. Since EO is the largest open set inside E, it is clear that
EO = {(x, y) : -1 < x < 1 and
-Ixl < Y < Ixl}.
Similarly,
E
=
{(x,y) : -1
~
x
~ 1 and
-Ixl
~
y
~
Ixl}·
I
10.36 Example. Find the interior and closure of the set E = B 1 ( -2, 0)UB 1 (2, O)U {(x,O) : -1 ~ x ~ 1}. SOLUTION. Draw a graph of this region. It turns out to be "dumbbell shaped," two open disks joined by a straight line. Thus EO = Bl (-2,0) U Bl (2,0) and
E = B 1 (-2,0) U B 1 (2,0) U {(x,O): -1
~
x
~ 1}.
I
One of the most important results from one-dimensional calculus is the Fundamental Theorem of Calculus. It states that the behavior of a derivative l' on an interval [a, b], as measured by the inegral, is determined completely by the values of f at the endpoints of [a, b]. What shall we use for "endpoints" of an arbitrary set in X? Notice that the endpoints a, b are the only points that lie near both [a, b] and the complement of [a, b]. Using this as a cue, we introduce the following concept.
304
Chapter 10
10.37 DEFINITION. Let E
~
METRIC SPACES
X. The boundary of E is the set
8E := {x EX: for all r > 0,
Br(x) n E i=- 0 and Br(x) n E Ci=- 0}.
(We will refer to the last two conditions in the definition of 8E by saying that Br(x) intersects E and EC.)
10.38 Example. Describe the boundary of the set
E = {(x, y) : x 2 + y2 ~ 9 and (x - l)(y + 2) > a}. SOLUTION. Graph the relations x 2 + y2 = 9 and (x - l)(y + 2) = 0 to obtain a region with solid curved edges and dashed straight edges (see Figure 8.7). By definition, then, the boundary of E is the union of these curved and straight edges (all made solid). Rather than describing 8E analytically (which would involve solving for the intersection points of the straight lines x = 1, y = -2, and the circle x 2 + y2 = 9), it is easier to describe 8E by using set algebra.
8E = {(x, y) : x 2 + y2 ~ 9 and (x - l)(y + 2) 2:: o} \ {(x,y): x 2 +y2 < 9 and (x -l)(y+ 2) > a}. I It turns out that set algebra can be used to describe the boundary of any set.
10.39 THEOREM. Let E
~
X. Then 8E = E \ EO.
By Definition 10.37, it suffices to show (2) x E E if and only if Br(x) n E i=- 0 for all r > 0, and (3) x 1: EO if and only if Br(x) n EC i=- 0 for all r > O. We will provide the details for (2) and leave the proof of (3) as an exercise. Suppose that x E E but Bro(x) n E = 0 for some ro > O. Then (Bro(xW is a closed set that contains E; hence, by Theorem 1O.34iii, E ~ (Bro(xW. It follows that En Bro (x) = 0, e.g., x 1: E, a contradiction. Conversely, suppose that x 1: E. Since (E)C is open, there is an ro > 0 such that Bro(x) ~ (E)c. In particular, 0 = Bro(x) n E ~ Bro(x) n E for some ro > O. I PROOF.
We have introduced topological operations (interior, closure, and boundary). The following result answers the question: How do these operations interact with the set operations (union and intersection)?
10.40 THEOREM. Let A, B
~
X. Then
(i) (ii) (iii) 8(AUB)
~
8Au8B,
and 8(AnB)
~
(An8B)U(Bn8A)U(8An8B).
10.3
Interior, closure, and boundary
305
PROOF. (i) Since the union of two open sets is open, AO U BO is an open subset of A U B. Hence, by Theorem 10.34ii, AO U BO ~ (A U B)o. Similarly, (A n B)0 ~ AO n BO. On the other hand, if V cAn B, then V c A and V c B. Thus, (A n B)O ~ AO n BO. (ii) Since AUB is closed and contains AUB, it is clear that by Theorem 10.34iii, AU B ~ Au B. Similarly, An B ~ An B. To prove the reverse inequality for union, suppose that x ~ A U B. Then there is a closed set E that contains A U B such that x ~ E. Since E contains both A and B, it follows that x ~ A and x ~ B. This proves part (ii). (iii) Let x E 8(AUB); i.e., suppose that Br(x) intersects AUB and (AUB)C for all r > O. Since (A U B)C = AC nBc, it follows that Br(x) intersects both AC and B Cfor all r > O. Thus, Br(x) intersects A and AC for all r > 0, or Br(x) intersects Band B C for all r > 0; i.e., x E 8A U 8B. This proves the first set inequality in part (iii). To prove the second set inequality, suppose that x E 8(A n B); i.e., suppose that Br (x) intersects An B and (A n B)C for all r > O. If x E (A n 8 B) U (B n 8A), then there is nothing to prove. If x ~ (An8B) U (Bn8A), then x E (AC U (8B)C) n (BC U (8A)C). Hence, it remains to prove that AC U (8BY ~ 8A and BC U (8A)C ~ 8B. By symmetry, we need only prove the first one. To this end, let x E AC U (8B)c. Case 1. x E AC. Since Br(x) intersects A, it follows that x E 8A. Case 2. x E (8B)c. Since Br(x) intersects B, it follows that Br(x) ~ B for small r > O. Since Br(x) also intersects AcUBc, it must be the case that Br(x) intersects AC. In particular, x E 8A. I
EXERCISES 1. Find the interior, closure, and boundary of each of the following subsets of R.
(a) [a, b) where a < b. (b) E = {lin: n EN}.
(c)
E=U:=l (n:l'~). = U( -n, n).
(d) E
2. Identify which of the following sets are open, which are closed, and which are neither. Find EO, E, and 8E and sketch E in each case. (a) E = {(x, y) : x 2 + 4y2 :s; I}. (b) E = {(x, y) : x 2 - 2x + y2 = O} U {(x, 0) : x E [2,3]}. (c) E={(x,y):y~x2, O:S;y
V
=
{x EX: s < p( x, a) < r},
and
E
=
{x EX: s :s; p( x, a) :s; r}.
Prove that V is open and E is closed. 4. Suppose that A ~ B ~ X. Prove that A ~ Band AO ~ BO.
Chapter 10
306
[ill.
METRIC SPACES
This exercise is used in Section 10.5. Show that if E is closed in X and a ~ E, then
inf p(x, a) > O.
xEE
6. Prove (3). 7. Show that Theorem 10.40 is best possible in the following sense.
i
(a) There exist sets A, Bin R such that (A U B)O A~U BO. (b) There exist sets A, B in R such that A n B =f A n B. (c) There exist sets A, Bin R such that 8(A U B) =f 8A U 8B and 8(A n B) (A n 8B) U (B n 8A) U (8A n 8B).
=f
~. This exercise is used many times from Section 10.4 onward. Let Y be a subspace of X. (a) Show that a set V is open in Y if and only if there is an open set U in X such that V = UnY. (b) Show that a set E is closed in Y if and only if there is a closed set A in X such that E = AnY. 9. Let 1 : R --; R. Prove that 1 is continuous on R if and only if 1-1 (I) is open in R for every open interval I. 10. Let V be a subset of a metric space X. (a) Prove that V is open in X if and only if there is a collection of open balls {Bo: : ex E A} such that
V =
U Bo:. o:EA
(b) What happens to this result if "open" is replaced by "closed"?
10.4 COMPACT SETS In Chapter 3 we proved the Extreme Value Theorem for functions defined on R. In this section we shall extend that result to functions defined on an arbitrary metric space. To replace the hypothesis "closed, bounded interval" used in the real case, we introduce "compactness," a concept that gives us a powerful tool for extending local results to global ones (see especially Remark 10.44, Theorem 10.52, and Theorem 12.46). Since compactness of E depends on how E can be "covered" by a collection of open sets, we begin by introducing the following terminology.
10.41 DEFINITION. Let V = {VO:}O:EA be a collection of subsets of a metric space X and suppose that E is a subset of X. (i) V is said to cover E (or be a covering of E) if and only if
10.4
307
Compact sets
(ii) V is said to be an open covering of E if and only if V covers E and each Va is open. (iii) Let V be a covering of E. V is said to have a finite (respectively, countable) subcovering if and only if there is a finite (respectively, countable) subset Ao of A such that {Va}aEAo covers E. Notice that the collections of open intervals
are open coverings of the interval (0,1). The first covering of (0,1) has no finite subcover but any member of the second covering covers (0,1). Thus an open covering of an arbitrary set mayor may not have a finite subcovering. Sets that satisfy this special property are important enough to be given a name.
10.42 DEFINITION. A subset H of a metric space X is said to be compact if and only if every open covering of H has a finite sub cover . To get a feeling for what this definition means, we make some elementary observations concerning compact sets in general.
10.43 Remark. The empty set and all finite subsets of a metric space are compact. PROOF. These statements follow immediately from Definition 10.42. The empty set needs no set to cover it, and any finite set H can be covered by finitely many sets, one set for each element in H. I
Since the empty set and finite sets are also closed, it is natural to ask whether there is a relationship between compact sets and closed sets in general. The following three results address this question in an arbitrary metric space.
10.44 Remark. A compact set is always closed. PROOF. Suppose that H is compact but not closed. Then H is nonempty and (by Theorem 10.16) there is a convergent sequence Xk E H whose limit x does not belong to H. For each Y E H, set r(y) := p(x, y)/2. Since x does not belong to H, r(y) > 0; hence, each Br(y)(Y) is open and contains y; i.e., {Br(y)(Y) : y E H} is an open covering of H. Since H is compact, we can choose points Yj and radii rj := r(Yj) such that {BrJ (Yj) : j = 1,2, ... , N} covers H. Set r := min{rl,"" rN}' (This is a finite set of positive numbers, so r is also positive.) Since Xk -+ x as k -+ 00, Xk E Br(x) for large k. But Xk E Br(x) n H implies Xk E B rJ (Yj) for some j E N. Therefore, it follows from the choices of rj and r, and from the triangle inequality, that
rj ~ p(Xk,Yj) ~ p(x,Yj) - p(Xk'X) = 2rj - p(Xk' x) > 2rj - r ~ 2rj - rj
=
rj,
a contradiction. I The following result is a partial converse of Remark 10.44 (see also Theorem 10.50).
Chapter 10
308
METRIC SPACES
10.45 Remark. A closed subset of a compact set is compact. PROOF. Let E be a closed subset of H, where H is compact in X and suppose that V = {VaJaEA is an open covering of E. Now EC = X \ E is open; hence, V U {EC} is an open covering of H. Since H is compact, there is a finite set Ao ~ A such that
H
But En EC
=
~E
C
U (
U Va).
aEAo
0. Therefore, E is covered by {Va}aEA o· I
Here is the connection between closed bounded sets and compact sets. 10.46 THEOREM. Let H be a subset of a metric space X. If H is compact, then H is closed and bounded. PROOF. Suppose that H is compact. By Remark 10.44, H is closed. It is also bounded. Indeed, fix b E X and observe that {Bn(b) : n E N} covers X. Since H is compact, it follows that N
He
U Bn(b) n=l
for some N E N. Since these balls are nested, we conclude that H C BN(b); i.e., H is bounded. I 10.47 Remark. The converse of Theorem 10.46 is false for arbitrary metric spaces. PROOF. Let X = R be the discrete metric space introduced in Example 10.3. Since a(O, x) :S 1 for all x E R, every subset of X is bounded. Since Xk -+ x in X implies Xk = x for large k, every subset of X is closed. Thus [0,1] is a closed bounded subset of X. Since {XLE[O,l] is an uncountable open covering of [0,1] that has no proper finite subcover, we conclude that [0,1] is closed and bounded, but not compact. I
The problem here is that the discrete space has too many open sets. To identify a large class of metric spaces for which the converse of Theorem 10.46 DOES hold, we need a property that cuts the "number of essential" open sets down to a reasonable size. 10.48 DEFINITION. A metric space X is said to be separable if and only if it contains a countable dense subset; i.e., there is a countable set Z of X such that for every point a E X there is a sequence Xk E Z such that Xk -+ a as k -+ 00. We have seen (Theorem 9.3) that all Euclidean spaces are separable. The space Hence, the hypothesis of separability is not an unusual requirement. The following result makes clear what we meant above by "number of essential" open sets. It shows that every open covering of a set in a separable metric space has a countable sub covering.
era, bJ is also separable (see Exercise 7, p. 519).
10.4
309
Compact sets
10.49 THEOREM [LINDELOF]. Let E be a subset of a separable metric space X. If {VoJaEA is a collection of open sets and E ~ UaEA Va, then there is a countable subset Ao of A such that
PROOF. Let Z be a countable dense subset of X, and consider the collection
T of open balls with centers in Z and rational radii. This collection is countable. Moreover, it "approximates" all other open sets in the following sense: CLAIM. Given any open ball Br(x) x E Bq(a) and Bq(a) ~ Br(x).
C
X, there is a ball Bq(a) E T such that
PROOF OF CLAIM. Let Br(x) C X be given. By Definition 10.48, choose a E Z such that p( x, a) < r j 4, and choose by Theorem 1.24 a rational q E Q such that rj4 < q < rj2. Since rj4 < q, we have x E Bq(a). Moreover, if y E Bq(a), then
p(x, y) ~ p(x, a) Therefore, Bq(a)
~
r
r
r
+ p(a, y) < q + 4 < '2 + 4 < r.
Br(x). This establishes the claim.
To prove the theorem, let x E E. By hypothesis, x E Va for some a E A. Hence, by the claim, there is a ball Bx E T such that
(4) The collection T is countable, hence so is the sub collection
(5)
{UI,U2 , ... }:= {Bx: x E E}.
By (4), for each kEN there is at least one ak E A such that Uk (5), xEE
kEN
~
Vak • Hence, by
kEN
Thus set Ao := {ak : kEN}. I We are prepared to obtain a converse of Theorem 10.46. (For the definition of the Bolzano-Weierstrass Property, see Exercise 9, p. 300.)
10.50 THEOREM [HEINE-BoREL]. Let X be a separable metric space that satisfies the Bolzano-Weierstrass Property and H be a subset of X. Then H is compact if and only if it is closed and bounded. PROOF. By Theorem 10.46, every compact set is closed and bounded. Conversely, suppose to the contrary that H is closed and bounded but not compact. Let V be an open covering of H that has no finite sub cover of H. By Lindelof's Theorem, we may suppose that V = {VkhEN; i.e.,
(6)
Chapter 10
310
METRIC SPACES
By the choice of V, Uj=l Vj cannot contain H for any kEN. Thus we can choose a point k
(7)
Xk
E
H\
UVj j=l
for each kEN. Since H is bounded, the sequence Xk is bounded. Hence, by the Bolzano-Weierstrass Property, there is a subsequence Xkv that converges to some x as l/ ~ 00. Since H is closed, x E H. Hence, by (6), x E VN for some N E N. But VN is open; hence, there is an MEN such that l/ 2:: M implies kv > Nand Xkv E VN. This contradicts (7). We conclude that H is compact .• Since R n satisfies the hypotheses of Theorem 10.50, it follows that a subset of a Euclidean space is compact if and only if it is closed and bounded. We now turn our attention to uniform continuity on an arbitrary metric space.
10.51 DEFINITION. Let X be a metric space, E be a nonempty subset of X, and f : E ~ Y. Then f is said to be uniformly continuous on E (notation: f : E ~ Y is uniformly continuous) if and only if given c > 0 there is a 8 > 0 such that p(X, a) < 8 and x, a E E imply T(f(X), f(a)) < c. In the real case, we proved that uniform continuity and continuity were equivalent on closed bounded intervals. That result, whose proof relied on the BolzanoWeierstrass Theorem, is not true in an arbitrary metric space. If we strengthen the hypothesis from closed and bounded to compact, however, the result is valid for any metric space.
10.52 THEOREM. Suppose that E is a compact subset of X and f : X Then f is uniformly continuous on E if and only if f is continuous on E.
~
Y.
PROOF. If f is uniformly continuous on a set, then it is continuous, whether or not the set is compact. Conversely, suppose that f is continuous on E. Given c > 0 and a E E, choose 8(a) > 0 such that
(8)
x E B8(a) (a)
and
x EE
imply T(f(X), f(a)) <
~.
Since a E B8(a) for all 8 > 0, it is clear that {B8(a)/2(a) : a E E} is an open covering of E. Since E is compact, choose finitely many points aj E E and numbers 8j := 8(aj) such that N
(9)
E ~
U B8 /2(aj)' 3
j=l
Set 8 := min{ 81/2, ... , 8N /2}.
10.4
Compact sets
311
Suppose that x, a E E and p(x, a) < O. By (9), x belongs to BOJ/2(aj) for some 1 :::; j :::; N. Hence, O· O· p(a, aj) :::; p(a, x) + p(x, aj) < ~ + ~ = OJ; i.e., a also belongs to BOJ (aj). It follows, therefore, from the choice of OJ that e e T(f(X), f(a)) :::; T(f(X), f(aj)) + T(f(aj), f(a)) < "2 + "2 = e. This proves that
f
is uniformly continuous on E. I
EXERCISES 1. Identify which of the following sets are compact and which are not. If E is not compact, find the smallest compact set H (if there is one) such that E c H.
2. 3. 4.
5.
(a) {11k: kEN} U {o}. (b) {(x, y) E R 2 : a :::; x 2 + y2 :::; b} for real numbers 0 < a < b. (c) {(x, y) E R2 : y = sin(l/x) for some x E (0, I]). (d) {(x,y) E R2: ixyi :::; I}. Let A, B be compact subsets of X. Prove that A U B and A n B are compact. Suppose that E ~ R is compact and nonempty. Prove that supE, inf E E E. Suppose that {VaJaEA is a collection of nonempty open sets in X that satisfies Va n V,e = 0 for all ex =f. f3 in A. Prove that if X is separable, then A is countable. What happens to this result when "open" is omitted? Prove that if V is open in a separable metric space X, then there are open balls Bl, B 2 , ••• such that V= Bj .
U
jEN
Prove that every open set in R is a countable union of open intervals. 6. Let E ~ X be closed. (a) Prove that 8E ~ E. (b) Prove that 8E = E if and only if EO = 0. (c) Show that (b) is false if E is not closed. 7. Prove directly that the discrete space R is not separable. 8. (a) Prove that Cantor's Intersection Theorem holds for nested compact sets in an arbitrary metric space; i.e., if H l , H 2 , ... is a nested sequence of nonempty compact sets in X, then 00
k=l
(b) Prove that (y'2, v'3)nQ is closed and bounded but not compact in the metric space Q introduced in Example 10.5. (c) Show that Cantor's Intersection Theorem does not hold in an arbitrary metric space if "compact" is replaced by "closed and bounded."
312
Chapter 10
METRIC SPACES
9. Prove that the Bolzano-Weierstrass Property does not hold for C[a, b] and IIIII (see Example 10.6). Namely, prove that if In(x) = x n , then Il/nll is bounded but Il/nk - III does not converge for any I E C[O, 1] and any subsequence indo 10. Let X be a metric space.
(a) Prove that if E ~ X is compact, then E is sequentially compact (see Exercise 10, p. 296). (b) Prove that if X is separable and satisfies the Bolzano-Weierstrass Property, then a set E ~ X is sequentially compact if and only if it is compact.
10.5 CONNECTED SETS We have introduced open sets (analogues of open intervals), closed sets (analogues of closed intervals), and compact sets (analogues of closed bounded intervals) in order to develop a calculus of functions of several variables in Chapters 11 through 13, which parallels that developed for functions of a single variable in Chapters 2 through 5. Some of the earlier theory, however, depended on properties of intervals not yet discussed. For example, the proof of the Intermediate Value Theorem tacitly used the fact that an interval is connected, i.e., is unbroken and all of one piece. We shall also use connected sets in Chapter 13 to provide a sufficiently broad definition of surfaces for computational ease. Thus we introduce the following idea.
10.53 DEFINITION. Let X be a metric space. (i) A pair of nonempty open sets U, V in X is said to sepamte X if and only if X = U U V and Un V = 0. (ii) X is said to be connected if and only if X cannot be separated by any pair of open sets U, V. Loosely speaking, a connected space is all in one piece, i.e., cannot be broken into smaller, nonempty, open pieces which do not share any common points. Indeed, we shall see that R, under the usual metric, is connected. On the other hand, under the discrete metric, R is not connected (since (-00,0] and (0,00) are both "open" in the discrete space). Recall (Example 10.4) that every subset of X is a metric space. Hence Definition 10.53 also defines what it means for a subset E of X to be connected. One can always find two subsets of an arbitrary metric space that are connected: (1) The empty set is connected, since it can never be written as the union of nonempty sets. (2) Every singleton E = {a} is also connected since if E = U U V where both U and V are nonempty, then E has at least two points. To obtain deeper results about connectivity, it is convenient to introduce the following concepts. (These concepts will also be used to study continuous functions in the next section.)
10.54 DEFINITION. Let X be a metric space and E ~ X. (i) A set U ~ E is said to be relatively open in E if and only if there is a set V open in X such that U = En V.
10.5
Connected sets
313
(ii) A set A ~ E is said to be relatively closed in E if and only if there is a set C closed in X such that A = En C. For example, the set E of Example 10.35 is relatively open in the subspace Y := {(x,y) : -1 ~ x ~ I} and relatively closed in the subspace Z := {(x,y) : -Ixl < y < Ixl}. Indeed, V = Z is open in R2 (it contains none of its boundary), A = Y is closed in R2 (it contains all its boundary), and E = V n Y, E = An Z. Recall (Exercise 8, p. 306) that a subset A of E is open (respectively, closed) in the subspace E if and only if it is relatively open (respectively, relatively closed) in the set E. Thus all Definition 10.54 does is codify the "subspace topology." By Definition 10.53, then, a set E is connected if there are no nonempty sets U, V, relatively open in E, such that E = U u V and un V = 0. The following result, which is usually easier to use than Definition 10.53, shows that when "separating" a nonconnected set, we can use open sets instead of relatively open sets. (The converse of this result is also true, but harder to prove-see Theorem 10.57.)
10.55 Remark. Let E ~ X. If there exists a pair of open sets A, B in X which separate E; i.e., if E ~ Au B, An B = 0, An E =I- 0, and B n E =I- 0, then E is not connected. PROOF. Set U = An E and V = B n E. It suffices to prove that U and V are relatively open in E and separate E. It is clear by hypothesis and the remarks above that U and V are nonempty, they are both relatively open in E, and Un V = 0. It remains to prove that E = U U V. But E is a subset of A U B, so E ~ U U V. On the other hand, both U and V are subsets of E, so E 2 U U V. We conclude that E= UuV. I Thus when looking for "separations" of a given set E c X, we can confine our attention to open sets in X. Here are several examples. The set Q is not connected since the pair A = (-00, J2), B = (J2, 00) separate Q. Example 10.35 is not connected since {(x, y) : x < O} and {(x, y) : x > O} are open in R2 (neither of them contains any of their boundary points) and separate the bowtie set E. Notice that Examples 10.36 and 10.38 are both connected in R2. There is a simple description of all connected subsets of R.
10.56 THEOREM. A subset E ofR is connected if and only if E is an interval. PROOF. Let E be a connected subset of R. If E is empty or contains only one point, then E is a degenerate interval. Hence we may suppose that E contains at least two points. Set a = inf E and b = sup E. Notice that -00 ~ a < b ~ 00. Suppose for simplicity that a, b ~ E; i.e., E ~ (a, b). If E =I- (a, b), then there is an x E (a, b) such that x ~ E. By the Approximation Property, En (a, x) =I- 0 and En (x, b) =I- 0, and by assumption, E ~ (a, x) U (x, b). Hence, E is separated by the open sets (a, x), (x, b), a contradiction. Conversely, if I is an interval which is not connected, then there are sets U, V, relatively open in I, which separate I, i.e., I = UUV, and there are points Xl E Inu
Chapter 10
314
METRIC SPACES
and X2 E In V. We may suppose that Xl < X2. Consider the set
w = {tEl:
the interval (Xl, t) satisfies (Xl, t) ~ U}.
Notice once and for all that since the intersection of two open intervals is an open interval, any set that is relatively open in I contains an interval about each if its points. Since U is relatively open, it follows that W =I=- 0. Since V is relatively open, it also follows that X2 ~ Wand W is bounded above by some c < X2. Thus X3 = sup W is a finite number that belongs to (Xl, c] C I. In particular, either X3 E U or X3 E V. Suppose that X3 E U. Since X3 > Xl, we can choose 5 > 0 so small that X3 -5 > Xl and (X3 - 5, X3 + 5) C U. Since X3 = sup W, we can choose by the Approximation Property at E W such that t > X3-5 and (XI,t) C U. It follows that (XI,X3+5) = (Xl, t) U (X3 - 5, X3 + 5) c U; i.e., X3 is not the supremum of W, a contradiction. On the other hand, if X3 E V, the same reasoning shows us that there is a 5 > 0 such that (X3 - 5, X3 + 5) c V and atE W such that t > X3 - 5 and (X3 - 5, t) c U. It follows that (X3 - 5, t) c Un V; i.e., Un V =I=- 0, a contradiction. Thus the pair U, V does not separate I, and I must be connected. I We can use this result to prove that a real function is continuous on a closed, bounded interval if and only if its graph is closed and connected (see Theorem 9.52).
We close this section by showing that the converse of Remark 10.55 is also true. This result is optional because we do not use it elsewhere. *10.57 THEOREM. Let E ~ X. If there exist sets U, V, relatively open in E, such that Un V = 0, E = U U V, U =I=- 0, and V =I=- 0, then there is a pair of open sets A, B that separate E.
PROOF. We first show that
(9)
UnV =
0.
Indeed, since V is relatively open in E, there is a set n, open in X, such that V = En n. Since Un V = 0, it follows that U C [yo This last set is closed in X. Therefore,
i.e., (9) holds. Next, we use (9) to construct the set B. Set
5x = inf{p(x, u) : u E U},
X E V,
and
B =
U B8 /2(X). x
xEV
Clearly, B is open in X. Since 5x > 0 for each X ~ U (see Exercise 5, p. 306), B contains V; hence, B n E :;;2 V. The reverse inequality also holds since by construction B n U = 0 and by hypothesis E = U U V. Therefore, B n E = V.
10.5
Connected sets
315
Similarly, we can construct an open set A such that A n E = U by setting Cy
= inf{p(v, y) : v
E V},
y E U,
and
A
=
UB
cy / 2 (Y).
yEU
To prove that the pair A, B separate E, it remains to prove that An B = 0. Suppose to the contrary that there is a point a E An B. Then a E B ox / 2 (x) for some x E V and a E B cy / 2 (Y) for some y E U. We may suppose that Ox ::; Cy. Then
p(x, y) ::; p(x, a) + p(a, y) <
0; +
c; ::; Cy.
Therefore, p(x, y) < inf{p(v, y) : v E V}. Since x E V, this is impossible. We conclude that A n B = 0. I
EXERCISES 1. (a) Let
a::; band c ::; d be real numbers. Sketch a graph of the rectangle [a, b] x [c, dj
:=
{(x, y) : x E [a, b], y
E
[c, dj}
and decide whether this set is compact or connected. Explain your answers. (b) Sketch a graph of the set B 1 (-2,0) U B 1 (2,0) U {(x, 0) : -1 < x < I}
and decide whether this set is compact or connected. Explain your answers. 2. (a) Sketch a graph of the set {(x, y) : x 2 + 2y2 < 6, y ~ O}
and decide whether this set is relatively open or relatively closed in the subspace {(x, y) : y ~ O}. Do the same for the subspace {(x, y) : x 2 + 2y2 < 6}. Explain your answers. (b) Sketch a graph of the set
and decide whether this set is relatively open or relatively closed in the subspace Bl (0,0). Do the same for the subspace Bv'2(2,0). Explain your answers.
316
Chapter 10
METRIC SPACES
3. Prove that the intersection of connected sets in R is connected. Show that this is false if "R" is replaced by "R2 ." 4. Prove that if E ~ R is connected, then EO is also connected. Show that this is false if "R" is replaced by "R2 ." 5. Suppose that E c X is connected and E ~ A ~ E. Prove that A is connected. 6. Suppose that {Eoj"'EA is a collection of connected sets in a metric space X such that n"'EAE", #- 0. Prove that E=
U E", "'EA
is connected. This exercise is used in Section 10.6. Let H ~ X. Prove that H is compact if and only if every cover {E",} "'EA of H, where the E", 's are relatively open in H, has a finite subcover. 8. A set E in a metric space is called clopen if it is both open and closed. (a) Prove that every metric space has at least two clop en sets. (b) Prove that a metric space is connected if and only if it contains exactly two clop en sets. 9. Let X be a metric space. Prove that X is connected if and only if every nonempty proper subset of X has a nonempty boundary. 1*10 I. This exercise is used to prove *Corollary 11.29. (a) A set E ~ Rn is said to be polygonally connected if and only if any two points a,b E E can be connected by a polygonal path in E; i.e., there exist points Xk E E, k = 1, ... , N, such that Xo = a, XN = band L(Xk-l;Xk) ~ E for k = 1, ... , N. Prove that every polygonally connected set in Rn is connected. (b) Let E ~ R n be open and Xo E E. Let U be the set of points x E E that can be polygonally connected in E to Xo. Prove that U is open. (c) Prove that every open connected set in R n is polygonally connected.
[lJ.
10.6 CONTINUOUS FUNCTIONS
In this section we discuss the behavior of images and inverse images of open sets, closed sets, compact sets, and connected sets under continuous functions. We shall use these results many times in the sequel. Recall that if X and Yare metric spaces (with respective metrics p and T), then a function f : X ~ Y is continuous on X if and only if given a E X and c > 0 there is a 8 > 0 such that p(x, a) < 8 implies T(f(X), f(a)) < c, i.e., such that
(10) This observation can be used to give the following simple but powerful characterization of continuous functions, which can be stated without using the metric of X (see also Exercise 3).
10.6
317
Continuous functions
10.58 THEOREM. Let X and Y be metric spaces, and let f : X - t Y. Then is continuous if and only if f-1(V) is open in X for every open V in Y.
f
PROOF. Suppose that f is continuous on X and V is open in Y. We may suppose that f-1(V) is nonempty. Let a E r1(V), Le., f(a) E V. Since V is open, choose c> 0 such that Be(f(a)) ~ V. Since f is continuous at a, choose 8 > 0 such that (10) holds. Evidently,
(11) Since a E f-1(V) was arbitrary, we have shown that every point in f-1(V) is interior to f-1(V). Thus f-1(V) is open. Conversely, let c > 0 and a E X. The ball V = Be(f(a)) is open in Y. By hypothesis, f-1(V) is open. Since a E f-1(V), it follows that there is a 8 > 0 such that Ba(a) ~ f-1(V). This means that if p(x, a) < 8, then r(f(x), f(a)) < c. Therefore, f is continuous at a EX. • By using the subspace (Le., relative) topology, we see that Theorem 10.58 contains the following criterion for f to be continuous on a subset of X.
10.59 COROLLARY. Let E ~ X and f: E - t Y. Then f is continuous on E if and only if f-1(V) n E is relatively open in E for all open sets V in Y. We shall refer to Theorem 10.58 and its corollary by saying that open sets are invariant under inverse images by continuous functions. It is interesting to notice that closed sets are also invariant under inverse images by continuous functions (see Exercises 3 and 4). It is natural to ask whether compact sets and connected sets are invariant under inverse images by continuous functions. The following examples show that the answer to this question is "no."
10.60 Examples. (i) If f(x) = l/x and H = [0,1]' then f is continuous on (0,00) and H is compact, but f-1(H) = [1,00) is not compact. (ii) If f(x) = x 2 and E = (1,4), then f is continuous on Rand E is connected, but f- 1 (E) = (-2, -1) U (1,2) is not connected. The next two results show that compact sets and connected sets are invariant under images, rather than inverse images, by continuous functions.
10.61 THEOREM. If H is compact in X and f : H then f(H) is compact in Y. PROOF.
-t
Y is continuous on H,
Suppose that {Va}aEA is an open covering of f(H). By Theorem 1.43,
H
~ f-1(f(H)) ~ r
1
(U
aEA
Va)
=
U
f-1(Va)'
aEA
Hence, by Corollary 10.59, {f-1(Va)}aEA is a covering of H whose sets are all relatively open in H. Since H is compact, there are indices a1, a2, ... ,ow such
318
Chapter 10
that
METRIC SPACES
N
H c;,
Uf-l(VaJ j=l
(see Exercise 7, p. 316). It follows from Theorem 1.43 that
Therefore, f(H) is compact. I
10.62 THEOREM. If E is connected in X and then f(E) is connected in Y.
f :E
~
Y is continuous on E,
PROOF. Suppose that f(E) is not connected. By Definition 10.53, there exist a pair U, V C Y of relatively open sets in f(E) that separate f(E). By Exercise 4, f-l(U) nE and f-l(V) nE are relatively open in E. Since f(E) = UUV, we have
E = (f-l(U) n E) U (f-l(V) n E). Since UnV = 0, we also have rl(U)nrl(V) = 0. Thus f- 1 (U)nE, f- 1 (V)nE is a pair of relatively open sets that separate E. Hence, by Definition 10.53, E is not connected, a contradiction. I (Note: Theorems 10.61 and 10.62 do not hold if "compact" or "connected" are replaced by "open" or "closed." For example, if f(x) = x 2 and V = (-1,1), then f is continuous on R and V is open, but f(V) = [0,1) is neither open nor closed.) Suppose that f is a real function continuous on a closed bounded interval [a, b]. Then the function F (x) = (x, f (x)) is continuous from R into R 2 . Since the graph of y = f(x) for x E [a, b] is the image of [a, b] under F, it follows from Theorems 10.61 and 10.62 that the graph of f is compact and connected. It is interesting to note that this property actually characterizes continuity of real functions (see Theorem 9.51). To illustrate the power of the topological point of view presented above, compare the proofs of the following theorem and Exercise 5 with those of Theorems 3.26 and 3.29.
10.63 THEOREM [EXTREME VALUE THEOREM]. Let H be a nonempty, compact set in a metric space X and suppose that f : H ~ R is continuous. Then M := sup{f(x) : x E H}
and m:= inf{f(x) : x E H}
are finite real numbers and there exist points x M, Xm E H such that M and m = f(x m ).
=
f (x M )
PROOF. By symmetry, it suffices to prove the result for M. Since H is compact,
f(H) is compact. Hence, by the Theorem 10.46, f(H) is closed and bounded. Since
10.6
Continuous functions
319
f(H) is bounded, M is finite. By the Approximation Property, choose Xk E H such that f(Xk) - t Mask - t 00. Since f(H) is closed, M E f(H). Therefore, there is an XM E H such that M = f(XM)' A similar argument shows that m is finite and attained on H. • The following analogue of Theorem 4.26 will be used in Chapter 13 to examine change of parametrizations of curves and surfaces. 10.64 THEOREM. Let X and Y be metric spaces. If H is a compact subset of X and f : H - t Y is 1-1 and continuous, then f- 1 is continuous on f (H). PROOF. By Exercise 4a, it suffices to show that (f-l )-1 takes closed sets in X to relatively closed sets in f(H). Let E be closed in X. Then EnH is a closed subset of H, so by Remark 10.45, En H is compact. Hence, by Theorem 10.61, f(E n H) is compact, in particular closed. Since f is 1-1, f(E n H) = f(E) n f(H) (see Exercise 6, p. 33). Since f(EnH) and f(H) are closed, it follows that f(E)nf(H) is relatively closed in f(H). Since (f-l )-1 = f, we conclude that (f-l )-1 (E)nf(H) is relatively closed in f(H) .•
If you are interested in how to use these topological ideas to study real functions further, you may read Section 9.5 now. EXERCISES 1. Let f(x) = sin X and g(x) = x/lxl if x =f 0 and g(O) = O. (a) Find f(E) and g(E) for E = (0,11'), E = [0,11'], E = (-1,1), and E = [-1,1], and explain some of your answers by appealing to results in this section. (b) Find f-l(E) and g-I(E) for E = (0,1), E = [0,1], E = (-1,1), and E = [-1,1], and explain some of your answers by appealing to results in this section.
= yX and g(x) = l/x if x =f 0 and g(O) = O. (a) Find f(E) and g(E) for E = (0,1), E = [0,1), and E = [0,1], and explain
2. Let f(x)
some of your answers by appealing to results in this section. (b) Find f-l(E) and g-I(E) for E = (-1,1) and E = [-1,1], and explain some of your answers by appealing to results in this section. 3. Let X be a metric space and f : X - t Y. Prove that f is continuous if and only if f- 1 (C) is closed in X for every set C closed in Y. 4. Suppose that E ~ X and f : E - t Y. (a) Let E ~ X and f : E - t Y. Prove that f is continuous on E if and only if f-l(A) n E is relatively closed in E for all closed sets A in Y. (b) Suppose that f is continuous on E. Prove that if V is relatively open in f(E), then f-l(V) is relatively open in E, and if A is relatively closed in f(E), then f-l(A) is relatively closed in E. 5. [INTERMEDIATE VALUE THEOREM]. Let E be a connected subset of a metric space X. If f : E - t R is continuous, f(a) =f f(b) for some a, bEE, and y is a number that lies between f(a) and f(b), then prove that there is an x E E
320
Chapter 10
METRIC SPACES
such that I(x) = y. (You may use Theorem 10.56.) 6. Let X be metric space, Y be a Euclidean space, and H be a nonempty compact subset of X.
(a) Suppose that
I :H
-+
Y is continuous. Prove that
IIIIIH := sup 111(x)lly xEH
is finite and there exists an Xo E H such that 111(xo)lly = IIIIIH. (b) A sequence of functions !k : H -+ Y is said to converge uniformly on H to a function I : H -+ Y if and only if given f: > 0 there is an N E N such that k ~ N
and
x E H
imply
11!k(x) - l(x)lly <
f:.
Show that Illk - IIIH -+ 0 as k -+ 00 if and only if!k -+ I uniformly on H as k -+ 00. (c) Prove that a sequence of functions Ik converges uniformly on H if and only if, given f: > 0, there is an N E N such that k,j ~ N
implies
Illk -hlIH <
f:.
7. Suppose that E is a compact subset of a metric space X.
(a) If I,g : E -+ Rn are uniformly continuous, prove that 1+ 9 and I· 9 are uniformly continuous. Did you need compactness for both results? (b) If g: E -+ R is continuous on E and g(x) ¥- 0 for x E E, prove that l/g is a bounded function. (c) If I,g: E -+ R are uniformly continuous on E and g(x) ¥- 0 for x E E, prove that 1/9 is uniformly continuous on E. 8. Let X and Y be metric spaces, E <;:; X, and I : E -+ Y. (a) If I is uniformly continuous on E and Xn E E is Cauchy in X, prove that I(x n ) is Cauchy in Y. (b) Suppose that D is a dense subspace of X; i.e., D c X and D = X. If Y is complete and I : D -+ Y is uniformly continuous on D, prove that I has a continuous extension to X (e); i.e., prove that there is a continuous function 9 : X -+ Y such that g(x) = I(x) for all xED.
Chapter 11
11.1 PARTIAL DERIVATIVES AND PARTIAL INTEGRALS The most natural way to define derivatives and integrals of functions of several variables is to allow one variable to move at a time. The corresponding objects, partial derivatives and partial integrals, are the subjects of this section. Our main goal is to identify conditions under which partial derivatives, partial integrals, and evaluation of limits commute with each other, e.g., under which the limit of a partial integral is the partial integral of a limit. We begin with some notation. The Cartesian product of a finite collection of sets Et, E 2 , . .• ,En is the set of ordered n-tuples defined by
Thus the Cartesian product of n subsets of R is a subset of Rn. By a rectangle in Rn (or an n-dimensional rectangle) we mean a Cartesian product of n closed, bounded intervals. An n-dimensional rectangle H = [al' bl ] x ... x [an, bn ] is called an n-dimensional cube with side s if Ibj - aj I = s for j = 1, ... ,n. Let f: {Xl} x ... x {Xj-d x [a, b] x {Xj+d x ... X {xn} ~ R. We shall denote the function
g(t) := f(Xl,"" Xj-l, t, Xj+1,"" Xn),
t E
[a, b],
by f(xt, ... , Xj-l,', Xj+t, ... , Xn). If 9 is integrable on [a, b], then the partial integral of f on [a, b] with respect to Xj is defined by
321
322
Chapter 11
DIFFERENTIABILITY ONRn
If 9 is differentiable at some to E (a, b), then the partial derivative (or first-order partial derivative) of I at (Xl, ... , Xj-l, to, Xj+l. ... xn) with respect to Xj is defined by
IXJ (Xl' ... ' Xj-l. to, Xj+l, ... , Xn) :=
~I (Xl. ... ,Xj-l,tO,Xj+1, ... ,Xn ) :=g'(to).
UXj
Thus the partial derivative IXJ exists at a point a if and only if the limit
81 (a) 8xj
:= lim
h--->O
I(a+ hej )
-
I(a)
h
exists. (Some authors use Ii to denote the partial derivative Ix J • To avoid confusing first-order partial derivatives with sequences and components of functions, we will not use this notation.) We extend partial derivatives to vector-valued functions in the following way. Suppose that a = (al, ... , an) ERn and I = (/1,12, ... , 1m) : {all x··· X {aj-l} X I X {aj+d X ... X {an} ----7 Rm, where j E {1,2, ... ,n} is fixed and I is an open interval containing aj. If for each k = 1,2, ... ,m the first-order partial derivative 8 Ik/ 8xj exists at a, then we define the first-order partial derivative of I with respect to Xj to be the vector-valued function
81 (8/1 81m) . Ix; (a) := !.'.)(a):= !.'.)(a), ... , ~(a) UXj UXj uXj Higher-order partial derivatives are defined by iteration. For example, the secondorder partial derivative of I with respect to Xj and Xk is defined by
when it exists. Second-order partial derivatives are called mixed when j This brings us to the following important collection of functions.
=f k.
11.1 DEFINITION. Let V be a nonempty, open subset of Rn, let I : V ----7 Rm, and let pEN. (i) I is said to be CP on V if and only if each partial derivative of I of order k ::; p exists and is continuous on V. (ii) I is said to be Coo on V if and only if I is CP on V for all pEN.
cq
Clearly, if I is CP on V and q < p, then I is on V. By making obvious modifications in Definition 11.1 using Definition 4.6, we can also define what it means for a function to be CP on a rectangle H. We shall denote the collection of functions that are CP on an open set V, respectively, on a rectangle H, by CP(V), respectively, by CP (H).
11.1
Partial derivatives and partial integrals
323
For simplicity, we shall state all results in this section for the case n = 2 and m = 1, using x for Xl and y for X2. (It is too cumbersome to do otherwise.) It is clear that with appropriate changes in notation, these results also hold for any n,mEN. Since partial derivatives and partial integrals are essentially one-dimensional ideas, each one-dimensional result about derivatives and integrals contains information about partial derivatives and partial integrals. Here are three examples. By the Product Rule (Theorem 4.10), if fx and gx exist, then
a ag ax (f g) = fax
of
+ 9 ax .
By the Mean Value Theorem (Theorem 4.15), if f(" y) is continuous on [a, bj and the partial derivative fx(-,y) exists on (a,b), then there is a point c E (a, b) (which may depend on y as well as a and b) such that
of f(b, y) - f(a, y) = (b - a) ax (c, y); and by the Fundamental Theorem of Calculus (Theorem 5.28), if f(', y) is continuous on [a, bj, then
a ax
l a
X
f(t, y) dt = f(x, y),
and if the partial derivative fx(', y) exists and is integrable on [a, b], then
l
a
b
of ax (x, y) dx = f(b, y) - f(a, y).
Our first result about the commutation of partial derivatives, partial integrals, and evaluation of limits deals with interchanging two first-order partial derivatives (see also Exercise 10, p. 339).
11.2 THEOREM. Suppose that V is open in R2, that (a,b) E V, and that
f : V ~ R. If f is Cion V, and if one of the mixed second partial derivatives of f exists on V and is continuous at the point (a, b), then the other mixed second partial derivative exists at (a, b) and
02 f 02 f ayax (a,b) = axay (a, b). NOTE: These hypotheses are met if f E C2 (V).
PROOF. Suppose that fyx exists on V and is continuous at the point (a, b). Consider 6.(h, k) := f(a + h, b + k) - f(a + h, b) - f(a, b + k) + f(a, b), defined for Ihl, Ikl < r/V2, where r > 0 is so small that Br(a, b) C V. Apply the Mean Value Theorem twice to choose scalars s, t E (0,1) such that
of M ~f 6.(h, k) = k ay (a + h, b + tk) - k ay (a, b + tk) = hk ax ay (a + sh, b + tk).
Chapter 11
324
DIFFERENTIABILITY ON R n
Since this last mixed partial derivative is continuous at the point (a, b), we have . . l:1(h, k) 0 2 f ( b) hmhm hk =~a, . k~Oh~O uxuy
(1)
On the other hand, the Mean Value Theorem also implies that there is a scalar u E (0, 1) such that l:1(h, k)
= f(a + h, b + k) - f(a, b + k) - f(a + h, b) + f(a, b) =
h
of
ax (a + uh, b + k) - h of ax (a + uh, b).
Hence, it follows from (1) that . hm . -k 1 (Of o+f hm !:l(a + uh, b + k) - !:l(a uh,) b)
k~Oh~O
uX
uX
.
.
= k~Oh~O hm hm
a
2f l:1(h,k) hk = ~(a,b). uxuy
Since fx is continuous on Br(a, b), we can let h conclude by definition that
=
0 in the first expression. We
We shall refer to the conclusion of Theorem 11.2 by saying the first-order partial derivatives of f commute. Thus, if f is C2 on an open subset V of Rn, if a E V, and if j #- k, then
02f aXj aXk
(a)
=
02 f
(a).
aXk aXj
The following example show thats that Theorem 11.2 is false if the assumption about continuity of the second-order partial derivative is dropped. 11.3 Example. Prove that
(X,y)
#-0
(X,y) =0 is Cl on R 2 , both mixed second partial derivatives of f exist on R 2 , but the firstorder partial derivatives of f do not commute at (0,0); i.e., fxy(O,O) #- fyx(O, 0). PROOF. By the one-dimensional Product and Quotient Rules,
11.1
325
Partial derivatives and partial integrals
for (x, y) =f. (0,0). Since 21xYI ~ x 2 + y2, we have Ifx(x, y)1 ~ 21yl. Therefore, fx(x, y) --+ 0 as (x, y) --+ (0,0). On the other hand, by definition
of . (h2 _ y2) j:l(O,y)=hmy h 2 ux
2
h-->O
+y
=-y
for all y E R; hence, fx(O, O) = O. This proves that fx exists and is continuous on R2 with value -y at (O,y). A similar argument show thats that fy exists and is continuous on R2 with value x at (x,O). It follows that the mixed second partial derivatives of f exist on R2, and
The following result show thats that we can interchange a limit sign and a partial integral sign when the integrand is continuous on a rectangle.
11.4 THEOREM. Let H = [a, bj x [c, dj be a rectangle and suppose that f : H R is continuous. If
F(y) =
lb
--+
f(x, y) dx,
then F is continuous on [c, dji i.e., lim y-->yo
lb
f(x, y) dx =
a
lb a
YE[c,dJ
lim f(x, y) dx Y-->Yo
YE[c,dJ
for all Yo E [c, dj. PROOF. For each y E [c, d], f(', y) is continuous on [a, bj. Hence, by Theorem 5.10, F(y) exists for y E [c, dj. Fix Yo E [c, d] and let € > O. Since H is compact, f is uniformly continuous on H. Hence, choose 8 > 0 such that II(x,y) - (z,w)11 < 8 and (x,y), (z,w) E H imply €
If(x,y) - f(z,w)1 < b _ a' Since Iy - Yol
= II(x,y) - (x,yo)ll, it follows that W(y) - F(yo)1
~
lb
If(x, y) - f(x, yo)1 dx <
€
for all y E [c, dj that satisfy Iy - yol < 8. We conclude that F is continuous on [c,dj. I The following result show thats that we can interchange a derivative and an integral sign when the first-order partial derivative of the integrand is sufficiently smooth. We will refer to this process as differentiating under the integml sign.
Chapter 11
326
DIFFERENTIABILITY ON Rn
11.5 THEOREM. Let H = [a, b] x [c, d] be a rectangle in R2 and let f : H ---7 R. Suppose that f(·, y) is integrable on [a, b] for each y E [c, dj, and that the partial derivative fy(x,·) exists on [c, dj for each x E [a, b]. If the two-variable function fy(x, y) is continuous on H, then
dd
lb f(x,y)dx = lb a(x,y)dx af a a Y
y
for all y
E
[c, d].
NOTE: These hypotheses are met if f E Cl (H). PROOF.
Recall that "fy(x,·) exists on [c, d]" means that fy(x,·) exists on (c, d),
and
) f y(x, c:=
· f(x,c+h)-f(x,c) f ( d)'- l' f(x,d+h)-f(x,d) 11m h , y x, .- 1m h
h->O+
h->O-
both exist (see Definition 4.6). Hence, it suffices to show that
· lab f(x,y+h)-f(x'Y)d __ lb a f ( )d 11m h x a x,y x
h->O+
a
a
Y
for y E [c, d), and )d · lbf(x,Y+h)-f(X,y)d __ lb a f ( 11m h x a x,y x
h->O-
a
a
Y
for y E (c, d]. The arguments are similar; we provide the details only for the first identity. Fix x E [a, b] and y E [c, d), and let h > 0 be so small that y + h E [c, d). Let e > O. By uniform continuity, choose a 8 > 0 so small that Iy - cl < 8 and x E [a, b] imply Ify(x, y) - fy(x, c)1 < e/(b - a). By the Mean Value Theorem, choose a point c(x; h) between y and y + h such that ('h)) F( x,y, h) ·=f(x,y+h)-f(x,y)=af( . h ay x,c x, .
Since Ic(x; h) -
yl
= c(x; h) - y ::; h, it follows that if 0
< h < 8, then
I
b f F(x, y, h) - l a af ay (x, y) dx ::; lb a la ay (x, c(x; h)) - af ay (x, y) I dx < e.
Therefore,
dd
y
lb f(x,y)dx a
Thus if H = lab bl ] x ... and k =f. j, then
(2)
af = lb a(x,y)dx. I Y
a
X
[an, bn ] is an n-dimensional rectangle, if f is CIon H,
bj a f a alb, f(Xb"" Xn ) dXj = l a(xb"" Xn ) dXj. Xk
a,
aj
Xk
The rest of this section contains optional material that shows what happens to the results above when the improper integral is used.
We begin by borrowing a concept from the theory of infinite series.
11.1
327
Partial derivatives and partial integrals
*11.6 DEFINITION. Let a < b be extended real numbers, let I be an interval in R, and suppose that f : (a, b) x I ---7 R. The improper integral
l bf (x,Y)dx is said to converge uniformly on I if and only if f(·, y) is improperly integrable on (a, b) for each y E I and given c > 0 there exist real numbers A, B E (a, b) such that
11b f(x,y)dx -i
f3
f(X,Y)dXI < c
for all a < a < A, B < f3 < b, and all y E I. For most applications, the following simple test for uniform convergence of an improper integral will be used instead of Definition 11.6 (compare with Theorem 7.15). *11.7 THEOREM [WEIERSTRASS M-TEST]. Suppose that a < b are extended real numbers, that I is an interval in R, that f: (a, b) x I ---7 R, and that f(·,y) is locally integrable on the interval (a, b) for each y E I. If there is a function M: (a, b) ---7 R, absolutely integrable on (a, b), such that
If(x,y)1
~
M(x)
for all x E (a, b) and y E I, then
lb f(x,y) dx converges uniformly on I. Let c >
o.
By hypothesis and the Comparison Test for improper integrals, f(x, y) dx exists and is finite for each y E I. Moreover, since M(x) is improperly integrable on (a, b), there exist real numbers A, B such that a < A < B < b and
f:
PROOF.
lA M(x) dx +
J:
M(x) dx < c.
Thus for each a < a < A < B < f3 < b and each y E I, we have
lb f(x, y) dx -
J:
f(x, y) dx
~l
a
If(x, y)1 dx +
~ lA M(x)dx+
i
J:
b
If(x, y)1 dx
M(x)dx
The following is an improper integral analogue of Theorem 11.4.
328
DIFFERENTIABILITY ON Rn
Chapter 11
*11.8 THEOREM. Suppose that a < b are extended real numbers, that c are finite real numbers, and that f : (a, b) x [c, dj ~ R is continuous. If
lb
F(y) =
f(x, y) dx
converges uniformly on [c, dj, then F is continuous on [c, dj; i.e.,
lim Y~Yo
Ib f(x, y) dx Ib
lim f(x, y) dx
=
a
a
YE[c,d]
Y~Yo
YE[c,d]
for all Yo E [c, dj. PROOF.
Let
E
> 0 and Yo
[c, dj. Choose real numbers A, B such that a < A <
E
B < band
B
IF(Y) - i
f(x, y) dxl <
for all y E [c, dj. By Theorem 11.4, choose
for all y E [c, dj that satisfy
IF(y) - F(yo)1
Iy -
~ IF(Y) -
Yo I < i
B
(j.
(j
~
> 0 such that
Then
f(X,Y)dXI
+ IF(YO)
+ liBU(X,y) -
- i
B
f(X,Yo))dXI
f(x, Yo) dxl
E E E <3+3+3=E for all y E [c, dj that satisfy
Iy -
Yol
< (j. I
The proof of Theorem 11.5 can be modified to prove the following result.
*11.9 THEOREM. Suppose that a < b are extended real numbers, that c < dare finite real numbers, that f : (a, b) x [c, dj ~ R is continuous, and that the improper integral
F(y) =
lb
f(x, y) dx
exists for all y E [c, dj. If fy(x, y) exists and is continuous on (a, b) x [c, dj and if
¢(y) =
I
b
a
of
a(x, y) dx y
11.1
329
Partial derivatives and partial integrals
converges uniformly on [c, dj, then F is differentiable on [c, dj and F'(y) = ¢(y); i.e.,
dd
y
lb
f(x,y)dx
a
=
lb
of a(x,y)dx Y
a
for all y E [c, d].
For a result about interchanging two partial integrals, see Theorem 12.31 and Exercise 10, p. 419.
EXERCISES 1. Compute all mixed second-order partial derivatives of each of the following
functions and verify that the mixed partial derivatives are equal.
(a) f(x,y) = xeY •
(b) f(x, y)
= cos(xy).
x+y (c) f (x, y) = x2 + 1 .
2. Compute all first-order partial derivatives of each of the following functions and find where they are continuous.
(a) f(x,y)
= x 2 +sin(xy).
(b) f(x,y,z) =~.
l+z
3. For each of the following functions, compute continuous.
(c) f(x,y)
= Jx 2 +y2.
f x, and determine where it is
(x, y) =I (0,0)
(a)
(x, y) = (0,0).
(x, y) =I (0,0)
(b)
(x,y) = (0,0). 4. Suppose that H = [a, b] x [c, d] is a rectangle, and 9 : [a, bj ~ R is integrable. Prove that
F(y) =
lb
g(x)f(x, y) dx
is uniformly continuous on [c, dj. 5. Evaluate each of the following expressions. (a)
f :H
~
R is continuous,
Chapter 11
330
dd
(b)
6. Suppose that
Y
f
11 J
DIFFERENTIABILITY ON Rn
x 2y2
+ xy + y + 2 dx
at y = 0.
-1
is a continuous real function.
(a) If f01 f(x) dx
= 1,
prove that
1 2
lim Y-->O
2
2
f(lx-1I)e x y+y dx=2.
0
(b) If f is C1 on Rand f; f'(x) sin x dx = e, prove that
r
+ flY + x) dx =
e + lim f(x) cos(y5 y-->O 10
0.
*7. Evaluate each of the following expressions.
r
1 xcosy dx. lim y-->O+ 10 -ij1 - x + y
(a)
(b)
d dy
1
00
e- XY sin x
*8. (a) Prove that
dx
at y
= 1.
X
11"
r1 cos(x2 + y2) dx
10 ..fi converges uniformly on (-00,00). (b) Prove that fooo e- xy dx converges uniformly on [1,00). (c) Prove that foOOye-XYdx exists for each y E [0,00) and converges uniformly on any [a, b] C (0,00), but does not converge uniformly on [0,1]. *11.10 DEFINITION. The Laplace transform of a function f : (0,00) said to exist at a point s E (0,00) if and only if the integral
C{f}(s):=
1
00
~
e- st f(t) dt
converges. (Note: This integral is improper at 00 and may be improper at 0.) *9. Prove that
(a)
(b)
C{l}(s)
C{tn}(s)
=
1 s
= -,
s:~1'
s > 0.
s > 0, n E N.
R is
11.1
Partial derivatives and partial integrals
1 8-a
C{e at }(8)
(c)
(d)
8
= ~b2'
C{cos(bt)}(8)
8
(e)
C{sin(bt)}(8) =
*10. Suppose that f : (0,00) at some a E (0,00). Let
-+
8>
= --,
+
+--b + 8
2'
a, a E
331
R.
8> 0, bE R.
8> 0, bE R.
R is continuous and bounded and that C{J} exists
t E (0,00).
(a) Prove that
for all N E N. (b) Prove that the integral any b > a and
1
00
fo
oo
e-(8-a)t¢(t)dt converges uniformly on [b,oo) for
e- 8t f(t) dt = (8 - a)
1
00
e-(s-a)t¢(t) dt,
8> a.
(c) Prove that C{J} exists, is continuous on (a, 00), and satisfies lim C{f}(8) = 0. 8-->00
(d) Let g(t) = tf(t) for t E (0,00). Prove that C{J} is differentiable on (a, 00) and d d8 C{J}(8) = -C{g}(8) for all 8 E (a, 00). (e) If, in addition, f' is continuous and bounded on (0, 00 ), prove that
C(f')(8) = 8C(f)(8) - f(O) for all 8 E (a, 00).
332
Chapter 11
DIFFERENTIABILITY ON Rn
*11. Using Exercises 9 and 10, find the Laplace transforms for the each of the following functions. (a) te t .
(b) tsin7l't.
(c) ecost.
11.2 DEFINITION OF DIFFERENTIABILITY
In this section we define what it means for a vector function to be differentiable at a point. Whatever our definition, we expect two things: If f is differentiable at a then (1) f will be continuous at a, and (2) all first-order partial derivatives of f will exist at a. Working by analogy with the one-variable case, we guess that f is differentiable at a if and only if all its first-order partial derivatives exist at a. The following example shows this guess is wrong. 11.11 Example. Prove that the first-order partial derivatives of x+y f(x,y) = { 1
°
x= or otherwise
y=
°
f is not continuous at (0,0). PROOF. Since limx--+o f(x, x) = 1 -:f. = f(O, 0), it is clear that f is nQt continuous at (0,0). On the other hand, the first-order partial derivatives of f exist since
exist at (0,0), but
°
fx(O, O) = lim f(h, 0) ~ f(O, 0) = 1 h--+O
and similarly, fy(O, 0) = 1. I Even if we restrict our attention to those functions f that are continuous and have first-order partial derivatives, we still cannot be sure that f is differentiable (see Exercise 7). How, then, shall we define differentiability in Rn? When a mathematical analogy breaks down, it is often helpful to reformulate the problem in the original setting. For functions of one variable, we found that f is differentiable at a if and only if there is a linear function T E .c(R; R) such that lim f(a h--+O
+ h) -
f(a) - T(h) h
=
°
(see Theorem 4.3). Thus f is differentiable at a E R if and only if there is a T E .c(R; R) such that the function e(h) := f(a + h) - f(a) - T(h) converges to zero so fast that e(h)jh -+ as h -+ 0. This leads us to the following definition.
°
11.2
Definition of differentiability
333
11.12 DEFINITION. Let f be a vector function from n variables to m variables. (i) f is said to be differentiable at a point a E Rn if and only if there is an open set V containing a such that f : V ---> Rm and there is aTE £(Rn; Rm) such that the function
c:(h) := f(a + h) - f(a) - T(h) (defined for h sufficiently small) satisfies c:(h)/llhll ---> 0 as h ---> O. (ii)
f is said to be differentiable on a set E if and only if E is nonempty, and f is differentiable at every point in E.
Since every linear transformation in £(Rn; Rm) can be represented by an m x n matrix (see Theorem 8.15), a vector function f is differentiable at a point a if and only if either of the following conditions holds: There exists an m x n matrix B such that
(3)
lim f(a+h) - f(a) - Bh =0 h-41 Ilhll
or such that 1.
h~
Ilf(a+h)-f(a)-Bhll_ Ilhll -
o.
We shall use these three descriptions interchangeably. The following result shows that Definition 11.12 rules out pathology such as Example 11.11.
11.13 THEOREM. Let f be a vector function. If f is differentiable at a, then f is continuous at a. PROOF. Suppose that f is differentiable at a. Then by (3), there is an m x n matrix B and a 6 > 0 such that Ilf(a + h) - f(a) - Bhll ::; Ilhll for all Ilhll < 6. By the triangle inequality (see Theorem 8.6iii) and the definition of the operator norm, it follows that Ilf(a+h) - f(a) II ::; IIBllllhl1 + Ilhll
for Ilhll < 6. Since IIBII is a finite real number, we conclude from the Squeeze Theorem that f(a+h) ---> f(a) ash--->O; i.e., f is continuous ata. I By Exercise 7, the existence of first-order partial derivatives is not enough to conclude that a function is differentiable. The converse of this result, however, is true.
11.14 THEOREM. Let f be a vector function. first-order partial derivatives of f exist at a.
Iff is differentiable at a, then all
PROOF. Let B = [bijl be an m x n matrix that satisfies (3). Fix 1 ::; j ::; nand set h = tej for some t > O. Since Ilhll = t, we have
f(a+h)-f(a)-Bh._ f(a+tej)-f(a) -B. Ilhll .t eJ •
Chapter 11
334
DIFFERENTIABILITY ON Rn
Take the limit of this identity as t multiplication. We obtain
---+
0+, using (3) and the definition of matrix
A similar argument shows that the limit of this quotient as t ---+ 0- also exists and equals (b 1j , ... , bmj ). Since a vector function converges if and only if each of its components converges (see Theorem 9.15), it follows that the first-order partial derivative of each component fi with respect to Xj exists at a and satisfies
for i = 1,2, ... ,m. In particular,
(4)
B=[ali(a)] aXj mXn
a!I (a) aX1
a!I (a) aXn
af:n(a) aX1
af:n (a) aXn
.-
I
If all first-order partial derivatives of a vector function shall use the notation
ali (a) ] D f(a):= [ax' J
mXn
f exist at a point a, we
.
We shall call this matrix the total derivative of f at a (as opposed to partial derivatives) when f is differentiable at a (in the sense of Definition 11.12). The proof of Theorem 11.14 tells us something very useful. If f is differentiable at a, then there is only one linear transformation T that satisfies Definition 11.12, and equivalently, only one matrix B that satisfies (3): the total derivative of f at a. We shall refer to this fact as the uniqueness of the total derivative. If n = 1 or m = 1, the total derivative D f is an m x 1 or 1 x n matrix, hence can be identified with a vector. Most applied mathematicians represent D f in these cases by different notations. For the case n = 1,
Df(a) =
[
ff (a) :
1
f:n(a)
is sometimes denoted in vector notation by f'(a):= (J{(a), .. ·,f:n(a)).
For the case m = 1,
Df(a) =
[::1
(a)
11.2
Definition of differentiability
335
is sometimes denoted in vector notation by \7 f(a):=
of (a), ... , OXn of) ( OXI (a)
.
(\7 f is called the gradient of f because it identifies the direction of steepest ascent. For this connection and a relationship between gradients and directional derivatives, see Exercise 7, p. 351.) If we strengthen the conclusion of Theorem 11.14, we can obtain a reverse implication. 11.15 THEOREM. Let V be open in Rn, let a E V, and suppose that f : V --+ Rm. If all first-order partial derivatives of f exist in V and are continuous at a, then f is differentiable at a. NOTE: These hypotheses are met if f is CIon V. PROOF. Since a function converges if and only if its components converge (see Theorem 9.15), we may suppose that m = 1. By definition, then, it suffices to show that
1·
h~ Let a
f(a+h) - f(a) - \7f(a) ·h - 0
Ilhll
- .
= (al, ... , an). Choose r > 0 so small that Br(a) c V. Fix h = (hI' ... ' h n ) =f.
o in Br(O).
By telescoping and using the one-dimensional Mean Value Theorem, we can choose numbers Cj between aj and aj + hj such that
+ hI,···, an + hn) - f(al, a2 + h2' ... ' an + hn) + ... + f(al, ... , an-I, an + hn ) - f(al, ... , an) of hj ax. (al' ... ' aj-l, Cj, aj+l + hj+l, ... , an + hn).
f(a+h) - f(a) = f(al n
=
L j=l
J
Therefore,
(5)
f(a+h) - f(a) - \7f(a) ·h=h·6,
where 6 ERn is the vector with components
Since the first-order partial derivatives of f are continuous at a, OJ --+ 0 for each 1 :::; j :::; n; i.e., 11611 --+ 0 as h --+ O. Moreover, by the Cauchy-Schwarz Inequality and (5),
(6)
If(a+h) - f(a) - \7f(a)
0:::;
Ilhll
·hl
=
Ih·61 < 11611 Ilhll - .
Chapter 11
336
DIFFERENTIABILITY ON R n
It follows from the Squeeze Theorem that the first quotient in (6) converges to 0 as h ---- O. Thus I is differentiable at a by definition. I
If all first-order partial derivatives of a function I exist and are continuous at a point a (respectively, on an open set V), we shall call I continuously differentiable at a (respectively, on V). By Theorem 11.15, every continuously differentiable function is differentiable. In particular, every function that is CP on an open set V, for some 1 ~ p ~ 00, is continuously differentiable on V. These results suggest the following procedure to determine whether a function I is differentiable at a point a. (1) Compute all first-order partial derivatives of I at a. If one of these does not exist, then I is not differentiable at a (Theorem 11.14). (2) If all first-order partial derivatives exist and are continuous at a, then I is differentiable at a (Theorem 11.15). (3) If the first-order partial derivatives of I exist but one of them fails to be continuous at a, then use the definition of differentiability directly. By the uniqueness of the total derivative, this will involve trying to verify (3) for B = D I(a). A decision about whether this limit exists and equals zero will involve methods outlined in Section 9.2. We close with some examples.
= (cos(xy), lnx - eY ) differentiable at (1,1)? SOLUTION. Since Ix = (-ysin(xy), 1/x) and Iy = (-x sin(xy) , eY ) both exist and are continuous at any (x, y) E R2 with x > 0, I is differentiable at any such
11.16 Example. Is I(x, y)
(x, y), in particular, at (1,1). I 11.17 Example. Is
(x, y) =I- (0,0) (x, y) = (0,0) differentiable at (O,O)? SOLUTION. Again we begin by looking at the first-order partial derivatives of f. By the one-dimensional Quotient Rule,
al 2x 2 y ax (x, y) = (x2 + y2)2 But this makes sense only when (x, y) =I- (0,0). Hence we cannot rely on the rules of differentiation alone to compute partial derivatives. To see whether the partial derivatives exist at (0,0) we must return to the definition:
al (0, 0) = lim l(h,O) - 1(0,0) = lim Q= O. ax h-+O h h-+O h Thus Ix(O, 0)
= 0 DOES exist even though the formula approach above crashed.
11.2
Definition of differentiability
337
Similarly, by definition,
of (0, 0) = lim f(O, k) - f(O, 0) = lim ~. oy k-.O k k-.O k Since this last limit does not exist, fy(O,O) does not exist. Hence f cannot be differentiable at (0,0). I Our final example shows that the converse of Theorem 11.15 is false.
11.18 Example. Prove that
(x, y) (x,y)
i- (0,0) = (0,0)
is differentiable on R2 but not continuously differentiable at (0,0). PROOF. If (x, y) verify that
i- (0,0),
fx(x, y)
=
then we can use the one-dimensional Product Rule to
-x 1 cos J x 2 + y2 J x 2 + y2
.
+ 2x sm
1 . 2 J x + y2
Thus f is differentiable on R2 \ {(O,O)}. Since fx(x,O) has no limit as x ---+ 0, the partial derivative fx is not continuous at (0,0). A similar statement holds for fy. Thus to check differentiability at (0,0) we must return to the definition. First, we compute the partial derivatives at (0,0). By definition,
f x(0, 0) = l'1m f(t,O) - f(O, 0) = l'1m t'sIn ~ It I = 0 ' t-.o t t-.o and similarly, fy(O, 0) = 0. Thus, both first partials exist at (0,0). To prove that f is differentiable at (0,0), we must verify (3) for a = (0,0) and B = '\1 f(a).
f(h,k) - f(O,O) - '\1f(O,O)· (h,k)
= Jh2
as (h, k)
---+
(0,0); i.e.,
k 2 sin
+
II(h,k)11
1
---+
0
vh 2 +k 2
f is differentiable at (0,0). I
EXERCISES 1. For each of the following functions, prove that
f is differentiable on its
domain and compute D f. (a) f(x, y)
= (sin x, xy, cos y).
(b) f(s, t, u, v)
= (st + u 2, uv - s2).
Chapter 11
338
DIFFERENTIABILITY ON R n
(c) I(t) = (logt, 1/(1 + t)).
(d) I(r, 0)
= (rcosO,rsinO).
JiXYI
2. Prove that I(x, y) = is not differentiable at (0,0). 3. Prove that the following function is not differentiable at (0,0).
°< II(x,y)11 <
7r
(x,y) = (0,0) 4. Let r> 0, I: Br(O) --t R, and suppose that there exists an a II(z)1 ::; Ilzll'" for all z E Br(O). (a) Prove that 1 is differentiable at O. (b) What happens to this result when a = 1?
> 1 such that
5. Prove that if a> 1/2, then
(x, y) =f (0,0) (x, y) = (0,0)
I(x, y) = { ~xYI'" log(x 2 + y2) is differentiable on R 2 • 6. Prove that
(x, y)
=f (0,0)
(x, y) = (0,0) is differentiable on R2 for all a < 3/2. 7. Prove that
(x, y)
=f (0,0)
(x, y) = (0,0) is continuous on R 2 , has first-order partial derivatives everywhere on R 2 , but 1 is not differentiable at (0,0). This exercise is used several times in this chapter and the next. Let T E .C(Rn; Rm). Prove that T is differentiable everywhere on Rn with
00.
DT(a) = T
9. Let V be open in Rn, a E V, and
1 :V
--t
Rm.
*11.19 DEFINITION. Ifu is a unit vector in Rn, i.e., lIull = 1, then the directional derivative of 1 at a in the direction u is defined by
I(a+ tu) - I(a) Du I(a ) ..- l'1m .:........:...----'--..:........:..--'... t--.o t when this limit exists.
11.3
339
Derivatives, differentials, and tangent planes
(a) Prove that Duf(a) exists for u
= ek if and only if fXk (a) exists, in which case
(b) Show that if f has directional derivatives at a in all directions u, then the first-order partial derivatives of f exist at a. Use Example 11.11 to show that the converse of this statement is false. (c) Prove that the directional derivatives of
(x, y)
-I-
(0,0)
(x, y) = (0,0) exist at (0,0) in all directions u, but at (0,0).
f
is neither continuous nor differentiable
10. Let r > 0, (a, b) E R 2 , f : Br (a, b) -+ R, and suppose that the first-order partial derivatives fx and fy exist in Br(a, b) and are differentiable at (a, b).
(a) Set ~(h) = f(a + h, b + h) - f(a h sufficiently small that ~(h)
-h- = fy(a
+ h, b) -
f(a, b + h)
+ f(a, b)
and prove for
+ h, b + th) - fy(a, b) - ~ fy(a, b) . (h, th) - (fy(a, b + th) - fy(a, b) - ~ fy(a, b) . (0, th)) + hfyx(a, b)
for some t E (0,1). (b) Prove that
. ~(h) hm - h 2
h-+O
= fyx(a,b).
(c ) Prove that
11.3 DERIVATIVES, DIFFERENTIALS, AND TANGENT PLANES In this section we begin to explore the analogy between D f and f'. First we examine how the total derivative interacts with the algebra of functions.
11.20 THEOREM. Let a E R, a E Rn, and suppose that f and 9 are vector functions. If f and 9 are differentiable at a, then f + g, af, and f . 9 are all differentiable at a. In fact, (7)
D(f + g)(a) = Df(a)
+ Dg(a) ,
Chapter 11
340
DIFFERENTIABILITY ONRn
D(af)(a) = aDf(a),
(8) and
D(f· g) (a) = g(a)D f(a)
(9)
+ f(a)Dg(a).
(The sum that appears on the right side of (7) represents matrix addition, and the products that appear on the right side of (9) represent matrix multiplication.) PROOF.
The proofs of these rules are similar. We provide the details only for
(9). Let
T = g(a)D f(a)
(10)
+ f(a)Dg(a).
Since g(a) and f(a) are 1 x m matrices, and D f(a) and Dg(a) are m x n matrices, T is a 1 x n matrix, the right size for the total derivative of f· g. By the uniqueness of the total derivative, we need only show that
1· (f. g)(a+h) - (f. g) (a) - T(h) - 0 h~ Ilhl! - . Since by (10),
(f. g)(a+h) - (f. g) (a) - T(h) = (f. g)(a+h) - (f. g) (a) - 9(a) Df (a) (h) - f (a) Dg (a) (h) = (f(a+h) - f(a) - Df(a)(h))· g(a+h) + (Df(a) (h)) . (g(a+h) - g(a)) + f(a) . (g(a + h) - g(a) - Dg(a) (h)) =: Tl(h) + T2(h) + T3(h), it suffices to verify Tj(h)/llhll ---+ 0 as h ---+ 0 for j = 1,2,3. Set e(h) = f(a + h) - f(a) - D f(a)(h) and 8(h) = g(a + h) - g(a) - Dg(a)(h) for h sufficiently small. Since f and 9 are differentiable at a, we know that e(h)/llhll and 8(h)/llhll both converge to zero as h ---+ O. To estimate T1 , use the Cauchy-Schwarz Inequality and the definition of e to verify
ITI (h) I :S
Ilf(a+h) - f(a) - Df(a)(h)llllg(a+h)11 =
Ile(h)llllg(a+h)ll·
Since 9 is continuous at a (Theorem 11.13) and e(h)/llhll ---+ 0 as h ---+ 0, it follows that ITl (h) 1/ Ilhll ---+ 0 as h ---+ O. A similar argument shows that IT3 (h) 1/ Ilhll ---+ 0 as h ---+ O. To estimate T 2 , observe by the Cauchy-Schwarz Inequality and the definition of the operator norm (see Theorem 8.17) that
IT2 (h) I =
IIDf(a)(h)llllg(a+h) - g(a) II :S II Df(a) II
Ilhll Ilg(a+h) -
g(a)ll·
11.3
341
Derivatives, differentials, and tangent planes
Thus IT2(h)I/llhll ~ IIDI(a)lIlIg(a+h) - g(a) II ~ 0 as h ~ O. We conclude that I· 9 is differentiable at a and its total derivative is T. I Formula (7) is called the Sum Rule; (8) is sometimes called the Homogeneous Rule; and (9) is called the Dot Product Rule. (We note that a quotient rule also holds for real-valued functions: see Exercise 4.) Continuing to explore the analogy between D 1 and f', let 9 be a real function and 1 be a vector function from n variables to one variable. We know that 9 is differentiable at a point a if and only if the curve y = g(x) has a unique tangent line at (a, g( a)), in which case g' (a) is the slope of that tangent line. What happens in the multidimensional case? Working by analogy, 1 should be differentiable at a point a if and only if the surface z = I(x) has a unique tangent hyperplane at the point (a,J(a)):= (al, ... ,an,J(al, ... ,an )) E RnH. Moreover, it would be nice if the normal vector n of that tangent hyperplane were somehow related to the total derivative V'/(a). We shall show that both of these observations are correct, and that the relationship between n and V'/(a) is a simple one (see (12) and (13), and Exercise 8, p. 369). Thus, for the case m = 1, Definition 11.12 captures both the analytic and geometric spirit of the one-dimensional derivative. First we define what we mean by a tangent hyperplane.
11.21 DEFINITION. Let S be a subset of Rm and c E S. A hyperplane II with normal n is said to be tangent to S at c if and only if c E II and
(11) for all sequences Ck E S \ {c} that converge to Co Notice by (3) in Section 8.1 that (11) is equivalent to assuming that the angle between nand Ck -cconverges to 7r /2 for all sequences Ck E S\ {c} that converge to Co Hence the definition of a "tangent hyperplane" makes geometric sense. (See Figure 11.1 for the case when n = 3, c = (a, b, I(a, b)), and S is the surface z = I(x, y). There, (h,k represents the angle between n and the vector from c to (a + h, b + k, I(a + h, b + k).) Also notice that if II is a tangent hyperplane to S at C, then an equation of II is given by n· (x-c) = o. It is easy to see that surfaces generated by differentiable vector functions have tangent hyperplanes.
11.22 THEOREM. Suppose that V is open in Rn, that a E V, and that R. If 1 is differentiable at a, then the surface S := {(x, z) E R n +1 : z = I(x) has a tangent hyperplane at (a,J(a)) with normal
(12)
and x E V}
I: V
~
342
Chapter 11
DIFFERENTIABILITY ON R n n
z =f(x, y)
Figure 11.1 PROOF. Let Ck E S with Ck -I- (a,!(a)) and Ck ---- (a, f(a)). Then Ck = (ak,!(ak)) for some ak E V and ak ---- a as k ---- 00. For small h, set c(h) = f(a+h) - f(a) - \7 f(h) and define n by (12). Since
it is clear by (12) that O < n· Ck -c I
-
I<
lick - cil -
Ic(ak -a)1 . Ilak - all
Since c(h)/llhll ---- 0 as h ---- 0, it follows from the Squeeze Theorem that n satisfies (11) forc:= (a,f(a)). I For the case n = 2, this result contains the following observation. If f is a realvalued function of two variables that is differentiable at a point (a, b), then the surface z = f(x,y) has a tangent plane at (a,b,f(a,b)) with normal
(13)
n = (fx(a, b), fy(a, b), -1) =: (\7 f(a, b), -1).
Moreover, an equation of that tangent plane is given by
(14)
z = f(a, b)
+ \7 f(a, b) . (x - a, y - b).
Notice that this is completely analogous to the real case. Namely, if 9 is differentiable at a, then the tangent line to y = g(x) at the point (a,g(a)) is
(15)
y = g(a)
+ g'(a)(x - a).
It is interesting to note that the converse of Theorem 11.22 is also true (see Theorem 11.27).
11.3
343
Derivatives, differentials, and tangent planes
y
Tangent at (a,f(a))
a
x
a+~x
Figure 11.2
There is another analogy between D f and f' worth mentioning. Recall that if f is a real function, then the change in y = f (x) as x moves from a to a + ~x is
defined by ~y = f(a + ~x) - f(a). For many concrete situations, it is convenient and useful to approximate ~y by the Leibnizian differential dy := f'(a) dx, where dx = ~x is a small real number (see Figure 11.2). Does a similar situation prevail for functions on Rn? To answer this question, suppose that z = f(x) is a vector function from n variables to one variable, differentiable at a, that ~z := f(a + ~) - f(a), where ~:= (~X1"'" ~xn), and that d:I: = ~ is a vector with small norm. Comparing (14) and (15), we define the first total differential of a vector function from n variables to one variable to be dz:= V'f(a)· ~:=
8f
Ln j=l
Is dz a good approximation to 11.23 Remark. Let Then
f : Rn
-8. (a) dXj. xJ
~z?
---- R be differentiable at a and
~z-dz
II~II ---- 0
as
In particular, the differential dz approximates
~ ----
~
=
(~Xl, ... , ~xn)'
o.
~z.
PROOF. By definition, if f is differentiable at a, then c(h) := f(a + h) - f(a) V' f(a) . h satisfies c(h)/llhll ---- 0 as h ---- O. Since ~z = f(a + h) - f(a) for h := ~ and dz = V' f(a) . h, it follows that (~z - dz)/II~II ---- 0 as ~ ---- O. I
But Figure 11.2 contains very useful geometric information. We cannot visualize
z = f(x) in arbitrary dimensions, but we can when n = 2. If z = f(x, y), does the total differential dz and the increment ~z play an analogous geometric role in R3 that dy and ~y played in R2? The picture corresponding to Figure 11.2
344
Chapter 11
DIFFERENTIABILITY ON Rn
z
A{ I~--_,-~
I I
I (a, b)
--r.£t~----l
I
(a + dX,b) .......---~--l,.... (a
+ dX,b + dy)
x
Figure 11.3 involves a tangent plane and a wedge-shaped region (see Figure 11.3). Namely, let Zo = f(a, b) and consider the wedge-shaped region W with vertical sides parallel to the xz and yz planes whose base has vertices Co := (a, b, zo), C:L := (a + 6.x, b, zo), 02 := (a, b+6.y, zo), C3 := (a+6.x, b+6.y, zo), and whose top is tangent to z = f(x, y) at Co. Let A represent the length of the vertical edge of W based at C:L, B the length of the edge based at 02, and C the length of the edge based at C3. If dz is to play the same role in Figure 11.3 that dy plays in Figure 11.2, then it must be the case that C = dz. This is actually easy to verify. Since the diagonals of rectangles bisect one another, the line segment from the intersection of the diagonals in the base of W to the intersection of the diagonals in the top of W must be parallel to the z axis. Thus the length D of this line segment can be computed in two ways. On the one hand, D = Cj2. On the other hand, D = (A + B)j2. Therefore, C = A + B. But from one-dimensional calculus, A = fx(a, b) dx and B = fy(a, b) dy. Consequently,
C
= A
+B
of
= ax (a, b) dx
+
of
ay (a, b) dy = dz.
We conclude that the first total differential of vector functions plays exactly the same role that it did for real functions. We close this section with some optional material about tangent planes and applications of the first total differential.
First, we discuss applications of the first total differential. By Remark 11.23, if is differentiable at a, then the differential of f can be used to approximate the change of f as x moves from a to a + h for h sufficiently small. Here is a practical example.
f
11.3
Derivatives, differentials, and tangent planes
345
*11.24 Example. Use differentials to approximate the change of I(x, y) as (x,y) moves from (0,1) to (0.02,1.01).
= x 2y_ y3
SOLUTION. Let z = x 2y - y3, a = 0, and b = 1. Then dx Since dz = 2xy dx + (x 2 - 3y2) dy, we have
dy
~z ~
Note that
~z
= 1(0.02,1.01) -
= 0.02 and
= 0.01.
0(0.02) - 3(0.01) = -0.03. 1(0,1)
= -0.029897 ...
is very close to -0.03. I
*11.25 Example. Use differentials to approximate (5.97){/16.03. SOLUTION. Let z Since
=
y{lx, a
= 16, and b = 6. Then dx = 0.03 and
dz =
dy
= -0.03.
4~dx + 1X dy,
4 vx 3
we have
~z ~
6(0.03) 4y'(16)3
+ 116( -0.03) ~ -0.054375.
Thus,
z ~ 6116 - 0.054375 = 11.945625. Note that the actual value of 5.97 {/16.03 is 11.945593 .... Thus our approximation is good to three decimal places. I *11.26 Example. Find the maximum percentage error for the calculated value of the volume of a right circular cylinder if the radius can be measured with a maximum error of 3% and the altitude can be measured with a maximum error of
2%. SOLUTION. The volume of a right circular cylinder is V = rrr 2h, where r is the radius and h is the altitude. Hence, the differential of V is dV = 2rrrh dr + rrr2 dh. Thus dV = 2dr + dh V r h' Since the percentage error of a variable x is ~x / x ~ dx / x, it follows that the maximum percentage error in calculating the volume V is approximately 8%: dV
V
= 2(±0.03)
+ (±0.02) =
±O.OB. I
Finally, we show that the converse of Theorem 11.22 holds. (The proof presented here is based on Taylor [13].1) 1 Angus E. Taylor, Advanced Calculus (Boston: Ginn and Company, 1955). Reprinted with permission of John Wiley & Sons, Inc.
Chapter 11
346
DIFFERENTIABILITY ON Rn
*11.27 THEOREM. Let V be open in R2, let (a, b) E V, and let f : V --+ R. Then f is differentiable at (a, b) if and only if z = f (x, y) has a nonvertical tangent plane II at c:= (a, b, f(a, b)), in which case II = IIn(c) and n
(16)
= (- fx(a, b), - fy(a, b), 1).
PROOF. If f is differentiable at (a, b), then by Theorem 11.22, z = f(x, y) has a nonvertical tangent plane with normal given by (16). Conversely, suppose that the surface S := {(x, y, z) : z = f(x, y) for (x, y) E V} has a nonvertical tangent plane II at c. Then the third component, say "I, of any normal of II is nonzero. Fix such a normal and multiply it by 1/"1. Thus we may suppose that II has a normal of the form n = (nI,n2,1). Let Ch,k := (a + h, b+ k, f(a+ h, b+ k)) be a point on S near but not equal to C and notice that n· (Ch,k - c) = 8 + ~z, where ~z := f(a + h, b + k) - f(a, b) and 8 := nIh + n2k. Hence by Definition 11.21,
(17) as (h, k)
(0,0). Use the quadratic formula to solve (17) for
--+
~z,
obtaining
(18) Notice that
181 = l(nl, n2) . (h, k)1
::;
Ilnllll(h, k)ll.
Hence, it follows from
(18)
that
(19) where G(c) := (c211nll + IclJllnl12 + (1- c2)) IIc2- 11. Since (0,0), we conclude by (19) and the Squeeze Theorem that
c --+
0 as (h, k)
--+
0< If(a+h,b+k)-f(a,b)+(nl,n2)·(h,k)1 < IG( )1--+0
-
as (h, k)
--+
II(h, k)11
-
c
(0,0). Therefore, f is differentiable at a and V' f(a, b)
EXERCISES 1. For each of the following, find D(f + g) (a) and D(3f - 2g)(a).
+ 1,
(a)
f(t)
(b)
f(x, y) = x - y,
= t2
9 ( t)
1
= log t - t'
a
= 1.
= (-nl, -n2). I
11.3
(c)
Derivatives, differentials, and tangent planes
g(x, y) = xsinx - cosy,
f(x,y) = xy,
(d) f(x,y,z)
a= (1l',1l').
= (x-z,x+z),
f(x, y) = (x, y, 1l'2),
(e)
347
a = (1,1,1).
a = (1, -1).
g(x, y) = (y, x, xy),
2. Let V be open in Rn, let a E V, let f,g: V are differentiable at a. (a) [CROSS-PRODUCT RULE] For the case n at a and
(f x g)'(a) = f(a) x g'(a)
---+
R3, and suppose that f and 9
=
1, prove that fxg is differentiable
+ f'(a)
x g(a).
(b) What happens to part (a) when n > I? (c) Suppose that f(a) = (2,1,2), g(a) = (1,2,1),
Df(a) =
[~ ~ ~], 1 1
and
Dg(a) =
[-~
1
1
_°1
°
~]. -1
Find D(f· g)(a)(I, 1, 1) and D(f x g)(a)(l, 1, 1). 3. Prove (7) and (8) in Theorem 11.20. 4. [QUOTIENT RULE] Let f : Rn -> R be differentiable at a with f(a)
f 0.
(a) Show that for Ilhll sufficiently small, f(a+h) f 0. (b) Prove that D f(a)(h)/llhll is bounded for all h ERn \ {o}. (c) 1fT:= -Df(a)IP(a), show that 1
f(a+h)
__1__ T(h) = f(a) - f(a + h)
+ D f(a)(h) f(a)f(a+h) (f(a+h) - f(a)) Df(a) (h) + j2(a)f(a+h)
f(a)
for Ilhll sufficiently small. (d) Prove that 1I f (x) is differentiable at x D
(71) (a)
= -
= a and Df(a) j2 (a) .
(e) Prove that if f and 9 are real-valued vector functions that are differentiable at some a, and if g(a) f 0, then
D
(L) (a) = g(a)D f(a)g2(a) - f(a)Dg(a) . 9
348
Chapter 11
DIFFERENTIABILITY ON Rn
5. For each of the following functions, find an equation of the tangent plane to z = f(x, y) at c. (a) f(x,y) = x3 siny, c= (0,0,0). (b) f(x,y) = x 3y - xy3, c= (1,1,0).
6. Find all points on the paraboloid z = x 2 + y2 (see Appendix D) where the tangent plane is parallel to the plane x + y + z = 1. Find equations of the corresponding tangent planes. Sketch the graphs of these functions to see that your answer agrees with your intuition. 7. Let 11. be the hyperboloid of one sheet, given by x 2 + y2 - z2 = 1. (a) Prove that at every point (a, b, c) E 11.,11. has a tangent plane whose normal is given by (-a, -b, c). (b) Find an equation of each plane tangent to 11. that is perpendicular to the xy plane. (c) Find an equation of each plane tangent to 11. that is parallel to the plane x+y-z=1. * 8. Compute the differential of the each of the following functions.
(b) z
= sin(xy).
xy (c) z = 1 + x2 + y2'
+ z. Use differentials to approximate Llw as (x, y, z) moves from (1,2,1) to (1.01,1.98,1.03). Compare your approximation with the actual value of Llw. * 10. The time T it takes for a pendulum to complete one full swing is given by *9. Let w = x 2 y
where 9 is the acceleration due to gravity and L is the length of the pendulum. If 9 can be measured with a maximum error of 1%, how accurately must L be measured (in terms of percentage error) so that the calculated value of T has a maximum error of 2%? *11. Suppose that 1 1 1 1 - = - + - +-, w x y z where each variable x, y, z can be measured with a maximum error of p%. Prove that the calculated value of w also has a maximum error of p%.
11.4 CHAIN RULE Here is the Chain Rule for vector functions.
11.4
349
Chain Rule
11.28 THEOREM
[CHAIN RULE]. Suppose that a ERn, that 9 is a vector function from n variables to m variables, and that f is a vector function from m variables to p variables. If 9 is differentiable at a and f is differentiable at g(a), then fog is differentiable at a and
(20)
D(f
0
g)(a) = Df(g(a))Dg(a).
(The product D f(g(a) ) Dg(a) is matrix multiplication.) PROOF. Set T = D f(g(a))Dg(a) and observe that T, the product of a p x m matrix with an m x n matrix, is a p x n matrix, the right size for the total derivative of fog. By the uniqueness of the total derivative, we must show that
1·
(21)
h~
f(g(a+h)) - f(g(a)) - T(h) - 0
Ilhll
-
.
Let b = g(a). Set
(22)
c(h)
= g(a + h) -
g(a) - Dg(a) (h)
and
(23)
8(k) = f(b+k) - f(b) - Df(b)(k)
for hand k sufficiently small. By hypothesis, c (h) / Ilhll --4 0 in R m as h --4 0 in R n , and 8(k)/llkll --40 in RP as k --4 0 in Rm. Fix h small and set k = g(a+h) - g(a). Since (23) and (22) imply
+ 8(k) + c(h)) + 8(k) T(h) + Df(b)(c(h)) + 8(k),
f(g(a+h)) - f(g(a)) = f(b+k) - f(b) = Df(b)(k) = D f(b) (Dg(a) (h)
=
we have f(g(a + h)) - f(g(a)) - T(h) = D f(b)(c(h))
+ 8(k)
=: Tl (h)
+ T2(h).
It remains to verify that Tj(h)/llhll --40 as h --4 0 for j = 1,2. Since c(h)/llhll --40 as h --4 0 and Df(b)(h) is matrix multiplication, it is clear that Tl (h)/llhll --4 D f(b) (0) = 0 as h --4 O. On the other hand, by (22), the triangle inequality, and the definition of the operator norm, we have
Ilkll
:=
Ilg(a+h) - g(a) II = IIDg(a)(h)
+ c(h) II
:::; IIDg(a)II'llhll
+ Ik(h)ll·
Thus Ilkll/llhll is bounded for h sufficiently small. Since k --4 0 in Rm as h --4 0 in
Rn, it follows that
Chapter 11
350
DIFFERENTIABILITY ON Rn
as h ~ O. We conclude that fog is differentiable at a and the derivative is Df(g(a))Dg(a). I The Chain Rule can be used to compute individual partial derivatives without writing out the entire matrices D f and Dg. For example, suppose that f(uI, ... , um) is differentiable from Rm to R, g(Xl, ... , xn) is differentiable from R n to Rm, and z = f(g(Xl, ... ,xn )). Since Df = \1f and the jth column of Dg consists of firstorder partial derivatives, with respect to Xj, of the components Uk := gk(XI, ... , xn), it follows from the Chain Rule and the definition of matrix multiplication that
for j
= 1,2, ... , n. Here are two concrete examples that illustrate this principle.
11.29 Examples. (i) If F, G, H : R2 ~ R are differentiable and z where x = G(r, 8), and y = H(r, 8), then
=
F(x, y),
(ii If f : R3 ~ Rand ¢, 'ljJ, a : R -+ R are differentiable and w = f(x, y, z), where x = ¢(t), y = 'ljJ(t), and z = a(t), then dw
ow dx
ow dy
ow dz
=-+-+-. dt ox dt oy dt oz dt EXERCISES R and f, g, h : R2 -+ R be C2 functions. If w = F(x, y, z), where x = f(p, q), y = g(p, q), and z = h(p, q), find formulas for wp, w q , and wpp. 2. Let r > 0, let a ERn, and suppose that 9 : Br(a) -+ Rm is differentiable at a.
1. Let F : R3
-+
(a) If f : Br(g(a)) -+ R is differentiable at g(a), prove that the partial derivatives of h = fog are given by
oh ox. (a)
=
J
for j = 1,2, ... , n. (b) If n = m and f : Br(g(a))
og \1 f(g(a)) . ox. (a) J
~
det(D(f 0 g)(a))
Rn is differentiable at g(a), prove that =
det(D f(g(a))) det(Dg(a)).
11..4
3. Let f,g: R
-+
Chain Rule
R be twice differentiable. Prove that u(x,y) := f(xy) satisfies
au ox
au ay
x - - y - =0 and v(x, y) := f(x - y)
4. Let u : R
-+
351
+ g(x + y)
,
satisfies the wave equation; i.e.,
[0,00) be differentiable. Prove that for each (x, y, z) F(x, y, z) := u( j x 2 + y2
-I- (0,0,0),
+ z2)
satisfies
5. Let t
> 0, x
E
R.
°
°
(a) Prove that u satisfies the heat equation; i.e., U xx - Ut = for all t > and x E R. (b) If a> 0, prove that u(x, t) -+ 0, as t -+ 0+, uniformly for x E [a, 00). 6. Suppose that I is a nonempty, open interval and f : 1-+ R m is differentiable on I. If f(I) ~ aBr(O) for some fixed r > 0, prove that f(t) is orthogonal to f'(t) for all tEl. 7. Let V be open in Rn, a E V, f: V -+ R, and let f be differentiable at a. (a) Prove that the directional derivative Duf(a) exists (see Exercise 9, p. 338), for each u E Rn such that Ilull = 1, and Duf(a) = V' f(a) . u. (b) If V'f(a) -I- 0 and e represents the angle between u and V'f(a) , prove that Duf(a) = IIV'f(a)llcose. (c) Show that as u ranges over all unit vectors in Rn, the maximum of Duf(a) is IIV' f(a) II, and it occurs when u is parallel to V' f(a). 8. Let z = F(x, y) be differentiable at (a, b) with Fy(a, b) -I- 0, and let I be an open interval containing a. Prove that if f : 1-+ R is differentiable at a, f(a) = b, and F(x, f(x)) = for all x E I, then
°
of
df --(a,b) -(a) = ~a~x __ dx ~: (a, b)
Chapter 11
352
DIFFERENTIABILITY ON Rn
9. Let f, 9 : R2 ---t R be differentiable and satisfy the Cauchy-Riemann equations, i.e., that af = ag and af = _ ag ax ay ay ax . If u(r, B)
= f(r cos B, r sin B), and v(r, B) = g(r cos B, r sin B), prove that au 1 av = ar -:;: aB'
av ar
1 au r aB
10. Let f : R2 ---t R be C2 on R2 and set u(r, B) satisfies the Laplace equation, i.e., if
r
# o.
= f(r cos B, r sin B). If f
prove for each r # 0 that
11.5 MEAN VALUE THEOREM AND TAYLOR'S FORMULA
Using D f as a replacement for f', we guess that the multidimensional analogue of the Mean Value Theorem is f(x) - f(a)
= Df(c)(x-a)
for some c "between" x and a, i.e., some c E L(x;a), the line segment from a to x. The following result shows that our guess is wrong for functions f : R n ---t R m when m > l. 11.30 Remark. The function f(t) = (cost,sint) is differentiable on R and satisfies f(21r) = f(O), but there is no c E R such that Df(c) = (0,0).
PROOF. D f(t) = (- sin t, cos t) exists and is continuous for t E R but (0,0) (- sin t, cos t) for t E R. I
#
The following is a correct version of the Mean Value Theorem for multivariable functions. 11.31 THEOREM [MEAN VALUE THEOREM ON Rnj. Let V be open in Rn and suppose that f : V ---t Rm is differentiable on V. If x,a E V and L(x;a) ~ V, then for each u E R m, there is acE L(x; a) such that U·
(I(x) - f(a))
=U·
(Df(c)(x-a)).
PROOF. Let g(t) =a+t(x-a),
t ER,
11.5
Mean Value Theorem and Taylor's Formula
,,--.......
/
I
"..--- ....... V
I ( \ \ \
/
\
\
// \
,------"
\
,"
\
I I I / / I
~
~ ...............
\
/
/
\ \
\
/
/
'\
\
\
353
-,.//
------
/
...-
""
//
Figure 11.4 and notice by Exercise 8, p. 338, that 9 : R ---+ Rn is differentiable with Dg(t) = x-a for all t E R. Since L(x;a) ~ V and V is open, choose 8 > 0 such that g(t) E V for all t E 18 := (-8,1 + 8). By the Chain Rule,
D(fog)(t) = Df(g(t))(x-a),
(24)
tEh
Fix u E Rm, and consider the function
F(t)
=U'
(fog)(t),
The function F is a real-valued function on (24), F is differentiable on 18 with
F'(t)
=U'
D(fog)(t)
tEh
h By (9) (the Dot Product Rule) and
=U'
(Df(g(t))(x-a)).
Hence, by the one-dimensional Mean Value Theorem, there is a to E (0,1) such that u·
(f(x) - f(a)) = F(l) - F(O) = F'(to) =u· (Df(g(to))(x-a)).
Thus set c = g(to). I Sets that satisfy the hypothesis "L(x;a) a name.
~
V" come up often enough to warrant
11.32 DEFINITION. A subset E of Rn is said to be convex if and only if L(x;a) ~ E for all x,a E E. Using this terminology, we see that the Mean Value Theorem holds for any C1 function on a convex, open set V. It is easy to see that any ball and any rectangle is convex. For example, if x,a E Br(b), then
11((1 -
t)a+ tx)
-bll = 11(1- t)(a-b) + t(x -b)11 < (1 -
t)r + tr = r.
On the other hand, Figure 11.4 is an example of a nonconvex set in R2 (because the line segment that joins a to b contains some points outside V.) Our next result shows that the Mean Value Theorem for scalar-valued functions recaptures the simplicity of the one-dimensional version (see also Exercises 1 and 5).
Chapter 11
354
DIFFERENTIABILITY ON Rn
11.33 COROLLARY. Let V be convex and open in Rn and suppose that f : ~ R. If f is differentiable on V and a + h, a both belong to V, then there is a o < t < 1 such that V
(25)
f(a+h) - f(a) = Df(a+ th)·h:=
of Ln ax(a+ th)h j=l
j •
J
PROOF. Let u be a nonzero scalar, and suppose that a + h, a both belong to V. Since V is convex, L(a+h;a) <:;:: V. Hence, by Theorem 11.31,
u(J(a+h) - f(a)) = u(\1f(c) ·h) for some c E L(a + h;a). Dividing this inequality by u and choosing t E (0,1) such that c = a + th, we conclude that (25) holds. I As in the one-dimensional case, the Mean Value Theorem is used most often to obtain information about a function from properties of its derivative. Here is a typical example.
11.34 COROLLARY. Let V be an open set in Rn, let H be a compact subset of V, and suppose that f : V ~ Rm is Clan V. If E is a convex subset of H, then there is a constant M (which depends on Hand f but not on E) such that Ilf(x) - f(a) II :::; Mllx-all
for all x,a E E. PROOF. Since H is compact and the entries of D f are continuous on H, we have by the Extreme Value Theorem (Theorem 9.32 or 10.63) and the proof of Theorem 8.17 that the operator norm of D f is bounded on H, i.e., that
M:= sup IIDf(c)11 cEH
is finite. Notice that M depends only on H and f. Let x,a E E and u = f(x) - f(a). Since E is convex, L(x;a) Theorem 11.31, there is acE L(x;a) such that
<:;::
E. Hence, by
Ilf(x) - f(a) 112 =u· (J(x) - f(a)) =u· (Df(c)(x-a)) = (J(x) - f(a))· (Df(c)(x-a)). It follows from the Cauchy-Schwarz Inequality and the definition of the operator norm that Ilf(x) - f(a) 112 :::; Ilf(x) - f(a) II IIDf(c)11 IIx -all·
If Ilf(x) - f(a) II = 0, there is nothing to prove. Otherwise, we can divide the inequality above by Ilf(x) - f(a) II to obtain
Ilf(x) - f(a) II :::; IIDf(c)llllx-all :::; Mllx-all· I As the following optional result shows, for some applications of the Mean Value Theorem, the convexity hypothesis can be replaced by connectivity. (This is an analogue of the one-dimensional result: If f' = 0 on [a, bJ, then f is constant on
[a, bJ.)
11.5
Mean Value Theorem and Taylor's Formula
355
*11.35 COROLLARY. Suppose that V is open and connected in Rn and that f: V -+ R m is differentiable on V. If Df(c) = 0 for all c E V, then f is constant on V. PROOF. Fix a E V, and let x E V. Since V is open and connected, V is polygonally connected (see Exercise 10, p. 277). Thus, there exist points Xo = a,X1, ... ,Xk = x such that L(Xj-I;Xj) ~ V for j = 1,2, ... , k (see Figure 11.4). Let U = f(x) - f(a) and choose by Theorem 11.31 points Cj E L(Xj_1;Xj) such that
for j = 1,2, ... , k. Summing over j and telescoping, we see by the choice of u that k
0=
Lu. (f(Xj) -
f(Xj-1))
=U·
(f(x) - f(a))
=
Ilf(x) - f(a)112.
j=1
Therefore, f(x)
= f(a). I
To obtain a multidimensional version of Taylor's Formula, we need to define higher-order differentials. Let p 2 1, let V be open in R n, let a E V, and let f : V -+ R. We shall say that f has a pth-order total differential at a if and only if the (p - 1)st-order partial derivatives of f exist on V and are differentiable at a, in which case we shall use the notation
Notice that
for p > 1. Also notice that if z = f(x), then DCI) f(a, &.:) is the first total differential dz defined in Section 11.3, and also is the total derivative of f at a evaluated at &.::
DCI) f(a; &.:)
:=
Ln j=1
of
ax. (a) Llxj = \l f(a) . &.: = D f(a)(&.:). J
For the case n = 2, this differential has a simple geometric interpretation (see Figure 11.3). Although total differentials look messy to evaluate, when f is a sufficiently smooth function of two variables they are relatively easy to calculate using binomial coefficients (see the next example and Exercise 4).
Chapter 11
356
DIFFERENTIABILITY ON Rn
11.36 Example. Suppose that f : V ---+ R is C2 on V. Find a formula for the second total differential of f at (a, b) E V. SOLUTION. By definition,
D
(2)
. _ 2 {)2 f f((a,b),(h,k))-h {)x 2 (a,b)
{)2 f
{)2 f
+ hk{)x{)y(a, b) + hk{)y{)x (a, b) +k
2 {)2 f
{)y2(a,b).
But by Theorem 11.2, fxy(a, b) = fyx(a, b). Therefore,
D
(2)
. _ 2 {)2 f {)2 f 2 {)2 f f((a,b),(h,k))-h {)x 2 (a,b)+2hk{)x{)y(a,b)+k {)y2(a,b).
Thus the second total differential of f(x,y)
I
= (xy)2 is
Here is a multidimensional version of Taylor's Formula.
11.37 THEOREM [TAYLOR'S FORMULA ON Rnj. Let pEN, let V be open in Rn, let x,a E V, and suppose that f : V ---+ R. If the pth total differential of f exists on V and L(x;a) ~ V, then there is a point C E L(x;a) such that p-l
f(x) = f(a)
+L
1 k!D(k) f(a;h)
1
+ p!D(P) f(C;h)
k=l
forh :=x-a. NOTE: These hypotheses are met if V is convex and f is CP on V.
PROOF. Let h = x-a. As in the proof of Theorem 11.31, choose 8 > 0 so small that a + th c V for t E 10 := (-8,1 + 8). The function P(t) = f(a + th) is differentiable on 10 and, by the Chain Rule, n
P'(t) = Df(a+ th)(h) =
L k=l
{)f a(a+ th) hk. Xk
In fact, a simple induction argument can be used to verify
for j
(26) for j
= 1,2, ... ,po Thus p(j)(O) = D(j) f(a;h)
= 1, ... ,p - 1, and t E 18.
and
p(p)(t) = D(p) f(a+ th;h)
11.5
Mean Value Theorem and Taylor's Formula
357
We have proved that F : 10 ---t R has a derivative of order p everywhere on [0,1]. Therefore, by the one-dimensional Taylor Formula and (26),
10 ::)
p-1 f(x) - f(a) = F(1) - F(O) =
"~F(j)(O) + ~F(P)(t) L..t J!
p!
)=1
p-1 1
L
=
1
~D(j) f(a;h)
+ ,D(p) f(a+ th;h)
j=l J.
p.
for some t E (0,1). Thus set c = a + tho I
EXERCISES
f : R n ---t R. Suppose that for each unit vector u ERn, the directional derivative Duf(a+tu) exists for t E [0,1] (see Definition 11.19). Prove that
1. Let
f(a+u) - f(a)
=
Duf(a+ tu)
for some t E (0,1). 2. Suppose that r, ex are positive numbers, E is a convex subset of R n such that E c Br(O), and there exists a sequence Xk E E such that Xk ---t 0 as k ---t 00. If f: Br(O) ---t R continuously differentiable and If(x)1 ::; Ilxll'" for all x E E, prove that there is an M > 0 such that If(x)1 ::; Mllxll for x E E. 3. (a) Write out an expression in powers of (x + 1) and (y - 1) for f(x, y)
x 2 + xy + y2.
(b) Write Taylor's Formula for f(x, y) = JX + /Y, a = (1,4), and p = 3. (c) Write Taylor's Formula for f(x, y) = eXY , a = (0,0), and p = 4. 4. Suppose that f: R2 ---t R is CP on Br(xo, Yo) for some r > O. Prove that given (x, y) E Br(xo, yo), there is a point (c, d) on the line segment between (xo, yo) and (x, y) such that
f(x,y) = f(xo,yo) +
p-1 1 (k (k) L k! L . (x -
k=l
.
.
f}k f ) xo)1 (y - YO)k-) f}x j f} k-j (xo, Yo) Y
j=O J P + -,1 . x - Xo )j( Y - Yo )P_j
L (p) ( J
p..
)=0
f}P f _. (d) c, . uyP )
>:>. >:> UX)
5. Let r > 0, a, bE R, f : Br(a, b)
---t R be differentiable, and (x, y) E Br(a, b). (a) Compute the derivative of g(t) = f(tx + (1 - t)a, y) + f(a, ty + (1 - t)b). (b) Prove that there are numbers c between a and x, and d between band y, such that
f(x, y) - f(a, b)
=
(x - a)fx(c, y)
(This is Exercise 12.20 in Apostol [1].)
+ (y - b)fy(a, d).
Chapter 11
358
DIFFERENTIABILITY ON R n
6. [INTEGRAL FORM OF TAYLOR'S FORMULA]. Let pEN, V be an open set in R n , x,a E V, and f: V -+ R be CP on V. If L(x;a) c V and h =x-a, prove that
p-1 1 f(x) - f(a) = "'" _D(k) f(a;h) L.....t k!
+
k=l
11
1
(p - I)!
(1- t)P-1D(p) f(a+ th;h) dt.
0
°
7. Suppose that V is open in R n , f: V -+ R is C2 on V, and fxJa) = for some a E V and all j = 1, ... , n. Prove that if H is a compact convex subset of V, then there is a constant M such that for all x E H
If(x) - f(a) I :::;
Mllx-aI1 2 .
8. Suppose that V is an open subset of R2, (a, b) E V, and on V. Prove that
4
lim -2
r-+O trr
f :V
-+
127f f(a + rcos e, b + r sine) cos(2e) de = fxx(a, b) 0
R is C3
fyy(a, b).
9. Suppose that V is an open subset of R2, H = [a, b] x [0, c] C V, u: V -+ R is C2 on V, and u(xo, to) ~ for all (xo, to) E 8H. (a) Show that given E > 0, there is a compact set K c HO such that u(x, t) ~ -E for all (x, t) E H \ K. (b) Suppose that U(X1,tl) = -£ < for some (Xl,tl) E HO, and choose r > so small that 2rt1 < £. Apply part (a) to E := £/2 - rh to choose the compact set K, and prove that the minimum of
°
°
w(x, t)
:=
u(x, t)
°
+ r(t - tl)
on H occurs at some (X2' t2) E K. (c) Prove that if u satisfies the heat equation, i.e., Uxx - Ut = on V, and if u(xo, to) ~ for all (xo, to) E 8H, then u(x, t) ~ for all (x, t) E H.
°
°
°
10. (a) Prove that every convex set in R n is connected. (b) Show that the converse of part (a) is false. *(c) Suppose that f : R -+ R. Prove that f is convex (as a function) if and only if E:= {(x, y) : y ~ f(x)} is convex (as a set in R2).
11.6 INVERSE FUNCTION THEOREM
By the one-dimensional Inverse Function Theorem (Theorem 4.27), if 9 : R -+ R is =I- 0, then g-l is differentiable at Yo = g(xo) and
1-1 and differentiable with g'(xo)
(g
-l)'() 1 Yo = g'(xo)'
11.6
Inverse Function Theorem
359
In this section we obtain a multivariable analogue of this result, i.e., an Inverse Function Theorem for vector functions f from n variables to n variables. What shall we use for hypotheses? We needed 9 to be 1-1 so that the inverse function g-1 existed. For the same reason, we shall assume that f is 1-1. We needed g'(xo) to be nonzero so that we could divide by it. In the multidimensional case, D f(a) is a matrix, hence "divisibility" corresponds to invertibility. Since a matrix is invertible if and only if it has a nonzero determinant (see Appendix C), we shall assume that the Jacobian of f
Ilj(a)
:=
det(Df(a)) =F
o.
The word Jacobian is used because it was Jacobi who first recognized the importance of Il j and its connection with volume (see Exercise 6, p. 431). The proof of the Inverse Function Theorem on R n is not simple. It lies somewhat deeper than the previous results of this chapter, and we precede it by three preliminary results that explore the consequences of the hypothesis Il j =F O. If f- 1 is differentiable, then f- 1 is continuous; hence, f = (f-l) -1 must take open sets to open sets (see Theorem 9.26 or 10.58). Our first preliminary result, a step in the right direction, shows that if f is 1-1 and its Jacobian is nonzero at a, then f(a) is interior to f(Br(a)).
R n , a E V, and r > 0 be so small that Br(a) c V. Suppose that f is continuous and 1-1 on Br(a), and its first-order partial derivatives exist at every point in Br(a). If Il j =F 0 on Br(a), then there is a p > 0 such that Bp(f(a)) C f(Br(a)). 11.38 Lemma. Let V be open in Rn, f: V
--t
STRATEGY: The idea behind this proof is simple. Let y E Bp (f(a)) , where p is to be determined later. To verify Bp(f(a)) C f(Br(a)), we must show that y = f(b) for some b E Br(a); i.e., f(b) - y = O. If such a b exists, we should be able to find it by choosing ab E Br(a), that minimizes Ilf(b) -YII. This strategy has a problem: By the Extreme Value Theorem, the continuous function Ilf(b) - yll assumes its minimum on the compact set Br(a), not on the open set Br(a); hence, although such a b exists and belongs to the closure of Br(a), it might not belong to Br(a) itself. By controlling p, we can eliminate this problem. If p < m, where m is the minimal distance from f(8Br(a)) to f(a) , then b cannot belong to 8Br(a). Thus bE Br(a), as required. Here are the details.
PROOF.
Let
g(x) = Ilf(x) - f(a)ll,
x
E
Br(a).
By hypothesis, 9 : Br(a) --t R is continuous. Since f is 1-1, g(x) > 0 for all x =F a. Since 8Br(a) is compact, it follows that m
=
inf
zE8B r (a)
g(x) > O.
Set p = m/2 and fix y E Bp(f(a)). Since the function h(x) := Ilf(x) - yll is continuous on the compact set Br(a), it attains its minimum there. Thus there is a bE Br(a) such that h(b) ~ h(x) for all x E Br(a).
Chapter 11
360
DIFFERENTIABILITY ON R n
To show that b E Br(a), suppose to the contrary that b fj. Br(a). Then bE oBr(a). Since h(a) = 11/(a) -1111 < p, the minimum, h(b), must also satisfy h(b) < p. Since bE oBr(a), it follows from the triangle inequality and the choice of p that p> h(b) =
11/(b) -1111
~
11/(b) - l(a)II-II/(a) -1111
= g(b) - h(a)
> 2p - P = p,
a contradiction. It remains to prove that 11 = I(b). Notice that since h(b) ~ 0, h2 (b) is the minimum of h2 on Br(a). Thus by one-dimensional calculus,
This is a system of n linear equations in n unknowns, /j (b) - Yj. Since the matrix of coefficients of this system has determinant 2n ~ f (b) -# 0, it follows from Cramer's Rule (see Appendix C) that this system has only the trivial solution; Le., /j(b)-Yj = o for all j = 1, ... , n. In particular, 11 = I(b). I Next, we show that 1-1 is continuous when
I
is 1-1 and ~f is nonzero.
11.39 THEOREM. Let V be open and nonempty in R n , and I : V -+ Rn be continuous. If I is 1-1 and has first-order partial derivatives on V, and if ~f -# 0 on V, then 1-1 is continuous on I(V). PROOF. By Theorem 9.26 or 10.58 (applied to 1-1), it suffices to show that I(W) is open in R n for every open W ~ V in Rn. Let b E I(W); Le., b = I(a) for some a E W. Since W is open, choose q > 0 such that Bq(a) c W. Fix 0 < r < q, and notice that Br(a) c W. Since I is I-Ion V 2 W, apply Lemma 11.38 to choose p > 0 such that Bp(b) = Bp(f(a)) c I(Br(a)).
Since I(Br(a))
c
I(W), this proves that I(W) is open. I
Our final preliminary result shows that if the Jacobian of a continuously differentiable function I is nonzero at a point, then I must be 1-1 near that point. (This will provide a key step in the proof of Theorem 11.41.) 11.40 Lemma. Let V be open in Rn and I: V -+ Rn be CIon V. If ~f(a) -# 0 for some a E V, then there is an r > 0 such that Br(a) c V, I is 1-1 on Br(a), ~ f(x) -# 0 for all x E Br(a), and
of- (e;) ] det [ ax', J
-# 0 nXn
11.6
Inverse Function Theorem
361
for all C1, ... ,en E Br(a). STRATEGY: The idea behind the proof is simple. If 1 is not 1-1 on some Br(a), then there exist x,y E Br(a) such that x t- y and I(x) = I(y). Since L(x;y) C Br(a), we have by Corollary 11.33 (the Mean Value Theorem) that
(27)
= (Xl,"" X n ), Y = (Y1,"" Yn), c.; E L(x;y), and i = 1, ... , n. Notice that (27) is a system of n linear equations in n unknowns, (Yk - Xk). If we can show, for sufficiently small r, that the matrix of coefficients of (27) has nonzero determinant for any choice of c.; E Br(a), then by Cramer's Rule the linear system (27) has only one solution: Yk - Xk = 0 for k = 1, ... , n. This would imply that x = y, a for x
contradiction. Here are the details. PROOF. To show that there is an r > 0 such that the matrix of coefficients of the linear system (27) is nonzero for all c.; E Br(a), let v(n) = V x ... x V represent the n-fold Cartesian product of V with itself, and define h : v(n) --+ R by
Since the determinant of a matrix is defined using products and differences of its entries (see Appendix C), we have by hypothesis that h is continuous on v(n). Since h(a, ... ,a) = Llf(a) t- 0, it follows that there is an r > 0 such that Br(a) C V and h(C1,'" ,en) t- 0 for c.; E Br(a). In particular, the matrix of coefficients of the linear system (27) is nonzero for all c.; E Br(a), and Llf(x) = h(x, ... ,x) t- 0 for all x E Br(a). I We now prove a multidimensional version of the Inverse Function Theorem.
11.41 THEOREM [INVERSE FUNCTION THEOREM]. Let V be open in Rn and --+ Rn be C1 on V. If Llf(a) t- 0 for some a E V, then there exists an open set W containing a such that (i) 1 is 1-1 on W, (ii) 1- 1 is Clan I(W), and (iii) for each y E I(W),
1:V
where
[t 1 represents matrix inversion (see Theorem C.5).
PROOF. By Lemma 11.40, there is an open ball B centered at a such that 1-1 and Ll f t- 0 on B, and
af-' (c.;) ] Ll := det [aXj
nxn
t- 0
1 is
362
Chapter 11
DIFFERENTIABILITY ON R n
for all C;, E B. Let Bo be an open ball centered at a which is smaller than B; i.e., the radius of Bo is strictly less than the radius of B. Then Bo c B, f is 1-1 on Bo and, by Theorem 11.39, f- 1 is continuous on f(Bo). Let W be any open ball centered at a which is smaller than Bo. Then f is 1-1 on Wand f(W) is open. To show that the first-order partial derivatives of f- 1 exist and are continuous on f(W), fix Yo E f(W) and 1 ~ i, k ~ n. Choose t E R \ {O} so small that Yo + tek E f(W), and choose XO,X1 E W such that Xo = f-1(yO) and Xl = f-1(yO + tek)' Observe that for each i = 1,2, ... , n, k
=i
k =/= i.
Hence by Corollary 11.33 (the Mean Value Theorem), there exist points L(Xo;X1) such that (28)
k=i k=/=i
i
C;,
E
= 1,2, ... ,no
Let xo(j) (respectively, Xl (j)) denote the jth component of Xo (respectively, Xl)' Since (28) is a system of n linear equations in n variables (X1(j) - xo(j))/t whose coefficient matrix has determinant ~ (which is nonzero by the choice of B), we see by Cramer's Rule that the solutions of (28) satisfy (29) where Qj(t) is a quotient of determinants whose entries are O's or 1's, or firstorder partial derivatives of components of f evaluated at the C;, 'so Since t ---+ 0 implies Xl ---+ Xo, C;, ---+ Xo, and Yo + tek ---+ Yo, it follows that Qj(t) converges to Qj, a quotient of determinants whose entries are O's or 1's, or first-order partial derivatives of components of f evaluated at Xo = f- 1(yo). Since f- 1 is continuous on f(W), it follows that Qj is continuous at each Yo E f(W). Taking the limit of (29) as t ---+ 0, we see that the first-order partial derivatives of (I-1)j exist at Yo and equal Qj; i.e., f- 1 is continuously differentiable on f(W). It remains to verify (iii). Fix y E f(W), and observe, by the Chain Rule and Exercise 8, p. 338, that
By the uniqueness of matrix inverses, we conclude that
Of course, the value D f- 1 (y) is not unique because f- 1 may have several branches. For example, if f(x) = x 2 , then f-1(1) = ±1, depending on whether we take the inverse of f(x) near x = 1 or x = -1 (compare with Example 1.32).
11.6
Inverse FUnction Theorem
363
11.42 Remark. The hypothesis "tl. f i=- 0" in Theorem 11.39 can be relaxed. If f(x) = x 3 , then f : R ---+ R and its inverse f-l(X) continuous on R, but tl.f(O) = 1'(0) = O. I PROOF.
= fIX
are
11.43 Remark. The hypothesis "tl. f i=- 0" in Theorem 11.41 cannot be relaxed. In fact, if f : Br(a) ---+ Rn is differentiable at a and its inverse f- 1 exists and is differentiable at f(a), then tl.f(a) i=- O. PROOF. Suppose to the contrary that f is differentiable at a but tl.f(a) Exercise 8, p. 338, and the Chain Rule,
1= D(f-l
0
f)(a)
=
= O. By
D(f-l)(f(a))Df(a).
Taking the determinant of this identity, we have
a contradiction. I 11.44 Remark. The hypothesis "f is CIon V" in Theorem 11.41 cannot be relaxed. PROOF. If f(x) = x + 2X2 sin(l/x), x i=- 0, and f(O) = 0, then f : R differentiable on V := (-1,1) and 1'(0) = 1 i=- O. However, since
---+
R is
for kEN, f is not I-Ion any open set that contains O. Therefore, no open subset of f(V) can be chosen on which f- 1 exists. I Although Theorem 11.41 says say that f is I-Ion V.
f must be I-Ion some subset W of V, it does not
11.45 Remark. The set W chosen in Theorem 11.41 is in general a proper subset of V, even when V is connected.
Set f(x, y) = (X 2 _y2, xy) and V = R2\{(0, On. Then tl. f = 2(X 2+y2) i=o for (x,y) E V, but f(x,-y) = f(-x,y) for all (x,y) E R2. Thus f is not I-Ion PROOF.
V. I Sometimes functions from p variables to n variables are defined implicitly by relations on Rn+p. On rare occasions, such a relation can be solved explicitly as follows. *11.46 Example. If x& + s& + t& = 1 and Xo i=- 0, prove that there exist an r > 0 and a function g(s, t), continuously differentiable on Br(so, to), such that Xo = g(80, to) and
364
Chapter 11
s
for x
= g(S, t)
PROOF.
DIFFERENTIABILITY ON Rn
Figure 11.5 and (s, t) E Br(so, to).
Solve x 2
+ S2 + t 2 = 1 for
x to obtain
Which sign shall we take? If xo > 0, set g(s, t)
og os
-s
and
=
og ot
vI -
S2 -
t 2 • By the Chain Rule,
-t
Thus 9 is differentiable at any point (s, t) that lies inside the two-dimensional unit ball, Le., that satisfies s2 + t 2 < 1. Since x& + s& + t& = 1 and xo > 0, (so, to, xo) lies on the boundary of the three-dimensional unit ball in stx space a distance xo units above the st plane (see Figure 11.5). In particular, if r := 1 x& and (s, t) E Br(so, to), then S2 + t 2 < 1. Therefore, 9 is continuously differentiable on Br(so, to). If xo < 0, a similar argument works for g(s, t) = S2 - t 2 • I
VI -
-vI -
We cannot expect that all relations can be solved explicitly as we did in Example 11.46. It is most fortunate, therefore, that once we know that a solution exists, we can often approximate that solution by numerical methods. The crux of the matter, then, is which relations have solutions? In order to state a result about the existence of solutions to a relation, we introduce additional notation. Let V be an open subset of Rn, f : V - t Rm, and a E V. Then the partial Jacobian of f generated by a subset {k1 , k 2 , ••• , k n } of {1, 2, ... , m} at the
11.6
Inverse Function Theorem
365
point a, is the number
provided that all these partial derivatives exist. For the case n = m, the corresponding partial Jacobian is just the Jacobian t:1 f (a). We shall use partial Jacobians again in Chapter 12 to discuss change of variables for integrals in Rn, and in Chapter 13 to introduce differential forms of order 2. Here is a result about the existence of solutions to relations. (In this theorem we use the notation (x,t) to represent the vector (Xl"'" X n , h, ... , t p ).)
11.47 THEOREM [THE IMPLICIT FUNCTION THEOREM]. Suppose that V is open in Rn+ p, and F = (FI , ... , Fn) : V --+ Rn is Clan V. Suppose further that F(xo,to) = 0 for some (xo,to) E V, where Xo ERn and to E RP. If
then there is an open set W c RP containing to and a unique continuously differentiable function 9 : W --+ Rn such that g(to) = xo, and F(g(t),t) = 0 for all tEW. STRATEGY: The idea behind the proof is simple. If F took its range in Rn+p instead of Rn and had nonzero Jacobian, then by the Inverse Function Theorem, F- I would exist and be differentiable on some open set. Presumably, the first n components of F- I would solve F for the variables Xl, ... , X n . Thus we should extend F (in the simplest possible way) to a function if' that takes its ran~e in Rn+p and has nonzero Jacobian, and apply the Inverse Function Theorem to F. Here are the details. PROOF. For each (x,t) E V, set (30)
Clearly,
if' : V
--+
Rn+p and
DF=
[[~~Lxn Opxn
B Ipxp
1
where Opxn represents a zero matrix, Ipxp represents an identity matrix, and B represents a certain n x p matrix whose entries are first-order partial derivatives of
366
Chapter 11
DIFFERENTIABILITY ON R n
Fj's with respect to tk'S. Expanding the determinant of DF along the bottom rows first, we see by hypothesis that
Since P(Zo,fo) = (O,fo), it follows from the Inverse Function Theorem that there exist open sets 0 1 containing (Zo,fo) and O2 := P(Ot} containing (O,fo) such that Pis I-Ion 0 1 , and C := p- 1 is 1-1 and continuously differentiable on O2 . Let ¢ = (C 1 , •.. , Cn). Since C = p- 1 is 1-1 from O2 onto 0 1 , it is evident by (30) that ¢(P(x,t)) =x
(31) for all (x, t) E 0 1 and (32)
P(¢(x,t),t)
= (x,t)
for all (x,t) E O2. Define g on W := {t E RP : (O,t) E 02} by g(t) = ¢(O,t). Since O2 is open in Rn+p, W is open in RP. Since C is continuously differentiable on O2 and ¢ represents the first n components of C, g is continuously differentiable on W. By the definition of g, the choice of xo, and (31), we have g(fo)
= ¢(O,fo) = ¢(P(xo,fo)) = Zo·
Moreover, by (30) and (32) we have F(¢(x,t),t) =x for all (x,t) E 02. Specializing to the case x = 0, we obtain F(g(t),t) = for t E W. It remains to show uniqueness. But if h : W -+ Rn satisfies F(h(t),t) = = F(g(t),t); i.e., P(h(t),t) = (O,t) = P(g(t),t), then g(t) = h(t) for all t E W, since P is I-Ion O2 . I
°
°
Theorem 11.47 is an existence theorem. It states that a solution g exists without giving us any idea how to find it. Fortunately, for many applications it is not as important to be able to write an explicit formula for g as it is to know that g exists. Here is an example for which an explicit solution is unobtainable. 11.48 Example. Prove that there is a function g(s, t), continuously differentiable on some Br(l, 0), such that 1 = g(l, 0), and
for x
= g(s, t) and (s, t)
E B r (1, 0).
PROOF. If F(x,s,t) = sx2 + tx 3 + 2y't+s + t 2 x 4 - x 5 cost - x 6 - 1, then F(I, 1, 0) = 0, and Fx = 2sx + 3tx 2 + 4t 2 x 3 - 5x 4 cos t - 6x 5 is nonzero at the point (1,1,0). Applying the Implicit Function Theorem to F, with n = 1, p = 2, Xo = 1, and (so, to) = (1,0), we conclude that such a g exists. I
11.6
Inverse Function Theorem
367
Even when an explicit solution is obtainable, it is frequently easier to apply the Implicit Function Theorem than it is to solve a relation explicitly for one or more of its variables. Indeed, consider Example 11.46 again. Let F(x, s, t) = 1- x 2 - S2 - t 2 and notice that Fx = -2x. Thus, by the Implicit Function Theorem, a continuously differentiable solution x = g(s, t) exists for each Xo =I=- 0. The following example shows that the Implicit Function Theorem can be used to show that several differentiable solutions exist simultaneously.
11.49 Example. Prove that there exist functions u, v : R4 ~ R, continuously differentiable on some ball B centered at the point (x, y, z, w) = (2,1, -1, -2), such that u(2, 1, -1, -2) = 4, v(2, 1, -1, -2) = 3, and the equations
both hold for all (x, y, z, w) in B. PROOF. Set n
=
2, p = 4, and
Then F(4, 3, 2,1, -1, -2) &(F1 ,F2 ) &(u, v)
= (0,0), and =
det
[2U 2u/x2
2V]
2v/y2
=
4uv
( y21 -
This determinant is nonzero when u = 4, v = 3, x = 2, and y functions u, v exist by the Implicit Function Theorem. I
1) x2 .
= 1. Therefore, such
EXERCISES 1. For each of the following functions, prove that
f- 1 exists
and is differentiable in some nonempty, open set containing (a, b), and compute D(f-1 )(a, b)
(a) f(u, v)
=
(3u - v, 2u + 5v) at (a, b).
(b) f(u,v) = (u+v,sinu+cosv) at (a,b) = (0,1).
(c) f(u, v) = (uv, u 2 + v 2) at (a, b) = (2,5). = (u 3 -v 2,sinu-Iogv) at (a,b) = (-1,0).
(d) f(u,v)
2. For each of the following functions, find out whether the given expression can be solved for z in a nonempty, open set V containing (0,0,0). Is the solution differentiable near (O,O)? (a) xyz + sin(x + y + z) = 0. (b) x 2 + y2 + z2 + ij2xy + 3z + 8) = 2. (c) xyz(2cosy - cosz) + (z cos x - xcosy) = 0. (d) x + y + z + g(x, y, z) = 0, where g is any continuously differentiable function that satisfies g(O, 0, 0) = and gAO, 0, 0) > 0.
°
Chapter 11
368
DIFFERENTIABILITY ON Rn
3. Prove that there exist functions u(x, y), v(x, y), and w(x, y), and an r > 0 such that u, v, ware continuously differentiable and satisfy the equations
u 5 + xv 2 - Y + w = 0 v 5 + yu 2 - X + w = 0
+ y5
w4
_ X4
=1
on Br(l, 1), and u(l, 1) = 1, v(l, 1) = 1, w(l, 1) = -l. 4. Find conditions on a point (xo, Yo, un, va) such that there exist real-valued functions u(x, y) and v(x, y) that are continuously differentiable near (xo, Yo) and satisfy the simultaneous equations
xu 2 + yv 2 + xy = 9 xv 2 + yu 2 - xy = 7. Prove that the solutions satisfy u 2 + v 2 = 16/(x + y). 5. Given nonzero numbers xo, Yo, un, va, So, to that satisfy the simultaneous equations
+ sx + ty = 0 v 2 + tx + sy = 0 2s2x + 2t 2y - 1 = 0
(*)
u2
S2X - t 2y
=
0,
prove that there exist functions u(x, y), v(x, y), s(x, y), t(x, y), and an open ball B containing (xo, Yo), such that u, v, s, t are continuously differentiable and satisfy (*) on B, and such that u(xo, Yo) = un, v(xo, Yo) = va, s(xo, Yo) = So, and t(xo, Yo) = to· 6. Let E = {(x, y) : 0 < y < x} and set f (x, y) = (x + y, xy) for (x, y) E E. (a) Prove that f is 1-1 from E onto {(s, t) : s > 20, t > O} and find a formula for f-1(S,t). (b) Use the Inverse Function Theorem to compute D(f-1 )(f(x, y)) for (x, y) E E. (c) Use the formula you obtained in part (a) to compute D(f-1)(S,t) directly. Check to see that this answer agrees with the one you found in part (b). 7. Suppose that f : R2 -+ R2 has continuous first-order partial derivatives in some ball Br(xo, Yo), r > o. Prove that if 6. j (xo, Yo) i= 0, then
8f1 1 (f( )) = 8h/8y(xo, Yo) J;l xo, Yo A ( )' uX t....J.jXo,Yo
8f1 1 (f( )) _ -8fr/8y(xo, yo) J;l Xo, Yo A ( uy t....J.j Xo, Yo ) '
and
8f;1 (f( )) J;l Xo, Yo uX
=
-8h/8x(xo,yo) A ( )' t....J.jXo,Yo
8f;1 (f( )) _ 8fr/8x(xo, Yo) - - xo,Yo . 8y 6. j (xo, Yo)
11.7
[!J.
369
Optimization
This exercise is used in Section ell. 7. Let F : R3 differentiable at (a, b, c) with 'V F(a, b, c) :f:. O.
--+
R be continuously
(a) Prove that the graph of the relation F(x, y, z) = 0; Le., the set 9 := {(x,y,z): F(x,y,z) = O}, has a tangent plane at (a,b,c). (b) Prove that a normal of the tangent plane to 9 at (a, b, c) is given by 'V F(a, b, c). 9. Suppose that f := (u, v) : R --+ R2 is C2 and (xo, Yo) = f(to). (a) Prove that if 'Vf(to):f:. 0, then u'(to) and v'(to) cannot both be zero. (a) If 'V f(t o) :f:. 0, show that either there is a C1 function 9 such that g(xo) = to and u(g(x)) = x for x near Xo, or there is a Cl function h such that h(yo) = to and v(h(y)) = y for y near Yo. e11.7 OPTIMIZATION
This section uses no material from any other enrich-
ment section. In this section we discuss how to find extreme values of differentiable functions of several variables. 11.50 DEFINITION. Let V be open in Rn, let a E V, and suppose that
f :V
--+
R. (i) f(a) is called a local minimum of f if and only if there is an r > 0 such that f(a) ::::; f(x) for all x E Br(a). (ii) f(a) is called a local maximum of f if and only if there is an r > 0 such that f(a) ~ f(x) for all x E Br(a). (iii) f(a) is called a local extremum of f if and only if f(a) is a local maximum or a local minimum of f. The following result shows that as in the one-dimensional case, extrema of realvalued differentiable functions occur among points where the "derivative" is zero. 11.51 Remark. If the first-order partial derivatives of f exist at a, and f(a) is a local extremum of f, then 'V f(a) = O.
PROOF. The one-dimensional function g(t) = f(aI, ... , aj-l, t, aj+I, ... , an) has a local extremum at t = aj for each j = 1, ... , n. Hence, by the one-dimensional theory,
~f
(a) = g'(aj)
=
O. I
uXj
As in the one-dimensional case, 'V f(a) = 0 is necessary but not sufficient for f(a) to be a local extremum. 11.52 Remark. There exist continuously differentiable functions that satisfy 'V f(a) = 0 such that f(a) is neither a local maximum nor a local minimum.
PROOF. Consider
Chapter 11
370
DIFFERENTIABILITY ON Rn
z
x
Figure 11.6
Since the first-order partial derivatives of f exist and are continuous everywhere on R 2 , f is continuously differentiable on R 2 • Moreover, it is evident that 'fil f (0) = 0, but f(O) is not a local extremum (see Figure 11.6). I The fact that the graph of this function resembles a saddle motivates the following terminology. 11.53 DEFINITION. Let V be open in R n , let a E V, and let f : V -7 R be differentiable at a. Then a is called a saddle point of f if 'fil f(a) = 0 and there is a TO > such that given any < p < TO there are points x,Y E Bp(a) that satisfy
°
°
f(x) < f(a) < f(y)· By the Extreme Value Theorem, if f is continuous on a compact set H, then it attains its maximum and minimum on H; i.e., there exist points a,b E H such that
f(a) = sup f(x) xEH
and
f(b) = inf f(x). xEH
When f is a function of two variables, these points can be found by combining Remark 11.51 with one-dimensional techniques. 11.54 Example. Find the maximum and minimum of f(x, y) on H = B 1 (0, 0).
= x2 -
X
+ y2 -
2y
SOLUTION. Since 'filf(x,y) = (0,0) implies (x,y) = (1/2,1), f has no local extrema inside H. Thus the extrema of f on H must occur on 8H. Using polar 0 < 21l'. Set coordinates, we can describe 8H by (x, y) = (cos 0, sin 0), where
°: :;
h(O):= f(cosO,sinO) = 1- cosO - 2sinO.
11.7
Optimization
371
Notice that the derivative of h is zero when tan 0 = 2; i.e., 0 = arctan 2 ~ 1.10715 or 0 = arctan 2 + 7r ~ 4.24874. Therefore, candidates for the extrema of f on 8H are (x, y) ~ (0.4472,0.8944) and (x, y) ~ (-0.4472, -0.8944). Checking the sign of h"(O), we see that the first point corresponds to a minimum, and the second point corresponds to a maximum. Therefore, the maximum of f on H is f( -0.4472, -0.8944) ~ 3.236, and the minimum of f on H is f(0.4472, 0.8944) ~ -1.236. I Using the second-order total differential D(2) f introduced in Section 11.5, we can obtain a multidimensional analogue of the Second Derivative Test. First, we prove a technical result.
11.55 Lemma. Let V be open in R n , a E V, and f : V ~ R. If all second-order partial derivatives of f exist at a and D(2) f(a;h) > 0 for all h i= 0, then there is an m > 0 such that (33)
for all x ERn. PROOF. Set H
= {x ERn: Ilxll = I} and consider the function n
g(x) :=
D(2)
f(a;x) :=
n
LL
82 f
(a) XjXk,
j=l k=l 8Xk 8xj
By hypothesis, 9 is continuous and positive on Rn \ {O}, hence on H. Since H is compact, it follows from the Extreme Value Theorem that 9 has a positive minimum monH. Clearly, (33) holds for x = o. If x i= 0, then x/llxll E H, and it follows from the choice of 9 and m that
We conclude that (33) holds for all x ERn. I
11.56 THEOREM [SECOND DERIVATIVE TEST). Let V be open in R n , a E V, and suppose that f : V ~ R satisfies V f(a) = O. Suppose further that the secondorder total differential of f exists on V and is continuous at a. (i) If D(2) f(a;h) > 0 for all h i= 0, then f(a) is a local minimum of f. (ii) If D(2) f(a;h) < 0 for all h i= 0, then f(a) is a local maximum of f. (iii) If D(2) f(a;h) takes on both positive and negative values for hE R n , then a is a saddle point of f. PROOF. Choose r > 0 such that Br(a) c V, and suppose for a moment that there is a function c: : Br(O) ~ R such that c(h) ~ 0 as h ~ 0 and (34)
f(a+h) - f(a)
=
1
"2D(2) f(a;h)
+ IlhI1 2 c:(h)
Chapter 11
372
DIFFERENTIABILITY ON R n
for h sufficiently small. If D(2) f(a;h) > 0 for h:f: 0, then (33) and (34) imply f(a+h) - f(a) ;::: ( ; +c(h)) IIhll 2
for h sufficiently small. Since m > 0 and c(h) ---+ 0 as h ---+ 0, it follows that f (a + h) - f (a) > 0 for h sufficiently small; i.e., f (a) is a local minimum. Similarly, if D(2) f(a;h) < 0 for h :f: 0, then f(a) is a local maximum. This proves parts (i) and (ii). To prove part (iii), fix h ERn and notice that (34) implies
for t E R. Since c(th) ---+ 0 as t ---+ 0, it follows that f(a+th) - f(a) takes on the same sign as D(2) f(a;h) for t small. In particular, if D(2) f(a;h) takes on both positive and negative values as h varies, then a is a saddle point. It remains to find a function c : Br (0) ---+ R such that c(h) ---+ 0 as h ---+ 0, and (34) holds for all h sufficiently small. Set c(0) = 0 and
c(h) =
f(a+h) - f(a) - ~D(2) f(a;h) II h l1 2
'
hE Br(O), h:f: O.
By the definition of c(h), (34) holds for h E Br(O). Does c(h) ---+ 0 as h h = (h 1 , h2,"" hn ) E Br(O). Since \l f(a) = 0, Taylor's Formula implies f(a+h) - f(a)
=
---+
O? Fix
21 D (2) f(C;h)
for somecE L(a;a+h); Le.,
Since Ihjhkl ::; IIhl1 2 and the second-order partial derivatives of f are continuous at a, it follows that
as h
---+
O. We conclude by the Squeeze Theorem that c(h)
---+
0 as h
---+
O. I
The following result shows that the strict inequalities in Theorem 11.56 cannot be relaxed.
11.7
Optimization
373
11.57 Remark. If D(2) f(a;h) 2: 0, then f(a) can be a local minimum ora can be a saddle point. PROOF. f(O,O) is a local minimum of f(x, y) point of f (x, y) = x 3 + y2. I
= x4 + y2, and (0,0) is a saddle
In practice, it is not easy to determine the sign of D(2) f(a;h). For the case n = 2, the second total differential D(2) f(a;h) is a quadratic form, i.e., has the form Ah2 + 2Bhk + Ck 2. The following result shows that the sign of a quadratic form is determined completely by the discriminant D = B2 - AC.
11.58 Lemma. Let A,B,C E R, D = B2 -AC, and ¢(h,k) = Ah2 +2Bhk+Ck 2. (i) If D < 0, then A and ¢(h, k) have the same sign for all (h, k) =f. (0,0). (ii) If D > 0, then ¢(h, k) takes on both positive and negative values as (h, k) varies over R 2 • PROOF.
(i) If D < 0, then A
A¢(h, k)
=f. 0 and
A¢(h, k) is a sum of two squares:
= A 2h 2 + 2ABhk + ACk2 = (Ah + Bk)2 + IDlk 2.
Since A =f. 0 =f. D, at least one of these squares is positive for each (h, k) It follows that A and ¢(h, k) have the same sign for all (h, k) =f. (0,0). (ii) If D > 0, then either A =f. 0 or B =f. O. If A =f. 0, then A¢(h, k) is a difference of two squares:
A¢(h, k)
=
(Ah + Bk -
=f.
(0,0).
..Ji5 k)(Ah + Bk + ..Ji5 k).
The lines Ah + Bk - .Ji5 k = 0 and Ah + Bk + .Ji5 k = 0 divide the hk plane into four open regions (see Figure 11.7). Since A¢(h, k) is positive on two of these regions and negative on the other two, it follows that ¢(h, k) takes on both positive and negative values as (h, k) varies over R 2 . If A = 0 and B =f. 0, then
¢(h, k)
= 2Bhk + Ck 2 = (2Bh + Ck)k.
Since B =f. 0, the lines 2Bh + Ck = 0 and k = 0 divide the hk plane into four open regions. As before, ¢(h, k) takes on both positive and negative values as (h, k) varies over R 2 • I This result leads us to the following simple test for extrema and saddle points.
11.59 THEOREM. Let V be open in R2, (a, b) E V, and suppose that f : V ~ R satisfies Y' f (a, b) = O. Suppose further that the second-order total differential of f exists on V and is continuous at (a, b), and set
(i) If D < 0 and fxx(a, b) > 0, then f(a, b) is a local minimum. (ii) If D < 0 and fxx(a, b) < 0, then f(a, b) is a local maximum. (iii) If D > 0, then (a, b) is a saddle point.
374
Chapter 11
DIFFERENTIABILITY ON R n k
/ Ah + Bk-Wk
/
II.
/
/
/
/
/
/
/
/
I.
/
~...,,--------..-..-
.-.-.-.-.-.-
.-.-'- Ah + Bk + Wk
h
.-.-.-.-.-.-.-.-.-? III.
/
/
IV.
/ / /
Figure 11.7 PROOF. Set A = fxx(a, b), B = fxy(a, b), and C 11.56 and Lemma 11.58. I
= fyy(a, b). Apply Theorem
(For a discriminant that works for functions of three variables, see Widder [14], p. 134.) 11.60 Remark. If the discriminant D = 0, f(a,b) may be a local maximum, a local minimum, or (a, b) may be a saddle point. PROOF. The function f(x, y) = x 2 has zero discriminant at (a, b) = (0,0), and o = f(O,O) is a local minimum for f. On the other hand, f(x, y) = x 3 has zero discriminant at (a, b) = (0,0), and (0,0) is a saddle point for f. I In practice, one often wishes to optimize a function subject to certain constraints. (For example, we do not simply want to build the cheapest shipping container, but the cheapest shipping container that will fit in a standard railway car and will not fall apart after several trips.) 11.61 DEFINITION. Let V be open in R n , a E V, and f,gj : V j = 1,2, ... ,m.
(i) f(a) is called a local minimum of f subject to j = 1, ... , m, if and only if there is a p > 0 gj (x) = 0 for all j = 1, ... ,m imply f(x) 2 f(a). (ii) f(a) is called a local maximum of f subject to j = 1, ... , m, if and only if there is a p > 0 gj(x) = 0 for all j = 1, ... , m imply f(x) ~ f(a).
~
R for
the constraints gj(a) = 0, such that x E Bp(a) and the constraints gj(a) = 0, such that x E Bp(a) and
11.62 Example. Find all points on the ellipsoid x 2 + 2y2 + 3z 2 = 1 (see Appendix D) that lie closest to or farthest from the origin. SOLUTION. We must optimize the distance formula ";x2
+ y2 + z2; equivalently,
11. 7
Optimization
375
z
y
x
Figure 11.8
we must optimize the function f(x, y, z) = x 2 + y2 + Z2, subject to the constraint g(x, y, z) = x 2 + 2y2 + 3z 2 - 1 = O. Using 9 to eliminate the variable x in f, we see that f takes on the form ¢(y, z) = 1 - y2 - 2z2. Solving \i'¢(y,z) = (0,0), we obtain (y,z) = (0,0); i.e., x 2 = 1. Thus, elimination of x leads to the points (±1, 0,0). Similarly, elimination ofy leads to (0,±1/V2,0), and elimination of z leads to (0,0, ±1/ v3). Checking the distance formula, we see that the maximum distance is 1, which occurs at the points (±1, 0, 0), and the minimum distance is 1/v3, which occurs at the points (0,0, ±1/v3). (The points (0,±1/V2,0) are saddle points, i.e., correspond neither to a maximum nor to a minimum.) I Optimizing a function subject to constraints, as above, by eliminating one or more of the variables is called the direct method. There is another, more geometric method for solving Example 11.62. Notice that the points on the ellipsoid g(x, y, z) = x 2 + 2y2 + 3z 2 - 1 = 0 that are closest to and farthest from the origin occur at points where the tangent planes of the ellipsoid g(x, y, z) = 0 and the sphere f(x, y, z) = 1 are parallel (see Figure 11.8). Recall that two nonzero vectors a and b are parallel if and only if a+ >JJ = 0 for some scalar A #- O. Since normal vectors of the tangent planes of f(x, y, z) = 1 and g(x, y, z) = 0 are \i' f and \i'9 (see Exercise 8, p. 369), it follows that extremal points (x, y, z) of f(x, y, z) subject to the constraints g(x,y,z) = 0 must satisfy
(35)
\i'f(x,y,z)
+ A\i'g(X,y,z) =0
for some A #- O. For the case at hand, (35) implies (2x,2y,2z) + A(2x,4y,6z) = (0,0,0). Combining this equation with the constraint g(x, y, z) = 0, we have four
Chapter 11
376
DIFFERENTIABILITY ON Rn
equations in four unknowns:
X(A
+ 1),
y(2A + 1) = 0,
Z(3A + 1)
= 0,
and
X2
+ 2y2 + 3z2 = 1.
Solving these equations, we obtain three pairs of solutions: (±1, 0, 0) (when A = -1), (0, ±1/V2, 0) (when A = -1/2), and (0,0,±1/v'3) (when A = -1/3). Hence, we obtain the same solutions with the geometric method as we did with the direct method. The following result shows that the geometric method is valid, even in the case when the functions have nothing to do with spheres and ellipsoids, and even when several constraints are used. This is fortunate since the direct method cannot be used unless the constraints are relatively simple. [LAGRANGE MULTIPLIERS]. Let m < n, V be open in Rn, R be Clan V for j = 1,2, ... ,m. Suppose that there is an a E V
11.63 THEOREM and f, gj : V such that
----*
8(gl, ... , gm) (a) 8(X1, ... ,xm )
=1=
0.
If f(a) is a local extremum of f subject to the constraints gk(a)
= 0,
k
= 1, ... ,m,
then there exist scalars A1, A2, ... ,Am such that m
(36)
'\l f(a)
+ L: Ak '\lgk(a) = o. k=l
PROOF.
Equation (36) is a system of n equations in m unknowns, A1, A2, ... , Am:
8gk 8f L: Ak-(a) = --(a), 8x· 8x· m
(37)
k=l
J
j=1,2, ... ,n.
J
The first m of these equations form a system of m linear equations in m variables whose matrix of coefficients has a nonzero determinant, hence uniquely determine the Ak'S. What remains to be seen is that because f(a) is a local extremum subject to the constraints gk(a) = 0, these same Ak'S also satisfy (37) for j = m + 1, ... , n. This is a question about implicit functions. Let p = n - m. As in the proof of the Implicit Function Theorem, write vectors in Rm+p in the form x = (y,t) = (Y1, ... , Ym, t1, ... , tp). We must show that (38)
for .e = 1, ... ,po Let 9 = (gl, ... , gm), and choose Yo E R m, to E RP such that a = (Yo, to). By hypothesis, g(yO,to) = 0 and the Jacobian of 9 (with respect to the variables Yj) is nonzero at (Yo, to). Hence, by the Implicit Function Theorem, there is an open set
11. 7
We RP that contains to, and a function h : W differentiable on W, h(to) = Yo, and
g(h(t),t) = 0,
(39) For each t E Wand k
377
Optimization
~
Rm such that h is continuously
tE W.
= 1, ... ,m, set
Gk(t) = gk(h(t),t)
and
F(t) = j(h(t),t).
We shall use the functions G 1 , ... ,Gm and F to verify (38) for C = 1, ... , p. Fix such an C. By (39), each G k is identically zero on W, hence has derivative zero there. Since to E Wand (h(to),to) = (YO,to) = a, it follows from the Chain Rule that
Oh1 ( )
ot
P
to
ohr:, ( )
Ogk (a)]
ot
oXn
o
P
o
to
1
Hence, the Cth component of DGk(to) is given by (40) for k = 1,2, ... , m. Multiplying (40) by Ak and adding, we obtain
Hence, it follows from (37) that (41)
~ oj oh j 0= - L.... -(a)-(to) j=1 OXj
ote
~ Ogk + L.... Ak-(a). k=1
ote
Suppose that j(a) is a local maximum subject to the constraints g(a) = O. Set Eo = {x E V : g(x) = O}, and choose an n-dimensional open ball B(a) such that
(42)
x E B(a) n Eo
implies
j(x)::; j(a).
Chapter 11
378
DIFFERENTIABILITY ON R n
Since h is continuous, choose a p-dimensional open ball B(to) such that t E B(to) implies (h(t),t) E B(a). By (42), F(to) is a local maximum of F on B(to). Hence, V'F(to) = o. Applying the Chain Rule as above, we obtain
o=
(43)
f :!. j=l
J
(a)
~~ji (to) + ;:e(a)
(compare with (40)). Adding (43) and (41), we conclude that of ~ ogk o = ~(a) + L..,. Ak~(a). uti uti
I
k=l
11.64 Example. Find all extrema of X 2+y2+Z2 subject to the constraints x-y = 1 and y2 - Z2 = 1. SOLUTION. Let f(x,y,z) = x 2 + y2 + Z2, g(x,y,z) = x - Y -1, and h(x,y,z) = y2 - z2 - 1. Then (36) takes on the form V' f + AV' 9 + J1V' h = 0; i.e., (2x, 2y, 2z)
+ A(1, -1,0) + J1(0, 2y, -2z) =
(0,0,0).
In particular, 2x + A = 0, 2y + 2J1Y - A = 0, and 2z - 2J1z = O. From this last equation, either J1 = 1 or z = O. If J1 = 1, then A = 4y. Since 2x + A = 0, we find that x = -2y. From 9 = 0 we obtain -3y = 1; i.e., y = -1/3. Substituting this into h = 0, we obtain Z2 = -8/9, a contradiction. If z = 0, then from h = 0 we obtain y = ±1. Since 9 = 0, we obtain x = 2 when y = 1, and x = 0 when y = -1. Thus, the only candidates for extrema of f subject to the constraints 9 = 0 = h are f(2, 1,0) = 5 and f(O, -1,0) = 1. To decide whether these are maxima, minima, or neither, look at the problem from a geometric point of view. The problem requires us to find points on the intersection of the plane x - y = 1 and the hyperbolic cylinder y2 - Z2 = 1 which lie closest to the origin. Evidently, both of these points correspond to local minima, and there is no maximum (see Figure 11.9). In particular, the minimum of x 2 + y2 + Z2 subject to the given constraints is 1, attained at the point (0, -1,0). I EXERCISES 1. Find all local extrema of each of the following functions.
(a) (b) (c) (d)
f(x, f(x, f(x, f(x,
y) = x 2 - xy + y3 - y. y) = sinx + cosy. y, z) = e X +Y cos z. y) = ax 2 + bxy + cy2, where a f 0 and b2 - 4ac f O.
11. 7
Optimization
379
Figure 11.9
2. For each of the following, find the maximum and minimum of f on H. (a) f (x, y) = x 2 + 2x - y2 and H = {(x, y) : x 2 + 4y2 :::; 4}. (b) f (x, y) = x 2 + 2xy + 3y2, and H is the region bounded by the triangle with vertices (1,0), (1,2), (3,0). (c) f(x, y) = x 3 + 3xy - y3, and H = [-1,1] x [-1,1]. 3. For each of the following, use Lagrange multipliers to find all extrema of f subject to the given constraints. (a) f(x, y) = x + y2 and x 2 + y2 = 4. (b) f(x, y) = x 2 - 4xy + 4y2 and x 2 + y2 = l. (c) f(x, y, z) = xy, x 2 + y2 + Z2 = 1 and x + y + z = o. (d) f(x,y,z,w) = 3x + y + w, 3x 2 + y + 4z 3 = 1 and -x 3 + 3z4 + w = o. 4. Let f : Rn ---+ Rm be differentiable at a, and 9 : Rm ---+ R be differentiable at b = f(a). Prove that if g(b) is a local extremum of g, then '\1(g 0 f)(a) = O. 5. Let V be open in R 2 , (a, b) E V, and f : V ---+ R have second-order partial derivatives on V with fx(a, b) = fy(a, b) = O. If the second-order partial derivatives of f are continuous at (a, b) and exactly two of the three numbers fxx(a, b), fxy(a, b), and fyy(a, b) are zero, prove that (a, b) is a saddle point if fxy(a, b) =I- O. 6. Let V be an open set in Rn, a E V, and f : V ---+ R be C2 on V. If f(a) is a local minimum of f, prove that D(2) f(a) (h) 2: 0 for all h ERn. 7. Let a, b, c, D, E be real numbers with c =I- O. (a) If DE > 0, find all extrema of ax + by + cz subject to the constraint z =
Chapter 11
380
DIFFERENTIABILITY ON Rn
Dx2 + Ey2. Prove that a maximum occurs when cD when cD > O. (b) What can you say when DE < O?
< 0 and a minimum
8. [IMPLICIT METHOD].
(a) Suppose that f, g : R3 - R are differentiable at a point (a, b, c), and f(a, b, c) is an extremum of f subject to the constraint g(x, y, z) = k, where k is a constant. Prove that
and
of og of og oy (a, b, c) oz (a, b, c) - oz (a, b, c) oy (a, b, c) = O.
(b) Use part (a) to find all extrema of f(x, y, z) the constraint xyz = 16.
[!].
= 4xy + 2xz + 2yz subject to
This exercise is used in Section e14.4.
> 1. Find all extrema of f(x) = E~=l IXkl P = 1.
(a) Let p
E~=l x~ subject to the constraint
(b) Prove that
for all
Xl, .•• ,X n
E R, n E N, and 1 ~
p
~
2.
Chapter 12
12.1 JORDAN REGIONS In this section we define grids (a multidimensional analogue of partitions) and use them to identify special subsets of Rn, called Jordan regions, that have a welldefined volume. In the next section, when we define integrals of multivariable functions on Jordan regions, grids will play the role that partitions did in the onevariable case. Throughout this chapter, R will represent a nondegenerate n-dimensional rectangle; i.e.,
where aj < bj for j = 1,2, ... , n. A grid on R is a collection of n-dimensional rectangles!} = {Rl, ... , Rp} obtained by subdividing the sides of R; i.e., for each j = 1, ... , n there are integers Vj E N and partitions P j = P j (!}) = {x~) : k = 1, ... , Vj} of [aj, bj ] such that!} is the collection of rectangles of the form It x ... x In, where each I j = [X~21' x~)] for some k = 1, ... , Vj (see Figure 12.1). A grid!} is said to be finer than a grid 'H if and only if each partition Pj (9) is finer than the corresponding partition Pj('H), j = 1, ... , n. Notice once and for all that given two grids!} and 'H, there is a grid I that is finer than both!} and 'H. (Such a grid can be constructed by taking Pj (I) = Pj (!}) U Pj ('H) for j = 1, ... , n.) If R is an n-dimensional rectangle ofthe form (1), then the volume of R is defined to be IRI = (b 1 - al) ... (b n - an). (When n = 1, we shall calilRI the length of R, and when n = 2, we shall calilRI the area of R.) Notice that given e > 0 there exists a rectangle R* such that R C (R*)O and IR* I = IRI + e. Indeed, since bj - aj + 28 """""* bj - aj as 8 """""* 0, we can choose 8> 0 so small that R* := [al -8, b1 +8] x··· x [an -8, bn +8] satisfies IR*I = IRI +e. We want to define the integral of a multivariable function on a variety of sets, for example, the integral of a function of two variables on rectangles, disks, triangles, 381
382
Chapter 12
INTEGRATION ON Rn
Y
d
- - - - -
---.--,----.-----.------y----,
- - - - - +-+--+--+----f----i Y3
- - - - -
R
-I--+--+--+----+----i
Y2
- - - - - -+-+--+--+----f----i c
- - - - - -I--+--+--+----+----i
x
Figure 12.1
ellipses, and the integral of a function of three variables on balls, cones, ellipsoids, pyramids, and so on. One property these regions all have in common is that they have a well-defined "area" or "volume." How shall we define the volume of a general set E? Let R be a rectangle that contains E. If E is simple enough, we should be able to get a good approximation for the volume of E by choosing a sufficiently fine grid Q on R and adding up the collective volumes of all rectangles in Q that intersect E. Accordingly, we define the outer sums of E with respect to a grid Q on a rectangle R by
L
V(E; Q):=
IRjl,
RjnE#0
where the empty sum is by definition zero. Notice once and for all, since the empty sum is defined to be zero, that V(0; 9) = 0 for all grids Q. Figure 12.2 illustrates an outer sum for a particular set E and grid Q. The rectangles that intersect E have been shaded; those that cover 8E are darker than those that are contained in EO. Notice that even for this crude grid, the shaded region is a fair approximation to the "volume" of E. The following result shows that as the grids get finer, the outer sum approximations to the volume of E get better. 12.1 Remark. Let R be an n-dimensional rectangle. (i) Let E be a subset of R, and let Q, 'H be grids on R. If Q is finer than 'H, then V(E; Q) :::; V(E; 'H).
(ii) If A and B are subsets of R and A
~
B, then V(A; Q) :::; V(B; 9).
12.1
Jordan regions
383
Figure 12.2
Figure 12.3
PROOF. (i) Since Q is finer than 1t, each Q E 1t is a finite union of Rj's in Q. If Q n E i= 0, then some of the Rj's in Q intersect E and others might not (see Figure 12.3, where the darker lines represent the grid 1t, the lighter lines represent Q \ 1t, and the Rj's that intersect E are shaded). Let Il = {R E Q : R n E i= 0} and I2 = {R E Q \ Il : R ~ Q for some Q E 1t with Q n E i= 0}. Then
V(E; 1t) =
L
IRI
+
L
IRI ~
L
IRI = V(E; Q).
(ii) If A ~ B, then A ~ B (see Exercise 3, p. 254). Thus, every rectangle that appears in the sum V(A;Q) also appears in the sum V(B;Q). Since alllRjl's are nonnegative, it follows that V(A; Q) ::; V(B; Q). I In view of all this, we guess that the volume of a set E can be computed by taking the infimum of all outer sums of E. Unfortunately, this guess is wrong unless some restriction is made on the set E. To see why a restriction is necessary, notice
Chapter 12
384
INTEGRATION ON Rn
that any reasonable definition of volume should satisfy the following property: if E = AU B, where B = E \ A, then the volume of E must equal the sum of the volumes of A and B. The following example shows that this property does not hold if A is spread out too much. 12.2 Example. If R = [0,1] x [0, 1], A = {(x, y) : x,y E Qn [0, I]}, and B = R\A, then V(A; g) + V(B; Q) = 2V(R; Q) no matter how fine g is. PROOF. Let g = {Rl' ... ,Rp} be a grid on R. Since each R j is nondegenerate, it is clear by the Density of Rationals (Theorem 1.24) that Rj nA =I- 0 for all j E [l,p]. Hence V(A; g) = IRI = 1. Similarly, the Density of the Irrationals (Exercise 3, p. 23) implies V(B; g) = IRI = 1. •
The real problem with A is that the boundary of A, 8A := A\AO = R, is too big. To avoid this type of pathology, we will restrict our attention to "Jordan" regions, that is to sets whose boundaries are small in the following sense. (See also Remark 12.6ii. ) 12.3 DEFINITION. Let E be a subset ofRn. Then E is said to be a Jordan region if and only if given e > there is rectangle R ~ E, and a grid g = {Rl' ... , Rp} on R, such that
°
V(8E;Q):=
L
IRjl < e.
R J n8Ei-0
(This last sum IS the outer sum of 8E since 8E
= 8E by Theorem 8.36.)
Recall that E is covered by {Qdk=l means that E ~ Uk=l Qk. Thus a set is a Jordan region if and only if its boundary is so thin that it can be covered by rectangles whose total volume is as small as one wishes (see the darkly shaded rectangles in Figure 12.2). Notice once and for all that by definition, a Jordan region is contained in some rectangle R, hence bounded. The converse of this statement is false. The set A in Example 12.2 is bounded but not a Jordan region. We are now prepared to define what we mean by the volume of a Jordan region. Working by analogy to upper sum approximations of integrals, we shall define the volume of a Jordan region E to be the infimum of the outer sums V(E; g) over all grids g on some rectangle R containing E. 12.4 DEFINITION. Let E be a Jordan region in Rn and let R be an n-dimensional rectangle that satisfies E ~ R. The volume (or Jordan content) of E is defined by
Vol (E) := infV(E; g) := inf{V(E; g): g ranges over all grids on R} 9
We shall sometimes call Vol (E) length when n = 1 and area when n = 2. Notice, then, that the empty set is of length, area, and volume zero. Before we continue, we need to show that Vol (E) does not depend on the rectangle R chosen to generate the grids g. To this end, let Rand Q be rectangles that
12.1
Jordan regions
385
contain E. Since the intersection of two rectangles is a rectangle, we may suppose that E ~ Q c R. Since Q C R, it is easy to see that inf V(E; H) ~
1t on Q
inf V(E; g).
9 on R
On the other hand, let H be any grid on Q. For each c > 0, choose a rectangle Q* such that Q C (Q*)O and IQ*I = IQI + c. Let Ho be the grid formed by adding the endpoints of Q* and R to H; i.e., if
then Pj(Ho) = Pj(H) U {aj, ,Bj,''Yj, bj}. Then go := Ho n R is a grid on R whose rectangles that intersect E are either part of H to begin with, or the thin ones formed by adding the endpoints of Q*. Hence,
V(E; H)
+ c 2: V(E; go) 2: 9 inf V(E; 9). on R
It follows that inf V(E; H)
1t=Q
~
inf V(E; 9)
9=R
~
inf V(E; H)
1t=Q
+ c,
for every c > O. By letting c -+ 0, we verify that the definition of volume does not depend on the rectangle R. In general, it is not easy to decide whether or not a given set is a Jordan region. Topology alone cannot resolve this problem since there are open sets in Rn that are not Jordan regions (see Spivak [12], p. 56). In practice, however, it is usually easy to show that the boundary of a specific set can be covered by thin rectangles. We illustrate this fact with rectangles first. In the process, we also show that the two definitions of the volume of a rectangle (length x width x . .. versus the infimum of outer sums) agree.
Remark 12.5. If R is an n-dimensional rectangle, then R is a Jordan region in Rn and Vol (R) = IRI. PROOF.
Let c
> 0 and suppose that
Since bj - aj - 2b
-+
bj - aj as b -+ 0, we can choose b > 0 so small that if
then IRI - IQI < c. Let go := {HI, ... ,Hq} be the grid on R determined by
Chapter 12
386
INTEGRATION ON Rn
Then it is clear that an H j E 9 intersects 8R if and only if H j V(8R;Q):=
L
-:f. Q. Hence,
IHjl = IRI-IQI < f:.
H j n8R¥0
This proves that R is a Jordan region. To compute the volume of R by Definition 12.4, let 9 = {Rl' ... , Rp} be any grid on R. Since R j nR -:f. 0for all R j E g, it follows from definition that V(R; Q) = IRI. Taking the infimum of this identity over all grids 9 on R, we find that Vol (R) =
IRI·
I
We shall soon see that spheres, ellipsoids, and, in fact, all "projectable regions" (just about anything you can draw) are Jordan regions (see Theorem 12.39). The method of proof frequently involves the following observations.
12.6 Remark. Suppose that E is a bounded subset of Rn. (i) E is a Jordan region of volume zero if and only if there is an absolute constant C, that does not depend on E, such that for each f: > 0 one can find a grid 9 that satisfies V(E; Q) < Cf:. (ii) E is a Jordan region if and only if Vol (8E) = O. (iii) If E is a set of volume zero and A ~ E, then A is a Jordan region and Vol (A) = O. PROOF. By Definitions 12.3 and 12.4, and Remark 12.1ii, it suffices to prove (i). Let E be a Jordan region of volume zero, and let f: > O. By the Approximation Property for Infima, there is a grid 9 such that V(E; Q) < f:. Hence set C = 1. Conversely, let f: > 0 and suppose that there is a grid 9 such that V(E; Q) < Cf:. Then 8E = E \ EO c E implies
O:S Q:=
i~fV(8E;Q):S f3:= i~fV(E;Q):S Cf:.
Since f: > 0 was arbitrary, it follows that Q = f3 = O. Since Q = 0, we can use the Approximation Property for Infima to choose a grid 1i such that V (8E; 1i) < f:. Thus E is a Jordan region. Since f3 = 0, we conclude by Definition 12.4 that Vol (E) = O. I To evaluate integrals of multi variable functions over unions of sets, we introduce the following concept.
12.7 DEFINITION. Let £ := {Ee}eEN be a collection of subsets of Rn. (i) £ is said to be nonoverlapping if and only if E j n Ek is of volume zero for j -:f. k. (ii) £ is said to be pairwise disjoint if and only if E j
n Ek = 0 for j -:f. k.
Notice that since 0 is of volume zero, every collection of pairwise disjoint sets is nonoverlapping. (The converse of this statement is false-see Exercise 6.)
12.1
387
Jordan regions
According to Definition 12.3, a Jordan region is a set whose boundary can be covered by small rectangles from some grid g. The following result, when combined with Remark 12.6ii, shows that the grid is unnecessary. Indeed, any set whose boundary can be covered by finitely many rectangles (or squares) whose total volume can be made arbitrarily small is a Jordan region. 12.8 THEOREM. Let E be a subset ofRn. Then E is a Jordan region of volume zero if and only if for every E > 0 there is a finite collection of cubes Qk of the same size, i.e., all with sides of length s, such that q
E
c
UQk
and
k=1 PROOF. If Vol (E) = 0, then by definition there exists a grid {RI, ... , Rp} represents all rectangles in g that intersect E, then p
E c
UR
g
such that if
p
j
and
j=1
LIRjl j=1
<~.
By increasing the size of the Rj's slightly, we may suppose that the sides of each R j have rational lengths, and LJ=1 IRj I < E. (These rectangles may no longer be nonoverlapping.) The lengths of the sides of the Rj's have a common denominator, say d. By using a grid fine enough, we can divide each Rj into cubes Q~), for k = 1,2, ... , Vj and some choice of Vj EN, such that each Q~) has sides of common length s
= lid. Since IRjl = L~~IIQ~)I, it follows that P
!.Ij
LLIQ~)I j=lk=1
p
= LIRjl
<E.
j=1
Conversely, if such cubes exist, let R be a rectangle that contains the union of the Qk'S and suppose that
b(k)] x ... x [a(k) b(k)] Qk = [a(k) 1 , 1 n , n . h d ' {(I) ror each J. -- 1, 2, ... , n, teen pomts aj , b(l) j , ... , aj(q) , b(q)} j can b e arrange d in increasing order to form a partition of the jth side of R. Thus there is a grid g = {RI, ... , Rp} so fine that each Qk is a union of the Rj's (see Figure 12.4). Since V(E; g) ~ Lk=1 IQkl < E, we conclude from Remark 12.6i that Vol (E) = O. I D
This characterization of sets of volume zero can be used to show that all balls in
R n are Jordan regions-see Exercise 7. (For a formula of the volume of a ball in Rn, see Theorem 12.69.) Here are some additional corollaries of Theorem 12.8.
388
Chapter 12
INTEGRATION ON Rn
R
Q3
Q2 Q1
Figure 12.4 12.9 COROLLARY. If El and E2 are Jordan regions, then El U E2 is a Jordan region and Vol (El U E 2) ::; Vol (El) + Vol (E2). PROOF. We begin by proving that El U E2 is a Jordan region. Since El and E2 are Jordan regions, use Theorem 12.8 to choose squares {Sj} that cover 8El (respectively, squares {Qk} that cover 8E2) such that
But by Theorem 8.37 or 10.40, 8(El U E 2) ~ 8E l U 8E2. Thus {Sj} U {Qk} is a collection of squares that covers 8(El UE2) whose total volume is less than e. Hence by Theorem 12.8, El U E2 is a Jordan region. To estimate the volume of El U E 2, let 9 be a grid on a rectangle that contains El UE2. If R j intersects El U E 2, then by Theorem 8.37 (or 10.40) Rj intersects El or E2 (or both). Hence, V (El U E 2; Q) ::; V (El; Q) + V (E2; Q). Taking the infimum of this inequality over all grids g, we obtain
By iterating this result, we see that the collection of Jordan regions is closed under finite unions. This is also the case for intersections and set differences (see Exercise 4). Our next corollary of Theorem 12.8 shows that certain kinds of images of Jordan regions are Jordan regions. We shall use this to obtain a change-of-variables formula in Section 12.4.
12.1
389
Jordan regions
12.10 COROLLARY. Suppose that V is a bounded, open set in R n and that ¢ : V --+ Rn is 1-1 and CIon V with D.. -# O. (i) If E is of volume zero and E c V, then ¢(E) is of volume zero. (ii) If {EkhEN is a nonoverlapping collection of sets in Rn with Ek C V for all kEN, then {¢(Ek)}kEN is a nonoverlapping collection of sets in Rn. (iii) If E is a Jordan region and E C V, then ¢(E) is a Jordan region. (i) Since E C V, for each X E E there is an r(x) > 0 such that Br(z)(x) C V. Hence by the Borel Covering Lemma, there exist finitely many Xk E E such that the bounded open set PROOF.
N
U :=
UBr(Zk) (Xk)
j=l
satisfies E cU. Set H:= U and notice that H is compact, and EcHo c H c V. We claim that there is a constant C, depending only on H, ¢, and n, such that if Q is a cube contained in H, then there is a cube Q such that
¢(Q) ~ Q and
(2)
IQI:S CIQI.
To prove this claim, notice by Corollary 11.34 that there is an M > 0, which depends only on H, ¢, and n, such that
(3)
11¢(x) - ¢(y) II :S Mllx -yll
for all X,y E R and all rectangles R ~ H. Let Q be a cube of side s contained in H. By Remark 8.7, Ilx - yll :S sy'n. Hence by (3), we have
x,yEQ.
11¢(x) - ¢(y) II :S MsVn, Fix x E Q. It follows ¢(Q) is a subset of the cube
Q:= [¢l(X) - MsVn, ¢l(X) + MSVnl x ...
X
[¢n(x) - MsVn, ¢n(x) + MsVnl.
Since Vol (Q O. Since EcHo, use Theorem 12.8 to choose cubes Ql,"" Qq such that Qj c H, q
and
L
IQjl <
f.
j=l Let
g
be any grid whose rectangles Rk are either nonoverlapping with the cubes
{Qt, ... , Qt} or a union of some subset of them. (Just use the endpoints of the Qj's to generate the grid g.) Since by Theorem 1.43 and the left side of (2) we have
390
Chapter 12
INTEGRATION ON R n
it follows from Remark 12.1ii, and the definition of outer sums, that
jld Qj; Q ::; f; V(Qj; Q) f; IQjl· q
V(1)(E); Q) ::; V
(
)
q
q
=
We conclude by the right side of (2) and the choice of the Qj's that q
V( 1>(E); Q) ::; C
L
IQj I = CE:o
j=l
In particular, Vol (1)(E)) = 0 by Remark 12.6i. (ii) By part (i), if Ek n E j is of volume zero, then so is 1>(Ek n E j ). But by Exercise 6, p. 33 (since 1> is 1-1),
Thus {1>( E k)} is nonoverlapping when {Ek} is. (iii) By part (i) and Remark 12.6iii, it suffices to prove that 8(1)(E)) t:: 1>(8E). By Theorem 11.39, the set 1>(EO) is open and by Theorem 9.29 (or 10.61), the set 1>(E) is closed. It follows from Theorem 8.32 (or 10.34) that 1>(EO) t:: (1)(EW and 1>(E) ~ 1>(E). Therefore,
We close this section with some optional results that will not be used elsewhere. They show that the volume of a set can also be approximated from below using inner sums We introduced outer sums (analogues of upper sums) and defined the volume of a Jordan region as the infimum of all outer sums. In order to calculate the volume of a specific set, it is sometimes convenient to have inner sums (analogues of the lower sums we used to define integrals in Chapter 5). Given E eRn, a subset of some n-dimensional rectangle R, and Q = {Rj : j = 1, ... , p}, a grid on R, the inner sums of E with respect to Q are defined by
v(E;Q):=
L
IRjl,
R]CEO
where the empty sum is again interpreted to be zero. Thus v(E; Q) = 0 for all grids Q and all sets E satisfying EO = 0. Inner and outer sums can be used to define inner and outer volume of ANY bounded set, in the same way that upper and lower sums were used to define upper and lower integrals of any bounded function (see Definition 12.13). If Q is fine enough and E is Jordan, the inner sum of a Jordan region E with respect to Q
12.1
391
Jordan regions
should approximate Vol (E); just as V(E; g) overestimated Vol (E), each v(E; Q) underestimates Vol (E). (In Figure 12.2, the underestimate v(E; g) is represented by the lightly shaded rectangles. You might refine the grid there and revisualize the inner and outer sums to illustrate that these estimates get better as the grid gets finer.) Since v(E; H) is either zero or a sum of nonnegative terms, it is clear that v(E; H) 2: 0 for all grids H. If we combine this observation with the proof of Remark 12.li, we can also establish the following result.
12.11 Remark. Let R be an n-dimensional rectangle, let E be a subset of R, and let Q, H be grids on R. If Q is finer than H, then
o ::; v(E; H) ::; v(E; g) ::; V(E; g) ::; V(E; H). This leads us to the following fundamental principle.
12.12 Remark. Let R be an n-dimensional rectangle and E be a subset of R. If Q and H are grids on R, then
o ::; v(E; g) ::; V(E; H). PROOF.
Let I be a grid finer than both Q and H. By Remark 12.11,
o ::; v(E; g) ::; v(E; I) ::; V(E; I) ::; V(E; H).
I
Using the sums v(E; g) and V(E; Q), we can define inner and outer volume of any bounded set E.
12.13 DEFINITION. Let E be a bounded subset of Rn and let R be an ndimensional rectangle that satisfies E ~ R. The inner volume of E is defined by Vol (E) := sup{ v(E; Q): Q ranges over all grids on R}, and the outer volume of E is defined by Vol (E) := inf{V(E; g): Q ranges over all grids on R}. As before, we can show that this definition is independent of the rectangle R chosen to generate the grids Q. When E is a Jordan region, the outer and inner volume of E is precisely the volume of E.
12.14 THEOREM. Let E be a bounded subset ofRn. Then E is a Jordan region if and only if Vol (E) = Vol (E). PROOF. Let E c Rn and suppose that R is a rectangle that contains E. We shall show that for all grids Q on R,
(4)
V(E; g) - v(E; Q)
=
V(oE; Q).
392
Chapter 12
INTEGRATION ON R n
If EO = 0, then 8E = E and (4) is obvious. Otherwise, suppose that R j E g is a rectangle which appears in the sum represented by the left side of (4); i.e., R j intersects E but R j is not a subset of EO. If R j does not appear in the sum represented by the right side of (4), then R j n 8E = 0. It follows that the pair EO, (Rn \ E) ° separates R j , a contradiction since all rectangles are connected (see Remark 9.34). Therefore, every rectangle that appears in the sum represented by the left side of (4) also appears in the sum represented by the right side; i.e.,
V(E; g) - v(E; g) ::; V(8E; g). On the other hand, suppose that R j Egis a rectangle which appears in the sum represented by the right side of (4); i.e., R j n 8E =1= 0. Recall from Theorems 8.24 and 8.36 (or 10.39 and 10.31) that 8E = E \ EO is closed, so R j n 8E =1= 0. It follows that R j intersects E but Rj is not a subset of EO. Thus every rectangle that appears in the sum represented by the right side of (4) also appears in the sum represented by the left side i.e., V(E; g) - v(E; g) :::: V(8E; g). This proves (4). To prove the theorem, suppose that E is a Jordan region. By Remark 12.6ii, Vol (8E) = O. Since by (4), V(8E; g) = V(E; g) - v(E; g) :::: Vol (E) - Vol (E), it follows (by taking the infimum of this last inequality over all grids g) that
(5)
0= Vol (8E) :::: Vol (E) - Vol (E) :::: O.
Thus Vol (E) = Vol (E). Conversely, suppose that Vol (E) = Vol (E). By the Approximation Property, given c: > 0, there exist grids HI and H2 such that Vol (E)
+ c: > V(E; Hd
and
Vol (E) - c: < v(E; H2)'
If g is a grid on R that is finer than both HI and H 2 , it follows from Remark 12.11 that Vol (E) + c: > V(E; g) and Vol (E) - c: < v(E; g). Subtracting these inequalities, we see by (4) that
0::; V(8E; g) = V(E; g) - v(E; g) < Vol (E) - Vol (E) + 2c: = 2c:. Hence E is a Jordan region by definition .•
EXERCISES 1. (a) For m = 1,2,3, let gm be the grid on [0,1] x [0,1] generated by
= 1,2. For each of the following sets, compute V(E; gm). (o:)E={(X,Y)E[O,l]x[O,l]:x=O or y=O}. ({3) E = {(x,y) E [0,1] x [0,1]: y::; x}. h) E = {(x,y) E [0,1] x [0,1] : (2x _1)2 + (2y -1)2::; I}. *(b) For each E in part (a), compute v(E; gm).
where j
12.1
Jordan regions
393
2. (a) Prove that every finite subset of Rn is a Jordan region of volume zero. (b) Show that even in R 2, part (a) is not true if "finite" is replaced by "countable." (c) By an interval in R 2 we mean a set of the form {( x , c) : a :::; x :::; b}
or
{( c, y) : a :::; y :::; b}
for some a, b, c E R. Prove that every interval in R2 is a Jordan region.
~. This exercise is used in Section e12.6. Let E eRn. The translation of E by an x ERn is the set x
+E
= {y ERn: y = x
+ z for
some z E E},
and the dilation of E by a scalar ex > 0 is the set
exE = {y ERn : y = cxz for some z E E}. (a) Prove that E is a Jordan region if and only if x + E is a Jordan region, in which case Vol (x + E) = Vol (E). (b) Prove that E is a Jordan region if and only if exE is a Jordan region, in which case Vol (exE) = exnVol (E).
[!].
This exercise is used in Section e12.5. Suppose that E l , E2 are Jordan regions in Rn.
(a) Prove that if El <:;;: E 2, then Vol (Ed:::; Vol (E2)' (b) Prove that El n E2 and El \ E2 are Jordan regions. (c) Prove that if E l , E2 are nonoverlapping, then Vol (El U E 2) = Vol (Ed + Vol (E2)' (d) If E2 <:;;: E l , prove that Vol (El \ E 2) = Vol (E l ) - Vol (E 2). (e) Prove that Vol (El U E 2) = Vol (E l ) + Vol (E2) - Vol (El n E2)' 5. Let E be a Jordan region in Rn. Prove that EO and E are Jordan regions. Prove that Vol (EO) = Vol (E) = Vol (E). Prove that Vol (E) > 0 if and only if EO -=1= 0. Let f : [a, bJ ...... R be continuous on [a, bJ. Prove that the graph of y = f(x), x E [a,b], is a Jordan region in R2. (e) Does part (d) hold if "continuous" is replaced by "integrable"? How about "bounded"?
(a) (b) (c) (d)
6. Prove that every grid is a nonoverlapping collection of Jordan regions. 7. (a) Prove that the boundary of an open ball Br(a) is given by
8Br(a) = {x:
Ilx-all =
r}.
(b) Prove that Br (a) is a Jordan region for all a E Rn and all r ~ O.
Chapter 12
394
INTEGRATION ON Rn
*8. Show that if E c R n is bounded and has only finitely many cluster points, then E is a Jordan region. *9. A set E eRn is said to be of measure zero if and only if given e > 0 there is a sequence of rectangles Rl, R2 , •.• that covers E such that L~l IRkl < e. (a) Prove that if E c Rn is of volume zero, then E is of measure zero. (b) Prove that if E c Rn is at most countable, then E is of measure zero. (c) Prove that there is a set E c R 2 of measure zero that does not have zero area, in fact, is not even a Jordan region.
12.2 RIEMANN INTEGRATION ON JORDAN REGIONS By analogy with the one-variable case, the integral of a nonnegative function f over a Jordan region E should be the volume of the set {(x, t) : x E E, 0 :S t :S f(x)}. We should be able to approximate this volume by using (n + 1)-dimensional rectangles whose heights approximate t = f(x) and whose bases belong to some grid on E (see Figure 12.5). This leads us to the following definition (compare with Definition 5.13).
12.15 DEFINITION. Let E be a Jordan region in Rn, let f : E ~ R be a bounded function, let R be an n-dimensional rectangle such that E ~ R, and let g = {Rl, ... , Rp} be a grid on R. Extend f to Rn by setting f(x) = 0 for x E Rn\E. (i) The upper sum of f on E with respect to
L
U(f, Q):=
g is
MjlRjl,
RJ nE¥0
where M j = sUP",ERJ f(x). (ii) The lower sum of f on E with respect to
L
L(f,Q):=
g is
mjlRjl,
RJ nE¥0
where mj = inf"'ERJ f(x). (iii) The upper and lower integrals of f on E are defined by
(L)
f
JE
f(x)dx:=(L)
f
JE
fdV:=supL(f,g) 9
and
(U)
JEf f(x) dx := (U) JEf f dV := inf9 U(f, Q),
where the supremum and infimum are taken over all grids
g on R.
12.2
395
Riemann integmtion on Jordan regions
Figure 12.5
Modifying the proofs of Remarks 5.7, 5.8, and 5.14, we can prove the following result. 12.16 Remark. Let E be a non empty Jordan region in Rn, let f : E ~ R be bounded, and let R be a rectangle that contains E. (i) If 9 and 1i are grids on R, then L(f, Q) ~ U(f,1i). (ii) The upper and lower integrals of f over E exist, do not depend on the choice of R, and satisfy
(6)
(L)
L
f(x) dx
~ (U)
L
f(x) dx.
12.17 DEFINITION. A real-valued bounded function f defined on a Jordan region E is said to be (Riemann) integrable on E if and only if for every c > 0 there is a grid 9 such that U(f, Q) - L(f, g) < c.
By modifying the proof of Theorem 5.15, we can establish the following result. 12.18 Remark. Let E be a Jordan region in Rn and suppose that f : E bounded. Then f is integrable on E if and only if
(7) When
(L)
L
f(x) dx = (U)
L
f(x) dx.
f is integrable on E, we denote the common value in (7) by
~
R is
Chapter 12
396
INTEGRATION ON Rn
and call it the integral of lover E. For n = 2 (respectively, n = 3) we shall frequently denote the integral fE 1 dV by ffE 1 dA (respectively, by fffE 1 dV). The following result shows that evaluation of Riemann integrals over Jordan regions reduces to evaluation of Riemann integrals over rectangles.
12.19 THEOREM. Let E be a Jordan region in Rn, let R be an n-dimensional rectangle that contains E, and suppose that 1 : E --+ R is integrable on E. If g(x)
= { ~(x)
xEE x
tt E,
then 9 is integrable on Rand
L
(8)
I(x) dx =
l
g(x) dx.
PROOF. By Definition 12.15, the upper and lower sums of 1 and 9 are identical; hence, they have the same upper and lower integrals. It follows from Remark 12.18 that they have the same integrals. I This last proof worked because we defined the upper and lower integrals of a function 1 on E by extending 1 to be zero off E. We did this to be sure that U(f; Q) was an overestimate of the integral of 1 and L(f; g) was an underestimate. Unfortunately, the abrupt change from 1 to 0 at the boundary of E introduces additional complications. The next result shows that since the boundary of E is of volume zero, we can ignore what happens at the boundary.
12.20 THEOREM. Let E be a Jordan region and suppose that 1 : E --+ R is bounded. Then given e > 0 there is a grid go such that iig := {Rl, ... , Rp} is any grid finer than go and M j , mj are defined as in Definition 12.15, then
and
PROOF. Let e > 0 and choose M > 0 such that II(x)1 :S M for all x E E. Since Vol (8E) = 0, we can choose a grid 'HI such that V(8E; 'Hd < e/(2M). Moreover, by the Approximation Property of Infima, we can choose a grid 'H2 such that
(U)
r l(x)dx:S U(f;'H2) < (U) Jr I(x)dx+~.2
JE
E
Let go be a grid finer than both 'HI and 'H2, and suppose that g = {Rl, ... , Rp} is finer than go. Since each R j is connected, it is easy to see that if Rj intersects
12.2
Riemann integration on Jordan regions
397
E but R j is not a subset of EO, then R j intersects BE. (Indeed, if R j n BE = 0, then the pair EO, (Rn \ E)O separates R j , a contradiction since all rectangles are connected-see Remark 9.34.) Since g is finer than 111 and 112, it follows that
c -<-+ 2 R J naEf.0
~
c
"2 + MV(BE;9) < c.
A similar proof establishes the inequality involving lower sums and lower integrals .• It follows that if EO = 0, then the upper and lower integrals of any bounded 1 are zero; i.e., I(x) d:i; = O. Can we avoid worrying about the boundary by redefining the numbers M j and mj in Definition 12.15? For example, why not just define M j = sUP:r:ERJnE I(x)? This approach will not work because the infimum of these upper sums will not equal the integral of I. For example, suppose that go = {[O, 1] x [0, I]} and g = {Rl' R2, R 3 , R 4 }, where the Rj's are formed by bisecting the sides of go; i.e., each Rj is exactly one-fourth of the unit square. Let E = Rl and suppose that 1 = 1 on R'l, but 1 = -1 otherwise. If M j is defined as above, then U(f, go) = 1 but U(f, Q) = -1/2, which is LESS than I(x) d:i; = 1/4. Evidently, in order to define Ion E by looking at grids on a rectangle that contains E, we must extend 1 to be zero off E. Our first application of Theorem 12.20 is an analogue of Theorem 5.10.
IE
IE
12.21 THEOREM. If E is a closed Jordan region in Rn and continuous on E, then 1 is integrable on E.
1:E
-+
R is
PROOF. Since by hypothesis E is closed and bounded, 1 is bounded on E (apply the Extreme Value Theorem and the Heine-Borel Theorem). To show that 1 is integrable on E, let c > 0 and R be a rectangle that contains E. By Theorem 12.20, there is a grid go on R such that if g = {Rl, . .. ,Rp} is any grid that is finer than go, then
Since
1 is uniformly continuous on E, choose 8 > 0 such that
Ilx -YII < 8
and X,Y E E
imply
I/(x) - I(Y) I < c.
Make g finer by insisting that for each R j E g, Ilx - yll < 8 when x, y E R j . Then the choice of 8 implies that M j - mj < c for all j that satisfy R j C E. Hence it
398
Chapter 12
INTEGRATION ON Rn
follows from Remark 12.16 and (9) that 0::; (U)
1
f(x) dx - (L)
E
1
f(x) dx <
£
L
+
E
(Mj - mj)IRjl
RjeEO
< £ + £V(E; g) ::; £(1 + IRI). Since £ > 0 was arbitrary, we conclude that (U) IE f(x) dx is integrable on E .•
= (L) IE f(x) dx; i.e., f
In Theorem 12.21, the hypothesis that E be closed can be weakened if we insist that f be uniformly continuous on E (see Exercise 4b). All one needs to do is apply Exercise 8, p. 277. The following result shows that the volume of a Jordan region can be computed by integration.
12.22 THEOREM. If E is a closed Jordan region, then Vol (E)
=
L
1dx.
PROOF. Let R be a rectangle containing E and g = {R 1 , .•. , Rp} be a grid on R. Define f(x) = 1 for x E E and f(x) = 0 for x ~ E, and notice by Theorem 12.21 that f is integrable on E. Since Rj nEt- 0 implies Rj n E t- 0, and Mj(f) = 1 when Rj nEt- 0, it is clear, by the definition of upper sums and outer sums, that U(f, g) ::; V(E; g). Taking the infimum of this inequality over all grids g, and applying Theorem 12.21 together with Definitions 12.15 and 12.4, we have
infV(E;g) = J(E 1dx= infU(f,g)::; Ii Ii
Vol (E).
On the other hand, since Vol (8E) = 0, given £ > 0 we can choose g so that V(8E; g) < £. Since mj(f) = 0 when Rj n EC t- 0, and mj(f) = 1 when Rj ~ E, it follows that
11
dx 2: L(f; g)
E
=
L
mjlRjl
R j nEf.0
=
L
IRjl2:
L
IRjl-
L
IRjl
= V(E; g) - V(8E; g) 2: Vol (E) - £.
Since
£
> 0 was arbitrary, it follows that IE 1 dx 2: Vol (E) .•
In Theorem 12.22, the hypothesis that E be closed can be dropped (combine Exercise 5b, p. 393, with Theorem 12.24). As in the one-dimensional case, the integral of a sum of functions over a union of regions can be broken into simpler pieces.
12.2
Riemann integration on Jordan regions
399
12.23 THEOREM [LINEAR PROPERTIES]. Let E be a Jordan region in R n , let J, 9 : E ----; R, and let a be a scalar.
(i) If J, 9 are integrable on E, then so are aJ and J
L
(10)
aJ(x) dx = a
+ g.
In fact,
L
J(x) dx
and
L
(11)
(f(x)
L
+ g(x)) dx =
J(x) dx +
L
g(x) dx.
~ E are nonoverlapping Jordan regions and J is integrable on both El and E 2, then J is integrable on El U E2 and
(ii) If E l , E2
(12)
r
J~u~
J(x) dx =
r J(x) dx + J~r J(x) dx.
J~
PROOF. We suppose for simplicity that a such that
> O. Let c >
L
+ c.
(13)
U(f, g) - c <
Notice that U(aJ, g) we obtain
= aU(f, g) and L(aJ, g) = aL(f, g). Multiplying (13) by a
U(aJ, g) - ac < a
J(x) dx < L(f, g)
°and choose a grid g
L
J(x) dx < L(aJ, g)
+ ac.
In particular, inf U(aJ, g) < a 9
r J(x) dx + ac
JE
and sup L(aJ, g) > a 9
r J(x) dx - ac.
JE
Taking the limit of these inequalities as c ----; 0, we conclude that inf U(aJ, g) :::; a 9
This proves (10). To prove (11), choose a grid
L(aJ, g). 9 JrE J(x) dx:::; sup
g such that
U(f, g) - c <
L
J(x) dx < L(f, g)
+E
400
Chapter 12
INTEGRATION ON R n
and U(g,g) -c < L9(X)dx< L(g,9) +c.
Adding these inequalities, we have U(f,9) + U(g, 9) - 2c < L f(x) dx+ L g(x) dx < L(f,9) + L(g, 9) + 2c.
By definition, U(f + g, 9) :S; U(f, g) + U(g, 9) and L(f + g, g) Therefore,
~
L(f, g) + L(g, 9).
U(f + g, g) - 2c < L f(x) dx+ L g(x) dx < L(f + g, 9) + 2c;
i.e., inf U(f + g, g):s; ( f(x) dx + ( g(x) dx :S; sup L(f + g, 9).
JE
g
JE
g
This proves (11). To prove (12), let c > 0 and apply Theorem 12.20 three times to choose a grid go so that if 9 = {R 1 , ... , Rp} is finer than go then (14)
for i
= 1, 2, and
(15)
Since El and E2 are nonoverlapping, we may also assume that (16)
Let M = max{IM11, ... , IMpl}. Since each R j is connected and Ef n E2 = 0, it is easy to see that each Rj C (El U E 2)O satisfies one and only one of the following three conditions: (i) Rj C Ef; (ii) Rj C E 2; or (iii) Rj n El n E2 =f 0. Hence, it follows from (15), (16), and (14) that (U)
1
L
f(x) dx < c +
E 1 uE2
MjlRjl
R J c(E 1 UE 2)O
:S;c+
L
MjlRjl+
L
M j IRj l+MV(E 1 nE2;9)
< 3c+ { f(x)dx+ { f(x)dx+Mc.
JEl
JE
2
12.2
Riemann integration on Jordan regions
401
Since to > 0 was arbitrary, we obtain
A similar argument establishes
(L)
r
J~u~
f(x)dx?,
r
J~
f(x)dx+
r
J~
f(x)dx.
Thus (12) holds. I The following result shows that the value of an integral remains the same when the integrand is changed on a set of volume zero (compare with Exercise 6, p. 116).
12.24 THEOREM. Let E be a Jordan region in Rn, and suppose that R are bounded functions. (i) If Eo is of volume zero, then 9 is integrable on Eo and
r
JEo
f, 9 : E
-t
g(x) dx = O.
(ii) Iff is integrable on E and if there is a subset Eo of E such that Vol (Eo) and f(x) = g(x) for all x E E \ Eo, then 9 is integrable on E and
L
g(x) dx =
= 0
L
f(x) dx.
PROOF. (i) If Eo i:- 0, then Eo contains a ball, hence a nondegenerate rectangle, so Vol (E) > 0, a contradiction. Since Eo = 0, it follows from Theorem 12.20 that
(U)
(ii) Since f
= 9 on
r
JEo
g(x) dx = (L)
r
JEo
g(x) dx = O.
E \ Eo, it follows from Theorem 12.23ii and part (i) that
r g(x) dx = JE\Eo r g(x) dx + JEor g(x) dx = r f(x) dx + r f(x) dx = r f(x) dx. I JE\Eo JEo JE
JE
This suggests a way to define the integral of f on E when f is not defined on all of E. Indeed, if f is defined on E \ Eo, where E is a Jordan region and Eo is of volume zero, and the function
g(x)
:= {
~(x)
xE xE
E\Eo Eo
Chapter 12
402
INTEGRATION ON Rn
is integrable on E, then we shall define
L I(x) dx:= L g(x) dx. For example,
r ~ dx = Jor (x + 1) dx = 4. Jo x-I 2
2
Henceforth, the phrase "I : E ----t R is integrable" includes the possibility that I may not be defined on a subset of E of volume zero. The following result is a multidimensional analogue of Theorems 5.21 and 5.22.
12.25 THEOREM [COMPARISON THEOREM FOR MULTIPLE INTEGRALS]. Let ----t R are integrable on E.
E be a Jordan region in Rn and suppose that I, 9 : E (i) If I (x) ~ g(x) for x E E, then L I(x) dx
~L
g(x) dx.
~
~
(ii) Ifm,M are scalars that satisfy m mVol (E) (iii) The function
~
L
I(x)
M for x E E, then
I (x) dx ~ MVol (E).
III is integrable on E and
(17) PROOF. (i) If I ~ 9 on E, then L(f, Q) ~ L(g, Q) for any grid Q. Taking the supremum of this inequality over all grids Q verifies part (i). (ii) By Theorem 12.22, (10), and part (i), mVol(E) (iii) Let c
= Lmdx~
LI(X)dx~ LMdx=MVOl(E).
> 0 and choose by Definition 12.17 a grid Q = {Rl, ... , Rp} such that U(f, Q) - L(f, Q) < c.
(18)
By repeating the argument that verified (10) in Section 5.2, we have sup zERJ
I/(x)l- inf I/(x)1
~ sup
zERJ
I(x) - inf I(x).
zER J
zERJ
Hence, it follows from (18) that
Thus
U(I/I, Q) - L(I/I, Q)
~
U(f, Q) - L(f, Q) < c.
III is integrable on E. Since -III
~
I
- L I/(x)1 dx
~L
~
I(x) dx
III, we conclude by part (i) that
~L
I/(x)1 dx. I
12.2
Riemann integration on Jordan regions
403
12.26 THEOREM [MEAN VALUE THEOREM FOR MULTIPLE INTEGRALS]. Let E be a Jordan region in R n and let j, g : E ----> R be integrable on E with g(x) ~ for all x E E.
°
(i) There is a number c satisfying inf j(x) :s; c:S; sup f(x)
(19)
xEE
xEE
such that (20)
c
L
g(x) dx
=
L
f(x)g(x) dx.
(ii) There is a number c satisfying (19) such that cVol (E) =
L
f(x) dx.
PROOF. (i) By hypothesis, the product fg is integrable on E (see Exercise 7). Let m = infxEE f(x) and M = SUPxEE f(x). Since g ~ 0, Theorem 12.25 implies (21)
m
L
g(x) dx:S;
L
f(x)g(x) dx :s; M
IE g(x) dx = 0, then (21) implies IE f(x)g(x) dx = IE g(x) dx -=I- 0, then (20) holds for IE f(x)g(x) dx c= IE g(x) dx
If
L
g(x) dx.
°and (20) holds for any c. If
-'--=-;;------;--:--::-
(ii) Apply part (i) to g(x) = 1. I We close this section with some optional material that generalizes a concept introduced in Section 9.5.
*12.27 DEFINITION. A set E c R n is said to be of measure zero if and only if for every c > there is a countable collection of rectangles {R j } j EN such that
°
00
E
c
U Rj
00
and
j=l
L
IRj I < c.
j=l
*12.28 Remark. If E l , E 2 , ... is a sequence of subsets of R n and each Ek is of measure zero, then 00
is also of measure zero.
404
Chapter 12
INTEGRATION ON R n
PROOF. Let e > O. For each kEN, choose a collection ofrectangles {RY)}jEN that covers Ek such that
Clearly, the collection {RY)h,kEN is countable, covers E, and
Consequently, E is of measure zero. I Every singleton E = {a} in R n is of measure zero. In fact, by comparing Definition 12.27 with Theorem 12.8, it is clear that every set of volume zero is a set of measure zero. The converse of this statement is false. Indeed, for each a E R the set {(a, y) : y E [0, I]} is of volume zero, hence is of measure zero. Thus, by Remark 12.28, E := Q x [0,1] is a set of measure zero. On the other hand, it is clear that Vol (E) = 0 < 1 ::; Vol (E), so E is not a set of volume zero; in fact, E is not even a Jordan region. An analogue of Lebesgue's Theorem holds for multiple integrals. * 12.29 THEOREM. Let E be a Jordan region and let
f :E
R be bounded. (i) Then f is Riemann integrable on E if and only if the set of points of discontinuity of f on E is of measure zero. (ii) Suppose that V is an open set in Rn such that E c V, and that ¢ : V ---+ Rn is 1-1 and ¢-1 is C1 on ¢(V) with tl.",-l -# O. If f is integrable on ¢(E), then f 0 ¢ is integrable on E. ---+
PROOF. (i) This part can be verified by modifying the proof of Theorem 9.49 (see Spivak [12], p. 53). (ii) By part (i) and Corollary 12.lDiii, it suffices to show that the set of points of discontinuity of f 0 ¢ on E is a set of measure zero. Let e > O. Since f is integrable on ¢(E), its set of points of discontinuity, D, can be covered by squares Qk such that 2:%"=1 IQkl < e. Set 'Ij; = ¢-1 and apply (2), with'lj; in place of ¢, to choose an absolute constant C and squares Qt such that 'Ij;(Qk) ~ Qt and IQtl ::; ClQkl. Then {Qt} covers 'Ij;(D) = ¢-l(D) and 00
00
k=l
k=l
L IQtl ::; CL IQkl < Ce.
Hence, ¢-l(D) := 'Ij;(D) is a set of measure zero. But since D is the set of points of discontinuity of f on ¢(E), ¢-l(D) is the set of points of discontinuity of f 0 ¢ on E. Hence f 0 ¢ is Riemann integrable by part (i). I
12.2
405
Riemann integration on Jordan regions
EXERCISES 1. Using Exercise 1, p. 17, compute the upper and lower sums U(f, Ym), L(f, Ym) for mEN, where f(x, y) = xy and Ym is determined by
for j
=
1,2. Prove that
2. Let E be a Jordan region in Rn and
f, 9 be integrable on E with
Lf(X)dx= 5 and
L9(X)dx= 2.
(a) Find
Lo (2f(x) - 3g(x)) dx,
L (2f(x) - 3g(x)) dx,
and
h(2 f (X) - 3g(x)) dx.
(b) If h is integrable on E and g(x) ::; h(x) ::; f(x), prove that JE h(x) dx =1= 7f /2. (c) Suppose that n = 2 and E <;;;; [0,1] x [0,1]. If g(x, y) ::; f(x, y) for all (x, y) E E, prove that there is a a ::; to ::; 1 such that
L x 2 (f(x, y) - g(x, y)) dA
=
3to·
3. Let Q := [0,1] x [0,1]' A := {(x, y) E Q : y ::; x}, B := {(x, y) E Q : y 2: x}, and let f be integrable on Q (hence, on A-see Exercise 4a) with JJA fdA = 4. (a) If JJQ fdA = 3, find JJB fdA, and compute the value of JJB(2f (b) If f 2: a on A and E:= {(x,y) E Q: y::; if:t4}, prove that
+ 5) dA.
4. (a) Let El C E be Jordan regions in Rn. If f : E --+ R is integrable on E, prove that f is integrable on E 1 . (b) If f is uniformly continuous on a Jordan region E, prove that f is integrable
on E. (c) If f : R n --+ R is continuous on R n, prove that region in R n .
f is integrable on any Jordan
406
[!].
Chapter 12
INTEGRATION ON R n
This exercise is used in Sections 12.4, 13.5, and 13.6. Let E be an open Jordan region in Rn and Xo E E. If I: E --+ R is integrable on E and continuous at Xo, prove that lim V< 1(; ( )) ( I(x) d:i; = I(xo). 0 r Xo J Br(Xo)
r---+O+
6. (a) Suppose that E is a Jordan region in R n and that Ik : E --+ R are integrable on E for kEN. If Ik --+ 1 uniformly on E as k --+ 00, prove that 1 is integrable on E and lim ( k---+OOJE
(b) Prove that lim
k---+oo
A (x) d:i; = ( I(x) d:i;. JE
Jr { cos(xjk)e
Y/ k
JE
dA
exists, and find its value for any Jordan region E in R2. 7. Let E be a Jordan region in Rn and let I, 9 : E --+ R be integrable on E. (a) Modifying the proof of Corollary 5.23, prove that Ig is integrable on E. (b) Prove that 1 V 9 and 11\ 9 are integrable on E (see Exercise 9, p. 65). 8. Let H be a closed, connected, nonempty Jordan region and let 1 : H --+ R be continuous on H. (a) If 9 : H --+ R is integrable and nonnegative on H, prove that there is an Xo E H such that
I(Xo) (b) If HO that
L
g(x) d:i; =
L
I(x)g(x) d:i;.
t- 0, prove that there is an open set V
and a point Xo E V n H such
L
I(x) d:i; = l(xo)Vol (H).
9. Prove the following special case of Theorem 12.29i. If E is a closed nonempty Jordan region in Rn, Eo is a nonempty Jordan region of volume zero, and 1 : E --+ R is a bounded function that is continuous on E \ Eo, then 1 is integrable on E. 10. Suppose that V is open in Rn and 1 : V --+ R is continuous. Prove that if
L
I(x) d:i; = 0
for all nonempty Jordan regions E c V, then 1 = 0 on V. 11. Suppose that E is a Jordan region and 1 : E --+ R is integrable. (a) If I(E) ~ H, for some compact set H, and ¢ : K --+ R is continuous, prove that ¢ 0 1 is integrable on E. *(b) Show that part (a) is false if ¢ has even one point of discontinuity.
12.3
Iterated integrals
407
12.3 ITERATED INTEGRALS If f(Xl, ... , Xk, ... , Xj, ... , xn) is defined for Xk E [c, d] and Xj E [a, b], j we shall call
#- k,
then
an iterated integral when the integrals on the right side exist. In a similar way, we define higher-order iterated integrals. In the preceding section we defined the Riemann integral of a multivariable function but developed no practical way to evaluate it. In this section we show that for a large collection of Jordan regions E, integrals over E can be evaluated using iterated integrals. For simplicity, we begin with the two-dimensional case. Recall that for each ¢ : [a, b] --; R, (U) ¢(x) dx represents the upper Riemann integral of ¢, and (L) ¢(x) dx represents the lower Riemann integral of ¢.
J:
J:
12.30 Lemma. Let R = [a, b] x [c, d] be a two-dimensional rectangle and suppose that f : R --; R is bounded. If f(x,·) is integrable on [c, d] for each x E [a, b], then
(22)
(L)
11
f dA ::; (L) ::; (U)
lb (l lb (l
d
d
f(x, Y)dY) dx f(x, Y)dY) dx ::; (U)
11
fdA.
PROOF. Let Rij = [Xi-I, Xi] X [Yj-l, Yj]' where {xo, ... , Xk} is a partition of [a,b] and {Yo, ... ,yd is a partition of [c,d]. Then Q = {Rij : i = 1,2, ... ,k,j = 1,2 ... ,£} is a grid on R. Let E > 0, choose Q so that
(23)
U(f, Q) -
E
< (U)
11
sup
f(x, y).
fdA,
and set
Mij =
(24)
(x,y)ER'J
Since (U)
J: ¢(x) dx = (U)
lb
2:7=1 (U) J:'~, ¢(x) dx
(¢(x)
+ 'lj;(x)) dx ::; (U)
lb
and
¢(x) dx + (U)
lb
'lj;(x) dx
408
INTEGRATION ONR n
Chapter 12
for any bounded functions ¢ and 'Ij; defined on [a, b] (see Exercise 7, p. 116), we can write
(U)
l' (t
f(X,Y)dY) dx
L (t, f. "t,t,(U) L (f. ~ t,(U) k
f(X,Y)dY) dx
f(X,Y)d Y) dx
R
::; LLMij(Xi - xi-d(Yj - Yj-l) = U(j,g). i=l j=l
It follows from (23) that
(U) lb
(l
d
f(X,Y)dY) dx
Taking the limit of this inequality as c
---->
< (U) fLfdA+c.
0, we obtain
Similarly,
We are now prepared to show that under reasonable conditions, a double integral over a rectangle reduces to an iterated integral.
12.31 THEOREM [FUBINI'S THEOREM]. Let R = [a, b] x [c, d] be a two-dimensional rectangle and let f : R ----> R. Suppose that f(x, .) is integrable on [c, d] for each x E [a, b], that fC, y) is integrable on [a, b] for each Y E [c, d], and that f is integrable on R (as a function of two variables). Then
(25)
iLfdA= l
b
l
df (X,Y)dY dX=
l
d
l bf (X,Y)dXdY.
NOTE: These hypotheses hold if f is continuous on the rectangle [a, b] x [c, d].
PROOF. For each x E [a,b], set g(x) = Lemma 12.30 implies
iL
t
f(x,y)dy. Since f is integrable on R,
f dA = (U) lb g(x) dx = (L) lb g(x) dx.
12.3
Iterated integrals
409
Hence, g is integrable on [a, b] and the first identity in (25) holds. Reversing the roles of x and y, we obtain
Hence, the second identity in (25) holds. I The second identity in Fubini's Theorem is as important as the first. It tells us that, under certain conditions, the order of integration in an iterated integral can be reversed. Frequently, one of these iterated integrals is easier to evaluate than the other. 12.32 Example. Find
SOL UTION. This iterated integral looks tough to integrate. However, if we change the order of integration, using Fubini's Theorem, we obtain
r r y3 e 1
1
Jo Jo
xy2 dxdy
=
r y(e 1
Jo
y2
-l)dy = e - 2.
2
I
The following three remarks show that the hypotheses of Fubini's Theorem cannot be relaxed. 12.33 Remark. There exists a function f : R2 --> R such that f(x,') and f(', y) are both integrable on [0, 1], but the iterated integrals are not equal. PROOF.
Set (x, y) E [2- n , 2- n +l) x [2- n , 2- n +l), n E N,
(x,y) E [2- n -1,2- n ) x [2- n ,2- n+1), n E N, otherwise. Notice that for each fixed Yo E [0,1), f(x, Yo) takes on only two nonzero values and is integrable on [0,1) in x. For example, if Yo E [2- n , 2- n+1), then f(x, Yo) = 22n for x E [2- n , 2- n +1), and f(x, Yo) = _22n+1 for x E [2- n -1, 2- n ); hence, f(x, Yo) is bounded on [0,1), and
The same is true for f(xo, y) when Xo E [0,1/2), but when Xo E [1/2,1), f(xo, y) takes on only one nonzero value, namely, f(xo, y) = 4 when y E [1/2,1), and equals zero otherwise. It follows that
J,1 J,1 4dydx r r f(x,y)dydx Jo Jo 1
1
=
1/2
1/2
= 1.
Chapter 12
410
INTEGRATION ON Rn
On the other hand, by (26) we have
fa1 fa1 f(x, y) dx dy =
0.
Thus the iterated integrals of f are not equal. (Of course, by Fubini's Theorem, f itself cannot be Riemann integrable on [0,1) x [0,1). In fact, f is not even bounded.) I Thus the rightmost equality of (25) need not hold when the double integral of f does not exist. 12.34 Remark. There exists a bounded function f: R2 -+ R such that f(x,·) and f(·,y) are both integrable on [0,1], but f is not integrable on [0,1] x [0,1]. PROOF.
Set f(x,y) = {
~
(x,y)
= (~'2~)'
0
otherwise.
Notice that if Xo = p/2 n , then f(xo, y) = 1 only when y = q/2 n for some q = 1,2, ... ,2 n - 1. Hence, for each fixed Xo E [0,1]' f(xo, y) = except for finitely many y's. It follows from Exercise 6, p. 116, that
°
fa1 f(x, y) dy = ° for all x E [0, 1]. A similar statement holds for the dx integral. Consequently,
fa1 fa1 f(x, y) dy dx = fa1 fa1 f(x, y) dx dy = 0. To see that the double integral of f does not exist, let Rj := [a, b] x [c, d] be a nondegenerate rectangle in [0,1] x [0,1]. It is easy to verify that [a, b] and [c, d] both contain irrational points, and points of the form p/2 n (just use density of irrationals, and repeat the proof of Theorem 1.24 with 2n in place of n). Thus if 9 = {Rj } is a grid on [0,1] x [0,1], then Mj(f) = 1 and mj(f) = for all j, and U(f,9) - L(f, 9) = 1 - = 1. Hence, f is not integrable on [0,1] x [0,1]. I
°
°
Thus we cannot be sure a function of several variables is integrable just because its iterated integrals exist and are equal. (See also Exercises 5 and 9.) The next result is starred because it uses Lebesgue's characterization of Riemann integrability (see Theorems 9.49 and 12.29i). *12.35 Remark. There exists a function f : R2 -+ R such that f integrable on [0,1] x [0, 1], f(·, y) is integrable on [0,1] for all y E [0,1]' but f(x,·) is not integrable on [0, 1] for infinitely many x E [0, 1].
12.3 PROOF.
Iterated integrals
Let
f(x, y) = {
411
°
when x = or when x or y is irrational when X,y E Q and x = p/q is in reduced form.
~/q
By the argument of Example 3.33, the function f is continuous and zero on the set ([0,1] \ Q) x [0,1]. Hence, by Lebesgue's Theorem, f is integrable on the square R = [0,1] x [0,1]. By computing its lower sums, it follows that JJR fdA = 0. Similarly, for each y E [0,1]' f(·, y) is integrable on [0,1] with Jol f(x, y) dx = 0. Thus
On the other hand, since for each nonzero x E Q the function f(x,·) is nowhere continuous, it cannot be integrable on [0,1]. Therefore, the other iterated integral in Fubini's Theorem does not exist. I Fubini's Theorem shows us how to evaluate a double integral over a rectangle by means of iterated integrals. The following result shows that the integral of a continuous function over a rectangle in R n can be evaluated using n partial integrals. 12.36 Lemma. Let R = [ab bl ] x ... x [an, bn] be an n-dimensional rectangle and let f : R - t R be integrable on R. If, for each x := (Xl' ... ' xn-d E Rn .[aI, bl ] x ... x [an-I, bn- l ], the function f(x,·) is integrable on [an, bnL then
I
bn
f(x, t) dt
an
is integrable on R n , and
( f(x, t) d(x, t) =
(27)
JR PROOF.
li Rn
bn
f(x, t) dt dx.
an
By repeating the argument of Lemma 12.30, we have
(L)
1
f(x,t)d(x,t)::; (L)
R
Rn
::; (U) ::; (U) for any bounded
li
bn
f(x,t)dtdx
an
Ln l:n
f(x, t) dt dx
L
f(x, t) d(x, t)
f. Since f is integrable on R, it follows that (27) holds. I
Using this result in conjunction with Theorem 12.20, we can evaluate integrals over a large collection of nonrectangular Jordan regions. To this end, we shall call
412
Chapter 12
INTEGRATION ON Rn
y y =x 2
y =x
x
Figure 12.6
a nonempty set E c Rn a projectable region if and only if there is a closed Jordan region HeR n-l, an index j E {1, ... , n}, and continuous functions ¢, 'lj; : H --+ R such that E
= {(Xl, ... ,Xn ) ERn: (Xl, ... ,Xj, ... ,xn ) E H and
¢(Xl,"" Xj,"" xn) ::; Xj ::; 'lj;(Xl,"" Xj,"" xn)}.
(The notation Xj means that the variable x j is missing; hence, (Xl"'" Xj, ... , Xn) is a point in Rn-l.) In this case, we say that E is generated by j, H, ¢, and'lj;. We are more specific for regions in R 2 and R 3 • A set E c R 2 is called a region oj type fif and only if E = {(x,y): X E [a,b], ¢(x)::; y::; 'lj;(x)} and a region oJ type II if and only if E = {(x, y) : y E [a, bj, ¢(y) ::; x ::; 'lj;(y)}, where ¢, 'lj; : [a, bj --+ R are continuous functions. Similarly, a set E c R 3 is called a region oj type f if and only if E = {(x, y, z) : (x, y) E H, ¢(x, y) ::; z ::; 'lj;(x, y)}, a region oj type II if and only if E = {(x,y,z): (x,z) E H, ¢(x,z)::; y::; 'lj;(x,z)}, and a region oJ type II/if and only if E = {(x,y,z): (y,z) E H, ¢(y,z)::; x::; 'lj;(y,z)}, where ¢,'lj;: H --+ R are continuous functions and H is a closed Jordan region in R2. 12.37 Example. Prove that the set E in R 2 bounded by y = x and y = x 2 is a region of types I and II.
PROOF. The set E can be described by
{(x, y) : x 2 ::; y ::; x, x E [0, I]}
or
{(x, y) : y ::; x ::; fy, y E [0, I]}
(see Figure 12.6). I 12.38 Example. Prove that the set E of points (x, y, z) that satisfy 4x 2+y2 + z2 ::; 1 is a region of types I, II, and III.
PROOF. The set E, an ellipsoid, can be described by E = {(x,y, z) :
-\1'1- 4x2 -
y2 ::;
Z ::;
\1'1- 4x 2 -
y2, (x,y)
EH},
12.3
Iterated integrals
413
z
x
Figure 12.7 where H = {(x, y) and III. I
: 4X2
+ y2
S I}. A similar argument shows that
E
is of types II
Before we show how to evaluate multiple integrals over project able regions, we introduce additional terminology. For each k = 1, ... ,n the set
will be called a coordinate hyperplane. Given a set E ~ R n , the projection of E onto the coordinate hyperplane Ih is the set Ek of points (XI, ... , Xk-l, 0, Xk+1,"" xn) such that (Xl, ... ,Xk, ... ,X n ) E E for some Xk E R. For example, in R3 the coordinate hyperplane III corresponds to the yz plane, and the projection of the three-dimensional ball Br(xo, Yo, zo) onto III is essentially the two-dimensional ball Br (Yo , zo) (see Figure 12.7). The following result shows that multiple integrals over most projectable regions can be evaluated using iterated integrals.
12.39 THEOREM. Let E be a projectable region in Rn generated by j, H, ¢, and 'IjJ. Then E is a Jordan region in Rn. Moreover, if f : E - t R is continuous on E, then
PROOF.
By symmetry, we may suppose that j
= n. Thus
To show that E is a Jordan region, we must show that the volume of {)E is zero. Now {)E is made up of "lower-dimensional pieces," a bottom B = {(x, t) : x E Hand t = ¢(x)}, a top T = {(x, t) : x E Hand t = 'IjJ(x)}, and a side S = {(x, t) :
414
Chapter 12
INTEGRATION ON R n
z
Z
=If/(x,y)
x
Figure 12.8
x E
()H
and ¢(x) :::; t :::; 1j;(x)}. (Figure 12.8 illustrates the situation for the case
n = 3.) Hence, we must show that B, T, and S are of volume zero.
To estimate the volume of B, notice that since H is compact, ¢ is uniformly continuous on H. Thus, given € > 0, there is a 0 > 0 such that (29)
x,y E Hand
Ilx -yll < 0 imply
11¢(x) - ¢(y) II < €.
Since H is bounded, H is contained in some (n - I)-dimensional cube Q. Divide Q into subcubes Ql,"" Qp such that X,y E Qk implies Ilx - yll < 0, and let Rk = Qk x [¢(ak) - 2€, ¢(ak) + 2€] for some ak E Qk, k = 1,2, ... ,po Then g := {R 1 , ... ,Rp} is grid in Rn, and by (29), p
V(B; g) :::;
L
p
IRkl = 4€
k=l
L
IQkl = 4€IQI·
k=l
It follows from Remark 12.6i that B is of volume zero. A similar argument shows that T is of volume zero. To estimate the volume of S, set M
= sup1j;(x) and m = inf ¢(x). zEH
zEH
Since H is a Jordan region, choose a grid {Ql, ... , Qp} in R n-l such that
Set Rk = Qk x [m, M] and observe that
g := {R 1 , ... , Rp} is a grid in R n , and
p
V(S; g) :::;
L k=l
IRkl < (M - m)€.
12.3
Iterated integrals
415
z
y
x
Figure 12.9
Hence it follows from Remark 12.6i that S is of volume zero. We conclude that BE is of volume zero; Le., E is a Jordan region. To prove (28), let R = [aI, bll x ... x [an, bnl be an n-dimensional rectangle that contains E, and define 9 on R by g(x, t) = f(x, t) when (x, t) E E, and g(x, t) = 0 otherwise. By Theorem 12.20 and Lemma 12.36,
But for each x
= (XI, ... , xn-d g(x, t)
Therefore,
I
bn
an
= {
E H, we have
~(x, t)
g(x, t) dt
=
¢(x) ~ t ~ ,¢(x) otherwise.
1"'{2:)
f(x, t) dt.
I
>{2:)
Although we have stated Theorem 12.39 for continuous f, the result is evidently true whenever Lemma 12.36 applies, e.g., if f is integrable on E and f(x,·) is integrable on [an, bnl for each fixed x E H. If the set H is itself projectable, then Theorem 12.39 can be applied again to H. Thus if E is nice enough, an integral over E can be evaluated using n partial integrals. We close this section with several examples that illustrate this principle for the cases n = 2 and n = 3. 12.40 Example. Find the integral of f(x, y, z) = 1 - x - y, x = 0, y = 0, and z = o.
z
= x over the region E bounded by
Chapter 12
416
INTEGRATION ON R n
z
-1
y
Figure 12.10 SOLUTION. The surfaces z = 0 and z = 1 - x - y intersect when y = 1 - x. The projection E3 is bounded by the curves x = 0, y = 0, and y = 1 - x. These last two curves intersect when x = 1. Thus E is a region of type I:
E={(x,y,z):O:::;x:::;l, O:::;y:::;l-x, O:::;z:::;l-x-y}
(see Figure 12.9). It follows that
IlL f 11 1 1 11 111 + dV =
1
=
=-
2
1 x -
1
1 x - (x
- x 2 - xy)dydx
X
-
(x - 2x 2
-
Y
xdzdydx
x 3 ) dx
0
1
= -. I 24
12.41 Example. Find the integral of f(x, y, z) = x 2 over the region E bounded by Ixl = 1, z = x 2 _y2, where z 2 O. The surfaces z = 0 and z = x 2 - y2 intersect when x 2 - y2 = 0; i.e., y = ±x. The curves y = ±x and Ixl = 1 intersect when x = ±1. Thus the region E is of type I: SOLUTION.
E = {(x, y, z) : -1 :::; x :::; 1,
(see Figure 12.10). It follows that
-Ixl :::; y :::; lxi,
0:::; z :::; x 2 _ y2}
12.3
417
Iterated integrals
z
z=l
y
Figure 12.11
Although Theorem 12.39 can be used in conjunction with Theorem 12.23 to handle the case when E is a finite union of projectable subregions, we can sometimes avoid breaking E into subregions by changing our point of view. Here is a typical example. 12.42 Example. Find the integral of f(x, y, z) = x - z over the region bounded by z = y2, Z = 1, z = x, and x = O.
The region E is a union of two regions of type I (see Figure 12.11, where the "back" of E is that portion of the plane x = 0 which is bounded by the parabola z = y2, X = 0 here represented by a dashed line). Therefore, we must use two integrals if we integrate dz first, the integral where z varies between y2 and 1, and the integral where z varies from x to 1. It looks complicated to set up. The solution is simpler if we integrate dx first. Indeed, E is a single region of type III since E = {(x, y, z) : -1 ~ y ~ 1, y2 ~ Z ~ 1, 0 ~ x ~ z}. SOLUTION.
Thus,
JJLf dV = [11 1: l =
-~
2
Z
1111 -1
(x - z) dx dz dy z 2 dzdy
y2
=
~11 (y6 -1)dy = -~. 6
7
-1
I
EXERCISES 1. Evaluate each of the following iterated integrals.
(a) 1
1
1\x2+ Y )dxdY.
(b)
1111
r/ r/ 2
Yxy+xdxdy. (c)
Jo Jo
2
ycos(xy) dydx.
Chapter 12
418
INTEGRATION ON Rn
2. Evaluate each of the following iterated integrals. Write each as an integral over a region E, and sketch E in each case.
[11x2+l
(a) Jo
x
(x+y)dydx.
(d)
[lJl 11
Jo 3. For each of the following, evaluate
vy
..jx3+zdzdxdy.
x3
JE f(x) dx.
f(x,y) = xVfj and E is bounded by y = x and y = x 2. f(x, y) = x + y and E is the triangle with vertices (0,0), (0,1), (2,0). f(x, y) = x and E is bounded by y = x = -Vfj, and y = 4. f(x, y, z) = x and E is the set of points (x, y, z) such that 0 :::; z :::; 1 - x 2, o :::; Y :::; x 2 + z2, and x 2: o. 4. Compute the volume of each of the following regions. (a) E is bounded by the surfaces x + y + z = 3, z = 0, and x 2 + y2 = 1. (b) E lies under the plane z = x + y and over the region in the xy plane bounded by the curves x = x = 2Vfj, x + y = 3. (c) E is bounded by z = y2, X = y2 + Z2, X = 0, z = 1. (d) E is bounded by y = x 3, X = z2, Z = x 2, and y = o. (a) (b) (c) (d)
,;x,
JY72,
5. (a) Verify that the hypotheses of Fubini's Theorem hold when f is continuous on R. (b) Modify the proof of Remark 12.33 to show that Fubini's Theorem might not hold for a nonintegrable f, even if f(x, y) is continuous in each variable separately; i.e., if f(x,·) is continuous for each x E [a, b] and f(·, y) is continuous for each y E [c, dj. 6. (a) Suppose that !k is integrable on [ak, bk] for k ... x [an, bn]. Prove that
= 1, ... , n, and set R = [all bl ] x
(b) If Q = [0,1]n and y:= (1,1, ... ,1), prove that
7. The greatest integer in a real number x is the integer [x] := n that satisfies n :::; x < n + 1. An interval [a, b] is called Z-asymmetric if b+ a -=f. [b] + [a] + 1. (a) Suppose that R is a two-dimensional Z-asymmetric rectangle; Le., both of its sides are Z-asymmetric. If 'ljJ(x, y) := (x - [x]- 1/2)(y - [y]- 1/2), prove that JJR 'ljJ dA = 0 if and only if R at least one side of R has integer length.
12.3
419
Iterated integrals
(b) Suppose that R is tiled by rectangles Rl ... , R N , i.e., the Rj's are Z-asymmetric, nonoverlapping, and R = UY=lRj . Prove that if each R j has at least one side of integer length and R is Z-asymmetric, then R has at least one side of integer length. 8. Let E be a nonempty Jordan region in R2 and f : E ---- [0,00) be integrable on E. Prove that the volume of Q = {(x, y, z) : (x, y) E E, ~ z ~ f(x, y)} (as given by Definition 12.5) satisfies
°
Vol(Q)
= fLfdA.
9. Let R = [a, b] x [c, d] be a two-dimensional rectangle and (a) Prove that (L) fLfdA
~ (L) ~ (U)
lb ((X) ld lb ld ((X)
f : R ---- R
be bounded.
f(X,Y)d Y) dx f(x, y) dY) dx
~ (U) fLfdA for X = U or X = L. (b) Prove that if f is integrable on R, then
flfdA~
l
((L)
t
f(x, Y) dY)
dx~
l ((u) l' f(x,
Y) dY)
dx.
(c) Compute the two iterated integrals in part (b) for
f(x,y) and R
= {
~
yE Q y~
Q
= [0,1] x [0,1]. Prove that f is not integrable on R.
< b are extended real numbers, c < d are finite real numbers, f: (a, b) x [c, d] ---- R is continuous, and
*10. [FUBINI'S THEOREM FOR IMPROPER INTEGRALS]. If a
F(y)
=
lb
f(x, y) dx
converges uniformly on [c, dj, prove that
ld
f(x,y) dy
Chapter 12
420
INTEGRATION ON Rn
is improperly integrable on (a, b) and
ld lb
f(x,y)dxdy =
lb ld
f(x,y)dydx.
12.4 CHANGE-OF-VARIABLES Recall (Exercise 11, p. 136) that if ¢ : [a, b] ¢' =f 0 on [a, b], then
1
f(t) dt =
4>([a,b])
f
----t
R is continuously differentiable and
f(¢(x)) 1¢'(x)1 dx
J[a,b]
for all f integrable on ¢([a, b]). We shall generalize this result to functions of several variables; namely, we shall identify conditions under which (30)
1
4>(E)
f(u) du =
f f(¢(x)) ILl4> (x) I dx JE
holds. (At this point you may wish to read the discussion following the proof of Theorem 12.46 to see that Ll4> takes on a familiar form when ¢ is the change from polar to rectangular coordinates.) It takes six or seven hypotheses to establish (30). These hypotheses fall into two categories. 1) Hypotheses made so that the change of variables is possible. Since the one-dimensional result required ¢ to be continuously differentiable and ¢' =f 0 (which together imply that ¢ is 1-1), we expect hypotheses for (30) to be: ¢ is 1-1, continuously differentiable, and Ll4> =f O. 2) Hypotheses made so that the integrals in (30) exist. There are four of these: E is a Jordan region, ¢(E) is a Jordan region, f is integrable on ¢(E), and f 0 ¢ILl4> I is integrable on E. In practice, only the first and third of the hypotheses in category 2) need be verified. Indeed, if ¢ satisfies all hypotheses in category 1) and E is Jordan, then ¢(E) is Jordan by Corollary 12.10iii, and, when f is integrable on ¢(E), f 0 ¢ILl4> I is integrable on E (see Theorem 12.29ii and Exercise 7a, p. 406). Moreover, the remaining hypotheses in category 2) can usually be verified by inspection. The reason for this is twofold. Most functions are continuous (or nearly so), hence integrable, and E is frequently projectable, hence a Jordan region. Therefore, the crucial hypotheses for (30) are those in category 1), namely, that ¢ be 1-1, continuously differentiable, and Ll4> =f O. To give an outline of a proof of (30), we introduce the following terminology. A function f is said to satisfy a certain property P "locally" on a set E if and only if given a E E there is an open set W containing a such that f satisfies P on W n E. f is said to satisfy the property P "globally" on E if and only if f satisfies P for all points in E. To prove (30), we first obtain several preliminary results which culminate in a "local" change-of-variables formula (see Lemma 12.45) and then use this to obtain a "global" change of variables formula for functions ¢ that are CIon
12.4
Change-oj-variables
421
an open set that contains E (see Theorem 12.46). Throughout this discussion, we assume that t1c/> is never zero. In Section 12.5, we work much harder to show that the condition "t1c/> i- 0" can be relaxed on a set of volume zero (see Theorem 12.65). Since every Jordan region can be approximated by rectangles, and every integrable function is almost continuous, hence, locally nearly constant, we should consider (30) first in the case when ¢(E) is a rectangle and f is identically 1; i.e., we should prove that
(31)
Our first preliminary result shows that this case is a step in the right direction. 12.43 Lemma. Let W be open in Rn, let ¢ : W ---> Rn be 1-1 and continuously differentiable on W with t1c/> i- 0 on W, and suppose that ¢-l is continuously differentiable on ¢(W) with t1c/>-1 i- 0 on ¢(W). Suppose further that (31) holds for every n-dimensional rectangle R C ¢(W). If E is a Jordan region with E C W, if f is integrable on ¢( E), and if f 0 ¢ is integrable on E, then
r
ic/>(E)
f(u)du=
r (fo¢)(x)It1c/>(x)ldx.
iE
PROOF. We may suppose that W is nonempty. Let E be a fixed Jordan region that satisfies E C Wand suppose that f is integrable on ¢(E). Set f+ = (Ifl+ 1)/2 and f- = (If I - 1)/2. Then f+ and f- are both nonnegative and integrable on ¢(E), and f = f+ - f- (see Exercise 7, p. 65, and Exercise 2, p. 125). Since the integral of a difference is the difference of the integrals, it suffices to prove the lemma for the case when f 2: o. Let c > O. Since f is integrable on ¢( E), choose a grid 9 = {R 1, ... , Rp} such that
(32)
where M j = sUPUER, f(u) := sUPzEc/>-I(Rj) f(¢(x)). Moreover, since ¢(E) = ¢(E) c ¢(W), we may suppose, by refining 9 if necessary, that Rj n ¢(E) i- 0 implies Rj C ¢(W). Hence, by Corollary 12.10ii, {¢-l(R j )}R,nc/>(E),t:0 is a nonoverlapping collection of Jordan regions whose union satisfies
u
R,nc/>(E),t:0
Chapter 12
422
INTEGRATION ON Rn
Hence, (32), (31), Theorem 12.25, and Theorem 12.23 imply
1
L
J(u) cbJ.?
c/>(E)
MjlRjl- c
RJnc/>(E)#
L
Mj
RJnc/>(E)#
L
?
RJnc/>(E)#
=
?
{ in!
1
1
1~c/>(x)ldx-c
c/>-!(RJ)
J(¢(x))I~c/>(x)1 dx -
c
c/>-!(RJ)
J(¢(x))I~c/>(x)1 dx -
Ie J(¢(x))I~c/>(x)1
c
dx - c.
(For this last step, we used the fact that J ? 0.) Since c > 0 was arbitrary, we obtain ( J(u) cbJ.? ( J(¢(x))I~c/>(x)1 dx. ic/>(E)
iE
On the other hand, by Theorem 12.20 there is a grid 1i = {Ql, . .. ,Qp} such that
1
J(u) du
c/>(E)
~
L
mjlQjl + c,
QJC(c/>(E))o
where mj = infuEQ J J(u) := inf"'Ec/>-!(Qj) J(¢(x)). Repeating the steps above with
U
O2 :=
¢-l(Qj) ~ ¢-l(¢(E)) = E
QJc(c/>(E))o
in place of 0 1 , we see that
1 J(u)du~ c/>(E)
~
~
L
mjlQjl+c
QJc(c/>(E))O
( in2
J(¢(x))I~c/>(x)1 dx + c
Ie J(¢(x))I~c/>(x)1
dx+ c.
We conclude that
1
c/>(E)
J(u) du = ( iE
J(¢(x))I~c/>(x)1 dx.
Next, we show that (31) holds locally near points a when and C1 .
I ~c/>(a)
f:. 0 and ¢ is 1-1
12.4
Change-oj-variables
423
12.44 Lemma. Let V be open in Rn and ¢ : V ---- R n be 1-1 and continuously differentiable on V. If I::l.>(a) =I- 0 for some a E V, then there exists an open rectangle W such that a EWe V, I::l.> is nonzero on W, ¢-l is C1 and its Jacobian is nonzero on ¢(W), and such that if R is an n-dimensional rectangle contained in ¢(W), then ¢-l(R) is Jordan and (31) holds. PROOF. The proof is by induction on n. If n = 1 and ¢'(a) =I- 0, then ¢' is nonzero on some open interval I containing a. Hence, by Exercise 11, p. 136, (31) holds for "rectangles" (Le., intervals) in ¢(I). Suppose that (31) holds on Rn-l, for some n > 1. Let ¢: V ____ Rn be 1-1 and CIon V with I::l.>(a) =I- O. Since I::l.>(a) =I- 0, we can use continuity of I::l.> and the Inverse Function Theorem to choose an open set W C V, containing a, such that ¢ is 1-1 and I::l.> =I- 0 on W, and ¢-l is 1-1, C1 and I::l.>-l =I- 0 on ¢(W). By making W smaller, if necessary, we may suppose that W is an open rectangle; i.e., there exist open intervals I j such that W = It x ... x In. To apply the inductive hypothesis, we must break ¢ into "lower-dimensional" pieces. To this end, for each x = (Xl. ... , xn) E W, set
Notice that ¢ = 0' o1jJ, hence by the Chain Rule, I::l.>(x) particular, by the choice of W, (33)
=
I::l.(j (1jJ (x)) 1::l.'IjJ (x). In
1::l.'IjJ(x) =I- 0 and 1::l.(j(1jJ(x)) =I- 0 for all x E W.
To show the inductive hypothesis can be used on 1jJ, fix t E It. Set Wo = 12 X ... x In and ¢t(y) = (¢2(t,y), ... , ¢n(t,y)) for each y E Woo Then, ¢t : Wo ____ Rn-l is 1-1 and CIon Wo, and, by (33), I::l.>t(y) = 1::l.'IjJ(t,y) =I- 0 for all y E Woo It follows from the inductive hypothesis that if Qo is an (n - I)-dimensional rectangle that satisfies Qo C ¢t(Wo), then (¢t)-l(QO) is Jordan and (34)
IQol
=
1
II::l.>t (y) 1 dy.
W)-I(QO)
(Wo, hence W, may have gotten smaller again.) Let Q = 10 x Qo be any n-dimensional rectangle in 1jJ(W) and integrate (34) with respect to t over 10 to verify
But the first component of 1jJ satisfies 1jJl (t, y) = t for all YEW, so I::l.>t = 1::l.'IjJ and 1jJ-l(Q) is the union of the "t-sections" (¢t)-l(QO) as t ranges over 10. Hence, we can continue the identity above as follows:
(35)
Chapter 12
424
INTEGRATION ON Rn
In particular, it follows from Lemma 10.43 that (36)
r
J,p(E)
g(U) du =
r g("p(x))I~,p(x)1 dx
JE
for all Jordan regions E that satisfy E C W, provided that g is integrable on "p(W) and go "p is integrable on E. Similarly, we can use the inductive hypothesis to prove that (31) holds for a in place of ¢ for all n-dimensional rectangles R contained in ¢(W). Hence, for each such rectangle R, we have by (36)-with E = "p-l(a- 1 (R)) == ¢-l(R) and g = I~al-and the Chain Rule that IRI
= = =
1
a-1(R)
l~a(u)1 dn
r l~a("p(x))II~,p(x)1 dx r 1~q,(x)1 dx. I Jq,-l(R)
J,p-l(a-1(R))
By combining Lemmas 12.43 and 12.44, we obtain the following local version of the change-of-variables formula we want. 12.45 Lemma. Suppose that V is open in R n , a E V, and ¢ : V ---+ Rn is continuously differentiable on V. If ~q,(a) -:f. 0, then there exists an open rectangle W c V containing a such that if E is Jordan with E c W, if j 0 ¢ is integrable on E, and if j is integrable on ¢(E), then
(37)
r
Jq,(E)
j(u) dn =
r j(¢(x))I~q,(x)1 dx.
JE
This local change-of-variables formula contains the following global result. 12.46 THEOREM. Suppose that V is open in R n and that ¢ : V ---+ Rn is 1-1 and continuously differentiable on V. If ~q, -:f. 0 on V, if E is a Jordan region with E C V, if j 0 ¢ is integrable on E, and if j is integrable on ¢(E), then
(38)
r
Jq,(E)
j(u) dn =
r j(¢(x))I~q,(x)1 dx.
JE
PROOF. Let j : ¢(E) ---+ R be integrable, and set H := E. By Lemma 12.45, given a E H there is an open rectangle Wa such that a E Wa C V and
(39)
r
Jq,(E,)
j(u) dn =
r j(¢(x))I~q,(x)1 dx
J E,
for every Jordan region Ei that satisfies Ei C Wa. Let Qa be an open rectangle that satisfies a E Qa C Qa C Wa. Then for each a E H there is an r(a) > 0 such
12.4
Change-oj-variables
425
y
(r,6) = (x, y)
x
Figure 12.12
that Br(a) (a) C Qa. Since the Jordan region E is bounded, H is compact by the Heine-Borel Theorem. Thus there exist aj such that H is covered by Br(aj) (aj), j = 1,2, ... , N. Hence the open rectangles Qj := QaJ satisfy
Let R be a huge rectangle that contains H and g = {Rl, ... , Rp} be a grid on R so fine that each rectangle in g which intersects H is a subset of some Qj. (This is possible since there are only finitely many Qj'S; just use the endpoints of the Qj's to generate g.) Let Ei = Ri n E. Then Ei ~ Ri n H ~ Qj C WaJ for some j E {I, ... , N}; i.e., (39) holds. Moreover, the collection {E 1 , ... , Ep} is a nonoverlapping family of nonempty Jordan regions whose union is E; hence, by Theorem 1.43 and Corollary 12.10ii, the collection {¢(Ei) : i = 1, ... ,p} is a nonoverlapping family of nonempty Jordan regions whose union is ¢(E). It follows from Theorem 12.23 and (39) that
{ f(u)du= ici>(E)
=
t( t 1 f(¢(x))I~ci>(x)1 1f(¢(x))I~ci>(x)1 f(u)du
i=l
ici>(E,)
dx
i=l
E,
=
dx. I
E
Again, we note that in Theorem 12.46 the hypothesis that f 0 ¢ be integrable is superfluous-see Theorem 12.29ii. To see how all this works out in practice, we begin with a familiar change of variables in R 2 . Recall that polar coordinates in R 2 have the form
x = r cos (),
y
= rsin(),
426
Chapter 12
INTEGRATION ON R n
y
Figure 12.13
e
where r = II (x, y) II and is the angle measured counterclockwise from the positive x axis to the line segment L((O, 0); (x, y)) (see Figure 12.12). Set ¢(r, e) = (r cos e, r sin e) and observe that (40)
-rsine] rcos e
= r( cos2 e +. sm2 e) = r.
Thus we abbreviate the change-of-variables formula from polar coordinates to rectangular coordinates by dx dy = r dr de. Although ¢ is not 1-1 (e.g., ¢(O, e) = (0,0) for all e E R) and its Jacobian is not nonzero, this does not prevent us from applying Theorem 12.46 (Le., changing variables from polar coordinates to rectangular coordinates, and vice versa). Indeed, since ¢ is I-Ion n:= {(r, e) : r > 0,0:::; e < 2rr} and its Jacobian is nonzero off the set Z := {(r, e) : r = O}, we can apply Theorem 12.46 to En {(r, e) : r > O} and let r ! O. Since the end result is the same as if we applied Theorem 12.46 directly without this intermediate step, we shall do so below without any further comments. This works in part because the set Z where the hypotheses of category 1) fail, is a set of volume zero (see Theorem 12.65). The next two examples show that polar coordinates can be used to evaluate integrals that cannot be computed easily using rectangular coordinates. 12.47 Example. Find the volume of the region E bounded by z x 2 + y2 = 4, and z = O. B
= x 2 + y2,
SOLUTION. Clearly, E lies under the function f(x,y) = x 2 + y2 over the region = B 2 (0, 0) (see Figure 12.13). Using polar coordinates, we obtain
12.4
12.48 Example. Evaluate
427
Change-oj-variables
1L
_X_2_:....::y_2 dA,
(42) where E
= {(x, y) : a2 ::; x 2 + y2 ::; 1 and
SOLUTION.
°: ; y ::; x}
for some
°< a <
1.
Changing to polar coordinates, we see that
Ji
x2 + y2 17r/411 --B r3 1 - a317r/4 dr dB = - sec BdB. dA = E x 0 a rcos 3 0 To integrate sec B, multiply and divide by sec B+ tan B. Using the change of variables u = sec B + tan B, we obtain
7r/4 17r /4 sec Btan B + sec2B dB 1o sec BdB = 0 sec B + tan B =
j 1+v'2 -du = 10g(1 + V2). U
1
Consequently,
(43) Recall that cylindrical coordinates in R 3 have the form
x = rcosB,
y
= rsinB,
z
= z,
where r = II(x,y,O)11 and B is the angle measured counterclockwise from the positive x axis to the line segment L((O, 0, 0); (x, y, 0)). It is easy to see that this change of variables is I-Ion n := {(r,B,z) : r > 0,0 ::; B < 27f,z E R}, and its Jacobian, r, is nonzero off Z := {(r, B, z) : r = O}. We shall abbreviate the change of variables formula from cylindrical coordinates to rectangular coordinates by dx dy dz = r dz dr dB. (Note that Z is a set of volume zero. As with polar coordinates, application of Theorem 12.46 can justified by applying it first for r > 0, and then taking the limit as r ! 0.) 12.49 Example. Find the volume of the region E that lies inside the paraboloid x 2 + y2 + z = 4, outside the cylinder x 2 - 2x + y2 = 0, and above the plane z = 0. SOLUTION. The paraboloid z = 4-x 2_y2 has vertex (0,0,4) and opens downward about the z axis. The cylinder x 2 - 2x + y2 = (x - 1) 2 + y2 - 1 = has base centered at (1,0) with radius 1. Hence, the projection E3 lies inside the circle x 2+y2 = 4 and outside the circle x 2 + y2 = 2x (see Figure 12.14). This last circle can be described in polar coordinates by r2 = 2r cos B; i.e., r = 2 cos B. Thus
°
Vol (E)
=
IlL
lL31-
4 r2
IdV =
dzdA
=/7r/2 r 2 (4-r 2)rdrdB+ r 37r / 2 r\4-r 2)rdrdB= 1~7f. -7r/2 i2cos(J i7r/2 io
I
Chapter 12
428
INTEGRATION ON R n
-2
2
x
Figure 12.14 z
'f
(p, 'f, 8)
= (x,
y, z)
I p
I
I I 8
Y
x
Figure 12.15
Recall that spherical coordinates in R 3 have the form x
= p sin 'P cos e,
e
y = p sin 'P sin e,
z = pcos'P,
where p = II(x,y,z)ll, is the angle measured counterclockwise from the positive x axis to the line segment L ( (0, 0, 0); (x, y, 0) ), and 'P is the angle measured from the positive z axis to the vector (x, y, z) (see Figure 12.15). Notice that this change of variables is I-Ion {(p,'P,e): p > 0,0 < 'P < 71',0::; e < 27r} and its Jacobian, p 2 sin'P (see Exercise 8), is nonzero off Z := {(p,'P,e) : 'P = 0,7r,p = O}, a Jordan region of volume zero. Hence, application of Theorem 12.46 can justified by applying it first for p > 0 and 0 < 'P < 71', and then taking the limit as p, 'P 1 0 and 'P T 71'. Since the end result is the same as applying Theorem 12.46 directly to any project able region in R 3 , we shall do so, without further comments, when changing variables to or from spherical coordinates. We shall abbreviate the change-of-variables formula from spherical coordinates to rectangular coordinates by dx dy dz = p2 sin 'P dp d'P de. (For spherical coordinates in R n, see the proof of Theorem 12.69.)
12.4
429
Change-oJ-variables
12.50 Example. Find
11k
xdV,
where Q = B 3 (0, 0, 0) \ B 2 (0, 0, 0). SOLUTION.
Using spherical coordinates, we have
11k
x dV
=
121r LTr 1 p sin
I
Theorem 12.46 can be used for other changes of variables besides polar, cylindrical, and spherical coordinates.
12.51 Example. Find
lie
sin(x + y) cos(2x - y) dA,
where E is the region bounded by y
= 2x -
1, Y = 2x + 3, y
= -x,
and y
= -x + 1.
SOLUTION. Let ¢(x, y) = (2x -y, x+y) and observe that the integral in question looks like the right side of (38) except that the Jacobian is missing. By Cramer's Rule, for each fixed u, v E R, the system u = 2x - y, v = x + y has a unique solution in x, y. Hence, ¢ is I-Ion R 2 • It is obviously continuously differentiable, and its Jacobian, 8(u, v) ~4>(x, y) = 8(x, y) = det 1 1 = 3,
[2 -1]
is a nonzero constant. Hence, we can make adjustments to the integral in question so that it is precisely the right side of (38):
lie
sin(x + y) cos(2x - y) dA =
~
lie
f 0 ¢(x, Y)~4>(X' y) d(x, y),
where f(u,v) = cosusinv. It remains to compute the left side of (38), i.e., to find what happens to E under ¢. Notice that y = 2x - 1 implies u = 1, Y = 2x + 3 implies u = -3, Y = -x implies v = 0, and y = -x + 1 implies v = 1. Thus ¢(E) = [-3,1] x [0,1]. Applying Theorem 12.46 and the preliminary step taken above, we find that
lie
sin(x + y) cos(2x - y) dA
=
~ 11 [13 sin vcosu dudv
= ~(sin(l) + sin(3))(1 -
cos(1)). I
Chapter 12
430
INTEGRATION ON R n
EXERCISES 1. Evaluate each of the following integrals.
(a)
(b)
0:::; a < b.
(c) 2. For each of the following, find
ffE fdA.
(a) f(x, y) = cos(3x2 + y2) and E is the set of points satisfying x 2 + y2/3 :::; l. (b) f(x,y) = yylx - 2y and E is bounded by the triangle with vertices (0,0), (4,0), and (4,2). 3. For each of the following, find
fffE f dV.
(a) f(x, y, z) = z2 and E is the set of points satisfying x 2 + y2 + Z2 :::; 6 and z 2: x 2 + y2. (b) f(x, y, z) = eZ and E is the set of points satisfying x 2 +y2 + z2 :::; 9, x 2 +y2 :::; 1, and z 2: O. (c) f(x, y, z) = (x - y)z and E is the set of points satisfying x 2 + y2 + Z2 :s 4, z 2: x 2 + y2, and x 2: O.
J
4. (a) Prove that the volume bounded by the ellipsoid
is 47rabc/3. (b) Let a, b, c, d be positive numbers and r2 < d 2/(b 2 + c2). Find the volume of the region bounded by y2 + Z2 = r2, x = 0, and ax + by + cz = d. (c) Show that for any a 2: 0, the volume of the region bounded by the cylinders x 2 + Z2 = a2 and y2 + Z2 = a 2 is 16a 3 /3.
ffE ylx - yylx + 2ydA, where E is the parallelogram with vertices (0,0), (2/3, -1/3), (1,0), (1/3,1/3). (b) Compute ffE {!2x 2 - 5xy - 3y2 dA, where E is the parallelogram bounded by the lines y = x/3, y = (x - 1)/3, y = -2x, Y = 1 - 2x. (c) Find
5. (a) Compute
fie
e(Y-X)/(Y+X)
dA,
12.4
431
Change-oj-variables
where E is the trapezoid with vertices (1,1), (2,2), (2,0), (4,0). (d) Given Jo1 (1 - x)f(x) dx = 5, find
10 loX f(x 1
y) dy dx.
6. Suppose that V is nonempty and open in Rn and differentiable with f). f =f. 0 on V. Prove that
f :V
If).
(Xo) I
lim Vol (f(Br(xo))) = r-->O+ Vol (Br(xo))
---+
R n is continuously
f
for every Xo E V. 7. Show that Vol is rotation invariant in R2; i.e., if ¢ is a rotation on R2 (see Exercise 9, p. 241) and E is a Jordan region in R2, then Vol (¢(E)) = Vol (E). 8. (a) Compute the Jacobian of the change of variables from spherical coordinates to rectangular coordinates. (b) Assuming that Vol is translation and rotation invariant (see Exercise 3, p. 393, and Exercise 7), verify the following classical formulas: the volume of a sphere of radius r is ~7rr3, and the volume of a right circular cone of altitude h and radius r is 7rr2 h/3. 9. Let fJj = (Vj1,"" Vjn) ERn, j = 1, ... , n, be fixed. The parallelepiped determined by the vectors fJj is the set P(fJ1, ... ,fJn ) :=
{hfJ1
+ ... + tnfJn : tj
E [0, I]},
and the determinant of the fJj's is the number det(fJ1"" ,fJn ) := det [vjkl nxn ' Prove that Vol (P(fJ1,'" ,fJn )) = Idet(fJ1'''' ,fJn)l· Check this formula for n = 2 and n = 3 to see that it agrees with the classical formulas for the area of a parallelogram and the volume of a parallelepiped. [!!]. This exercise is used in Section e12.6. (a) Prove that the improper integral Jooo e- x2 dx converges to a finite real number. (b) Prove that if 1 is the value of the integral in part (a), then 12
= lim N-->oo
17r/21N 0
e- r2 r dr dB.
0
(c) Show that
roo e-x2 dx =
io
y'7f. 2
(d) Let Qk represent the n-dimensional cube [-k, k] x ... x [-k, k]. Find lim k-->oo
r e-llzl12 dz.
i Qk
Chapter 12
432
INTEGRATION ON Rn
11. Let He VeRn, with H convex and V open, and let >: V --+ R n be C1 . (a) Show that if E is a closed subset of HO and
€h(X) := >(x + h) - >(x) - D>(x)(h)
for x E V and h small.
then €h(x)/llhll --+ 0 uniformly on E, as h --+ o. (b) Show that if R is a closed rectangle in HO and S := (D>(X))-l exists for some x E R, then given € > 0 there are constants 8 > 0 and M > 0, and a function T(x,y) such that
So >(x) - S 0 >(y) = x - y for x,y E R, and IIT(x,y) II ~ M€ when Ilx -yll < 8. (c) Use parts (a) and (b) to prove that if 6.> is nonzero on V, x E HO, and € is sufficiently small, then there exist numbers Ce > 0, which depend only on H, >, n, and €, and a 8> 0 such that Ce --+ 1 as € --+ 0 and Vol (So>(Q)) ~ CelQI for all cubes Q C H that contain x and satisfy Vol (Q) < 8. (d) Use part (c) and Exercise 9 to prove that if 6.> is nonzero on V and x E HO, then given any sequence of cubes Qj that satisfy x E Qj and Vol (Qj) --+ 0 as j --+ 00, it is also the case that Vol (>(Qj))/IQjl--+ I6.> (x) I as j --+ 00.
e12.5 PARTITIONS OF UNITY
This section uses results from Section 9.4.
In this section we show that a smooth function can be broken into a sum of smooth functions, each of which is zero except on a small set, and use this to prove a global change-of-variables formula when the Jacobian is nonzero off a set of volume zero. Later, this same technique will be used to prove the Fundamental Theorem of Calculus on manifolds (see Theorem 15.44).
12.52 DEFINITION. Let f: Rn --+ R. (i) The support of f is the closure of the set of points at which spt f := {x E Rn : f(x)
f is nonzero; i.e.,
=f. O}.
(ii) A function f is said to have compact support if and only if spt f is a compact set.
12.53 Example. If
f(x) = { then spt f
~
xE Q
x~ Q,
= R.
12.54 Example. If x E (0,1) x E (1,2) otherwise,
12.5
then spt f
Partitions of unity
433
= [0,2].
Since the support of a function is always closed, a function f on R n has compact support if and only if spt f is bounded (see the Heine-Borel Theorem). The following result shows that if two functions have compact support, then so does their sum (see also Exercises 1 and 2). 12.55 Remark. If f, 9 : Rn
-+
R, then
+ g)
spt (f PROOF.
If (f
~
spt f U spt g.
+ g) (x) =f. 0, then f(x) =f. 0 or g(x) =f. O.
{x E R n : (f + g) (x) =f. O}
~
Thus
{x ERn: f(x) =f. O} U {x ERn: g(x) =f. O}.
Since the closure of a union equals the union of its closures (see Theorem 8.37 or 10.40), it follows that spt (f + g) ~ spt f U spt g. I Let pEN or p = 00. The symbol C~(Rn) will denote the collection of functions -+ R that are CP on Rn and have compact support. In particular, it follows from Remark 12.55 that if Ii E cg(Rn) for j = 1, ... , N, then
f : Rn
N
L Ii
E
C~(Rn).
j=l
We will use this observation several times below. If f is analytic (a condition stronger than COO) and has compact support, then f is identically zero (see Exercise 3 below). Thus it is not at all obvious that C~(Rn) contains anything but the zero function. Nevertheless, we shall show that C~(Rn) not only contains nonzero functions, but has enough functions to "approximate" any compact set (see Theorem 12.58 and Exercise 6). First, we deal with the one-dimensional case. 12.56 Lemma. For every a < b there is a function ¢ E Cgo(R) such that ¢(t) for t E (a, b) and ¢(t) = 0 for t tt (a, b). PROOF.
>0
The function
f(t) belongs to Coo(R) and f(j)(O)
¢(t)
= { e
=
{
~
-1/t2
t=f.O t=O
= 0 for all j EN (see Exercise 3, p. 101). Hence, -1/{t-a)2 -1/{t-b)2
o
e
t
E
(a, b)
otherwise
belongs to COO (R), satisfies ¢(t) > 0 for t E (a, b), and spt ¢ = [a, b]. I Next, we show that there exists a nonzero Coo function that is constant everywhere except on a small interval.
434
INTEGRATION ON R n
Chapter 12
12.57 Lemma. For each 6 > 0 there is a function 'ljJ E Coo (R) such that 0 ::; 'ljJ ::; 1 on R, 'ljJ(t) = 0 for t ::; 0, and 'ljJ(t) = 1 for t > 6. PROOF. By Lemma 12.56, choose ¢ E and ¢(t) = 0 for t t/: (0,6). Set 'ljJ(t)
C~(R)
such that ¢(t) > 0 for t E (0,6)
= f~ ¢(u) du. fo ¢(u) du
By the Fundamental Theorem of Calculus, 'ljJ E Coo(R), by construction 0::; 'ljJ ::; 1, and t ::; 0 'ljJ(t) = {
~
t
> 6. I
Finally, we use these one-dimensional Coo functions to construct nonzero functions in
C~(Rn).
12.58 THEOREM [COO VERSION OF URYSOHN'S LEMMA]. Let H be compact and nonempty, let V be open in R n, and let H c V. Then there is an h E C~ (Rn) such that 0::; h(x)::; 1 for all x ERn, h(x) = 1 for all x E H, and spth C V. Let ¢ E C~(R) satisfy ¢(t) > 0 for t E (-1,1) and ¢(t) = 0 for (-1, 1). For each E > 0 and each x ERn, let Q0 (x) represent the n-dimensional
PROOF.
t
t/:
cube Set (44)
go(Y)
=
¢
(~l)
... ¢ (Y;) ,
and observe by Theorem 4.10 (the Product Rule) that go is Coo on Rn. By construction, go(Y) 20 on Rn, go(Y) > 0 for all y in the open ball Bo(O), and the support of go is a subset of the cube Qo(O). In particular, go E C~(Rn). We will use sums of translates of these go's to construct a Coo function, supported on V, that is strictly positive on H. It is here that the compactness of H enters in a crucial way. For each x E H, choose E := E(X) > 0 such that Qo(x) C V. Set
hx(Y)
=
go(Y - x),
and notice that hx 2 0 on Rn, hx(y) > 0 for ally E Bo(x), hx(Y) and hx E C~(Rn). Since H is compact and He
=
0 for ally
t/: Qo(x),
U Bo(x), xEH
choose points
Xj
E H and positive numbers
E j = E (Xj ),
j = 1, ... , N, such that
12.5
Partitions of unity
435
Set Q = Qel (Xl) U ... U QeN (XN) and f = hz, + ... + hzN. Clearly, Q is compact, Q C V, and f is Coo on Rn. Ifx ~ Q, then X ~ Qej(Xj) for all j, hence f(x) = o. Thus spt f ~ Q. If X E H, then X E Be) (Xj) for some j, hence f(x) > o. It remains to flatten f so that it is identically 1 on H. This is where Lemma 12.57 comes in. Since f > 0 on the compact set H, f has a nonzero minimum on H. Thus there is a 8 > 0 such that f(x) > 8 for X E H. By Lemma 12.57, choose 'lj; E Coo(R) such that 'lj;(t) = 0 when t ~ 0, and 'lj;(t) = 1 when t > 8. Set h = 'lj; 0 f. Clearly, h E C~(Rn), spt h ~ Q c V, and since f > 8 on H, h = 1 on H. Finally, since o ~ 'lj; ~ 1, the same is true of h. I This result leads directly to a decomposition theorem for Coo functions.
12.59 THEOREM [COO PARTITIONS OF UNITY]. Let n c Rn be nonempty and let {Va}aEA be an open covering of n. Then there exist functions ¢j E C~(Rn) and indices O!j E A, j EN, such that the following properties hold.
(i)
¢j 2 0 for all j E N.
(ii)
spt ¢j C Va) for all j EN.
(iii)
L ¢j (x) = 1 for all x E n.
00
j=1
iv) If H is a nonempty compact subset ofn, then there is a nonempty open set W ~ H and an integer N such that ¢j(x) = 0 for all j 2 N and x E W. In particular, N
L¢j(x) = 1 for all X E W. j=1
PROOF.
For each x En, choose a bounded open set W(x) and an index
O!
EA
such that
x E W(x) C W(x) eVa. Then W = {W(x) : x En} is an open covering of n, and by Lindelof's Theorem, we may suppose that W is countable; i.e., W = {Wj}jEN. By construction, given j EN, there is an index O!j E A such that
Wj C Wj eVa). Choose by Theorem 12.58 functions hj E C~(Rn) such that 0 ~ hj ~ 1 on Rn, h j = 1 on Wj, and spt hj C Va) for j EN. Set ¢l = hI and for j > 1, set ¢j
= (1- hI)··· (1 -
hj - 1 )h j •
436
Chapter 12
INTEGRATION ON Rn
Then CPj ~ 0 on Rn, and CPj E C~(Rn) with sptCPj proves parts (i) and (ii). An easy induction argument establishes
~
spthj C Vo) for j E N. This
k
L CPj = 1 -
(1 - h 1) ... (1 - hk)
j=1 for kEN. If x E
n, then x
E Wjo for some jo so 1 - hjo (x) = O. Thus k
L cpj(x) =
1- 0= 1
j=1 for k ~ jo. If H is a compact subset of n, then H C W1 u· .. UWN for some N E N. If W = W 1 U ... U WN, then x E W implies hk(x) = 1 for some 1 :::; k :::; N; i.e., CPj(x) = 0 for all j > N. Hence, N
00
L cpj(x) L cpj(x) =
j=1
=
1
j=1
for all x E W. I A sequence of functions {cpj hEN is called a (CO) partition of unity on n subordinate to a covering {Vo} oEA if and only if n and all the Vo's are open and nonempty, the CPj'S are all continuous with compact support and satisfy statements (i) through iv) of Theorem 12.59. By a CP partition of unity on n we shall mean a partition of unity on n whose functions CPj are also CP on n. By Theorem 12.59, given any open covering V of any nonempty set n ~ Rn and any extended real number p ~ 0, there exists a CP partition of unity on n subordinate to V. CP partitions of unity can be used to decompose a function f into a sum of functions fJ that have small support and are as smooth as f. For example, let f be defined on a set n, {cpj }jEN be a CP partition of unity on n subordinate to a covering {Vj}jEN, and fJ = fcpj· Then 00
00
00
j=1
j=1
j=1
for all x E n. If f is continuous on n and p ~ 0, then each fJ is continuous on if f is continuously differentiable on n and p ~ 1, then each fJ is continuously differentiable on n. Thus, f can be written as a sum of functions fJ that are as smooth as f. This allows us to pass from local results to global ones; e.g., if we know that a certain property holds on small open sets in n, then we can show that a similar property holds on all of n by using a partition of unity subordinate to a covering of n which consists of small open sets.
n;
12.5
Partitions of unity
437
To illustrate the power of this point of view, we now show that the integral can be extended from Jordan regions to open bounded sets, even though such sets are not always Jordan regions. This extension is a multidimensional version of the improper integral. (The proofs of Theorems 12.63 and 12.64 are based on Spivak [12].1) STRATEGY: The idea behind this extension is fairly simple. Let V be a bounded open set and let f be locally integrable on V; i.e., f : V -7 R is integrable on every closed Jordan region H C V. For each x E V, choose an open Jordan region V(x) so small that x E V(x) C V. (For example, V(x) could be an open ball.) Then {V(X)},.:EV is an open covering of V, and by Lindelof's Theorem it has a countable subcover, say V = {Vi}jEN. Let {
f1
1
f(x) dx =
j=l v)
v
f (x)
Before we can proceed, we must answer two questions. Does this series converge? And if it does, will its value change when the partition of unity changes? The next two results answer these questions. 12.60 Lemma. Let V be a bounded open set in Rn and let V = {Vi}jEN be a sequence of nonempty open Jordan regions in V that satisfies 00
V=
UVi.
j=l Suppose that f: V -7 R is bounded on V and integrable on each Vi, If {
t, Ii,
¢; (x)f (x)
dol :<; t, L
I¢; (x)f(x) I do
1L N
:::; M
E
I
j=l
1 M. Spivak, Calculus on Manifolds (New York: W. A. Benjamin, Inc., 1965). Reprinted with permission of Addison-Wesley Publishing Company.
438
Chapter 12
INTEGRATION ON R n
Therefore, the series in (45) converges absolutely. I The value of the series in (45) depends neither on the partition of unity chosen nor the covering V.
12.61 Lemma. Let V be a bounded, nonempty, open set in Rn. Suppose that V = {Vi }jEN and W = {WkhEN are sequences of nonempty open Jordan regions
in R n such that
00
00
U
U
Vi = Wk· j=1 k=1 Suppose further that f : V ~ R is bounded and locally integrable on V. If {¢>j }jEN is a partition of unity on V subordinate to V and {'Ij;k} kEN is a partition of unity on V subordinate to W, then V=
(46) By Lemma 12.60, both sums in (46) converge absolutely. By Exercise 5, {¢>j'lj;dj,kEN is a partition of unity on V subordinate to the covering {VinWdj,kEN. Thus PROOF.
ff 1
¢>j(x)'Ij;dx)f(x)dx
j=1k=1 v
also converges absolutely. Fix j E N. Since spt ¢>j is compact, choose N E N so large that 'lj;k (x) = 0 for k > N and x E spt ¢>j. Hence, N
[J ¢>j(x)f(x) dx = [J ¢>j(x) ~ 'lj;k(x)f(x) dx =L r ¢>j(x)'Ij;k(x)f(x) dx k=1 N
}VJnWk
=
f
k=1
Thus
f1
¢>j(x)f(x) dx =
j=1
r
ff1 j=1 k=1
VJ
¢>j (X)'Ij;k (x) f(x) dx.
}VJnwk
¢>j (X)'Ij;k (x) f (x) dx.
VJnWk
Reversing the roles of j and k we also have
f1 k=1
Wk
'lj;k(x)f(x) dx =
ff1
k=1 j=1
¢>j(x)'Ij;dx)f(x) dx.
VJnwk
Since these series are absolutely convergent, we may reverse the order of summation in the last double series. I Using Lemma 12.61, we define the integral of a locally integrable function a bounded open set V as follows.
f
over
12.5
439
Partitions of unity
12.62 DEFINITION. Let V be a bounded, nonempty, open set in Rn and let ~ R be bounded and locally integrable on V. The integral of f on V is defined to be
f :V
f1
lv(f):=
j=1
CPj(x)f(x)dx,
v)
where {cpj hEN is any partition of unity on V subordinate to an open covering V = {l--j }jEN such that each l--j is a nonempty Jordan region and 00
V=Ul--j· j=1
The following result shows that this definition agrees with the old one when V is a Jordan region. Thus, we shall use the notation Iv f(x) dx for lv(f).
12.63 THEOREM. 1£ E is a nonempty, open Jordan region in Rn and f : E is integrable on E, then
~
R
L
f(x) dx = lE(f).
PROOF. Let e > O. Since E is a Jordan region, choose a grid 9 of some n-dimensional rectangle R :J E such that
=
{Q1, . .. , Qp}
(47) Let H=
U Qt. QtCE
Clearly, H is compact and by (47), Vol (E \ H) < e (see Exercise 4d, p. 393). Set M = sUP"'EE If(x)l. Let {Rj}jEN be a sequence of rectangles such that R j c E and E = U~1 Rj, and let {cpj} j EN be a partition of unity on E subordinate to V = {Rj} jEN. Since H is compact, choose N1 E N such that CPj (x) = 0 for j > N1 and x E H. Then, for any N 2 N 1 , we have N
N
Lf(x)dx- ~L) CPj(x)f(x)dx = Lf(X)dx- ~L CPj(x)f(x)dx
:; 1
N
f(x) -
E
L cpj(x)f(x)
dx
j=1
N
::; M L
1-
~CPj(x)
dx
::; MVol (E \ H) < Me. We conclude that lE(f) exists and equals IE f(x) dx. I We now prove a change-of-variables formula valid for all open bounded sets.
440
Chapter 12
INTEGRATION ON Rn
12.64 THEOREM. Suppose that V is a bounded, nonempty, open set in Rn, that ¢ : V ----; R n is 1-1 and continuously differentiable on V, and that ¢(V) is bounded. If D..1> -I- 0 on V, then
1
1>(V)
f(u) du =
r f(¢(x)) ID..1> (x) I dx,
Jv
for all bounded f : ¢(V) ----; R, provided that f is locally integrable on ¢(V). PROOF. For each a E V, choose by Theorem 12.45 an open rectangle W(a) such that W(a) C V and (48)
1
1>(W(a))
f(u) du =
r
JW(a)
f(¢(x)) ID..1> (x) I dx.
Set W = {W(a)}aEV. Then W is an open covering of V. By Lindelof's Theorem, we may assume that W = {Wj}jEN. Let {¢j}jEN be a partition of unity on V subordinate to W, i.e., a sequence of Coo functions such that 00
spt¢j C Wj C V,
j EN,
and
L¢j(x) j=l
= 1
for all x E V. By Corollary 12.10, each ¢(Wj) is a Jordan region. By Theorem 11.39, each ¢(Wj) is open, and by Exercise 4, {¢j 0 ¢-1 LEN is a partition of unity on ¢(V) subordinate to the open covering {¢(Wj)} jEN. Hence, by Definition 12.62 and (48),
1
f(u)du=
1>(V)
f:l f: 1 l
(¢j o¢-l)(u)f(u)du
j=l 1>(WJ )
=
j=l WJ
=
¢j (x)f(¢(x)) 1D..1>(x) Idx
f(¢(x)) 1D..1> (x) I dx. I
Finally, we are prepared to prove a change-of-variables formula for functions whose Jacobians are zero on a set of volume zero.
12.65 THEOREM [CHANGE OF VARIABLES FOR MULTIPLE INTEGRALS]. Suppose that W is open in Rn, that ¢ : W ----; R n is continuously differentiable, and that E is a Jordan region with E C W. If ¢(E) is a Jordan region and if there exists a closed set Z such that E n Z is of volume zero, and such that ¢ is 1-1 and D..1>(x) -I- 0 for all x E EO \ Z, then
1
1>(E)
f(u) du =
r f(¢(x)) 1D..1> (x) I dx,
}E
12.6
provided that
441
Gamma function and volume
f is integrable on ¢(E).
PROOF. Set V := EO \ Z and observe that V is open and bounded. Since ¢(E) ;2 ¢(V), ¢(V) is also bounded. By hypothesis, E \ EO ~ DE and En Z are of volume zero. Moreover, by Corollary 12.10, ¢(E \ EO) and ¢(E n Z) are of volume zero. Since E = V U (E n Z) U (E \ EO) and ¢(E) = ¢(V) U ¢(E n Z) U ¢(E \ EO), it follows from Theorems 12.23, 12.24, and 12.64 that
r
J~(E)
f(u) d:u = =
r
J~(V)
f(u) d:u
fv f(¢(x))I~~(x)1 = l f(¢(x))I~~(x)1 dx
dx. I
We close by noting that as general as it is, even this result can be improved. If something called the Lebesgue integral is used instead of the Riemann integral, the condition that ~~ =f. 0 can be dropped altogether (see Spivak [12], p. 72.)
EXERCISES 1. If f, 9 : R n -+ R, prove that spt (f g) ~ spt f n spt g. 2. Prove that if f,g E Cg<'(Rn), then so are fg and oJ for any scalar 0:. *3. Prove that if f is analytic on Rand f(xo) =f. 0 for some Xo E R, then f ~ Cg<'(R). 4. Suppose that V is a bounded, open set in Rn, and ¢ : V -+ Rn is 1-1 and continuously differentiable on V with ~~ =f. 0 on V. Let W = {Wj }jEN be an open covering of V and {¢j }jEN be a CP partition of unity on V subordinate to W, where p ~ 1. Prove that {¢j 0 ¢-l }jEN is a C1 partition of unity on ¢(V) subordinate to the open covering {¢(Wj)}jEN. 5. Let V be open in Rn and V = {Vj}jEN, W = {WdkEN be coverings of V. If { ¢j} jEN is a cP partition of unity on V subordinate to V and {'l/Jk} kEN is a CP partition of unity on V subordinate to W, prove that {¢j'I/Jk}j,kEN is a CP partition of unity on V subordinate to the covering {Vj n Wdj,kEN. 6. Show that given any compact Jordan region H c Rn, there is a sequence of C= functions ¢j such that
e12.6 GAMMA FUNCTION AND VOLUME
The last result of this section
uses Dini's Theorem from Section 9.5. In this section we introduce the gamma function and use it to find a formula for the volume of any n-dimensional ball and an asymptotic estimate of n!.
Chapter 12
442
Recall that if
f : (0,00)
---+
INTEGRATION ON Rn
R is locally integrable on (0,00), then
lim loroo f(t) dt = x--+O+
l
Y
f(t) dt.
x
Y--+OO
Io
oo
In particular, it is easy to check that
e- at dt is finite for all
Q
> O.
12.66 DEFINITION. The gamma function is defined by
r(x) =
1
00
t X- 1 e- t dt,
x E (0,00),
when this (improper) integral converges. By definition,
r(l) =
1
00
e- t dt = 1,
and
r(1/2) = 1
00
r
1/ 2 e- t
dt
= 21 00 e- u2 du = ..Iff.
(We used the change of variables t = u 2 and Exercise 10, p. 431.) It turns out that E (0,00).
r(x) is defined for all x
12.67 THEOREM. For each x E (0,00), r(x) exists and is finite, r(x xr(x), and r(n) = (n - 1)! for n E N. PROOF.
Write
r(x) =
11
t x - 1 e- t dt
+
1
00
t x - 1 e- t dt =: h
+ 1)
=
+h
By I'Hopital's Rule, lim e- t / 2 t Y = 0 t--+oo
for all y E R. Hence, e- t t x - 1 :::; e- t / 2 for t large and it follows from Theorem 5.43 (the Comparison Theorem), that 12 is finite for all x E R. To show that h is finite for x > 0, suppose first that x 2: 1. Then t x - 1 :::; 1 for all t E [0,1] and
h =
rt lo 1
r e- dt = 1 lo 1
~
< 00. e Therefore, r(x) is finite for all x 2: 1. Next, suppose that 0 < x < 1. Then x+1 2: 1, so r(x + 1) is finite. Integration by parts yields
1
x - 1 e- t
dt:::;
tX - t = _e_
00
t
11
00
t x - 1 e- t dt
1
+-
00
1
tXe- t dt = -r(x + 1). o x t=o x 0 x Therefore, r(x) is finite when 0 < x < 1. This argument also verifies xr(x) = r(x + 1) for x E (0,00). Since r(1) = 1, it follows that r(n) = (n - 1)! for all n E N. I
r(x) =
The gamma function can be used to evaluate certain integrals that cannot be evaluated by using elementary techniques of integration.
12.6
443
Gamma function and volume
12.68 THEOREM. Ifx,y E (0,00), then
r io
1 vy-1(1- V)x-1 dv
(i)
= r(x)r(y) r(x+y)'
and
1o~/2
(ii)
COS 2 x- 1 cpsin 2y - 1
cpdcp =
r(x)r(y) . 2r(x+y)
In particular,
r sm.
io
(iii)
d _ r((k - 1)/2)r(1/2) r(k/2)
k-2
cp cp -
holds for all integers k > 2. PROOF.
To prove part (i), make the change of variables v
1o1 v Y-1 (1 -
x-I
v
)
dv -
-
=
100 ( -U- )Y-1 ( 1 -
-U- )X-1
100 (1+u
duo
0
o
1+U
u y- 1 -1-
=
11
1+U
)x+Y
It follows from two more changes of variables (s Fubini's Theorem that
r(x + y)
= u/(l + u) and write
=
t/(l
+ u)
dU
(1 + u)2
and w
=
su) and
v y- 1(1 - V)X-1 dv
roo roo u y- 1 (_1_) x+y tx+y-1e-t dt du
io io
1+
100 100 100 (100 = 100 (100
u
u y- 1sx+y-1 e -s(u+1) ds du
=
=
sx-1 e-s
uY-1sYe-SUdu) ds
sx-1 e-s
w y- 1e- w dW) ds
= r(x)r(y).
To prove part (ii) use the change of variables v = sin2 cp and part (i) to verify
1o~/2
COS 2 x- 1 cp
111
sin 2y - 1 cpdcp = 2
Specializing to the case y
0
r(x)r( ) v y- 1(1 - V)x-1 dv = Y . 2r(x+y)
= (k - 1)/2 and x = 1/2, we obtain part (iii). I
The connection between the gamma function and volume is contained in the following result.
444
Chapter 12
INTEGRATION ON Rn
12.69 THEOREM. Ifr > 0 and a ERn, then 2rn7rn/2 Vol (Br(a)) = nr(n/2)' PROOF. By translation invariance (see Exercise 3, p. 393) and Theorem 12.22, Vol (Br(a)) = 1 dx for B = Br(O). We suppose for simplicity that n 2 2, and introduce a change of variables in Rn analogous to spherical coordinates. Namely, let
IB
Xl
= p cos
X2
= P sin
Xn-l = psin
X3
= P sin
... ,
Xn = psin
and
where 0 :::: p :::: r, 0 :::: B :::: 27r, and 0 ::::
=
L
1 dx
lor L" .. ·107r 1027r pn-l sinn- 2
27rr n . r((n - 1)/2)r(1/2) . r((n - 2)/2)r(1/2) n r(n/2) r((n -1)/2)
r(1)r(1/2) r(3/2)
Canceling all superfluous factors and substituting the value ,fii for r(1/2), we conclude that Vol (Br(a))
=
27rr n
----n-
(r
n - 2 (1/2))
r(n/2)
2rn7rn/2
= nr(n/2)' I
This formula agrees with what we already know. For n Vol (Br(O)) for n
=
2r7rl/2 r(1/2)
=
= 1 we have
2r,
= 2 we have
and for n
= 3 we have 2r 3 7r 3 / 2 4 (3/2)r(1/2) = 37rr3 .
We close this section with an asymptotic estimate of n!. First, we obtain an integral representation for n!/(nn+1/2e- n ).
12.6
12.70 Lemma. If ¢(X)
Gamma function and volume
= x -logx -1, x> 0, then
n! nn+l/2 e-n PROOF.
445
=
1
00
e-nq,(l+t/y'n) dt.
-y'n
By Definition 12.66 and Theorem 12.67, we can write
n!
=
r(n + 1)
1
00
=
Making two changes of variables (first x that
xne- X dx.
= ny, then y = 1 + t/fo) , we conclude
roo (::)n e-x+n dx n
_,....,n,...,.!::--_ __1_ nn+l/2 e-n - fo 10 =
Vn 10
=
Vn
00
yne-n(y-l) dy
roo e-nq,(y) dy =
10
1
e- nq,(1+t/y'n) dt. I
00
-y'n
Next, we derive some inequalities that will be used, in conjunction with Dini's Theorem, to evaluate the limit of the integral that appears in Lemma 12.70. 12.71 Lemma. If ¢(x) = x -logx - 1, x> 0, then
(x -l)¢'(x) - 2¢(x) > 0 forO < x < 1 and (x - l)¢'(x) - 2¢(x) < 0
for x> 1.
Moreover, there is an absolute constant M > 0 such that (50)
¢(x) ~ M(x - 1)2
for 0 < x < 2
and ¢(x)
(51)
~
M(x - 1)
for x
~
2.
PROOF. Let 'Ij;(x) = 2Iogx-x+1/x and observe that (x-1)¢'(x)-2¢(x) = 'Ij;(x). Since 'Ij;'(x) = -(x -1)2/x2 < 0 for all x =f 1, 'Ij; is decreasing on (0,00). Since 'Ij;(1) = 0, it follows that 'Ij; > 0 on (0,1) and'lj; < 0 on (1, (0). This proves the first pair of inequalities. To prove the second pair of inequalities, observe first that by Taylor's Formula,
¢(x) = ¢(1)
+ ¢'(l)(x _
1)
+ ¢"(c) (x;! 1)2
(x -
1)2
2c2
for some c between x and 1. Thus ¢(x) ~ (x-1)2/8 for all 0 < x < 2. Next, observe, since ¢(x) > 0 for x > 1 and ¢(x)/(x - 1) -+ 1 as x -+ 00, that ¢(x)/(x - 1) has a positive minimum, say m, on [2, (0). Thus (50) and (51) hold for M := min{m,1/8}. I Our final preliminary result evaluates the limit of the integral that appears in Lemma 12.70.
446
Chapter 12
INTEGRATION ON Rn
12.72 Lemma. If ¢(x) = x - log x-I for x > 0, and Fn(t) n E N and t > -y'ri, then lim
n--+oo
1
00
_y'n
Fn(t) dt =
1
00
= e- nq,(l+t/y'n)
for
e- t2 / 2 dt.
-00
STRATEGY: The idea behind the proof is simple. By l'Hopital's Rule,
'/'(1 +t ) _- l i y'ri
· 1Imn,+,
n--+oo
so Fn(t)
n--+oo
---+ e- t2 /2,
as
n ---+
2 · ,/,1/(1 t¢'(1+t/y'ri)_t t)_t2 - - 11m,+, + - - -2 ' 2 1/ y'ri 2 n--+oo y'ri
m-
00, for every t E R. Thus J~y'n Fn(t) dt should converge
to J~oo edt as n ---+ 00. To prove this, we break the integral over (-y'ri, 00) into three pieces: one over (- y'ri, - Va), one over (Va, 00 ), and one over (- Va, Va). Since e- t2 / 2 is integrable on (-00, 00), the first two integrals should be small for a sufficiently large. Once a is fixed, we shall use Dini's Theorem on the third integral. Here are the details. t2 / 2
PROOF. Let 1£
> 0 and observe that
for any a > 0 and n EN, provided that n > a. Hence, it suffices to prove that IIj I :::; 1£/4 for j = 1, 2, 3, 4, and n, a sufficiently large. Let M be the constant given in Lemma 12.71, and choose a > 0 so large that oo 1£ 2 1£ e- M t dt < e- Mt dt < -, (52) Va 4 Itl~Va 4'
J
1
and (53) By (53), II21 < 1£/4. To estimate IIjl for j ¥- 2, fix t > -Va and consider the function G(x) e- x q,(1+ t /yx), x > O. By the Product Rule,
G'(x) = e- x q,(1+ t /yx) (_t_ ¢' 2yX
(1 + yX
_t ) _ ¢
e-xq,(y)
=
2
((y -l)¢'(y) - 2¢(y)),
(1 + yX
_t ))
12.6
447
Gamma function and volume
where y = 1 + t/y'x. Thus by Lemma 12.71, G'(x) > 0 for x > a, -Va < t < 0, and G'(x) < 0 for x > 0, t > O. It follows that for each t E (-Va, 0), Fn(t) i e- t2 / 2 as n --+ 00, and for each t E (0,00), Fn(t)! e- t2 / 2 as n --+ 00. Hence, by Dini's Theorem (Theorem 9.41),
as n --+ 00. Thus, we can choose an N E N so large that n ~ N implies remains to estimate IIj I for j = 3,4. To this end, let n > max{N, a}. By (50) and (51), for and n¢
(1 + In) ~ In ~ nM
Mt
Ih I < c.
It
n!
~
-.;n < t < .;n for t ~
.;n.
Since n > a, it follows that
We conclude by (52) that
1131 + 1141 < c/2, as required. I
12.73 THEOREM [STIRLING'S FORMULA]. For n V21T(nn+1/2)e- n ; i.e.,
E N sufficiently large,
lim n! = 1. n--+oo V21T(nn+1/2)e- n PROOF. By Exercise 10, p. 431, and the change of variables t
= v'2u, we have
Therefore, it follows from Lemmas 12.70 and 12.72 that lim n! = lim _1_ Joo e-n
(l+t/vn) dt n--+oo V21Tnn+l/2 e-n n--+oo V21T -vn
= -1- Joo e- t2 / 2 dt = V21T
-00
1. I
Chapter 12
448
INTEGRATION ONRn
EXERCISES 1. Show that
2. Show that
(I ----::::=d=x= = ,Jrr.
10
1:
3. Show that
y'-logx e7rt -e' dt = r(ll').
4. Show that the volume of a four-dimensional ball of radius r is 1l'2 r 4/2, and the volume of a five-dimensional ball of radius r is 81l'2 r 5/15. 5. Verify (49). 6. For n > 2, prove that the volume of the n-dimensional ellipsoid
is
2
n/2
Vol(E)= al···an ll' nr(n/2)
.
7. For n > 2, prove that the volume of the n-dimensional cone
C = {(Xl,'" ,Xn )
:
is Vol (C)
=
(h/r)Jx~ + ... + X~ ~ Xl ~ h} 2hr n - 11l'(n-I)/2 n(n _ l)r((n - 1)/2)'
8. Find the value of
for each kEN. 9. If 1 : BI (0) --+ R is differentiable with
1(0)=0
and
11V'I(x)ll~l
for x E BI (0), prove that the following exists and equals O. lim ( k--+oo
10. (a) Prove that
r
1B1 (0)
II(xW dx.
is differentiable on (0,00) with
r'(x) =
1
00
e-ttX-Ilogtdt.
* (b) Prove that r is Coo and convex on (0, 00 ) .
Chapter 13
Fund
13.1 CURVES According to the dictionary, a curve is a smooth line that bends, without corners: a one-dimensional object with length but no breadth. Of course, this definition is too imprecise. It is also too restrictive. Our concept of a curve will include not only "smooth" objects such as the graphs of the function y = x 2 and the relation x2 + y2 = 1, but also objects with "corners," such as the graph of y = Ixl. Recall that if I ~ Rand ¢ : I ---+ Rm, then the image of I under ¢ is the set
¢(I)
= {x E R m : x = ¢(t) for some t
E I}.
Also recall that given a,b E Rm with b i= 0, the image of R under ¢(t) := a + tb is the straight line through a in the direction of b. This is the simplest type of curve in Rm. A naive attempt to define a general curve in Rm is to insist that it simply be the image of an interval under some continuous function ¢ : R ---+ Rm. It turns out that this definition is too broad. There are continuous functions (called "space-filling curves") which take the unit interval [0,1] onto the unit square [0,1] x [0,1] (see Boas [2]). One way to fix this definition is to use homeomorphisms, i.e., continuous functions whose inverses are also continuous. Since we are interested primarily in the differential structure of curves, we take a different approach, using differentiable functions to define curves (see Definition 13.1). We begin by extending the definition of partial differentiation to include functions defined on nonopen domains. Let m, n,p E N, and E be a nonempty subset of Rn. A function f : E ---+ R m is said to be CP (on E) if and only if there is an open set V 2 E and a function g : V ---+ Rm whose partial derivatives of orders j :::; p exist and 449
450
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
are continuous on V such that f(x) = g(x) for all x E E. In this case we define the partial derivatives of f to be equal to the partial derivatives of g; e.g., afJjaXk(X) = agjjaXk(X) for k = 1,2, ... , n, j = 1,2, ... , m, and x E E. A function f : E -+ Rm is said to be Coo (on E) if and only if f is CP on E for all pEN. Notice that this agrees with Definition 4.6 when n = 1. Also notice that the Mean Value Theorem and the Inverse Function Theorem hold for functions in CI(E). Henceforth, p will denote an element of N or the extended real number 00.
13.1 DEFINITION. A subset C of R m is called a CP curve (in Rm) if and only if there is a nondegenerate interval I (bounded or unbounded) and a CP function ¢: I -+ Rm such that ¢ is I-Ion rand C = ¢(I). In this case, the pair (¢,I) is called a pammetrization of C, and C is called the tmce of (¢,I). The equations t E J,
j = 1, ...
,m,
are called the pammetric equations of C induced by the parametrization (¢,1). Thus the straight line through a in the direction ofbis a Coo curve with parametrization ¢(t) :=a+tb, 1= R. For most applications, we must assume more about curves.
13.2 DEFINITION. A CP curve is called an arc if and only if it has a parametrization (¢, I) where I = [a, b] for some a, b E R. In this case, we shall call ¢(a) and ¢(b) the endpoints of C. An arc is said to be closed if and only if ¢(a) = ¢(b). Thus L(a;b), the line segment from a to b, is an arc (see p. 229). The circle + y2 = a 2 is an example of a closed arc (see Example 13.4). A closed arc is said to be simple if and only if it does not intersect itself except possibly at its endpoints. Simple closed arcs are also called Jordan curves because of the Jordan Curve Theorem. This theorem states that every simple closed arc C in R 2 separates R 2 into two pieces, a bounded connected set E and an unbounded where aE = = C (see Griffiths [3]). (It is interesting to note connected set that the set E is not necessarily a Jordan region. This fact was discovered by W.F. Osgood. l ) Before we start developing a theory of curves, we look at several additional examples to see how broad Definitions 13.1 and 13.2 really are. First, we show that curves, as defined in Definition 13.1, include graphs of CP real functions.
x2
n,
an
13.3 Example. Let I be an interval and let f : I -+ R be a CP function. Prove that the graph of y = f(x) on I is a CP curve in R2. PROOF. Let ¢(t) = (t,J(t)). Then ¢ is CP and I-Ion I, and ¢(I) is the graph of y = f(x) as x varies over I. (We shall call (¢,1) the trivial pammetrization of y = f(x).) I Jordan Curve of Positive Area," Transactions of the American Mathematical Society, vol. 4 (1903), pp. 107-112.
1 "A
13.1
Curves
451
y
a
a
x
Figure 13.1
By an explicit curve we mean a curve of the form ¢(I), where ¢(t) = (t, f(t)) (respectively, ¢(t) = (f(t), t)) for some CP function f : I ---+ R. Notice, then, that an explicit curve is a set of points (x, y) that satisfy y = f(x) (respectively, x = f (y)) for some real CP function f. We have just proved that every explicit curve is a curve in R2. The following result shows that the converse of this statement is false. 13.4 Example. Prove that the circle x 2
+ y2 = a 2 is a Coo
Jordan curve in R2.
PROOF. This circle can be described in polar coordinates by r = a, i.e., in rectangular coordinates by x = a cos (), y = a sin (). Set ¢ (t) = (a cos t, a sin t) and 1= [0,211']. Then ¢ is Coo, I-Ion [0,211'), and ¢(I) is the set of points (x, y) E R2 such that x 2 + y2 = a 2 • (The trace of this parametrization is sketched in Figure 13.1. The arrow shows the direction the point ¢(t) moves as t grows larger. For example, ¢(O) = (a, 0) and ¢(11'/2) = (0, a).) I Recall that the graph of a CP function on an interval is "smooth," i.e., has a tangent line at each of its points. The following example shows that this is not the case for a general CP curve. 13.5 Example. Let ¢(t) = (cos 3 t, sin3 t) and I = [0,211']. Show that (¢,1) is a parametrization of a Coo Jordan curve in R2 that has "corners." (This curve is called an astroid.)
PROOF. Clearly, ¢ is Coo on I and I-Ion [0, 211'). Let x and observe by a double-angle formula that
=
cos3 t and y
=
sin3 t
Hence, vi x 2 + y2 varies from a maximum of 1 (attained when t = 0,11'/2,11', 311'/2, 211') to a minimum of 1/2 (attained when t = 11'/4,311'/4,511'/4,711'/4). Since I is connected and ¢ is differentiable, hence, continuous, the set ¢(I) must also be connected. Plotting a few points, we see that ¢(I) is a four-cornered star, starting
452
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
y
/
/
"-
"
, "-
/
\
\
/ I
\
\
\ \
I 1
x
/
\ \
"-
/
, .....
/ /
"
Figure 13.2 at (1,0) and moving in a counterclockwise direction from 8B 1 (0, 0) to 8B 1 / 2 (0,0) and back again (see Figure 13.2). As t runs from to 211", this curve makes one complete circuit. I
°
We have enough examples to begin to explore the theory of curves. First, we define the "length" of a curve. (For a geometric justification of this definition, see Theorem 13.17.)
13.6 DEFINITION. Let C be a CP arc and (c/J, J) be one of its parametrizations. The arc length of C, as measured by (c/J,!) , is defined to be
L(C) :=
1
Ilc/J'(t) II dt.
For example, let (c/J, J) be the parametrization of the circle of radius a given by Example 13.4. Since 11c/J'(t)11 = a for all t E [0,211"], it is easy to check that L( C) = 211"a, exactly what it should be. This also demonstrates why we insisted that parametrizations be I-Ion the interior of their domains. If c/J were not I-Ion (0,211"), some part of the circle might be traced more than once, giving an inflated value of its arc length. In general, if c/J' is continuous on a closed, bounded interval J, then IWII is integrable on J; hence, L(C) is finite for any parametrization of a CP arc C. This is not necessarily the case if C is merely the continuous image of an interval (the space-filling curve is continuous but its length is infinite) or if C is the image of an open interval (see Exercise 4). When C is an explicit curve, say y = f(x) on [a, b], and (c/J, J) is the trivial parametrization, Definition 13.6 becomes
L(C) =
lb VI +
(f'(X))2 dx.
This agrees with the formula for arc length introduced in elementary calculus texts.
13.1
Curves
453
Before we continue, it is important to realize that even the simplest curve can have many different parametrizations. For example, the line segment {(x, y) E R2 : y = x, 0 < x ~ I} is the trace of c/>(t) = (t, t) on (0,1]' of 'Ij;(t) = (t/2, t/2) on (0,2]' and of a (t) = (1/ t, 1/t) on [1, 00). Although these functions trace the same line segment, each of them traces it differently. The function 'Ij; traces the line "twice as slowly" as C/>, and a traces the line "backward" from c/>. Therefore, a parametrization (c/>, I) of C is a specific way of tracing out the points in C. At this point, it is natural to ask: Does the value of arc length, L(C), remain the same if we change parametrizations of C? To answer this question, we begin by showing that any two parametrizations of the same arc are related by a onedimensional change of variables 7. 13.7 Remark. Let I, J be closed bounded intervals and let c/> : I -+ R m be 1-1 and continuous. Then c/>(I) = 'Ij;(J) for some continuous 'Ij; : J -+ Rm if and only if
there is a continuous function 7 from J onto I such that 'Ij; = c/> 0 7. PROOF. Since I is closed and bounded and c/> is 1-1 and continuous on I, c/>-l is continuous from c/>(I) onto I (see Theorem 9.33 or 10.64). Since 'Ij;(J) = c/>(I), it follows that 7 := c/>-l o'lj; is continuous from J onto I. Conversely, if 7 is any continuous function from J onto I, then 'Ij; = c/> 0 7 is continuous from J onto c/>(I); i.e., 'Ij;(J) = c/>(I). I
Thus if (c/>,I) and ('Ij;,J) are CP parametrizations of the same arc and c/> is 1-1, then there is a continuous function 7 : J -+ I, called the transition from J to I, such that 'Ij; = c/> 0 7, or equivalently, 7 = c/>-l 0 'Ij;. It follows that if the transition is CP, hence differentiable, then by the Chain Rule,
'Ij;'(u) = c/>'(7(U))7'(U),
(1)
u E J.
We are prepared to prove that the definition of arc length does not depend on the parametrization chosen provided that the transition has a nonzero derivative. 13.8 Remark. If (c/>,I) and ('Ij;, J) are CP parametrizations of the same are, if 'Ij; = c/> 0 7, where 7 takes J onto I and satisfies 7' (u) ¥- 0 for all u E J, then
1
Ilc/>'(t) II dt =
i
11'Ij;'(u)II duo
PROOF. By hypothesis, 7(J) = I. Hence, it follows from (1) and the Change-ofVariables Formula (Theorem 12.46) that
j II c/>'(t) II dt = 1 I
T(J)
II c/>'(t) II dt =
r11c/>'(7(U))1117'(u)1 du = r11'Ij;'(u)II duo
lJ
I
lJ
We note that the condition 7' ¥- 0 can be relaxed at finitely many points in J (see Exercise 8). One productive way to think about different parametrizations of a curve C is to interpret c/>(t) as the position of a particle moving along C at time t. Different parametrizations of C represent different flight plans, some faster, some slower, some forward, and some backward, but all tracing out the same set of points.
454
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
cp(/)
Figure 13.3 13.9 Remark. Let (¢,I) be a parametrization of a CP curve, and let Xo = ¢(to) for some to E IO. If ¢(t) represents the position of a moving particle at time t, then II ¢' (to) II is the speed of that particle at position %0 and, when ¢' (to) f 0, ¢' (to) is a vector that points in the direction of flight at Xo. PROOF.
Let to E IO and notice that, for each sufficiently small h > 0, the quotient
¢(to
+ h) -
¢(to)
h is a vector that points in the direction of flight along the curve C (see Figure 13.3). To calculate the speed of the particle, define the natural parameter of the curve C:= ¢(I) by
(2)
s
:=
£(t):=
it
11¢/(u)11 du,
t E [a,b].
By the Fundamental Theorem of Calculus, ds/dt = £'(t) = 11¢/(t)ll. Thus, the change of arc length s with respect to time at to; i.e., the speed of the particle at %0, is precisely 11¢/(to)ll· I By elementary calculus, every explicit CP curve is "smooth," i.e., has a tangent line at each of its points. The astroid (Example 13.5) shows that this might not be the case for a general curve. Is there an easy way to recognize when a general CP curve has a tangent line (in the sense of Definition 11.21) at a given point on its trace? To answer this question, let (¢, I) be the parametrization of the astroid given in Example 13.5, and notice that ¢'(t) = 0 when t = 0,1r/2,1r, 31r/2,21r, i.e., exactly at the points where the astroid ¢(I) fails to have a tangent line. (Notice that if we use the flight plan analogy, this condition makes much sense. It is impossible to draw a curve at a corner without pausing to make the direction change, i.e., without making the velocity of the drawing device zero.) Could the answer to our question be this simple? Does a curve with parametrization (¢,I) have a tangent line at each point where ¢' fO?
13.10 Remark. If (¢, I) is a parametrization of a CP curve C in R2, and ¢' (to) for some to E r, then C has a tangent line at (xo,Yo) := ¢(to).
f 0
By elementary calculus, the graph of every differentiable function has a tangent line at each of its points. The curve C is given by x = ¢l(t), y = ¢2(t). If STRATEGY:
13.1
Curves
455
we could solve the first equation for t, then by the second equation C is an explicit curve: y = (/;2 0 ¢t l (x). Thus we must decide: Is ¢t l differentiable? This looks like a job for the Implicit Function Theorem. PROOF. Let (¢l, ¢2) represent the components of ¢. Since ¢'(to) =f 0, we may suppose that ¢~ (to) =f o. Set F(x, t) = ¢l (t) - x and observe by the Implicit Function Theorem that there is an open interval Jo containing Xo and a continuously differentiable function 9 : Jo ~ I such that ¢1(g(X)) = x for all x E Jo and g(xo) = to. Thus the graph of y = f(x) := ¢2 0 g(x), x E Jo, coincides with the trace of ¢ on g(Jo), i.e., near (xo, Yo). It follows that C has a tangent line at
(xo, Yo). I Accordingly, we make the following definition.
13.11 DEFINITION. Let (¢, I) be a parametrization of a CP curve C. (i) (¢, 1) is said to be smooth at to E I if and only if ¢' (to) =f O. (ii) (¢, I) is called smooth if and only if it is smooth at each point of I, in which case we call ¢' the tangent vector of C induced by (¢, I). (iii) A curve is called smooth if and only if it has a smooth parametrization, unless it is a closed arc, in which case we also insist that one of its smooth parametrizations (7jJ, [c, d]) satisfy 7jJ' (c) = 7jJ' (d). By definition, then, if a curve C has a smooth parametrization, then C is smooth. The converse of this statement is false, even for arcs.
13.12 Remark. Every smooth arc has nonsmooth parametrizations. PROOF. Let (¢, [a, b]) be a smooth parametrization of a smooth arc C. We may suppose (by a preliminary change of variables) that 0 E (a, b). Then 7jJ(t) := ¢(t 3 ), J = (tfli, -eIb) is a parametrization of C. It is NOT smooth, however, since 7jJ'(t) = ¢'(t3 ) ·3t 2 = 0 when t = O. I
This raises another question: When does a change in parametrization preserve smoothness? To answer this question, suppose that (¢, I) and (7jJ, J) are parametrizations of the same curve, with ¢ 1-1 and (¢,1) smooth. If the transition 7, from J to I, is differentiable, then by (1), (7jJ, J) is smooth if and only if 7' (u) =f 0 for all u E J. This leads us to the following definition.
13.13 DEFINITION. Two CP parametrizations (¢,!), (7jJ, J) are said to be smoothly equivalent if and only if they are smooth parametrizations of the same curve, and there is a CP function 7, called the transition from J to I, such that 7jJ = ¢ 0 7, 7(J) = I, and 7'(U) =f 0 for all u E J. Thus, by Remark 13.8, the arc length of a curve is the same under smoothly equivalent parametrizations. Notice that since 7' is continuous and nonzero, either 7' is positive on J or 7' is negative on J. Hence, by Theorem 4.24, a transition 7 between two smoothly equivalent parametrizations is always 1-1.
456
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
The following integral can be interpreted as the mass of a wire on a curve with density 9 (see Appendix E).
13.14 DEFINITION. Let C be a smooth arc in Rm with parametrization (¢,I), and let 9 : C --+ R be continuous. Then the line integral of 9 on C is
i
(3)
gds:=
For an explicit curve C given by y
1
g(¢(t))II¢'(t)11 dt.
= f(x), x E [a, b], this integral becomes
We note that by Definition 13.6, the line integral (3) equals the arc length of C when 9 = 1. This explains the notation ds. Indeed, the parameter s represents arc length (see (2)) and, by the Fundamental Theorem of Calculus, ds/dt = 11¢'(t)ll. Hence, the Leibnizian differential of s satisfies ds = 11¢'(t)11 dt. We also note that the line integral of a function 9 on a curve is the same under smoothly equivalent parametrizations (see Exercise 8). Since a line integral is a one-dimensional integral, it can often be evaluated by the techniques discussed in Chapter 5.
13.15 Example. Find and I = [0,1T/2].
Ie 9 ds where g(x, y) = 2x+y, C = ¢(I), ¢(t) = (cost, sint),
SOLUTION. Since 11¢'(t)11
r
= 11(-sint,cost)11 = 1, we have
r/
Je gds = Jo
2
(2 cost + sint) dt = 3.
I
For even the simplest applications, we must have a theory rich enough to handle curves, such as the boundary ofthe unit square 8([0, 1] x [0,1]), which are not smooth but a union of smooth pieces. Consequently, we extend the theory developed above to finite unions of smooth curves as follows. A subset C of R m is called a piecewise smooth curve (respectively, a piecewise smooth arc) if and only if C = Uf=lCj, where each C j is a smooth curve (respectively, arc), and for each j f=. k either C j and Ck are disjoint or they intersect at a single point. Thus a piecewise smooth curve might consist of several disjoint smooth pieces, such as the boundary of an annulus 0 < a 2 < x 2 + y2 < b2 , or several connected pieces with corners, such as the boundary of the perforated square ([0,3] x [0,3]) \ ([1,2] x [1,2]). Let C = Uf=l C j be a piecewise smooth curve. By a parametrization of C we mean a collection of smooth parametrizations (¢j, I j ) of C j . Two parametrizations Uf=l (¢j, I j ) and Uf=l ('ljJj, Jj ) of C are said to be smoothly equivalent if and only if
13.1
457
Curves
Figure 13.4
(cPj, I j ) and ('l/Jj, J j ) are smoothly equivalent for each j E {1, ... , N}. Finally, if C is a piecewise smooth arc, then the arc length of C is defined by N
L(C):= 2::L(Cj
),
j=l
and the line integral on C of a continuous function g : C
1 2::1
---+
R is defined by
N
gds=
C
j=l
gds.
CJ
13.16 Example. Parametrize the boundary C of the unit square [0,1] x [0,1] and compute fcgds, where g(x,y) = x 2 + y3. SOLUTION.
C has four smooth pieces that can be parametrized by
cP1(t)
=
for t E [0,1]. Since
fc
(t,O),
cP2(t)
= (1, t),
cP3(t)
=
(t, 1),
cP4(t)
= (0, t),
IlcPj(t)11 = 1, we have by definition,
1 9dS = 11t 2 dt+ 1\1+t3 )dt+ 1 (t 2 +1)dt+ 11t3dt=
1:.
I
We close this section with a geometric justification of Definition 13.6 that will not be used elsewhere.
The arc length of some non-CP curves can be defined by using line segments for approximation (see Figure 13.4). Namely, we say that a curve C with parametrization (cP, I) is rectifiable if and only if
11011
'~'"P
{t,
1I¢(tj) - ¢(tj-dll ' {to, t" ... , t,} ;,. part;tion of! }
is finite, in which case we call IICII the arc length of C. The following result shows that every CP arc is rectifiable, and the two definitions we have given for arc length agree.
458
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
*13.17 THEOREM. IfG is a CP arc, then II Gil is finite, and L(G) = IIGII.
STRATEGY: The idea behind the proof is simple. By the Mean Value Theorem, each term 11¢(tj) - ¢(tj -l)11 that appears in the definition of IIGII is approximately 11¢'(tj)ll(tj - tj-t), a term of a Riemann sum of the integral of 11¢'(t)ll. Thus, we begin by controlling the size of 11¢'(tj)ll. PROOF.
Let e > 0, write ¢ = (¢l, ¢2, ... , ¢m), and set
for (Xl, ... , xm) in the cube I m := I x ... x I. By hypothesis, F is continuous on Im, and I m is evidently closed and bounded. Thus, F is uniformly continuous on Im; i.e., there is a 8 > 0 such that
X,Y Elm
and
Ilx -YII < 8 imply
IF(x) - F(y) I < 2~II·
Let P = {Uo, ... , UN} be any partition of I. By Theorem 5.18, choose a partition Po = {to, tl, ... , td of I finer than P such that IIPol1 < and
8/rm
k
111¢'(t)11 dt -
~
j=l
I
11¢'(tj)ll(tj - tj-l) < 1 11 ¢'(t)11 dt +~. I
e
Fix E {1, ... ,m} and j E {1, ... ,k}. By Theorem 4.15 (the one-dimensional Mean Value Theorem), choose a point Cj(e) E [tj-l, tj] such that
¢t(tj) - ¢t(tj-l) Since IIPol1 < 8/
¢'(t)
= (¢~(t),
= ¢~(Cj(e))(tj
- tj-l).
rm,
we have IF(tj, ... , tj) - F(cj(l), ... , Cj (m)) I < e/(2III). Since ... , ¢~(t)), we also have F(tj, ... , tj) = 11¢'(tj)11 and
It follows that k
L 11¢'(tj)ll(tj - tj-t} -
~<
j=l
k
k
j=l
j=l
L 11¢(tj) - ¢(tj-l)11 < L 11¢'(tj)ll(tj - tj-l) + ~.
Combining this double inequality with the preceding one, we obtain k
111¢'(t)11 dt - e < I
L 11¢(tj) - ¢(tj-l)11 < 1 j=l
I
11 ¢'(t)11 dt + e.
13.1
459
Curves
Using the left-hand inequality and the definition of
L(G) - e =
i
IIGII, we have
k
111>'(t)11 dt - e <
L 111>(tj) -1>(tj-I)11 :S IIGII·
j=l
I
It follows from Definition 13.6 that L(G) :S IIGII. On the other hand, since Po {to, tI, ... , td is finer than P, it follows from the right-hand inequality that N
L
k
111>(Ui) -1>(ui-dll :S
i=l
L
111>(tj) -1>(tj-I)11 <
j=l
i
111>'(t)11 dt
=
+ e.
I
Taking the supremum over all partitions {uo, ... , UN} of I, we have
i.e.,
I Gil :S L(G).
I
EXERCISES 1. Let 1/J(t) = (asint,acost), O"(t) = (acos2t,asin2t), 1= [0, 27r), and J = [0,7r). Sketch the traces of (1/J, I) and (0", J). Note the "direction of flight" and the "speed" of each parametrization. Compare these parametrizations with the one given in Example 13.4. 2. Let a,b E Rm, b f 0, and set 1>(t) = a + tb. Show that G = 1>(R) is a smooth unbounded curve that contains a and a + b. Prove that the angle between 1>(tI) -1>(0) and 1>(t2) -1>(0) for any tI, t2 f 0 is 0 or 7r.
3. Let I be an interval and
1:I
-t
R be continuously differentiable with
I/(OW + 11'(0)1 2 f
0
for all 0 E I. Prove that the graph of r = 1(0) (in polar coordinates) is a smooth CI curve in R 2. *4. Show that the curve y = sin(l/x), 0 < x :S 1, is not rectifiable. Thus show that Theorem 13.17 can be false if G is not an arc. 5. Sketch the trace and compute the arc length of each of the following. (a) 1>(t) = (e t sint, e t cost, et ), t E [0, 27r]. (b) y3 = x 2 from (-1,1) to (1,1). (c) 1>(t) = (t 2, t 2, t 2), t E [0,2]. (d) The astroid of Example 13.5.
460
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
6. For each of the following, find a (piecewise) smooth parametrization of G and compute fegds. (a) G is (b) G is first (c) G is
the curve y = J9 - X2, x ~ 0, and g(x, y) = xy2. the portion of the ellipse x 2/a 2 + y2/b 2 = 1, a, b > 0, which lies in the quadrant and g(x, y) = xy. the intersection of the surfaces x 2 + Z2 = 4 and y = x 2, and g( x, y, z) =
VI +yz2.
(d) G is the triangle with vertices (0,0,0), (1,0,0), and (0,2,0), and g(x, y, z)
=
X+y+Z3. 7. Let G be a smooth arc and gk : G ~ R be continuous for n E N. (a) If gk ~ 9 uniformly on G, prove that fe gk ds ~ fe gds as k ~ 00. * (b) Let {gd be pointwise monotone and let gk ~ 9 pointwise on G as k ~ 00. If 9 is continuous on ¢(I), prove that fe gk ds ~ fe 9 ds as k ~ 00. S. Let (¢,1) be a parametrization of a smooth arc in Rm, and let T : J ~ R be a C1 function, 1-1 from J onto I. If T'(U) =f for all but finitely many u E J, '!/J = ¢ 0 T, and 9 : ¢(I) ~ R is continuous, prove that
°
1
g(¢(t))II¢'(t)11 dt
=
1
g('!/J(u)) II '!/J' (u) II duo
9. [FOLIUM OF DESCARTES]. Let G be the piecewise smooth curve ¢(h U I 2 ),
where
h = (-00, -1), h = (-1,00), and 3t2) 3t ¢(t)= ( l+t 3 'I+t 3
•
Show that if (x, y) = ¢(t), then x 3 + y3 = 3xy. Sketch G. 10. The absolute curvature of a smooth curve with parametrization ('!/J, I) at a point Xo = '!/J(to) is the number
. B(t) I\;(xo) = t--+to hm (.O( t ) , when this limit exists, where B(t) is the angle between '!/J'(t) and '!/J'(to), and let) is the arc length of '!/J(I) from '!/J(t) to '!/J(to). (Thus I\; measures how rapidly B(t) changes with respect to arc length.) (a) Givena,b ERn, b =f 0, prove that the absolute curvature of the line A = '!/J(I), where '!/J(t) :=a+tband I:= (-00,00), is zero at each pointxo on A. (b) Prove that the absolute curvature of the circle ofradius r (namely, G = '!/J(I), where '!/J(t) = (rcost,rsint) and I = [0,2rr)) is l/r at each point Xo on G. 11. Let G be a smooth C2 arc with parametrization (¢, [a,b]), and let s = let) be given by (2). The natuml pammetrization of G is the pair (v, [0, L]), where
v(s) = (¢ol-l)(S)
and
L = L(G).
13.2
461
Oriented curves
(a) Prove that Ilv'(s)11 = 1 for all s E [0, L] and the arc length of a sub curve (v, [c, d]) of Cis d-c. (This is why (v, [0, L]) is called the natural parametrization.) (b) Show that v'(s) and v//(s) are orthogonal for each s E [0, L]. (c) Prove that the absolute curvature (see Exercise 10 above) of (v, [0, L]) at Xo = v(so) is tI;(xo) = Ilv//(so)ll· (d) Show that ifxo = ¢(to) = v(so) and m = 3, then
tI;(xo)
= Ilv
,
//
(so) x v (so)
II
=
IW(to) x ¢//(to) II IW(to)11 3 .
(e) Prove that the absolute curvature of an explicit CP curve y = under the trivial parametrization is tI;=
ly//(xo)1 (1
+ (Y'(XO))2)3/2
f (x)
at (xo, Yo)
.
13.2 ORIENTED CURVES Every parametrization (¢, 1) of a smooth curve C determines a "direction of flight" along C, Le., determines the direction ¢(t) moves as t increases on I; equivalently, the direction in which the tangent vector ¢' (t) points. This direction is called the orientation of C induced by (¢,1). (The arrows in Figures 13.1 and 13.2 represent the orientation of the given parametrization.) If C is smooth, and (¢, I) is one of its smooth parametrizations, then the unit tangent vector of C at Xo = ¢(to) is defined by
T(xo) := ¢'(to)/IW(to)ll. Suppose that (¢,1) and ('lj;, J) are smoothly equivalent parametrizations of the same curve with transition r. Since r' is continuous and nonzero, either r'(u) > a for all u E J or r'(u) < a for all u E J. In the first case, the vectors ¢'(r(u)) and'lj;'(u) point in the same direction (see (1) in Section 13.1); hence, these parametrizations determine the same orientation and same unit tangent. In the second case, the vectors ¢'(r(u)) and 'lj;'(u) point in opposite directions, hence, determine different orientations and opposite unit tangents. Accordingly, we make the following definition.
13.18 DEFINITION. Two parametrizations (¢,1) and ('lj;, J) are said to be orientation equivalent if and only if they are smoothly equivalent and the transition r from J to I satisfies r'(u) > a for all u E J. In practice, a curve and its orientation are often described geometrically. The reader must provide a parametrization that traces the curve in the prescribed orientation. Here are two typical examples.
462
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
Figure 13.5 13.19 Example. Find a smooth parametrization of the curve C in R 3 , oriented in the clockwise direction when viewed from high up on the positive z axis, formed by intersecting the surfaces x 2 + 5y2 = 5 and z = x 2. SOLUTION. The elliptical cylinder x 2 + 5y2 = 5 intersects the parabolic cylinder z = x 2 to form a "sagging ellipse" (the shaded region in Figure 13.5 represents that part of z = x 2 which lies inside the cylinder x 2 + 5y2 = 5). Using x = V5sint, y = cos t to incorporate clockwise motion around the ellipse x 2 + 5y2 = 5, we see that z = x 2 = 5 sin2 t. Thus a smooth parametrization of C with clockwise orientation is ¢(t) = (V5 sin t, cos t, 5 sin 2 t) on J = [0, 21f]. I 13.20 Example. Find a smooth parametrization of the curve C in R 3 , oriented from right to left when viewed from far out the line y = x in the xy plane, formed by intersecting the surfaces z = x 2 - y2 and x + y = 1. SOLUTION. The saddle surface z = x 2 - y2 intersects the plane x + y = 1 to form a curve that cuts across the surface. Using x = t as a parameter to incorporate right-to-Ieft orientation, we see that y = 1 - t and z = t 2 - (1 - t)2 = 2t - 1. Thus a smooth parametrization of C is ¢(t) = (t,1 - t, 2t - 1) on J = R. In particular, C is a line in the direction (1, -1, 2) passing though the point (0,1, -1). I The following integral arises naturally in the study of fluids, electricity, and magnetism (e.g., see the discussion below). 13.21 DEFINITION. Let C be a smooth arc in Rm with unit tangent T, and let (¢, J) be a smooth parametrization of C. If F : C ~ R m is continuous, then the oriented line integral of F along C is
(4)
fc
F· Tds:=
fc
F· d¢:=
1
F(¢(t))· ¢'(t) dt.
The notation F . d¢ is self-explanatory. The notation F . T ds is consistent with equation (3) in Section 13.1. Indeed, T = ¢'(t)/II¢'(t)11 and ds = 11¢'(t)11 dt, so in the expression F· Tds, the scalars 11¢'(t)11 cancel each other out. What does this number represent? If F represents the flow of a fluid, then F· T is the tangential component of F, i.e., a measure of fluid flow in the direction to which
13.2
Oriented curves
463
the tangent T points (see Appendix E). For example, suppose that C is the unit circle oriented in the counterclockwise direction and F(x,y) = (-y,x). The unit tangent to C at a point (x, y) is (-y, x), so F points in the same direction that T does. Hence, F . T = 1 is an indication that the fluid is flowing "with the tangent" rather than against it. On the other hand, if G(x, y) = (y, -x) and H(x, y) = (x, y), then G . T = -1 because the fluid is flowing against the tangent, and H . T = 0 because the fluid is flowing orthogonally to T (e.g., neither with nor against it). Therefore, the integral of F . T ds over C is a measure of the circulation of F around C in the direction of the tangent vector. If this integral is positive, it means that the net flow of the fluid is with T rather than against T. Since an oriented line integral is a one-dimensional integral, it can often be evaluated by techniques introduced in Chapter 5. Here is a typical example. 13.22 Example. Describe the trace of cp(t) = (cost,sint,t), tEl = [O,41fJ, and compute
i
where F(x, y, z)
F·Tds,
= (1, cosz, xy) and C = cp(I).
SOLUTION. Let (x, y, z) = cp(t). Since x 2 + y2 = 1, the trace of cp lies on the cylinder x 2 + y2 = 1, 0 :::; z :::; 41f. As t increases, the point (x, y) winds around the unit circle x 2 + y2 = 1 in a counterclockwise direction. Thus the trace of cp is a spiral (called the circular helix) that winds around the cylinder x 2 + y2 = 1 (see Figure 13.6). As t runs from 0 to 41f, this spiral winds around the cylinder twice, and z runs from 0 to 41f. Since cp'(t) = (-sint,cost, 1), we have
i
F·Tds =
1
47r
(1, cost, costsint) . (-sint,cost,l)dt
= 10r (- sin t + cos2 t + sin t cos t) dt = 21f. I 47r
The following result shows that unlike the line integral fe 9 ds, the oriented line integral fe F . T ds can give different values for different smoothly equivalent parametrizations of the same curve. 13.23 Remark. If (cp, I) and (7jJ, J) are smoothly equivalent but not orientation
equivalent, then
1
F(cp(t)) . cp'(t) dt
= -
i
F(7jJ(u))· 7jJ'(u) duo
PROOF. Let T be the transition from J to I. Since T' is continuous and nonzero, it is either positive on J or negative on J. Since (cp, I) and (7jJ, J) are not orientation equivalent, it follows that T' is negative on J; i.e., IT'(u)1 = -T'(U) for u E J. Combining this observation with the Change-of-Variables Formula (Theorem 12.46)
464
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
Figure 13.6
and (1) in Section 13.1, we conclude that
1
F(¢(t)) . ¢'(t) dt =
i
= -
F(¢(T(U)) . ¢'(T(U)) IT'(u)1 du
i
F('lj;(u)) . 'lj;'(U) duo I
By the same method, one can show that the oriented integral (4) gives identical values for orientation equivalent parametrizations of the same curve (see Exercise 5). Therefore, to evaluate an oriented integral over a curve C whose orientation has been described geometrically, we can use any smooth parametrization of C and adjust the sign of the integral to reflect the prescribed orientation. Here is a typical example. 13.24 Example. Find
where F(x, y) direction.
fc
F ·Tds,
= (y, xy) and C is the unit circle x 2+y2 = 1 oriented in the clockwise
SOLUTION. The parametrization ¢(t) = (cos t, sin t), t E [0,21fJ, of C has counterclockwise orientation (see Example 13.4). Thus, by Remark 13.23,
iF. T ds = =
_1
27r
(sin t, sin t cos t) . (- sin t, cos t) dt
ior27r (sin
2
t - sin t cos2 t) dt
= 1f. I
13.2
465
Oriented curves
There is another way to represent the oriented integral (4) that uses differential notation. Recall that if Xj = ¢j(t), then dXj = ¢j(t)dt. Hence, formally, F(¢(t)) . ¢' (t) dt looks like
This last expression is called a differential form of degree 1 on Rm (more briefly, a 1-form) and the functions Fj are called its coefficients. A I-form is said to be continuous on a set E if and only if each of its coefficients is continuous on E. The oriented integral of a continuous I-form on a smooth arc C in R m is defined by i
Fl dXl
+ ... + Fm dXm := i F , T ds,
where F = (Fl , ... , Fm). The following example illustrates the fact that differential forms provide a shorthand for the way that an oriented line integral is computed (so we can avoid parametrizing explicit curves). 13.25 Example. Find i
ydx +cosxdy,
where C is the explicit curve y = x2 SOLUTION. Since y = x2 i
y dx
+ sin X
oriented from (0,0) to
+ sin X and dy = (2x + cos x) dx,
+ cos x dy =
io (x 2 + sin x) dx + io 1r
11"3
1r
(11",11"2).
we have
cos X (2x
+ cos x) dx
11"
=-+--2 • 32' Let C = Uf=l Cj be a piecewise smooth arc in R m (see the discussion preceding Example 13.16) and Tj be a unit tangent vector for Cj . If F : C -+ Rm is continuous, then the oriented line integral of F along C induced by the tangents Tj is defined to be
1
'Ll N
F·Tds=
C
j=l
F.Tjds.
CJ
If w is a I-form continuous on C, then the oriented integral of w along C is defined to be
l w=tl w. C
13.26 Example. Find
j=l
Cj
466
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS y
---------..,
I I
:I t
I C3 I x
- - C2
Figure 13.7
where C is the boundary of Q = [0,1] x [0,1] oriented in the counterclockwise direction. SOLUTION. The boundary C = {)Q consists of four smooth pieces (see Figure 13.7): C 1 (which lies in the line x = 0), C 2 (in y = 0), C 3 (in x = 1), and C4 (in y = 1). For Cl, let x = 0 and y run from 1 to 0 (to maintain counterclockwise orientation on C). Then
1 c
xydx + (x 2 + y2) dy =
1°
y2 dy =
1
1
-~.
Similarly, the integrals over C2 , C3 , and C4 are 0,4/3, and -1/2. Hence,
r
Jc F· Tds =
1
4
1
1
-3 + 0 + 3 - 2 = 2· •
EXERCISES 1. For each of the following, sketch the trace of (¢, R), describe its orientation, and verify that it is a subset of the surface S. (a) ¢(t) = (3t, 3 sin t, cos t), S = {(x, y, z) : y2 + 9z 2 = 9}. (b) ¢(t) = (t 2,t3,t2), S= {(x,y,z): z=x}. (c) ¢( t) = (t, t 2 , sin t), S = {(x, y, z) : y = x 2 }. (d) ¢(t) = (cost,sint, cost), S= {(x,y,z) :y2+z2 = I}. (e) ¢(t) = (sint,sint,t), S = {(x,y,z): y = x}. 2. For each of the following, find a (piecewise) smooth parametrization of C and compute fc F . T ds. (a) C is the curve y = x 2 from (1,1) to (3,9), and F(x, y) = (xy, y - x). (b) C is the intersection of the elliptical cylinder y2 + 2z2 = 1 with the plane x = -1, oriented in the counterclockwise direction when viewed from far out the positive x axis, and F(x, y, z) = (Jx 3 + y3 + 5, z, x 2). (c) C is the intersection of the bent plane y = Ixl with the elliptical cylinder x 2 + 3z 2 = 1, oriented in the clockwise direction when viewed from far out the positive y axis, and F(x, y, z) = (z, -z, x + y).
13.2
Oriented curves
467
Ic
3. For each of the following, compute w. (a) C is the polygonal path consisting of the line segment from (1,1) to (2,1) followed by the line segment from (2,1) to (2,3), and w = y dx + x dy. (b) C is the intersection of z = x 2 + y2 and x 2 + y2 + z2 = 1, oriented in the counterclockwise direction when viewed from high up the positive z axis, and w = dx + (x + y) dy + (x 2 + xy + y2) dz. (c) C is the boundary of the rectangle R = [a, b] x [c, d], oriented in the counterclockwise direction, and w = xy dx + (x + y) dy. (d) C is the intersection of y = x and y = z2, 0 :::; Z :::; 1, oriented from left to right when viewed from far out the y axis, and w = yxdx + cosydy - dz.
ou
4. (a) Let c E R, 0> 0, and set T(U) = + c for U E R. Prove that if (¢, I) is a smooth parametrization of some curve, if J = T- 1 (1), and if 'if; = ¢ 0 T, then ('if;, J) is orientation equivalent to (¢,1). (b) Prove that if (¢, 1) is a parametrization of some smooth arc, then it has an orientation equivalent parametrization of the form ('if;, [0, 1]). (c) Obtain an analogue of (b) for piecewise smooth curves. 5. Let (¢, I) be a smooth parametrization of some arc and T be a C1 function, 1-1 from J onto I, that satisfies T'(U) > 0 for all but finitely many u E J. If 'if; = ¢ 0 T, prove that
1
F(¢(t)) . ¢'(t) dt
for any continuous F : ¢(I)
--+
=
1
F('if;(u)) . 'if;'(u) du
Rm.
lliJ. This exercise is used in Section 13.5. Let f : [a, b] --+ R be C
1 on [a, b] with f'(t) =f. 0 for t E [a,b]. Prove that the explicit curve x = f-1(y), as y runs from f(a) to f(b), is orientation equivalent to the explicit curve y = f(x), as x runs from a to b. 7. Let V =f. 0 be open in R2. A function F : V --+ R 2 is said to be conservative on V if and only if there is a function f : V --+ R such that F = '\l f on V. Let (x, y) E V and let F = (P, Q) : V --+ R2 be continuous on V. (a) Suppose that C(x) is a horizontal line segment terminating at (x, y), i.e., a line segment of the form L((X1, y); (x, y)), oriented from (Xl, y) to (x, y). If C(x) is a subset of V, prove that
~
uX
1
c(x)
F· Tds = P(x,y).
Make and prove a similar statement for V terminating at (x, y). (b) Let (xo, Yo) E V. Prove that
(*)
fa
0/ oy
F·Tds=O
and vertical line segments in
468
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
for all closed piecewise smooth curves C the integrals
f(x,y):=l
c
V if and only if for all (x, y) E V,
F·Tds
C(x,y)
give the same value for all piecewise smooth curves C(x, y) that start at (xo, Yo), end at (x, y), and stay inside V. (c) Prove that F is conservative on V if and only if (*) holds for all closed piecewise smooth curves C that are subsets of V. (d) Prove that if F is C1 and satisfies (*) for all closed piecewise smooth curves C that are subsets of V, then OP oQ
oy
ax·
(Note: If V is nice enough, the converse of this statement also holds (see Exercise 8, p. 504, or Theorem 15.45).) *8. Let f : [0, 1] ~ R be increasing and continuously differentiable on [0,1] and let T be the right triangle whose vertices are (0, f(O)), (1, f(O)), and (1, f(l)). If c represents the hypotenuse of T, a and b represent the legs of T, and L represents the arc length of the explicit curve y = f(x), x E [0,1]' prove that c:::; L:::; a+ b.
13.3 SURFACES In this section we define surfaces and unoriented surface integrals, concepts that are two-dimensional analogues of arcs and the line integrals discussed in Section 13.1. Recall that a smooth arc is parametrized on a closed, bounded interval. On what shall we parametrize a smooth surface? Evidently, we need to use some type of closed, bounded set in R2. Although we could use rectangles, such a restriction would be awkward when dealing with explicit surfaces with curved projections, or with surfaces described by cylindrical or spherical coordinates. It is much more efficient to build greater generality into the definition of surface, using two-dimensional regions instead of rectangles, i.e., using sets of the following type for m = 2.
13.27 DEFINITION. An m-dimensional region is a set E c Rm such that E = V for some nonempty, open, connected Jordan region V in Rm. Notice that every closed, bounded interval is a one-dimensional region, every twodimensional rectangle and the closure of every two-dimensional ball or ellipse is a two-dimensional region, and every three-dimensional rectangle and the closure of every three-dimensional ball or ellipsoid is a three-dimensional region.
13.28 DEFINITION. A subset S of R3 is called a CP surface (in R3) if and only if there is a pair (¢, E) such that E is a two-dimensional region, ¢ : E ~ R3 is CP on E and I-Ion EO, and S = ¢(E). In this case we call (¢, E) a pammetrization of S, S the tmce of (¢, E), and the equations
x
=
¢l(U,V),
y = ¢2(U,V),
z
=
¢3(U,V),
(u,v)
E E,
13.3
(0,2)
B
(0,0)
469
Surfaces
(2n, 2)
(2n, 0)
· y x
Figure 13.8
the parametric equations of S induced by (¢, E). Earlier, we called the graph of a function z = f(x, y) a surface. The following result shows that this designation is compatible with Definition 13.28 when f is CP. 13.29 Example. Let E be a two-dimensional region and let function. Prove that the graph of z = f(x, y) is a CP surface.
f :E
---->
R be a CP
PROOF. If ¢(u,v) = (u,v,J(u,v)), then ¢ is CP and 1-1 on E, and ¢(E) is the graph of z = f(x, y). (This is called the trivial parametrization of z = f(x, y).) I
In a similar way we define trivial parametrizations of surfaces of the form x = f(y, z) and y = f(x, z). For example, the trivial parametrization of the surface x = f(y, z), (y, z) E E, is given by (¢, E), where ¢(u, v) = (f(u, v), u, v). By an explicit surface over E we shall mean a surface of the form x = f(y, z), y = f(x, z), or z = f(x, y), where f : E ----> R is a CP function and E is a two-dimensional region. By the proof of Example 13.29, every explicit surface is a CP surface. The next four examples, which provide model parametrizations for certain kinds of surfaces, show that not every surface is an explicit surface.
13.30 Example. Show that the truncated cylinder x 2 + y2 = 1, Coo surface.
°
~
z ~ 2, is a
PROOF. Let ¢( u, v) = (cos u, sin u, v) and E = [0,211'] X [0, 2], and notice that ¢ is 1-1 on EO and Coo on E. The corresponding parametric equations are x = cosu, y = sin u, z = v. Clearly, x 2 + y2 = 1. Thus ¢(E) is a subset of the cylinder x 2 + y2 = 1, ~ z ~ 2. Since E is connected, so is ¢(E). To see that ¢(E) is the entire cylinder, look at the images of horizontal line segments in E. The image of the line segment v = Va is a circle lying in the plane z = Va, centered at (0,0, va), of radius 1 (see Figure 13.8). Thus, as Va ranges from to 2, the images of horizontal lines v = Va cover the entire cylinder x 2 + y2 = 1, ~ z ~ 2. I
°
°°
13.31 Example. Show that the sphere x 2 + y2 PROOF.
Let ¢(u, v)
+ z2 = a2 is a Coo surface.
= (acosucosv, asin ucosv, a sin v) and E = [0,211']x
470
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
(27r, 7r/2)
(0, 7r12) v
= Vo"",-
(0, -7r/2)
z (0,0, a)
(27r, -7r12)
Figure 13.9
[-11"/2,11"/2]. Clearly, ¢ is C= on E. The corresponding parametric equations are x = acosucosv, y = a sin ucosv, z = a sin v. Since x 2 + y2 = a2 cos2 v, we have x 2 + y2 + z2 = a 2. Thus ¢( E) is a subset of the sphere centered at the origin of radius a. The image of the horizontal line segment v = Vo is a circle, lying in the plane z = a sin vo, centered at (0,0, a sin vo) ofradius a cos Vo (see Figure 13.9). The image of the top edge (respectively, bottom edge) of E, i.e., of the horizontal line v = 11"/2 (respectively, v = -11"/2), is the north pole (0,0, a) (respectively, the south pole (0,0, -a)). Thus, as Vo ranges from -11"/2 to 11"/2, the images of horizontal lines v = Vo cover the entire sphere x 2 + y2 + z2 = a2 . I Let C represent the circle in the xz plane centered at (a, 0, 0) of radius b, where a > b. The torus centered at the origin with radii a > b is the donut-shaped surface obtained by revolving C about the z axis (see Figure 13.10). 13.32 Example. Show that the torus centered at the origin with radii a > b is a C= surface. PROOF. Let ¢( u, v) = ((a + b cos v) cos u, (a + b cos v) sin u, b sin v) and E = [-11",11"] X [-11",11"], and notice that ¢ is I-Ion EO and C= on E. The image of u = 0 is a circle in the xz plane centered at (a, 0, 0) of radius b. The images of horizontal lines v = Vo are circles, parallel to the xy plane, centered at (0,0, bsin vo) of radius (a + b cos vo). The image of the lines v = ±11" is a circle in the xy plane centered at (0,0,0) of radius a-b. Thus ¢(E) covers the entire torus. I
13.33 Example. Let b > O. Show that the truncated cone z b, is a C= surface.
= y' x 2 + y2, 0 ::; Z
::;
PROOF. Let (x,y,z) = ¢(u,v) = (vcosu,vsinu,v) and E = [0,211"] x [O,b], and notice that ¢ is I-Ion EO and C= on E. Clearly, x 2 + y2 = z2 and 0 ::; z ::; b. Thus ¢(E) is a subset of the given cone. The image of a horizontal line v = Vo, o < Vo ::; b, is a circle in the plane z = Vo centered at (0,0, vo) of radius Vo (see Figure 13.11). Thus ¢(E) is the cone z = y'x 2 + y2, 0 ::; Z ::; b. Notice that the image of the line v = 0 is the vertex (0,0,0). I
Let S be aCP surface with parametrization (¢, E), and suppose that (uo, vo) E EO.
13.3
Surfaces
471
(-n, -n)
(a
+ b, 0,0)
(0, a + b, 0)
Figure 13.10
z
(O,b)
(2n, b)
v = vo .......
~ (0,0)
(2n, 0)
x
y
Figure 13.11 If ¢ = (¢l, ¢2, ¢3), then by the Implicit Function Theorem (see the proof of Remark 13.10), one can show that if at least one of the partial Jacobians is nonzero at (uo, vo); i.e., if
(5) for some i =1= j, then there is a CP explicit surface (1jJ, B) such that (xo, Yo, zo) := ¢(uo, vo) E 1jJ(B) and 1jJ(B) C ¢(E). Since differentiable explicit surfaces have tangent planes (see Theorem 11.22), it follows that if (5) is satisfied for some i =1= j and (xo, Yo, zo) = ¢(uo, vo), then S has a tangent plane at (xo, Yo, zo). The following result shows how to use a parametrization of a surface to compute a normal to its tangent plane. 13.34 Remark. Let S be a CP surface, let (¢, E) be one of its parametrizations, and set ¢ =: (¢l, ¢2, ¢3). If (5) holds at some (uo, vo) E EO and some i =1= j, then a normal to the tangent plane of Sat (xo, Yo, zo) = ¢(uo,vo) is given by
472
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
q,(E)
Figure 13.12
PROOF. Let II be the tangent plane to S at ¢(ua, va). To compute a normal to II we need only find two vectors that lie in II. But ¢u(ua, va) is tangent to the curve ¢(u, va) and ¢v(ua, va) is tangent to the curve ¢(ua, v) (see Figure 13.3). Hence, ¢u(ua, va) and ¢v(ua, va) both lie in II (see Figure 13.12). Therefore, a normal to II at (xa, Ya, za) is given by the cross product
If (¢, E) is a parametrization of a C1 surface S, we shall use the notation
N>(u, v)
:=
¢u(u, v) x ¢v(u, v),
(u,v)
E
E,
to represent the vector (6). When (5) holds for some i =f j, we shall call N>(ua, va) the normal induced by ¢ on S. It is easy to check that if z = I(x, y) is an explicit surface and ¢ is its trivial parametrization, then N> = (- lx, - I y, 1). This is precisely the normal we used for explicit surfaces before (see Theorem 11.22). Normal vectors play the same role for surfaces that tangent vectors played for curves. (For example, we shall use normal vectors to define area of surfaces, smooth surfaces, and orientation of surfaces.) Indeed, many of the concepts for curves can be brought over to surfaces by replacing ¢' by N>. For example, compare the following definition with Definition 13.11. 13.35 DEFINITION. Let (¢, E) be a parametrization of a CP surface. (i) (¢, E) is said to be smooth at a point (ua, va) E E if and only if N>(ua, va) =f 0 (equivalently, if and only if IIN>(ua, va)11 > 0). (ii) (¢, E) is said to be smooth if and only if it is smooth at each point in E. (iii) (¢, E) is said to be smooth off a set Ea C E if and only if (¢, E) is smooth at each point in E \ Ea.
13.3
Surfaces
473
Notice that the trivial parametrization of an explicit surface is always smooth. Analogous to the situation for curves, a surface with a smooth parametrization must have a tangent plane at each of its points (see Exercise 7). On the other hand, a surface with tangent planes at each point can have nonsmooth parametrizations. For example, the parametrization ¢ of the sphere given in Example 13.31 satisfies
IIN.p11 = II (a 2 cos Ucos2 v, a2 sin Ucos2 v, a2 sin v cos v) II = a2
1
cos vi,
hence is not smooth when v = ±7r /2. (This happens because this parametrization takes the lines v = ±7r /2 to the north and south pole, hence, is not 1-1 there.) We shall call a surface S smooth if and only if for each point Xo E S there is a parametrization (¢, E) of S that is smooth at (uo, vo), where Xo = ¢(uo, vo). Other authors call a surface smooth only when it has a smooth parametrization. This definition is inadequate for most "closed" surfaces, i.e., surfaces that are the boundary of some three-dimensional region, because those surfaces have no (globally) smooth parametrizations. (See, for example, discussion of the parametrization of the sphere in the preceding paragraph. The sphere IS smooth by our definition, however, since we can find other parametrizations that are "smooth" at the north and south poles, e.g., the trivial parametrizations of each hemisphere.) This is typical. Every surface smooth by our definition is a union of surfaces with smooth parametrizations-see Exercise 7, p. 487.) The following result shows what happens to the normal vector N.p under a change of parameter.
13.36 THEOREM. Lett (¢, E) and ('l/J, B) be parametrizations of the same CP surface. If 7 is a C1 function that takes B into E such that'l/J = ¢ 0 7, then N,p(u, v) = Llr(u, V)N.p(7(U, v))
for each u, v
E
B.
N,p = (Ll C,p2,,p3)' Ll C,p3,,ptl, Ll C,pl,,p2))' Since, by hypothesis, ('l/Ji,'l/Jj) = (¢i,¢j) 07 for i,j = 1,2,3, it follows from the Chain Rule that LlC,p .. ,pJ)(U, v) = Llr(u,v)LlC.p .. .pJ)(7(U, v)) for any u, v E B. Therefore, N,p
= Llr . (N.p 07) on B. I
This leads us to the following definition (compare with Definition 13.13).
13.37 DEFINITION. Two CP parametrizations (¢, E), ('l/J, B) are said to be smoothly equivalent if and only if they are smooth parametrizations of the same surface and there is a CP function 7, that takes B onto E, such that 'l/J = ¢ 0 7 and Llr (u, v) -=f. 0 for all (u, v) E B. The function 7 is called the tmnsition from B to E. Analogous to Definitions 13.6 and 13.14, we define surface area and the surface integral as follows.
474
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
13.38 DEFINITION. Let S be a smooth CP surface and (¢, E) be one of its parametrizations. (i) The surface area of S is defined to be
O'(S):= L IIN,p(u, v)11 d(u, v). (ii) If 9 : S be (7)
--+
R is continuous, then the surface integral of 9 on S is defined to
lis gdO':= L g(¢(u, v)) IIN,p(u, v)11 d(u, v).
The surface integral (7) can be interpreted as the mass of a membrane with shape ¢(E) and density 9 (see Appendix E). For an explicit CP surface S given by z = f(x, y), (x, y) E E, this integral looks like
(8)
lis 9 dO'
=
L g(x, y)J1 + J:(x, y) + f;(x, y) d(x, y).
It can be argued on heuristic grounds that this is the right definition for surface area (see Appendix E). In fact, we could have defined the surface area of S by approximating it with planar regions, as we defined IIGII below Example 13.16, by approximating it by line segments (see Price [10], p. 360). This approach, however, works only under suitable restrictions. Indeed, even when using triangular regions to approximate a bounded cylinder, the total area of the approximating regions may become infinite (see Spivak [12], p. 130). We prefer Definition 13.38i because it is both direct and easy to use. Notice that by Theorem 12.24, (7) makes sense when the normal N,p(u,v) is undefined on a set of area zero. Thus the surface integral can be defined for some nonsmooth surfaces, e.g., for cones. It is easy to see that surface area and the surface integral are invariant under smoothly equivalent parametrizations, even when the condition ~T =f 0 is relaxed on a closed set of area zero (see Exercise 5). It is also easy to see that if a surface S is a subset of R 2 , then its surface area, as defined by Definition 13.38, is the same as the area of S, as defined by Definition 12.3 (see Exercise 4). To compute a surface integral, one must find a suitable parametrization of the given surface and apply Definition 13.38.
13.39 Example. Find g(x, y, z) = Vz.
IIs 9 dO', where S is the hemisphere z = J a2 -
x2
-
y2
and
SOLUTION. Let ¢ be the function defined in Example 13.31 and E = [0,271'] x [0,71'/2]. Then (¢,E) is a parametrization of the hemisphere S and IIN,p11 = a2 cosv. Therefore,
lisgdO'
=
lLo a2 cosvv'asinvdudv
=
27!'a 5 / 2
1 7r
/
2
cos vv'sinv dv = 4; a5 / 2 . •
13.3
475
Surfaces
Continuity of g is assumed in Definition 13.38 only so that the integral on the right-hand side of (7) makes sense. If one of the iterated integrals is a convergent improper integral, we can extend the definition of the surface integral in the obvious way. Using this observation, we now offer a second solution to Example 13.39 using the trivial parametrization. ALTERNATIVE SOLUTION. The explicit surface z = Ja 2 - x 2 - y2 has normal N = (-zx, -Zy, 1) = (xl z, yl z, 1). (This normal does not exist on 8Ba(0, 0), but since 8Ba(0,0) is of area zero, we can ignore it when integrating over Ba(O,O).) Notice that on S, IINII = alz. Thus, by (8) and polar coordinates,
Ji
s
gda=
1
Ba(O,O)
avz
--d(x,y)=a 1271"
1 a
4rr 5 / 2. r(a 2 _r 2 )-1/4drd(}=_a
zoo
3
(The inner integral (with respect to r) is an improper integral.) I For even the simplest applications, we must have a theory rich enough to handle surfaces, such as the boundary of the unit cube 8([0,1] x [0,1] x [0,1]), which are not smooth but a union of smooth pieces. Consequently, we shall extend the theory developed above to finite unions of smooth surfaces. This expanded theory will be introduced using informal geometric descriptions instead of formal statements. For now, these vague descriptions will suffice because the concrete surfaces that arise in practice are easy to visualize. (Chapter 15 contains a rigorous and more mathematically satisfying treatment of these ideas.) Before describing piecewise smooth surfaces, we must distinguish between interior points (points that lie "inside" a surface) and boundary points (points that lie on the "edge" of a surface). To illustrate the difference, consider the truncated cylinder S parametrized by (¢, E) in Example 13.30. A point (x, y, z) E S lies inside S if o < z < 2, and on its edge if z = 0 or z = 2. (Look at Figure 13.8 to see why this terminology is appropriate.) Naively, we might guess that (x, y, z) lies on the edge of ¢(E) if and only if (x, y, z) tt ¢(EO). This guess is incorrect, even for the cylinder; for example, (1,0,1) = ¢(O, 1) does not belong to ¢(EO) but does not belong to an edge of the cylinder either. (Instead, it lies on a "seam" of S.) Evidently, to define the interior and boundary of a general surface S, we must describe it geometrically. We cannot define them by using a particular parametrization (¢, E). Accordingly, let S be a CP surface in R3. Imagine yourself standing on a point (x, y, z) E S. We shall say that (x, y, z) is interior to S if you are surrounded on all sides by points in S; i.e., if you take a sufficiently small step in any direction you remain on S. We shall denote the set of interior points of a surface S by Int (S) and shall define the (manifold) boundary of a surface S by 8S:= S \ Int (S). We have used the same notation to denote the boundary of a surface as we did to denote the boundary of a set (see Definition 8.34 or 10.37) even though these concepts are not the same. We made this choice because it homogenizes the statements of all the fundamental theorems of multidimensional calculus. To avoid ambiguity, we shall henceforth refer to the boundary of a region E (i.e., to E \ EO) as the topological boundary of E. No confusion will arise because the only boundary
476
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
we use in connection with surfaces is the manifold boundary, and the only boundary we use in connection with m-dimensional regions is the topological boundary. A surface 8 is said to be closed if and only if 88 = 0. For example, if a > 0, 2 - x 2 _ y2 then the sphere x 2 + y2 + Z2 = 2 is closed, but the hemisphere z = (respectively, the truncated paraboloid z = x 2 + y2, Z :::; 1) is not closed, since its boundary is x 2 + y2 = a2, z = (respectively, x 2 + y2 = 1, z = 1). By the Jordan Curve Theorem, a closed arc C divides R2 into two or more disjoint connected sets, the bounded components "surrounded" by C and the unbounded component that lies "outside" C. This is not the case for closed surfaces. Indeed, there are closed smooth surfaces (the Klein bottle is one example) that surround no points, hence do not divide R3 into disjoint sets (see Griffiths [3], p. 22, or Hocking and Young [4], p. 237). A set 8 C R3 will be called a piecewise smooth surface if and only if 8 = Uf=18j, where each 8 j = (¢j, E j ) is a smooth surface and for each j #- k either 8 j n 8 k is empty, or a portion of the boundary of 8 j is matched to a portion of the boundary of 8 k . Thus a piecewise smooth surface might consist of disjoint pieces, such as the topological boundary of the corona < a :::; I (x, y, z)11 :::; b, or connected pieces with ridges, such as the concentric boxes 8[([0,3] x [0,3] x [0,3]) \ ([1,2] x [1,2] x [1,2])]. We make the further restriction that the intersection of any three 8 j 's is at most finite. This prevents a piecewise smooth surface from doubling back on itself more than once along any given edge. Let 8 = Uf=18j be a piecewise smooth surface. By a parametrization of 8 we mean a collection of smooth parametrizations (¢j, E j ) of 8 j . Two parametrizations (¢j, E j ), ('lj;j, B j ) are said to be smoothly equivalent if and only if (¢j, E j ) is smoothly equivalent to ('lj;j, B j ) for j = 1, ... , N. The boundary, 88, of 8 is defined to be the union of all points that belong to the closure of an unmatched portion of 88j . (For example, the boundary of the box formed by removing the face z = 1 from the unit cube [0,1] x [0,1] x [0,1] is the unit square in the plane z = 1, and the boundary of the union of x 2 + y2 = 1, -3:::; z :::; 0, and z = V1 - x 2 - y2 is the unit circle in the plane z = -3.) The surface area of 8 is defined by
a °
°: :;
va
°
N
a(8) = La(8j
)
j=1
and the surface integral of a real-valued function 9 continuous on 8 is defined by
13.40 Example. Let 8 be the tetrahedron formed by taking the topological boundary ofthe region bounded by x = 0, y = 0, z = 0, and x+y+z = 1. Find a piecewise smooth parametrization 8 and compute ffs 9 da, where g(x, y, z) = x + y2 + z3.
The tetrahedron has four faces that can be parametrized by ¢1 (u, v) = (u, v, 0), ¢2(U, v) = (0, u, v), ¢3(U, v) = (u, 0, v), ¢4(U, v) = (u, v, 1 - u - v), where SOL UTION.
13.3
477
Surfaces
(u, v) belongs to E, the triangular region with vertices (0,0), (1,0), and (0,1). Since liN",) I = 1 for j = 1,2,3 and IIN"'411 = yI3, we have
EXERCISES 1. For each of the following, find the surface area of S.
(a) S is the conical shell given by z = vx 2 + y2, where a ::; z ::; b. (b) S is the sphere given in Example 13.31. (c) S is the torus given in Example 13.32. 2. For each of the following, find a (piecewise) smooth parametrization of Sand of as, and compute 9 dO". (a) S is the portion of the surface z = x 2 - y2 that lies above the xy plane and +4x2 +4y2. between the planes x = 1 and x = -1, and g(x,y,z) = 3 (b) S is the surface y = x , 0::; y::; 8,0::; z::; 4, and g(x,y,z) = x 3z. (c) S is the portion of the hemisphere z = x 2 - y2 that lies outside the cylinder 2x2 + 2y2 = 9, and g(x, y, z) = x + y + z. 3. Find a parametrization (cp, E) of the ellipsoid
IIs
VI
V9 -
x 2 y2 z2 2+b 2 +2"=1 a c that is smooth off the topological boundary aE. 4. (a) Suppose that E is a two-dimensional region and S = {(x, y, z) E R3 : (x, y) E E and z = o}. Prove that Area (E) =
lis
dO"
478
Chapter 13
and
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
fls
gda = Isg(x,y,O)d(X,y)
for each continuous 9 : E -+ R. (b) Let I: [a, b] -+ R be a CP function, let C be the curve in R2 determined by z = I(x), a :::; x :::; b, and let 8 be the surface in R3 determined by z = I(x), a:::; x :::; b, c :::; y :::; d. Show that a(8) = (d - c)L(C). (c) Let 1 : [a, b] -+ R be a CP function and let 8 be the surface obtained by revolving the curve y = I(x), a :::; x :::; b, about the x axis. Prove that the surface area of 8 is
a(8) = 27r
lb I/(x)lvl + 1f'(x)12
dx.
5. Suppose that 'ljJ(B) and ¢(E) are CP surfaces, and 'ljJ = ¢ 0 T, where T is a C1 function from B onto Z. (a) If ('ljJ, B) and (¢, E) are smooth and T is 1-1 with ~r =I- 0 on B, prove that
f
Is g(¢(u,v))IIN(u,v)11 dudv =
fL
g('ljJ(s, t))IIN",(s, t)11 dsdt
for all continuous 9 : ¢(E) -+ R. * (b) If Z is a closed subset of B of area zero such that ('ljJ, B) is smooth off Z, is 1-1, and ~r =I- 0 on BO \ Z, prove that
f
Is g(¢(u, v))IIN(u,v)11 dudv
for all continuous 9 : ¢(E)
=
fL
g('ljJ(s, t))IIN",(s, t)11 dsdt
R. 6. Let I: B3(0,0) -+ R be differentiable with IIV'/(x,y)ll:::; 1 for all (x,y) E B3(0,0). Prove that if 8 is the paraboloid 2z = x 2 + y2, 0 :::; Z :::; 4, then
fIs
-+
I/(x, y) - 1(0,0)1 da :::; 407r.
7. Let ¢(E) be a CP surface and (xo, Yo, zo) = ¢(uo, vo), where (uo, vo) E EO. If N (uo, vo) =I- 0, prove that ¢( E) has a tangent plane at (xo, Yo, zo). 8. Let 'ljJ(B) be a smooth surface. Set E = II'ljJull, F = 'ljJu . 'ljJv, and G = II'ljJvll. Prove that the surface area of 8 is vE2G2 - F2 d(u,v). 9. Suppose that 8 is a C1 surface with parametrization (¢, E) that is smooth at (xo, Yo, zo) = ¢(uo, vo). Let ('ljJ,1) be a parametrization of a C1 curve in E that passes through the point (uo,vo) (Le., there is a to E I such that 'ljJ(to) = (uo, vo)). Prove that (¢ 0 'ljJ)'(to) . (¢u x ¢v)(uo, vo) = O.
IE
T
13·4
Oriented surfaces
479
z (n,-I)
(n,l)
] "~O -I
[~
t
v=o
(-n, -1)
(-n,1)
Figure 13.13 13.4 ORIENTED SURFACES Recall that a smooth curve ¢(I) is oriented by using the tangent vector ¢'(t) to choose a "positive direction." Analogously, a smooth surface S = ¢(E) will be oriented by using the normal vector N¢ to choose a "positive side." Since smooth surfaces are by definition connected, such a choice will be possible if S has two, and only two, sides. A new complication arises here. There are smooth surfaces that have only one side. (The following example of such a surface can be made out of paper by taking a long narrow strip by the narrow edges, twisting it once, and gluing the narrow edges together.)
13.41 Example [MOBIUS STRIP]. Sketch the trace of (¢, E), where ¢( u, v) = ((2+vsin(u/2))cosu, (2 + vsin(u/2)) sinu, vcos(u/2)) and E= [-n,n] x [-1,1].
°
SOLUTION. The image of the horizontal line v = under ¢ is (2cosu,2sinu,0), i.e., the circle in the xy plane centered at the origin of radius 2. The image of each vertical line u = Uo is a line segment in R 3 that rotates through space as Uo increases. For example, the image ofu = is (2,0,v), -1 S; v S; 1, and the image of u = ±n is the seam So := (-2 =f v,O,O), -1 S; v S; 1, i.e., the set of points {(x, 0, 0) : -3 S; x S; -I}. Thus the trace of (¢, E) is given in Figure 13.13. I
°
To avoid such anomalies, we introduce the following concepts. The unit norS, at a point (xo, Yo, zo) on S, induced by one of its parametrizations (¢, E) is the vectorn(xo, Yo, zo) = N¢(uo, vo)/IIN¢(uo, vo)ll, where ¢( un, va) = (xo, Yo, zo). Evidently, the unit normal n is well-defined only when mal of a smooth surface
N¢(uo,vo) = N¢(Ul,vd IIN¢(uo, vo)11 IIN¢(Ul' vd I
:f:O
for all (Uj, Vj) E E that satisfy ¢(Uj, Vj) = (xo, Yo, zo) for j = 0,1. This will surely be the case if ¢ is 1-1 and smooth on E. If ¢ fails to be I-Ion E, however, the unit normal n might not be well-defined, even though (¢, E) is smooth on E (see the
480
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
Mobius strip in Figure 13.13, where ¢(7r, v) = ¢( -7r, v) but Nt/>(7r, v) = -Nt/> (-7r, v) for all v). A smooth surface S is said to be orientable if and only if it has a smooth parametrization (¢, E) that induces an unambiguous unit normal n on S that varies continuously over S; i.e., if ¢( uo, vo) = ¢(Ub VI)' then Nt/> (uo, vo) points in the same direction as Nt/>(UI, VI)' and if (U2' V2) is near (uo, vo), then Nt/>(U2, V2) points in approximately the same direction as Nt/> (uo, vo). (A formal definition of orient able will be given in Section 15.2.) If S is orientable, then, its unit normal can be used to choose a "positive" side (the side from which n points). Henceforth, by a parametrization of an orientable surface S we mean a smooth (¢, E) that induces an unambiguous unit normal on S.
13.42 DEFINITION. Two parametrizations (¢, E) and ('l/J, B) are said to be orientation equivalent if and only if they are parametrizations of the same orientable surface, smoothly equivalent with transition T, and ~7' (u, v) > 0 for all (u, v) E B. By Theorem 13.36, if (¢, E) and ('l/J, B) are orientation equivalent, then the normal vectors they generate point in the same direction. Thus the positive side chosen by (¢, E) is the same as the positive side chosen by ('l/J, B). Oriented surface integrals can be defined using the unit normal in the same way that oriented line integrals were defined using the unit tangent (compare the following definition with Definition 13.21).
13.43 DEFINITION. Let S be a smooth orient able surface with unit normal n determined by a parametrization (¢, E). If F : S ~ R3 is continuous, then the oriented surface integral of F on S is
lis
F'ndO':= L(Fo¢)(u,v) . Nt/>(u,v)d(u,v).
The notation of the leftmost integral is consistent with the notation in (7) since
n = Nt/>/IINt/> II and dO' =
liNt/> II d(u,v).
Notice that the trivial parametrization always induces an unambiguous normal on an explicit surface. In fact, if S = {(x, y, z) : z = f(x, y), (x, y) E E}, Definition 13.43 takes the form
(9)
lis
F ·ndO'
=
L
F(x,y, f(x, y)) . (- fx, - fy, 1) d(x,y).
Things are not so simple for smooth surfaces which are the boundary of a threedimensional region (such as the sphere) and for surfaces which are not smooth (such as the cone), because their parametrizations have at least one point where the normal is zero, hence the unit normal cannot be defined. Nevertheless, as was the case for the oriented line integral, the oriented surface integral can be defined when the normal fails to exist on some set of area zero (see Exercise 4). One needs to be careful, however, with the definition of orientable. If the collection of nonsmooth points cuts across the entire surface (such as the peak of a pup tent or the edge of a
13.4
Oriented surfaces
481
pyramid), one has difficulty defining what it means to have a "continuously varying" normal. We shall address this problem for piecewise smooth surfaces at the end of this section. In the meantime, notice that one can define what it means for a surface S = ¢(E) to be orient able if the set of singularities (Le., the set of (x, y, z) E R3 such that (x,y,z) = ¢(u,v) for some (u,v) E E that satisfies N,p(u,v) =0) is finite. In particular, the standard parametrizations of spheres and cones can be used in Definition 13.43. What does an oriented surface integral represent? If F represents the flow of an incompressible fluid at points on a surface S, then F . n represents the normal component of F, i.e., the amount of fluid that flows in the direction of n (see Appendix E). Thus the integral of F . n dO" on S, a measure of the flow of the fluid across the surface S in the direction of n, is sometimes called the flux of F across S. In particular, we should not be surprised when many of these integrals turn out to be zero. It is easy to see that the integral of F'n dO" on a surface S does not change when orientation equivalent parametrizations are used (see Exercise 4). The following result shows that a change of orientation changes the value of the oriented surface integral by a minus sign. 13.44 Remark. If (¢, E) and ('lj;, B) are smoothly equivalent but not orientation
equivalent, then
fe F(¢(u, v)) . N,p(u,v) d(u,v) = -l F('lj;(s, t))· N",(s, t) d(s, t). PROOF. Let 7 be the transition from B to E. Since Ar is continuous and nonzero on the connected set B, and (¢, E) and ('lj;, B) are not orientation equivalent, we have Ar < on B. Hence, it follows from Theorems 13.36 and Theorem 12.46 (the Change-of-Variables Formula) that
°
l
F('lj;(s, t)) . N",(s, t) d(s, t) = -lIAr(S, t)I(F 0 ¢ 0 7)(S, t) . (N,p 0 7)(S, t)
=
-1
F(¢(u, v)) . N,p(u, v) d(u, v)
r(B)
=-
fe F(¢(u, v)) . N,p(u, v) d(u, v). I
Therefore, when evaluating an oriented integral on a surface S whose orientation has been described geometrically, we can use any smooth parametrization of Sand adjust the sign of the integral to reflect the prescribed orientation. Here is a typical example. 13.45 Example. Find the value of ffs F-ndO", where F(x, y, z) = (xy, x-y, z), Sis the planar region x+y+z = 1, (x, y) E [0,1] x [0, 1], and n is the downward-pointing normal.
482
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
SOLUTION. The usual normal (1,1,1) of the plane x rather than downward. Thus, by Remark 13.44,
+y +z =
1 points upward
lisF.nda=-111\Xy'X-y,1-X-Y).(1,1,1)dXdY=-~.
I
It is convenient to have a "differential" version of oriented surface integrals. To see how to define differentials of degree 2, let S = ¢(E) be a smooth orient able surface and x = ¢l(U,V), y = ¢2(U,V), z = ¢3(U,V). By Remark 13.34,
N = (8(Y,z) 8(z,x) 8(x,y)) '" 8( u, v) , 8( u, v) , 8( u, v) . Therefore, the oriented surface integral of a function F = (P, Q, R) : ¢(E) _ R3 has the form
r (p 8(8 (Y,z) +Q8(z,x) +R 8 (x,y)) d(u v) u, v) 8( u, v) 8( u, v) ,
JE
=:
J
is Pdydz + Qdzdx + Rdxdy;
i.e., we should define differentials of degree 2 by
8(y,z) 8(z,x) 8(x,y) dydz:= 8(u, v) d(u, v), dzdx:= 8(u, v) d(u,v), and dxdy:= 8(u, v) d(u, v). (These are two-dimensional analogues of the differential dy = !,(x) dx.) By a 2form (or a differential form of degree 2) on a set n c R3 we mean an expression of the form Pdydz + Qdzdx + Rdxdy, where P, Q, R : n - R. A 2-form is said to be continuous on n if and only if its coefficients P, Q, R are continuous on n. The oriented integral of a continuous 2-form on a smooth surface S oriented with a unit normal n is defined by
lis Pdydz + Qdzdx + Rdxdy =
JIs
(P,Q,R) ·nda.
Differential forms of degree 1 were formal devices used in certain computations, e.g., to compute an oriented line integral or to estimate the increment of a function. Similarly, differential forms of degree 2 are formal devices that will be used in certain computations, e.g., to compute an oriented surface integral. They can also be used to unify the three fundamental theorems of vector calculus presented in the next two sections (see Exercise 4, p. 549). (There is a less formal but time-consuming way to introduce differentials in which the differential dx can be interpreted as the derivative of the projection operator (x, y, z) 1----7 x (see Spivak [12], p. 89).)
13.4
483
Oriented surfaces
In general, the boundary of a surface is a curve. Since the boundary of the Mobius strip is a simple closed curve, the boundary of a surface may be orient able even when the surface is not. Let S be an oriented surface with a piecewise smooth boundary as. The orientation of S can be used to induce an orientation on as in the following way. Imagine yourself standing close to as on the positive side of S. The direction of positive flow on as moves from right to left; i.e., as you walk around the boundary on the positive side of S in the direction of positive flow, the surface lies on your left. This orientation of as is called the positive orientation, the right-hand orientation, or the orientation on as induced by the orientation of S. When S is a subset of R2, i.e., of the xy plane, we shall say that as is oriented positively if it carries the orientation induced by the upward-pointing normal on S, i.e., the normal that points toward the upper half space z :2: O. Thus if S is a bounded subset of R2 whose boundary is a connected piecewise smooth closed curve, then the usual orientation on S induces a counterclockwise orientation on as when viewed from high up on the positive z axis. This is not the case, however, when E has interior "holes." For example, if E = {(x, y) : a 2 < x 2 + y2 < b2} for some a > 0, then the positive orientation is counterclockwise on {(x, y) : x 2 +y2 = b2}, but clockwise on {(x, y) : x 2 +y2 = a2 }. A formal definition of the positive or induced orientation will be given in Section 15.2. In the meantime, the informal geometric description given above is sufficient to identify the induced orientation in most concrete situations. Here is a typical example. 13.46 Example. Let S be the truncated paraboloid z = x 2 + y2, 0 ~ outward-pointing normal. Parametrize as with positive orientation.
Z
~ 4, with
The boundary of S is the circle x 2 + y2 = 4 that lies in the z = 4 plane. The positive orientation is clockwise when viewed from high up the z axis. Therefore, a parametrization of as is given by 4>(t) = (2 sin t, 2 cos t, 4), t E [0,2rr]. I SOLUTION.
How do we extend these ideas to piecewise smooth surfaces? If S = USj, it is not enough to assume that each Sj is orientable, because the Mobius strip is the union of two orient able surfaces, namely 4>(E1 ) and 4>(E2 ), where 4> is given by Example 13.41 and Ek = [rr(k - 2), rr(k -1)] x [-1,1], k = 1,2. We shall say that a piecewise smooth surface S = USj is orientable if and only if one can use the normals ±N
lis F·ndn~ t,lis;
Fn;dn
The following three examples provide further explanation of these ideas. 13.47 Example. Evaluate
484
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
x
Figure 13.14
where 8 is the topological boundary of the solid bounded by the cylinder X 2+y2 = 1, and the planes z = 0, z = 2, n is the outward-pointing normal, and F(x, y, z) = (x, 0, y). SOLUTION. This surface has three smooth pieces: a vertical side 81> a bottom 8 2 , and a top 8 3 (see Figure 13.14). Parametrize 8 1 by ¢(u,v) = (cosu,sinu,v), where E = [0,2rr] x [0,2]. Thus N¢ = (cosu,sinu,0) and
Since the outward-pointing unit normal to 8 2 is n 4a, p. 477, that
J1sr r
2
F ·ndO' = -
r
1B (0,0) 1
yd(x,y) =
= (0,0, -1), we see by Exercise
_1271" 11 r 2 sinBdrdB = o. 0
0
Similarly, the integral on 8 3 is also zero. Therefore,
Jis
F . n dO' = 2rr + 0 + 0 = 2rr. I
13.48 Example. Find ffsF ·ndO', where F(x,y,z) = (x + z2,x,z), 8 is the topological boundary of the solid bounded by the paraboloid z = x 2 + y2 and the plane z = 1, and n is the outward-pointing normal.
SOLUTION. The surface 8 has two smooth pieces: the paraboloid 8 1 given by z = x 2 + y2, 0 :S z :S 1, and the disk 8 2 given by x 2 + y2 :S 1, z = 1. The trivial parametrization of 8 1 is ¢(u, v) = (u,v, u 2 +v 2), (u,v) E Bl(O, 0). Note that
13.4
Oriented surfaces
485
(0,0, -1)
Figure 13.15
Nq, = (-2u, -2v, 1) points inward (the wrong way). Thus, by Remark 13.44 and polar coordinates,
{I {27r = io io (2r2 COS 2 () + 2r5 cos () + 2r2 cos () sin () - r2)r d() dr = 0. Since the unit outward-pointing normal of S2 is n = (0,0,1) and F . n = z = 1 on S2, we see by Exercise 4a, p. 477, that
Ji r
Therefore,
{ S2
F·ndO"= (
i B1 (0,0)
Jis
d(x,y)
=
Area(B 1 (0,0))
F . ndO" = 0+ 7r =
7r.
=7r.
I
13.49 Example. Compute ffs F ·ndO", where F(x, y, z) = (x, y, z), S is the topological boundary of the solid bounded by the hyperboloid of one sheet x 2+y2 - z2 = 1 and the planes z = -1, z = y'3, and n is the outward-pointing normal to S.
SOLUTION. The surface S has three smooth pieces: a top Sl, a side S2, and a bottom S3 (see Figure 13.15). Using n = (0,0,1) for Sl, we have
486
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
Similarly,
J[3
F ·nder
=
211".
To integrate F . n on S2, let z = u and note that x 2 + y2 = 1 + u 2. Thus ¢J( u, v) = ((1 +u2 ) cos v, (1 +u 2 ) sin v, u), (u, v) E [-1, V3] x [0,211"], is a parametrization of S2. Since N¢ = (-(1 + u 2) cos v, -(1 + u 2) sin v, 2u(1 + u 2)) points inward and F· N¢
=
((1 + u 2 ) cosv, (1 + u 2 ) sinv,u) . (-(1 + u 2) cos v, -(1 + u 2) sin v, 2u(1 + u 2))
= -(1 + U 2)2 + 2u 2(1 + u 2) = u 4
1,
-
we have
JiSr r
F ·nder
= - JV3 r"(u 4 -1) dvdu -1
2
= 211"
J
io
V3
(1 - u 4 ) du
811"
= -(1 - V3).
-1
5
Therefore,
Jisr r
F· nder = 4V311" + 211" + 811" (1 - V3) = 611" (3 + 2V3). 5 5
I
EXERCISES 1. For each of the following, find a (piecewise) smooth parametrization of
agrees with the induced orientation, and compute
las F . T ds.
as that
(a) S is the truncated paraboloid y = 9 - x 2 - z2, Y 2 0, with outward-pointing normal, and F(x, y, z) = (x 2y, y 2x, X + y + z). (b) S is the portion of the plane x + 2y + z = 1 that lies in the first octant, with normal that points away from the origin, and F(x, y, z) = (x - y, y - x, xz 2). (c) S is the truncated paraboloid z = x 2 + y2, 1 ::; z ::; 4, with outward-pointing normal, and F(x, y, z) = (5y + cos z, 4x - sin z, 3x cos z + 2ysinz).
2. For each of the following, compute
lIs F . n der.
(a) S is the truncated paraboloid z = x 2 + y2, 0 ::; Z ::; 1, n is the outwardpointing normal, and F(x,y,z) = (x,y,z). (b) S is the truncated half cylinder z = V4 - y2, 0 ::; x ::; 1, n is outwardpointing normal, and F(x, y, z) = (x 2 + y2, yz, z2). (c) S is the torus in Example 13.32, n is the outward-pointing normal, and
F(x,y,z)
=
(y,-x,z).
(d) S is the portion of z = x 2 that lies inside the cylinder x 2 + y2 = 1, n is the upward-pointing normal, and F(x, y, z) = (y2 z, cos(2+ log(2 -x 2 _y2)), x 2z).
13.4
487
Oriented surfaces
3. For each of the following, compute
IIs w.
(a) 8 is the portion of the surface z = x4 + y2 that lies over the unit square [0,1] x [0,1], with upward pointing normal, and w = x dydz+y dz dx+z dx dy. (b) 8 is the upper hemisphere z = J a 2 - x 2 - y2, with outward-pointing normal, and w = xdydz + ydz dx. (c) 8 is the spherical cap z = v'a~2---x--'2~_-y"""'2 that lies inside the cylinder x 2 + y2 = b2, 0 < b < a, with upward pointing normal, and w = xz dy dz + dz dx + zdxdy. (d) 8 is the truncated cone z = 2Jx 2 + y2, 0 ~ Z ~ 2, with normal that points away from the z axis, and w = x dy dz + ydz dx + Z2 dx dy. 4. Suppose that 'Ij;(B) and ¢(E) are CP surfaces, and 'Ij; = ¢ 0 T, where T is a Cl function from B onto E. (a) If ('Ij;, B) and (¢, E) are smooth, and T is 1-1 with AT > 0 on B, prove for all continuous F : ¢( E) -T R 3 that
L
F(¢(u, v)) . Nq,(u, v) d(u, v)
=
Is
F('Ij;(s, t)) . N",(s, t) d(s, t).
*(b) Let Z be a closed subset of B of area zero, ('Ij;, B) be smooth off Z, and T be 1-1 with AT > 0 on BO \ Z. Prove for all continuous F : ¢(E) -T R3 that
L
F(¢(u, v)) . Nq,(u, v) d(u, v)
=
Is
F('Ij;(s, t)) . N",(s, t) d(s, t).
5. Let E be the solid tetrahedron bounded by x = 0, y = 0, z = 0, and x+y+z = 1, and suppose that its topological boundary, T = 8E, is oriented with outward pointing normal. Prove for all C1 functions P, Q, R: E -T R that
IlaE
Pdydz
+ Qdzdx + Rdxdy =
IIL
(Px
+ Qy + R z ) dV.
6. Let T be the topological boundary of the tetrahedron in Exercise 5, with outward pointing normal, and 8 be the surface obtained by taking away the slanted face from T; i.e., 8 has three triangular faces, one each in the planes x = 0, y = 0, z = O. If 88 is oriented positively, prove for all C1 functions P, Q, R : 8 -T R that
las P dx + Q dy + R dz = IIs (Ry - Q z) dy dz + (Pz - Rx) dz dx + (Q x - Py) dx dy. 7. Suppose that 8 is a smooth surface. (a) Show that there exist smooth parametrizations (¢j, E j ) of portions of 8 such that 8 = Uf=l¢j(Ej ). (b) Show that there exist nonoverlapping surfaces 8 j with smooth parametrizations such that 8 = Uf=18j . What happens if 8 is orientable?
488
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
13.5 THEOREMS OF GREEN AND GAUSS Recall by the Fundamental Theorem of Calculus that if f is a C1 function, then
f(b) - f(a) = lb I'(t) dt. Thus the integral of the derivative l' on [a, b] is completely determined by the values f takes on the topological boundary {a,b} of [a,b]. In the next two sections we shall obtain analogues of this theorem for functions F : n - t Rm, where n is a surface or an m-dimensional region, m = 2 or 3. Namely, we shall show that the integral of a "derivative" of F on n is completely determined by the values F takes on the "boundary" of n. Which "derivative" and "boundary" we use depends on whether n is a surface or an m-dimensional region and whether m = 2 or 3. Our first fundamental theorem applies to two-dimensional regions in the plane.
13.50 THEOREM [GREEN'S THEOREM]. Let E be a two-dimensional region whose topological boundary 8E is a piecewise smooth C1 curve oriented positively. If P, Q : E - t Rare C1 and F = (P, Q), then
iaE F . T ds
=
Jh(~~ -~:)
dA.
PROOF FOR SPECIAL REGIONS. Suppose for simplicity that E is of types I and II. Write the integral on the left in differential notation,
r
JaE
P dx + Q dy =
r
JaE
P dx +
r
JaE
Q dy =: h
+h
We evaluate h first. Since E is of type I, choose continuous functions R such that E = {(x,y) E R2: a::; x::; b, f(x)::; y::; g(x)}.
f, 9 : [a, b]
-t
Thus 8E has a top y = g(x), a bottom y = f(x), and (possibly) one or two vertical sides (see Figure 13.16). Since the positive orientation is counterclockwise, the trivial parametrization of the top is y = g(x), where x runs from b to a, and of the bottom is y = f(x), where x runs from a to b. Since dx = 0 on any vertical curve, the contribution of the vertical sides to h is zero. Thus it follows from Definition 13.21 and the one-dimensional Fundamental Theorem of Calculus that
h=
r
JaE
Pdx=lbp(X,f(X))dx+lap(X,g(X))dx
b
a
b = - l (p(x,g(X))-P(x,f(X))dX
I 19(X) b
= -
a
f(x)
8P a(x,y)dydx Y
= -
J~ adA. 8P E
Y
13.5
489
Theorems of Green and Gauss
y
a vertical side
-----+-
+
+
x
Figure 13.16
Since E is of type II, a similar argument establishes
(Here, we have changed parametrizations of aE, e.g., replaced y = f(x) by x = The value of the oriented integral does not change because these parametrizations are orientation equivalent-see Exercise 6, p. 467.) Adding hand 12 completes the proof. I
f- l (y).
The assumption that E be of types I and II was made to keep the proof simple. For a proof of Green's Theorem as stated, see Theorem 15.44 and the reference that follows it. In the meantime, it is easy to check that Green's Theorem holds for any two-dimensional region that can be divided into a finite number of regions, each of which is of types I and II. For example, consider the region E illustrated in Figure 13.17. Notice that although E is not of type II, it can be divided into E l , E 2 , both of which are of types I and II. Applying Theorem 13.50 to each piece, we find
Jr (aax r
JE
Q _
ap) ay
dA =
JeJEr (aaxQ _ ap) dA + J'r (a Q _ ap) dA ay JE 2 ax ay I
= =
r
JaE I
F . T ds
+
r
JaE2
F . T ds
r F. Tds + JcnaE r
JaE
F· Tds + I
r
JcnaE 2
F· Tds,
where C is the common border between El and E 2 • Since aEl and aE2 are oriented in the counterclockwise direction, the orientation of C n aE l is different from the orientation of C n aE2 • Since a change of orientation changes the sign of the integral, the integrals along C drop out. The end result is the integral of F . T ds on aE, as promised. Green's Theorem is often used to avoid tedious parametrizations. 13.51 Example. Find JaE F . T ds, where E = + clockwise orientation, and
= [0,2] x [1,3],
F(x, y) (xy, x2 y2).
aE has the counter-
490
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS y
x
Figure 13.17 SOLUTION. Since oE has four sides, direct evaluation requires four separate parametrizations. However, by Green's Theorem,
r
J8E
F·Tds= r213(2X-X)dYdX=4.
Jo
I
1
Green's Theorem is also used to avoid difficult integrals.
13.52 Example. Find J8E F . T ds, where E = Bl (0,0), oE has the clockwise orientation, and F = (xy2, arctan(log(y + 3)) - x)). SOLUTION. The second component of F looks tough to integrate. However, by Green's Theorem,
r
J8E
F. T ds
JrJr (-1 - 2xy) dx dy r = Jr o Jo (1 + 2r2 cos 0 sin O)r dr dO = = -
Bl (0,0)
27r
1
1r.
(Note: The minus sign appears because oE is oriented in the clockwise direction.) I By Green's Theorem, the "derivative" used to obtain a fundamental theorem of calculus for two-dimensional regions in R2 is Qx - Py • Here are the "derivatives" that will be used when n is a surface in R 3 or a three-dimensional region.
13.53 DEFINITION. Let E be a subset of R3 and let F = (P, Q, R) : E be CIon E. The curl of F is curl F = (OR _ oQ oP _ oR oQ _ oP) oy oz 'oz ox ' ox oy , and the divergence of F is
--+
R3
13.5
491
Theorems of Green and Gauss
vertical side
z =j(x,y)
y B
x
Figure 13.18 Notice that if F = (P, Q, 0), where P and Q are as in Green's Theorem, then curl F . k = Qx - Py is the derivative used for Green's Theorem. These derivatives take on a more easily remembered form by using the notation
V=
(:x,:y,!).
Indeed, curlF = V x F and div F = V· F. If E is a three-dimensional region whose topological boundary is a piecewise smooth orient able surface, then the positive orientation on BE is determined by the unit normal that points away from EO. If E is convex, this means that n points outward. This is not the case, however, when E has interior "bubbles." For example, if E = {x : a ::; Ilxll ::; b} for some a > 0, then n points away from the origin on {x: Ilxll = b} but toward the origin on {x: Ilxll = a}. Our next fundamental theorem applies to the case when n is a three-dimensional region. This result is also called the Divergence Theorem.
13.54 THEOREM [GAUSS'S THEOREM]. Let E be a three-dimensional region whose topological boundary BE is a piecewise smooth C1 surface oriented positively. If F : E -> R3 is CIon E, then
J iaE F ·ndcy = J J
L
div FdV.
PROOF FOR SPECIAL REGIONS. Suppose for simplicity that E is a region of types I, II, and III. Let F = (P, Q, R) and write the surface integral in differential form:
JiaE F . ndCY = JiaEPdYdz+ JiaEQdZdX+ JiaERdXdY=:h+I2+h We evaluate
Is
first.
492
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
Since E is of type I, there exist a two-dimensional region B functions f, 9 : B -> R such that E
= {(x, y, z)
c
R 2 and continuous
E R3 : (x, y) E B, f(x, y) ~ z ~ g(x, y)}.
Thus BE has a top z = g(x, y), a bottom z = f(x, y), and (possibly) a vertical side (see Figure 13.18). Any normal to BE on the vertical side is parallel to the xy plane. Since dx dy is the third component of a normal to BE, it must be zero on the vertical portion. Therefore, 13 can be evaluated by integrating over the top and bottom of BE. Notice that, by hypothesis, the unit normal on the bottom portion points downward and the unit normal on the top portion points upward. By using trivial parametrizations and Theorem 5.28 (the Fundamental Theorem of Calculus), we obtain
13
=
llaE
Rdxdy
=
=
is
(R(x,y,g(x,y) - R(x,y,f(x,y)) d(x,y)
/, J, B
9(X'Y)
f(x,y)
BR
~(x,y,z)dzd(x,y)
BR = JJ~ ~dV. E uZ
uZ
Similarly, since E is of type II,
and since E is of type III,
Adding
h + 12 + 13 verifies the theorem.•
The assumption that E be of types I, II, and III was made to keep the proof simple. For a proof of Gauss's Theorem as stated, see Theorem 15.44 and the reference that follows it. In the meantime, it is easy to check that Gauss's Theorem holds for any three-dimensional region E that can be divided into a finite number of regions E j , each of which is of types I, II, and III. For example, if E = El U E 2, then
llL
divFdV= =
llLI llaE
divFdV+
F·nda+
llL2
ll
divFdV
naE I F·nda+
llnaE2
F·nda,
where S is the common surface between El and E 2. Since El and E2 have outwardpointing normals, the orientation of S n BEl is different from the orientation of S n BE2 , and the integrals over S cancel each other out. The next two examples show that like Green's Theorem, Gauss's Theorem can be used to avoid difficult integrals and tedious parametrizations.
13.5
493
Theorems of Green and Gauss
13.55 Example. Use Theorem 13.54 to evaluate ffs F . ndO', where S is the topological boundary of the solid E = {(x, y, z) : x 2 + y2 ::; Z ::; I}, n is the outward pointing normal, and F(x, y, z) = (2x + z2, x 5 + z7, cos(x 2) + sin(y3) - Z2). SOLUTION.
Since div F
= 2 - 2z, it follows from Gauss's Theorem that
13.56 Example. Evaluate ffaQ F-ndO', where Q is the unit cube [0, 1] x [0,1] x [0,1], n is the outward-pointing normal, and F(x, y, z) = (2x - Z, x 2y, -xz 2). SOLUTION. Since 8Q has six sides, direct evaluation of this integral requires six separate integrals. However, by Gauss's Theorem,
JJaQr r
F .ndO' =
r r r (2 + x 2 _ 2xz) dxdydz = 11. 6 1
1
Jo Jo Jo
1
I
These definitions and results take on new meaning when examined in the context of fluid flow. When F represents the flow of an incompressible fluid near a point a, curl F(a) measures the tendency of the fluid to swirl in a counterclockwise direction about a (see Exercise 6, p. 503), and div F(a) measures the tendency of the fluid to spread out from a (see Exercise 7). (This explains the etymology of the words curl and divergence.) For example, if F(x, y, z) = (x, y, z), then the fluid is not swirling at all, but spreading straight out from the origin. Accordingly, curl F = 0 and div F = 3. On the other hand, if G(x, y, z) = (y, -x, 0), then the fluid is swirling around in a circular motion about the origin. Accordingly, curlG = (0,0,-1) but div G = 0. Note the minus sign in the component of curl G. This fluid swirls about the origin in a clockwise direction, so runs against counterclockwise motion. When the fluid flows over a two-dimensional region E C R2, the integral of F·T ds over C represents the circulation of the fluid around C in the direction of T (see the comments following Definition 13.21). Thus Green's Theorem tells us that the circulation of a fluid around 8E in the direction of the tangent is determined by how strongly the fluid swirls inside E. When F represents the flow of an incompressible fluid through a three-dimensional region E C R3 and S = 8E, the integral ffs F'n represents the flux of the fluid across the surface S (see the comments following Definition 13.43). Thus Gauss's Theorem tells us that the flux of the fluid across S = 8E is determined by how strongly the fluid is spreading out inside E. We close this section by admitting that the interpretations of curl and divergence given above are imperfect at best. For example, the vector field F(x, y, z) = (0, z, 0) has curl (-1,0,0). Here the fluid is shearing in layers with flow parallel to the xy plane in the direction of the positive y axis when z > 0. Although the fluid is not swirling, it does tend to rotate a stick placed in the fluid parallel to the z axis (e.g., the line segment {(O,I,z) : 0::; z::; I}) because more force is applied to the top than the bottom. This tendency toward rotation is reflected by the value of the curl. (Notice that the rotation is clockwise and the curl has a negative first component.)
494
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
EXERCISES 1. For each of the following, evaluate
Ie F . T ds.
(a) C is the topological boundary of the two-dimensional region in the first quadrant bounded by x = 0, y = 0, and y = J4"=X2, oriented in the counterclockwise direction, and F(x, y) = (sin( Jx 3 - x2), xy). (b) C is the perimeter of the rectangle with vertices (0,0), (2,0), (0,3), (2,3), oriented in the counterclockwise direction, and F(x, y) = (e Y , log(x + 1)). (c) C = C 1 U C 2 , where C 1 = 8B 1 (0, 0) oriented in the counterclockwise direction, C 2 = 8B2 (0, 0) oriented in the clockwise direction, and F(x, y) = (f(x 2 + y2), xy2), where f is a C1 function on [1,2]. 2. For each of the following, evaluate
Ie w.
(a) C is the topological boundary of the rectangle [a, b] x [c, d], oriented in the counterclockwise direction, and w = (f(x) + y) dx + xy dy, where f : [0,1]-T R is any continuous function. (b) C is the topological boundary of the two-dimensional region bounded by y = x 2 and y = x, oriented in the clockwise direction, and w = yf(x) dx + (x 2 + y2) dy, where f : [0,1] -T R is C1 and satisfies xf(x) dx = x 2 f(x) dx. (c) C is the topological boundary of a two-dimensional region E that satisfies the hypotheses of Green's Theorem, oriented positively, and w = eX sin y dyeX cosydx.
Io1
3. For each of the following, evaluate normal.
Io1
IIs F ·nda, where n is the outward-pointing
(a) S is the topological boundary of the rectangle [0,1] x [0,2] x [0,3] and F(x, y, z) = (x + eZ , y + e Z , eZ ). (b) S is the truncated cylinder x 2 + y2 = 1, z :::; 1 together with the disks x 2 + y2 :::; 1, z = 0,1, and F(x, y, z) = (x 2, y2, Z2). (c) S is the topological boundary of E, where E c R 3 is bounded by z = 2 - x 2 , Z = x 2, Y = 0, z = y, and F(x,y,z) = (x+f(y,z),y+g(x,z),z+h(x,y)) and f,g,h: R2 -T Rare C1. (d) S is the ellipsoid x 2ja 2 +y2jb2 +z2jc2 = 1 and F(x,y,z) = (xlyl,ylzl,zlxi).
°: :;
4. For each of the following, find
IIs w, where n is the outward-pointing normal.
(a) S is the topological boundary of the three-dimensional region enclosed by y = x 2, Z = 0, z = 1, y = 4, and w = xyzdydz + (x 2 + y2 + Z2) dzdx + (x + y+z)dxdy. (b) S is the truncated hyperboloid of one sheet x 2 - y2 + z2 = 1, Y :::; 1, together with the disks x 2 + z2 :::; 1, y = 0, and x 2 + Z2 :::; 2, y = 1, and w = xYlzl dydz + x21z1 dz dx + (x 3 + y3) dx dy. (c) S is the topological boundary of E, where E c R3 is bounded by the surfaces x 2 + y + Z2 = 4 and 4x + y + 2z = 5, and w = (x + y2 + Z2) dy dz + (x 2 + y + z2) dzdx + (x 2 + y2 + z) dxdy.
°: :;
5. (a) Prove that if E is a Jordan region whose topological boundary is a piecewise
13.5
495
Theorems of Green and Gauss
smooth curve oriented in the counterclockwise direction, then
~ f
Area (E) =
2 JaE
xdy-ydx.
(b) Find the area enclosed by the loop in the Folium of Descartes, i.e., by ¢(t) =
3t2) '
3t ( 1+ t3' 1 + t3
t E
[0, (0).
(c) Find an analogue of part (a) for the volume of a Jordan region E in R 3 . (d) Compute the volume of the torus with radii a > b (see Example 13.32). 6. (a) Show that Green's Theorem does not hold if continuity of P, Q is relaxed at one point in E. (Hint: Consider P = y/(x 2 + y2), Q = -x/(x 2 + y2), and E=B1(0,0).) (b) Show that Gauss's Theorem does not hold if continuity of F is relaxed at one point in E.
0. This exercise is used in Section 13.6. Suppose that V#-0 is an open set in R3 and F : V
--+
R3 is C1 . Prove that
divF(xo) = lim r~O+
V; 0
1(; ( ) Je f r
Xo
JaBr(xo)
F·ndO'
for each Xo E V, where n is the outward-pointing normal of Br(xo), 8. Let F, G : R3 --+ R3 and f : R3 --+ R be differentiable. Prove the following analogues of the Sum and Product Rules for the "derivatives" curl and divergence. \7 x (F
(b)
\7 x (f F)
(c)
\7 . (f F)
(d)
\7. (F
(e)
[ill.
+ G) =
(a)
(\7 x F)
= f (\7
x F)
+ (\7
+ (\7 f
x G). x F).
= \7 f . F + f . (\7 . F).
+ G)
= \7. F
+ \7. G.
\7 . (F x G) = (\7 x F) . G - (\7 x G) . F.
This exercise is used in Section 13.6. Let E of a C1 function f : E --+ R is defined by
c R 3 . Recall that the gradient
grad f := \7 f := (fx, fy, fz). (a) Prove that if f is C2 at Xo, then curlgradf(xo) = O. (b) If F: E --+ R3 is C1 on E and C2 at Xo E E, prove that divcurlF(xo) = O. (c) Suppose that E satisfies the hypotheses of Gauss's Theorem and i : E --+ R is a C2 function that is harmonic on E (see Exercise 10d). If F = grad f on E, prove that
flaE iF ·ndO' = ffLIIFI12 dV.
496
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
10. Let E be a set in Rm. For each u : E --+ R that has second-order partial derivatives on E, Laplace's equation is defined by
~u:=
m
{)2U
L ()x·
2·
J
j=l
(a) Show that if u is C2 on E, then ~u = 'V. ('Vu) on E. (b) [GREEN'S FIRST IDENTITY]. Show that if E c R3 satisfies the hypotheses of Gauss's Theorem, then
ffIe (u~v +
'Vu· 'Vv) dV =
fhE
for all C2 functions u, v : E --+ R. (c) [GREEN'S SECOND IDENTITY]. Show that if E of Gauss's Theorem, then
ffie (u~v
-
v~u) dV =
fhE
u'Vv ·nda
c R3 satisfies the hypotheses
(u'Vv - v'Vu) ·nda
for all C2 functions u, v : E --+ R. (d) A function u : E --+ R is said to be harmonic on E if and only if u is C2 on E and ~u(x) = 0 for all x E E. Suppose that E is a nonempty open region in R3 that satisfies the hypotheses of Gauss's Theorem. If u is harmonic on E, u is continuous on E, and u = 0 on ()E, prove that u = 0 on E. (e) Suppose that V is open and nonempty in R 2 , U is C2 on V, and u is continuous on V. Prove that u is harmonic on V if and only if
{ (u x dy - uy dx)
iaE
for all two-dimensional regions E Theorem.
= 0
c V that satisfy the hypotheses of Green's
13.6 STOKES'S THEOREM Our final fundamental theorem applies to surfaces in R3 whose boundaries are curves. 13.57 THEOREM [STOKES'S THEOREM]. Let S be an oriented, piecewise smooth C2 surface in R 3 with unit normal n. If the boundary {)S is a piecewise smooth C1 curve oriented positively and F : S --+ R3 is C1 , then
hs
F· Tds
=
fIs
curlF ·nda.
13.6
Stokes's Theorem
497
(I/I(t), O'(t))
E
Figure 13.19
PROOF FOR EXPLICIT SURFACES. Suppose for simplicity that S is an explicit C2 surface that lies over E, a two-dimensional region that satisfies the hypotheses of Green's Theorem. Let F = (P, Q, R) be C1 on S and write the line integral in differential notation:
r F· T ds = Jas r P dx + Q dy + R dz. Jas Without loss of generality, suppose that S is determined by z = I (x, y), (x, y) E E, where I : E ---+ R is a C2 function and S is oriented ~ith the upward-pointing normal. Thus n = N/IINII, where N = (-Ix, - Iy , 1). Let ('¢(t), O'(t)) , t E [a,b], be a piecewise smooth parametrization of oE oriented in the counterclockwise direction. Then
¢(t)
=
('¢(t), O'(t), I(,¢(t), O'(t))),
t E
[a,b],
is a piecewise smooth parametrization of oS which is oriented positively (see Figure 13.19). If x = ,¢(t), y = O'(t), and z = 1(,¢(t),O'(t)), then dx = ,¢'(t) dt, dy = O"(t) dt, and
Thus, by definition,
(10)
hs
Pdx+Qdy+Rdz=
hE (P+R~~)
dx+
(Q+R~:)
dy.
We shall apply Green's Theorem to this last integral. By the Chain Rule and the Product Rule,
~ (Q+R oz ) = oQ + oQoz + oRoz + oRozoz +R 02 z ox
oy
ox
oz ox
ox oy
oz oxoy
oxoy
and
~ (P+R oz ) = oP + oPoz + oRoz + oRozoz +R 02 z . oy
ox
oy
oz oy
oy ox
oz oy ox
oyox
498
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
Since z = f (x, y) is C2 , the mixed second-order partial derivatives above are equal. Therefore,
~ (Q+R az ) ax ay
_
~ (P+R az ) ay ax
(OR _ a Q ) (_ az) + (aP _ OR) (_ az) + (a Q _ aP) ay a z ax a z ax ay ax ay = curlF· N.
=
Hence, it follows from (10), Green's Theorem, and (9) that
las F· Tds = LcurlF· N d(x,y) = lis curlF ·nder.
I
The assumption that S be an explicit surface over a "Green's region" was made to keep the proof simple. For a proof of Stokes's Theorem as stated, see Theorem 15.44 and the reference that follows it. In the meantime, it is easy to check that Stokes's Theorem holds for any surface that can be divided into a finite number of such explicit surfaces. As before, the common boundaries appear twice, each time in a different orientation, hence cancel each other out. Stokes's Theorem can be used to replace complicated line integrals by simple surface integrals.
Ie
13.58 Example. Compute F . T ds, where C is the circle x 2 + z2 = 1, Y = 0, oriented in the counterclockwise direction when viewed from far out on the y axis, and F(x, y, z) = (x 2Z + J x 3 + x 2 + 2, xy, xy + J z3 + z2 + 2). SOLUTION. Since curl F = (x, x 2 - y, y), using Stokes's Theorem is considerably easier than trying to integrate F . T ds directly. Let S be the disk x 2 + z2 ~ 1, y = 0, and notice that as = C. Since C is oriented in the counterclockwise direction, the normal to S must point toward the positive y axis; i.e., n = (0,1,0). Thus curl F . n = x 2 - Y = x 2 on S, and Stokes's Theorem implies that
L
F . T ds =
lis
x 2 dA =
127r 11 r3 cos Bdr dB = ~. 2
I
In Example 13.58, we could have chosen any surface S whose boundary is C. Thus Stokes's Theorem can also be used to replace complicated surface integrals by simpler ones.
IIs
13.59 Example. Find curl F . nder, where S is the semiellipsoid 9x 2 + 4y2 + 36z 2 = 36, z ~ 0, n is the upward-pointing normal, and 2
2
2
2
F(x,y,z) = (cosxsinz+xy,x 3,e X +z -eY +z +tan(xy)). SOLUTION. Let C = as. The integral of curlF . nder over S and the integral of F . T ds over C are both complicated. But, by Stokes's Theorem, the integral
13.6
Stokes's Theorem
499
of F . T ds over C is the same as the integral of curl F . nda over any oriented C2 surface E satisfying oE = C. Let E be the two-dimensional region 9x 2 + 4y2 :=::; 36. On E, n = (0,0,1). Thus we only need the third component of curl F:
Therefore,
lis
curlF ·nda =
l
(3x 2
-
x) d(x, y).
Let x = 2r cos () and y = 3r sin (). By a change of variables,
Stokes's Theorem can also be used to replace complicated surface integrals by simple line integrals. 13.60 Example. Let S be the union of the truncated paraboloid z
°
1, and the truncated cylinder x 2 + y2 = 1, 1 :=::; z
= x 2 + y2,
:=::;
3. Compute
where n is the outward-pointing normal and F(x, y, z) = (x
+ z2, 0, -z -
:=::; Z :=::;
3).
SOLUTION. The boundary of Sis x 2 + y2 = 1, z = 3. To use Stokes's Theorem, we must find a function G = (P, Q, R) : S ~ R 3 such that curl G = F, i.e., such that (11)
oR oQ 2 ---=x+z oy oz '
(12)
oP _ oR _ oz ox - ,
°
and
(13)
oQ OP ---=-z-3. ox oy
Starting with (11), set (14)
oQ - = - x and oz
oR 2 oy = z .
The left side of (14) implies that Q = -xz+g(x,y) for some g: R2 ~ R. Similarly, the right side of (14) leads to R = z2 y + h(x, z) for some h : R2 ~ R. Thus
500
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
Qx = -z + gx will solve (13) if we set 9 = 0 and Py = 3; i.e., P = 3y + a(x, z) for some R2 ---+ R. Hence, Pz - Rx = z - hx will satisfy (12) if = h = O. Therefore, P = 3y, Q = -xz and R = yz2; i.e., G = (3y, -xz, yz2). Parametrize {)S by ¢(t) = (sint,cost,3), t E [0,27rj, and observe that
a:
(G 0 ¢) . ¢'
a
a
= (3 cos t, -3sint, 9cost)· (cost, - sint, 0) = 3cos2 t + 3sin2 t = 3.
Consequently, Stokes's Theorem implies that
lis F·nda = lis curlG ·nda = las G· Tds = 127r 3dt = 67r.
I
The solution to Example 13.60 involved finding a function G that satisfied curl G F. This function is not unique. Indeed, we could have begun with
{)Q
-
{)z
= -z
2
and
=
{)R
- =x {)y
instead of (14). This leads to a different solution: G(x, y, z)
= (zy, -(3x + z3 /3), xy).
The technique used to solve Example 13.60, however, is perfectly valid. Indeed, by Stokes's Theorem the value of the oriented line integral of G . T will be the same for all C1 functions G that satisfy curlG = F. This technique works only when the system of partial differential equations curl G = F has a solution G. To avoid searching for something that does not exist, we must be able to discern beforehand whether such a solution exists. To discover how to do this, suppose that G is a C2 function that satisfies curl G = F on some set E. Then div F = 0 on E by Exercise 9b, p. 496. Thus the condition div F = 0 is necessary for existence of a solution G to curlG = F. The following result shows that if E is nice enough, this condition is also sufficient (see also Theorem 15.45).
13.61 THEOREM. Let n be a ball or a rectangle with nonempty interior, and let F : n ---+ R3 be Clan n. Then the following three statements are equivalent. (i) There is a C2 function G : n ---+ R3 such that curlG = F on n. (ii) If E and S = {)E satisfy the hypotheses of Gauss's Theorem and E c n, then (15)
lisF.nda=o.
(iii) The identity div F = 0 holds everywhere on
n.
PROOF. If (i) holds, then div F = div (curl G) = 0 since the first-order partial derivatives of G commute. Thus (15) holds by Gauss's Theorem. (This works for any set n.)
13.6
501
Stokes's Theorem
If (ii) holds, then by Gauss's Theorem and Exercise 7, p. 495,
divF(xo) = lim ¥ r-+O+ 0
1(; ( ) Jr ff Xo
r
= lim
J J Br(xo)
1 r-+O+ Vol (Br(xo))
Jr f
divFdV F . nda
=0
JaBr(xo)
for each Xo E n°. Since div F is continuous on n, it follows that div F = 0 everywhere on n. (This works for any three-dimensional region n.) Finally, suppose that (iii) holds. Let F = (p, q, r) and suppose for simplicity that G = (0, Q, R). If curlG = F, then
Ry - Qz = p,
(16)
-Rx = q,
Qx = r.
If n is a ball, let (xo, Yo, zo) be its center; if n is a rectangle, let (xo, Yo, zo) be any point in n. Then given any (x, y, z) E n, the line segment from (xo, y, z) to (x, y, z) is a subset of n. Hence we can integrate the last two identities in (16) from Xo to x, obtaining
R = -lx q(u, y, z) du + g(y, z)
and
x
Q=l
Xo
r(u, y, z) du + h(y, z)
xo
for some g, h : R2 ---+ R. Differentiating under the integral sign (Theorem 11.5), and using condition (iii), the first identity becomes
p = Ry - Qz = -lx (qy(u, y, z) + rz(u, y, z)) du + gy - h z Xo
=l
x
Px(u, y, z) du + gy - hz = p(x, y, z) - p(xo, y, z) + gy - h z .
Xo
Thus (16) can be solved by gy = p(xo, y, z) and h = 0; Le.,
Q=l
x
r(u, y, z) du and R =
~
l
Y
p(xo, v, z) dv -lx q(u, y, z) duo •
~
~
We notice that Theorem 13.61 holds for any three-dimensional region n that satisfies the following property: There is a point (xo, Yo, zo) E n such that the line segments L((xo, y, z); (x, y, z)) and L((xo, Yo, z); (xo, y, z)) are both subsets of n for all (x, y, z) E n. However, as the following result shows, Theorem 13.61 is false without some restriction on n. 13.62 Remark. Let
n = Bl (0,0,0) \ {(O, 0, O)} F(x, y, z)
=
(w X
3/ 2 '
and
Z)
y w 3/ 2 ' w 3/ 2
'
502
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
where w = w(x, y, z) = x 2 + y2 satisfies curlG = F. PROOF.
°on n, but there is no
+ z2. Then div F =
G that
By definition,
. dIV F
_ 2X2
=
+ y2 + Z2 W 5/ 2
x 2 _ 2y2 + z2 W 5/ 2
+
+
x 2 + y2 - 2z2 W 5/ 2 = 0.
Let 8 represent the unit sphere OBI (0,0,0) oriented with the outward-pointing normal, and suppose that there is a G such that curlG = F. On the one hand, since F = (x, y, z) = non 8 implies that F'n = x 2 + y2 + z2 = 1, we have
lis
(17)
lis
F ·nda =
IdA = a(8) = 41l'.
On the other hand, dividing 8 into the upper hemisphere 8 1 and the lower hemisphere 8 2 , we have by Stokes's Theorem that
(18)
lis
F ·nda = =
lis!
F ·nda+
r
G . Tl ds +
las!
lis2
F ·nda
r
G . T2 ds
las2
= 0.
This last step follows from the fact that 081 = 082 and Tl = -T2. Since (17) and (18) are incompatible, we conclude that there is no G that satisfies curl G = F. I
EXERCISES 1. For each of the following, evaluate
Ie F . T ds.
(a) C is the curve formed by intersecting the cylinder x 2 + y2 = 1 with z = -x, oriented in the counterclockwise direction when viewed from high on the positive z axis, and F(x, y, z) = (xy2, 0, xyz). (b) C is the intersection of the cubic cylinder z = y3 and the circular cylinder x 2 + y2 = 3, oriented in the clockwise direction when viewed from high up the positive z axis, and F(x, y, z) = (eX + z, xy, zeY ). 2. For each of the following, evaluate curl F . nda. (a) 8 is the "bottomless" surface in the upper half space z 2 bounded by y = x 2, Z = 1 - y, n is the outward-pointing normal, and F(x,y,z) = (xsinz 3,ycosz3,x3 +y3 +z3). (b) 8isthetruncatedparaboloidz = 3-x 2 _y2, z 20,nistheoutward-pointing normal, and F(x, y, z) = (y, xyz, y). (c) 8 is the hemisphere z = .)10 - x 2 - y2, n is the inward-pointing normal, and F(x, y, z) = (x, x, x 2y 3 10g (z + 1)). (d) 8 is the "bottomless" tetrahedron in the upper half space z 2 bounded by x = 0, y = 0, x + 2y + 3z = 1, z 2 0, n is the outward-pointing normal, and F(x,y,z) = (xy,yz,xz).
IIs
°
°
13.6
503
Stokes's Theorem
IIs
3. For each of the following, evaluate F ·nda using Stokes's Theorem or Gauss's Theorem. (a) S is the sphere x 2 + y2 + z2 = 1, n is the outward-pointing normal, and F(x,y,z) = (xz 2,x2y- z3,2xy+y2z). (b) S is the portion of the plane z = y that lies inside the ball B1(O), n is the upward-pointing normal, and F(x, y, z) = (xy, xz, -yz). (c) S is the truncated cone y = 2v'X2 + z2, 2 ~ y ~ 4, n is the outward-pointing normal, and F(x, y, z) = (x, -2y, z). (d) S is a union of truncated paraboloids z = 4 - x 2 - y2, ~ z ::s 4, and z = x 2 + y2 - 4, -4 ~ z ~ 0, n is the outward-pointing normal, and
°
F( x, y, z)
= (x + y2 + sin z, x + y2 + cos z, cos x + sin y + z).
(e) S is the union of three surfaces z = x 2 + y2 (0 ~ Z ~ 2), 2 = x 2 + y2 (2 ~ z ~ 5), and z = 7 - x 2 - y2 (5 ~ z ~ 6), n is the outward-pointing normal, and F(x, y, z) = (2y, 2z, 1). 4. For each of the following, evaluate w using Stokes's Theorem or Gauss's Theorem. (a) S is topological boundary of cylindrical solid y2 + z2 ~ 9, ~ x ~ 2, with outward-pointing normal, and w = xy dy dz + (x 2 - z2) dz dx + xz dx dy. (b) S is the truncated cylinder x 2 + z2 = 8, ~ y ~ 1, with outward-pointing normal, and w = (x - 2z) dy dz - y dz dx. (c) S is the topological boundary of R = [0, 7r /2] x [0, 1] x [0,3], with outwardpointing normal, and w = eY cos x dy dz + x 2Z dz dx + (x + y + z) dx dy. (d) S is the intersection of the elliptic cylindrical solid 2x2 + Z2 ~ 1 and the plane x = y, with normal that points toward the positive x axis, and w = x dy dz - y dz dx + sin y dx dy. 5. Prove that Green's Theorem is a corollary of Stokes's Theorem. 6. Let fl be a plane in R3 with unit normal nand Xo E fl. For each r > 0, let Sr be the disk in fl centered at Xo of radius rj i.e., Sr = Br(xo) n fl. Prove that if F : Bl (Xo) --+ R is C1 and aSr carries the orientation induced by n, then
Is
°
°
curlF(xo) 'n= lim
r ...... O+
a
(Sl) r
r ias
F· Tds. r
7. Let S be an orient able surface with unit normal nand nonempty boundary as that satisfies the hypotheses of Stokes's Theorem. (a) Suppose that F : S --+ R3 \ {O} is C1 , that as is smooth, and that T is the unit tangent vector on as induced by n. If the angle between T(xo) and F(Xo) is never obtuse for any Xo E as, and curl F . nda = 0, prove that T(xo) and F(xo) are orthogonal for all Xo E as. (b) If F, Fk : S --+ R 3 are Cl and Fk --+ F uniformly on as, prove that
IIs
lim k ...... oo
Jrisr curlF
k
·nda =
Jrisr curlF ·nda.
504
Chapter 13
FUNDAMENTAL THEOREMS OF VECTOR CALCULUS
8. Let E be a two-dimensional region such that if (x, y) E E, then the line segments from (0,0) to (x,O) and from (x,O) to (x, y) are both subsets of E. If F: E -4 R2 is C1 , prove that the following three statements are equivalent. (a) F = \l! on E for some ! : E -4 R. (b) F = (P,Q) is exact; i.e., Qx = P y on E. (c) Jc F . T ds = for all piecewise smooth curves C = an oriented counterclockwise, where 0. is a two-dimensional region that satisfies the hypotheses of Green's Theorem, and 0. c E. 9. Let 0. be a three-dimensional region and F : 0. -4 R3 be C1 on n. Suppose further that for each (x,y,z) En, both the line segments L((x,y,O); (x,y,z)) and L((x, 0, 0); (x, y, 0)) are subsets of n. Prove that the following statements are equivalent. (a) There is a C2 function G: 0. -4 R3 such that curlG = F on n. (b) If F, E, and S = aE satisfy the hypotheses of Gauss's Theorem and E C 0., then
°
lis
°
F ·nder
=
0,
(c) The identity div F = holds everywhere on n. 10. Suppose that E satisfies the hypotheses of Gauss's Theorem and S satisfies the hypotheses of Stokes's Theorem. (a) If!: S -4 R is a C2 function and F = grad! on S, prove that
Ilas (b) If G : E
-4
(iF)· Tds = 0.
R3 is a C2 function and F
IlaE
(iF) ·nder =
curlG on E, prove that
=
IlL
grad!· FdV.
Note: You may wish to use Exercises 8 and 9, p. 495. 11. Let F be Cl and exact on R2 \ {(O,O)} (see Exercise 8b).
(a) Suppose that C 1 and C 2 are disjoint smooth simple curves, oriented in the counterclockwise direction, and E is a two-dimensional region whose topological boundary aE is the union of C 1 and C2 . (Note: This means that E has a hole with one of the Cj's as the outer boundary and the other as the inner boundary.) If (0,0) tf- E, prove that
1 c1
F· T ds =
1 c2
F· T ds.
(b) Suppose that E is a two-dimensional region that satisfies (0,0) E EO. If aE is a smooth simple curve oriented in the counterclockwise direction, and
x) '
-y F(x,y)= ( X 2 +y2'x2 +y2
13.6
Stokes's Theorem
505
compute feE F . T ds. (c) State and prove an analogue of part (a) for functions F : R3 \ {(O, 0, O)}, three-dimensional regions, and smooth surfaces.
Chapter 14
Fourier Series INTRODUCTION richment section.
e14.1
This section uses no material from any other en-
In Chapter 7 we studied power series and their partial sums, classical polynomials. In this chapter we shall study the following objects.
14.1 DEFINITION. Let ak, bk E R and let N be a nonnegative integer. (i) A trigonometric series is a series of the form
(ii) A trigonometric polynomial of order N is a function P : R
-+
R of the form
N
P(x) =
ao ~ . 2 + L....-(ak coskx + bk smkx). k=l
(Here, coskx is shorthand for cos(kx), and sinkx is shorthand for sin(kx).) Calculus was invented with the tacit assumption that power series provided a unified function theory; i.e., every function has a power series expansion (see Klein [5]). When Cauchy showed that this assumption was false (see Remark 7.41), mathematicians began to wonder whether some other type of series would provide a unified function theory. Euler (respectively, Fourier) had shown that the position of a vibrating string (respectively, the temperature along a metal rod) can be represented by trigonometric series. Thus, it was natural to ask: Does every function have a trigonometric series expansion? In this chapter we shall examine this question, and the following calculation will help to answer it. 506
14.1
14.2 Lemma
[ORTHOGONALITY]. Let k,j
1.:
(i)
be nonnegative integers. Then 2n
cos kx cosjx dx
=
{
507
k=j=O
~
k=j=/=O k=/=j
1 11" sin kx sinjx dx = { n -11" 0
(ii)
and
1.:
(iii) PROOF.
Introduction
Let 1=
sin kxcosjx dx
1.:
coskxcosjxdx.
If k = j = 0, then I = D1I" dx = 2n. If k elementary integration, we have 1=
= O.
=j
=/= 0, then by a half-angle formula and
11" 1111" 1-11" cos 2 kx dx = 2" _11"(1 + cos 2kx) dx = n.
If k =/= j, then by a sum-angle formula and elementary integration, we have
I =
~ 111" (cos(k + j)x + cos(k 2
-11"
j)x) dx
= O.
This proves part (i). Similar arguments prove parts (ii) and (iii). I Notice once and for all that the question concerning representation of functions by trigonometric series has a built-in limitation. A function f : R --t R is said to be periodic (of period 2n) if and only if f (x + 2n) = f (x) for all x E R. Since cos kx and sin kx are periodic, it is clear that every trigonometric polynomial is periodic. Therefore, any function that is the pointwise or uniform limit of a trigonometric series must also be periodic. For this reason, we will usually restrict our attention to the interval [-n, n] and assume that f( -n) = f(n). The following definition, which introduces a special type of trigonometric series, plays a crucial role in the representation of periodic functions by trigonometric series.
14.3 DEFINITION. Let f be integrable on [-n, n] and let N be a nonnegative integer. (i) The Fourier coefficients of f are the numbers
ak(f)
=:;1111" -11" f(x)coskxdx,
k = 0,1, ... ,
Chapter 14
508
and
bk(f)
=;:1 J7I"
-71"
FOURIER SERIES
f(x)sinkxdx,
k
= 1,2, ....
(ii) The Fourier series of f is the trigonometric series (S!)(x)
= ao;!) + f)ak(!)coskx + bk(f)sinkx). k=l
(iii) The partial sum of Sf of order N is the trigonometric polynomial defined, for each x E R, by (Sof)(x) = ao(f)/2 if N = 0, and N
(SN!)(X)
= ao;!) + 2)ak(f)coskx+bk(!)sinkx) k=l
if N E N. The following result shows why Fourier series play such an important role in the representation of periodic functions by trigonometric series.
14.4 THEOREM [FOURIER]. If a trigonometric series 00
~ + 2)akcoskx + bksinkx)
S:=
k=l converges uniformly on R to a function f, then S is the Fourier series of f; i.e., ak = ak(f) for k = 0,1, ... , and bk = bk(f) for k = 1,2, .... PROOF. Fix an integer k
~
0. Since 00
f(x)
= ~ + 2)aj cosjx + bj sinjx) j=l
converges uniformly and cos kx is bounded, 00
(1)
f(x) cos kx
= a; cos kx + 2)aj cosjx cos kx + bj sinjx cos kx) j=l
also converges uniformly. Since f is the uniform limit of continuous functions, f is continuous, hence integrable on [-1l',1l']. Integrating (1) term by term and using orthogonality, we obtain
ak(f)
=;:1 J7I"
-71"
= ao 21l'
f(x)coskxdx
J7I" -71"
cos kx dx
+
f:
j=l
(a j 1l'
J7I" -71"
cos kxcosjx dx
+ bj J7I" 1l'
-71"
cos kxsinjx dX)
14.1
Introduction
509
A similar argument establishes that bdf) = bk. I There are two central questions in the study of trigonometric series. CONVERGENCE QUESTION. Given a function f : R ---- R, periodic on Rand integrable on [-1l', 1l'], does the Fourier series of f converge to f? UNIQUENESS QUESTION. If a trigonometric series S converges to some function f integrable on [-1l', 1l'], is S the Fourier series of f? We shall answer these questions for pointwise and uniform convergence when f is continuous and of bounded variation. We notice in passing that by Theorem 14.4, the answer to the Uniqueness Question is "yes" if uniform convergence is used. The following special trigonometric polynomials arise naturally in connection with the Convergence Question (see Exercise 2).
14.5 DEFINITION. Let N be a nonnegative integer. (i) The Dirichlet kernel of order N is the function defined, for each x E R, by Do(x) = 1/2 if N = 0, and 1 N DN(X) = "2 + Lcoskx k=l
if N E N. (ii) The Fejer kernel of order N is the function defined, for each x E R, by Ko(x) = 1/2 if N = 0, and
(2)
KN(X)
="21 + LN
(
1-
k=l
k) coskx
N1. +
if N E N. The following result shows that there is a simple relationship between Fejer kernels and Dirichlet kernels.
14.6 Remark. If N is a nonnegative integer, then K (x) = Do(x) + ... + DN(X) N N+1 for all x E R. PROOF. The identity is trivial if N = 0. To prove the identity for N E N, fix x E R. By definition, KN(X)
(N
1 +1 N ) = N1. - 2 - + L(N - k+ l)coskx
+
k=l
1 (1 ""1· (! + ~ (! + ~ + N
N
N
=- -+N+1 2 2 + ~~
coskx
)
k=lj=k
_ _1_ - N 1
2
~
j=l
2
k ) ) _ Do(x) + ... +DN(X) N . I +1
~cos x
k=l
Chapter 14
510
FOURIER SERIES
The next result shows that Dirichlet and Fe}3r kernels can be represented by quotients of trigonometric functions.
14.7 THEOREM. If x E R cannot be written in the form 2k1l" for any k E Z, then
(3)
and
(4)
for N = 0,1, .... PROOF. The formulas are trivial for N = O. Fix N E N. Applying a sum-angle formula and telescoping, we have N
(DN(X)
-~) sin~ = LCoskxsin~ k=l
=
~
t
(sin(k +
~)x -
sin(k -
~)x)
k=l
=
~2 (sin(N + ~)x 2
sin::) . 2
Solving this equation for DN(X) verifies (3). Let kEN. By (3) and another sum-angle formula,
Dk(X) sin2
~ = ~ sin ~ sin(k + ~)x = ~(cos kx -
This identity also holds for k
(N
+ I)KN(x)
sin 2
~
cos(k + l)x).
= O. Applying Remark 14.6 and telescoping, we have N
=
L Dk(X) sin2 ~ k=O
1 N
=
4 L(coskx k=O
cos(k + l)x)
(N
1 - cos(N + l)x) = "2 1. +-1) x. = 4(1 sm 2 - 2
14.1
511
Introduction
Solving this equation for KN(X) verifies (4). I These identities will be used in the next section to obtain a partial answer to the Convergence Question. The next two examples illustrate the general principle that the Fourier coefficients of many common functions can be computed using integration by parts.
14.8 Example. Prove that the Fourier series of f(x) = x is
L 00
2
(_l)k+l
sinkx.
k
k=l
PROOF. Since x cos kx is odd and x sin kx is even, we see that ak (f) k = 0,1, ... , and
= 0 for
2111' xsinkxdx
bk(f) = -
7r
for k
=
0
1,2, .... Integrating by parts, we conclude that
r cos kx d)=2(-1)k+ x
=~(_XCOSkXI1l' ! b(f) k 7r k 0 + k 10
1
k
I
14.9 Example. Prove that the Fourier series of f(x) = Ixl is ~ 2
_i 7r
~ cos(2k - l)x ~
(2k - 1)2 .
k=l
PROOF. Since Ixl cos kx is even and Ixl sin kx is odd, we see that bk (f) k = 1,2, ... , and
2111' x cos kx dx
ak(f) = -
7r
for k
0
= 0,1, .... If k = 0, then
i.e., ao (f) /2
=
7r
/2. If k > 0, then integration by parts yields if k is even if k is odd. I
EXERCISES 1. Compute the Fourier series of (a) x 2 and (b) cos 2 x.
= 0 for
Chapter 14
512
2. Prove that if
I: R
~
FOURIER SERIES
R is integrable on [-71",71"], then
(SN f)(x)
J7I"
1 -71" l(t)DN(X - t) dt
= -;
for all x E [-71",71"] and N E N. 3. Show that if I, 9 are integrable on [-71",71"] and a E R, then
ak(f + g)
=
ak(f)
+ ak(g),
ak(af)
=
aak(f),
k
= 0,1, ... ,
and
bk(f + g) = bk(f) + bk(g), 4. Let I : R
~
bk(af) = abk(f),
k = 1,2, ....
f'
be integrable on
R be differentiable and periodic and
[-71",71"]. Prove that ak(f') = kbk(f)
and
bk(f') = -kak(f),
kEN.
~ R are integrable and IN ~ I uniformly on [-71",71"] as N ~ 00. (a) Prove that ak(fN) ~ ak(f) and bk(fN) ~ bk(f), as N ~ 00, uniformly in k. (b) Show that part (a) holds under the weaker hypothesis
5. Suppose that IN : [-71", 71"]
J~oo [: I/(x) -
IN(X)I dx = O.
6. Let x~O
x (a) Compute the Fourier coefficients of (b) Prove that
(S2N f) (x) = ~
= O.
I.
r sin.smt2Nt dt
7I"Jo
*(c)
for x E [-71",71"] and N E N. [GIBBS'S PHENOMENON]. Prove that
. (S2Nf) (71") hm -N
N ..... oo
2
= -2171"sint - dt ~ 1.179 71" 0 t
e14.2 SUMMABILITY OF FOURIER SERIES from Section 14·1.
This section uses material
The Convergence Question posed in Section 14.1 is very difficult to answer, even for continuous functions. In this section we replace it with an easier question and show that the answer to this question is "yes." Namely, we shall show that the Fourier series of any continuous periodic function I is uniformly summable to f. By summable, we mean the following concept.
14.2
14.10 DEFINITION. A series E~=o ak with partial sums to be Cesaro summable to L if and only if its Cesaro means aN:=
converge to L as N ----
513
Summability of Fourier series
So
SN =
E~=o ak is said
+ ... + SN N+1
00.
The following result shows that summability is a generalization of convergence.
14.11 Remark. II E~=o ak converges to a finite number L, then it is Cesaro summable to L. PROOF. Let 6 > O. Choose Nl E N such that k ~ Nl implies that ISk - LI < 6/2. Use the Archimedean Principle to choose N2 EN such that N2 > Nl and
L ISk-LI
62
k=O
The converse of Remark 14.11 is false. Indeed, although the series E~o (_1)k does not converge, its Cesaro means satisfy N is even N is odd,
whence aN ---- 1/2 as N ---- 00. It is easier to show that a series is Cesaro summable than to show that it converges. Thus the following question is easier to answer than the Convergence Question. SUMMABILITY QUESTION. Given a function integrable on [-rr, rr], is S I Cesaro summable to
I : R ---- R,
periodic on Rand
I?
The Cesaro means of a Fourier series S I are denoted by
.= (Sof)(x)+···+(SN/)(x) (aN I)() x. N ' +1 N = 0, 1, .... The following result shows that the Cesaro means of a Fourier series can always be represented by an integral equation. This is important because it allows us to estimate the remainder aNI - I, using techniques of integration.
Chapter 14
514
14.12 Lemma. Let
f :R
----->
FOURIER SERIES
R be periodic on R and integrable on
(C7N f)(x)
=:;1/71' -71' f(x -
[-7l",7l"].
Then
t)KN(t) dt
for all N = 0,1, ... , and all x E R. PROOF.
Fix j, N E N and x E R. By definition and a sum-angle formula,
1/71' f(u)cosjucosjxdu+ -1/71' f(u)sinjusinjxdu -71' 7l" -71' = -1/71' f( u)( cos ju cosjx + sinju sinjx) du 7l" -71' = -1/71' f(u) cosj(x - u) duo 7l" -71'
= -
7l"
Summing this identity over integers j
= 1, 2, ... , k and adding ao (f) /2, we have k
(Skf)(x)
=
ao;!)
+ {;(aj(f) cosjx + bj(f) sinjx)
k =:;1 /71' -71' f(u) (12 + {; cosj(x -
=:;1/71' -71' f(u)Dk(X -
u) ) du
u) du
for k = 0,1, .... Making the change of variables t = x - u and using the fact that both f and Dk are periodic, we obtain
(Sk!)(X)
=:;1/71' -71' f(x -
t)Dk(t) dt,
k
= 0,1, ....
We conclude by Remark 14.11 that
= N
1
+1
t;:;1/71'-71' f(x - t)Dk(t) dt =:;1/71'-71' f(x - t)KN(t) dt. • N
To answer the Summability Question we need to know more about Fejer kernels. The following result shows that Fejer kernels satisfy some very nice properties.
14.2
515
Summability of Fourier series
14.13 Lemma. For each nonnegative integer N,
(5)
KN(t)
~
0
for all t E R,
and
(6) Moreover, for each 0 < 8 < 7f,
(7)
lim
N ....... oo
Jor
IKN(t)1 dt = O.
Fix N ~ O. If t = 2j7f for some j E Z, then Dk(t) = k + 1/2 ~ 0 for all 0, whence KN(t) ~ O. If t =I- 2j7f for any j E Z, then by Theorem 14.7,
PROOF.
k
~
KN(t) = N 2 1 (
+
sin ( - t N+l))2 .2 t 2sm 2
~ O.
This proves (5). By Definition 14.5 and orthogonality,
j 1r KN(t)dt=j1r -1r -1r
(~+t(l- N k+ l)coskt) dt=7f. k=l
This proves (6). To prove (7), fix 0 < 8 < 7f and observe that sin t/2 ~ sin 8/2 for t E [8,7f]. Hence, it follows from Theorem 14.7 that
1r lIKN(t)1 dt :::; N o
1r
~ 11 ( 0
SIn .
-- t (N+l))2 28 dt :::; 7f 8· 2sin 2 2(N+l)sin2 2
Since 8 is fixed, this last expression tends to 0 as N
-+ 00.
I
Using these properties, we can answer the Summability Question for continuous functions (see also Exercises 6 and 8). 14.14 THEOREM [FEJER]. Suppose that f : R -+ R is periodic on R and integrable on [-7f,7f]. (i) If L= lim f(xo+h)+f(xo-h) h ....... O 2 exists for some Xo E R, then (aNf)(xo) -+ L as N -+ 00. (ii) If f is continuous on some closed interval I, then aN f -+ f uniformly on I as N -+ 00.
Chapter 14
516
FOURIER SERIES
PROOF. Since f is periodic, we may suppose that Xo (6), Lemma 14.12, and a change of variables,
(8)
E
[-7f,7f]. Fix N
E
N. By
1j7r -7r KN(t)(J(Xo-t)-L)dt
(aNf)(xo)-L=-;
= ~ fo7r KN(t) (f(x o + t) ; f(xo - t) -L) dt
217r KN(t)F(xo, t) dt.
=: -
7f
Let c > 0 and choose 0 By (5) and (6) we have
(9)
2
0
< 8 < 7f such that It I < 8 implies that !F(xo, t)1 < c/3.
r
8
2c
r
8
-; io KN(t)!F(xo, t)1 dt < 37f io KN(t) dt ~
On the other hand, choose by (7) an Nl E N such that N J87r KN(t) dt < c/3M, where M := sUPxER !F(x)l. Then
~ 17r KN(t)!F(xo, t)1 dt ~ M 17r KN(t) dt <
2c
3' ~ Nl
implies that
i,
and it follows from (8) and (9) that (10) for all N ~ N 1 . This proves part (i). To prove part (ii), suppose that f is continuous on some closed interval I. Since f is periodic, we may suppose that I ~ [-7f,7f]. Thus I is closed and bounded, and f is uniformly continuous on I. Repeating the estimates above, we see that (10) holds uniformly for all Xo E I. I
14.15 COROLLARY. If f : R --+ R is continuous and periodic, then aN f converges to f uniformly on R as N --+ 00. 14.16 COROLLARY [COMPLETENESS]. If f : R --+ R is continuous and periodic, and ak-l(J) = bk(J) = 0 for kEN, then f(x) = 0 for all x E R. PROOF. By hypothesis, (aN J) (x) = 0 for all N E N and x Corollary 14.15, f(x) = limN->oo(aNJ) (x) = 0 for all x E R. I
E
R. Hence, by
14.17 COROLLARY. Let f : R --+ R be continuous and periodic. Then there is a sequence of trigonometric polynomials T1 , T2 , .•. , such that TN --+ f uniformly onR. PROOF.
Set TN = aN f for N
E
N, and apply Corollary 14.15. I
This result can be used to prove the following density result for classical polynomials, Le., polynomials of the form P(x) = 2::~=o CkXk.
14.2
Summability of Fourier series
517
14.18 THEOREM [WEIERSTRASS ApPROXIMATION THEOREM]. Let [a, b] be a closed bounded interval, and suppose that 1 : [a, b] --+ R is continuous on [a, b]. Given c > 0 there is a (classical) polynomial P on R such that
I/(x) - P(x)1 < c for all x E [a, b].
PROOF. By considering g(t) := I(a + (b - a)t/7f), which is continuous on [0,7f], we may suppose that a = and b = 7f. Let c > O. Extend 1 from [0,7f] to R so that 1 is continuous and periodic. (For example, we could insist that the graph of y = I(x) on [7f,27f] is the chord from (7f,/(7f)) to (27f,/(0)) and then define l(x+2k7f):= I(x) for k E Z.) By Corollary there is a trigonometric polynomial T such that
°
14.17,
IT(x) - l(x)1 <
c
'2
for x E R. Since each cos kx and sin kx is analytic on R, so is T. Since analytic functions are uniform limits of their Taylor series, it follows that there is a polynomial P on R such that
IT(x) - P(x)1 <
c
'2
for x E [-7f, 7f] ~ [a, b]. We conclude that
I/(x) - P(x)1 ~ I/(x) - T(x)1
+ IT(x) -
P(x)1 < c
for all x E [a, b] .•
EXERCISES 1. Let E ~ R and suppose that I, Ik : R --+ R are bounded functions. Prove that if L~=o Jk(x) converges to I(x) uniformly on E, then
O'N(X)
:=
t (1k=O
Nk
+
1) Jk(x)
converges to I(x) uniformly on E as N --+ 00. 2. If 1 : R --+ R is periodic on R and integrable on [-7f,7fJ, prove that the Cesaro means of S 1 are uniformly bounded; i.e., there is an M > 0 such that
for all x E Rand N EN.
Chapter 14
518
3. Let
FOURIER SERIES
00
ao
. 2 + 6(ak cos kx + bk sm kx)
S=
~
k=l
be a trigonometric series and set
for x E Rand N EN. Prove that S is the Fourier series of some continuous periodic function f : R ----; R if and only if 0' N converges uniformly on R, as N ----;00. 4. Let f be integrable on [-n, n] and L E R. (a) Prove that if (O'Nf)(XO) ----; L as N ----; 00 and if (Sf)(xo) converges, then (SNf)(XO) ----; L. (b) Prove that M ~ 4(-1)ksinV2n sin v 2 n + 6 2 _ k2 cos kx k=l
converges to V2n cos V2 x uniformly on R. 5. Suppose that
f : [a, b] ----;
R is continuous and
for all integers n 2: O.
J: P(x)f(x) dx for any polynomial P on R. (b) Prove that J: If(xW dx O. (c) Show that f(x) = 0 for all x [a, b]. (a) Evaluate
=
E
6. [SUMMABILITY KERNELS]. Let ¢N : R ----; R be a sequence of continuous, periodic functions on R that satisfy
r
io
27r
¢N(t) dt
=
1
and
for all N E N, and
for each 0 < 8 Prove that
< 2n. Suppose that f: R ----; R is continuous and periodic. N~oo
uniformly for x E R.
1
27r
lim
0
f(x - t)¢N(t) dt = f(x)
14.3
519
Growth of Fourier coefficients
7. Let [a, b] be a nondegenerate, closed, bounded interval. (a) Prove that given any polynomial P on R and any f > 0, there is a polynomial Q on R, with rational coefficients, such that IP(x) - Q(x)1 < f for all x E
[a,b]. C[a, b] (see Example 10.6) is separable. * 8. A sequence of functions f N : R --+ R is said to converge almost everywhere * (b) Prove that the space
to a function f if and only if there is a set E of measure zero such that fN(X) --+ f(x), as N --+ 00, for every x E R \ E. Suppose that f is periodic on R. Prove that if f is Riemann integrable on [-n, n], then Ij N f --+ f almost everywhere as N --+ 00.
GROWTH OF FOURIER COEFFICIENTS terial from Sections 5.5 and 14.2.
e14.3
This section uses ma-
By Theorem 14.14, a continuous periodic function f is determined completely by its Fourier coefficients. In this section we ask to what extent smoothness of f affects the growth of these coefficients. We begin with a computational result. 14.19 Lemma. If
f :R
--+
R is integrable on [-n, n] and N is a nonnegative
integer, then
PROOF. Fix N ;::: 0. Since f and SN f are integrable on [-n, n], both integrals in (11) exist. By definition and orthogonality,
~ n
j7r f(x)ao(f) dx = lao(f)12 = ~ j7r (SNf)(X)ao(f) dx. -7r
2
n
2
-7r
2
Similarly,
and
for kEN. Adding these identities for k
=
0, ... , N verifies (11). I
Next, we use this result to identify a growth condition satisfied by the Fourier coefficients of any Riemann integrable function.
Chapter 14
520
FOURIER SERIES
14.20 THEOREM [BESSEL'S INEQUALITY]. If 1 : R ---+ R is (Riemann) integrable on [-7f,7f], then L:~=l lak(JW and L:~=l Ibk(JW are convergent series. In fact,
PROOF.
Fix N E N. By Lemma 14.19,
Therefore,
for all N E N. Taking the limit of this inequality as N ---+ 00 verifies (12). Since 1/12 is Riemann integrable when 1 is, it follows that both L:~l lak(JW and L:~l Ibk(JW are convergent series. I
14.21 COROLLARY then
[RIEMANN-LEBESGUE LEMMA].
If1 is integrable on [-7f, 7f],
PROOF. Since the terms of a convergent series converge to zero, it follows from Bessel's inequality that ak(J) and bk(J) converge to zero as k ---+ 00. I
Our next major result shows that Bessel's inequality is actually an identity when
1 is continuous and periodic. First, we show that the partial sums of the Fourier series of a function 1 are the best approximations to 1 in the following sense. 14.22 Lemma. Let N E N. If 1 is (Riemann) integrable on [-7f,7f] and N
TN =
~ + I)Ckcoskx + dksinkx)
k=l
is any trigonometric polynomial of degree N, then
14.3 PROOF.
521
Growth of Fourier coefficients
Notice by (11) that
i : If(x) - TN(XW dx
+ (SNf)(x) - TN(X)12 dx
=
i : If(x) - (SNf)(x)
=
i:1f(x)-(SNf)(xWdX +2 i : (f(x) - (SNf)(X))((SNf)(X) -TN(X))dx
+ i : I(SN f)(x) - TN(X)12 dx
~ i : If(x) -
(SN f)(xW dx
+ 2 i : ((SN f)(x)TN(X) - f(x)TN(X)) dx.
This last term is zero since, by orthogonality,
- -Co /71" f(x)dx 271"
-71"
c. /71" L...1. f(x)cosjxdx . -71" N
J=
1 71"
Consequently,
14.23 THEOREM tinuous, then
(13)
[PARSEVAL'S IDENTITY].
If f
:R
-+
R is periodic and con-
Chapter 14
522
FOURIER SERIES
PROOF. By Bessel's inequality, we need only show that the left side of (13) is greater than or equal to the right side of (13). Since f is continuous and periodic, O'N f ---+ f uniformly on R as N ---+ 00 by Fejer's Theorem. Hence, it follows from Lemmas 14.19 and 14.22 that
as N
---+ 00.
In particular,
The Riemann-Lebesgue Lemma can be improved if f is smooth and periodic. In fact, the following result shows that the smoother f is, the more rapidly its Fourier coefficients converge to zero.
14.24 THEOREM. Let f : R ---+ Rand j E N. If f U) exists and is integrable on [-7f,7f] and f(£) is periodic for each O:S f < j, then lim k j ak (f)
(14)
k->oo
PROOF.
Fix kEN. Since
ak(f') = Similarly, bk(f') obtain
2. f1r 7f
-1r
= lim k j bk(f) = O. k->oo
f is periodic, integration by parts yields
J'(x) cos kx dx =
~ f1r 7f
f(x)sinkxdx = kbk(f).
-1r
= -kak(f), hence ak(f") = kbk(f') = -k2ak(f). Iterating, we when j is even, when j is odd.
A similar identity holds for Ibk(fU))I. Since the Riemann-Lebesgue Lemma implies that ak(fU)) and bk(fU)) ---+ 0 as k ---+ 00, it follows that kjak(f) ---+ 0 and kibk(f) ---+ o as k ---+ 00. I This result shows that if f is continuously differentiable and periodic, then kak(f) and kbk(f) both converge to zero as k ---+ 00. Recall that if f is continuously differentiable on [-7f,7f], then f is of bounded variation (see Remark 5.51). Thus it is natural to ask: How rapidly do kak (f) and kb k (f) grow when f is a function of bounded variation? To answer this question, let {xo, Xl, ... ,xn } be a partition
14.3
Growth of Fourier coefficients
523
of [-7f,7f]. Using Riemann sums, the Mean Value Theorem, Abel's Formula, and sin kxo = sin kX n = 0, we can convince ourselves that
7fak(J) = [ : f(x) cos kx dx
~ t,f(Xj)COSkXj(Xj ~~ =
xj-d
t
f(xj)(sinkxj - sinkxj_1) j=l n-1 ~ 2)f(xj) - f(xj+d) sinkxj. j=l
Since the absolute value of this last sum is bounded by Var f, we guess that klak(J)1 5. Var f/7f. To prove that our guess is correct, suppose for a moment that f is increasing, periodic, and differentiable on [-7f,7f], and ¢(x) = sinkx. Then by Definition 14.3, periodicity, integration by parts, and the Fundamental Theorem of Calculus, we can estimate the Fourier coefficients of f as follows:
7fklak(J) I = = =
=
1[:
f(x)¢'(x) dxl
If(X)¢(X)
1[: t lX
1:7r - [: j'(x)¢(x) dxl
j'(x)¢(x) dxl5. [ : j'(x) dx
j=l
j
j'(x) dx
=
t
f(xj) - f(xj-d 5. Var f.
j=l
Xj-l
The following result shows that this estimate is valid even when entiable nor increasing.
f is neither differ-
14.25 Lemma. Suppose that f and ¢ are periodic, where f is of bounded variation on [-7f, 7f] and ¢ is continuously differentiable on [-7f, 7f]. If M := sUPxE[-7r,7rjl¢(x)l, then (15)
1[:
f(x)¢'(x) dxl5. M Var f·
PROOF. Since f is of bounded variation and ¢' is continuous on [-7f,7f], the product f¢' is integrable on [-7f,7f] (see Corollary 5.23 and the comments following Corollary 5.57). Let E > 0 and set C = SUPxE[-7r,7rjlf(x)l. Since ¢' is uniformly continuous and f¢' is integrable on [-7f, 7f], choose a partition P = {Xo, Xl, ... , X2n} of [-7f, 7f] such that
(16)
w, C E [Xj-1, Xj]
implies
I¢'(w) - ¢'(c)1
< EC 47f
Chapter
524
14
FOURIER SERIES
and
~f(W;)¢'(W;)(X; - x;-d -
(17)
i:
f(X)¢'(X)dxl <
~
for any choice of Wj E [Xj-l, Xj]. Set
2n
A
L f(wj)(¢(Xj) - ¢(xj-d), j=1 where Wj = Xj when j is even, Wj = Xj-l when j is odd. By the Mean Value Theorem, choose Cj E [Xj-l, Xj] such that ¢(Xj) - ¢(Xj-l) = ¢'(Cj)(Xj - xj-d. :=
Then
2n
A
=
L f(wj)¢'(Cj)(Xj - xj-d· j=1
Hence it follows from (17) and (16) that
IA
-1:
f(x)¢'(x) dxl
2n
2n
< L f(Wj)¢'(Cj)(Xj - Xj-l) - L f(Wj)¢'(Wj)(Xj - Xj-l) j=1
j=1
2n
< L If(Wj)II¢'(Cj) - ¢'(Wj)I(Xj - Xj-l)
+~
j=1 2n
C,",
:::; 47f L)Xj - Xj-l)
j=1
c
c
c
+ "2 = "2 + "2 = c.
Combining this observation with the triangle inequality, we obtain
11:
(18)
f(X)¢'(X)dxl:::;
IAI +c.
On the other hand, by the choice of the Wj'S, n
A
n
= L f(X2j-2)(¢(X2j-l) - ¢(X2j-2)) + L f(X2j)(¢(X2j) - ¢(X2j-l)) j=1
j=1
n
=
L ¢(x2j-d(f(x2j-2) - f(X2j)) j=1 n
+ L(f(X2j)¢(X2j) - f(X2j-2)¢(X2j-2)). j=1
14.3
Since
525
Growth of Fourier coefficients
f and ¢ are periodic, this last sum telescopes to O. Therefore, n
IAI =
L ¢(X2j-l)(f(X2j-2) -
f(X2j)
j=l n
::; L
1¢(X2j-l)llf(X2j-2) - f(X2j)1 ::; M Var f.
j=l
This, together with (18), proves that
1[:
f(x)¢'(x) dxl ::; M Var f
Taking the limit of this inequality as c
---+
+ c.
0, we conclude that (15) holds. I
We now estimate the rate of growth of Fourier coefficients of functions of bounded variation. 14.26 THEOREM. Iff: R then
---+
R is periodic and of bounded variation on [-1l',1l'],
for kEN.
PROOF. Fix kEN and set ¢( x) = sin kx. Then ¢ is periodic and ¢' (x) = k cos kx is continuously differentiable on [0, 21l'J. Hence, it follows from Lemma 14.25 that
Ikak(f)1
= 1-1111' 1l'
-11'
f(x)kcoskxdx I =
11- 111' f(x)¢'(x)dx::; I Var -f. 1l'
-11'
1l'
A similar argument proves that Ikbk(f)1 ::; Varf /1l'. I
EXERCISES 1. If f is integrable on [-1l',1l'J and a E R, prove that
lim k-+oo
111' f(x)sin(k+a)xdx=O. -11'
2. Prove that there is no continuous function whose Fourier coefficients satisfy lak(f)1 2: I/Vk for kEN. 3. Prove that if f : R ---+ R belongs to C2(R) and f, !' are both periodic, then Sf converges to f uniformly and absolutely on R. (See also Exercise 5 in Section 14.4.) 4. If f : R ---+ R belongs to COO(R) and f(j) is periodic for all j 2: 0, prove that Sf is term-by-term differentiable on R. In fact, show that dj f
00
dj
-d .(x) = '"' -d.(ak(f)coskx+bk(f)sinkx) xJ ~ xJ k=l
Chapter 14
526
FOURIER SERIES
uniformly for all j EN. 5. Let I : R ---- R be periodic on R, integrable on [-11",11"], and ak(f) 2: 0 for k = 0,1, .... (a) Prove that (Skf)(O) 2: (Sjf)(O) for all k 2: j 2: o. (b) Prove that SN 1(0) :S 2U2N 1(0) for N E N. (c) Prove that 2::~1 lak(f)1 < 00. (d) Suppose that I is also even. Prove that I must be continuous and converges uniformly and absolutely on R. 6. Let I : R ---- R be continuous and periodic. The modulus 01 continuity of I is defined by w(f,8) = sup I/(t + h) - l(t)l·
SI
tE[0,271"]
Ihl9
(a) Show that ak(f) =
2~
I:
(/(u) -
I
(u +
~)) coskudu
for kEN. (b) Prove that
for kEN. (c) Use part (b) to give a different proof the Riemann-Lebesgue Lemma in the special case when I is periodic and continuous. e14.4 CONVERGENCE OF FOURIER SERIES rial from Sections 5.5, 14.2, and 14.3.
This section uses mate-
We shall prove that under certain conditions, a summable series must also be convergent. Such results, called Tauberian theorems, will be used to obtain a partial answer to the Convergence Question posed in Section 14.1 and further results concerning the growth of Fourier coefficients. The following result was the first Tauberian theorem discovered. 14.27 THEOREM [TAUBER]. Let ak 2: 0 and L E R. If 2::':=0 ak is Cesaro summable to L, then 00
Lak
= L.
k=O
PROOF. By Remark 14.11, it suffices to prove that 2::':=0 ak < 00. Suppose to the contrary that 2::':=0 ak = 00. Then given M > 0, there is an no E N such that n 2: no implies that 8 n := 2::~=0 ak 2: M. Let N > no. Then 80
UN :=
+ 81 + ... + 8 no 8 no +1 + ... + 8N N - no N +1 + N +1 2: 0 + N + 1 M.
14.4
527
Convergence of Fourier series
Taking the limit of this last inequality as N -7 We conclude that L = 00, a contradiction .•
00,
we obtain L 2:: M for all M >
o.
This result can be used to improve the Riemann-Lebesgue Lemma for certain types of functions.
14.28 COROLLARY. Let f : R -7 R be periodic on R and integrable on [-w, w]. If ak(f) = 0 and bk(f) 2:: 0 for kEN, tben
~ ~
bk(f) k
<
00
.
k=l
PROOF.
By considering 9 = f - ao (f) we may suppose that ao (f) = F(x) =
o.
Let
fox f(t) dt.
By Theorem 5.26, F is continuous on R. Since ao(f) = 0, F is also periodic. Hence, by Fejer's Theorem, (O"NF)(O) -7 F(O) = 0 as N -7 00. Integrating by parts, we obtain ak(F) = bkij) 2:: 0 and bk(F) = - ak~j) = O. It follows that 2::r=1 bk(f)/k is Cesaro summable (to -ao(F)/2) and has nonnegative terms. We conclude by Tauber's Theorem that 2::r=1 bk(f)/k converges .•
We are now in a position to see that the converse of the Riemann-Lebesgue Lemma is false. Indeed, if sinkx k=2 log k
f
were the Fourier series of some integrable function, then by Corollary 14.28, 00
1
L klogk k=2 would converge, a contradiction of the Integral Test. The following result is one of the deepest Tauberian theorems.
14.29 THEOREM [HARDY]. Let E <;;; R and suppose tbat!k sequence of functions tbat satisfies
E
-7
R is a
(19) for all x E E, all kEN, and some M > O. If 2::r=o!k is uniformly Cesaro summable to f on E, tben 2::r=o fk converges uniformly to f on E.
Fix x E E and suppose without loss of generality that M 2:: 1. For each 0, 1, ... , set
PROOF.
n
=
n
Sn(X)
=
L k=O
fk(X),
O"n(X) =
so(x)
+ ... + sn(x) n+1
'
Chapter 14
528
FOURIER SERIES
and consider the delayed averages
() O"n,k X
:=
Sn(X) + ... + Sn+k(X) k+ 1
defined for n, k 2: o. Let 0 < € < 1. For each n E N choose k = k(n) EN such that k+1 ::; n€/(2M) < k + 2. Then n-1 n 2M k+1 < k+1 < € <
(20)
00.
Moreover, since
O"n k (X) - Sn () X
=
'
+ ... + (sn+k(x) - sn(x))
(sn(x) - sn(x)) n+k (
.
=~ 1-~~~
k
+1
) fJ(x),
it follows from (19) and the choice of k = k(n) that
(21)
n+k 1 M (k + 1) L IfJ(x)l::; M L ~ < n + 1 < 2' j=n+l j=n+l J n+k
100n,k(X) - sn(x)l::;
€
Since O"n - f uniformly on E, choose N E N such that €2
(22)
n 2: N
x
and
Since
O"n,k(X) =
E
E
imply
100n(x) - f(x)1 < 12M'
(1 + ~ ~ ~) O"n+k - (~ ~ ~) O"n-l,
it follows from (20), (21), and (22) that
ISn(x) - f(x)1 ::; ISn(x) - O"n,k(X) I + 100n,k - f(x)1
<
~ + ( 1 + ~ ~ ~) 10"n+k (x) -
f (x) I
+ (~ ~ ~) 100n-l(X) - f(x)1 <
~ + (1 + 2~) (1;~) + 2~ (1;~) €
€2
€
€
€
€
= 2 + 12M + 3" < 2 + 12 + 3" < € for any n
> N and x E E. We conclude that Sn - f uniformly on E as n -
00.
I
We are prepared to answer the Convergence Question posed in Section 14.1 for piecewise continuous functions of bounded variation.
14.4
Convergence of Fourier series
14.30 THEOREM [DIRICHLET-JORDAN]. If f of bounded variation on [-1f, 1f], then
:R
---+
529
R is periodic on Rand
for every x E R. If f is also continuous on some closed interval I, then
uniformly on I. PROOF. Since f is periodic and of bounded variation, the one-sided limits f(x+) and f(x-) exist for each x E R, and f is Riemann integrable on [-1f,1f] (see the comments that follow the proof of Corollary 5.57). Hence, by Fejer's Theorem, both conclusions hold if SN is replaced by (J'N. Since Theorem 14.26 implies that
for kEN, it follows from Hardy's Theorem that both conclusions hold as stated. I We close this section with an application of Fourier series to an extremal problem. We will show that among all smooth simple closed curves in R 2 with a given arc length, the largest area is enclosed by a circle. (The proof presented here comes from Marsden [7].)
14.31 THEOREM [ISOPERIMETRIC PROBLEM]. Let E be a region in R2 whose topological boundary C = 8E is a smooth closed simple curve of length 21f. If A = Area (E), then A :s 1f. Moreover, A = 1f if and only if E = Bl (a, b) for some a,bER. PROOF. Let (v,
all
8
[0, 21fD be the natural parametrization of C; i.e., Ilv' (8) II = 1 for
E [0, 21f]. Set
P(8) = Vl(8) - a,
Q(8) = V2(8) - b,
and
¢(8) = (P(8), Q(8))
for 8 E [0,21f], where (Vl' V2) := v. Clearly, (¢, [0, 21fD is a smooth parametrization of 8E - (a, b) whose trace is a smooth closed simple curve with arc length 21f that encloses a region with area A. Moreover,
(23)
(24)
IP'(8)1 2
1 21f
10r
+ IQ'(8)1 2 = 1,
27r
P(8) d8 = 0,
1 21f
10r
27r
Q(8) d8 = 0,
Chapter 14
530
FOURIER SERIES
and by Green's Theorem, (25)
A
=
fl
dA
=
feE xdy = 127r P(S)Q'(s)ds.
Let ak, bk (respectively, Ck, dk) represent the Fourier coefficients of P (respectively,
Q). Since (
(26)
P(s)
00
= ~)ak cos ks + bk sin ks), k=l
Q(s) = 2)Ckcosks + dksinks), k=l
(27) 00
P'(s) = ~)kbkcosks - kaksinks), k=l
00
and Q'(s) = L(kdkcosks-kcksinks) k=l
uniformly on [0,2rr]. Hence, by (23) and Parseval's Identity,
Moreover, by (25) and orthogonality,
(28) It follows that
In particular, A ::; rr and A = rr if and only if al = d l , Cl = -bl, and ak = bk = Ck = dk = for k ~ 2. Suppose that A = rr. Then P(s) = alcoss+blsins and Q(s) = -blcoss+ al sins = -P(s + rr/2). Thus P'(s) = -Q(s) and Q'(s) = -P"(s) = P(s) for all s E [0,2rr]. It follows from (23) that
°
EXERCISES
f is continuous and of bounded variation on [-rr, rr]. Prove that f pointwise on (-rr, rr) and uniformly on any [a, b] c (-rr, rr).
1. Suppose that
SN f
~
14.4
531
Convergence of Fourier series
2. (a) Prove that 00
x=2L
(_l)k+l k sinkx
k=1
pointwise on (-1f, 1f) and uniformly on any [a, b] (b) Prove that
Ixl = ~
_
~ ~ cos(2k - l)x 1f~
2
c (-1f, 1f).
(2k-1)2
uniformly on [-1f, 1f]. (c) Find a value for 1
00
t;(2k-1)2' 3. Prove that if f is continuous, odd, and periodic, then 2::%"=1 bk(f)/k converges. 4. Let L E R. A series 2::%"=0 ak is said to be Abel summable to L if and only if 00
lim ' " akrk = L.
r---+l-~
k=O
(a) Let Sk = 2::7=0 ak· Prove that 00
00
00
L ak rk = (1- r) L Sk rk = (1- r)2 L(k + 1)a'k rk , k=O k=O k=O provided that anyone of these series converges for all 0 < r < 1. (b) Prove that if 2::%"=0 ak is Cesaro summable to L, then it is Abel summable to L. (c) Prove that if f is continuous, periodic, and of bounded variation on R, then Sf is Abel summable to f uniformly on R. (d) Show that if ak ;:::: 0 and 2::%"=0 ak is Abel summable to L, then 2::%"=0 ak converges to L.
5. [BERNSTEIN] Let f : R -> R be periodic and ex > O. Suppose that of order ex; i.e., there is a constant M > 0 such that
If(x
+ h) -
f
is Lipschitz
f(x)1 :::; MlhlQ
for all x, hER. (a) Prove that
:1j7r ; : -7r If(x + h) -
00
f(x - h)12 dx
=
4 t;(a~(f)
+ b~(f)) sin2 kh
Chapter 14
532
FOURIER SERIES
holds for each hER. (b) If h = 7f/2n +1, prove that sin2 kh ~ 1/2 for all k E [2n - 1 ,2n J. (c) Combine parts (a) and (b) to prove that
for n = 1,2,3, .... (d) Assuming that
(see Exercise 9, p. 380), prove that if I is Lipschitz of order 0: for some 0: > 1/2, then SI converges absolutely and uniformly on R. (e) Prove that if I : R -+ R is periodic and continuously differentiable, then S I converges absolutely and uniformly on R. *6. Suppose that I: R -+ R is periodic and of bounded variation on [-7f,7fJ. Prove that S N I -+ I almost everywhere as N -+ 00 (see Exercise 8, p. 519).
e14.5 UNIQUENESS
This section uses material from Section 14.4.
In this section we examine the Uniqueness Question posed in Section 14.1. We begin with the following generalization of the second derivative. 14.32 DEFINITION. Let Xo E R and let I be an open interval containing Xo. A function F : I -+ R is said to have a second symmetric derivative at Xo if and only if - 2h) - 2F(xo) D2 F( Xo ) -- 1·1m F(xo + 2h) + F(xo 4h2 h-.O+ exists. 14.33 Remark. Let Xo E R and let I be an open interval containing Xo. II F is differentiable on I and F"(XO) exists, then F has a second symmetric derivative at Xo and D2F(xo) = F"(XO). PROOF. Set G(t) = F(xo + 2t) + F(xo - 2t) for tEl and H(t) = 4t 2 and fix tEl. By Theorem 4.15 (the Generalized Mean Value Theorem),
F(xo
+ 2t) + F(xo 4t 2
- 2t) - 2F(xo)
G(t) - G(O) G'(c) H'(c) H(t) - H(O) F'(xo + 2c) - F'(xo - 2c) 4c
14.5
Uniqueness
for some c between 0 and t. Since c -+ 0 as t
F'(xo D2 F( Xo ) -1· - 1m
-+
533
0, it follows that
+ 2c) - F'(xo - 2c) 4
c-->O
c
= ~ lim (F'(X O + 2c) - F'(xo) + F'(xo) - F'(xo - 2C)) 2 c-->O =
2c
2c
~(F"(xo) + F"(xo)) 2
=
F"(xo). I
The converse of Remark 14.33 is false. Indeed, if
F(x)
= {
~
x>O x=O
-1
x
< 0,
then D2F(0) = 0 but F"(O) does not exist. The following result reinforces further the analogy between the second derivative and the second symmetric derivative (see also Exercises 1 and 5). 14.34 Lemma. Let [a, bj be a closed bounded interval. If F : [a, bj -+ R is continuous on [a, bj and D2F(x) = 0 for all x E (a, b), then F is linear on [a, bj; i.e., there exist constants m, I such that F(x) = mx + I for all x E [a, bj. PROOF.
Let s > O. By hypothesis,
¢>(x)
:=
F(x) - F(a) -
(F(b~ =:(a)) (x -
a)
+ s(x - a)(x - b)
is continuous on [a, b], and by Remark 14.33,
D2¢>(x)
(29)
=
D2F(x)
+ 2s =
2s
for x E (a, b). We claim that ¢>(x) :::; 0 for x E [a, bj. Clearly, ¢>(a) = ¢>(b) = o. If ¢>(x) > 0 for some x E (a, b), then ¢> attains its maximum at some Xo E (a, b). By Exercise 1, D2¢>(xo) :::; 0, hence by (29), 2s :::; 0, a contradiction. This proves the claim. Fix x E [a, bj. We have shown that
F(x) - F(a) -
(F(b~
=:(a)) (x - a) :::; s(x - a)(b - x).
A similar argument establishes that
F(x) - F(a) -
(F(b~
=:(a)) (x - a) :::: -s(x - a)(b - x).
Therefore,
IF(X) - F(a) -
(F(b~ =:(a)) (x -
Taking the limit of this inequality as s
F(x)
=
F(a)
-+
a)1 :::; s(x - a)(b - x) :::; c(b - a)2. 0, we conclude that
+ (F(b) - F(a)) (x - a)
for all x E [a, bj; i.e., F is linear on [a, bj. I
b-a
Chapter 14
534
FOURIER SERIES
14.35 DEFINITION. The second formal integral of a trigonometric series, 00
s = ~O + ~)ak cos kx + bk sin kx), k=l is the function
ao 2 ~ 1 4x - ~ k 2 (akcoskx + bksinkx). k=l By the Weierstrass M-Test, if the coefficients of S are bounded, then the second formal integral of S converges uniformly on R. In particular, the second formal integral always exists when the coefficients of S converge to zero. Notice that the second formal integral of a trigonometric series S is the result of integrating S twice term by term. Hence, it is not unreasonable to expect that two derivatives of the second formal integral F might recapture the original series S. Although this statement is not quite correct, the following result shows there
F(x)
=
is a simple connection between the limit of the series S and the second symmetric derivative of F.
14.36 THEOREM [RIEMANN]. Suppose that
S = ~o
00
+ L(ak coskx + bk sinkx) k=l
is a trigonometric series whose coefficients ak, bk -+ 0 as k -+ 00 and let F be the second formal integral of S. If S(xo) converges to L for some Xo E R, then D2F(xo) = L. PROOF. Let FN denote the partial sums of F. After several applications of Theorem B.3, we observe that
. FN(XO + 2h) + FN(XO - 2h) - 2FN(XO) 1Im--~~--~----~~--~--~~~ 4h2
h-.O
=
~~ ( ~ + t,(ak cos kxo + bk sin kxo) (Si~:h
r)
N
= ~o + L (ak cos kxo + bk sin kxo) k=l holds for any N E N. Therefore, it suffices to show that given e > 0, there is an N E N such that
IRNI:=
(30) for all
Ihl ::; 1.
00 • (Sinkh)21 k];+/akcoskxo+bksmkxo) ~ <e
14.5
Let
Uniqueness
535
00
+ bj sinjxo)
(aj cosjxo
an d
B
k
=
(sinkh)2 kh
j=k+l
for kEN. Since An
--+
0 as n
we have by Abel's Formula that
--+ 00,
n
(31)
L
RN := lim
n->oo
=
(A k-
1 -
Ak)B k
k=N+l
}!..~ ((AN -
~
An)Bn -
(AN - Ak)(Bk+l - Bk))
k=N+l 00
= ANBN+l
L
+
AdBk+l - Bk)'
k=N+l
Moreover, by the Fundamental Theorem of Calculus,
(32)
IBk+l - Bkl
Since
=
r(k+l)h d (Sint)2 Jkh dt -tdt. 1
1
~ (sin t ) 2 = 2 sin t (t cos t - sin t ) ill
t
~
t
is bounded near t = 0 and is bounded by 2(t that the improper integral
C=
roo
Jo
~
+ 1)/t3 <
2/t 2 for t 2: 2, it is clear
(sint)2 dt
dt
t
converges. Since {Bd is bounded and AN N E N such that
--+
0 as N
--+ 00,
we can choose an
(33) It follows from (32) that 00
L
Ak(Bk+l - B k) :::;
2~
k=N+l
00
L
k=N+l
<
r(k+l)h
Jkh
~ dt
(sint)2 dtl
t
~ Joroo 1~ (sin t) 21 dt = ~. dt t 2
- 2C
Combining this inequality with (31) and (33), we conclude that IRNI <
E.
I
The following result shows that the hypotheses of Riemann's Theorem are satisfied by any trigonometric series that converges pointwise on a non degenerate interval.
Chapter 14
536
FOURIER SERIES
14.37 THEOREM [CANTOR-LEBESGUE LEMMA]. If
is a trigonometric series that converges pointwise on a nondegenerate interval [a, b], then its coefficients satisfy ak, bk ---+ 0 as k ---+ 00. PROOF. Set Po = ao/2 and p% = a% + b% for kEN. If the result is false, then there is a 0 > 0 such that Pk > 0 for infinitely many kEN. Set 00 = 0 and for each kEN define Ok E R so that ak = Pk cos kOk, bk = Pk sin kOk' By a sum-angle formula, n
a;
+ ~)ak coskx + bk sinkx) = k=l
n
LPk cosk(x - Ok) k=O
for each x E Rand n E N. Since S converges on [a, b], it follows that (34)
lim Pk cos k(x - Ok)
k-+oo
=0
for all x E [a, b]. Set 10 = [a, b] and ko = 1. Fix j 2: 0 and suppose that a closed interval I j ~ 10 and an integer kj > ko have been chosen. Choose kj+1 > kj such that kj+1lIj I > 21l' and PkJ +1 > o. Clearly, kj+1(X - Ok J +l) runs over an interval of length> 21l' as x runs over Ij . Hence, we can choose a closed interval Ij+1 ~ Ij such that x E Ij+1
implies
By induction, then, there exist integers 1 < k1 < k2 < . .. and a nested sequence of closed intervals 10 ;::2 h ;::2 • •• such that (35) for x E I j , j EN. By the Nested Interval Property, there is an x E I j for all j EN. This x must satisfy (35) for all j E N and must belong to [a, b] by construction. Since this contradicts (34), we conclude that Pk ---+ 0 as k ---+ 00 . • We are now prepared to answer the Uniqueness Question for continuous functions of bounded variation. 14.38 THEOREM [CANTOR]. Suppose that
S= ~
00
+ L(ak coskx + bk sinkx) k=l
14.5
Uniqueness
537
converges pointwise on [-11",11"] to a function f which is periodic and continuous on R, and of bounded variation on [-11",11"]. Then S is the Fourier series of f; i.e., ak = ak(f) for k = 0,1, ... , and bk = bk(f) for k = 1,2, ...
PROOF. Suppose first that f(x) = 0 for all x E R. By the Cantor-Lebesgue Lemma, the coefficients ak, bk tend to zero as k --+ 00. Thus the second formal integral F of S is continuous on R and by Riemann's Theorem has a second symmetric derivative that satisfies D 2 F(x) = 0 for x E R. It follows that F is linear on R; i.e., there exist numbers m and 'Y such that mx + 'Y
=
~ x2 -
f :2
(ak cos kx
+ bk sin kx)
k=l
for x E R. Since the series in this expression is periodic, it must be the case that m = ao = 0; i.e., 00 1 'Y + k 2 (ak cos kx + bk sin kx) = 0
L
k=l
for all x E R. Since this series converges uniformly, it follows from Theorem 14.4 that 'Y = 0 and ak = bk = 0 for kEN. This proves the theorem when f = o. If f is periodic, continuous, and of bounded variation on [-11", 11"], then S N f --+ f uniformly on R by Theorem 14.30. Hence, the series S - Sf converges pointwise on R to zero. It follows from the case already considered that ak - ak (f) = 0 for k = 0,1, ... , and bk - bk(f) = 0 for k = 1,2, ....•
EXERCISES 1. Suppose that F : R ---+ R has a second symmetric derivative at some Xo. Prove that if F(xo) is a local maximum, then D2F(xo) ::; 0, and if F(xo) is a local minimum, then D2F(xo) 2: O. 2. Prove that if the coefficients of a trigonometric series are bounded, then its second formal integral converges uniformly on R. 3. Prove that if f : R --+ R is periodic, then there exists at most one trigonometric series that converges to f pointwise on R. 4. Suppose that f : R --+ R is periodic, piecewise continuous, and of bounded variation on R. Prove that if S is a trigonometric series that converges to (f(x+) + f(x-) )/2 for all x E R, then S is the Fourier series of f. *5. Suppose that F: (a, b) ---+ R is continuous and D 2F(x) > 0 for all x E (a, b). Prove that F is convex on (a, b).
Chapter 15
Differentiahle Manifolds This chapter is considerably more abstract than those preceding it. Our aim is to show that the theorems of Green, Gauss, and Stokes are special cases of a more general theory in which differential forms of degree 1 and 2 are replaced by differential forms of degree n, and curves and surfaces are replaced by n-dimensional manifolds. Differential forms of degree n are introduced in Section 15.1, n-dimensional manifolds are introduced in Section 15.2, and an n-dimensional version of Stokes's Theorem is proved in Section 15.3. e15.1 DIFFERENTIAL FORMS ON Rn any other enrichment section.
This section uses no material from
We introduced differential forms of degree 1 and 2 in Sections 13.2 and 13.4. In this section we introduce differential forms of degree r. It turns out that as far as calculus is concerned, the actual definition of differential forms is not as important as their algebraic structure. For this reason, we begin with the following formal definition. (For a more constructive approach to differential forms that interprets dXi as a derivative of the projection operator (Xl' ... ' Xn) r---t Xi, see Spivak [12], p.89.) 15.1 DEFINITION. Let 0 ::; r ::; n and let V be open in Rn. (i) A O-form (or differential form of degree r = 0) on V is a function f : V -4 R. (ii) Let r > o. An r-form (or differential form of degree r) on V is an expression of the form
(1) where the sum is taken over all integers i j that satisfy 1 ::; i l < i2 < . .. < ir ::; n, each coefficient function fit, ... ,i r is a O-form on V, and the dXiJ's are 538
Differential forms on R n
15.1
539
symbols that (for us) will take on meaning only in the context of integration (see Definition 15.38). If all the coefficient functions are zero, then w is called the zero r-form and is denoted by O. . dx·'1.1 ... dx·'l.r' (iii) Two r-forms , w = "f· L..-i 'l-l, .. ·,1..r dx·'1.1 ... dx·'l.r and 'YI " = "g. L..J 'l.l,··,,'l.r are said to be equal on V if and only if fit ,... ,ir (x) = 9iJ ,... ,dx) for all 1 ::; il < i2 < ... < ir ::; n and all x E V. (iv) An r-form w is said to be decomposable on V if and only if there exist integers 1 ::; i 1 < ... < ir ::; n and a O-form I such that
on V. (v) An r-form is said to be continuous (respectively, CP) on V if and only if all of its coefficient functions are continuous (respectively, CP) on V. (vi) The support of an r-form (notation: sptw) is the union of the supports of its coefficient functions; i.e., if w is given by (1), then sptw If spt w
~
u
=
spt (fit, ... ,i r
).
E, then w is said to be supported on E.
Let V be open in R n. Since there is only one collection of indices that satisfies 1 ::; i 1 < ... < ir ::; n for r = n, an n-form on V is an expression of the form w = Idxl ... dX n for some O-form I (i.e., a function) on V. Thus every n-form on V ~ Rn is decomposable. At the other extreme, a general I-form on V is an expression of the form n w = Lljdxj, j=l
where each Ii is a O-form on V. An example of a I-form is the total differential of a differentiable function z = I(x, y), i.e., dz = Ix dx + Iy dy. An (n - I)-form on Rn is an expression of the form n
W
=L
Ii dXl
...
dxj ... dx n .
j=l
The notation dxj indicates that the differential dx j is missing. Thus, a 2-form on R 3 is an expression of the form
(2)
w=
II dy dz + h dx dz + h
dx dy.
In Chapter 13, we used Jacobians to define differential forms of degree 2 on a smooth orient able surface S = (cp, E) and to associate with each 2-form an oriented
540
Chapter 15
DIFFERENTIABLE MANIFOLDS
integral on S (see also Exercise 5). In the same way, we shall associate n- forms on Rn with oriented integrals over certain geometric objects called n-dimensional manifolds. First, we introduce an algebraic structure on the collection of differential forms that is compatible with this identification. Addition of differential forms can be realized by grouping like terms and simplifying coefficients. For example, the sum of x 2 dy dz + y dx dy and (1 - x 2 ) dy dz is x 2 dydz + ydxdy + (1- x 2 ) dydz = dydz + ydxdy.
In particular, if V is open in R nand w --
"f·tl,···,tr. dx·tl ... dx·'l-r'
Tl " -
~
"g. . dx·'l-l ... dx·'lor ~ 'tI,···,tr
are r- forms on V, then
W+ " = "(f·'l.l, ... . +g.'l.l, ... . )dx·2-1 ... dx· Tl
~
,'l.r
't r '
,'l.r
It is clear that addition of differential forms satisfies the usual laws of algebra, e.g., the Commutative Law and the Associative Law. The product of a O-form (this includes scalars) and an r-form can be defined by
g(" L-tf ·t1,o .. ,t.r dx·1.1 ... dx·'lor ) -- "gf· L-t 'l.l,··,,'l.r. dx·'l.1 ... dx·'t r ' It is clear that if w, 'f/ are r- forms and
f(w
+ 'f/) =
fw
+ f'f/
f,
9 are 0-forms, then
and
(f + g)w = fw
+ gw.
Multiplication of differential forms of degrees r, s > 0 is somewhat more complicated to describe. To explain what happens, recall that if S = (
8(y,z) dy dz = 8(u, v) d(u, v),
8(z,x) 8(x,y) dzdx = 8(u,v) d(u,v), and dxdy = 8(u,v) d(u,v),
where x =
Differential forms on Rn
15.1
541
for k, j = 1, ... ,n. Next, we multiply two differential forms by assuming the distributive law holds, grouping like terms, and simplifying the resulting expression using the Nilpotent Property and the Anticommutative Property. For example, the product of x 2 dx and y dy + z dz is
and the product of sin x dz and x 2 dx
(sinxdz)(x 2 dx In particular, if W =
+ xy dy + log z dz
is
+ xydy + logzdz) = -xysinxdydz -
Ef=l Wj
and
7]
= E~=l 7]k is a sum of differential forms, then N
w7]
x 2 sinxdxdz.
=L
L
L Wj7]k'
j=lk=l
Although the Anticommutative Property and the Nilpotent Property may seem strange, they are natural consequences of the fact that dx dy comes not from an iterated integral but an oriented integral. For example, the Anticommutative Property reflects the fact that when orientation is changed, the sign of the integral changes. 15.2 Example. Find w+7], W-7], and W7] if W = x 2 dx dz+xy dy dz and 7] SOLUTION.
= 2y dx dz.
By definition,
W + 7]
= (x 2 + 2y) dxdz + xydydz = xydydz - (x 2 + 2y) dzdx,
W -7]
= (x 2 - 2y) dxdz + xydydz = xydydz + (2y - x 2) dzdx,
and
W7]
=
(x 2 dxdz
+ xydydz)(2ydxdz)
= dx dz dx dz + 2xy2 dy dz dx dz 2 = -2x ydxdxdzdz - 2xy 2 dydxdzdz = O. I 2x 2 y
Using products of I-forms, we see that an r-form is an expression of the form ""' . dx·tl ... dx·'lor' L-t f·tl,···,tr
where the sum is taken over all integers i j E {I, ... ,n}; i.e., it is no longer necessary that the ii's increase in j. Because of the connection between 2-forms and oriented surface integrals, we will frequently use
W = Pdydz
+ Qdzdx + Rdxdy
to represent a generic 2-form on R3 rather than (2). Here is a summary of the algebraic laws satisfied by addition and multiplication of differential forms.
Chapter 15
542
DIFFERENTIABLE MANIFOLDS
15.3 THEOREM. Let V be open in Rn, let f be a O-form on V, let W be an r-form on V, let TJ be an s-form on V, and let () be a t-form on V. (i) If r = s, then W + TJ is an r-form, W + TJ = TJ + w, and (w + TJ)() = w() + TJ(). If r = s = t, then (w + TJ) + () = W + (TJ + ()). (ii) For any rand s, WTJ = (-lt sTJw. (iii) For any r, s, and t, (wTJ)() = w(TJ()) and f(wTJ) = (fw)TJ = w(fTJ). PROOF. Properties (i) and (iii) hold, by definition. To prove (ii), we may suppose that wand TJ are decomposable; i.e., suppose that W = f dXit ... dXir and TJ = 9 dXjr ... dXjs' By definition, the product of wand TJ is the (r + s )-form W'l1 ./ = fgdx·t1 ... dx·tr dx· J1 ... dx· Js' Successive applications of the Anticommutative Property yield WTJ = f 9 dXit ... dXir dXjr ... dXj. = (-I)Tfgdx·J1 dx·t1 dx't2 ... dx'tr dx'J2 ... dx·J. = ... = (-I)TS gf dx·Jl ... dx·J8 dx·'1.1 ... dx·'l.r = (_I)Ts'l1w. " This completes the proof of part (ii) .•
y) of two variables as dz = fx dx + fy dy. This gives us two definitions for dz dx and dy dz, one using Jacobians and one "multiplying" the total differential dz by the I-forms dx and dy. These two definitions are compatible. Indeed, using the trivial parametrization of the surface z = f(x, y), the Jacobian definition yields
In Section 11.3 we introduced the total differential of a function z
dydz = -fxd(x,y)
(3)
and
= f(x,
dzdx = -fyd(x,y).
On the other hand, multiplying the I-form dz = fx dx + fy dy on the left by dy we have, by the Nilpotent Property and the Anticommutative Property, that dydz
= dy(fx dx + fy dy) = fx dydx + fy dydy = - fx dxdy.
A similar computation leads to dzdx = -fydxdy. Thus if we identify d(x,y) with dx dy, (3) holds no matter which definition we use. (Identification of d(x, y) with dx dy is justified by using the "identity chart"-see Remark 15.41.) The following result contains an important computation that relates the n-fold product of n-forms on R n to the determinant operator.
15.4 THEOREM. Let V be open in Rn and let Wl,W2,." ,Wn be I-forms on V. If A = [aij]nxn is a real matrix, then
15.1
Differential forms on Rn
543
PROOF. The proof is by induction on n. If n = 1, there is nothing to prove. Suppose that the theorem holds for some integer (n-I) ~ 1. By Theorem 15.3 and the Nilpotent Property, we have
Continue this string of identities using the Anticommutative Property, the inductive hypothesis, the definition of det A in terms of cofactors of A, and the Nilpotent Property. We obtain
(t
}=1
aljWj) ...
(t
anjWj)
}=1
= au det Au (WIW2 ... wn ) + (-I)la21 det A21 (WIW2 ... wn )
+ ... + (_I)n-l an1 detAn1(wIW2··· wn )
+
(t
a1jWj) det Au (W2 ... wn )
}=2
= (detA) WI·· ·Wn + 0 = (detA) WI·· ·Wn . • The derivative of a differential form is defined as follows. (This derivative can be used to unify the three operators grad, curl, and div-see Exercise 4.)
15.5 DEFINITION. Let V be open in Rn and suppose that W is a C1 r-form on V. (i) If W = f is a D-form, then the exterior derivative of W is the I-form
af L ax. dXj. j=1 } n
dw:=
Chapter 15
544
DIFFERENTIABLE MANIFOLDS
(ii) If W = f dXiI ... dXir is a decomposable r-form, r > 0, then the exterior derivative of W is the r + I-form
(iii) If W is a differential form of degree r > 0, i.e., W = L:f=l Wj, where each Wj is a decomposable r-form, then the exterior derivative of W is the r + I-form N
dw:= Ldwj. j=l
(iv) If W is C2 r-form on V, then the second exterior derivative of W is d2 w := d(dw).
15.6 Example. Find dw and ~w if w(x, y, z, t) SOLUTION.
=
xy dx dy + (x
+ z + t) dz dt.
By definition,
dw = (ydx
+ xdy) dx dy + (dx + dz + dt) dzdt = dxdzdt;
hence d2 w = (dI)dxdydz
= O. I
It is clear that for O-forms, the exterior derivative satisfies the following rules.
15.7 Remark. Let wand TJ be C1 O-forms on some open set VeRn, and let a be a scalar. Then d(aw), d(w + TJ), and d(wTJ) are continuous 1-forms on V with
d(aw) = adw, d(w + TJ) = dw + dTJ, and d(wTJ) = TJdw
+ w dTJ·
Analogues of these rules hold for arbitrary r-forms.
15.8 THEOREM. Let V be open in Rn and let a be a scalar. If w is a C1 rform on V and TJ is a C1 s-form on V, then d(aw) and d(w + TJ) (when r = s) are continuous (r + I)-forms on V, and d(wTJ) is a continuous (r + 8 + I)-form on V. Moreover, d(aw) = adw,
d(w
+ TJ) = dw + dTJ
(when r = 8), and
(4)
d(wTJ) = dw TJ + (-It w dTJ.
15.1
Differential forms on R n
545
PROOF. By Definition I5.5iii, we may suppose that w and 'T/ are decomposable; i.e., w = f dXil ... dXir and 'T/ = gdxh ... dXj •. By Definition I5.5ii and Remark 15.7,
Similarly, if r = sand iv = jv, v = 1, ... , r, then
To prove (4), consider first the case r = 0; i.e., w = 15.7 and Theorem I5.3ii, d(w'T/) = d(fg) dXjl ... dXj.
f is a o-form. By Remark
= (gdf + f dg) dXh ... dXj.
= df 9dxh ... dXj. + f dgdxjl ... dXj.
= dw'T/
+ wd'T/.
Next, suppose that r > O. If iv = jf.J- for some indices v and /1-, then the Nilpotent Property implies that w'T/ = 0 = dw'T/ = w d'T/' On the other hand, if all the indices are distinct, then since 9 is a o-form and dg is a I-form, we have by Theorem I5.3ii that d(w'l1) = d(fg dx·1.1 ... dx·'l.r dx·J1 ... dx·J8 ) " = (g d'Jf + f dg) dx·~I ... dx·~r dx·31 ... dx3.. = d:Jf dx'~I ... dx·~r 9 dx·31 ... dx 3. . + (-I)T'lfdx·1.1 .. ·dx·'t r dgdx·J1 .. ·dx·J8
= dw'T/+(-IrWd'T/. I Equation (4) is called the Product Rule. The following result shows that the second exterior derivative of a C2 r-form is always zero. (By Exercise 4, this result generalizes Exercises 9a and b, p. 495.)
15.9 THEOREM. If w is a C2 r-form on an open set V ~ Rn, then d2 w = O. PROOF. We may suppose that w is decomposable; i.e., w = f dXil ... dXir. The proof is by induction on r. Suppose that r = 0; i.e., w = f. By the Nilpotent Property, the Anticommutative Property, and the fact that the first-order partial derivatives of f commute, we have
Chapter 15
546
DIFFERENTIABLE MANIFOLDS
Suppose that r = 1; i.e., W = f dXk. Since all first-order partial derivatives of the function 1 are zero, we have by definition that d2xk = d(1 dXk) = O. Thus, by the Product Rule and the case r = 0,
Finally, suppose that there is an r o :<:::; s < r. By definition, dw
> 1 such that the theorem holds for all s-forms,
= d(fdx·"'1
... dx·tr-l ) dx·'l.r .
Hence, by the Product Rule and the inductive hypothesis (for s we have
= 1 and s = r -1),
The following definition shows how to use a continuously differentiable function ¢ : Rn --t Rm to transform differentials from Rm to Rn. (This concept will be used later to define integration of r-forms over manifolds.)
15.10 DEFINITION. Let U be open in Rn, let V be open in Rm, let ¢: U be CIon U, and suppose that
--t
V
w -- '"' L...,; f·"'l,···,t.r dx·"'1 ... dx·'l.r
is an r-form on V. Then the differential transform (induced by ¢) of w is the r-form on U defined by
where ¢* (I) =
for every i
f
0
¢ for every O-form
f
and
= 1,2, ... , m.
For the next several remarks, let U be open in Rn, V be open in Rm, and ¢: U --t V.
15.11 Remark. If w is a Cl r- form on V and ¢ is C2 on U, then ¢*(w) is a Cl r-form on U.
PROOF. By definition, (¢* o-form f on V and
0
I)(u)
= f(¢(u)) is a C1 o-form on U for every C1
15.1
Differential forms on R n
547
is a Cl I-form on U for i = 1,2, ... , m. Hence, it is clear that ¢*(w) is a C1 r-form on U when W is a C1 r-form on V. I
15.12 Remark. The differential transform ¢* is linear; i.e., ifw and TJ are r-forms on V and ¢ is Clan U, then ¢*(W + TJ) = ¢*(w)
PROOF.
We may suppose that W =
¢*(W + TJ)
f
+ ¢*(TJ).
dxi, ... dXir and TJ
= ¢*((f + g) dxi, ... dXi r ) = (f 0 ¢ + 9 0 ¢)¢*(dXil)··· ¢*(dXir)
=
= 9 dXil ... dXir. Thus,
¢*(w)
+ ¢*(TJ)· I
15.13 Remark. The differential transform ¢* is multiplicative; i.e., if w is an r-form on V and TJ is an s-form on V and ¢ is Clan U, then ¢*(WTJ) = ¢* (w)¢* (TJ).
PROOF.
We may suppose that W =
f
dxi, ... dXir and TJ
= 9 dXjj ... dXj •. Thus,
¢*(WTJ) = ¢*((fg) dXil ... dXirdxjl ... dXj.) = (f
0
¢)(g 0 ¢)¢*(dxi,)··· ¢*(dXir)¢*(dxj))··· ¢*(dxj.)
= ¢*(w)¢*(TJ)·
I
15.14 Remark. The differential transform ¢* and the exterior derivative d commute; i.e., if W is a C1 r-form on V and ¢ is C2 on U, then
(5)
PROOF.
¢*(dJJJ) = d(¢*(w)).
We may suppose that
W
is decomposable. The proof is by induction on
r. Suppose that r = 0; i.e.,
W =
f.
Then, by definition and the Chain Rule,
548
Chapter 15
DIFFERENTIABLE MANIFOLDS
Suppose that r = 1; i.e., W = J dXk. Then, by definition, the multiplicative property of ¢*, and the case r = 0, we have ¢*(dw) = ¢*(dJ dXk) = ¢*(df)¢*(dxk) = d(f
0
¢) d¢k.
On the other hand, since ¢*(w) = (f 0 ¢)¢*(dXk) = (f 0 ¢) d¢k, it follows from the Product Rule, the Nilpotent Property, and Theorem 15.9 that d(¢*(w)) = d(f 0 ¢) d¢k
+ (f
0
¢) d2 ¢k = d(f
0
¢) d¢k.
Thus (5) holds when w is a I-form. Finally, suppose that there is an r > 1 such that (5) holds for all 8-forms, 0 :::; 8 < r. Let w be a decomposable r-form and write w = 0"" where 0 is a I-form and '" is an (r - I)-form. By the Product Rule, dw = (dO)", - 0 d",.
Hence, it follows from the inductive hypothesis, the Product Rule, and the multiplicative property of ¢* that ¢*(dw)
=
¢* (dO)¢* ("') - ¢*(O)¢*(d",)
= d(¢*O)¢*(",) - ¢*(O)d(¢*",) = d((¢*O)(¢*",))
= d(¢*(O",)) = d(¢*(w)). I
The following result shows that differential transforms can be used to define the oriented line and surface integrals introduced in Sections 13.2 and 13.4 (see Exercise 5).
15.15 THEOREM [FUNDAMENTAL THEOREM OF DIFFERENTIAL TRANSFORMS]. Let m 2: n, let U be open in R n, let V be open in R m, and suppose that ¢ : U ---+ V is C1 on U. If
is an n-form on V, then
PROOF. We may suppose that w is decomposable. If n = 1, i.e., w = J dXj, then by definition, ¢*(w) = ¢*(f) ¢*(dxj) = (f 0 ¢)¢' duo If n > 1, i.e., w = J dXh ... dXin, then by Definition 15.10 and Theorem 15.4,
15.1
Differential forms on R n
549
EXERCISES 1. Algebraically simplify the following differential forms.
(a) 3(dx + dy) dz + 2(dx + dz) dy. (b) (x dy - y dx) (x dz - z dy).
(c) (x 2 dxdy - cosxdydz)(y 2 dy + cosxdw) - (x 3 dydz - sinxdydw)(y 3 dy sinx dz). 2. Compute the exterior derivatives of the following differential forms. (a) x 2 dy - y2 dx. (b) sin(xy) dz dw + cos(zw) dx dy. (c) ...jx 2 + y2 dy dz - ...jx 2 + y2 dx dz. (d) (e XY dz + eYZ dx)(sinxdy + cosydx). 3. (a) Prove that if W is an r-form, r odd, then w2 = o. (b) Prove that if Wj are decomposable r-forms, r even, and W =
I::f=l Wj,
+
then
N
w2
= 2 L WjWk· k,j=l j
[!].
This exercise is used in Section 15.3. If f, g are O-forms and w, TJ are r-forms, define (f,g). (w,TJ) = fw + gTJ·
(a) Prove that if f : R3 ----> R is C1 and gradf := (fx, fy, fz), then the exterior derivative of the o-form W = f can be written in the form dw = ( grad 1) . (dx, dy, dz).
(b) Prove that if F = (P, Q, R) : R3 ----> R3 is cl, then the exterior derivative of the I-form W = P dx + Q dy + R dz can be written in the form
dw
=
(curl F) . (dydz,dzdx,dxdy).
and the exterior derivative of the 2-form TJ can be written in the form
dTJ
[ill.
= P dy dz + Q dz dx + R dx dy
= (div F) dx dy dz.
This exercise is used in Section 15.3. Let I be an interval and E be a Jordan region. Define the integral of a continuous I-form W = f dt on I and a continuous 2-form TJ = g du dv on E by jW= jf(t)dt
and
fLTJ= fLg(u,V)d(U,v).
Chapter 15
550
DIFFERENTIABLE MANIFOLDS
(a) Let C = (¢,I) be a smooth simple C1 curve in R2, F = (P,Q): ¢(I) ~ R2 be continuous, and w = P dx + Q dy. Prove that
(b) Let S = (¢,E) be a smooth simple orient able C1 surface in R 3 , let F = (P, Q, R) : ¢(E) ~ R3 be continuous, and suppose that TJ = P dy dz
+ Q dz dx + R dx dy.
Prove that
e15.2 DIFFERENTIABLE MANIFOLDS. from any other enrichment section.
This section uses no material
In Chapter 13 we introduced one-dimensional objects (curves), two-dimensional objects (surfaces), and corresponding oriented integrals. We shall extend these ideas to higher dimensions. A problem that surfaced several times in Chapter 13 is that one parametrization by itself does not fully describe a surface. For example, the boundary of a surface could not be defined using one parametrization alone. To avoid this problem, we adopt a different point of view here. Instead of thinking of a surface as a particular parametrization ¢ : E ~ S, we will think of a surface as a set of points S together with a class of functions ha : S ~ E related to each other in a natural way. (Each h-;;l can be thought of as a parametrization of a piece of S.) This point of view has been used by mapmakers for centuries. The earth (a particular surface) can be described by an atlas, which is itself a collection of two-dimensional maps (or charts) that represent overlapping portions of its surface. If we know how the individual charts fit together (see (7)), we can study the whole surface by using this atlas.
15.16 DEFINITION. Let M be a set. (i) An n-dimensional chart of M at a point x E M is a pair (V, h), where x E V, V ~ M, h : V ~ Rn is 1-1, and h(V) is open in Rn. (We shall drop the adjective "n-dimensional" when no confusion arises.) (ii) An n-dimensional CP atlas of M is a collection
(6) of n-dimensional charts of M such that h,6(Va n V,6) is open in Rn,
(7)
15.2
551
Differentiable manifolds
and M=
U Va. aEA
The functions ha
0
h~l are called the transition maps of the atlas
A.
Notice, then, that if (V, h) is a chart of M, then (h- 1, h(V)) is a parametrization of a portion V of M. 15.17 Example. If e = ¢(1), where (¢, I) is a simple curve and I is an open interval, prove that {(e, ¢-1)} is a one-dimensional Coo atlas of e. PROOF. Since (¢,I) is simple, ¢-1 exists on ¢(I). (e,¢-l) is evidently a onedimensional chart of e, so {(e, ¢-1)} is an atlas of e. Since the transition map (¢-1 0 ¢)(x) =X is the identity function on e, this atlas is Coo. I
A similar argument establishes the following two remarks. 15.18 Remark. If V is open in R2, ¢ : V ~ Rm is 1-1 on V, and S = ¢(V), then {(S, ¢-1)} is a two-dimensional Coo atlas of S. 15.19 Remark. If V is open in Rn and I(x) =X is the identity function on Rn, then {(V, I)} is an n-dimensional Coo atlas of V. (We shall call (V, 1) the identity chart.) Not all atlases consist of one chart. 15.20 Example. For each t E R, set ¢(t) = (cost,sint), 'lj;(t) = (cos(t +11'), sin(t+ 11')), V = ¢(I), and U = 'lj;(I), where I = (0,211'). If h = ¢-1 on V and 9 = 'lj;-1 on U, prove that A= {(V, h), (U,g)} is a one-dimensional Coo atlas of the unit circle x 2 + y2 = 1. Let M represent the set of points (x, y) such that x 2 + y2 = 1. Since ¢ (respectively 'lj;) is 1-1 from I onto V (respectively, I onto U) and V U U = M, (V, h) and (U, g) are charts that cover M. It is easy to see that the transition maps are Coo. For example, g(V n U) = (0,11') U (11',211') and, on the interval (0,11'), (h 0 g-l )(t) = t + 11'. Thus A is a Coo atlas of M. I PROOF.
The following concept is a replacement for smooth equivalence of parametrizations. 15.21 DEFINITION. Two n-dimensional atlases A, B of M are said to be CP compatible (notation: A rv B) if and only if Au B is an n-dimensional CP atlas on M. Notice that CP compatibility is an equivalence relation (see Exercise 2), i.e., any atlas A is CP compatible with itself; if A is CP compatible with B, then B is CP compatible with A; and if A is CP compatible with B, and B is CP compatible with V, then A is CP compatible with V. (For some elementary remarks about equivalence relations and equivalence classes, see Appendix F.)
552
Chapter 15
DIFFERENTIABLE MANIFOLDS
15.22 DEFINITION. An n-dimensional CP manifold is a set M together with an equivalence class A of n-dimensional CP atlases on M. By an atlas of M we mean an atlas in A. Bya chart of M we mean a chart in some atlas of M. Atlases of an n-dimensional manifold M can be used to "pull back" concepts from Rn to M.
15.23 DEFINITION. Let M be an n-dimensional CP manifold and let A be an atlas of M. A set W ~ M is said to be open if and only if h(V n W) is open in Rn for all charts (V, h) E A. The following result shows that this definition does not depend on the atlas chosen from the manifold structure of M.
15.24 Remark. Let A and B be CP compatible atlases of M and suppose that W ~ M. Then h(V n W) is open in Rn for all (V, h) E A if and only if g(U n W) is open in R n for all (U,g) E B . PROOF. Let W ~ M such that h(V n W) is open in Rn for all (V, h) E A and suppose that (U,g) E B. If W n U = 0, then g(W n U) = 0 is open in Rn by definition. If W n U #- 0, choose (V, h) E A such that W n V n U ¥- 0. Since h(W n V) and g(U) are open in Rn and the transition map h 0 g-l is CP, hence continuous, it follows that
g(W n V n U)
= (g 0 h- 1)(h(W n V)) n g(U) = (h 0 g-l )-l(h(W n V)) n g(U)
is open in Rn. Since g(wnU) =
U
g(WnVnU),
(V,h)EA
we conclude that g(W n U) is open in Rn. Reversing the roles of A and B proves the converse. I Using open sets, we can define what we mean by continuity of a function on a manifold (compare with Exercise 6, p. 276, or Theorem 10.58).
15.25 DEFINITION. Let M be an n-dimensional CP manifold and let A be an atlas of M. (i) A function f : M --+ Rk is said to be continuous on M if and only if f-l(U) is open in M for every open set U c R k • (ii) A function f : Rk --+ M is said to be continuous on a set E C Rk if and only if f-1(W) n E is relatively open in E for every open set Win M. 15.26 Remark. If (V, h) is a chart from an atlas A of M, then h is a homeomorphism; i.e., h is continuous on V and h- 1 is continuous on h(V). PROOF. If W ~ V is open in M, then (h-1)-1(W) = h(W) is open in Rn. Hence, h- 1 is continuous on h(V) by Definition 15.25. On the other hand, suppose that 0 C h(V) is open in R n and W = h-1(O). Let (U,g) be any chart in A. Then
15.2
Differentiable manifolds
553
/
I
/
I h
hex)
h(V) \
\
\
'-
y
Figure 15.1 g(U n W) = g(U) n g 0 h- 1 (O) is open in Rn; Le., W is open in M by Definition 15.23. Hence, h is continuous on V. I
To define the boundary of a manifold, we introduce the following terminology. By a half-space of R n we mean a set of the form
{(X!, ... , xn) : Xj
;:::
a} or {(x!, ... , xn) : Xj
::;
a},
where a E R and j E {I, 2, ... , n}. We shall refer to the special case
as left half-space. If n = 2, we shall refer to half-spaces as half-planes. A simple curve parametrized on an open interval is a one-dimensional manifold (see Example 15.17). What about surfaces? A smooth surface with empty boundary is a two-dimensional manifold, but the restriction in Definition 15.16i that h(V) be open prevents any surface whose boundary is nonempty from being a manifold. For example, the cylinder 1 = X2 + y2, 0 ::; z ::; 1, does not satisfy Definition 15.16 because there is no way to construct an "open" chart at points on its boundary (see Remark 15.29). Loosely speaking, this is because at a point on the boundary, the surface does not look like an open set but rather like a relatively open set in a half-plane (see Figure 15.1). Accordingly, we make the following definition.
15.27 DEFINITION. Let M be a set. (i) An n-dimensional chart-with-smooth-boundary of M at a point x E M is a pair (V, h), where x E V, V ~ M, h : V --+ Rn is I-Ion V, and h(V) is relatively open in some half-space 1l of Rn. If h(V) n ffH. = 0, then (V, h) is called an interior chart. If h(V) n a1l -:j:. 0, then (V, h) is called a boundary chart. (ii) An n-dimensional CP atlas-with-smooth-boundary of M is a collection
(8)
A = {(Va, ha ) : a
E A}
Chapter 15
554
DIFFERENTIABLE MANIFOLDS
of n-dimensional charts-with-smooth-boundary of M such that h,B(Va n V,B) is relatively open in some half-space 1l,
ha
(9)
0
h~l is CP on h,B(Va
n V,B)
for all
0:,
f3 E A,
and
The functions ha 0 h~l are called the transition maps of the atlas A. (iii) Two n-dimensional atlases-with-smooth-boundary A, B of M are said to be CP compatible (notation: A rv B) if and only if Au B is an n-dimensional CP atlas-with-smooth-boundary on M. It is easy to check that CP compatibility of atlases-with-smooth-boundary is an equivalence relation. We also note that, since any open subset of 1l is relatively open in 1l, every atlas is an atlas-with-smooth-boundary. We now expand the definition of manifold, using atlases-with-smooth-boundary.
15.28 DEFINITION. (i) An n-dimensional CP manifold-with-smooth-boundary is a set M together with an equivalence class of n-dimensional atlases-with-smooth-boundary. (ii) A point x E M is said to be a boundary point if and only if it belongs to V for some boundary chart (V, h) of M, with h(V) relatively open in some half-space 1l, and h(x) E ffH. The collection of all boundary points, called the boundary of M, is denoted by 8M. The following result shows that if the transition maps of a manifold have nonzero Jacobian, then the definition of boundary point does not depend on the chart (V, h).
15.29 Remark. Let M be an n-dimensional CP manifold-with-smooth-boundary, x E M, and (U, g), (V, h) be charts of M at x whose transition map ¢ = go h- 1 satisfies ~
g(x) E ¢(n) ~ (g 0 h-1)(h(U n V)) = g(U n V) ~ g(U). Hence, g(x) belongs to the interior of g(U), i.e., cannot belong to ffH. I Is Definition 15.28 general enough to include every smooth surface whose boundary is made up of smooth curves? At first glance, the answer to this question seems to be no because of the restriction that h(V) be relatively open in some half-plane, i.e., part of its boundary be a straight line. Nevertheless, if S is a smooth surface with smooth boundary, one can always find a smoothly equivalent parametrization
15.2
Differentiable manifolds
555
('Ij;, B) of S such that aB is made up of straight lines (see Munkres [8], p. 51). In particular, every smooth surface with smooth boundary is a two-dimensional manifold-with-smooth-boundary. Our goal is to prove Stokes's Theorem for manifolds-with-smooth-boundary. We shall deal exclusively with manifolds M that are subsets of R m for some m ~ n. In this case we have two competing concepts: open sets defined by the manifold structure (Definition 15.23) and open sets defined by the relative topology (Definition 8.26 or 10.54). The purpose of the following definition is to make sure that these concepts coincide. 15.30 DEFINITION. An n-dimensional manifold M is said to be continuously embedded in Rm if and only if the following three conditions hold: (i) m ~ nj (ii) M is a closed subset of Rmj (iii) if (V, h) is a chart of M, then V is relatively open in M and h is a homeomorphism from the relative topology on V to the usual topology on Rnj i.e., if U is relatively open in V then h(U) is open in Rn, and if 0 is open in R n , then h-1(O) n V is relatively open in V. From now on, by a manifold we mean a manifold-with-smooth-boundary (whose boundary mayor may not be empty) that is continuously embedded in some Rm. We now define what it means for a manifold to be orientable.
15.31 DEFINITION. Let m ~ nand M c Rm. (i) A CP atlas (8) is said to be oriented if and only if
for all u E h,B(Va n V,B) and Ct, f3 E A (compare with Definition 13.42). (ii) An n-dimensional CP manifold M is said to be orientable if and only if it has an oriented atlas. (iii) Two oriented CP atlases A, B of an orient able manifold M are said to be orientation compatible if and only if Au B is an oriented atlas. (Note that orientation compatibility is an equivalence relation.) (iv) An orientation of an n-dimensional orient able CP manifold M is an equivalence class of oriented CP atlases. If A is an oriented CP atlas of M, then the orientation generated by A is the orientation of M that contains A. An orientation of a manifold M can be used to induce an orientation of aM in the following way.
15.32 DEFINITION. Let M be a manifold with orientation 0 and let A be an atlas of M consisting of charts (V, h) from 0 that satisfy hl (x) ::; 0 for all x E V, and h1(x) = 0 if and only if x E aM n V. The orientation induced on aM by A is the orientation of aM generated by the atlas
A = {(V, h) : (V, h) E A},
Chapter 15
556
where
DIFFERENTIABLE MANIFOLDS
V = V n aM and h(x) = (h2(X), ... , hn(x)).
The following result shows that orientation of M.
A is
an oriented atlas of aM when 0 is an
15.33 Remark. Suppose that (V, h) and (U, g) are charts from an orientation 0 of an oriented manifold M that satisfy hl(X) ::; 0 (respectively, gl(X) ::; 0) for x E V (respectively, x E U), and hI (x) = 0 (respectively, gl(X) = 0) if and only if x E aM n V (respectively, x E aM n U). If h = (h 2, ... , h n ) and g = (g2, ... , gn), then ~-hog- 1 (u) > 0 for all u E g(V n (1). Let (t,u) = (t, U2, ... , un) represent a general point in Rn, ¢ = hog- 1 be the transition from g(U) to h(V), and ¢1 be the first component of ¢. By Remark 15.29, ¢ takes boundary points to boundary points. Since hI (x) = gl (x) = 0 for x E aM n V n U, it follows that ¢1 (0, u) = 0 for all u E g(V n U). Consequently, the first row of the Jacobian matrix D(h 0 g-1 )(O,u) is given by PROOF.
It follows that ~hog-l (O,u)
a¢1
= at (O,u)· ~hog-l (u).
Moreover, the conditions hI ::; 0 on V and gl ::; 0 on U imply a¢1 (O,u)
at Si~ce ~hog-l
= lim ¢1(t,U) - ¢1(O,U) = lim ¢1(t,U) > O. t-+O-
t
t-+O-
> 0 on g(V n U), we conclude that
t
~hog-l (u)
-
> 0 for each u E
g(V n U). I
For two-dimensional manifolds, the condition hI (x) ::; 0 makes the induced orientation, as defined above, agree with the right-handed orientation introduced in Section 13.4 (see Figure 15.1). Our definition of n-dimensional manifolds is quite general, but not general enough. It does not include n-dimensional rectangles. (There is no way to parametrize a corner of a two-dimensional rectangle by using relatively open sets in a half-plane.) One way to fix this is to extend the definition of charts to include "corner" charts. This extension is still not general enough to include all piecewise smooth curves; for example, it does not include curves with cusps (e.g., y = x 2 / 3 ). The theory can be extended once again by taking limits of "manifolds with corners." For details, see Loomis and Sternberg [6]. We will take a less ambitious approach by treating the rectangular case separately. By a chart of an n-dimensional region R in Rn (this includes all n-dimensional rectangles) we mean a pair (E, h) where RO ~ E ~ R and h: V -+ R n is 1-1 and
15.2 (al. b2)
557
Differentiable manifolds
--~
(b l • b2)
t
R;
t
R[
(al' a2)
R~--
(b l • a2)
Figure 15.2
continuously differentiable on some open set V that contains R with Ah =f 0 on V. (Notice that, by the Inverse Function Theorem, h is a homeomorphism on V and that, by Corollary 12.lDiii, h(E) is a Jordan region.) Using such charts, we can define atlases and manifolds in the same way as above. The end result is that we can consider any n-dimensional region in R n to be a manifold. Notice that if I represents the identity function on R n, i.e., I (x) = x for all x ERn, and if R is an n-dimensional rectangle, then {(R, In is an atlas of R. 'l'he orientation generated by the identity chart is called the usual orientation on R. Also notice that since R is closed, the "manifold" boundary of R is precisely its topological boundary. When R is a rectangle, what orientation is induced on oR by the usual orientation of R? The following result answers this question by showing how to find an atlas of the induced orientation for arbitrary n E N. Notice that, for the special case n = 2, this orientation on oR is counterclockwise orientation (see Figure 15.2). 15.34 THEOREM. Let R = lab bl ) x··· x [an, bn) be an n-dimensional rectangle. For each j = 1, ... , n, set R~ = [al,b l ) x .. · x {aj} x .. · x [an,b n),
Rj = [abbl) x· .. x {bj } x .. · x [an,b n), .( ) _ {(a l - Xb- X2,X3, ... ,Xn) hJ Xb"" Xn . ~ (aj - Xj, (-l)JXI, X2, ... , Xj, ... , xn)
j=l
j=f1
and j=l
j=fl.
(The notation Xj indicates that this variable is missing.) If ltj = RO U R~ and UJ· = RO U RTJ' then
is an oriented atlas of R that is compatible with the usual orientation. In particular, if Vj = ltj n R], Uj = Uj n Rj, hj, and gj are defined as in Definition 15.32, then
558
Chapter 15
is an oriented atlas of orientation.
DIFFERENTIABLE MANIFOLDS
oR that
belongs to the orientation induced by the usual
PROOF. Fix 1 ::; j ::; n and let I (x) By definition, if j = 1, then
tlh]oI
= det
= x represent the identity function on R n.
-1
0
0 0
-1
0 0
0
1
0 0 0
0
0
0
1
= 1 > O.
If j =F 1, then by factoring -lout of the first row and interchanging j - 1 rows we have (-l)j 0 0 o o 1 0 o 001 o =(-1)2j =1>0.
o
o
0
1
Thus the chart (Vj, hj ) is compatible with the usual orientation on R. Let hjl represent the first component of the function hj, j = 1, ... , n. Clearly, if x E R, then hjl (x) ::; OJ and hjl (x) = 0 if and only if Xj = aj, i.e., if and only if x E Thus (Vj, hj ) belongs to the orientation induced on by the usual orientation. A similar argument works for the "right-hand" boundaries Rj. I
R;.
oR
We mentioned in Chapter 13 that a connected smooth curve or connected smooth orient able surface has only two orientations. This is a general principle shared by all connected orient able C1 manifolds (see Theorem 15.36). First, we prove the following result. 15.35 Lemma. If M is a connected orientable C1 manifold and A, B are oriented atlases of M, then either tlhog-l
(u) > 0
for all (V, h) E A, (U, g) E B, and u E g(V n U), or
for all (V, h) E A, (U, g) E B, and u E g(V n U). PROOF. Set
A
= {x EM: tlhog-l (g(x)) > 0 for some
(V, h) E A and (U, g) E B}
and B = {x EM:
tlhog-l
(g(x)) < 0 for some (V, h) E A and (U, g) E B}.
15.2
Differentiable manifolds
559
We must show that M = A or M = B. Since M is connected, it suffices to show that A and B are (relatively) open in M, M = AUB, and AnB = 0 (see Definition 8.28 or 10.53). To show that A is open in M, let Xo E A and choose (V, h) E A, (U, g) E B such that 6. ho y-1(g(XO)) > o. Then Xo E n := g-1(6.';-;g_1 ((0, 00))). Since h 0 g-l is continuously differentiable on g(U), its Jacobian is continuous on g(U). Hence n, the inverse image of the open set (0,00) under the continuous function 6. hog -1 0 g, must be open in M (Exercise 6, p. 276, or Theorem 10.58). It follows that A is open in M. A similar argument proves that B is open in M. To show that M = Au B we must show that 6. hog -1 (g(x)) =f. 0 for all x E M. Suppose to the contrary that 6. hog -1 (g(x)) = 0 for some x EM. Since A is an atlas of M, choose (W,O") E A such that x E Wand set u = O"(x). By (10) and the Chain Rule,
a contradiction. Thus M = A U B. Finally, to show that A n B is empty, suppose to the contrary that there is an x E An B. By definition, this means that there exist charts (Vi, hi) E A and (Ui , gi) E B such that (11) for i = 1,2. Since A is an orientation, we have by (10) and the Chain Rule that
o < 6.h1oh~1 (h2(X)) =
6. h109 ;-1 (gl
0
g"2 1 0 g2
0
h"21
0
h2(X))
. 6.91og~1 (g2
0
h"21
0
h2(X))6.92oh~1 (h2(X))
= 6. h1 og-1(gl(X))6. g1 og-1(g2(X))6. g2 oh-2 1(h2(X)). 1 2
By (11), the first (respectively, third) of these factors is positive (respectively, negative). Hence, the second factor must be negative. But the second factor is positive since both gl and g2 come from the same oriented atlas B. This contradiction proves the lemma. I
15.36 THEOREM. Let M be a connected orientable CP manifold. Then M has exactly two orientations. 0:
PROOF. E A} be
We first show that M has at least two orientations. Let A = {(Va, haJ : an oriented atlas of M with ha = (hal, ... , han), and consider
where ga = (-hal, h a2 , ... , han). Clearly, B is a CP atlas of M. Since
Chapter 15
560
DIFFERENTIABLE MANIFOLDS
13 is orientable. Since
13 is not orientation compatible with A. Thus M has at least two orientations. To show that M has no more than two orientations, suppose to the contrary that M has three distinct orientations. Let A, 13, and 0 be atlases from each of these orientations, and choose (V, h) E A, (U, g) E 13 such that V n U =I- 0. Since these orientations are distinct, there exist (Wi,O'i) E 0, i = 1,2, such that
for some x E V n WI and y E Un W2 . By Lemma 15.35, ~hoa-l 00' < 0 and ~aog-l 0 9 < 0 on M for all (W,O') EO. Let x E V n U and choose (W,O') EO such that x E W. By the Chain Rule, ~hog-l (g(x)) = ~hoa-l (O'(x) )~aog-l (g(x)).
This is a product of two negative numbers, hence, positive. It follows from Lemma 15.35 that ~hog-l 0 9 > 0 on U for all (V, h) E A and (U, g) E 13. Therefore, A is orientation compatible with 13, a contradiction. I EXERCISES 1. Let M be a CP manifold (not necessarily continuously embedded in some R m).
(a) If {VoJaEA is a collection of open sets in M, prove that UaEA Va is open in M. (b) If VI"'" VN are open in M, prove that n.f=1 Vi is open in M. 2. Prove that CP compatibility and orientation compatibility are equivalence relations. 3. Prove that the boundary of an n-dimensional CP manifold-with-smooth-boundary is an (n - 1)-dimensional manifold. [!]. This exercise is used in Section 15.3. Translation on R n by an a ERn is defined by O'(x) = x +a for x ERn. Dilation on Rn by a J > 0 is defined by O'(x) = Jx for x ERn. (a) Prove that if A is an oriented atlas of a manifold M and 0' is a translation or a dilation, then 13 = {(V, 0' 0 h) : (V, h) E A} is an atlas of M that is orientation compatible with A. (b) Let A be an orientation of a manifold M and let x E M be an interior point. Prove that there is a chart (V, h) at x such that h(V) = BI (0) and h(x) = O. 5. Let A be an n-dimensional Coo atlas of a manifold M. A function f : M -; Rk is said to be CP on M if and only if f 0 h- I : h(V) -; Rk is CP for all charts (V,h)EA.
(a) Prove that this definition is independent of the atlas A. (b) Prove that the composition of CP functions is a CP function. (c) Prove that if (V, h) is a chart of M, then h is a Coo function on V.
15.3
6. Prove that the sphere x~
Stokes's Theorem on manifolds
+ ... + x~
561
= a2 is an (n - I)-dimensional manifold in
Rn. e15.3 STOKES'S THEOREM ON MANIFOLDS rial from Sections 9.5, 12.5, 15.1, and 15.2.
This section uses mate-
We shall define oriented integrals of n-forms on n-dimensional manifolds and obtain a fundamental theorem of calculus for these integrals. Recall that, for us, a manifold M is closed and continuously embedded in Rm (see Definition 15.30). In particular, given W open in M, there is an open set 0 C Rm such that 0 n M = W. This assumption is not essential, and all results stated in this section are valid without it. We make it to simplify the proof that partitions of unity exist on a manifold.
15.37 Lemma [COO PARTITIONS OF UNITY ON A COMPACT MANIFOLD]. Let M be a compact n-dimensional CP manifold in R m with orientation 0, and let {Ua}aEA be an open covering of M. Then there exist Coo functions
L
(1 - 'l/Jt} ... (1 - 'l/JN)'
j=l
Since {OJ} covers M, this verifies (iv) .• We shall call the functions
w=
f
dXil ... dXir>
where f is a O-form on some open set 0 that contains M. We are now prepared to define the integral of a differential form on an oriented manifold. (This definition includes oriented line integrals and oriented surface integrals-see Exercise 5, p. 549.)
562
Chapter 15
DIFFERENTIABLE MANIFOLDS
15.38 DEFINITION. Let m ;::: n, let M be an n-dimensional oriented CP manifold in R m, let w be a continuous n-form on M, and suppose that 0 is an orientation ofM. (i) If ¢ : V -7 R is continuous on V for some chart (V, h) EO, if spt ¢nM ~ V, and if fdx1'" dX n := (h-1)*(¢w), then the oriented integml of ¢W on M is defined by the following Riemann integral:
r ¢w
=
JM
(
Jh(V)
f(x)dx1 ... dx n .
(ii) If M is compact, then the oriented integml of won M is defined by
where ¢1,"" ¢ N is any Coo partition of unity on M subordinate to the orientation O. The following two remarks show that these definitions make sense. 15.39 Remark. The value of f M ¢w does not depend on the chart chosen. PROOF. Let (U, g) E 0 be another chart that satisfies U 2 spt ¢ n M. We may suppose that w is decomposable; i.e., w = fdxil ... dXin. Let H = h- 1 and G = g-l. Since 0 is an orientation, Ahog-1 ;::: 0 on g(U n V). Moreover, by the Chain Rule, A(G'l, ... ,Gi n ) = A(Hi1, ... ,Hin ) 0 h 0 g-lAhog-l.
Since h(V n U) = (h ° g-l) 0 g(V n U) and ¢ is supported in V n U, it follows from the Fundamental Theorem of Differential Transforms and a change of variables in R n that
( (h- 1)*(¢w)(u) du Jh(V)
=
r
Jh(V)
=1
(¢oh- 1)(u)(foh- 1)(u)A(Hi1 ,... ,H;n)(u)du (¢og-l)(y) (fog-1)(y)
g(U)
. A(Hil ,... ,H'n) (h 0 g-1(y))I A hog- 1(y)1 dv
=
r
Jg(U)
=
1
(¢og-1)(y)(fog-1)(y)A(Gil, ... ,Gin )(Y)dy (g-l)*(¢W)(y) dv· I
g(U)
15.40 Remark. The value of f M ¢1,"" ¢N chosen.
W
does not depend on the Coo partition of unity
15.3
Stokes's Theorem on manifolds
563
PROOF. If 'l/J1, ... , 'l/J L is another Coo partition of unity subordinate to 0, then
The following result justifies the identification of d(x, y) with dx dy made below (3) on p. 542. (See also Exercise 5, p. 549.) 15.41 Remark. If R is an n-dimensional rectangle with the usual orientation, and w = f dX1 ... dX n is an n-form on some open n :> R, then
PROOF. By hypothesis, (R, I) is a chart of R. Hence, by Definition 15.38i,
r w = JI(R) r I*(w)(x) dx = JR r f(x) dx. JR
I
We shall prove that the oriented integral of the exterior derivative dw of a differential form on a manifold M is determined by the behavior of won aM. STRATEGY: The idea behind the proof is straightforward. First, we prove the result when M is a rectangle (Lemma 15.42). (This case follows directly from the one-dimensional Fundamental Theorem of Calculus because the boundary of a rectangle moves in only one dimension at a time.) Next, we pull back this result to sufficiently small charts on M (Lemma 15.43). Finally, by using a Coo partition of unity subordinate to a covering by sufficiently small charts, we establish the general result. (The proofs of Lemma 15.43 and Theorem 15.45 presented here come from Spivak [12].1) 15.42 Lemma. Let R = [a1' b1] x ... x [an, bn ] be an n-dimensional rectangle, let w be a C1 (n - I)-form on an open set U that contains R, and suppose that R has the usual orientation. If aR carries the induced orientation, then
rdw = JaRr
JR
w.
PROOF. Let.A = {(Vj, hj ), (fjj, gj) : j = 1, ... , n} be the atlas of aR introduced in Theorem 15.34, and set Hj = hjl, Gj = gj1. We claim that (12) 1M. Spivak, Calculus on Manifolds (New York: W. A. Benjamin, Inc., 1965). Reprinted with permission of Addison-Wesley Publishing Company.
Chapter 15
564
DIFFERENTIABLE MANIFOLDS
and (13) n --for any (n - I)-form w = Ei=l/idxl ... dXi'" dX n on U. To prove (12), fix j and notice by construction that
Hj(Ul, ... , Un-I)
j=1
(aI, -Ub U2,···, un-I) ={ .
j> 1.
(( -1)JUl, U2,"" aj, ... , Un-I)
Representing the ith component of H j by H ji , we have i=j i
=f j.
Hence, by the Fundamental Theorem of Differential Transforms,
This proves (12). A similar argument proves (13). Using (12) and (13), a change of variables in the second variable when j 1 (respectively, in the first variable when j > 1), the Fundamental Theorem of Calculus in the jth variable, and Fubini's Theorem, we see that
f.R w~ =
t, (Lt+ L,w) i)
-1)j ( { _
j=l
= i)-I)j j=l
i.e.,
(14)
(Ii 0 h-;l)(u) du
.h;J (VJ )
i ·i- ·i bj
bi
..
ai
bn
..
aj
an
-1- (Ii 9J (UJ )
0
9-;1)(u)
00)
15.3
Stokes's Theorem on manifolds
565
On the other hand, it is clear that '+1 8g
--
d(gdxl ... dXj ... dx n ) = (-1)1
-dXl'" dX n 8xj
for any differentiable function g. Thus
We conclude by Remark 15.41 and (14) that
15.43 Lemma. Let M be an n-dimensional orientable C2 manifold-with-smoothboundary and let (1 be an orientation of M. For each x E M there is a chart (V, h) E (1 at x such that if'fJ is any C1 (n - 1)-form supported in V, then
r
Jv PROOF.
Suppose first that x
~
d'fJ =
r
Jvn8M
'fJ.
8M. Then there is a chart (U, h) at x such that
h(U) is open in Rn. Let R be an n-dimensional rectangle such that h(x) E RO eRe h(U)
and set V = h-1(RO) (see Figure 15.3). Let 'fJ be an (n - 1)-form supported in V. We may suppose that 'fJ is decomposable; i.e.,
By definition and (5), (h-1)*(d'fJ))
= d((h-1)*('fJ)) = (-1)j- 1 88 f
Xj
dXl' ··dx n ·
Since spt f C h(V) = RO, it follows from Definition 15.38 and Lemma 15.42 (using the identity chart on R) that
Iv
Since spt ((h- 1)*'fJ) c RO and RO n 8R = 0, this last integral is zero; i.e., d'fJ = O. On the other hand, h(U) is open in R n so h(V) ~ h(U) contains no boundary points of M. Therefore, V n 8M = 0 and
r
JVn8M
'fJ = 0 =
r
Jv
d'fJ'
Chapter 15
566
DIFFERENTIABLE MANIFOLDS h(U)
'"
/ /
."...--- .........
'\
I h
,"
R
(I
\
I)
'(x).
\ '\
U
", .....
Figure 15.3
/
--- ...- '"
/
/
+
h(U) _-I /
",'"
I
/
I
-h
I
( R-
I \ \
\
x
h(x) = 0
'\
",
I
..... _---j
I
Figure 15.4 Next, suppose that x E oM. Let (U, h) be a chart at x such that h(U) is relatively open in the left half-space 'HI and h(x) = O. Let R be a n-dimensional rectangle such that RO eRe U and R n o'H I = (see Figure 15.4), and set V = h-I(RO U RD. Let ry be a decomposable CI (n - I)-form supported on V, with
Rr
(h-I)* (ry) = f dXI ... dxj ... dx n , and u = (Ul, ... , Un-I) E oR. Then f is identically zero on oR \ from Definition 15.38 and Lemma 15.42 that ,
JvnaM
ry=
RL and it follows
'r f (U)du=t(-I)j('t f (U)du-lr f (U)du)
JRl
JR
j=l
j
R j
= , (h-I)*(ry)(u) du = , (h-I)*(dry)(x) dx
JaR
=,
JR
(h-I)*(dry)(x) dx
Jh(V)
=,v
dry. I
J
We are now prepared to prove the general result.
15.44 THEOREM [STOKES'S THEOREM ON MANIFOLDS]. Let M be a compact n-dimensional oriented C2 manifold-with-smooth-boundary. Ifw is a CI (n-l)-form on M, then , dw = , w.
JM
JaM
15.3
Stokes's Theorem on manifolds
567
PROOF. By Lemma 15.43, choose an open covering V = {Vz}..:EM of M such that Vz and
xE
r
(15)
ivz
d'fJ=
r
ivznaM
'fJ
for all (n -I)-forms 'fJ supported in Vz. Since M is compact, choose open sets Vj = Vzj , j = 1, ... , N, that cover M and a Coo partition of unity (Pl, ... , ¢ N on M such that spt ¢j n M ~ Vj. Set 'fJj = ¢jW and observe that spt 'fJj n M ~ spt ¢j n M ~ Vj for each j and W = 2:f=l 'fJj. Hence, by (15),
This result extends Theorems 13.50, 13.54, and 13.57 (the theorems of Green, Gauss, and Stokes) to regions with smooth boundaries (see Exercises 4 and 5, p. 549). Theorem 15.44 also holds for manifolds with singularities, i.e., piecewise smooth boundaries. (For a treatment of manifolds with singularities, see Loomis and Sternberg [6].) We close this section with an n-dimensional analogue of Theorem 13.61. A set VeRnis said to be star-shaped (centered at 0) if and only if for each x E V the line segment between x and 0 lies in V; i.e., t.x E V for all 0 :::; t :::; 1. An r-form W is said to be exact on V if and only if there is an (r - I)-form 'fJ on V such that d'fJ=w.
15.45 THEOREM [POINCARE LEMMA]. Let V be an open star-shaped set in R n and let W be a C1 r-form on V. Then W is exact on V if and only if dw = 0 on V. PROOF. For each r-form
f·
w=
. (x) dx·
1.1,··.,'t r
1.1
... dx·'t r
on V, define an (r - I)-form A(w) on V by
~
. dXil ... dXik ... dXi r
•
Since V is star-shaped and f is defined on V, the integrals in (16) make sense for each x E V. Thus A(w) is an (r - I)-form on V. We claim that (17)
A(dw)
+ d(A(w)) = w
568
Chapter 15
DIFFERENTIABLE MANIFOLDS
for every r-form w on V. To prove (17) we may suppose that w is decomposable; i.e., w = By definition, dw
n of = '"' - dx·J dx·'1 ... dx·'r ~ ax. j=l
is an (r
+ I)-form on V.
f dXi1 ... dXir.
J
Letting io = j, we have by (16) that
(18)
~
. dXj dXi1 ... dXik ... dXir
On the other hand, by the Product Rule, differentiating under the integral sign (see Theorem 11.5), and the Chain Rule, we have
Thus, by the Anticommutative Property, the exterior derivative of A(w) is
Adding (18) and (19), we obtain by the Product Rule and the one-dimensional
15.3
Stokes's Theorem on manifolds
569
Fundamental Theorem of Calculus that
This proves (17). Theorem 15.45 is now easy to prove. If w is exact and C1 , then there is a C2 r - I-form ry such that dry = w. Thus dw = d2ry = by Theorem 15.9. (This part works whether or not V is star-shaped.) Conversely, if dw = 0, then by (17), d(A(w)) = w. Thus set ry = A(w). I
°
EXERCISES 1. Compute
faBa(O,O,O,O)
x3 dy dz dw + y2 dx dz dw.
2. Compute fM 2.:7=1 X~dX1 ... dxj ... dx n , where M is the boundary of the unit n-dimensional rectangle Q = [0, a1] x ... x [0, an]. 3. Let E be a compact n-dimensional Jordan region in Rn, n > 1. If BE is an (n - 1)-dimensional manifold, prove that
if n is odd if n is even. 4. Let r E N, m > n = 2r + 2, V be a star-shaped open set in Rm, and M be a compact n-dimensional C2 manifold-with-smooth-boundary in Rm. If MeV and w is an exact C1 r + I-form on V with w = dry, prove that
Appendices
A. ALGEBRAIC LAWS In this section we derive several consequences of the ordered field axioms (Le., Postulates 1 and 2 in Section 1.1).
A.I THEOREM. Let x, a E R. (i) If a = x + a, then x = o. (ii) If a = x . a and a f= 0, then x = 1. PROOF. (i) Since the additive inverse of a exists, we can add -a to the equation a = x + a. Using the Associative Property and the fact that 0 is the additive identity, we obtain
0= a + (-a)
= (x + a) + (-a) = x + (a + (-a)) = x + 0 = x.
(ii) Since the multiplicative inverse of a exists, we can multiply a = x . a by a-I. Using the Associative Property, and the fact that 1 is the multiplicative identity, we obtain 1 = a· a-I = (x· a)· a-I = X· (a· a-I) = x·l = x. I Theorem A.l shows that the additive and multiplicative identities are unique. The following result shows that additive and multiplicative inverses are also unique. Thus "unique" can be dropped from the statements in Postulate 1.
A.2 THEOREM (i) If a, b E R and a + b = 0, then b = -a. (ii) If a, b E R and ab = 1, then b = a-I. PROOF. (i) By hypothesis and the Associative Property,
-a = -a + (a + b) = (-a (ii) Since 1 f= 0, a f= that
o.
+ a) + b = 0 + b =
b.
Thus it follows from hypothesis and the Associative Property
a-I = a-I (ab) = (a-Ia)b = 1· b = b. I 570
A.
A.3 THEOREM. For all a,b and -(a - b) = b - a.
E
R,
Algebraic laws
o· a =
571
0, -a = (-1)· a, -(-a) = a, (_1)2 = 1,
PROOF. Since 1 is the multiplicative identity and 0 is the additive identity, it follows from the Distributive Property that
a + 0 . a = 1 . a + 0 . a = (1 + 0) . a = 1 . a = a. Hence, by Theorem A.l,
o· a =
O. Similarly,
a + (-1) . a = (1 + (-1)) . a = O· a = O. Since additive inverses are unique, it follows that ( -1)·a = -a. Since -a+a = a+ ( -a) = 0, a similar argument proves that -(-a) = a. Substituting a = -1, we have
(-1)(-1)
= -(-1) = 1.
Finally, for any a, b E R, we also have
-(a-b) = (-I)(a-b) = (-I)a+(-I)(-b) =-a+b=b-a. I
A.4 THEOREM. Let a, b, c E R. (i) If a . b = 0, then a = 0 or b = O. (ii) If a . b = a . c and a =f:. 0, then b = c. PROOF. (i) If a = 0, we are done. If a =f:. 0, then multiplying the identity 0 = a· b by a-I, we have 0= a-I. 0 = a-I. (a· b) = (a-I. a) . b = 1 . b = b. (ii) If a· b = a· c, then by Theorem A.3 we have
a· (b - c) = a· (b + (-I)c) = a· b + (-I)a· c = a· b - a· c =
o.
Since a =f:. 0, it follows from part (i) that b - c = 0; i.e., b = c. I A subset E of R is called inductive if
(1)
1EE
and
(2)
for every x E E, x
+ 1 also belongs to E.
Notice by Postulate 1 that R is an inductive set. Define N to be the set of elements that belong to ALL inductive sets, and set Z := {k E R : kEN, -k E N, or k = O}. Notice that N is the smallest inductive set; i.e., N ~ E for any inductive set E. Indeed, if kEN and E is inductive, then by definition, kEE. We first show that Nand Z, as defined above, satisfy the assumptions we made in Remark 1.1.
572
APPENDICES
A.5 THEOREM (i) Givenn E Z, one and only one of the following statements holds: n E N, -n E N, or n = O. (ii) n E N implies n + 1 EN and n ~ 1. (iii) Ifn E Nand n =/:-1, then n -1 EN. (iv) IfnEZ andn>O, thennEN. PROOF. (i) Since (0,00) is an inductive set, all elements of N are positive. By the definition of Z, given nEZ, one of the following statements holds: n E N, -n E N, or n = o. It follows that either n > 0, n < 0, or n = O. Since the Trichotomy Property implies that only one of these conditions can hold for a given n, property (i) is proved. (ii) Since N is inductive, n E N implies n + 1 E N. Since [1,00) is an inductive set, every n E N satisfies n ~ 1. (iii) Suppose to the contrary that there is an no E N such that no =/:- 1 and no -1 fj. N. Consider the set E := {k EN: k =/:- no}. Since 1 =/:- no, 1 E E. If x E E, then x E N; hence, x =/:- no -1. It follows that x + 1 =/:- no; Le., x + 1 E E. Thus E is an inductive set. Since E c N, this contradicts the fact that N was defined to be the smallest inductive set. (iv) Suppose to the contrary that nEZ, n > 0, but n fj. N. Then by part (i), -n E N, so by part (ii), -n ~ 1. Using the second Multiplicative Property, it follows that n :::; -1 < 0, a contradiction .•
A.6 COROLLARY. If x, e E N and x < e, then x
+ 1 :::; e.
PROOF. By hypothesis, e - x > O. Hence, by Theorem A.5ii and iv, it suffices to show that e - x E Z. Consider the set A = {k EN: k - 1 E Z}. Clearly, 1 E A. Moreover, if k E A, then (k + 1) - 1 = kEN c Z. Thus A is an inductive set, Le., contains N. In particular, e E A. Similarly, B = {k EN: e - k E Z} is an inductive set, hence contains x. We conclude that e - x E Z .• We close this section by proving that under mild assumptions, the Axiom of Induction and the Well-Ordering Principle are equivalent.
A.7 THEOREM. Suppose that the Ordered Field axioms and Theorem A.5iii hold. Then the Axiom of Induction holds if and only if the Well-Ordering Principle holds. PROOF. By the proof of Theorem 1.11 (which used Theorem A.5iii at a crucial spot), the Well-Ordering Principle implies the Axiom of Induction. Conversely, if the Axiom of Induction holds, then the elements of N belong to every inductive set. This is the "definition" of N we made below (2) above. It follows that all results above are valid. (We will use Theorem A.5ii and Corollary A.6.) Suppose that E is a nonempty subset of N, and consider the set A:={xEN:x:::;e forall
eEE}.
A is nonempty since 1 E A. A is not the whole set N since if eo E E, then eo + 1 cannot belong to A. Hence, by the Axiom of Induction, A cannot be an inductive set. In particular, there is an x E A such that x + 1 fj. A. We claim that this x is a least element of Ej Le., x is a lower bound of E and x E E. That x is a lower bound of E is obvious, since by construction, x E A implies x :::; e for
B.
Trigonometry
573
(x. ,;
/
I
I
,
.-----I
"
\
\
, '......
: I I I I
'--t--
A(O)
".'"
Figure B.la
Figure B.lh
all e E E. On the other hand, if x ~ E, then x < e for all e E E. Hence, by Corollary A.6, x + 1 :::; e for all e E E, i.e., x + 1 E A, a contradiction of the choice of x .•
B. TRIGONOMETRY In this section we derive some trigonometric identities, by using elementary geometry and algebra. Let (x, y) be a point on the unit circle x 2 + y2 = 1 and () be the angle measured counterclockwise from the positive x axis to the line segment from (0,0) to (x, y) (see Figure B.Ia). (We shall refer to (x, y) as the point determined by the angle ().) Define sin()=y,
cos () = x,
and
tan() =
'!L. x
By the Law of Similar Triangles, given a right triangle with base angle (), altitude a, base b, and hypotenuse h (see Figure B.Ib), sin() = a/h, cos() = b/h, and tan() = a/b = sin () / cos (). B.l THEOREM. Given a circle C : x 2 + y2 = r2 of radius r, let s(()) represent the length of the arc on C swept out by (), and A(()) represent the area of the angular sector swept out by () (see Figure B.la). If the angle () is measured in radians (not degrees), then r2() s(()) = r() and A(()) = 2. PROOF. Since there are 27r radians in a complete circle and the circumference of a circle of radius r is 27rr, we have s(()) () 27rr 27r; i.e., s(()) = r(). Similarly, since the area of a circle is 7rr 2, we have
A(()) 7rr2 i.e., A( ()) = r2() /2 .•
()
27r;
APPENDICES
574
......
-----
.......
........ ,
x2+y2=!
A
,,
"
B=(i,O)I
I I
./
~-----.,.,.".,
Figure B.2a
/
/
/
I
I
I
./'"
Figure B.2h
B.2 THEOREM (i) sin(O) = 0 and cos(O) = 1. (ii) For any 0 E R, Isin 01 :::; 1, Icos 01 :::; 1, sin( -0) = - sin 0, cos (-0) = cos 0, and sin 2 0 + cos 2 0 = 1. (iii) If 0 is measured in radians, then sin(7r /2) = 1, cos(7r /2) = 0, sin(O + 27r) = sin 0, and cos(O + 27r) = cosO. Moreover, iEO < 0 < 7r/2, then 0 < OcosO < sinO < 0. (iv) If 0 E R is measured in radians, then Isin 01 :::; 101. Let 0 E R and (x, y) be the point on the unit circle determined by 0. (i) If 0 = 0, then (x, y) = (1,0) (see Figure B.la). Hence, sin(O) = 0 and cos(O) = 1. (ii) Clearly, Isin 01 = Iyl = J x 2 + y2 = 1, and similarly, Icos 01 :::; 1. By definition (see Figure B.2a), sin(-O) = -y = -sinO and cos(-O) = x = cosO. Moreover, PROOF.
n : :;
sin 2 0 + cos 2 0 = x 2
+ y2 = 1.
(iii) If 0 = 7r /2, then (x, y) = (0, 1), so sin( 7r /2) = 1 and cos (7r /2) = O. Fix 0 E (0, 7r /2) and consider Figure B.2b. Since sin 0 is the altitude of triangle ABC and the shortest distance between two points is a straight line, we have by Theorem B.l that sin 0 < 8(0) = 0. On the other hand, the triangle ABC is a proper subset of the angular sector swept out by 0, which is a proper subset of the triangle ABD. Hence, Area (ABC) < A(O) < Area (ABD). Since the area of a triangle is one-half the product of its base and its altitude, it follows from Theorem B.l that
(3)
sin 0 -2-
0
tan 0
<"2 < -2-'
Trigonometry
B.
But 0 < cosO
575
< 1 for all 0 E (O,71-j2). Multiplying (3) by 2cosO, we conclude that sin 0 cos 0 < 0 cos 0 < sin O.
(4)
(iv) By part (iii), Isin 01 = sin 0 ~ 0 = 101 for all 0 ~ 0 ~ 7r /2. Since sin( -0) = - sin 0, it follows that Isin 01 ~ 101 for all 0 E [-7r /2, 7r /2]. But if 0 ~ [-7r /2, 7r /2]' then IsinOI ~ 1 < 7r/2 < 101. Therefore, IsinOI ~ 161 for all 0 E R .• The next result shows how to compute the sine and cosine of a sum of angles.
B.3 THEOREM (i) [SUM-ANGLE FORMULAS]. IfO,cp cos (0
± cp)
E
R, then
= cos 0 cos cp =t= sin 0 sin cp
and sin( 0 ± cp) = sin 0 cos cp ± cos 0 sin cpo (ii) [DOUBLE-ANGLE FORMULAS]. If 0 E R, then 2
cos
0
. 20
sm
=
1 + cos(20) 2 '
=
1 - cos(20) 2 '
and
cosO = 1- 2sin2 (O/2). (iii) [SHIFT FORMULAS]. Ifcp is measured in radians, then
= cos
(i -cp)
cos cp = sin
(i -cp)
sincp and
for all cp E R.
PROOF. Suppose first that 0 > cpo Consider the chord A cut from the unit circle by a central angle 0 - cp, and the chord B cut from the unit circle by a central angle cp - 0 (see Figure B.3). Since sin 2 0 + cos 2 0 = 1, we have
(5)
A2 = (cosO - cos cp)2
+ (sinO -
and
B2 = (cos(O - cp) _1)2 Since
(6)
10 -
cpl = Icp -
sin cp)2 = 2 - 2(cos 0 cos cp + sin 0 sin cp)
+ (sin(O -
cp))2 = 2 - 2cos(O - cp).
01, the lengths of these chords must be equal. cos( 0 - cp) = cos 0 cos cp + sin 0 sin cp
Thus
APPENDICES
576
y (cose. sin9)~-t-_~
x
Figure B.3
for 0 < <po A similar argument establishes (6) for
This and (6) verify the first identity in part (i). Applying this identity to 0 = 1T /2, we see by Theorem B.2ii that 1T)
1T
•
1T
•
•
cos ( "2 -
((~ (~ -
0) =f
(~ -
0) sin( -
= sin 0 cos
This proves the second identity in part (i). Specializing to the case 0 = sin( 1T /2 -
1T /2,
we obtain
cos(20) = cos(O + 0) = cos 2 0 - sin 2 0 = 2 cos 2 0 - 1.
Hence, cos 2 0 = (1
+ cos(20))/2.
Similar arguments establish the rest of part (ii) .•
We close this section with the Law of Cosines, a generalization of the Pythagorean Theorem.
C.
Matrices and determinants
577
B.4 THEOREM [LAW OF COSINES]. 1fT is a triangle with sides of length a, b, c, and () is the angle opposite the side of length c, then c2 = a2 + b2 - 2ab cos (). PROOF. Suppose without loss of generality that () is acute, and rotate T so b is its base. Let h be the altitude of T, and notice that h cuts a right triangle out of T whose sides are a and h and the angle opposite h is (). By the definition of sin(} and cos (), h = a sin () and the length d of the base of this right triangle is d = b - a cos () . Substituting these values into the equation c2 = h 2 + d2 (which follows directly from the Pythagorean Theorem), we obtain c2 = (a sin ()? + (b - acos(})2 = a 2 sin2 () + a2 cos2 () + b2 - 2abcos() = a2 + b2 - 2abcos(} .•
C. MATRICES AND DETERMINANTS In this section we prove several elementary results about matrices and determinants. We assume that the student is familiar with the concept of row and column reduction to canonical form. Recall that an m x n matrix B is a rectangular array that has m rows and n columns:
The notation bij indicates the entry in the ith row and jth column. We shall call B real if all its entries bij belong to R.
e = he]pxq be real matrices. (i) Band e are said to be equal if m = p, n = q, and bij = Cij for i = 1,2, ... ,m, and j = 1,2, ... , n. (ii) The m x n zero matrix is the matrix 0 = Omxn = [bij]mxn where bij = 0 for i = 1, ... , m, and j = 1, ... , n. (iii) The n x n identity matrix is the matrix I = Inxn = [bij]nxn where bii = 1 for i = 1, ... , n, and bij = 0 for i =1= j, i,j = 1, ... , n. (iv) The product of a matrix B and a scalar a is defined by
C.l DEFINITION. Let B = [bij]mxn and
aB = [abij]mxn. (v) The negative of a matrix B is defined by - B = (-1) B. (vi) When m = p and n = q, the sum of Band e is defined by
B
+ e = [bij + Cij]mxn.
(vii) When n = p, the product of Band
Be
=
e is defined by
[t
v=l
biVCvj] mxq
578
APPENDICES
C.2 Example. Compute B
SOLUTION.
+ C, 3B,
-C, BC, and CB, where
By definition, 3B
=
[~ ~],
These operations do not satisfy all the usual laws of algebra. (For example, the last two computations show that matrix multiplication is not commutative.) Here is a list of algebraic laws satisfied by real matrices. C.3 THEOREM. Let A = [aij], B = [bij ], and C = [Cij] be real matrices and a, {3 be scalars. (i) (a + (3)C = aC + (3C. (ii) If B + C is defined, then a(B + C) = aB + aC, and B + C = C + B. (iii) If BC is defined, then a(BC) = (aB)C = B(aC). (iv) If AB and AC are defined, then A(B + C) = AB + AC. If BA and CA are defined, then (B + C)A = BA + CA. (v) If A + Band B + C are defined, then (A + B) + C = A + (B + C). If AB and BC are defined, then (AB)C = A(BC). (vi) If B is an m x n matrix, then
B+O mxn =B,
(vii) If B is an n x n matrix, then InxnB = BInxn = B. PROOF.
By definition,
and a(B + C)
= a[bij + Cij] = [a(bij + Cij)] = [abij ] + [aCij] = aB + aC.
A similar argument establishes B + C = C + B. Let B be an m x n matrix and C be an n x q matrix. By definition,
c.
Matrices and determinants
579
A similar argument establishes a(BG) = B(aG). Let A be an m x n matrix and B, G be n x q matrices. By definition,
A similar argument establishes (B + G)A = BA + GA. Let A be an m x n matrix, B be an n x p matrix, and G be a p x q matrix. By definition,
[~aivbvj] [CjkJ
(AB)G =
=
[t (t 3=1
=
aivbvj ) Cjk]
1'=1
[t (aivtbvjCjk)] 1'=1
= A
3=1
[tbVjCjk] = A(BG). 3=1
A similar argument establishes (A + B) By definition,
+ G = A + (B + G).
n
BOnxq =
[L biv . OJ = Omxq,
m
Opxm B =
1'=1
and
o· B =
[0·
bijJ =
[L 0 . bvjJ = Opxn, 1'=1
Omxn. Since I =
[Oij], where i =j, i =1= j,
we have InxnB = [L:~=l OivbvjJ =
[bijJ =
B = B1nxn· I
A square matrix is a matrix with as many rows as columns. Clearly, if Band G are square real matrices of the same size, then both B + G and BG are defined. This gives room for more algebraic structure. An n x n real matrix B is said to be invertible if and only if there is an n x n matrix B-l, called the inverse of B, that satisfies
The following result shows that matrix inverses are unique.
580
APPENDICES
C.4 THEOREM. Let A, B be n x n real matrices. If B is invertible and BA = I, then B- 1 = A. PROOF.
By Theorem C.3 and definition,
If B = [bij]nxn is square, recall that the minor matrix Bij of B is the (n-1) x (n-1) matrix obtained by removing the ith row and the jth column from B. For example, if
B=
[~11 ~22 ~33] ,
then B2l
= [;
~].
Minor matrices can be used to define an operation on square real matrices (the determinant) that makes invertible matrices easy to identify (see Theorem C.6). The determinant can be defined recursively as follows. Let B be an n x n real matrix. (i) If n = 1, then the determinant of B is defined by det[b] = b. (ii) If n = 2, then the determinant of B is defined by det
[~
:] =ad-bc.
(iii) If n > 2, then the determinant of B is defined recursively by det[bij]nxn = bll det Bll - b12 det B12
+ ... + (-l)n- l bln det BIn,
where B lj are minor matrices of B. The following result shows what an elementary column operation does to the determinant of a matrix.
C.5 THEOREM. Let B = [bij ] and C = [Cij] be n x n real matrices, n 22. (i) IfC is obtained from B by interchanging two columns, then detC = -detB. (ii) If C is obtained from B by mUltiplying one column of B by a scalar a, then detC = adetB. (iii) If C is obtained from B by multiplying one column of B by a scalar and adding it to another column of B, then det C = det B. PROOF.
Since det
[~
:]
= ad - bc = -(bc - ad) = - det [:
~],
part (i) holds for 2 x 2 matrices. Suppose that part (i) holds for (n-1) x (n-1) matrices. Suppose further that there are indices jo < jl such that bijo = Cij1 and biil = Cijo for i = 1, ... ,n. By the inductive hypothesis, det Clj = - det B lj for j i= jo and j i= jl, detCl10" =
. " 1 (_1)11-10-
detBl"11'
and
detCl11" = (_1)jo-j1+l detBl"10·
C.
581
Matrices and determinants
Hence, by definition, detC = Cll detCll - C12detC12 + ... + (-l)n-IcIndetCIn = -b ll det Bll + b12 det B12 + ... - (-l)n- I bIn det BIn = - det B. Thus (i) holds for all n
E
N. Similar arguments establish parts (ii) and (iii). I
In the same way we can show that Theorem C.5 holds if "column" is replaced by "row." It follows that we can compute the determinant of a real matrix by expanding along any row or any column, with an appropriate adjustment of signs. For example, to expand along the ith row, interchange the ith row with the first row, expand along the new first row, and use Theorem C.5 to relate everything back to B. In particular, we see that
The numbers (-1 )i+j det Bij are called the cofactors of bij in det B The operations in Theorem C.5 are called elementary column operations. They can be simulated by matrix multiplication. Indeed, an elementary matrix is a matrix obtained from the identity matrix by a single elementary column operation. Thus elementary matrices fall into three categories: E(i +-+ j), the matrix obtained by interchanging the ith and jth columns of I; E(ai), the matrix obtained by multiplying the ith column of I by a "I- 0; and E(ai + j), the matrix obtained by multiplying the ith column of I by a "I- 0 and adding it to the jth column. Notice that an elementary column operation on B can be obtained by multiplying B by an elementary matrix; e.g., E(i +-+ j)B is the matrix obtained by interchanging the ith and jth columns of B. These observations can be used to show that the determinant is multiplicative. C.6 THEOREM. If B, Care n x n real matrices, then det(BC) = det B det C. Moreover, B is invertible if and only if det(B) PROOF.
"I- O.
It is easy to check that
det(E(i
+-+
j)) = -1,
det(E(ai))
= a,
and
det(E(ai + j))
= 1.
Hence, by Theorem C.5,
(7)
det(EA) = det E det A
holds for any n x n matrix A and any n x n elementary matrix E. The matrix B can be reduced, by a sequence of elementary column operations, to a matrix V, where V = I if B is invertible and V has at least one zero column if B is not invertible (see Noble and Daniel [9], p. 85). It follows that there exist elementary matrices E I , ... ,Ep such that A = EI ... Ep V. Hence, by (7), det(B) = det(EI ... Ep V) = det(EI ) det(E2 ... Ep V)
= ... = det(EI ... Ep) det(V).
582
APPENDICES
In particular, B is invertible if and only if det B i= O. Suppose that B is invertible. Then V = I and by (7), det(BC) = det(E1 ... Ep) det(VC) = det B det C. If B is not invertible, then BC is not invertible either (see Noble and Daniel [9J, p. 204). Hence, det(BC) = 0 and we have
det(BC) = 0 = det Bdet C. I
The transpose of a matrix B = [b ij J is the matrix BT obtained from B by making the ith row of B the ith column of BT; i.e., the (i x j)th entry of BT is bji . The adjoint of an n x n matrix B is the transpose of the matrix of cofactors of B; i.e.,
The adjoint can be used to give an explicit formula for the inverse of an invertible matrix.
C.7 THEOREM. Suppose that B is a square real matrix. If B is invertible, then (8) PROOF.
B- 1 = de!Badj(B).
Set [CijJ
= B adj (B).
By definition,
If i = j, then Cij is an expansion of the determinant of B along the ith row of B; i.e., Cii = det B. If i i= j, then Cij is a determinant of a matrix with two identical rows so Cij is zero. It follows that Badj (B) = detB· I.
We conclude by Theorem C.4 that (8) holds. I In particular,
(9)
[
a C
b]-l
d
1
[d
= ad - bc -c
The following result shows how the determinant can be used to solve systems of linear equations. (This result is of great theoretical interest but of little practical use because it requires lots of storage to use on a computer. Most packaged routines that solve systems of linear equations use methods more efficient than Cramer's Rule, e.g., Gaussian elimination.)
D.
Quadric surfaces
583
C.s THEOREM [CRAMER'S RULE]. Let Cl,C2, ... ,Cn E Rand B
= [bij]nxn be a
square real matrix. The system
(10)
bnXl b 21 X l
+ b12 X2 + ... + blnxn = Cl + b22 X2 + ... + b2nXn = C2
of n linear equations in n unknowns has a unique solution if and only if the matrix B has a nonzero determinant, in which case Xj =
detC(j) detB '
where C(j) is obtained from B by replacing the jth column of B by the column matrix leI ... cn]T. In particular, if Cj = 0 for all j and det B i= 0, then the system (10) has only the trivial solution Xj = 0 for j = 1,2, ... , n. PROOF.
The system (10) is equivalent to the matrix equation BX=C,
where B = [bij ], X = C.7,
[Xl· ..
xn]T, and C =
X = B-IC =
[Cl ...
cn]T. If det B
i= 0,
then by Theorem
de~Badj (B)C.
By definition, adj (B)C is a column matrix whose jth "row" is the number (-I)1+j Cl det B lj
+ (_1)2+ j C2 det B 2j + ... + (-It+ j Cn det Bnj =
det C(j).
(We expanded the determinant of C(j) along the jth column.) Thus Xj = det C(j). Conversely, if BX = C has a unique solution, B can be row reduced to I. Thus B is invertible; i.e., det B i= O. I
D. QUADRIC SURFACES A quadric surface is a surface that is the graph of a relation in R 3 of the form AX2
+ By2 + Cz 2 + Dx + Ey + Fz + Gxy + Hyz + Izx =
J,
where A, B, ... , J E R and not all A, B, C, G, H, I are zero. We shall only consider the cases when G = H = 1=0. These include the following special types. 1. The ellipsoid, the graph of
AX2
+ By2 + Cz 2 = 1,
where A, B, C are all positive.
APPENDICES
584
z
x
Figure D.I 2. The hyperboloid of one sheet, the graph of Ax2
+ By2 + Cz 2 = 1,
where two of A, B, C are positive and the other is negative. 3. The hyperboloid of two sheets, the graph of AX2
+ By2 + Cz 2 =
1,
where two of A, B, C are negative and the other is positive. 4. The cone, the graph of
where two of A, B, C are positive and the other is negative. 5. The paraboloid, the graph of
where A, B are both positive or both negative. 6. The hyperbolic paraboloid, the graph of z = AX2 + By2, where one of A, B is positive and the other is negative. The trace of a surface S in a plane II is defined to be the intersection of S with II. Graphs of many surfaces, including all quadrics, can be visualized by looking at their traces in various planes. We illustrate this technique with a typical example of each type of quadric.
D.I Example. The ellipsoid 3x2 + y2
+ 2z2 =
6.
SOLUTION. The trace of this surface in the xy plane is the ellipse 3x 2 + y2 = 6. The trace of this surface in the yz plane is the ellipse y2 + 2Z2 = 6, and its trace in the xz plane is the ellipse 3x 2 + 2z2 = 6. This surface is sketched in Figure D.l. •
D.
Quadric surfaces
585
y
Figure D.2
y
Figure D.3 D.2 Example. The hyperboloid of one sheet x 2
+ y2 -
z2
= 1.
SOLUTION. The trace of this surface in the plane z = a is the circle x 2 + y2 = 1 + a2 . The trace of this surface in x = 0 is the hyperbola y2 - Z2 = 1. This surface is sketched in Figure D.2. I
D.3 Example. The hyperboloid of two sheets x 2
-
y2 -
z2
= 1.
SOLUTION. The trace of this surface in the plane z = 0 is the hyperbola x 2 - y2 = 1. The trace of this surface in y = 0 is the hyperbola x 2 - z2 = 1. This surface has no trace in x = O. This surface is sketched in Figure D.3. I
D.4 Example. The cone
Z2
= x 2 + y2.
The trace of this surface in the plane z = a is the circle x 2 + y2 = a2 . The trace of this surface in y = 0 is a pair of lines z = ±x. This surface is sketched in SOLUTION.
APPENDICES
586
y
Figure D.4
a
x
Figure D.5
Figure D.4. I D.5 Example. The paraboloid z = x 2 + y2. SOLUTION. If a > 0, the trace of this surface in the plane z = a is the circle x 2+y2 = a. The trace of this surface in y = is the parabola z = x 2 . This surface is sketched in Figure D.5. I
°
D.6 Example. The hyperbolic paraboloid z = x 2 _ y2.
E.
Vector calculus and physics
587
Figure D.6 SOLUTION. The trace of this surface in the plane z = a is the hyperbola a = x 2 _ y2. (It opens up around the xz plane when a > 0, and around the yz plane when a < 0.) The trace of this surface in the plane y = 0 is the parabola z = x 2 • This surface is sketched in Figure D.6. (Note: The scale along the x axis has been exaggerated to enhance perspective, so the hyperbolas below the z = 0 plane are barely discernible.) I E. VECTOR CALCULUS AND PHYSICS Throughout this section C = (cp,I) is a smooth arc in R2, S = ('ljJ,E) is a smooth surface in R 3 , {to, ... , tN} is a partition of I, and {R 1, ... , R N } is a grid on E. E.1 Remark. The integral
JIs dcr Ie IIN",(u, v)11 d(u, v)
(11)
=
can be interpreted as the surface area of S.
Let (Uj, vJ) be the lower left-hand corner of Rj and suppose that Rj has sides ~u, (see Figure E.l). If Rj is small enough, the trace of each piece Sj = ('ljJ,Rj) is approximately equal to the parallelogram determined by the vectors ~u'ljJu and ~v'ljJv. Hence, by Exercise 7 in Section 8.2, ~v
A (Sj) :::::: 11(~u'ljJu(uJ,Vj)) x (~v'ljJv(Uj,vj))11 = IIN",(uj,vj)11 ~u~v = IIN",(uj,vj)IIIRjl·
Summing over j, we obtain N
A(S)::::::
L
IIN",(uj,vj)IIIRjl
j=l
which is a Riemann sum of the integral (11).
APPENDICES
588
Figure E.1 E.2 Remark. If w is a thin wire lying along C, whose density (mass per unit length) at a point (x,y) is given by g(x,y), then
can be interpreted as the mass of w.
Since mass is the product of density and length, an approximation to the mass of the piece of w lying along Ck = (¢>, [tk-l, tk]) is given by
(see Definition 13.9). Summing over k, an approximation to the mass of w is
which is nearly a Riemann sum of the integral
1
g(¢>(t))II¢>'(t)11 dt =
fc
gds.
The following remark has a similar justification.
E.3 Remark. If S is a thin sheet of metal whose density at a point (x, y, z) is given by g(x, y, z), then
can be interpreted as the mass of S.
Work done by a force F acting on an object as it moves a distance d is defined to be W = Fd. There are many situations where the force changes from point to point. Examples include the force of gravity (which is weaker at higher altitudes), the velocity of a fluid flowing through a constricted tube (which gets faster at places where the tube narrows), the force on an electron moving through an electric field, and the force on a copper coil moving through a magnetic field.
E.
Vector calculus and physics
589
I/f(E)
Figure E.2a
Figure E.2h
E.4 Remark. If an object acted on by a force F : R 3 C = (¢>, I), then the unoriented line integral
-->
R moves along the curve
fcFdS can be interpreted as the work done by F along C.
An approximation to the work done along Ck
= (¢>, [tk-l, tk])
is
Summing over k, we find that an approximation to the total work along C is given by
which is nearly a Riemann sum of the integral
1
F(¢>(t))II¢>'(t)11 dt = fc Fds.
The following remark explains why F . T is called the tangential component of F and F . n is called the normal component of F. E.5 Remark. Let u be a unit vector in R2 (respectively, R3) and F be a function whose range is a subset of R2 (respectively, R3). If f is the line in the direction u passing through the origin, then IP . ul is the length of the projection of F onto f (see Figure E.2a). Let () represent the angle between u and F. By (3) in Section 8.1,
IF· ul = cos ()llPlllluli = cos ()IIPII· Hence, by trigonometry, IP· ul is the length of the projection of F onto f. Notice that F . u is positive when () is acute and negative when () is obtuse.
Ie
F . T ds represents the work done Combining Remarks E.4 and E.5, we see that by the tangential component of a force field F : R3 --> R3 along C.
APPENDICES
590
E.6 Remark. If S = ('ljJ, E) is a thin membrane submerged in an incompressible fluid that passes through S, and F(x, y, z) represents the velocity vector of the flow of that fluid at the point (x, y, z), then the oriented integral of F . n can be interpreted as the volume of fluid flowing through S in unit time. Let {Ej } be a grid that covers E, and let h be the length of the line segment obtained by projecting F onto the normal line to S at a point (Xj, yj, Zj) E 'ljJ(Ej ) (see Figure E.2b). If E j is so small that F is essentially constant on the trace of Sj = ('ljJ,Ej ), then an approximation to the volume of fluid passing through Sj per unit time is given by Vj
= A (Sj)· h = A (Sj)· F(xj,Yj,zj)·n = A (Sj)F('ljJ(uj, Vj)) . N,p (Uj , vj)/IIN,p(uj, vj)ll.
Summing over j and replacing A (Sj) by IIN,pIIIEjl (see Remark E.1), we see that an approximation to the volume V of fluid passing through S per unit time is given by N
LF('ljJ(uj,Vj)). N,p(uj,vj)IEjl· j=l
This is a Riemann sum of the oriented integral
lie
=
F('ljJ(u, v)) ·N,p(u,v)dA
lis
F·nda.
F. EQUIVALENCE RELATIONS A partition of a set X is a family of nonempty sets {EaJoEA such that
X =
UEo
and
n E(3
Eo
=
0
oEA
for a i= (3. A binary relation rv on X is a subset of X x X. If (x, y) belongs to rv, we shall write x rv y. Examples of binary relations include = on R, ~ on R, and "parallel to" on the class of straight lines in R 2 • A binary relation is called an equivalence relation if it satisfies three additional properties. [REFLEXIVE PROPERTY] For every x EX, x rv x. [SYMMETRIC PROPERTY]
If x
rv
y, then y
[TRANSITIVE PROPERTY]
If x
rv
y and y
rv
rv
x.
Z,
then x
rv
z.
Notice that = is an equivalence relation on R, "parallel to" is an equivalence relation on the class of straight lines in R 2 , but ~ is not an equivalence relation on R (it fails to satisfy the Symmetric Property). If rv is an equivalence relation on a set X, then X :=
{y EX: y
rv
is called the equivalence class of X that contains x.
x}
F.
Equivalence relations
591
F.l THEOREM. If rv is an equivalence relation on a set X, then the set of equivalence classes {x : x E X} forms a partition of X. PROOF. Since rv is reflexive, each equivalence class x contains Xj i.e., x is nonempty. Suppose that x n y i= 0j i.e., some z E X belongs to both these equivalence classes. Then z rv X and z rv y. By the Symmetric Property and the Transitive Property, we have x rv Yj i.e., y E x. By the Transitive Property, it follows that y ~ x. Reversing the roles of x and y, we also have x ~ y. Thus x = y. I
1. ApOSTOL, TOM M., Mathematical Analysis. Reading, Mass.: Addison-Wesley Publishing Co., 1974. 2. BOAS, RALPH P., JR., Primer of Real Functions, Carus Monograph 13. New York: Mathematical Association of America and John Wiley & Sons, Inc., 1960. 3. GRIFFITHS, HUBERT B., Surfaces. London and New York: Cambridge University Press, 1976. 4. HOCKING, JOHN G. AND GAIL S. YOUNG, Topology. Reading, Mass.: AddisonWesley Publishing Co., 1961. 5. KLINE, MORRIS, Mathematical Thought from Ancient to Modern Times. New York: Oxford University Press, 1972. 6. LOOMIS, LYNN H. AND SHLOMO STERNBERG, Advanced Calculus. Reading, Mass.: Addison-Wesley Publishing Co., 1968. 7. MARSDEN, JERROLD E., Elementary Classical Analysis. New York: W.H. Freeman & Co., 1990. 8. MUNKRES, JAMES R., Elementary Differential Topology, Annals of Mathematical Studies. Princeton, N.J.: Princeton University Press, 1963. 9. NOBLE, BEN AND JAMES W. DANIEL, Applied Linear Algebra, 2nd ed. Upper Saddle River, N.J.: Prentice Hall, Inc., 1977. 10. PRICE, G. BALEY, Multivariable Analysis. New York: Springer-Verlag New York, Inc., 1984. 11. RUDIN, WALTER, Principles of Mathematical Analysis, 3rd ed. New York: McGraw-Hill Book Co., 1976. 12. SPIVAK, MICHAEL, Calculus on Manifolds. New York: W.A. Benjamin, Inc., 1965. 13. TAYLOR, ANGUS E., Advanced Calculus. Boston: Ginn and Company, 1955. 14. WIDDER, DAVID V. Advanced Calculus, 2nd ed. Upper Saddle River, N.J.: Prentice Hall, Inc., 1961. 15. ZYGMUND, ANTONI, Trigonometric Series, Vol. I, 2nd ed. London and New York: Cambridge University Press, 1968.
592
Answers and Hints to Selected Exercises
CHAPTER 1 1.1 Ordered Field Axioms
2. (a) (-3,7). (b) (-3,5). (c) (-1,-1/2) U (1,00). (d) (-2,1). 3. (b) Consider the cases c = 0 and c =1= o. 4. To prove (7), multiply the first inequality in (7) by c and the second inequality in (7) by b. Prove (8) and (9) by contradiction. 5. (a) Apply (6) to 1- a. (b) Apply (6) to a -1. (c) Observe that (Va - Vb)2 ;::: o. 6. (a) Use uniqueness of multiplicative inverses to prove that (nq)-l = n-1q-l. (b) Use part (a). (c) Use proof by contradiction for the sum. Use a similar argument for the product, and identify all rationals q such that xq E Q for a given x E R \ Q. (d) Use the Multiplicative Properties. 7. (a) Prove that Ixl ::; 1 implies Ix + 11 ::; 2. (b) Prove that -1 ::; x ::; 2 implies Ix + 21 ::; 4. 8. (a) n > 99. (b) n ;::: 20. (c) n ;::: 23. 9. Show first that the given inequality is equivalent to 2a1b1a2b2 ::; a~bf + aib~. 10. (a) Observe that Ixy - abl = Ixy - xb + xb - abl and Ixl < lal + E. 11. (a) The Trichotomy Property implies (i); the Additive and Multiplicative Properties imply (ii). 1.2 The Well-Ordering Principle
3.
t
(~)xn-khk-l.
k=l
4. 5. 6. 8.
See Exercise 5 in Section 1.1. Observe that x 2 - x - 2 < 0 for all 0 < x < 2. (b) First prove that 2n + 1 < 2n for n = 3,4, .... (a) Show that n 2 + 3n cannot be the square of an integer when n > 1. (b) The expression is rational if and only if n = 9. 10. (a) This recursion, discovered by P.W. Wade, generates all Pythagorean triples a, b, c that satisfy c - b = 1.
1.3 The Completeness Axiom 1. (a) inf E (c) inf E (f) inf E
= 1, supE = 8. (b) inf E = (3 - V29)/2, supE = (3 + V29)/2. = a, supE = b. (d) inf E = 0, supE = V2. (e) inf E = 0, supE = 2. = -1, supE = 2. (g) inf E = 0, supE = 3/2. 593
ANSWERS AND HINTS TO SELECTED EXERCISES
594
2. 3. 5. 6. 8. 9. 10.
Prove that sup E must be an integer. Notice that a - ,j2 < b - ,j2, and use Exercise 6c in Section 1.1. (b) Apply Theorem 1.20 to -E. (b) Apply the Completeness Axiom to -E.
tl
~
t2
~
....
Use the proof of Theorem 1.24 as a model. After showing that sup A and supB exist, prove that max{sup A, sup B} ~ supE.
1.4 Functions, Countability, and the Algebra of Sets
= (x + 7)/3. (b) f-l(x) = l/logx. (c) f-l(X) = arctanx. (d) f-l(X) = (-3 + )33 + 4x)/2. (e) f-l(X) = (x - 2)/3 when x ~ 2, f-l(x) = X - 2 when 2 < x ~ 4, and f-l(X) = (x + 2)/3 when x > 4. (f) f-l(x) = (1 - )1- 4x 2 )/2x when x =1= 0, and f- 1 (0) = o. 2. If
1. (a) f-l(X)
5. (a) [-1,2]. (b) [0,1]. (c) [0,1]. (d) {O}. 6. First prove that f(A) \ f(B) ~ f(A \ B) and A ~ f-l(f(A)) hold whether or not f is 1-1. 10. (a) Prove it by induction on n. (b) Use Exercise 9.
CHAPTER 2 2.1 Limits of Sequences 2. (b) Apply Definition 2.1 with c/3 in place of c. 4. (b) Definition 2.1 works for any c, including c/e.
2.2 Limit Theorems 1. 2. 4. 5. 7. 9. 10.
(c) You may use Exercise 4. (a) -3. (b) 1/5. (c) ,j2 (see Exercise 4). (d) O. You may wish to prove that Fn - ..jX = (x n - x)/(Fn + ..jX). Use Theorem 1.24. (a) or (b) If x = limn~oo Xn exists, what is limn~oo X n +l? (a) See Exercise 1c in Section 1.2. (a) Modify the proof of Theorem 1.24.
2.3 The Bolzano-Weierstrass Theorem 1. You only need to prove that {x n } has a convergent subsequence, not actually find it. 4. See Exercise 4a in Section 1.2 5. Prove that x ~ v'2x + 3 for -3/2 ~ x ~ 3. 6. See Exercise 4b in Section 1.2. 8. Prove that {x n } is monotone. 9. (a) See Exercise 5c in Section 1.1.
ANSWERS AND HINTS TO SELECTED EXERCISES
595
2.4 Cauchy Sequences 2. 6. 7. 8.
You may use Theorem 2.29. You may use Exercise 4. Is it Cauchy? (See Exercise lc in Section 1.2.) (a) Use the Bolzano-Weierstrass Theorem.
2.5 Limits Supremum and Infimum
1. (a) 2,4. (b) -1,1. (c) -1,1. (d) 1/2,1/2. (e) 0,0. (f) 0,00. (g) 00,00. 4. (a) First prove that infk>n Xk +infk>n Yk ~ infk>n(xk +Yk). (c) By (b), the first and final inequalities ca;;- only be st~ict if neithe~ {xn} nor {Yn} converges. 7. Let s = infnEN(suPk>n Xk) and consider the cases s = 00, s = -00, and s E R. 8. Let s = liminfn~oo £n and consider the cases s = 00, s = 0, and < s < 00.
°
CHAPTER 3 3.1 Two-Sided Limits 1. 2. 3. 7. 9.
For all parts, see Example 3.3. (b) The limit is zero. (c) Does it exist as an extended real number? Why not? (a) 1/2. (b) 3/2. (c) 0. (d) n. (e) 0. (b) Use Exercise 8. (b) Use Exercise 8.
3.2 One-Sided Limits and Limits at Infinity 2. 3. 4. 6. 8.
(a) -00. (b) 0. (c) 0. (d) 1. (e) 00.
(a) -3. (b) 0. (c) -00. (d) 7r/2. (e) 0. (f) It does not exist. (b) Use Theorem 3.8. You may use cos X ----> 1 as x ----> 0. Prove that if f(x) does not converge to L as x ----> 00, then there is a sequence {x n } such that Xn ----> 00 but f(xn) does not converge to L as n ----> 00. 9. See Exercise 5 in Section 2.2.
3.3 Continuity 1. (c) Recall that 2x = exlog2. 2. (c) Recall that v'x is continuous on [0,00). 8. (b) Use part (a) to show that f(x) == f(mx/m) = mf(x/m) first. (d) If the statement is true, then m must equal f(I). 9. Begin by showing that f(O) = 1. 3.4 Uniform Continuity 1. (c) Recall that sin 2x - sin 2a = 2 sin (x - a) cos(x + a). 6. (c) and (e) Prove that f (x) = x and g( x) = x 2 are both uniformly continuous on (0,1) but only one of them is uniformly continuous on [0,00). 7. (a) This is a function analogue of the Monotone Convergence Theorem. 9. You may wish to prove that if P(x) = anx n + ... + ao is a polynomial of degree n ~ 1 whose leading coefficient satisfies an > 0, then P(x) ----> 00 as x ----> 00.
ANSWERS AND HINTS TO SELECTED EXERCISES
596
CHAPTER 4 4.1 The Derivative 2. Use Definition 4.1 directly. 4. (a) Use (ii) and (vi) to prove that sinx ----> 0 as x ----> O. Use (iii) to prove that cosx ----> 1 as x ----> O. (b) First prove that sinx = sin(x-xo)cosxo+cos(xxo) sinxo for any x, Xo E R. (c) Inequality (vi) and 0 :::; 1 - cos x :::; 1 - cos 2 x playa prominent role here. (d) Use (iv) and part (c). 4.2 Differentiability Theorems 1. (a) (5x 2-6x+3)/(2y'x"), x > O. (b) -(2x+1)/(x2+x-1)2, x i- (-1±V5)/2. (c) (1 + log x)XX, x > O. (d) f'(x) = (3x2+4x-1)(x3+2x2-x-2)/lx3+2x2-x-21 for xi-I, -1, -2. 2. (a) 3a + c. (b) (2b - d)/8. (c) bc. (d) bc. 3. No, f is not differentiable at O. 8. (a) Observe that yn = xm and use Exercise 6 in Section 4.1 together with the Chain Rule. (b) To handle the case q < 0, first prove that (X-I), = _x- 2 for all xi- O. 4.3 Mean Value Theorem 1. (a) 3. (b) -00. (c) e l / 6. (d) 1. (e) -1/7r. (f) -1. 3. (b) First prove that if g(x) = e- l / x2 /x k for some kEN, then g(x) ----> 0 as x ----> O. Next, prove that given n E N, there are integers N = N(n) E Nand ak = akn ) E Z such that
xi-O x=O. (Note: Although for each n E N many of the ak's are zero, this fact is not needed in this exercise.) 4. (b) Find the maximum of f(x) = log x/x" for x E [1,00). 11. This is the only exercise in this section that has nothing to do with the Mean Value Theorem or l'Hopital's Rule. 12. (a) Compare with Exercise 3 in Section 4.1. 4.4 Monotone Functions and Inverse Function Theorem 1. (a) a> -3. (b) a ~ -3/4. (c) f is strictly decreasing on (-00,1] and strictly increasing on [1, 00 ). 2. (a) 1/7r. (b) l/e. (c) 1/7r. 3. (b) 1/4e. 4. Observe that if x = siny, then cosy = V1- x 2 • 6. f(x) = ±y'Qx + c for some c E R. 8. Use Theorem 4.29. 9. Use Darboux's Theorem. 10. Use Darboux's Theorem and Lemma 4.28.
597
ANSWERS AND HINTS TO SELECTED EXERCISES
CHAPTER 5 5.1 Riemann Integral
4. (a) Use the Sign Preserving Property. 5. First show that JI f(x) dx = 0 for all subintervals I of [a, b]. 8. (a) Notice that IXj - xj-11 :S IIPII for each j = 1,2, ... , n. 5.2 Riemann Sums 1. (a) 1/4. (b) 7ra 2 /4. (c) 9. (d) (3/2)(b 2 - a2 ) + (b - a). (Note: If a ~ -1/3 or b:S -1/3, the integral represents the area of a trapezoid; if a < -1/3 < b, the integral represents the difference of the areas of two triangles, one above the x axis and the other below the x axis.) 3. Do not forget that f is bounded. 5. (b) You may use the fact that J xn dx = xn+l /(n + 1). 8. (a) If If(xo)1 > M - 10/2 for some Xo E [a, b], can you choose a nondegenerate interval I such that If(x)1 > M - 10 for all x E I? (b) See Example 2.21. 5.3 Fundamental Theorem of Calculus
1. (a) 15. (b) 1. (c) (4 100 _1)/300. (d) (e 2 +1)/4. (e) (e 7r / 2 +1)/2. (f) 4V3-2v'TI. 3. (a) 2xf(x 2 ). (b) h(t) + sint· h(cost). (c) g(-t). (d) Integrate by parts with u = f(x). 8. Use the Fundamental Theorem of Calculus. 10. (a) See Exercise 4 in Section 5.1. (b) See Exercise 3 in Section 4.3. 5.4 Improper Riemann Integration
1. 2. 3. 4. 9. 10.
(a) 3/2. (b) 7r. (c) 3/2. (d) 4. (a) p> 1. (b) p < 1. (c) p> 1. (d) p> 1. (e) p > 1. Compare with Example 5.44. (a) Diverges. (b) Diverges. (c) Converges. (d) Converges. (e) Converges. Integrate by parts first. (a) You might begin by verifying sinx ~ .;2/2 for x E [7r/4,7r/2] and sinx ~ 2x/7r for x E [0,7r/4]. 5.5 Functions of Bounded Variation 4. Combine Lemma 4.28 and Theorem 3.39. 9. For the bounded case, prove that (L) 1f'(x)1 dx :S Var f :S (U)
J:
J: 1f'(x)1 dx.
5.6 Convex Functions
5. Use Remark 5.60.
CHAPTER 6 6.1 Introduction
2. 3. 4. 6. 7.
(a) 1/(1 + 7r). (b) 5/6. (c) 21/4. (d) e/(e - 2). (a) 1. (b) log(2/3). (c) -1 + 7r/4. Ixl:S 1. (b) Consider the geometric series. (c) Notice that if the partial sums of L:~l bk are bounded, then b =
o.
598
ANSWERS AND HINTS TO SELECTED EXERCISES
8. (b) See Exercise 7b. (d) First prove that if ak ~ 0 and L:~oak diverges, then L:~oak = 00. 9. (a) Is na2n :::; L:~n ak? 10. Use Corollary 6.9.
6.2 Series with Nonnegative Terms 1. (c) If p > 1, are there constants C > 0 and q > 1 such that log k/k P :::; Ck- q ? 2. (a) No, you cannot apply the p-Series Test to k1-1/k because the exponent p := (1 - l/k) is NOT constant, but depends on k. (d) Try the Integral Test. 3. It converges when p > 1 and diverges when 0 :::; p :::; 1. 5. See Exercise 7 in Section 4.4. 9. It diverges when 0 < q :::; 1 and converges when q > 1. 6.3 Absolute Convergence 2. (a) Convergent. (b) Divergent. (c) Convergent. (d) Divergent. (e) Convergent. (f) Divergent. (g) Convergent. 6. (a) (1,00). (b) 0. (c) (-00,-1) U (1,00). (d) (1/2,00). (e) (-00,log2(e)). (Use Stirling's formula when p = log2(e).) (f) (1,00). 9. (a) See Exercise 8 in Section 6.1. 10. See Definition 2.32.
6.4 Alternating Series 1. (d) Use Example 6.34. 2. (a) [-1,1). (b) (-0,0). (c) (-1,1]. (d) [-3, -1]. 3. (a) Absolutely convergent. (b) Absolutely convergent. (c) Absolutely convergent. (d) Conditionally convergent. (e) Absolutely convergent. 5. See Exercise 6.34. 6. See Exercise 6.34. 8. Is it Cauchy? 9. Let Ck = L:;:k ajbj and apply Abel's Formula to
6.5 Estimation of Series 1. (a) At most 100 terms. (b) At most 15 terms. (c) At most 10 terms. (To prove that {ak} is monotone, show that ak+I/ak < 1.) 2. (a) p> 1. 3. (a)n=5. (b)n=7. (c)n=lO. (d)n=7. 6.6 Additional Tests 1. (a) Divergent. (b) Absolutely convergent. (c) Divergent. (d) Absolutely convergent. 2. (a) Absolutely convergent for p > 0 and divergent for p :::; O. (b) Absolutely convergent for p > 0 and divergent for p :::; O. (c) Absolutely convergent for Ipl < l/e, conditionally convergent for p = -l/e, and divergent otherwise. (Use Stirling's formula when p = ±l/e.) 4. It actually converges absolutely.
ANSWERS AND HINTS TO SELECTED EXERCISES
599
CHAPTER 7 7.1 Uniform Convergence of Sequences 2. 4. 6. 7.
(a) 2. (b) 4. (c) 7r/2. (a) Use Exercise 3c. (a) This is different from Theorem 7.9 because E is not necessarily an interval. Modify the proof of Example 4.21 to show that (1 + x/n)n i eX as n ----> 00. To prove this is a uniform limit, choose N so large that [a, b] C [-N, N] and find the maximum of eX - (1 + x/N)N on [a, b]. 7.2 Uniform Convergence of Series
:s
1. (b) Recall that IsinOI 101 for all 0 E R. 5. See Exercise 3a in Section 6.1. 6. Is there a connection between I:~=l k- 1 sin(xk- 1 ) and I:~l cos(xk- 1 )? 7. Use Abel's Formula. 9. See Example 6.32. 7.3 Power Series 1. (a) (-2,2). (b) (3/4,5/4). (c) [-1,1). (d) [-1/v'2,1/v'2]. (Use Raabe's Test for the endpoints.) 2. (a) f(x) = 3x 2/(1-x 3) for x E (-1,1). (b) f(x) = (2-x)/(1-x)2 for x E (-1,1). (c) f(x) = 2(logx + l/x -1)/(1- x) for x E (0,2), x =1= 1, and f(l) = O. (d) f(x) = 10g(1/(1- x 3))/x3 for x E [-1,1), x =1= 0, and f(O) = 1. 4. Use Exercise 8 in Section 2.5 to prove that if lim sup lak/ak+ll < R, then there is an r < R such that {Iakrkl} is increasing for k large, i.e., that I:~=l akrk diverges for some r < R. 8. Use the method of Example 7.36 to estimate 1f'(x)l. 9. First prove that the radius of convergence of I:~o akxk is ~ 1. 10. (a) Use Theorem 6.35 to estimate log(n!) = I:~=llog k. (b) x E (-l/e, l/e). 7.4 Analytic Functions 1. (a) cos(3x) = I:~o( _9)kX2k /(2k)!. (b) 2X = I:~=o xk logk 2/k!. (c) cos 2 x = 1 + I:~l (_1)k22k-1X2k /(2k)!. (d) sin 2 x + cos 2 X = 1. (e) x 3ex2 = I:~o x 2k +3/ k!. 2. (a) 10g(1 - x) = - I:~=l xk /k. (b) x 2/(1- x 3) = I:~=o x 3k +2. (c) eX /(1- x) = I:~o(I:7=o l/j!)x k. (d) x 3/(1 - x)2 = I:~=l kxk+2. (e) arcsin x = I:~=o (-V2) (_1)k x2k+1 /(2k + 1). (Use Theorem 7.54.) 3. (a) (x 2 -l)e X = -1 - x I:~=2(k2 - k - l)xk /k!. (b) eX cos x = I:~=o(I:jEAk(-1)j/((2j)!(k - 2j)!))· Xk, where Ak := {j EN:
o :S j
:S k/2}.
(c) sin x/ex = I:~=l (I:jEA k (_.1)k- j +1 /((2j + l)!(k - 2j -I)!))· xk, where Ak := {j EN: 0 :S j :S (k - 1)/2}. (d) f(x) = I:~=o xk logk+l a/(k + I)!. 4. (a) 10glO x = I:~=l (_1)k+ 1(x _l)k /(k log 10), valid for x E (0,2]. (b) x 2 + 2x1 = 2 + 4(x - 1) + (x - 1)2, valid for x E R. (c) eX = I:~o e(x - l)k /k!, valid for x E R. 10. See Exercise 4 in Section 5.1 and use analytic continuation. 11. First use the Binomial Series to verify that (1 + x),8 ~ 1 + x,8 for any 0 < x < 1.
ANSWERS AND HINTS TO SELECTED EXERCISES
600
7.5 Applications 1. The first seven places of the only real root are given by -0.3176721. 6. Choose ro as in the proof of Theorem 7.60, define {x n } by (19), and find a 8 so that If(xo)1 ~ 8 implies IX n - xn-ll < r~+l.
CHAPTER 8 8.1 Algebraic Structure 2. (a) (a, a, a), a =I- O. (b) (a, (20 - 8a)/7, (8 + a)/7), a =I- O. (c) x - 2y 5. (vi) Write Ilx x Yl12 = (x X y) . (x x y) and use parts (iv) and (v). 8.2 Planes and Linear Transformations 1. 2. 3. 4. 5.
Z
= 4.
(a) 3x + 3y - Z = 6. (a) x - 2y + Z = -1. (b) Use Exercise la in Section 1.2. x - y + 2z + 5w = 1. Use a, b, c to produce a normal to II. (a) Use the linear property to compute T(l, 0) and T(O, 1).
(b)
6. (a)
[
7/2 b 7/2
a
-1/2 1/2
-7r] -3
for any choice of a, b E R. 8. If (xo, Yo, zo) does not lie on II, let (X2, Y2, Z2) be a point on II different from (Xl, YI, Zl), let () represent the angle between to:= (xo - X2, Yo - Y2, Zo - Z2) and the normal (a, b, c), and compute cos () two different ways, once in terms of to and a second time in terms of the distance from (xo, Yo, zo) to II. 10. (a) T = [2x cos x]. 8.3 Topology of Rn 1. Diamonds and squares. 2. (a) Closed. (b) Closed. (c) Neither open nor closed. (d) Open. 5. Notice that a E EC and EC is open. 9. Try a proof by contradiction. 8.4 Interior, Closure, and Boundary
= (a,b), E = [a,b], 8E = {a,b}. (b) EO = 0, E = Eu {O}, 8E = E. (c) EO = E, E = [0,1]' 8E = {l/n : n E N} U {O}. (d) EO = R, E = R, 8E = 0. 2. (a) EO = {(x,y) : x 2 + 4y2 < 1} and 8E = {(x,y) : x 2 + 4y2 = I}. (b) EO = 0 and 8E = E. (c) EO = {(x,y): y > X2,0 < y < I}, E = {(x,y): y 2: X2,0 ~ Y ~ 1}, and 8 E = {( x, y) : y = x 2, 0 ~ Y ~ 1} U {( x, 1) : -1 ~ x ~ I} . (d) E = {x 2 - y2 ~ 1, -1 ~ y ~ I} and 8 E = {(x, y) : x 2 - y2 = 1, - v'2 ~ x ~ v'2} U {(x, 1) : -v'2 ~ x ~ v'2} U {(x, -1) : -v'2 ~ x ~ v'2}.
1. (a) EO
9. (c) Use part (b). You may assume that Rn is connected.
ANSWERS AND HINTS TO SELECTED EXERCISES
601
CHAPTER 9 9.1 Limits of Sequences 2. (a) (0, -3). (b) (1,0,1). (c) (-1/2,1,0). 9. (b) Show that a set Cis relatively closed in E if and only if E \ C is relatively open in E. 10. (b) Use the Bolzano-Weierstrass Theorem. 9.2 Limits of Functions
= {(x,y) : x =1= l,y =1= I} and the limit is (0,3). (b) Domf = {(x,y): x =1= O,y =1= 0, and x/y =1= k7r/2,k odd} and the limit is (1,0,1). (c) Dom f = {(x, y) : (x, y) =1= (O,O)} and the limit is (0,0). (d) Dom f = {(x, y) : (x,y) =1= (1, I)} and the limit is (0,0). 2. (a) limy--->o limx--->o f(x, y) = limx--->o limy--->o f(x, y) = 0, but f(x, y) has no limit as (x, y) ----> (0,0). (b) limy--->o limx--->o f(x, y) = 1/2, limx--->o limy--->o f(x, y) = 1, so f(x,y) has no limit as (x,y) ----> (0,0). (c) limy--->o limx---> 0 f(x,y) = limx--->o limy--->o f(x, y) = 0, and f(x, y) ----> as (x, y) ----> (0,0). 1. (a) Domf
°
9.3 Continuous Functions 2. (b) Note that f-1(-I, 1) is not open. Does this contradict Theorem 9.39? 5. For (b) implies (a), suppose not, and use the Sequential Characterization of Continuity. 6. (b), (c) You may wish to prove that A is relatively closed in E if and only if E \ A is relatively open in E. 8. See Theorem 3.40. 10. (a) A polygonal path in E can be described as the image of a continuous function f : [0,1] ----> E. Use this to prove that every polygonal path is connected. (c) Prove that if E is not polygonally connected, then there are nonempty open sets U, VeE such that Un V = 0 and U U V = E. 9.4 Compact Sets 1. (a) Compact. (b) Compact. (c) Not compact. H = EU {(O,y): -1::; y::; I}. (d) Not compact. There is no compact set H that contains E.
5. See Exercise 4 in Section 8.3 and the proof of the Borel Covering Lemma. 6. (a) Use Theorem 9.33. 9.5 Applications 2. See Exercise 7 in Section 7.2. 4. (a) Wj(t) = 1 for all t. (b) Wj(t) = if t =1= and Wj(O) t =1= and Wj(O) = 2. 7. (a) (b) f(0)/3. (c) 1/4. (d) (e 4 -1)/(2e 2 ). 9. (d) You may wish to use Lemma 4.28.
°
°
°
= 1.
(c) Wj(t)
=
°
if
JT72.
CHAPTER 10 10.1 Introduction 10. (a) Show that if E is not bounded, then there exist p(xn' a) ----> 00 as n ----> 00.
Xn
E E and a E X such that
ANSWERS AND HINTS TO SELECTED EXERCISES
602
10.2 Limits of Functions 1. (a) R. (b) [a,b]. (c) 0. (d) {x} if E is infinite, 0 if E is finite. (e) 0. 9. (b) See the proof of Theorem 3.26. 10.3 Interior, Closure, and Boundary
= (a,b), E = [a,b], 8E = {a,b}. (b) EO = 0, E = Eu {O}, 8E = E. (c) EO = E, E = [0,1], 8E = {l/n : n E N} U {O}. (d) EO = R, E = R, 8E = 0. 2. (a) Closed. EO = {(x, y) : x 2 + 4y2 < I} and 8E = {(x, y) : x 2 + 4y2 = I}. (b) Closed. EO = 0 and 8E = E. (c) Neither open nor closed. EO = {(x,y) : y > x 2, 0 < y < I}, E = {(x, y) : y ~ x 2, 0 ~ Y ~ I}, and 8E = {(x, y) : y = x 2, 0 ~ Y ~ I} U {( x, 1) : -1 ~ x ~ I}. (d) Open. E = {x 2 - y2 ~ 1, -1 ~ Y ~ I} and 8E = {(x, y) : x 2 - y2 = 1, -J2 ~ x ~ J2} U {(x, 1) : -J2 ~ x ~
1. (a) EO
J2} U {(x, -1) : -J2 ~ x ~ J2}. 5. Notice that a E E C and E C is open. 8. (a) See the description of relative balls following Definition 10.22. 9. To show that f is continuous at a, consider the open interval I = (f( a) -
10,
f (a) +
c). 10.4 Compact Sets 1. (a) Compact. (b) Compact. (c) Not compact. H = EU {(O,y): -1 ~ Y ~ I}. (d) Not compact. There is no compact set H that contains E. 5. See Exercise 10 in Section 10.3. 8. (a) Notice that if nHk = 0, then {X \ Hd covers X. 10. (a) Let Xk E E. Does E contain a point a such that each Br(a), r > 0, contains Xk for infinitely many k's? (b) See Exercise lOa in Section 10.1.
10.5 Connected Sets 6. Try a proof by contradiction. 9. Use Exercise 8. 10. (a) A polygonal path in E can be described as the image of a continuous function f : [0,1] --+ E. Use this to prQve that every polygonal path is connected. (c) Prove that if E is not polygonally connected, then there are nonempty open sets U, VeE such that un V = 0 and U U V = E. 10.6 Continuous Functions 2. (b) Note that f-1(-1,1) is not open. Does this contradict Theorem 10.58? 4. (a) You may wish to prove that A is relatively closed in E if and only if E \ A is relatively open in E. 8. See Theorem 3.40.
CHAPTER 11 11.1 Partial Derivatives and Partial Integrals
= fyx = eY. (b) fxy = fyx = - sin(xy) - xycos(xy). (c) fxy = fyx = -2x/(x 2 + 1)2. 2. (a) fx = 2x+ycos(xy) and fy = xcos(xy) are continuous everywhere on R2. (b) fx = y/(l+z), fy = x/(l+z), and fz = -xy/(1+z)2 are continuous except when z = -1. (c) fx = x/ Jx 2 + y2 and fy = y/ Jx 2 + y2 are continuous everywhere 1. (a) fxy
603
ANSWERS AND HINTS TO SELECTED EXERCISES
except at the origin. 3. (a) fx = (2 x 5 + 4x 3y2 - 2xy4)/(x 2 + y2)2 for (x, y) i- (0,0) and fx(O, 0) = 0. fx is continuous on R2. (b) fx = (2x/3) . (2X2 + 4y2)/(X 2 + y2)4/3 for (x, y) i- (0,0) and fx(O, 0) = 0. fx is continuous on R2. 5. (a) 1. (b) V2/2. 6. (a) 2. (b) 0. 7. (a) 9/10. (b) e- 7r /2. 10. (c) Choose 8 > such that 1¢>(t)1 < c for O:s; t < 8 and break the integral in part (b) into two pieces, one corresponding to :s; t :s; 8 and the other to 8 :s; t < 00. (d) Combine part (b) with Theorem 11.8. 11. (a) .c{tet } = 1/(8 _1)2. (b) .c{tsin1l"t} = 2811"/(8 2 + 11"2)2. (c) .cWcost} = 2(8 3 - 38)/(8 2 + 1)3. 11.2 The Definition of Differentiability
°
°
COSX
1. (a)
(b)
Df(x,y) =
~
[
t Df(8,t,u,V)= [ -28
°
x -siny
1.
° 2uv 8
uO].
_ [ l/t ] Df(t)- -l/(l+t)2 .
(c)
Df(r ())
(d)
,
= [c~s() 8m ()
-rSin()]. r cos ()
4. (b) The function f might not be differentiable when 11.3 Derivatives, Differentials, and Tangent Planes 1. (a)
D(f+g)(a)=[3
-3]
(c)
D(f + g) (a) = [0
11" ]
(e)
= 1.
D(f + g) (a) = 4 and D(3f - 2g)(a) = 2.
(b)
(d)
Q
[2 1 0]
D(f + g) (a) = 3 -2 1
D(f + 9)(&)
and
D(3f-2g)(a)=[-1
= [511" 311"].
and
D(3f - 2g)(a)
and
D(3f - 2g)(a) = [_11
~ [}, : 1 and
D(3f - 29)(&)
1].
-;
-;].
~ [-} ~~].
ANSWERS AND HINTS TO SELECTED EXERCISES
604
(c) D(f . g)(a)(l, 1, 1) = 6 and D(f x g)(a)(l, 1, 1) = (-4,0,1). (a) z=O. (b) 2x-2y-z=0. (-1/2, -1/2, 1/2), 2x + 2y + 2z = -l. (b) ax + by = 1, where a2 + b2 = 1. (c) x + y - z = ±1. 8. (a) dz = 2xdx + 2ydy. (b) dz = ycos(xy) dx + xcos(xy) dy. (c) dz = (1- x 2 + y2)y/(1 + x 2 + y2)2 dx + (1 + x 2 - y2)x/(1 + x 2 + y2? dy. 9. dw = 0.05 and ~w >:::: 0.049798. 10. L must be measured with no more than 5% error. 11.4 Chain Rule 2. 5. 6. 7.
M&
~
3.
fJp = fJx fJp
M~
+
fJy fJp
+
Mfu
~
fJz fJp'
fJq = fJx fJq
M&
Mfu
M~
+
fJy fJq
+
fJz fJq'
6. Compute the derivative of f . f using the Dot Product Rule. 10. Notice that by Exercise 10 in Section 11.2, this result still holds if "f is in C2 " is replaced by "the first-order partial derivatives of f are differentiable." 11.5 Mean Value Theorem and Tayl~r's Formula 3. (a) f(x,y) = 1-(x+1)+(y-1)+(x+1?+(x+1)(y-1)+(y-1)2. (b) JX+JY = 3+(x-1)/2+(y-4)/4- (x-1)2 /8- (y_4)2 /64 +(x-1)3 /16Vc5+(y-4)3 /16# for some (c, d) E L((x, y); (1,4)). (c) eXY = l+xy+ ((dx+cy)4 +12(dx+cy?xy+ 12x 2y2)e cd /4! for some (c,d) E L((x,y); (0,0)). 4. Notice that by Exercise 10 in Section 11.2, this result still holds if "f is in CP" is replaced by "the (p - l)st-order partial derivatives of f are differentiable." 8. Apply Taylor's Formula to f(a + x, b + y) for p = 3, x = rcose, and y = rsine, and prove that
4 1211" f(a -2 7fT
0
+ r cos e, b + r sin e) cos(2e) de =
fxx(a, b) - fyy(a, b) + F(r),
where F(r) is a function that converges to 0 as r -+ O. 9. (c) Let (X2, t2) be the point identified in part (b), and observe by one-dimensional theory that Ut(X2, t2) = O. Use this observation and Taylor's Formula to obtain the contradiction Wxx (X2, t2) - Wt(X2, t 2) : : .: O.
ANSWERS AND HINTS TO SELECTED EXERCISES
605
11.6 Inverse Function Theorem 1. (a) Since f(x, y) = (a, b) always has a solution by Cramer's Rule and D f is constant,
Drl( b) = [5/17 1/17] a, -2/17 3/17 (b) Since f((4k
(see Theorem C.7).
+ 1)1T/2, -(4k + 1)1T/2) =
Drl(O,l) =
[~ _~]
-n
[~
or
f(2h, -2h)
= (0, I),
(see Theorem C.7)
depending on which branch of f- 1 you choose. (c) Since f(±2, ±1) = f(±l, ±2) = (2,5), D
r
1 (2
,
5) = [=1=1/3 ±2/3
±1/3] =1=1/6
or
[±2/3 =1=1/3
=1=1/6] ±1/3
(see Theorem C.7)
depending on which branch of f- 1 you choose. (d) Since f(O, 1) = (-1,0), one branch of f- 1 satisfies
D(f-l)(-l 0) ,
=
[-1/2 -1/2
1] 0
(see Theorem C.7).
4. F(xo, Yo, uo, vo) = (0,0), x6 =I- Y6, and Uo =I- 0 =I- vo, where F(x, y, u, v) (xu 2 + yv 2 + xy - 9,xv 2 + yu 2 - xy -7). 6. (a) f- 1(8, t) = ((8 + v'8 2 - 4t)/2, (8 - v'8 2 - 4t)/2). (b)
D(f-l)(f(X y)) = [x/(X - y) , y/(y - x)
l/(y - X)] l/(x - y)
(see Theorem C.7).
8. (a) See Theorem 11.27. 11.7 Optimization 1. (a) f(1/3, 2/3) = -13/27 is a local minimum and (-1/4, -1/2) is a saddle point. (b) Let k,j E Z. f((2k + 1)1T/2,j1T) = 2 is a local maximum if k and j are even, f((2k + 1)1T/2,j1T) = -2 is a local minimuIl\ if k and j are odd, and ((2k + 1) 1T /2, j 1T) is a saddle point if k + j is odd. (c) This function has no local extrema. (d) f(O,O) = 0 is a local minimum if a > 0 and b2 - 4ac < 0, a local maximum if a < 0 and b2 - 4ac < 0, and (0,0) is a saddle point if b2 - 4ac > O. I 2. (a) f(2,O) = 8 is the maximum and f( -4/5, ±v'2I/5) = -9/5 is the minimum. (b) f(l,2) = 17 is the maximum and f(l,O) = 1 is the minimum. (c) f(l,l) = f( -I, -1) = 3 is the maximum and f( -1,1) = -5 is the minimum. 3. (a) f( -2,0) = -2 is the minimum and f(1/2, ±v'I5/2) = 17/4 is the maximum. (b) f(±2/ J5, ±1/ J5) = 0 is the minimum and f(±l/ J5, =1=2/ J5) = 5 is the maximum. (c) A = xy, 3J..l = x+y, f(±l/V2,=I=l/V2,O) = -1/2 is the minimum and f(±1/v'6,±1/v'6,=1=2/V6) = 1/6 is the maximum. (d) f(l,-2,O,l) = 2 is the minimum and f(l, 2, -I, -2) = 3 is the maximum. 7. (b) If DE < 0, then ax + by + cz has no extremum subject to the constraint z = Dx 2 + Ey2. 8. (b) f(2,2,4) = 48 is the minimum. There is no maximum.
ANSWERS AND HINTS TO SELECTED EXERCISES
606
CHAPTER 12 12.1 Jordan Regions 1. a) V(E; 9d = 3/4, V(E; (2) = 7/16, V(E; (3) = 15/64; v(E; 9m) = 0 for all m. (3) V(E; 9d = 1, v(E; (1) = 0; V(E; (2) = 13/16, v(E; (2) = 0; V(E; (3) = 43/64, v(E; (3) = 5/32. ,) V(E; 9d = 1, v(E; (1) = 0; V(E; (2) = 1, v(E; (2) = 1/4; V(E;93) = 15/16, v(E;93) = 1/2. 2. (c) First prove that E is a Jordan region if and only if there exist grids 9m such that V(8E;9m) - t 0 as m - t 00. 3. (a) Is it true for rectangles? 4. (d) Apply part (c) to El = (El \ E 2) U E 2. (e) Apply parts (c) and (d) to
(El U E 2) = (El \ (El
n E 2)) u (E2 \ (El n E2)) u (El n E2).
5. (a) See Theorem 8.15 or 10.39. (b) You may use Exercise 4a. 12.2 Riemann Integration on Jordan Regions 2. 3. 5. 6. 8. 11.
(a) 4. (a) -1. (b) 1/2. Show that the difference converges to zero as r - t 0+. (b) Area(E). (b) The hypothesis HO =I- f/J can be dropped when inf"'EH f(x) < f(xo) < SUP"'EH f(x). (a) Let c > 0 and choose 15 by uniform continuity of ¢. Choose a grid 9 such that U(j,9) - L(j,9) < 152 . Then break U(¢ 0 f,y) - L(¢ 0 f,9) into two pieces: those j that satisfy M J (¢ 0 f) - mj(¢ 0 f) < 15, and those j that satisfy Mj(¢ 0 f) - mj(¢ 0 f) ~ J. These two pieces are small for different reasons. (b) Use Example 3.34 and Theorem 12.29.
12.3 Iterated Integrals 1. (a) 5/6. (b) 8(2V2 -1)/9. (c) (1- COS(1T 2 /4)). (2/1T). 2. (a) E = {(x,y) : 0 ::; x ::; 1,x ::; y ::; x 2 + I} and Jo1 J: 2 + 1 (x + y)dydx = 71/60. (b) E = {(x,y,z) : 0 ::;.y::; 1,yiy::; x::; 1,0::; z::; x 2 +y2} and 3Jo1 J~(X2 +y2)dxdy = 26/105. (c) E = {(x,y): 0::; x::; 1,0::; y::; x} and
J~ J;Sin(x 2)dxdy = (1-cos(1))/2. (d) E = {(x,y,z): 0::; x::; 1, 0::; y::; X2,X3 ::; z::; I} and Jo1 J~J:3 vx 3 + zdzdxdy = 4(2V2 -1)/45. 3. (a) 2/35. (b) 1. (c) 492/5. (d) 1/8. 4. (a) 31T. (b) 91/30. (c) 88/105. (d) 1/18. 7. (a) See Exercise 6. 12.4 Change of Variables 1. (a) 1T(1- cos 4)/4. (b) 3/10. (c) (V2 + log(l + V2))(b 3 - a3 )/6. (Recall that the indefinite integral of sec 0 is Jsec 0 = log Isec 0 + tan 01 + C.) 2. (a) (1TV3/3)sin3. (b) 16 2 /(3.5.7). 3. (a) (6,;6 -7)41T/5. (b) 1T(4e 3 -1- 2(,;8 -l)e VS ). (c) 16V2/15. 4. (b) 1Tr 2 d/ a. 5. (a) 4/27. (b) 9/112. (c) 3(e-1)/e. (d) 5. (Use the change of variables x = u+v, y=u-v.) 6. See Exercise 5 in Section 12.2. 9. See Exercise 7 in Section 8.2.
ANSWERS AND HINTS TO SELECTED EXERCISES
607
10. (d) 7rn/2.
12.5 Partitions of Unity 3. See Theorem 7.58. 12.6 Gamma Function and Volume 5. Let 'lj;n represent the spherical change of variables in Rn and observe that the cofactor IAII of -p sin 'PI sin 'P2 ... sin 'Pn-2 sin 0 in the matrix D'Ij;n is identical to fi"'n_l if in fi"'n_l' 0 is replaced by 'Pn-2 and each entry in the last row of D'Ij;n-1 is multiplied by sinO. 8. r2 Vol(Br(O))/(n + 2). CHAPTER 13 13.1 Curves 5. (a) This curve spirals up the cone x 2 + y2 = Z2 from (0,1,1) to (0, e 27r , e 27r ) and has arc length V3(e 27r - 1). (b) This curve coincides with the graph of x = ±y3/2, ~ Y ~ 1 (looking like a stylized gull in flight) and has arc length 2( v'133 -1)/27. (c) This curve is a straight-line segment from (0,0,0) to (4,4,4) and has arc length 4V3. (d) The arc length of the astroid is 6. 6. (a) 27. (b) ab(a 2 + ab + b2 )/(3(a + b)). (c) 127r. (d) (5 + 3V5)/2. 7. (b) Use Dini's Theorem. 9. Analyze what happens to (x,y) and dy/dx := (dy/dt)/(dx/dt) as t --> -00, t --> -1-, t --> -t+, t --> 0, and t --> 00. For example, prove that as t --> -1-, the trace of ¢(t) lies in the fourth quadrant and is asymptotic to the line y = -x. 11. (b) Take the derivative of v' . v' using the Dot Product Rule. (d) Observe that ¢(t) = v(f(t)) and use the Chain Rule to compute ¢'(t) and ¢"(t). Then calculate ¢' x ¢" directly. 13.2 Oriented Curves 1. (a) A spiral on the elliptic cylinder y2 + 9z 2 = 9 oriented clockwise when viewed from far out the x axis. (b) A cubical parabola (it looks like a stylized gull in flight) on the plane z = x oriented from left to right when viewed from far out the plane y = x. (c) A sine wave on the parabolic cylinder y = x 2 oriented from right to left when viewed from far out the y axis. (d) An ellipse sliced by the plane x = z out of the cylinder y2 + Z2 = 1 oriented clockwise when viewed from far out the x axis. (e) A sine wave traced vertically on the plane y = x oriented from below to above when viewed from far out the x axis. 2. (a) 128/3. (b) -7rV2/2. (c) 0. 3. (a) 5. (b) 7r(-1+V5)/2. (c) IRI(2-a-b)/2. (d) -sin(l) +1/3. 4. (c) There exist functions 'Ij; and T on [0,1] that are CIon (0,1) \ {j /N : j = 1, ... ,N} such that T' > and 'Ij; = ¢j OT on ((j -l)/N,j/N) for each j = 1, ... ,N. 7. (c) If F is conservative, consider the case when C is smooth first. If (*) holds, use parts (a) and (b) to prove that F is conservative. 8. Use Jensen's Inequality. 13.3 Surfaces 1. (a) V27r(b 2 - a2 ). (b) 47ra 2. (c) 47r 2ab.
°
°
ANSWERS AND HINTS TO SELECTED EXERCISES
608
2. (a) ¢(u,v) = (u,v,u 2 - v 2), E = {(u,v) : -1 ~ u ~ 1, -lui ~ v ~ lui},
'1/JI(t) = (1,t,1-t 2), 'l/J2(t) = (-1,t,1-t 2), 'l/J3(t) = (t,t,O), 'l/J4(t) = (t,-t,O), h = 12 = 13 = 14 = [-1,1], and IIsgdo" = 22/3. (b) ¢(u,v) = (u,u 3,v), E = [0,2] x [0,4], 'l/Jl(t) = (t,t 3,4), 'l/J2(t) = (t,t 3,0), 'l/J3(t) = (O,O,t), 'l/J4(t) = (2,8,t), h = 12 = [0,2], h = 14 = [0,4], and IIsgda = (4/27)(145 3/ 2 -1). (c) ¢(u,v) = (3cosucosv,3sinucosv,3sinv), E = [0,27r] x [7r/4,7r/2], 'l/Jl(t) = ((3/V2) cost, (3/V2)sint,3/V2), 'l/J2(t) = (3cost,3sint,0), 1 = [0, 27r], and IIsgda = 277r/2. 5. (b) Use Theorem 12.65. 6. If you got 527r, you gave up too much when you replaced lI(x,y)1I by 3. 13.4 Oriented Surfaces 1. (a) Since the x axis lies to the left of the yz plane when viewed from far out the positive y axis, the boundary can be parametrized by ¢(t) = (3sint,0,3cost), 1 = [0,27r], and Ias F . T ds = -97r. (b) The boundary can be parametrized by ¢l(t) = (O,-t,l + 2t), h = [-1/2,0]; ¢2(t) = (t,O,l- t), 12 = [0,1]; and ¢3(t) = (-t,(l + t)/2,0), h = [-1,0]; and IasF. Tds = -1/12. (c) The boundary can be parametrized by ¢1 (t) = (2 sin t, 2 cos t, 4), h = [0,27r], and ¢2(t) = (cost,sint,l), 12 = [0,27r], and IasF.Tds =37r. 2. (a) 7r/2. (b) 16. (c) 27r 2ab2. (d) 7r/8. 3. (a) -14/15. (b) 47ra 3/3. (c) (3b 4 + 8a 3 - 8(a 2 - b2)3/2)7r/12. (d) -27r/3. 4. (b) Use Theorem 12.65. 13.5 Theorems of Green and Gauss 1. (a) 8/3. (b) 3log3 + 2(1- e3 ). (c) -157r/4. 2. (a) (b - a)(c - d)(c + d - 2)/2. (b) -1/6. (c) 0. 3. (a) 2(5 + e3). (b) 7r. (c) 8. (d) 7rabc(a + b + c)/2. 4. (a) 224/3. (b) 2(8V2 - 7)/15. (c) 247r. 5. (b) 3/2. (c) Vol (E) = (1/3) IaExdydz+ydzdx +zdxdy. (d) 27r 2ab 2. 9. (c) Use Exercise 8 and Gauss's Theorem. 10. (e) Use Green's Theorem and Exercise 5 in Section 12.2. 13.6 Stokes's Theorem 1. (a) -7r/4. (b) 277r/4. 2. (a) 0. (b) -37r. (c) -107r. (d) -1/12. 3. (a) 7r 2 /5. (b) -7r/(8V2). (c) 287r (not -287r because i x k = -j). (d) 327r. (e) -7r. 4. (a) 187r. (b) 87r. (c) 3(1- e) + 37r/2. (d) 0. 10. (b) 27r.
CHAPTER 14 14.1 Introduction 1. (a) ao(x 2) = 27r 2 /3, ak(x 2) =4(-1)k/k2, and bk(X 2) = for k= 1,2'00' (b) All Fourier coefficients of cos 2 x are zero except ao(cos 2 x) = 1/2 and a2(cos 2 x) = 1/2. 6. (a) ak(f) = for k = 0,1, ... , bk(f) = 4/(k7r) when k is odd and when k is even.
°
°
°
ANSWERS AND HINTS TO SELECTED EXERCISES
609
(c) You may wish to use Theorem 9.42. 14.2 Summability of Fourier Series 5. (c) See Exercise 4b in Section 5.1. 8. See Theorem 9.51. 14.3 Growth of Fourier Coefficients 4. See Exercise 4a in Section 14.2 and Theorem 7.12. 14.4 Convergence of Fourier Series 1. Note: It is not assumed that f is periodic. 2. (c) 7r 2 /8. 4. (a) Use Abel's Formula. For the first identity, you must show that SNpN -+ N -+ 00 for all p E (0,1) if 2:%':0 akrk converges for all r E (0,1). 5. (a) Prove that for each fixed h, ak (f (x + h)) = ak (f) cos kh + bk (f) sin kh.
°as
CHAPTER 15 15.1 Differential Forms on Rn
1. (a) dydz-3dzdx+2dxdy. (b) x 2 dydz+xydzdx+yzdxdy. (c) x 2 cosxdxdydwdydzdw. 2. (a) (2x+2y) dxdy. (b) y cos(xy) dx dz dw+xcos(xy) dydz dw-w sin(zw) dx dy dzz sin(zw) dxdy dw. (c) ((x + y)/ Jx 2 + y2) dx dydz. (d) (y sin xeYz - ysinxe xy + xcosye xy - cosxe xy ) dxdydz. 15.2 Differentiable Manifolds 5. (a) Note: If B is another atlas of M, then A and B are compatible. 15.3 Stokes's Theorem on Manifolds 1.
7r 2 a6
/4.
2. 2:7=1 (-1)j-1 a1 ··· iij ... an· a;.
Abel summable, 207 Abel's Formula, 174 Abel's Test, 177 Abel's Theorem, 200 Absolute curvature, 460 Absolute value, 8 Absolutely convergent series, 165, 192 rearrangements of, 168 Absolutely integrable, 139 Accuracy (see Rate of approximation) Additive identity, 3 Additive inverse, 3 Additive Property, 4 Adherent point, 166 Algebra of sets, 2, 31 Algebraic number, 34 Almost everywhere continuous, 281 Almost everywhere convergent, 519 Alternating Series Test, 175, 179 Analytic continuation, 217 Analytic function, 208 Angle between vectors, 231 Anticommutative Property, 540 Approximation Property, 18, 23
Arc, 450 (See also Curve) Arc length, 452, 457 Arc length differential, 456 Archimedean Principle, 19 Archimedes' approximation to 1T, 49 Area, 108, 110, 384 of a parallelogram, 241 of a rectangle, 381, 385 surface, 474, 476 (See also Volume) Arithmetic mean, 12 Arithmetic-Geometric mean, 49 Associative Properties, 2 Astroid, 451 Atlas, 550 oriented, 555 -with-smooth-boundary, 553 Axiom(s), Completeness, 19 Field,2 of Induction, 13 Order, 4 Well-Ordering, 13, 572 Balls, in a metric space, 292 relative, 297 611
612
Subject index
Balls, in Rn, 242 volume of, 444 Basis, usual, 232 Bernoulli's Inequality, 97 Bernstein's Theorem, 216, 531 Bessel functions, 197 Bessel's Inequality, 520 Bijection, 25 Binomial coefficients, 15, 214 Binomial Formula, 16 Binomial Series, 215 Bolzano-Weierstrass Property, 300 Bolzano-Weierstrass Theorem, 47, 258 Borel Covering Lemma, 259 Bound, lower, 21 upper, 18 Boundary, manifold, 475 of a manifold, 554 of a set, 251, 304 of a surface, 475 topological, 475 Boundary point, 475, 554 Bounded function, 73 Bounded sequence, 37, 256, 294 above, below, 37 uniformly, 191 Bounded set, 21 above, 18 below, 21 interval, 10 Bounded variation, function of, 142 Cantor set, 288 Cantor-Lebesgue Lemma, 536 Cantor's Diagonalization Argument, 28 Cantor's Intersection Theorem, 311 Cantor's Uniqueness Theorem, 536 Cartesian product, 2, 321 Cauchy Criterion for absolute convergence, 165 for sequences, 50, 258, 295 for series, 157 for uniform convergence, 189 Cauchy sequence, 49, 256, 295 Cauchy-Riemann equations, 352
Cauchy-Schwarz Inequality, 229 Cauchy's example, 209 Cauchy's Mean Value Theorem, 96 Cauchy's Theorem, 50 Cesaro means, 513 Cesaro summability, 159, 513 Chain Rule, 92, 349 Change of variables, on an interval, 130 on a Jordan region, 424, 440 on an open set, 439 Chart, 550 -with-smooth-boundary, 553 Chord, 85, 147 Circular helix, 463 Clopen, 255, 316 Closed ball, 242, 292 Closed curve, 450 Closed form, of the Dirichlet kernel, 510 of a power series, 205 Closed Graph Theorem, 286 Closed interval, 10 Closed set, 243, 292 limit characterization, 259, 294 Closed surface, 476 Closure of a set, 250, 302 Closure Properties, 2, 4 Cluster point, 52, 297 Commutation (See Interchange the order of) Commutative Properties, 2 Compact set, 259, 277, 307 sequentially compact, 52, 262, 296 Compact support, 432, 539 Comparison Test, 162 Comparison Theorem, for functions, 63 for improper integrals, 138 for integrals, 121 for multiple integrals, 402 for sequences, 43, 44(6b) for series, 162 Compatible atlas, 551, 555 Complement, 2 Complete Archimedean ordered field, 5 Complete metric space, 295 Completeness Axiom, 19
Subject index
Completeness of the trigonometric system, 516 Component, of a function, 263 of a vector, 225 Composition, 73 continuity of, 73, 276(3b), 300 Concave function, 147 Conditionally convergent series, 165 rearrangements of, 170 Conditionally integrable, 139 Cone, 584 parametrization of, 470 Connected set, 246, 312 characterization in R, 247 polygonally connected, 277 Conservative vector field, 467(7) Constraints, extrema with, 374, 376 Continuity, 72, 271, 299 characterization by open sets, 270, 276(5,6), 317 by sequences, 72, 271, 299 uniform, 80, 271, 310 Continuous, function, 72, 271, 299 image of a compact set, 273, 279(7), 317 image of a connected set, 274, 318 inverse image, 272, 276(6), 317 on a manifold, 552 on a metric space, 299 on R, 72 on Rn, 271 uniformly, 80, 271, 310 Continuously differentiable, 89, 336 Continuously embedded, 555 Continuously extended, 82 Contradiction, 6 Convergence, absolute, 165, 192 almost everywhere, 519, 532 conditional, 165 of a function, 58, 66, 264, 298 interval of, 199 in a metric space, 293, 298 pointwise, 184, 192 radius of, 198 of sequences, 35, 256, 293
of series, 155 uniform, 186, 192, 327 (See also Limit) Convergence Question, 509 Convex functions, 147 continuity, 149 differentiability, 152 Convex set, 353 Coordinate function, 263 Coordinate hyperplane, 413 Coordinates, 225 Corona, 476 cosx, 90(4), 573 Taylor series of, 211 Countable set, 27 at most countable, 27 Countable sub covering, 277, 307 Covering, cover, 259, 277, 307(2a) Cramer's Rule, 583 Cross product, 232, 347 Cube with side s, 321 Curl, 490, 495(8) Curvature, 460(10) Curve, 450, 456 arc length of, 452, 456 closed,450 explicit, 451 parametrization of, 450, 456 piecewise smooth, 456 rectifiable, 456 simple, 450 smooth, 455 Cylinder, parametrization of, 469 Cylindrical coordinates, 427 Darboux's Theorem, 105 Decimal expansion, 44, 155 Decreasing function, 102 Decreasing sequence, 45, 280 Deductive proof, 6 Degenerate interval, 11 DeMorgan's Laws, 31 Dense set, 277, 320 Density, of irrationals, 23 of rationals, 20
613
614
Subject index
Derivative, 85, 88, 334 curl, 490, 495(8) definition in Rn, 334 directional, 338, 351(7) divergence, 490, 495(8) exterior, 543 of higher order, 86, 355 of an integral, 127, 326, 329 partial, 322, 450 of a real-valued function, 85, 88 second symmetric, 532 of a series, 192, 201 total, 334 of a vector-valued function, 334, 450 (See also Differentiable function) Determinant, 580 Diagonalization Argument, 28 Differentiable function, 85, 88, 333 at a point in R, 85 at a point in Rn, 333 on a manifold, 560(5) on a set, 88, 450 (See also Derivative) Differentiable manifold (See Manifold) Differential, Leibnizian, 130 total, 343, 355 use for approximation, 345 Differential form (See Form) Differential Transform (s), 546 Fundamental Theorem of, 548 Differentiating under an integral sign, 326,329 Differentiation of an integral, 127 Differentiation of a series, 192, 201 Dilation, 393(3) Dimension, 225 Dini's Theorem, 280 Directional derivative, 338, 351(7) Direct method, 375 Dirichlet function, 77 Dirichlet kernel, 509 Dirichlet-Jordan Theorem, 529 Dirichlet's Test, 174 for uniform convergence, 193 Discontinuous, 76
Discrete metric, 291 Discriminant for local extrema, 373 Disjoint collection, 386 Distance, 11, 228 Distributive Law, 3 Divergence of a function, 490, 495(8) Divergence Test, 156 Divergence Theorem, 491 Divergent sequence, 41 Divergent series, 155 Domain, of a real-valued function, 2 of a relation, 2 maximal domain, 263 Dominated, 73, 266 Dot product, 226 Double angle formulas, 575 Double integral (See Iterated integral) Double series, 194 e, 99, 211, 222 Element (of a set), 1 Ellipsoid, 583 Empty set, 1 Empty sum, 382 Endpoints, 11, 450 Equality of forms, 539 of functions, 2 of ordered pairs, 2 of rationals, 4 of sets, 2 of vectors, 225 Equivalence relation, 590 Equivalent, orientation, 461, 480 smoothly, 455, 456, 473, 476 Estimation, 177 Euclidean distance, 228 Euclidean norm, 227 Euclidean space, 225 Euclid's Theorem, 220 Euler's proof, 220 Euler's Theorem, 222 Exact form, 567 Exact function, 504(8) Explicit curve, 451
Subject index
Explicit surface, 469 Exponential function, 5, 134(5), 211 Extended real number, 23 Comparison Tests, 43, 71(7) limits, 41, 67 Sequential Characterization, 68 Squeeze Theorem, 44(6), 71(7) Exterior derivative, 543 Extrema with constraints, 374, 376 Extreme Value Theorem, 74, 274 Extremum, local, 369 Factorial, 15 via Gamma function, 442 Fejer kernel, 509 Fejer's Theorem, 515 Field Axioms, 2 Finer, 107, 381 Finite difference, 174 Finite set, 27 Finite subcovering, 259, 277, 307 First Mean Value Theorem, 122 First-order partial, 322 First partials commute, 324 Folium of Descartes, 460(9) Form, decomposable, 539 of degree r, 538 of degree 1, 465 of degree 2, 482 derivative of, 543 exact, 567 Fundamental Theorem, 548 oriented integral of, 562 product of, 540 sum of, 540 Fourier coefficients, 507 Fourier series, 508 Fourier's Theorem, 508 Fubini's Theorem, 408, 419(10) Function, 2 analytic, 208 bounded, 73 of bounded variation, 142 component, 263 continuous, 72, 271, 299, 552
615
convex, 147 coordinate, 263 differentiable, 85, 88, 334 domain, 24, 263 exponential, 134(5) harmonic, 496(lOd) increasing, 102 inverse, 25 limit, 58, 66, 264, 298 linear, 236 logarithmic, 134(4) monotone, 102 nowhere continuous, 77 nowhere differentiable, 223 periodic, 507 product of, 61, 264 real, 24 real-valued, 24, 263 Riemann integrable, 110, 395 support, 432 sum of, 61, 264 trigonometric, 90(4), 573 uniformly continuous, 80, 271, 310 variation of, 142 vector-valued, 263 Fundamental Theorem of Calculus, 127 Fundamental Theorem of Differential Transforms, 548 Gamma function, 442 Gauss's Theorem, 491 Generalized Binomial Coefficients, 214 Generalized Mean Value Theorem, 96 Geometric mean, 12 Geometric series, 156, 158(1) Gibbs's phenomenon, 512(6) Global property, 420 Gradient, 335, 351(7), 495(9) Greatest lower bound (See Infimum) Green's Identities, 496 Green's Theorem, 488 Grid, 381 Half space, 553 Hardy'S Tauberian Theorem, 527
616
Subject index
Harmonic function, 496(10d) Harmonic series, 155 Heat equation, 351(5), 358 Heine-Borel Theorem, 259, 309 Helix, 463 Holds for large k, 160 Homogeneous Rule, 92 Hyperbolic paraboloid, 584 Hyperboloid, 584 Hyperplane, 235, 413 i,232 Image, inverse image, 32 Implicit Function Theorem, 365 Implicit Method, 380(8) Improper integral, 137 absolutely integrable, 139 Comparison Theorem, 138 conditionally integrable, 139 convergence, 137 Fubini's Theorem, 419 uniform convergence, 327 Increasing function, 102 Increasing sequence, 45 of functions, 280 Indeterminate form, 42, 99 Induced orientation, 483, 555 Induction, 13 Inductive set, 571 Inequality, Bernoulli's, 97 Bessel's, 520 Cauchy::-Schwarz, 229 Jensen's, 150 triangle, 9, 230, 290 Infimum, 21 Infinite series, 154 (See also Series) Infinity, 23 Inner product, 226 Inner sum, 390 Inner volume, 391 Integers, 4 Integrability, 110, 395 of the absolute value, 121, 402 of a continuous function, 111, 397
Lebesgue's characterization, 284 of a monotone function, 116(8) of a product, 122, 406(7) of a sum, 119, 399 Integrable, on an interval, 110 on a Jordan region, 395 locally, 137, 437 on an open set, 439 (See also Improper integral) Integral, 112, 395 differentiation of, 127, 326, 329 on an interval, 112 iterated, 407 on a Jordan region, 395 on a line (unoriented), 456, 457 lower, 112, 395 on a manifold, 562 on an open set, 439 oriented line, 462, 465 oriented surface, 480, 483 on a set of volume zero, 401 on a surface (unoriented), 474, 476 upper, 112, 395 (See also Integrability) Integral Test, 160, 177 Integration by parts, 129 Interchange the order of, 73 a derivative and an integral, 326, 329 iterated limits, 269 a limit and a derivative, 189 a limit and a function, 73 a limit and an integral, 188, 325 mixed partial derivatives, 323, 339(10) the order of integration, 408, 419(10) the order of summation, 172(7), 195 (See also Term-by-term) Interior, of a set, 250, 302 of a surface, 475 Intermediate Value Theorem 75, 105, 277(9) Intersect, two sets, 251, 304 Intersection, image of, 32 of two sets, 2 of a family of sets, 31
Subject index
Interval, 10 Interval of Convergence, 199 Inverse function, 25 Inverse Function Theorem, 103, 361 Inverse image, 32 Inversion, 509 Invertible matrix, 579 Irrational number, 4, 222 Isoperimetric Problem, 529 Iterated integral, 407 Iterated limits, 268 j, 232 Jacobian, 359 Jensen's Inequality, 150 Jordan content, 384 Jordan Curve Theorem, 450 Jordan region, 384 characterizations, 386, 391 image of, 389 project able, 412 union of, 388 of volume zero, 386
k,232 Kernels, 509, 518(6)
Lagrange Multipliers, 376 Lagrange's Theorem, 214 Laplace Transform, 330 Laplace's equation, 352, 496(10) Large k, holds for, 160 Law of Cosines, 577 Least element, 13 Least upper bound (See Supremum) Lebesgue's Theorem, 284, 404 Left-half space, 553 Left-hand derivative, 151 Left-hand limit, 66 Leibnizian differential, 130 Length, 11, 384, 452 I'Hopital's Rule, 97 Lie between, 75 Limit, of functions, 58, 66, 264, 298 iterated, 268
617
left-hand, right-hand, 66 one-sided, 66 of sequences, 35, 256, 293 sequential characterization, 60, 68, 265, 298 of series, 155 two-sided, 58, 67 Limit Comparison Test, 163 Limit infimum, 52 Limit supremum, 52, 166 Lindel6f's Theorem, 278, 309 Line, parametric form, 229, 450 Line integral, 456, 457 oriented, 462, 465 Line segment, 229 Linear function, 236 Linear Properties, 119, 399 Lipschitz class, 531(5) Local extrema, 90(3), 369 with constraints, 374, 376 discriminant for, 373 Lagrange Multipliers, 376 Locally integrable, 137, 437 Logarithm function, 5, 134 Logarithm Test, 181 Lower bound, 21 Lower integral, 112, 395 Lower Riemann sum, 108 Lower sum, 394 Maclaurin series, 208 Manifold, 552 integral on, 562 orient able, 555 -with-smooth-boundary, 554 Manifold boundary, 475 Mathematical Induction, 6, 13 Matrix, algebra of, 577 inverse, 582 representation of linear functions, 238 representation of vectors, 237 transpose, 582 Maximum, of a function, 74, 90(3), 150 in R2, 369, 370
618
Subject index
Mean(s), arithmetic, 12 arithmetic-geometric, 49 Cesaro, 513 geometric, 12 Mean Value Theorem, 96, 352 for integrals, 122, 124, 403 Cauchy's, 96 Generalized, 96 real-valued functions, 96 vector-valued functions, 352 Measure zero, 281, 394(9), 403 Metric space, 290 Minimum, of a function, 74, 90(3), 150 Mixed second partials, 322 Mobius strip, parametrization of, 479 Modulus of continuity, 526 Monotone Convergence Theorem, 45, 48 Monotone functions, 102 continuity of, 104 integrability of, 116(8) Monotone Property, 22, 23 Monotone sequences, 45, 280 Multiplication (See Product) Multiplicative, 8, 547 Multiplicative identity, 3 Multiplicative inverse, 3 Multiplicative Properties, 4 Natural logarithm, 5, 134(4) Natural numbers, 4 Natural parametrization, 454, 460 n-dimensional rectangle, 321, 381 n-dimensional region, 468 Negative part, 11, 65 Nested Interval Property, 46 Nested sets, 46 Newton's Theorem, 220 Nilpotent Property, 540 Nondegenerate interval, 11 Nonempty set, 2 Nonnegative number, 4 Nonoverlapping collection, 386 Norm, of a continuous function, 292 Euclidean, 227 £1_, 227
of a linear function, 239 of a matrix, 239 operator, 239 of a partition, 107 sup-, 227 of a vector, 227 Normal vector to an explicit surface, 341 induced,472 to a (hyper )plane, 235 to a surface in parametric form, 472 unit, 479, 483 Nowhere continuous function, 77 Nowhere differentiable function, 223 Numbers, algebraic, 34 irrational, 4 natural,4 rational,4 real, 2 transcendental, 34 One-sided derivative, 88, 151 One-sided limits, 66 One-to-one function, 25 Onto function, 25 Open ball, 242, 292 Open covering, 277, 307(2a) Open interval, 10 Open set in a manifold, 552 in a metric space, 292 in Rn, 243 Operator norm, 239 Order Axioms, 4 Ordered n-tuple, 225 Ordered pair, 2 Orientable, manifold, 555 surface, 480, 483 Orientation, of a curve, 461 induced, 483, 555 of a manifold, 555 positive, 483 of a surface, 480, 483 usual,557 Orientation compatible, 555
Subject index
Orientation equivalent, 461, 480 Oriented integral, line, 462, 465 on a manifold, 562 surface, 480, 483 Oriented positively, 483 Oriented surface (See Orientable) Orthogonal vectors, 227 Orthogonality of the trigonometric system, 507 Oscillation of a function, 282 Outer sum, 382 Outer volume, 391 Pairwise disjoint sets, 386 Paraboloid, 584 Parallel vectors, 227 Parallelepiped, 241(7), 431(9) Parallelogram, 228, 241 (7) Parametric equations (See Parametrization) Parametrization, of curves, 450, 456 natural, 454, 460 of surfaces, 468, 476 trivial, 450, 469 Parseval's Identity, 521 Partial derivative( s) commutation of, 323, 339(10) first-order, 322 mixed,322 at a point, 322 second-order, 322 on a set, 450 Partial differential equations, Cauchy-Riemann equations, 352 heat equation, 351(5), 358 Laplace's equation, 352, 496(10) wave equation, 351(3) Partial integral, 321 Partial Jacobian, 364 Partial sum, 14, 154, 508 Partial summation, 174 Partition, 107, 590 Partition of unity, 435, 561 Pascal's triangle, 15 Periodic function, 507
Piecewise smooth curve, 456 Piecewise smooth surface, 476 integral on, 476, 483 orientable, 483 oriented integral, 483 surface area, 476 Pigeonhole Principle, 34(9) Plane, 235 distance from a point, 241(8) tangent to a surface, 341, 471 Poincare Lemma, 567 Point in R n, 225 Point of accumulation (See cluster) Point of discontinuity, 76 Pointwise convergence, 184, 192 Pointwise increasing, 280 Pointwise sum, of functions, 61, 264 Polar coordinates, 425 Polygonally connected, 277 Polynomial, 71(4), 271(4) trigonometric, 506 Positive definite, 9, 290 Positive number(s), 4, 13 Positive orientation, 483, 491 Positive part, 11, 65 Postulates, 2, 4, 13, 19 Power Rule, 94(8), 129 Power series, 197 closed form, 205 differentiation, 201 expansion, 208 integration, 202 interval of convergence, 199 product of, 203 radius of convergence, 198 uniqueness of, 208 Prime numbers, 220 Principle Archimedean, 19 of Mathematical Induction, 13 Well-Ordering, 13 Product Cartesian, 2, 321 cross, 232 of determinants, 581
619
620
Product of forms, 540 of functions, 61, 264 inner product, 226 of matrices, 577 of power series, 204 scalar, 226 Product of series, 204 Product Rule, 92, 340, 544 Projectable region, 412 Projection, 413 Proof by contradiction, 6 Proof by induction, 13 Proper maximum, 150 Proper subset, 2 p-Series Test, 162 Quadratic form, 373 Quadric surface, 583 Quotient, derivative of, 92, 347(4) limit of, 62 pointwise, 61 Quotient Rule, 92, 347(4) Raabe's Test, 182 Radius of convergence, 198 Rate of approximation, 177-180 Ratio Test, 167, 180 Rational number, 4 Real function, 24 Real number, 2 Real-valued, 24, 263 Rearrangement of series, 168, 170 Rectangle, 321 connectivity, 275 volume of, 381, 385 Rectifiable curve, 457 Refinement, 107 Reflection, 22 Region, Jordan, 384 n-dimensional, 468 projectable, 412 of types I, II, or III, 412 Relation, 2, 590 Relative balls, 297
Subject index
Relatively open/closed, 246, 249(8), 255(11),263(9),276(6),313 Remainder term, 209 integral form, 214, 358(6) Riemann integral (See Integrable and Integral) Riemann-Lebesgue Lemma, 520 Riemann sums, 117 lower, upper, 108 Riemann's Theorem, 170, 534 Right-hand derivative, 151 Right-hand limit, 66 Right-hand orientation, 483 Rolle's Theorem, 94 Root Test, 167, 180 Rotation invariant, 431(7) Rotations, 241(9) Saddle point, 370 Scalar, 225 Scalar product, 226 with a function, 61 Secant line, 85 Second Derivative Test, 371 Second formal integral, 534 Second Mean Value Theorem, 124 Second symmetric derivative, 532 Separable, 257, 308 Separate(s) a set, 246, 312 Sequence, 35 bounded, 37, 256, 294 Cauch~ 49, 256, 294 convergent, 35, 256, 294 divergent, 41 increasing, 45 monotone, 45 pointwise convergent, 184 uniformly bounded, 191(5) uniformly convergent, 186 Sequential Characterization of Continuity, 72, 271, 299 of Limits 60, 68, 265, 298 Sequentially compact, 52, 296 Series, 154 absolutely convergent, 165, 192
Subject index
alternating, 175 Cauchy Criterion, 157 conditionally convergent, 165 convergent, 155 divergent, 155 Fourier, 508 geometric, 156, 158(1) pointwise convergent, 192 power, 197 product of, 203-204 rearrangements, 168, 170 telescopic, 156 trigonometric, 506 uniformly convergent, 192 Set, 1 bounded, 21 Cantor, 288 closed, 243, 292 compact, 277, 307 connected, 246, 312 countable, 27 empty, 1 inductive, 571 open, 243, 292, 552 sequentially compact, 52, 296 uncountable, 27 of volume zero, 386 Shift Formulas, 575 Sign-Preserving Property, 75 Simple closed curve, 451 sinx, 90(4), 573 Taylor series of, 211 Singularities, 481 Smooth curve, 455 Smooth surface, 472 Smoothly equivalent, curves, 455 surfaces, 473, 476 Space, compact metric, 307 complete metric, 295 connected metric, 312 of continuous functions, 292 Euclidean, 225 metric,290 separable, 308 Space-filling curve, 449
Sphere, parametrization of, 470 Spherical coordinates, 428 in Rn, 444 Squeeze Theorem for extended limits, 44(6) for functions, 63, 265 on a metric space, 299 for sequences, 39, 44(6) Star-shaped, 567 Stirling's Formula, 447 Stokes's Theorem, 496, 566 Straight line in R n, 229 Strictly increasing, function, 102 sequence, 45, 280 Subcovering, 259, 277, 307 Subordinate, 436 Subsequence, 37 Subset, 2 proper, 2 Subspace, 291 Subspace topology, 313 Sum, 2 of differential forms, 540 of functions, 61, 264 of matrices, 577 of a series, 155 of two series, 157 Sum-angle formulas, 575 Sum Rule, 92, 340, 544 Summability kernels, 518(6) Summability Question, 513 Summable Abel, 207 Cesaro, 159, 513 Summation by parts, 174 Sup-norm, 227 Support, 432, 539 Supremum, 18 Surface, 468 area of, 474 closed,476 explicit, 469 integral, 474, 476 quadric, 583 orientation of, 480, 483
621
622
Subject index
Surface parametrization of, 468, 476 piecewise smooth, 476 simple, 468 smooth, 472, 473 Surface area, 474, 476 Surface integral, 474, 476 oriented, 480, 483 Surjection (See Onto functions) Symmetric, 9, 290 Symmetric derivative, second, 532 Tangent line, 86, 454 Tangent plane(s), 341, 478(7) and differentiability, 346 Tangent vector, 455 unit, 461 Tauber's Theorem, 526 Taylor expansions, 208 remainder term, 209, 214, 356 Taylor series (See Taylor expansions) Taylor's Formula, 209, 356, 358 Telescopes, 109 Telescopic Series, 156 Term-by-term, differentiation, 192, 201 integration, 192, 202 Test, Abel's, 177 Alternating, 175, 179 Comparison, 162 Dirichlet's, 174, 193 Divergence, 156 integral, 160, 177 Limit Comparison, 163 Logarithmic, 181 p-Series, 162 Raabe's, 182 Ratio, 167, 180 Root, 167, 180 Second Derivative, 371 Weierstrass M-Test, 193, 327 Topological boundary, 475 Torus, parametrization of, 470 Total derivative, 334 uniqueness of, 334 Total differential, 343, 355
Total variation, 144 Trace, of a curve, 450 of a surface, 468 Transcendental number, 34 Transition, between two curves, 453-5 between two surfaces, 473 Transition maps, 551 Transitive Property, 4 Translation, 393(3) Transpose of a matrix, 582 Triangle inequalities, 9, 230, 290 Trichotomy Property, 4 Trigonometric functions, 90(4), 573 Trigonometric polynomial, 506 Trigonometric series, 506 Trivial parametrization, 450, 469 Twice differentiable, 86 Two-sided limits, 58, 67 Type I, II, or III, 412 Uncountable set, 27 Uniform Cauchy Criterion, 189 Uniform convergence, 186, 192, 327 (See also Term-by-term) Uniformly bounded, 191(5) Uniformly continuous, 80, 271, 310 characterization on intervals, 82 integrability of, 405 (4) Union, of a family of sets, 31 image of, 32 of two sets, 2 Unique, 25 Uniqueness, of identities and inverses, 570 of power series, 208 of the total derivative, 334 of trigonometric series, 536 Uniqueness Question, 509 Unit normal vector, 479, 483 Unit tangent vector, 461 Unity, partition of, 435, 561 Upper bound, 18 Upper integral, 112, 395 Upper Riemann sum, 108 Upper sum, 394
Subject index
Urysohn's Lemma, 434 Usual basis, 232 Usual metric, 290 Usual orientation, 557 Vacuous implication, 244 Value of a series, 155 Variables, change of, 130, 424, 440 Variation, bounded, 142 total, 144 Vector(s), 225 angle between, 231 components, 225 difference, 226 equality, 225 identification with matrices, 237 parallel, 227 sum, 226 Vector function, 263 continuity, 271 differentiability, 333, 560(5)
623
Volume, 384 of a ball, 444 connection with determinants, 431(9) connection with x, 241(7) integral form, 398 of a parallelepiped, 241(7), 431(9) of a rectangle, 381 of volume zero, 386 Wave equation, 351(3) Weierstrass Approximation Theorem, 517 Weierstrass M-Test, 193,327 Weierstrass's Theorem, 223 Well-Ordering Principle, 13, 572 Zero, of area, 386 of volume, 386 Zero form, 538 Zero vector, 225
Symbol, description
pagers) defined
0, the empty set ............................................................. 1 00, -00,
infinity (negative infinity) .......................................... 23
E, ~, an element of (not an element of) ....................................... 1
<;;:, C, is a subset of (is a proper subset of) ................................... 2 A U B, A n B, the union (intersection) of A and B . ........................... 2 A \ B, Be, the complement of B relative to A (to a universal set) ............. 2 A x B, the Cartesian product of A and B .................................... 2 the union of a sequence (a family) of sets ................... 30,31 the intersection of a sequence (a family) of sets ............. 30, 31 limx~a+, limx~a_, the right-hand (left-hand) limit .......................... 66 sup E, inf E, the supremum (infimum) of a set E ........................ 18, 21 L~=l ak, L~=l ak, a finite (infinite) sum or series ...................... 14, 155 N, the set of natural numbers ............................................... 4 Q, the set of rational numbers ............................................... 4 R, the set of real numbers ................................................... 2 Z, the set of integers ........................................................ 4 ::;, <, less than or equal (strictly less than) ................................... 4 lal, the absolute value of a ................................................... 8 a+, a -, the positive (negative) part of a number a . .......................... 11 (a, b), [a, b], open (closed) interval with endpoints a and b ......... ........... 10 IJI, R, the length of an interval I (the volume of a rectangle R) ......... 11, 381 Xn l' a, Xn 1 a, an increasing (decreasing) sequence which converges to a ..... 45 lim sup, lim inf, the limit supremum (infimum) .......................... 52, 166 eX, the exponential function ......................................... 5, 134(5) logx, the natural logarithm of x ...................................... 5, 134(4) R~'xo, the remainder term of the Taylor expansion of f at Xo ............... 209 sinx, cos x, sine (cosine) of x ........................................ 90(4),573 xC<, an irrational power of x ......................................... 5, 135(5e) f(x), the gamma function evaluated at x ................................... 442 R n, n-dimensional Euclidean space ........................................ 225 ej, the usual basis of R n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 Ilxll, Ilxll oo , Ilxlll' the norm, the sup-norm (the £l-norm) of a vector x ....... 227 IIBII, the operator norm of a matrix (or linear function) .................... 239 UkEN, U"EA, nkEN, n"EA,
624
Notation index
625
Br(a), the open ball centered at a of radius r .......................... 242, 292 L(a;b), the line segment between a and b ................................... 229 IIn(a), the (hyper )plane with normal n passing through a . .................. 235 EO, E, the interior (closure) of a set E ................................ 250, 302 aE, as, the boundary of a set E (surface S) . ..................... 251, 304, 475 Area (E), Vol (E), the area (volume) of a set E . ............................ 384 Vol(E), Vol(E), the outer (inner) volume of E ............................. 391 L(C), the arc length of a curve C ..................................... 452,457 O'(S), the surface area of a surface S . ................................. .474, 476 f : X ----> Y, a function from X to Y .......................................... 2 f+, f-, the positive (negative) part of a function f . ......................... 65 fog, the composition of f with g ........................................... 73 f- 1 , the inverse function of f ............................................... 26 f(E), f- 1 (E), the image (inverse image) of E under f ...................... 32 f(a+), f(a-), the right-hand (left-hand) limit of f at a . .................... 66 f', D f, the derivative (total derivative) of f ............................ 85, 334 f(k), the derivative of f of order k ........................................... 86 DRf, DLf, the right-hand (left-hand) derivative of f ....................... 151 D.f, the Jacobian of f ...................................................... 359 '\l f, the gradient of f ...................................................... 335 curlF, divF, the curl (divergence) of F .................................... 490 D(p) f(a;h), the pth total differential of f at a and h ........................ 355 dw, ¢*(w), the exterior derivative (differential transform) of w ......... . 543,546 CP, continuously differentiable of order p . ............................... 89, 322 Coo, Cgo infinitely differentiable (of compact support) .............. 89, 322, 432 IIPII, the norm of a partition P ............................................ 107 U(j, P), L(j, P), the upper (lower) Riemann sum of f over a partition P ... 108 U(j, g), L(j, g), the upper (lower) sum of f over a grid g ............. 394, 395 (U) (L) the upper (lower) Riemann integral ...................... 112, 395 Var (¢), V(¢, P), the variation of ¢ (over P) ............................... 142 V(E; g), v(E; g), outer (inner) sum of E with respect to g ............ 382, 390 ¢', N ¢' the tangent (normal) vector induced by ¢ ..................... 455, 472 T, n, the unit tangent (normal) vector of a curve (surface) ............ 461, 479 IE fdA, IEfdV, the Riemann integral of f on E C R2 (E eRn) .... ...... 395 9 ds, the line integral of 9 over C ................................... 456, 457 F . T ds, the oriented line integral of F along C ..................... 462, 465 9 dO', the surface integral of 9 over S . ............................... 474, 476 F . ndO', the oriented surface integral of F on S . .................... 480, 483 1M w, the integral of a differential form won a manifold M ................. 562 ak (j), bk (j), Fourier coefficients of f ....................................... 507 D n , K n , the Dirichlet (Fejer) kernel of order n ............................. 509 (SN f)( x), the Nth partial sum of the Fourier series of f evaluated at x . .... 508 (aN f) (x), the Nth Cesaro mean of the Fourier series of f evaluated at x ... 513
I,
Ie Ie Is Is
I,