ADVANCED CALCULUS An Introduction to Linear Analysis
Leonard F. Richardson
~WILEY ~INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION
ADVANCED CALCULUS
ADVANCED CALCULUS An Introduction to Linear Analysis
Leonard F. Richardson
~WILEY ~INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION
Copyright© 2008 by John Wiley & Sons, Inc. All rights reserved Published by John Wiley & Sons, Inc., Hoboken, New Jersey Published simultaneously in Canada No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section I 07 or I 08 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fcc to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., Ill River Street, Hoboken, NJ 07030, (20 I) 748-6011, fax (20 I) 748-6008, or online at http://www.wilcy.com/go/pcrmission. Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wilcy.com.
Library of Congress Cataloging-in-Publication Data: Richardson, Leonard F. Advanced calculus : an introduction to linear analysis I Leonard F. Richardson. p.cm. Includes bibliographical references and index. ISBN 978-0-470-23288-0 (cloth) I. Calculus. I. Title. QA303.2.R53 2008 515--dc22 2008007377 Printed in Mexico 10 9 8 7 6 5 4 3 2
To Joan, Daniel, and Joseph
CONTENTS
Preface
Xlll
Acknowledgments
XIX
Introduction
xxi
PART I
1
ADVANCED CALCULUS IN ONE VARIABLE
Real Numbers and Limits of Sequences
3
1.1
3 7
1.3
The Real Number System Exercises Limits of Sequences & Cauchy Sequences Exercises The Completeness Axiom and Some Consequences
1.4
Exercises Algebraic Combinations of Sequences
1.2
1.5
1.6
Exercises The Bolzano-Weierstrass Theorem Exercises The Nested Intervals Theorem
8 12 13
18 19 21 22 24 24 vii
viii
CONTENTS
1.7
Exercises The Heine-Borel Covering Theorem
1.8
Exercises Countability of the Rational Numbers Exercises
1.9
2
Test Yourself Exercises
Continuous Functions
39
2.1
Limits of Functions
2.2
Exercises Continuous Functions
2.3
Exercises Some Properties of Continuous Functions
2.4
Exercises Extreme Value Theorem and Its Consequences
2.5
The Banach Space C[a, b] Exercises
2.6
Test Yourself Exercises
39 43 46 49 50 53 55 60 61 66 67 67
Riemann Integral
69
3.1
Definition and Basic Properties Exercises
3.2
The Darboux Integrability Criterion Exercises Integrals of Uniform Limits Exercises The Cauchy-Schwarz Inequality Exercises
69 74 76 81 83 87 90 93 95 95
Exercises
3
3.3 3.4 3.5
4
26 27 30 31 35 37 37
Test Yourself Exercises
The Derivative
4.1 4.2
Derivatives and Differentials Exercises The Mean Value Theorem
99
99 103 105
CONTENTS
Exercises
4.3
The Fundamental Theorem of Calculus Exercises
4.4
Uniform Convergence and the Derivative Exercises
4.5
Cauchy's Generalized Mean Value Theorem Exercises Taylor's Theorem
4.6 4.7
5
Exercises Test Yourself Exercises
109 110 112 114 116 117 121 122 125 126 126
Infinite Series
127
5.1
127 132 134 137 138 146 148 153 154 157 158 161 162 167 169 173 174 174
5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9
Series of Constants Exercises Convergence Tests for Positive Term Series Exercises Absolute Convergence and Products of Series Exercises The Banach Space l 1 and Its Dual Space Exercises Series of Functions: The Weierstrass M-Test Exercises Power Series Exercises Real Analytic Functions and c= Functions Exercises Weierstrass Approximation Theorem Exercises Test Yourself Exercises PART II
6
ix
ADVANCED TOPICS IN ONE VARIABLE
Fourier Series
179
6.1
180 183 184 190
6.2
The Vibrating String and Trigonometric Series Exercises Euler's Formula and the Fourier Transform Exercises
X
CONTENTS
6.3
Bessel's Inequality and lz Exercises
6.4
Uniform Convergence & Riemann Localization Exercises
6.5
L 2 -Convergence & the Dual of l 2 Exercises
6.6
Test Yourself Exercises
7
192 196 197 204 205 208 212 212
The Rlemann-Stieltjes Integral
215
7.1
216 220 223 227 228 230 231 239 241 241
Functions of Bounded Variation Exercises
7.2
Riemann-Stieltjes Sums and Integrals Exercises
7.3
Riemann-Stieltjes Integrability Theorems Exercises
7.4
The Riesz Representation Theorem
7.5
Test Yourself
Exercises Exercises
PART Ill ADVANCED CALCULUS IN SEVERAL VARIABLES 8
Euclidean Space
245
8.1
245 249 252 254 256 258 259 261 263 263
Euclidean Space as a Complete Norrned Vector Space Exercises
8.2
Open Sets and Closed Sets Exercises
8.3
Compact Sets Exercises
8.4
Connected Sets Exercises
8.5
Test Yourself Exercises
9
Continuous Functions on Euclidean Space
265
9.1
265 268
Limits of Functions Exercises
CONTENTS
9.2
Continuous Functions Exercises
9.3
Continuous Image of a Compact Set Exercises Continuous Image of a Connected Set Exercises Test Yourself Exercises
9.4 9.5
10
270 272 274 276 278 279 280 280
The Derivative in Euclidean Space
283
10.1
283 286 289 295 298 300 301 303 305 309 311 317 322 327 328 328
10.2 10.3
10.4 10.5 10.6 10.7
11
xi
Linear Transformations and Norms Exercises Differentiable Functions Exercises The Chain Rule in Euclidean Space 10.3.1 The Mean Value Theorem 10.3.2 Taylor's Theorem Exercises Inverse Functions Exercises Implicit Functions Exercises Tangent Spaces and Lagrange Multipliers Exercises Test Yourself Exercises
Riemann Integration in Euclidean Space
331
11.1
331 336 338 341 342 344 346 349 351 355
11.2 11.3 11.4 11.5
Definition of the Integral Exercises Lebesgue Null Sets and Jordan Null Sets Exercises Lebesgue's Criterion for Riemann Integrability Exercises Fubini's Theorem Exercises Jacobian Theorem for Change of Variables Exercises
Xii
CONTENTS
Test Yourself Exercises
357 357
Appendix A: Set Theory A. I Terminology and Symbols Exercises A.2 Paradoxes
359 359 363 363
Problem Solutions
365
References
379
Index
381
11.6
PREFACE
Why this Book was Written The course known as Advanced Calculus (or Introductory Analysis) stands at the summit of the requirements for senior mathematics majors. An important objective of this course is to prepare the student for a critical challenge that he or she will face in the first year of graduate study: the course called Analysis I, Lebesgue Measure and Integration, or Introductory Functional Analysis. We live in an era of rapid change on a global scale. And the author and his department have been testing ways to improve the preparation of mathematics majors for the challenges they will face. During the past quarter century the United States has emerged as the destination of choice for graduate study in mathematics. The influx of well-prepared, talented students from around the world brings considerable benefit to American graduate programs. The international students usually arrive better prepared for graduate study in mathematics-in particular better prepared in analysis-than their typical U.S. counterparts. There are many reasons for this, including (a) school systems abroad that are oriented toward teaching only the brightest students, and (b) the self-selection that is part of a student taking the step of travel abroad to study in a foreign culture. The presence of strongly prepared international students in the classroom raises the level at which courses are taught. Thus it is appropriate at the present time, in the early years of the new millennium, for college and university mathematics departments to xiii
XiV
PREFACE
reconsider their advanced calculus courses with an eye toward preparing graduates for the international environment in American graduate schools. This is a challenge, but it is also an opportunity for American students and international students to learn side-by-side with, and also about, one another. It is more important than ever to teach undergraduate advanced calculus or analysis in such a way as to prepare and reorient the student for graduate study as it is today in mathematics. Another recent change is that applied mathematics has emerged on a large scale as an important component of many mathematics departments. In applied and numerical mathematics, functional analysis at the graduate level plays a very important role. Yet another change that is emerging is that undergraduates planning careers in the secondary teaching of mathematics are being required to major in mathematics instead of education. These students must be prepared to teach the next generation of young people for the world in which they will live. Whether or not the mathematics major is planning an academic career, he or she will benefit from better preparation in advanced calculus for careers in the emerging world. The author has taught mathematics majors and graduate students for thirty-seven years. He has served as director of his department's graduate program for nearly two decades. All the changes described above are present today in the author's department. This book has been written in the hope of addressing the following needs. 1. Students of mathematics should acquire a sense of the unity of mathematics. Hence a course designed for senior mathematics majors should have an integrative effect. Such a course should draw upon at least two branches of mathematics to show how they may be combined with illuminating effect. 2. Students should learn the importance of rigorous proof and develop skill in coherent written exposition to counter the universal temptation to engage in wishful thinking. Students need practice composing and writing proofs of their own, and these must be checked and corrected. 3. The fundamental theorems of the introductory calculus courses need to beestablished rigorously, along with the traditional theorems of advanced calculus, which are required for this purpose. 4. The task of establishing the rigorous foundations of calculus should be enlivened by taking this opportunity to introduce the student to modern mathematical structures that were not presented in introductory calculus courses. 5. Students should learn the rigorous foundations of calculus in a manner that reorient<; thinking in the directions taken by modern analysis. The classic theorems should be couched in a manner that reflects the perspectives of modem analysis.
PREFACE
XV
Features of this Text The author has attempted to address these needs presented above in the following manner. 1. The two parts of mathematics that have been studied by nearly every mathematics major prior to the senior year are introductory calculus, including calculus of several variables, and linear algebra. Thus the author has chosen to highlight the interplay between the calculus and linear algebra, emphasizing the role of the concepts of a vector space, a linear transformation (including a linear functional), a norm, and a scalar product. For example, the customary theorem concerning uniform limits of continuous functions is interpreted as a completeness theorem for C[a, b] as a vector space equipped with the sup-norm. The elementary properties of the Riemann integral gain coherence expressed as a theorem establishing the integral as a bounded linear functional on a convenient function-space. Similarly, the family of absolutely convergent series is presented from the perspective that it is a complete normed vector space equipped with the h -norm. 2. Many exercises are offered for each section of the text. These are essential to the course. An exercise preceded by a dagger symbol t is cited at some point in the text. Such citations refer to the exercise by section and number. An exercise preceded by a diamond symbol 0 is a hard problem. If a hard problem will be cited later in the text, then there will be a footnote to say precisely where it will be cited. This is intended to help the professor decide whether or not an exercise should be assigned to a particular class based upon his or her planned coverage for the course. Topics that can be omitted at the professor's discretion without disturbing continuity of the course are so-indicated by means of footnotes. 3. At the end of each chapter there is a brief section called Test Yourself, consisting of short questions to test the student's comprehension of the basic concepts and theorems. The answers to these short questions, and also to other selected short questions, appear in an appendix. There are no proofs provided among those answers to selected questions. The reason is that there are many possible correct proofs for each exercise. Only the professor or the professor's designated assistant will be able to properly evaluate and correct the student's writing in exercises requiring proofs. 4. The Introduction to this book is intended to introduce the student to both the importance and the challenges of writing proofs. The guidance provided in the introduction is followed by corresponding illustrative remarks that appear after the first proof in each of the five chapters of Part I of this text. 5. Whether a professor chooses to collect written assignments or to have students present proofs at the board in front of the class, each student must regularly construct and write proofs. The coherence and the presentation of the arguments must be criticized.
XVi
PREFACE
6. Most of the traditional theorems of elementary differential and integral calculus are developed rigorously. Since the orientation of the course is toward the role of normed vector spaces, Cauchy completeness is the most natural form of the completeness concept to use. Thus we present the system of real numbers as a Cauchy-complete Archimedean ordered field. The traditional theorems of advanced calculus are presented. These include the elements of the study of integrable and differentiable functions, extreme value theorems, Mean Value Theorems, and convergence theorems, the polynomial approximation theorem of Weierstrass, the inverse and implicit function theorems, Lebesgue's theorem for Riemann integrability, and the Jacobian theorem for change of variables. 7. Students learn in this course such concepts as those of a complete normed vector space (real Banach space) and a bounded linear functional. This is not a course in functional analysis. Rather the central theorems and examples of advanced calculus are treated as instances and motivations for the concepts of functional analysis. For example, the space of bounded sequences is shown to be the dual space of the space of absolutely summable sequences. 8. The concept of this book is that the student is guided gradually from the study of the topology of the real line to the beginning theorems and concepts of graduate analysis, expressed from a modern viewpoint. Many traditional theorems of advanced calculus list properties that amount to stating that a certain set of functions forms a vector space and that this space is complete with respect to a norm. By phrasing the traditional theorems in this light, we help the student to mentally organize the knowledge of advanced calculus in a coherent and meaningful manner while acquiring a helpful reorientation toward modern graduate-level analysis.
Course Plans that Are Supported by this Book Part I of this book consists of five chapters covering most of the standard one- variable topics found in two-semester advanced calculus courses. These chapters are arranged in order of dependence, with the later chapters depending on the earlier ones. Though the topics are mainly the ones typically found, they have been reoriented here from the viewpoint of linear spaces, norms, completeness, and linear functionals. Part II offers a choice of two mutually independent advanced one-variable topics: either Fourier series or Stieltjes integration. It is especially the case in Part II that each professor's individual judgment about the readiness of his or her class should guide what is taught. Some of these topics will not be for the average student, but will make excellent reading material for the student seeking honors credit or writing a senior thesis. Individual reading courses can be employed very effectively to provide advanced experience for the prospective graduate student. In Chapter 6 the introduction of Fourier series is aided by inclusion of complexvalued functions of a real variable. This is the only chapter in which complex-valued functions appear, and with these the Hermitian inner product is introduced. The
PREFACE
XVii
chapter includes l 2 and its self-duality, convergence in the £ 2 -norm, 1 the uniform convergence of Fourier series of smooth functions, and the Riemann localization theorem. The study of a vibrating string is presented to motivate the chapter. Chapter 7, which is about Stieltjes integration, includes functions of bounded variation and the Riesz Representation Theorem, presenting the dual space of C[a, b] in terms of Stieltjes integration. The latter theorem of F. Riesz is the hardest one presented in this book. It is not required for the later chapters. However, it is an excellent theorem for a promising student planning subsequent doctoral study, and it requires only what has been learned previously in this course. It is a century since the discovery of the Riesz Representation Theorem. The author thinks it is time for it to take its place in an undergraduate text for the twenty-first century. Part III is about several-variable advanced calculus, including the inverse and implicit function theorems, and the Jacobian theorems for multiple integrals. Where the first two parts place emphasis on infinite-dimensional linear spaces of functions, the third part emphasizes finite-dimensional spaces and the derivative as a linear transformation. At Louisiana State University, Advanced Calculus is offered as a three-semester triad of courses. 2 The first semester is taken by all and is the starting point regardless of the subsequent choices. But the other two semesters can be taken in either order. This enables the Department to offer all three semesters each year, with the first semester offered in both fall and spring, and the two other courses being offered with only one of them each semester. These courses are not rushed. One must allow sufficient time for the typical undergraduate mathematics major to learn to prove theorems and to absorb the new concepts. It is the author's experience that all too often, courses in analysis are inadvertently sabotaged by packing too much subject matter into one term. It is best to teach students to take enough time to learn well and learn deeply. A few words about testing procedures may be helpful too. At the author's institution, and at many others also, it is important to teach Advanced Calculus in a manner that is suitable for both those students who are preparing for graduate study in mathematics and those who are not. The author finds that it is appropriate to divide each test into two approximately equal parts: one for short questions of the type represented in the Test Yourself sections of this book, and the other consisting of proofs representative of those assigned and collected for homework. Although one would like each student to excel in both, there are many students who excel in one class of question but not the other. And there are indeed many students who do better in proofs than in the concept-testing short questions. Thus tests that combine both types of question provide fuller information about each student and give an opportunity for more students to show what they can do. The author always gives a choice of questions in each of the two categories: typically eight out of twelve for 1The
£ 2 norm is used here exclusively with the Riemann integral.
2 Mathematics
majors planning careers in high-school teaching take at least the first semester, while the others must take at least two of the three semesters. Those students who are contemplating graduate study in mathematics arc advised strongly to take all three semesters.
XViii
PREFACE
the short questions, and two out of three for the proofs, for a one-hour test. The pass rate in these courses is actually high, despite the depth of the subject. Naturally, each professor will need to determine the best approach to testing for his or her own class. It is most common for colleges and universities to offer either a single semester or else a two-semester sequence in Advanced Calculus or Undergraduate Analysis. Below the author has indicated practical syllabi for a one-semester course, as well as three alternative versions of a two-semester course. It should be understood that, depending on the readiness of the class, it may be possible to do more.
• Single-semester course: Sections 1.1-1.8, 2.1-2.4, 3.1-3.3, and 4.1--4.3. • Two-semester course leading to Stieltjes integration: 1. Chapters 1-3 for the first semester 2. Chapters 4, 5, and 7 for the second semester
• Two-semester course leading to Fourier series: I. Chapters 1-3 for the first semester 2. Chapters 4-6 for the second semester
• Two-semester course leading to the inverse and implicit function theorems: 1. Sections 1.1-1.8, 2.1-2.4, 3.1-3.3, and 4.1--4.3 for the first semester 2. Sections 8.1-8.3, 9.1-9.3, and 10.1-10.3 for the second semester
• Three-semester course, with parts 2 and 3 interchangeable in order: I. Chapters 1-3 for the first semester 2. Either (a) Chapters 4-6 for the second semester or (b) Chapters 4, 5, and 7 for the second semester 3. Sections 8.1-8.3, 9.1-9.3, and 10.1-10.3 for the third semester, and with Chapter 11 if there is sufficient time. No doubt there are other possible combinations. Whatever is the choice made, the author hopes that the whole academic community of mathematicians will devote an increased number of courses to the teaching of analysis to undergraduate mathematics majors. LEONARD Baton Rouge, LouisiafUl August, 2007
F. RICHARDSON
ACKNOWLEDGMENTS
It is a pleasure to thank several colleagues at Louisiana State University who have contributed useful ideas, corrections, and suggestions. They are Professors Jacek Cygan, Mark Davidson, Charles Delzell, Raymond Fabec, Jerome Hoffman, Richard Litherland, Gestur Olafsson, Ambar Sengupta, Lawrence Smolinsky, and Peter Wolenski. Several of these colleagues taught classes using the manuscript that became this book. It is a pleasure also to thank Professor Kenneth Ross, of the University of Oregon, who provided many helpful corrections to the first printing. Of course the errors that remain are entirely my own responsibility, and further corrections and suggestions from the reader will be much appreciated. In the academic year 1962-1963 I was a student in an advanced calculus course taught by Professor Frank J. Hahn at Yale University. His inclusion in that course of the Riesz Representation Theorem and its proof was a highlight of my undergraduate education. Though I didn't realize it at the time, that course likely was the source of the idea for this book. Professor Hahn was a young member of the Yale faculty when I was a student in his advanced calculus course that included the Riesz theorem. He was an extraordinary and generous teacher. I became his PhD student, but his death intervened about a year later. Then Professor George D. Mostow adopted me as his student. Professor Mostow took an interest in improving undergraduate education in mathematics, having co-authored a book [14] that had as one of its goals the earlier inclusion and
xix
XX
ACKNOWLEDGMENTS
integration of abstract algebra into the undergraduate curriculum. I have been very fortunate with regard to my teachers. They taught lessons that grow over time like branches, integral parts of one tree. I am grateful for the opportunity to record my gratitude and indebtedness to them. My book is intended to facilitate the integration of linear spaces, functionals and transformations, both finite- and infinite-dimensional, into Advanced Calculus. It is not a new idea that mathematics should be taught to undergraduate students in a manner that demonstrates the overarching coherence of the subject. As mathematics grows, in both pure and applied directions, the need to emphasize its unity remains a pressing objective. Questions and observations from students over the years have resulted in numerous exercises and explanatory remarks. It has been a privilege to share some of my favorite mathematics with students, and I hope the experience has been a good one for them. I am grateful to John Wiley & Sons for the opportunity to offer this book, as well as the course it represents and advocates, to a wider audience. I appreciate especially the role of Ms. Susanne Steitz-Piller, the Mathematics and Statistics Editor of John Wiley & Sons, in making this opportunity available. She and her colleagues provided valued advice, support, and technical assistance, all of which were needed to transform a professor's course notes into a book. L.F.R.
INTRODUCTION
Why Advanced Calculus is Important What is the meaning of knowledge? And what is the meaning of learning? The author believes these are questions that must be addressed in order to grasp the purpose of advanced calculus. In primary and secondary education, and also in some introductory college courses, we are asked to accept many statements or claims and to remember them, perhaps to apply them. Individuals vary greatly in temperament and are more willing or less willing to acquiesce in the acceptance of what is taught. But whether or not we are inclined to do so, we must ask responsible questions about the basis upon which knowledge rests. Here are a few examples. • Have we been taught accurate renditions of the history of our civilization? Is there nothing to indicate that history is presented sometimes in a biased or misleading way?
• Were we taught correct claims about the nature of the physical or biological world? Are there not examples of famous claims regarding the natural sciences, endorsed ardently, yet proven in time to be false?
xxi
XXii
INTRODUCTION
• How do we know what is or is not true about mathematics? Is there no record of error or disagreement? Is there an infallible expert who can be trusted to tell correctly the answers to all questions?
• If there are authorities who can be trusted without doubt to instruct us correctly, what will be our fate when these authorities, perhaps older than ourselves, die? Can we not learn for ourselves to determine the difference between truth and falsehood, between valid reason and error? In the serious study of history, one must learn how to search for records or evidence and how to appraise its reliability. In the natural sciences, one must learn to construct sound experiments or to conduct accurate observations so as to distinguish between truth and wishful thinking. And in the study of mathematics it is through logical proof by deductive reasoning that we can check our thinking or our guesswork. Learning how to confirm the foundations of our knowledge transforms us from receptacles for the claims made by others into stewards for the knowledge mankind has acquired through millennia of exertion. It is both our right as human beings and our responsibility to assume this role. Throughout our lives, we find ourselves with the need to resolve the conflict between opposing forces. On the one hand, the human mind is impulsive, eager to leap from one spot to another that may have a clearer view. This spark is an engine of creativity. We would not be human in its absence. It is also our Achilles' heel. Training and self-discipline are required that we may distinguish the worthwhile leaps of imagination from the faulty ones. A vital aspect of the self-discipline that must be learned by each student of mathematics is that proofs must be written down, scrutinized step-by-step, and rewritten wherever there is doubt. In a proof the reasoning must be solid and secure from start to finish. There is no one among us who can reliably devise a proof mentally, leaving it unwritten and unscrutinized. Indeed, mankind's capacity for wishful thinking is boundless. Discipline in the standard of logical proof is severe, and it is essential to our task. Mathematics is not a spectator sport. It can be learned only by doing. It is necessary but never sufficient to watch proofs being constructed by an experienced practitioner. The latter activity (which includes attendance in class and active participation, as well as careful study of the text) can help one to learn good technique. But only the effort of writing our own proofs can teach each of us by trial and error how to do it. See this as not only a warning but also good news that strenuous effort in this work is effective. From more than three decades of teaching as well as personal experience, the author can assure each student that this is so. It is possible also to assure the student that through vigorous effort in mathematics the student may come to enjoy this subject very much and to relish the light that it can shed. Even a seemingly small question can be a portal to a whole world of unforeseen surprise and wonder. In this spirit it is a pleasure to welcome the student and the reader to advanced calculus.
INTRODUCTION
XXiii
Learning to Write Proofs: A Guide for the Perplexed Student I want to do my proof-writing homework, but I don't know how to begin! It is an oftheard lament. In elementary mathematics courses, the student is provided customarily with a set of instructions, or algorithms, that will lead upon implementation to the solution of certain types of problems. Thus many conscientious students have requested instructions for writing proofs. All sets of instructions for writing proofs, however, suffer from one defect: They do not work. Yet one can learn to write proofs, and there are many living mathematicians and successful mathematics students whose existence proves this point. The author believes that learning to write proofs is not a matter of following theorem-proving instructions. The answer lies rather in learning how to study advanced calculus. The student, having been in school for much of his or her life, may bridle at the suggestion that he or she has not learned how to study. Yet in the case of studying theoretical mathematics, that is very likely to be true. Every single theorem and every single proof that is presented in this book, or by the student's professor in class, is a vivid example of theorem-proving technique. But to benefit from these fine examples, the student must learn how to study. Mathematicians find that the best way to read mathematics is with paper and pencil! This means that it is the reader's task to figure out how to think about the theorem and its proof and to write it down coherently. In reading the proofs of theorems in this text, or in the study of proofs presented by one's teacher in class, the student must understand that what is written is much more than a body of facts to be remembered and reproduced upon demand. Each proof has a story that guided the author in its writing. There is a beginning (the hypotheses), a challenge (the objective to be achieved), and a plan that might, with hard work, skill, and good fortune, lead to the desired conclusion. It will take time and a concerted effort for the student to learn to think about the statements and proofs of the presented theorems in this light. Such practice will cultivate the ability to read the exercises as well in a fruitful manner. With experience at recognizing the story of the proof or problem at hand, the student will be in a position to develop technique through the work done in the exercises. The first step, before attempting to read a proof, is to read the statement of the theorem carefully, trying to get an overall picture of its content. The student should make sure he or she knows precisely the definition of each term used in the statement of the theorem. Without that information, it is impossible to understand even the claim of the theorem, let alone its proof. If a term or a symbol in the statement of a theorem or exercise is not recognized, look in the index! Write on paper what you find. After clarifying explicitly the meaning of each term used, if the student does not see what the theorem is attempting to achieve, it is often helpful to write down a few examples to see what difficulties might arise, leading to the need for the theorem. Working with examples is the mathematical equivalent of laboratory work for a natural scientist. At this point the student will have read the statement of the theorem at least twice, and probably more often than that, accumulating written notes on a scratch pad along the way. Read the theorem again! Remember that in constructing
XXiV
INTRODUCTION
a building or a bridge, it is not a waste of time to dwell upon the foundation. The author has assured many students, from freshman to doctoral level, that the way to make faster progress is to slow down-especially at the outset. If you were planning a grand two-week backpacking trip in a national park, would you simply run out of the house? Of course not-you would plan and make preparations for the coming adventure. At this point we suppose the reader understands the statement of the theorem and wishes next to learn why the claimed conclusion is true. How does the author or teacher in class overcome the obstacles at hand? Read the whole proof a first time, taking written notes as to what combination of steps the author has chosen to proceed from the hypotheses to the conclusions. This first reading of the proof itself can be likened to one's first look at a road map drawn for a cross-country trip. It will give one an overall sense of the journey ahead. But taking the trip, or walking the walk, is another matter. Having noted that the journey ahead can be divided into segments, much like a trip with several overnight stops, the student should begin in earnest at the beginning. For each leg of the journey, it is important to understand thoroughly, and to write on paper, the logical justification of each individual step. There must be no magical disappearance from point A and reappearance at point B! No external authority can be substituted for the student's own understanding of each step taken. It is both the right and the responsibility of the student to understand in full detail. 3 By studying the theorems in this book in the manner explained above, the student will cultivate the modes of thinking that will enable him or her to write the proofs that are required in the exercises. The exercises are a vital part of this course, and the proof exercises are the most important of all. There is an answer section for selected short-answer exercises among the appendices of this book. It includes all the answers to the Test Yourself self-tests at the ends of the chapters. But the student will not find solutions to the proof exercises there. That is because it is not satisfactory merely to copy a written proof. Many correct proofs are possible. Only an experienced teacher can judge the correctness and the quality of the proofs you write. The student can and must depend upon his or her professor or the professor's designated assistant to read and correct proofs written as exercises. One of the ways that a teacher can help a student is by explaining that he or she has been where the student stands. The student is not alone and can meet the challenges ahead much as his or her teacher has done before. When the author was young, he had long walks to and from school: about twenty minutes each way at a brisk pace. It was a favorite pastime during these walks to review mentally the logical structure of advanced calculus-reconstructing the proofs of theorems about Riemann integrals or uniform convergence from the axioms of the real number system. Many colleagues within mathematics, and some from theoretical physics, have shared with the author similar experiences from their own lives. It is the active engagement with a subject 3The student should reread this introduction before reading Remark 1.1.1, which appears after the proof of the first theorem in this book. Corresponding remarks appear following the first proof in each of the five chapters of Part I of this book.
INTRODUCTION
XXV
that builds firm understanding and that incorporates the knowledge gained into ones own mind. Experiences in life can be enjoyed only once for the first time. The student is about to embark on a mathematical adventure with advanced calculus for his or her first time. Neither the author nor your teacher can do this again. But we can wish you a wonderful journey, and we do.
PART I
ADVANCED CALCULUS IN ONE VARIABLE
CHAPTER 1
REAL NUMBERS AND LIMITS OF SEQUENCES
1.1
THE REAL NUMBER SYSTEM
During the 19th century, as applications of the differential and integral calculus in the physical sciences grew in importance and complexity, it became apparent that intuitive use of the concept of limit was inadequate. Intuitive arguments could lead to seemingly correct or incorrect conclusions in important examples. Much effort and creativity went into placing the calculus on a rigorous foundation so that such problems could be resolved. In order to see how this process unfolded, it is helpful to look far back into the history of mathematics. Approximately 2000 years ago, Greek mathematicians placed Euclidean geometry on the foundations of deductive logic. Axioms were chosen as assumptions, and the major theorems of geometry were proven, using fairly rigorous logic, in an orderly progression. These ancient mathematicians also had concepts of numbers. They used natural numbers, known also as counting numbers, the set of which is denoted by N = {1,2,3, ... ,n,n+ 1, ... }. This is the endless sequence of numbers beginning with 1 and proceeding without end by adding 1 at each step. Also used were positive rational numbers, which we Advanced Calculus: An Introduction to Linear Analysis. By Leonard F. Richardson Copyright © 2008 John Wiley & Sons, Inc.
3
4
REAL NUMBERS AND LIMITS OF SEQUENCES
denote as
These numbers were regarded as representing proportions of positive whole numbers. Members of the Pythagorean school of geometry discovered that there was no ratio of positive whole numbers that could serve as a square root for 2. (See Exercise 1.11.) This was disturbing to them because it meant that the side and the diagonal of a square must be incommensurable. That is, the side and the diagonal of a square cannot both be measured as a whole number multiple of some other line segment, or unit. So great was these geometers' consternation over the failure of the set of rational numbers to provide the proportion between the side and the diagonal of a square that confidence in the logical capacity of algebra was diminished. Mathematical reasoning was phrased, to the extent possible, in terms of geometry. For example, today we would express the area of a circle algebraically as A = 7tT 2 • We could express this common formula alternatively as A = jd2 , where d is the diameter of the circle. But the ancient Greeks put it this way: The areas of two circles are in the same proportion as the areas of the squares on their diameters. The squares were constructed, each with a side coinciding with the diameter of the corresponding circle, and the areas of the squares were in the same proportion as the areas of the circles. Much later, in the 17th century, Isaac Newton continued to be influenced by this perspective. In his celebrated work on the calculus, Principia Mathematica, we can see repeatedly that where we would use an algebraic calculation, he used a geometrical argument, even if greater effort is required. The reader interested in the history of mathematics may enjoy the book The Exact Sciences in Antiquity by Otto Neugebauer [15] and the one by Carl Boyer [3], The History of the Calculus. It took until the 19th century for mathematicians to liberate themselves from their misgivings regarding algebra. It came to be understood that the real numbers, the numbers that correspond to the points on an endless geometrical line, could be placed on a systematic logical foundation just as had been done for geometry nearly two thousand years earlier. Most of the axioms that were needed to prove the properties of the real number system were already quite familiar from the arithmetic of the rational numbers. There was one crucial new axiom needed: the Completeness Axiom of the Real Number System. Once this axiom had been added, the theorems of the calculus could be proven rigorously, and future development of the subject of Mathematical Analysis in the 20th century was facilitated. Although we will not attempt the laborious task of rigorously proving every familiar property of the real number system, we will sketch the axioms that summarize familiar properties, and we will explain carefully the completeness axiom. With the latter axiom in hand, we will develop the theory of the calculus with great care. Students interested in studying the full and formal development of the real number system are referred to J. M. H. Olmsted's book [16], or to a stylistically distinctive classic by E. Landau [12].
THE REAL NUMBER SYSTEM
5
In addition to the set N of natural numbers, we will consider the set Z of integers, or whole numbers. Thus
Z = {0, ±1, ±2, ... } ={±nInE N} U {0}. We need also the full set of rational numbers:
Q = { ~ p, q E z, q #
I
0} .
We list in Table 1.1 the axioms for a general Archimedean Ordered Field IF. You will observe that the set Q is an Archimedean ordered field. However, the set lR of real numbers, which we will define in Section 1.3, will obey all the axioms for an Archimedean ordered field together with one more axiom, called the Completeness Axiom, which is not satisfied by Q. Table 1.1
Archimedean Ordered Field
An Archimedean Ordered Field lF is a set with two operations, called addition and multiplication. There is also an order relation, denoted by a < b. These satisfy the following properties:
l. Closure: If a and bare elements oflF, then a+ bE lF and abE JF. 2. Commutativity: If a and b are elements of lF, then a + b = b + a and ab
= ba.
3. Associativity: If a, b, and care elements of JF, then a+ (b +c) = (a+ b) + c and a(bc) = (ab)c.
= ab + ac. + a = a and 1a = a, for all a E lF.
4. Distributivity: If a, b, and care elements of JF, then a(b +c) 5. Identity: There exist elements 0 and 1 in lF such 0 Moreover, 0 f= 1.
6. Inverses: If a E JF, then there exists -a ElF such that -a+ a= 0. Also, for all a then there exists a- 1 = ~ ElF such that a~ = 1. 7. Transitivity: If a
f= 0,
< band b < c, then a < c.
8. Preservation of Order: if a< band if c E JF, then a+ c then ac
< b +c. Moreover, if c > 0,
9. Trichotomy: For all a and bin JF, exactly one of the following three statements will be true: a< b, or a= b, or a> b (which means b
0 and if M > 0, then there exists n E N such that nE > M. (In this general context, N is defined as the smallest subset of lF that contains 1 and is closed under addition.)
There is an old adage that loosely paraphrases the Archimedean Property found in the table: If you save a penny a day, eventually you will become a millionaire (or a billionaire, etc.).
6
REAL NUMBERS AND LIMITS OF SEQUENCES
From the axioms for an Archimedean ordered field, many familiar properties of the real numbers can be deduced. In particular, the behavior of all the operations used in solving equations and inequalities follows directly, with the exception that we have not established yet that roots of positive numbers, such as square roots, exist. Here we will concentrate on those properties that received less emphasis in elementary mathematics courses. The order axioms are particularly useful for analysis. In this connection, it is important to make the following definition. Definition 1.1.1 We define
JaJ ={a-a
if a 2: 0, if a< 0.
We think of JaJ as representing the distance of a from 0 on the number line. Note that JaJ is always nonnegative. The absolute value satisfies a vital inequality known as the Triangle Inequality. Theorem 1.1.1 For all a and bin R
Ja + bJ :::; JaJ + JbJ.
Proof: Observe that
-JaJ:::; a:::; JaJ, and
-lbl:::; b:::; JbJ, so that
-(JaJ + Jbl):::; a+ b:::; JaJ + JbJ.
(1.1)
Thus, if a + b 2: 0,
Ja + bJ =a+ b:::; JaJ + JbJ. But if a+ b < 0, then from the first inequality in Equation ( 1.1), we obtain
Ja+bl = -(a+b):::; We see that whether a
Ja+bl:::;
JaJ + JbJ.
+ b is
JaJ + JbJ.
negative or nonnegative, we have in either case that
•
Remark 1.1.1 If the student has not yet read the Introduction, including the discussion of Learning to Write Proofs on page xxiii, this should be done now. It was explained that in order to learn to write proofs, the student must learn first how to study the theorems and proofs that are presented in this book. Let us note how the remarks made there apply to the short proof of the first theorem in this book. First we read carefully the statement of Theorem 1.1.1. We note that this is a theorem about absolute values, so we reread Definition 1.1.1 to insure that we know the meaning of this concept. Since the absolute value of a number a depends upon the sign of a, we should test the claimed inequality in the theorem with several
EXERCISES
7
pairs of numbers: two positive numbers, two negative numbers, and two numbers of opposite sign. The reader should do this, with examples of his or her choice of numbers, noting that the triangle inequality in real application gives either equality, if the two numbers have the same sign, or else strict inequality, if the two numbers have opposite sign. This gives us an intuitive appreciation that the triangle inequality ought to be true. Now how do we prove it? Testing more examples will not suffice, because infinitely many pairs are possible. Many correct proofs can be given, but we will discuss the one chosen by the author. The next step in writing a proof requires some playfulness or inquisitiveness on the part of the student. In theoretical mathematics we are discouraged from following rote procedures in the hope of finding an answer without thought. To bypass thought would be to bypass mathematics itself. The student should not even consider such a route, just as he or she should not substitute a pill for a good meal. We see by playing with the definition of absolute value that Ia I must be equal to either a or -a. This reminds us of what we observed when checking pairs of specific numbers of the same or opposite sign, as explained above. The playfulness appears when we choose to write this as -lal ::; a ::; lal for all a, even though the truth of this double inequality hinges upon a being equal to either the left side or the right side. Then we do the same for b, recognizing that a and b do play symmetrical roles in the statement of the theorem. Then we add the two double inequalities, obtaining Equation (1.1). The remainder of the proof unfolds from considering that the value of Ia + bl hinges upon the sign of a+ b. This analysis of the proof of the triangle inequality is representative of what the student should do with each proof in this book, and with each proof presented in class by his or her professor. Take a fresh sheet of paper and write out a full analysis of the proof, including the perceived rationale for the course that it takes. Work on this until you are sure you understand correctly. If in doubt, ask your teacher! This is the way to learn advanced mathematics, and it is what the student must do to learn to prove theorems.
EXERCISES 1.1 Let E > 0. Determine how large n E N must be to ensure that the given inequality is satisfied, and use the Archimedean Property to establish that such n exist. a) .!. < E? n b) --.!,n < E? c) < E? (Assume that yin exists in R)
Jn
1.2
Prove the uniqueness of the additive inverse -a of a. (Hint: Suppose that
x+a=O=y+a and prove that x
= y.)
1.3 Use the Axiom of Distributivity to prove that aO this to prove that ( -1) ( -1) = 1.
=
0 for all a E IR, and use
8
REAL NUMBERS AND LIMITS OF SEQUENCES
1.4
Prove that ( -1 )a
1.5
Prove the uniqueness of the multiplicative inverse a - l of a for all a
= -a for all a E R
#- 0 in R
1.6 Prove: For all a and bin JR, labl = lallbl. (Hint: Consider the three cases a and b both nonnegative, a and b both negative, and a and b of opposite sign.) 1.7
Prove: For all a, b, c in JR, Ia- cl :::; Ia- bl
+ lb- cl.
(Hint: Use the triangle inequality.)
1.8 LeU:> 0. Findanumber8 > Osmallenoughsothatla-bl implies Ia - cl < E. 1.9
< 8andlc-bl < 8
t Prove: For all a and bin JR, llal-lbll:::; Ia- bl.
Intuitively, this says that Ia I and lbl cannot be farther apart than a and bare. (Hint: Write Ia I = l(a- b)+ bland use the triangle inequality. Then do the same thing for lbl.)
1.10
Prove or give a counterexample: a) If a < band c < d, then a - c b) If a < b and c < d, then a + c
< b - d. < b + d.
1.11 t This exercise leads in three parts to a proof that there is no rational number the square of which is 2. The reader will need to know from another source that each rational number can be written in the form If!- in lowest terms. This means that m and n have no common factors other than ±1. a) If m E Z is odd, prove that m 2 is odd. b) If m E Z is such that m 2 is even, prove that m is even. c) Suppose there exists If!- E Q, expressed in lowest terms, such that
Prove that m and n are both even, resulting in a contradiction. (Hint: For this problem, if the student has not taken any class in number theory, the following definitions may be helpful. A number n is called even if and only if it can be written as n = 2k for some integer k. A number n is called odd if and only if it can be written as n = 2k - 1 for some integer k.)
1.2 LIMITS OF SEQUENCES & CAUCHY SEQUENCES By a sequence Xn of elements of a setS we mean that to each natural number n E N there is assigned an element Xn E S. Unless otherwise stated, we will deal with
9
LIMITS OF SEQUENCES & CAUCHY SEQUENCES
sequences of real numbers. We can think of a sequence as an endless list of real numbers, or we could equivalently think of a sequence as being afunction whose domain is N and whose range lies in R It is very important to define the concept of the limit of a sequence. Intuitively, we say that Xn approaches the real number L a.<; n approaches infinity, written Xn ----> L E R a.<; n ----> oo, provided we can force lxn- Ll to become as small as we like just by making n sufficiently big. This is also written with the symbols limn-+oo Xn = L. The advantage of writing the definition symbolically as follows is that this definition provides inequalities that can be solved to determine whether or not Xn ----> L. Definition 1.2.1 A sequence Xn ----> L E R as n ----> oo if and only iffor all there exists N E N corresponding to E such that
n ;::: N ::::}
E
> 0,
lxn - Ll < E.
If there exists a number L such that Xn ----> L, we say Xn is convergent. Otherwise we say that Xn is divergent.
See Exercise 1.12. • EXAMPLE 1.1 We claim that if Xn = ~, then Proof: Let t
Xn ---->
> 0. We need N
0.
E N such that n
;::: N implies
That is, we need to solve the inequality ~ < t. Multiplying both sides of this inequality by the positive number ~· we see that ~ < n. That is, if we pick N EN such that N > ~.then
1
n;:::N
=::::}
1
~~N<E.
We know that such an N exists in N since t and 1 are both positive. Thus there exists N E N such that N1 = N > ~ by the Archimedean Principle. • The student should note that the value of N does indeed correspond to is made smaller, then N must be chosen larger. • EXAMPLE 1.2 Let lrl < 1. We claim that rn----> 0 as n----> oo. Let t > 0. We need to find N E N such that n ;::: N implies
E.
If f. > 0
10
REAL NUMBERS AND LIMITS OF SEQUENCES
In the special case in which r = 0, it would suffice to take N r =1- 0. Then we need to solve
= 1. So suppose
Note that we do not proceed by taking nth roots of both sides of this inequality, since we have not yet established the existence of such roots for all positive real numbers. Since \r\ < 1, = 1 + p > 1 for some p > 0. Thus
R
(
+ p)n
..!_)n = (1
\r\
= (1 = 1n
+ p)(1 + p) ... (1 + p) + np + ... + pn
> np. By transitivity of inequalities, it would suffice to find N E N such that N p > ~. Such integers N exist because of the Archimedean property. So pick N E N such Np > ~ and we find that n ? N implies np ? Np > ~ so that
\rn - 0\ = \r\n < f.
Notice that if Xn is convergent, then after some finite number N of terms, all subsequent terms are bunched very close to one another: in fact, within f of some number L. This motivates the following definition and theorem. Definition 1.2.2 A sequence Xn is called a Cauchy sequence if and only if, for all f > 0, there exists N E N, corresponding to t, such that n and m ? N implies
\xn-Xm\
Proof: Suppose Xn is convergent: say Xn ----? L. Let f > 0. Then, since ~ > 0 as well, we see there exists N E N, corresponding to f, such that n ? N implies \xn- L\ < ~- Then, if nand m? N, we have
\xn- Xm\ = \(xn- L) + (L- Xm)\ :s; \xn- L\ + \L- Xm\ f
<
f
2 + 2 =f.
• Remark 1.2.1 We make some remarks here to help the student to write his or her own detailed analysis of the proof of Theorem 1.2.1, as recommended in the introduction, on page xxiii. The student should begin with the intuitive understanding that if Xn ----7 L, then Xn will be very close to L for all sufficiently big n. The point is that
11
LIMITS OF SEQUENCES & CAUCHY SEQUENCES
we want both Xn and Xm to be so close to L that Xn and Xm must be within E of one another. The student should use visualization to recognize that since Xn and Xm can be on opposite sides of L, we will need both Xn and Xm to be within ~ of L. Then the triangle inequality for real numbers assures that :z:n and Xm are no more than E apart. The student should write a careful analysis of every proof in this course, whether proved in the text or by the professor in class .
•
EXAMPLE 1.3
We claim the sequence Xn = (-1 y•+ 1 is divergent. In fact, if Xn were convergent, then Xn would have to be Cauchy. But lxn - Xn+II 2, for all n. Thus, if 0 < E ::; 2, it is impossible to find N E N such that nand m 2: N implies lxn- Xml <E.
=
Definition 1.2.3 A sequence Xn is called bounded if and only if there exists ME IR such that lxnl ::; M,for all n E N. Theorem 1.2.2 If Xn is Cauchy, then
Xn
must be bounded.
Remark 1.2.2 Observe that if Xn is convergent, then it is Cauchy, so this theorem implies that every convergent sequence is bounded. Proof: We will show that every Cauchy sequence is bounded. In fact, taking E = 1, we see that there exists N E N such that n and m 2: N implies lxn - Xrn I < 1. In particular, n 2: N implies
lxnl-lxNI :S llxnl-lxNII so that
:S
lxn- XNI
<
1
lxnl < 1 + lxNI· If we let M
=max {lx1l, ... , lxN-11, 1 + lxNI},
making M the largest element of the indicated set of N numbers, then ~nEN.
•
lxn
I ::; M
for
•
EXAMPLE 1.4
If Xn = n, then Xn is not convergent. If Xn were convergent, then Xn would be bounded. But for all M > 0, there exists n E N, corresponding to M, such that n > M by the Archimedean Property. So Xn is not bounded. It is also convenient to define the concepts Xn --+ oo and Xn --+ -oo. However, oo is not a real number, so we have not defined anything like lxn - ool and thus cannot prove such a difference is less than E. (Compare this with the discussion on page 9.) We adopt the following definition.
12
REAL NUMBERS AND LIMITS OF SEQUENCES
Definition 1.2.4 We write Xn ---+ oo if and only iffor all M > 0 there exists N E N such that n ~ N implies Xn > M. Similarly, we write Xn ---+ -oo if and only iffor all m < 0 there exists N E N such that n ~ N implies Xn < m.
EXERCISES 1.12 t Use Definition 1.2.1 to prove that the limit of a convergent sequence Xn is unique. That is, prove that if Xn ---+Land Xn ---+ M then L = M. 1.13
Let
ifn ifn
< 100, ~
100.
Prove that Xn converges and find lim Xn.
1.14
Let Xn = n~l. Prove Xn converges and find the limit.
1.15
Let Xn = <
1.16
Let Xn = ~. Prove Xn converges and find the limit.
1.17
Let Xn = n ;;n. Does Xn converge or diverge? Prove your claim.
1.18
Let Xn = <-Ir+I. Does Xn converge or diverge? Prove your claim.
·;.r .Prove
Xn
converges and find the limit.
2
t Prove: If Sn ~ tn ~ Un for all n and if both Sn ---+ L and Un ---+ L then L as n ---+ oo as well. (This is sometimes called the squeeze theorem or the sandwich theorem for sequences.) 1.19
tn ---+
1.20
1.21
Prove or give a counterexample: a) Xn + Yn converges if and only if both Xn and Yn converge. b) Xn Yn converges if and only if both Xn and Yn converge. c) If XnYn converges, then lim XnYn =lim Xn lim Yn· Let Xn = si~ n. Prove Xn converges, and find the limit.
1.22 t Suppose a ~ Xn ~ b for all n and suppose further that L E [a, b]. (Hint: If L b, obtain a contradiction.) 1.23 Suppose Sn ~ tn ~ Un for all n, counterexample: limn->oo tn E [a, b]. 1.24
Sn ---+
a < b, and Un
---+
Xn ---+
b. Prove or give a
For each of the following sequences:
i. Determine whether or not the sequence is Cauchy and explain why. ii. Find limn->oo lxn+Ia) b)
Xn Xn
= (-l)nn
= n+.!.n -
C) Xn-
1 ;vr
Xnl·
L. Prove:
THE COMPLETENESS AXIOM AND SOME CONSEQUENCES
d)
Xn
13
is described as follows:
1
0
12
311
1234
o, 1• 2' ' 3' 3' 1' 4' 2' 4' 0 ' 5' 5' 5' 5' 1' .... 1.25
t Prove:
The sequence Xn is Cauchy if and only if for all ~ N, we have ixk- xNi < €.
€
> 0 there exists
N EN such that for all k 1.26
Prove that if Xn ----> oo then Xn is not Cauchy.
1.27
Let Xn =1- 0, for all n E N. Prove:
I
lxn ---->
oo if and only if
rtT
---->
0.
1.3 THE COMPLETENESS AXIOM AND SOME CONSEQUENCES Consider the following sequence of decimal approximations to X1
= 1,
X2
v'2:
= 1.4, X3 = 1.41, X4 = 1.414, ....
Each Xk is a rational number, having only finitely many nonzero decimal places. For each k, the last nonzero decimal digit of Xk is selected in such a way that x~ < 2 yet if that last digit were one bigger the square would be larger than 2. The number x~ cannot equal 2, since there is no v'2 in the rational number system. Naturally we hope for Xk to converge and for lim Xk = v'2. Indeed, Xk is a Cauchy sequence. We can see this by observing that if m and n are greater than or equal to N, then ixm-
x.,l
<
1 l()N-l ·
Since the sequence of successive powers of 1~ converges to 0, if € > 0 we can pick N large enough to ensure that wL1 < €. Since there is no v'2 in IQ, there are Cauchy sequences in 1Q that have no limit in the set 1Q of rational numbers. It is reasonable, knowing from geometrical considerations that there should be a v'2 E JR., to select the following axiom as the final axiom for the real number system. Completeness Axiom of JR. Every Cauchy sequence of real numbers has a limit in the set JR. of real numbers. In Example 1.10 we will see that in fact the completeness axiom does imply that there exists a v'2 in JR. Remark 1.3.1 ln books that use a different but equivalent version of the Completeness Axiom, the statement that every Cauchy sequence of real numbers converges to a real number is called the Cauchy Criterion for sequences. Definition 1.3.1 The set JR. of real numbers is anArchimedean ordered field satisfying the Completeness Axiom. Thus a sequence of real numbers converges if and only if it is Cauchy. We remark that it can be proven, although we will not do so here, that any two complete
14
REAL NUMBERS AND LIMITS OF SEQUENCES
Archimedean ordered fields must be isomorphic in the sense of algebra. The interested reader can find a proof in the book [16] by Olmsted. On the other hand, the reader can find an explicit construction of a set having all the properties of a complete Archimedean ordered field, beginning from the natural numbers, in the book [12] by Landau. In the next chapter, after studying the Intermediate Value Theorem, we will see easily that JR., with the Completeness Axiom, does possess an y'P for each p > 0 and for all n E N. Most of the current chapter, however, will deal with other consequences of completeness, that we will begin exploring right now. Definition 1.3.2 A number M is called an upper bound for a set A c JR. if and only iffor all a E A we have a :::; M. Similarly, a number m is called a lower bound for A if and only if for all a E A we have a ;::: m. A set A of real numbers is called bounded provided that it has both an upper bound and a lower bound. A least upper bound for a set A is an upper bound L for A with the property that no number L' < L is an upper bound of A. A least upper bound is denoted by lub( A).
Note that not every subset of JR. has an upper or a lower bound. For example, N has no upper bound, and Z has neither an upper nor a lower bound. It is important to bear in mind also that many bounded sets of real numbers have neither a largest nor a smallest element. For example, this is true for the set of numbers in the open interval (0, 1). The reader should prove this claim as an informal exercise. Theorem 1.3.1 bound£.
If a nonempty set S
has an upper bound, then S has a least upper
Remark 1.3.2 If S has an upper bound, then its least upper bound is denoted by lub(S). Iflub(S) exists, then it must have a unique value L. The reader should prove that no number greater or smaller than L could satisfy the definition of lub(S).
Proof: Since S f. 0, there exists s E S. Select any number a 1 < s so that a 1 is too small to be an upper bound for S. Let b1 be any upper bound of S. We will use a process known as interval halving, in which we will cut the interval [a1. b1 ] in half again and again without end. The midpoint between a 1 and b1 is a 1 1 •
;b
i. If~ is an upper bound for S, then let b2 ii. But if
a
1
;b
1
=
a1
;b
is not an upper bound for S, then let a2
1
and let a 2
=
= a1.
~ and let b2
= b1.
Thus we have chosen [a2, b2] to be one of the two half-intervals of [a 1 , b1 ], and we have done this in such a way that b2 is again an upper bound of S and a 2 is too small to be an upper bound for S. Now we cut [a 2 , b2] in half and select a half-interval of it to be [a3, b3] in the same way we did for [a 2, b2]. Note that
THE COMPLETENESS AXIOM AND SOME CONSEQUENCES
15
as N ----t oo. Thus if E > 0, there exists N E N, corresponding to E, such that IbN- aNI <E. But, if n ~ N, then an and bn E [aN, bN], son and m ~ N implies ian- ami < E and also Ibn- bml < E. Thus an and bn are Cauchy sequences. Hence an ----t a and bn ----t b, for some real numbers a, b. By Exercise 1.22, a and b are in [aN, bN], for all N. Thus 0 ~ Ia- bl < t:, for all t: > 0. Thus Ia- bl = 0 and a= b. We claim that the number L = a = b is the least upper bound of S. Note that for each k we have ak ~ L ~ bk, since for all j ~ k we have ai and bi E [ak, bk]. First, observe that if s E S, then s ~ L. In fact, if we did haves > L, then, since bk ----t L, for some big enough value of k we would have lbk - Ll < Is- Ll and so bk < s. But this is impossible, since bk is an upper bound of S. Thus s ~ Land L is an upper bound of S. Finally, we claim Lis the least upper bound of S. In fact, suppose L' < L. Then since ak ----t L, there exists k such that L' < ak. But ak is not an upper bound of S. Thus L' cannot be an upper bound of S. • Remark 1.3.3 The proof of Theorem 1.3 .I is the most difficult proof presented thus far in this book. It proceeds by the method of interval-halving. This method can be likened to the way that a first baseman and a second baseman in a baseball game will attempt to tag a base-runner out by throwing the ball back and forth between them, steadily reducing the distance between them until one baseman is close enough to tag the runner. Interval halving is a very useful method of calculating roots of equations with a computer, provided it is possible to tell from the endpoints of each half-interval which half would need to contain the root. The student should take careful note of how the method of interval-halving produces two natural Cauchy sequences, an and bn, corresponding to the left and right endpoints of the selected half-intervals. Corollary 1.3.1 If S is any nonempty set of real numbers that has a lower bound, then S has a greatest lower bound.
For the proof see Exercise 1.28 in this section. Remark 1.3.4 If S has a lower bound, then its greatest lower bound is denoted by glb(S).
Since not every subset S c R. has either an upper or a lower bound, least upper bounds and greatest lower bounds do not exist in every case. Thus we introduce the concepts ofthe supremum and the infimum of an arbitrary setS c R.. Definition 1.3.3 Let S be any nonempty subset of R Define the supremum of S, denoted sup(S), to be the least upper bound of S if Sis bounded above and define sup( B) = oo if S has no upper bound. Similarly, define the infimum of S, denoted inf(S), to be the greatest lower bound of S if S is bounded below, and define inf(S) = -oo if S has no lower bound.
Thus sup( B) = {
~b(S)
if S is bounded above, if S is not bounded above
16
REAL NUMBERS AND LIMITS OF SEQUENCES
and inf(S) = {
~~S)
if S is bounded below, if S is not bounded below .
• EXAMPLE 1.5
LetS= {xI 0 < x < 1} = (0, 1). Then sup(S) = 1 and inf(S) = 0. Proof:
Clearly, 1 is an upper bound of S. But if M < 1, then there exists
S n (M, 1). Thus M cannot be an upper bound of S. Hence 1 is the least • upper bound of S. The argument for inf(S) is similar. x
E
• EXAMPLE 1.6
Observe that sup(N) = oo and inf(N) = 1. This follows because N has no upper bound, but N does have a least element, namely 1.
Definition 1.3.4 We call a sequence Xn increasing provided n E N, and then we write this symbolically as
Xn
<
Xn+I
for all
Xn / ' .
Similarly, we call denote by
Xn
a decreasing sequence
if Xn 2':
Xn+I
for all n E N, which we
Xn '-.,..
In either case, we call n E N, we write
Xn
a monotone sequence. Similarly, Xn
if Xn <
Xn+I
for all
i
and call Xn strictly monotone increasing. And if Xn
>
Xn+l
Theorem 1.3.2 If Xk is an increasing sequence, then
Xk ---->
for all n E N, we write
and call Xn strictly monotone decreasing.
Xk
is a decreasing sequence, then Xk
---->
sup{ Xn}. Similarly,
if
inf{ Xn}.
Remark 1.3.5 If sup{ Xn} = L, a real number, then this theorem says the increasing sequence Xn ----> Land this is an instance of convergence. But if sup{xn} = oo, we write Xn ----> oo, but this is called divergence to infinity. We do not consider the latter circumstance as convergence because we cannot make lxn - ooi < E. In fact, Xn - oo is meaningless, since oo is not a real number and the arithmetic operations of real numbers are not defined for oo. Similar remarks apply if Xn is a decreasing sequence.
17
THE COMPLETENESS AXIOM AND SOME CONSEQUENCES
Proof: Consider the case of Xn increasing. If { Xn} is not bounded above, so that the supremum is infinite, we see that for all M E JR. there exists N such that XN > M. Then, n ~ N implies Xn ~ x N > M too, and we call this divergence of Xn to infinity, denoted by Xn ----+ oo. Now suppose Xn is bounded above, so sup{xn} =Lis the least upper bound of the set of numbers { Xn}. We must show that Xn ----+ L. Let f. > 0. Since L - f. < L, L- f. cannot be an upper bound of { Xn}, so there exists N such that L ~ x N > L- f.. Thus for all n ~ N we have
L-
f.
<
XN
:S:
Xn
:S: L,
so n ~ N implies lxn - Ll < f.; that is, Xn ----+ L. The case in which Xn decreases is Exercise 1.29.
•
Corollary 1.3.2 A monotone sequence converges if and only if it is bounded.
Proof: Exercise 1.30. One inconvenience in the concept of limit is that lim Xn does not exist for every sequence Xn. One may not be sure in advance whether a given sequence is convergent or divergent. However, there are two related concepts called the Limit Superior 4 and the Limit Inferior which are always defined. Definition 1.3.5 Let Xn be any sequence of real numbers. Denote Tn
=
{xk
Ik
~
n}, which we call the nth tail of the sequence Xn. Note that Define in
= inf(Tn) and
Sn
=
sup(Tn)·
It is easy to see that in :::; sn. for all n. Moreover, as n increases, the set Tn of which one takes sup or inf shrinks to a subset of what it was the step before. Thus in increases and sn decreases. Consequently, ik ----+ sup{ in I n E N} and Sk ----+ inf {sn I n E N}. Recall that this horizontal-arrow notation means convergence if the sequence is approaching a real number, but it indicates a special type of divergence if the sequence is approaching plus or minus infinity. Definition 1.3.6 We define the limit superior of Xn by
limsupxn = inf{sn
In EN}= inf{sup(Tn) In EN}
and we define the limit inferior of Xn
liminfxn =sup{ in
In EN}= sup{inf(Tn) In EN},
4 The lim sup and lim inf appear only occasionally in this book, but the concepts are presented because they are intrinsically interesting. Also they are very useful to know for further study in graduate courses. On the other hand, the sup, inf, lub, and glb appear often and are needed throughout this book.
18
REAL NUMBERS AND LIMITS OF SEQUENCES
where Tn is the nth tail of the sequence Xn. Of course, lim sup and lim inf may be real numbers or they may be ±oo. Theorem 1.3.3 Let L E lR and let Xn be a sequence of real numbers. Then if and only if lim sup Xn = L = lim inf Xn. Proof:
First, suppose Xn -+ L. Thus if
E
Xn -+
L
> 0 there exists N E N such that Sn = sup(Tn) :'S L + f/2 and
n ~ N implies lxn - Ll < f./2, which implies in = inf(Tn) ~ L- ~· Thus f.
f.
L--<·i <sn _ 0. Thus Sn-+ L = limsupxn and in-+ L = liminfxn. For the opposite implication, suppose lim sup Xn = lim inf x 11 = L E R Thus there exists N1 such that n ~ N1 implies sup(Tn) ::; L + ~ and there exists N2 such that n ~ N2 implies inf(Tn) ~ L- ~· Let N =max{ N1. N2}. and n ~ N implies lxn - Ll :'S f/2 < f.. Thus Xn -+ L. •
EXERCISES 1.28 t Prove Corollary 1.3.1. (Hint: Let-S = { -s can you apply to the set - S?) 1.29 1.30 1.31
Is
E S}. Which theorem
t Prove the case in which Xn decreases in Theorem 1.3.2. t Prove Corollary 1.3.2. Find sup(S) and inf(S) for each setS below, and justify your conclusions.
In
a) S= {(-l)n EN}. b) 8={(-l)nnlnEN}.
c) S = { x E lR I x 2
< 1}.
1.32 Suppose A and B are subsets of JR, both nonempty, with the special property that a ::; b for all a E A and for all bE B. Prove: sup( A) ::; inf(B). (Hint: Every b is an upper bound of A. So how does the sup( A) relate to each bE B?) 1.33 Prove that every real number M E lR is both an upper bound and a lower bound of the empty set, 0. 1.34 Let X 11 = n;;_- 1 . Show that conclusions.
X 11
is convergent and find limxn. Justify your
1.35 Let Xn = (1.5) 11 , for all n E N. Find sup(Tn) and inf(T11 ), where Tn is the nth tail of the sequence, and explain. Find lim inf x 11 and lim sup x ... 1.36 (xn
Prove or give a counterexample: If Xn increases and Yn increases, then
+ Yn) is monotone.
19
ALGEBRAIC COMBINATIONS OF SEQUENCES
1.37 Prove or give a counterexample: if Xn increases and Yn increases then is monotone.
(xn -
Yn)
1.38 Prove or give a counterexample: if product (xnYn) is monotone.
increases and
Yn
increases then the
1.39 Prove: If Xn is a constant sequence if and only if increasing and monotone decreasing.
Xn
is both monotone
Xn
1.40 Let Xn = (-~)n. Find inf(Tn), sup(Tn), limsupxn, and liminf:rn. Does limn_, 00 Xn exist? (Hint: Tn is the nth tail of the sequence Xn.) 1.41 Let exist? 1.42
Xn
=
(-l)n
+ ~-
Find limsupxn and liminfxn. Does
Give an example of a sequence Xn
---+
limn_, 00 Xn
oo for which Xn is not monotone.
1.43 Let Xn be any sequence of real numbers. Prove: if lim inf Xn = lim sup x n = oo.
Xn
1.44 Let Xn be any sequence of real numbers. Prove: onlyifliminfxn = limsupxn = -oo.
diverges to oo if and only
Xn
diverges to -oo if and
1.45 Prove that lim inf Xn :::; lim sup Xn, for every bounded sequence numbers. (Hint: The result of problem 5 may help.)
Xn
of real
1.46 Let Xn be any unbounded sequence of real numbers. Let sn and in be defined as in the proof of Theorem 1.3.3. a) If { Xn I n E N} has no upper bound, prove sn = oo for all n, so that lim sup Xn = oo. b) If { Xn I n E N} has no lower bound, prove in = -oo for all n, so that lim inf Xn = -oo. c) In either of the two cases above, conclude that
lim inf Xn
1.4
:::;
lim sup Xn.
ALGEBRAIC COMBINATIONS OF SEQUENCES
If sn is some algebraic combination of other sequences, then we may be able to determine whether or not sn converges if we know the behavior of the other sequences of which sn is composed. Theorem 1.4.1 Suppose n ---+ oo. Then
i. ii.
iii.
Xn
+ Yn
---+
Xn- Yn ---+ XnYn ---+
L
Xn
+ M.
L- M.
LM.
and Yn both converge, with
Xn ---+
L and Yn
---+
M as
20
REAL NUMBERS AND LIMITS OF SEQUENCES
iv. !En. ~ Yn
ML ,
provided that M =1- 0 and Yn =1- 0, for all n E N.
In order to prove this four-part theorem, it is helpful first to introduce the following definition and the two lemmas that follow it. Definition 1.4.1 A sequence that converges to zero is called a null sequence.
Lemma 1.4.1 The sequence Xn
~
L
E JR.
if and only if Xn - L
~
0.
Proof of Lemma. We remark that in words we are proving that Xn ~ L if and only if Xn - L is a null sequence. By definition, Xn ~ L E JR. if and only if for all E > 0 there exists N E N, corresponding to E, such that n ;::: N implies lxn - Ll < E. This is equivalent to i(xn- L)- Oj < E, which is equivalent to the statement that (xn- L) ~ 0, since lxn- Ll = i(xn- L)- Oj. •
Lemma 1.4.2 If Sn
~
0 and iftn is bounded, then sntn
~
0.
Proof: We are proving that a null sequence times a bounded sequence must be a null sequence. There exists M > 0 such that itnl ::::; M, for all n E N. Let E > 0. Since Sn ~ 0, there exists N such that n 2: N implies isn - Oj = isnl < ~. Now, n;::: N implies
• With the preceding definition and two lemmas in hand, we proceed to the main task of proving the theorem. Proof:
i. Let E > 0. There exists N 1 such that n 2: N1 implies lxn - Ll < E/2, and there exists N2 such that n 2: N2 implies IYn - Ml < E/2. Now let N = max{Nt, N2}. Then n;::: N implies
i(xn
+ Yn)- (L + M)l :S lxn- Ll + IYn- Ml <E.
ii. This proof is almost identical to the preceding case. iii. Since Yn converges, Yn is bounded. And
XnYn - Lyn + Lyn - LM (xn- L)Yn + L(Yn- M)
XnYn- LM ~
0+0=0
using the two lemmas and the first part, proven above.
EXERCISES
iv. Because of the third part, proven above, it suffices to prove that ...L ___. Yn
1
1
Yn - M 1
I
=
IYn - Mj IYnMI
=
1 • M
21
But
1
IYn - MjiYnMI'
Since IYn- Ml
___. 0, it suffices to show IYn1MI is bounded. There exists N such 1 that n ~ N implies IYn- Ml < 1.¥1. Thus IYnl > 1.¥1 and IYn MI < l~l2. 1 1 Thus IYn MI is bounded by max { IY MI, ... , IYN~IMI, 1~12 }· 1
•
EXERCISES 1.47
Give examples of divergent sequences Xn and Yn such that Xn + Yn converges.
1.48 Let a E lR be arbitrary. Give examples of sequences Xn ___. oo and Yn ___. oo such that Xn - Yn ___. a. 1.49
Give examples of divergent sequences Xn and Yn such that XnYn converges.
1.50 Let the real number a ~ 0 be arbitrary. Give examples of sequences Xn ___. oo and Yn ___. oo such that :£n. ___. a. Yn
1.51 Prove or else give a counterexample: If Xn converges, then Xn converges and Yn converges. 1.52
+ Yn
converges and if Xn - Yn
Prove or else give a counterexample: If ad - be =f. 0 and if
axn
+ byn ___. L and CXn + dyn
___. M
as n ___. oo, then Xn converges and Yn converges.
1.53 Suppose for all n E N we have Yn =f. 0. Prove or else give a counterexample: If both XnYn and :£n. converge, then Xn converges and Yn converges. Yn 1.54
Prove or else give a counterexample: a) A bounded sequence times a convergent sequence must be convergent. b) A null sequence times a bounded sequence must be a null sequence.
1.55 a) Ifq(n) = bknk +bk_ 1 nk-l +· · ·+b 1 n+bo is a polynomial in the variable n E N with bk =f. 0, show that there exists N E N such that n ~ N implies q(n) =f. 0. b) Show that
provided that bk
=f. 0 and k is a positive integer.
22
REAL NUMBERS AND LIMITS OF SEQUENCES
1.56
<> t 5 Define the nth Cesaro mean of a sequence Xn by an
for all n EN. a) Suppose
Xn -+
ian- Ll =
L as
IE~=l
1
=-
n
n -+
Xk;;L
(x1
+ ... + Xn)
oo. Prove:
an -+
L as
n -+
oo. (Hint: Write
I·)
b) Give an example of a divergent sequence Xn for which an converges.
1.57
Let Xn and Yn be any two bounded sequences of real numbers. Prove that limsup(xn +
Yn):::;
limsupxn + limsup]fn·
Give an example in which strict inequality occurs.
1.58
Let Xn and Yn be any two bounded sequences of real numbers. Prove that lim inf(xn +
Yn)
:2: lim inf Xn +lim inf Yn·
Give an example in which strict inequality occurs.
1.5 THE BOLZANO-WEIERSTRASS THEOREM A subsequence of a sequence Xn is a sequence consisting of some (but not necessarily all) of the terms of the sequence Xn. The terms appear in the same order as they appeared in Xn, but with omissions. We formalize this concept in the following definition.
Definition 1.5.1 Let nk be any strictly increasing sequence of natural numbers, so that n1
< n2 < · · · < nk < · · ·.
Then we call Xnk a subsequence of Xn.
We remark that since n1 :2: 1, it follows that n 2 :2: 2, ... , and nk :2: k, for all k. An alternative way to think about and to notate subsequences is to write that nk = ¢(k), where¢ : N -+ N is a strictly increasing function, in the sense that j < k ~ ¢(j) < ¢(k). Then we could alternatively write Xnk as Xq,(k)· • EXAMPLE 1.7
Let Xn = n 2 , for all n E N. If nk squares of even natural numbers.
5 This
=
2k, then
Xnk
=
(2k ) 2 is the sequence of
exercise is used to develop the Fejer kernel for Fourier series in Exercise 6.47.
THE BOLZANO-WEIERSTRASS THEOREM
Theorem 1.5.1 If Xn converges to the limit L as Xnk ----> L as k ----> oo.
n ---->
23
oo, then every subsequence
Proof: Lett > 0. There exists N E N such that n 2: N implies Since nk 2: k for all k, it follows that k 2: N ===> lxnk - Ll < t.
lxn-
Ll <
t. •
Corollary 1.5.1 if Xn has two subsequences that converge to different limits, then Xn is not convergent. Theorem 1.5.1 should be compared carefully with the following example . •
EXAMPLE 1.8
Let Xn = ( -1 )n+l. The sequence Xn is bounded but is not convergent. The subsequences X2k-l ____, 1 and X2k ----> -1 as k ----> oo, We have learned previously that every convergent sequence is bounded. Although the student has seen several examples of bounded sequences that are not convergent, we do have the following very important theorem.
Theorem 1.5.2 (Balzano-Weierstrass) Let Xn be any bounded sequence ofreal numbers, so that there exists M E ~such that lxnl :::; M for all n. Then there exists a convergent subsequence Xnk of Xn· That is, there exists a subsequence Xnk that converges to some L E [-M, M]. Proof: We will use the method of interval-halving introduced previously to prove theexistenceofleastupperbounds. Leta1 = -Mandb1 = M. Soxn E [a1.b1],for all n E N. Let Xn 1 = x1. Now divide [a1, b1] in half using the midpoint a 1 %b 1 = 0. i. If there exist oo-many values of n such that Xn E [a1, OJ, then let a2 = a 1 and b2 = 0. n. But if there do not exist oo-many such terms in [a 1, OJ, then there exist oo-many such terms in [0, b1 ]. ln that case let a2 = 0 and b2 = b1. Now since there exist oo-many terms of Xn in [a2, b2], pick any n2 > n1 such that E [a2, b2]. Next divide [a2, b2] in half and pick one of the halves [a3, b3] having oo-many terms of Xn in it. Then pick n3 > n2 such that Xn 3 E [a3, b3]. Observe that
Xn 2
as k ____, oo. So if t > 0, there exists K such that k 2: K implies lbk - ak I < t. Thus if j and k 2: K, we have lxni - Xnk I < taswell. Hence Xnk is a Cauchy sequence and must converge. Since [- M, M] is a closed interval, we know from a previous • exercise that Xnk ----> L as k ----> oo for some L E [- M, M].
24
REAL NUMBERS AND LIMITS OF SEQUENCES
EXERCISES 1.59
Give an example of a bounded sequence that does not converge.
1.60 Use Corollary 1.5.1 to prove that the sequence converge.
Xn
= (-l)n +~does not
1.61 Suppose Xn --t oo. Prove that every subsequence Xnk --t oo as k --t oo as well. (Hint: The sequence Xn is divergent, so it is not enough to quote Theorem 1.5.1.) 1.62 Use the following steps to prove that the sequence Xn has no convergent subsequences if and only if lxnl --too as n --too. a) Suppose that the sequence Xn has no convergent subsequences. Let M > 0. Prove that there exist at most finitely many values of n such that Xn E [-M, M]. Explain why this implies lxnl --too as n --too. b) Suppose lxnl --t oo as n --t oo. Show that Xn has no convergent subsequence. (Hint: Exercise 1.61 may help.) 1.63
Give an example in which Yj
> 0 for all j and Yj --t 0 yet Yj is not monotone.
1.64 The following questions provide an easy, alternative proof of the BolzanoWeierstrass Theorem. a) Use the following steps to prove that every sequence Xn of real numbers has a monotone subsequence. Denote the nth tail of the sequence by
Tn={xjiJ2::n}. (i) Suppose the following special condition is satisfied: For each n E N, Tn has a smallest element. Prove that there exists an increasing subsequence Xn;. (ii) Suppose the condition above fails, so that there exists N E N such that TN has no smallest element. Prove that there exists a decreasing subsequence Xn;. b) Give an easy alternative proof of the Bolzano-Weierstrass Theorem.
1.65 Prove: A sequence Xn --t L E R if and only if every subsequence Xn, possesses a sub-subsequence Xn;. that converges to L as j --t oo. (Hint: To prove 3 the if part, suppose false and write out the logical negation of convergence of Xn to L.) 1.66 Prove or Give a Counterexample: A sequence Xn E R converges if and only if every subsequence Xn; possesses a sub-subsequence Xn;. that converges as j --t oo. 3
1.6 THE NESTED INTERVALS THEOREM Having used the method of interval-halving twice already, it is natural to consider the following theorem.
25
THE NESTED INTERVALS THEOREM
Theorem 1.6.1 (Nested Intervals Theorem) Suppose
is a decreasing nest of closed finite intervals. Suppose also that
Then there exists exactly one point L E n~ 1 [ak, bk]. Moreover, ak bk --+ L as k --+ oo.
--+
L and
Proof: Let E > 0. Then there exists K such that k ;::: K implies ibk - aki < E. But, for all k :2: K, ak E [aK, bK]· Thusj, k;::: K implies iaj- aki <E. Hence the sequence ak is a Cauchy sequence so there exists a point L such that ak --+ L. Since k :2: n implies for all n that ak E [an, bn]. it follows that L E [an, bn] for all nand that 00
Now, if
n 00
L'
E
[ak, bk]
k=l
also, then IL- L'i ::::; ibk- aki --+ 0, which implies L = L'. Hence the point Lis unique. Observe that ibk - Ll ::::; lbk - ak I --+ 0 so that bk --+ L as claimed. • The reader is aware that there are real numbers that are not rational. For example, we will prove that there is a square root of 2 in lR in Example 1.10. Yet we know that no rational number can be a square root of 2 as was shown in Exercise 1.11. Despite the fact that not every real number is rational, every finitely long decimal expansion represents a rational number, and common sense tells us that we may approximate any real number as closely as we wish by using a suitable but finitely long decimal expansion. This observation gives rise to the following definition of what it means for a subset S ~ lR to be dense in R Definition 1.6.1 A subset S ~ lR is called dense in lR if and only there exists a sequence Sk of elements of S such that Sk --+ x .
if for all x
E
• EXAMPLE 1.9 We will show that 1QJ is dense in R
=
Proof: Let x E R If x E IQJ, we could simply let Sk x so that Sk --+ x, being a constant sequence. So suppose x f/_ IQJ, so that x is irrational. Then there exists n E Z such that n < x < n + 1. Let a 1 =nand b1 = n + 1, both rational numbers. Then the midpoint is also a rational number, and x must lie in one half-interval but
R
26
REAL NUMBERS AND LIMITS OF SEQUENCES
not the other. Let [a2, b2] be the half-interval containing x. Now cut [a2, b2] in half again and select [a 3 , b3 ] containing x again. Note that 1 lbk- akl = 2k-l ___. 0.
Since x E [ak, bk] for all k, lak - xl ---> 0, so ak ---> x, and ak E Q for all k. Thus we have a sequence of rational numbers converging to x in this case as well. (Note that bk would have served just as well as ak.) •
Remark. Because lR is complete and because the set Q c lR is dense in IR, it follows that any set of numbers that contains limits for all its Cauchy sequences and that contains Q must also contain JR. For this reason lR is called the completion of Q.
• EXAMPLE 1.10 We will show that J2 exists in JR. Proof: Recall that in the first paragraph of Section 1.3 we constructed an increasing sequence Xk as follows: X1
1
X2
1.4
X3
1.41
X4
1.414
Here Xk is the largest k-digit decimal greater than 1 such that x% < 2. We could have constructed also a decreasing sequence Yk by letting Yk be the smallest > 2. Thus k-digit decimal such that
yz
1 IYk- Xkl ~ 1()k-l
-4 ()
ask---> oo We see that the intervals [xk, Yk] satisfy the hypotheses of Theorem 1.6.1. Thus there exists a unique
n 00
L E
[xk,Yk]
k=l
suchthatxk ___. Landyk ___. L. Hencexz ___. £ 2, sothat£ 2 ~ 2, andyz ___. £ 2, so that L 2 ~ 2. Thus L 2 = 2 and L = J2 exists in JR. •
EXERCISES
1.67
Give an example of a decreasing nest of nonempty open finite intervals
THE HEINE-BOREL COVERING THEOREM
such that
1.68
n%:1 (ak, bk) =
27
0, the empty set.
Give an example of a decreasing nest of open intervals
(a1, b1) ;:2 (a2, b2) ;:2 · · · such that bk- ak
---'>
0 yet
n%:1 (ak, bk) -=1- 0.
1.69 Give an example of a decreasing nest of infinite intervals with empty intersection. 1.70 Prove or give a counterexample: If an i, bn l, and (an, bn) is a decreasing nest of finite open intervals, then there exists L E lR such that
n 00
(an, bn) = {L}.
n=l
1.71 Show that every open interval (a, b) C IR, with 0 < b- a but no matter how small, must contain a rational number. (Hint: Apply Example 1.9.)
1.72 t Let I denote the set of all irrational numbers. The following steps will lead to the conclusion that I is dense in R (You may assume it is known that vl2 E I.) Let x E R We must show there exists a sequence 8k of elements of I converging to X.
a) Show that if ~ is any nonzero rational number then ~vl2 is irrational. (Hint: Suppose the claim is false, and deduce a contradiction.) b) Now suppose x is any real number. Explain why there exists a sequence tk of nonzero elements of Ql converging to ~- Define a sequence sk of elements of I converging to x.
1.73 Show that every open interval (a,b), with b- a > 0 but no matter how small, must contain an irrational number. (Hint: Use the result of Exercise 1. 72.) 1.74
Is the set { ;~ ImE Z, n EN} dense in IR? Prove your conclusion.
<> Let D -=1- 0 be a subset of the set of strictly positive real numbers, and let = {nd InEZ, dE D}. Prove: Sis dense in lR if and only if inf(D) = 0.
1.75
S
1.7 THE HEINE-BOREL COVERING THEOREM Although the study of continuous functions belongs to the next chapter, let us think in advance on an intuitive level about this concept. A function f : lR ---'> lR is said to be everywhere continuous provided that for each point p E IR, f (x) remains very close to f (p) provided that x is kept sufficiently close to p. For example, the set
S
= {x lif(x)- f(p)l < f}
should contain some sufficiently small open interval around p, although S may also include points far away from p.
28
REAL NUMBERS AND LIMITS OF SEQUENCES
Consider next an open interval (a, b) that is contained in the set of values achieved by f. Let 0 = {x I f (x) E (a, b)}. For each p E 0 there will be a corresponding small number E > 0 such that
(f(p)-
E,
f(p)
+e)~
(a, b).
Because f is continuous at p, there will be a small open interval around p that is contained in 0. This example motivates the concept of an open set, which generalizes the familiar notion of an open interval.
Definition 1.7.1 AsetO ~ !Riscalledanopen subsetof!Rprovidedforeachx there exists r x > 0 such that
E 0
Thus 0 is called open provided that each x E 0 has some (perhaps very small) open interval of radius rx > 0 around it that is entirely in 0 .
• EXAMPLE 1.11 We claim that every open interval (a, b) is an open set. In fact, if x E (a, b), then a < x < b and we can let
rx = min{lx- al, lx- bl}. Then (x- rx, x
+ rx)
~
(a, b).
~ lR is a union of (perhaps infinitely many) open intervals. Moreover; every union of open sets is an open set.
Theorem 1.7.1 Every open subset 0
Proof: Let 0 ~ lR be open. Then, using the notation of Definition I. 7 .I, you will show in Exercise 1.78 that 0= U(x-rx,x+rx)· xEO
To prove the second conclusion, let 0 = UaEA 0"' be any union6 of open sets. Let X E 0. We know there exists O:o E A such that X E o<>o• which is open. Thus there • exists rx > 0 such that (x- rx, x + rx) ~ Oa 0 ~ 0. Thus 0 is open.
Definition 1.7.2 An open cover of a setS 0 = {Oa 6 When
~
lR is a collection
I 0: E A}
denoting an arbitrary union of open sets, it is customary to use a so-called index set, such as the set A used here. One should think of A as being a set of labels, or names, used to tag, or identify the sets of which the union is being formed. One cannot always index sets by means of natural numbers, because there exist sets so large that they cannot be uniquely indexed by natural numbers. Even the infinite set N is too small. The reader will learn more about this in Theorem 1.15.
THE HEINE-BOREL COVERING THEOREM
29
of (perhaps infinitely many) open sets Oa., where a ranges over some index set A, such that S s;;; Ua.EA Oa.. In analysis, it is often necessary to try to control small-scale local variations of some structure defined on a domain D. Under suitable conditions, one can control variations by restricting ones view to a very small open set surrounding each given point of D. Then in the large we cover the whole domain D with a family 0 of these (possibly small) open sets whose union contains D. Usually 0 will have infinitely many open sets as members, or elements of itself. Within each one of the open sets that are elements of 0 the fine structure varies only slightly. We hope for the availability of a finite subcover, consisting of only finitely many of the open sets belonging to 0, so as to produce uniform controls on fine-scale variations for the entire large domain D. Below, we show an example of an open covering of a set for which there is no finite subcover. This will motivate the Heine-Borel Theorem which follows . • EXAMPLE 1.12 Consider the setS= (0, 2), a finite open interval. We claim that
In fact, for each x E (0, 2) there exists n E N such that x E (~, 2). (Make sure you see why this is so.) Thus 0 = { (~, 2) In EN} is an open cover of S. However, it is impossible to select any finite subset of 0 that covers S. The reason is that any finite subset of 0 would have a largest value no of n for which ~ would be the left hand endpoint of an interval belonging to the chosen finite subset of 0. Thus the finite subset would fail to cover any points to the left of...!.... no Remark 1.7.1 Note that the term finite interval means an interval of finite length. Any finite interval with strictly positive length has infinitely many distinct points within it. Thus the word finite infinite interval means the same thing as bounded. On the other hand, a finite set means a set with finitely many elements. In Example 1.12, a finite subset of a set of intervals means a collection of finitely many of those intervals. This does not mean that the intervals in question have finitely many points. The Heine-Borel theorem is one of the most important in advanced calculus. But it is the most abstract theorem presented thus far in this book, and the reader will need time and experience to absorb fully its significance. It is recommended to consider Exercise 1.80 below after reading the statement of the theorem. Theorem 1.7.2 (Heine-Borel) Suppose the closed finite interval
[a, b] s;;;
U Oa., o.EA
30
REAL NUMBERS AND LIMITS OF SEQUENCES
where 0 = { Oa I a E A} is an open cover of [a, b]. Then there exists a finite set F = {at, ... , an} ~ A such that n
[a, b] ~
U Oa = UOa;·
a.EF
i=t
The collection {Oa 1 , ••• , Oa.,.} ~ 0 is called a finite subcover of[a, b]. Proof: We suppose the theorem were false. We will deduce a logical selfcontradiction from that supposition. This will prove the theorem. So suppose the Heine-Bore) theorem were false: Thus we can assume the given cover does not admit a finite subcover of [a,b]. Let at = a and bt = b, and Jet c = ~. Then each of the intervals [a1, c] and [c, bt] is covered by Ua.EA Oa.. If both of these half-intervals had finite subcovers, then the whole interval [a,b] would have a finite subcover since the union of two finite families is still finite. Since we are supposing [a,b] has no finite subcover, pick a half-interval [a 2, b2] that has no finite subcover. Now cut [a2, b2] in half and reason the same way for [a 2, b2] as we did for [at, bt]. We obtain a decreasing nest of intervals
[at. bt] -.;2 • • • -.;2 [ak, bk] "-2 • • · such that each [ak, bk] is covered by Ua.EA Oa. but has no finite subcover. However,
as k
~
b-a
k-t ~o 2 oo. By the nested intervals theorem, there exists a unique
lbk-akl=
n 00
X
E
[ak, bk] ~ [a, bj.
k=t
Since x E [a, b], there exists a E A such that x E Oa. So there exists rx > 0 such that (x- rx, x + rx) ~ Oa.. Now pick k big enough so that bk- ak < rx. Thus
x E [ak, bk] C (x- rx, x
+ rx)
~
Oa.
and we have covered [ak, bk] with a single open set Oa from the original cover. This is a (very small) finite subcover. This contradicts the statement that [ak, bk] could not have a finite subcover. This contradiction proves the Heine-Bore) theorem. •
EXERCISES 1.76
Show that a closed finite interval [a,b] is not an open set.
1.77
Show that a half-closed finite interval (a, b) is not an open set.
1.78 Let 0 be any open subset of JR., and for each x E 0 Jet rx be defined as in the proof of Theorem 1. 7 .1. Complete the proof of that theorem by showing that 0 = UxEO(x- rx, X+ rx)·
COUNTABILITY OF THE RATIONAL NUMBERS
1.79
31
The empty set 0 satisfies the definition of being open. Explain.
1.80 Find an open cover of the interval ( -1, 1) that has no finite subcover. Justify your claims. 1.81 Find an open cover of the interval ( -oo, oo) that has no finite subcover. Justify your claims. 1.82 Let E ~ lR be any unbounded set. Find an open cover of E that has no finite subcover. Prove that you have chosen an open cover and that it has no finite subcover. 1.83 Let E = { ~ I n E N}. Find an open cover 0 = {On I n E N} of E that has no finite subcover, and prove that 0 is an open cover and that 0 has no finite subcover. 1.84 0 We call p a cluster point of E, provided that for all E such that 0 < le- PI< E.
> 0 there exists e
E E
(See Defintion 2.1.1.) Let E c lR be any set with the property that there is a cluster point p of E such that p ¢ E. Show that there exists an open cover of E that has no finite subcover. Justify your claims. (Note: Exercise 1.83 is an example of the claim of this exercise.) 1.85
True or False: Finitely many of the open sets in the collection
would suffice to cover [0, 1]. 1.86 Prove or give a counterexample: Every open cover of a finite subset of lR has a finite subcover. (Note: For the real line, the phrase finite subset does not mean the same thing as finite interval.)
1.8 COUNTABILITY OF THE RATIONAL NUMBERS Definition 1.8.1 A set S is called countable if it is an infinite set for which it is possible to arrange all the elements of S into a sequence. That is, S is countable if S = {s 1 , s 2 , .•• , Sk, ... } with each element of S listed exactly once in the sequence. Equivalently, we may say that Sis countable if and only if there exists a function s : N --> S that is both one-to-one, which is also called injective, and onto S. Onto maps are often called surjective. The terms., in the definition above would be s(n) in this notation.
32
•
REAL NUMBERS AND LIMITS OF SEQUENCES
EXAMPLE 1.13
Let E denote the set of all even natural numbers. Thus E ~ N. We claim that E is countable. In fact, the elements of E can be arranged into a sequence by means of a function s : n ~ 2n that is both an injection and a surjection of E onto N. That is, the sequence is given by sn = 2n. It may surprise the reader that the elements of an infinite set can be paired one-to-one with those of a proper subset. • EXAMPLE 1.14
We will prove the surprising and useful fact that the set Q of all rational numbers is countable. It is important to understand that if a sequence s n is to include all the rational numbers, then these numbers cannot be listed in size places. That is, if Sn < Bn+t• both in Q, then sn+;n+l lies between them and is again rational. Hence there is no next smallest rational number after s n· We can explain how to list the rational numbers in a sequence, disregarding the order relation, as follows. We are going to consider a table of numbers with infinitely many rows. The entry in the mth row and nth column will be the fraction fii. Here m E N and n E Z. Thus there will be a first row, in which each denominator is understood to be 1, but no last row. Each row will extend endlessly to left and to the right. We can draw only part of this table below.
-4 4 -2
-3 3 -2
4
3
-3 4
-4
-2 2 -2 2
-3
-3
3
-4
-4
2
-1 1
-2 1
-3 1
-4
0 0 2
1 1
2 3
3
3
1
0
4
3 3 2
2
1
0
3
2 2 2
4
3
2
3
4
4
4 4 2 4
3 4
4
We will describe a systematic expanding search pattern that reaches each term on the infinite table after some finite number of terms in the sequence described below. We will list side-by-side those terms fii for which
lml + lnl = k beginning with k = 1, k = 2, and so on. If parentheses are placed around a number, we are skipping that number because it was already listed previously. Here is the resulting list:
o-1 (Q = o) '
'
2
1 -2
' '
'
-~2' (Q3 =
o)
~
2 -3
'2' '
'
(-~2 = -1) ' -~3'
(~ =0) '~' (~ =1) ,3, .... It is clear that this expanding search pattern eventually reaches any rational number fii that one might choose, and each rational number is listed exactly
COUNTABILITY OF THE RATIONAL NUMBERS
33
once in the resulting endless sequence. The first several terms of the sequence sn, corresponding to k = 0, 1, 2, 3, 4, are
1 1 1 1 0, -1, 1, -2, -2, 2' 2, -3, -3, 3' 3, .... We will see several applications of the countability of Q in this book. However, for now we describe a startling example .
• EXAMPLE 1.15 We will describe a set 0 that is both open and dense in IR, yet which is quite small. Let E > 0, a small positive number. Consider the line segment [0, E] of length E. We will construct a sequence of intervals (ak, bk), each of length ~. That is, the first interval, ( a1, b1) will have length ~. This leaves half of [0, E) remaining. But for (a2, b2) we will use only half that remainder: namely, f. b3 - a 3 will be taken to be ~.or half of the remaining f from the original interval [0, E]. Let Q = {St, s2, ... , Bk, ... }, which can be arranged since Q is countable, as explained above. Let (a1,b1) be centered around St. (a2,b2) centered around s2, and in general (ak, bk) will be centered around the point Bk. For any finite subcollection of the intervals (ak, bk), k = 1, 2, 3, · · · , the sum of the lengths of each of the finitely many intervals chosen must be less than E. That is because the whole infinite sequence of intervals is chosen by cutting E in half again and again without end. Now consider that Q is dense in R But if we let 0 = 1 (ak, bk), then Q C 0 and so 0 is also dense in R Moreover, 0 is open by Theorem 1. 7.1. We claim that 0 is a smal1 set in the following sense. Let [a,b] be any closed finite interval of length~ E. We claim it is impossible for [a, b] ~ 0. In fact, if [a,b] were a subset of 0, then
U;::
00
[a, b] ~
U(ak, bk), k=l
an open cover of [a,b]. By the Heine-Bore! theorem, there must be a finite number of intervals from among the (ak, bk)'s that cover [a,b]. Yet the sum of the lengths of these finitely many intervals must be less than E ::; b - a. This is impossible. It is interesting to compare the preceding example with Exercise 1.91. The interested student can learn much more about surprising subsets of the line in the book by Gelbaum and Olmsted [7]. It is natural to wonder at this point whether or not perhaps every infinite set is countable. The answer is no, as is shown by the following surprising theorem.
Theorem 1.8.1 (Cantor) The set lR of real numbers is uncountable. (That is, it is impossible to include all the real numbers in a sequence.)
34
REAL NUMBERS AND LIMITS OF SEQUENCES
Proof: We begin by noting that the (possibly endless) decimal expansions of real numbers are not unique, because an infinite tail of 9's can always be replaced by an expansion ending in an infinite tail of O's. For example,
0.999 ...
= 1.000 ...
This is understood in the sense that if Xn = 0.999 ... 9 with n 9's then 1 lxn - 11 = -l()n ---+ 0 as n ---+ oo. But if we agree not to allow endless tails of9's, then decimal expansions of real numbers are unique. Moreover, every infinite decimal representation corresponds to a real number. The reason for this fact is as follows. Consider any infinite decimal expression. It could be written in terms of a whole number K in the form
K
+ O.d1d2d3 ... dn ... ,
where dn is the nth digit to the right of the decimal point. Then let
Xn = K
+ O.d1d2 ... dn.
It follows that if m and n are both greater than N, then
lxn-
Xml <
1 10N ---+ 0
as N ---+ oo. Hence the sequence Xn of truncations of the endless decimal expression to n digits is itself a Cauchy sequence. By the completeness axiom this sequence Xn must converge to a limit x E R That is why we say that the endless decimal
represents x. Now we suppose that Cantor's theorem were false and deduce a contradiction. Suppose therefore that all real numbers could be placed into a sequence. Then there would be a subsequence Xn containing all the real numbers in [0,1). We denote the decimal expansions of the numbers Xn in a vertical column below.
XI
.d11d12d13. · · d1k · · ·
X2
.d21d22d23. · · d2k · · ·
X3
.d31 d32d33 · · · d3k · · ·
Now we obtain a contradiction by constructing a number x E [0, 1) that is not in the sequence Xn. We define x by the digits dk in its decimal expansion. If d11 =f=. 0, we let d1 = 0. If du = 0, let d 1 = 1. If d22 =f. 0, we let d2 = 0. If d22 = 0, we let d2 = 1. In general, if dkk =f=. 0, we let dk = 0, but if dkk = 0, then we let dk = 1. We observe that x = .d1d2d3 ... dk ... E [0, 1), yet x tj. { Xn} since for all n, x differs from Xn in the nth decimal digit. •
EXERCISES
35
EXERCISES
1.87
t a) If A and B are each countable sets, show that A U B is countable. (Hint:
For each set, consider a sequence of all elements, and show how to splice the sequences together to make one sequence. Remember that the sets need not be disjoint.) b) Prove that the union of countably many finite sets is either countable or finite.
1.88
t If An is a countable set for each n
E N, show that
00
is again a countable set. (Hint: Explain why each set An can be written in the form An
= {ank I k
E
N}
but these sets need not be disjoint from one another. Consider an array similar to that displayed in the proof of Cantor's Theorem in this section, but reason in a manner similar to that in Example 1.14.)
1.89 Is the set Z of integers countable? Why or why not? How about the set of all odd positive integers? Even integers? 1.90 Show that the set I of all irrational numbers must be uncountable. (Hint: Use Exercise 1.87.) 1.91 A subset E c lR is called closed if and only if its complement lR \ E is open. (For example, lR itself is a closed set since lR \ lR = 0 is an open set.) Prove that a closed set E that is also dense in lR must be all of R (Hint: Suppose the claim were false, so that lR \Eisa nonempty open set. Deduce a contradiction.) 1.92
Referring to the definition in Exercise 1.91, answer the following questions. a) Prove that every closed finite interval [a, b] is a closed set. b) Give an example of subset E c lR for which E is neither open nor closed.
Justify your example. c) Give an example of a set S <;;; lR that is both open and closed.
1.93 () Prove that every open set S <;;; lR can be expressed as the union of a countable set of open intervals. Hint: Let S n Ql = {Qn I n E N} be a sequence listing all the rational numbers inS. Let Tn = sup{r I (qn- r,qn
+ r)
<;;; S}
1.94 Prove that every subset E of lR is the union of some family of closed sets. Can every subset E of!R be the union of a family of open sets? Prove your answer.
36
REAL NUMBERS AND LIMITS OF SEQUENCES
1.95
LetS= Q n [0, 1]. Then Sis countable, so we can write
S = {sn
In EN}.
We follow the model of Example 1.15 using t: = 1/2. Thus, for each n, (an, bn) is an open interval centered about sn and bn - an = 2 n~FI . a) Show that 0 = U;:'= 1 (an, bn) is an open subset of R. and that every point of [0, 1] is the limit of a sequence of points from 0. b) Use the Heine-Bore] Theorem to prove that 0 = {(an, bn) I n E N} is not an open cover of [0, 1].
1.96 () A real number a is called an algebraic number provided there exists a polynomial equation p( x) = 0 with integer coefficients such that p( a) = 0. a) Let PN,n denote the set of all polynomials with integer coefficients of the form p(x) = anxn + · · · + a 1 x + ao for which the sum of the absolute values of the coefficients is bounded by N. That is n
Show that PN,n is a finite set. b) Prove that the set of algebraic numbers is countable. (Hint: Consider first the set of those numbers that are roots of a polynomial equation of degree n with integer coefficients.)
1.97 A real number is called transcendental provided that it is not algebraic. Prove that the set of all transcendental numbers is uncountable. Remark 1.8.1 The method of proof employed in Cantor's theorem is known as the Cantor diagonalization process after its inventor, Georg Cantor (1845-1918). The discovery that some infinite sets are significantly larger than others, as uncountable sets are larger than countable ones, led to the invention of the subject of transfinite arithmetic. The student who is curious to learn more about this may enjoy the classic book by E. Kamke [II]. It is interesting to note that Cantor embarked upon his study of transfinite sets with particular applications to analysis in mind. So-called trigonometric series, or Fourier series, are representations of suitable functions as sums of perhaps infinitely many sine and cosine waves of various periods. Such representations had been shown by Fourier to be very useful for the solution of the heat equation in physics. There were, however, major difficulties regarding the uniqueness of these representations and the actual pointwise convergence of the sums of sine and cosine waves to the function under study. In the long run, it turned out that a different development undertaken by Henri Lebesgue (the Lebesgue integral) was more effective than set theory for this application. However, Cantor's research cast a new light upon the whole of mathematics, far beyond the applications that motivated the initial study. This is a good example of how investigation of an interesting question can lead to vast and totally unanticipated branches of mathematical knowledge.
TEST YOURSELF
37
The interested reader can find this and many other historical topics in Mathematics at the website of the MacTutor History of Mathematics archive7 at the University of St. Andrews in Scotland.
1.9 TEST YOURSELF Test Yourself sections, found at the end of each chapter, contain short questions to check your understanding of basic concepts and examples. Proofs are not tested in these sections, since proofs must be read individually by the student's teacher or teaching assistant.
EXERCISES 1.98 E = 1 ~0 • Find a number 8 > 0 small enough so that Ia- bl < 8 and lc- bl < 8 implies Ia- cl <E. 1.99 The sequence Xn begins as follows: 0, 1, ~, 2, ~, ~, 3, continues according to the same pattern. a) True or False: limn-.oo lxn - Xn+ll = 0. b) True or False: Xn is a Cauchy sequence.
i, 144 , ~, 4, ... and
1
1.100 Give an example of two sequences, Xn and Yn =/:. 0 such that XnYn converges, ~ converges, but neither Xn nor Yn converges. 1.101 Let Xn = (( -l)n + 1) limsupxn.
+ 2~
for all n E N. Find both liminf Xn and
1.102 Give an example of two sequences of real numbers Xn and Yn for which liminf(xn +Yn) = Obutliminfxn = -oo = liminfYn· 1.103 State True or Give a Counterexample: If Xn is an unbounded sequence, then Xn has no convergent subsequences. 1.104 Give an example of a decreasing nest of nonempty open intervals (an, bn) (an, bn) = 0. such that bn -an ---+ 0 but
n:1
1.105
True or False: The set S = {;:; I m E Z, n E N} is dense in R
1.106 Let E ={~In EN}. Find an open cover 0 ={On In EN} of E that has no finite subcover. 1.107
True or False: The set Q is closed in the real lineR
1.108 True or False: The setS = {O.d 1 d 2 ••• dn I n E N} of all .finitely long decimal expansions (with each di an integer between 0 and 9) is countable. 1.109
True or False: The setS= { ~V2~
~
E
Q} is uncountable.
7http://www-history.mcs.st-andrews.ac.uk/history/
38
REAL NUMBERS AND LIMITS OF SEQUENCES
1.110 Let Xn = 1 + implies lxn - 11 < f.
<Jn". Iff> 0 find aN EN sufficiently big so that n''?: N
1.111 True or Give a Counterexample: A bounded sequence times a convergent sequence must be a convergent sequence.
(
1.112
Find n~=I -oo, -n].
1.113
Give an example of an open cover 0 = {On
lin E N} of the set
such that S has no finite subcover from 0. 1.114
Let
E =
{~In E w} u {0}.
True or False: The set lR \ E, that is the complement of E, is an open subset of R
CHAPTER2
CONTINUOUS FUNCTIONS
2.1
LIMITS OF FUNCTIONS
During the 19th century, many mathematicians worked to identify those classes of functions to which useful techniques, such as differentiation, integration, and decomposition into infinite series could be applied correctly. Mathematicians and physical scientists had worked with many types of functions described by formulas. These included polynomials, quotients of polynomials, and combinations of these with roots and powers. Sines, cosines, tangents, exponential and logarithmic functions played a role too. But in order to discover a coherent body of theorems that would explain to which functions what techniques could be properly applied, it was necessary to identify the underlying properties that functions would need to have for certain theorems to work. Those properties could be shared by functions that might be described in terms of very different-looking formulas. The formal concept of a function includes all the familiar examples and many others that are not so easily described. A function f assigns a numerical value f (x) to each point x in the domain D f on which f is defined. The range off is the set RJ = {f(x)
Ix
E DJ}·
Advanced Calculus: An Introduction to Linear Analysis. By Leonard F. Richardson Copyright© 2008 John Wiley & Sons, Inc.
39
40
CONTINUOUS FUNCTIONS
Since derivatives, integrals, and infinite series are all defined in terms of limits, we need to define the concept of the limit of f (x) as x ~ a. In order to understand the subtleties of the definition that is required, we look ahead to one of the most important applications of this concept, which will be the definition of the derivative of a function. The student will recall that in elementary calculus the derivative of f at a point x is defined as the limit as h ~ 0 of the difference quotient
Q(h) = f(x
+ h~- f(x).
Thus f'(x) = Iimh .....o Q(h). It is important to note that we wish to define the limit of Q( h) as h ~ 0, although Q is not even defined at 0. Thus we are forced to define Iimx-+a f(x) in such a way that it will be irrelevant whether or not the function f is actually defined at a. On the other hand, for Iimx-+a f(x) to make sense, it will have to be possible for x to become as close as we like to a without x being a itself. Thus we formulate the following preliminary concept.
Definition 2.1.1 A point a is called a cluster point of the set D if and only if for all 8 > 0 there exists x E D such that 0 < lx - ai < 8. Thus a cluster point a of a set D has the property that it is always possible to find points x E D for which x -1- a and yet x is as close to a as we like .
• EXAMPLE 2.1 Let D = [0, 1), the interval closed at 0 but open at l. Then the set of cluster points of Dis the interval [0,1]. See Exercise 2.1.
• EXAMPLE 2.2 Let D = { ~ I n EN}. Then D has only one cluster point: namely, the point 0. Note that in this example 0 ~ D. See Exercise 2.2. We are ready to define the concept of the limit of a function, which the reader should compare with Exercise 2.18.
Definition 2.1.2 Let a be any cluster point of the domain D f of a function f. Then lim f(x) = L
x-+a
if and only if for each E > 0 there exists 8 > 0 such that x
E
DJ andO <
lx- ai < 8
=:> lf(x)-
Ll < L
LIMITS OF FUNCTIONS
41
• EXAMPLE 2.3 We present four useful examples of limits. 1. We claim that for all m and b in JR. we have lim(mx +b)= ma +b.
x-+a
We understand implicitly that the domain of mx + b is the whole real line R Now, let E > 0. We need 8 > 0 such that 0 < lx- al < 8 implies
(ma + b) I <
I(mx + b) -
E.
But the latter inequality is equivalent to
lm(x- a)l =
lmllx- al < E
to which the solution is
f
lx-al < ~' provided that m =/= 0. (If m inequality.) 2. We claim that
=
0, then all x would satisfy the required
x 2 -1
lim - x-+1 X - 1
= 2.
Since the concept of limit does not permit x = 1 we can say that for all x =/= 1 2 we have xX-- 11 = x + 1 ---> 2 as x ---. 1. 3. Let
~
if X=/= 0, if X= 0.
f(x)={~
if X 2': 0,
f(x) = { Then limx-+O f(x) 4. Let
= 0 =/= f(O).
ifx
< 0.
Then limx-+O f (x) does not exist. In fact, no matter how small we make 8 > 0, the inequality 0 < Ix - 0 I < 8 will be satisfied by points x for which f (x) can be either 1 or 0, and we cannot force both 1 and 0 to be simultaneously within arbitrarily small E > 0 of any one number L. In mathematics it is very important to have equivalent forms of definitions. For some purposes, one form of a definition is most convenient to use; for other purposes, another form is more suitable.
42
CONTINUOUS FUNCTIONS
Theorem 2.1.1 Suppose a is a cluster point of the domain D 1 of f. following two statements are logically equivalent.
Then the
i. Iimx_,a f(x) = L. ii. For every sequence of points Xn E D 1 \ {a} such that Xn ---+ a, we have the sequence f(xn) ---+ L. (This is called the Sequential Criterion for Limits.)
Proof: Let us prove first that (i) implies (ii). So suppose (i) and consider any sequence of points Xn E D 1 \ {a} such that Xn ---+ a. We know for all f > 0 there exists 8 > 0 such that x E D 1 and 0 < lx - al < 8 implies lf(x) - Ll < f. But since Xn ---+ a, there exists N such that n 2 N implies 0 < lxn - al < 8. This implies lf(xn)- Ll < f, so that f(xn) ---+ L. Next we prove (ii) implies (i). So suppose (ii) is true. We need to show lirnx_,a f(x) = L. We will show that if this conclusion were false then a selfcontradiction would result. But if it were false that lirnx_,a f(x) = L, then there would exist f > 0 such that for all8 > 0 there exists x E D 1 such that 0 < Ix- a I < 8 and yet lf(x)- Ll 2 LIn particular, if we let 8n =~.then we get Xn such that 0
1 n
< lxn- al < - and lf(xn)- Ll 2
f.
Now, Xn E DJ \{a} and Xn---+ a yet f(xn) ~ L. This contradicts (ii) which was • assumed to be true. Remark 2.1.1 Since this is the first theorem of Chapter 2, we remind the reader of the importance of writing out a full analysis of each proof in the reader's own words and to his or her own satisfaction. This has been discussed in the Introduction on page xxiii. For the first implication, (i) implies (ii), the reasoning follows from a careful reading of the relevant definitions. The opposite implication is trickier and benefits much from the choice of an indirect method of proof. Without the indirect proof, it would be very difficult to establish that there is a suitable 8 corresponding to each f > 0, just from knowing that Xn ---+ a implies f ( Xn) ---+ f (a). If we have two functions f and g we can form the sum, difference, and product of these functions with the domain of this combination being DJ n D 9 • On the other hand, the domain D 1 of the quotient will be only those points of DJ n D 9 for which g
g(x) =/; 0. Corollary 2.1.1 Suppose a is a cluster point of the intersection D 1 domains off and g, and suppose further that lim f(x) = L and lim g(x) = M. x~a
Then
i. Iimx_,a(f ± g)(x) = L ± M
X-i-a
n D9
of the
EXERCISES
43
ii. limx-.a(fg)(x) = LM iii. limx-.a
f (x) = f:t provided that M -/=- 0 and that a is a cluster point of D
L. g
Proof: Consider first conclusion (i). Consider a sequence of points XnEDtnD 9 \{a} such that Xn
---t
a. Then
(! ± g)(xn) = f(xn) ± g(xn)
---t
L± M
by the corresponding theorem for limits of sequences. The proof of (ii) is very similar. The proof of (iii) is Exercise 2.3. • Remark 2.1.2 It is possible to make up many variations on the definition of limit of a function, for assorted specialized uses. Here are two examples, in which we adapt the sequential condition of Theorem 2.1.1 to generalize the concept of limit. The reader should compare the definitions below with the statements in Exercise 2.17. Definition 2.1.3 We define limits at infinity and also one-sided limits as follows.
i.
If the domain Dt is not bounded above, we say f(x) that for all sequences Xn E D f such that Xn
---t
---t Las x ---t oo provided oo we have f ( Xn) ---t L.
= L,providedthat a we have f(xn) ---t L.
ii. lfaisaclusterpointofDtn(a,oo), wesaylimx-.a+ f(x)
for all sequences Xn EDt n (a, oo) such that Xn In this case, we write f(a+) = limx-.a+ f(x).
---t
In both cases, L can be a real number, in which case we speak of convergence of f(x) to L. However, if Lis ±oo, then we speak of the divergence of f(x) to L.
EXERCISES 2.1 2.2
2.3
t Prove the claim of Example 2.1. t Prove the claim of Example 2.2. t Prove part (iii) of Corollary 2.1.1.
2.4 Find all the cluster points of the set
Find all the cluster points of the set Z of all integers, and justify your conclusion.
2.6
Show that if 0
1
~m x
f(x) = {
ifO<x:Sl, if X= 0,
then limx-.O+ f (x) does not exist. (See Fig. 2.2.)
44
CONTINUOUS FUNCTIONS
2.7 a) Suppose f(x):::; g(x):::; h(x) for all x E (a- 8,a+8) \{a}. Suppose also that limx-+a f (x) and limx-+a h( x) both exist and equal L E R Prove that limx-+a g( x) exists and equals L. (This statement is sometimes called the squeeze theorem or the sandwich theorem for functions.) b) Use the squeeze theorem for functions to prove that
. sinx Il m - -
x-tO
X
exists and equals 1. (Hint: You may assume your prior knowledge that limx-+0 sec x = 1.) 2.8 t A function f is called monotone increasing, denoted by f /', provided whenever x1 < x2 in Dt we have f(xi) :::; j(x2). Prove: Iff is monotone increasing on JR., then for all a E JR., lim f(x) x-ta+ exists and is a real number. The latter limit is called the limit from the right, and it is denoted by f(a+ ). (Hint: LetS= {f(x) I x > a} and show that inf(S) is a real number L. Then show that for every sequencexn-+ a+ we must have f(xn)-+ L.) 2.9 t Adapt Definition 2.1.3 to formulate the concept of a limit from the left, limx-+a- f(x), and prove that iff is increasing on JR. then for all a E JR. we have limx-+a- f(x) = f(a-) exists and is a real number.
2.10 a) Suppose f : D -+ JR. and a is a cluster point of D n (a, oo ). Prove that limx-+a+ f(x) exists and equals L E JR. if and only iffor each € > 0 there exists 8 > 0 such that x ED n (a, a+ 8) implies lf(x) - Ll < €. b) Suppose f : D -+JR. and a is a cluster point of D n (-oo, a). Prove that limx-+a- f(x) exists and equals L E JR. if and only if for each € > 0 there exists 8 > 0 such that x E D n (a- 8, a) implies lf(x) - Ll < €.
2.11 Use the Definition you constructed in Exercise 2.9 to prove that if a is a cluster point of both D 1 n (a, oo) and D f n (-oo, a), then limx-+a f(x) exists and equals L if and only if limx-+a+ f(x) and limx-+a- f(x) both exist and equal L. (Hint: The result of Exercise 2.10 may be helpful.) 2.12 t A function f is said to have a jump discontinuity at a if and only if limx-+a- f(x) =/:. limx-+a+ f(x) although both one-sided limits exist. Generalize Exercises 2.8 and 2.9 to show that if a monotone function f has a discontinuity at a point then that discontinuity must be a jump discontinuity.
45
EXERCISES
2.13
Find
b)
a)
x2- a2 lim--X- a
lim X->a
x~a
xn- an X-
a
,
wheren EN.
2.14
Show
provided that bn
2.15
f- 0.
Suppose there exists a E lR such that x >a implies f(x) > 0. Prove: lim f(x) = oo
x------too
~ x~oo lim - () = JX 1
0.
2.16 Prove the Cauchy Criterion for Limits of Functions: Suppose a is a cluster point of the domain DJ of f. Then limx->a f(x) exists if and only if for all f > 0 there exists 15 > 0 such that x and y in
(D J \ {a}) n (a - 15, a + 15) implies lf(x)- f(y)l
2.17
t Use Definition 2.1.3 to prove the following statements. a) Suppose f(x) ----* L E lR as x ----* oo. Prove that for each f > 0 there exists M > 0 such that for all x E DJ n (M, oo) we have lf(x)- Ll <E. b) Suppose D f is not bounded above, and suppose that for all f > 0 there exists M > 0 such that x E DJ n (M, oo) =? lf(x)- Ll <E. Prove that f(x) ----* L E lR as x ----* oo. c) Suppose that a is a cluster point of DJ n (a, oo) and limx->a+ f(x) exists and equals L E JR. Prove that for each f > 0 there exists 15 > 0 such that x E DJ n (a, a+ 15) implies lf(x)- Ll 0 there exists 15 > 0 such that .1: E DJ n (a, a+ 15) implies lf(x)- Ll <E. Prove that limx->a+ f (x) exists and equals L E JR.
2.18 Let f : E ----* lR and f > 0 be arbitrary. If p is not a cluster point of E and if L E lR is arbitrary, prove that there exists 15 > 0 with the following property: for all e E E such that 0 < Ie - PI < 15 we have If (e) - L I < E. (This exercise explains why the concept of limit of a function is meaningless at points that are not cluster points of the domain E of f.)
46
CONTINUOUS FUNCTIONS
2.2 CONTINUOUS FUNCTIONS The intuitive concept is that a function is continuous (on an interval) provided its graph can be drawn without lifting the pencil from the paper. The formal concept is meant to embody the requirement that there can be no abrupt jumps or gaps in the values of the function, but it says more than this.
Definition 2.2.1 A function f is called continuous at a point a f, provided that for all E > 0 there exists 8 > 0 such that
E D, the domain of
xEDn(a-8,a+8) ====? lf(x)-f(a)I<E. Iff is continuous at every point a E D, we say f E C(D), the set of all continuous functions on D. We remark that in the special case in which Dis an interval, such a'> [a, b] c IR, we denote C(D) by C[a, b]. There is a surprising degenerate case of continuity. If there exists 8 > 0 such that D 1 n (a- 8, a+ 8) = {a}, then the stipulation in the definition of continuity is true, in that there are no points in the designated intersection other than a itself, and of course lf(a)- f(a)l < t.
Definition 2.2.2 We call a point a E D an isolated point of D > 0 such that D n (a- 8, a+ 8) = {a}.
~
IR, provided that
there exists 8
In other words, if D 1 has any isolated points a, then f is automatically continuous at a. The more interesting case, however, is that of a nonisolated point a E D 1. Note that a E D 1 is a nonisolated point, provided that a E D 1 and that a is a cluster point of DJ.
Theorem 2.2.1 A function f is continuous at a cluster point a E D 1 if and only if limx-+a
f (x) exists and lim f(x)
x-+a
=
f(a).
Proof: First suppose f is continuous at a E DJ. Then for all E > 0 there exists 8 > 0 such that lx- al < 8 and x E DJ implies lf(x) - f(a)i < E. Hence for all x E DJ such that 0 < lx- al < 8 we have lf(x)- f(a)i < E, which implies limx-+a f(x) exists and limx-+a f(x) = f(a). Now suppose lirnx-+a f(x) exists and is f(a). Then for all E > 0 there exists 8 > 0 such that 0 < lx -a I < 8 and x E DJ implies lf(x)- f(a)l < E. This implies for all x E DJ n (a- 8, a+ 8) we have lf(x)- f(a)l < E so that f is continuous at
a.
•
Corollary 2.2.1 A function f : D ~ lR is continuous at a E D if and only if it satisfies the following Sequential Criterion for Continuity: For every sequence of points Xn ED such that Xn ~a we havef(xn) ~ f(a).
CONTINUOUS FUNCTIONS
47
Proof: i. Suppose a is an isolated point of D. In this case there is little to prove, since f is automatically continuous at a and since Xn as in the Sequential Criterion must have the property that there exists N E N such that n ~ N ===} Xn = a and f(xn) = f(a), which implies that f(xn) -4 f(a). 11.
Suppose a is a cluster point of D. We will apply Theorem 2.1.1 and Theorem 2.2.1. Iff is continuous at a, then limx-+a f(x) exists and equals f(a). Hence the Sequential Criterion for limits implies that the sequential criterion for Continuity is satisfied. Conversely, if the Sequential Criterion for continuity is satisfied, then limx-+a f(x) exists and equals f(a).
• Theorem 2.2.2 Suppose both f and g are continuous at a E D f n D 9 • Then
i. f ± g is continuous at a. ii. f g is continuous at a. iii. ~ is continuous at a provided g(a)
-=/=
0.
iv. If, moreover, his continuous at f(a), then the composition h of is continuous at a, where h o f(x) = h(f(x)). Proof:
We will apply the Sequential Criterion for Continuity (Corollary 2.2.1) throughout. i. For the first case, bear in mind that DJ±g = DJ n D 9 . If Xn -4 a with all Xn E DJ n D 9 , then f(xn) -4 f(a) and g(xn) -4 g(a). We conclude that limx-+a(f ± g)(x) exists and equals
lim f(x) ± lim g(x) x-+a
x~a
= f(a) ± g(a).
ii. This part has a nearly identical proof. iii. This part is left to the reader in Exercise 2.19. iv. We note that Dhof = {x E DJ I f(x) E Dh}· Now let Xn E Dhof such that Xn -4 a. Then letting Yn = f(xn). we have
lim h(f(xn)) = lim h(yn) = h(f(a))
n---+CXJ
since Yn -4 f(a) as n respectively.
n-+CXJ
-4
oo since f and h are continuous at a and f(a),
•
48 •
CONTINUOUS FUNCTIONS
EXAMPLE 2.4
Examples of Continuity and Discontinuity.
1. Let i(x) = x for all x E R Then i is continuous at every point a E R In fact, if Xn -+ a we have i(xn) = Xn -+a= i(a) for all a E R 2. Let f(x) = c, a constant. It is easy to check that f satisfies the Sequential Criterion for Continuity at every a E R so that f E C(IR.).
3. Let f(x) = x 2 , for all x E R By Theorem 2.2.2, f E C(R). This generalizes easily to show that g(x) = xn is in C(R) for all n E N. Now apply Theorem 2.2.2 to see that every polynomial p(x) = anxn + · · · a1x + ao is continuous on all ofR as well. Moreover, every rational function Q(x) = ~·where p and q are polynomials, is continuous wherever defined. 4. Let
f(x) = {
~
if X 2: 0, ifx
< 0.
Then f is discontinuous at 0 but continuous everywhere else. Why? 5. Let
f(x) = {
~
if X 2: 0, if X= -1.
Then f is continuous wherever it is defined. Note that -1 is an isolated point ofDJ.
6. Let
f(x) =
{
x2
ifx E Q,
-x2
if X¢_ Q.
We claim that f is continuous at x = p if and only if p = 0. Perhaps the easiest way to prove this is to utilize Corollary 2.2.1. Consider two sequences x n E Q and Xn ¢. Q such that Xn -+ p and Xn -+ p. Iff is continuous at p, then
f(xn) = x~ -+ p 2 = f(p) and f(xn) = -x~ -+ -p 2 = f(p). Iff is continuous at p, then p 2 = -p2 and p = 0. It remains to prove that f actually is continuous at 0. Suppose now that x n is any sequence converging to 0. We need to prove that f(x n) -+ f(O) = 0. Let E > 0. Note that the sequences x~ converges to 0, as does the sequence -x;;,. Moreover, there exists N E N, corresponding to E, such that n 2: N implies
EXERCISES
49
01
It follows also that n 2: N implies 1-x;_ < f. Now let n 2: N. If Xn E Q, then lf(xn) - Ol = x;
IJ(xn)- Ol = 1-x~- Ol =X~< Thus n
2: N implies lf(x.,) - f(O) I < f so that f(xn)
f.
---t
f(O) = 0.
EXERCISES
2.19
t Prove part (iii) of Theorem 2.2.2.
The domain D L is that subset of D f g
n D9
consisting of points x for which g(x) =1- 0.
2.20 t Prove that if a(x) Exercise 1.9.)
=
lxl, the absolute value function, then a E C(R). (Use
2.21 t Let Q(x) = .jX, which is defined for all x 2: 0. Prove: Q E C[O, oo ). (Hint: If a 2: 0, and f > 0, we seek 8 > 0 such that x 2: 0 and lx- al < 8 implies 2 IQ(x)- Q(a)l < E. Begin by showing that lvx- v'al lx- a!.)
s
2.22
Let
X!, X2
E
2.23
Let
f
Dj
be continuous at a, and
f
> 0. Show there exists 8 > 0 such that
n (a- 8, a+ 8) implies lf(xl)- f(x2)1
ifx E Q, ifx
rJ_
Q.
Find all points pat which f is continuous. Prove your conclusion.
2.24
Let
f(x)
=
{1-
x
1- x 2
ifx E Q, ifx
rJ_
Q.
Prove that f is continuous at p if and only if p E { 0, 1}.
2.25
t Let f(x)
= {
;:=:n
if x =1- a, if x =a,
where n EN. Find the value of c that makes f E C(R).
2.26 <> Let f be monotone increasing on [a, b]. For all p E (a, b), we denote limx-->p+ f(x) = f(p+) and limx-->p- f(.r) = f(p-) both of which exist. Let
j(p) = f(p+)- f(p-) denote the height of the jump at p. Let
j(b) = f(b)- f(b-) and j(a) = f(a+)- f(a)
50
CONTINUOUS FUNCTIONS
and let
En =
{X Ij (X) ~ ~ }
for all n EN. a) Prove: f is continuous at x E [a, b] {::} j(x) = 0. b) Show that the set E of points of discontinuity off is given by
C) If
a
=
<
Zo ::; XI
<
ZI
X2
<
Z2
< ··· <
Xk
:S
Zk
=
b, prove that
j(xi) :S f(zi) - f(zi-d for all i. d) Prove that En is finite for all n. Hint: If PI < ... < Pk all lie in En, show that
k
k
- :S Ll(Pi) :S f(b)- f(a). n
i==I
e) Prove: The set E is either countable or else finite.
=
Let f : lR---+ lR be any function such that f(x + y) f(x) + J(y). (Such a function is called a homomorphism of the additive group lR in abstract algebra.) Suppose also that f is continuous, and let c = f(l ). a) For each n EN, prove that f (~) = ~ b) Let x E Q, so that x can be written as ~· Prove that f(x) =ex.
2.27
c) Use the continuity off to prove that f(x) =ex for all x E R
2.28
t <; 8 Let f : D---+ lR and suppose p E D. a) Lets(8) = sup{J(x) I x E Dn(p-8,p+8)}. Provethatsisamonotone increasing function of 8 E (0, oo). b) Let i(8) = inf{J(x) I x E D n (p- 8,p + 8)}. Prove that i is a monotone decreasing function of 8 E (0, oo). c) Define the oscillation o(p) = lim<5-->o+(s(8) - i(8)) and prove that o(p) exists. d) Prove that f is continuous at a point p E D if and only if o(p) = 0.
2.3 SOME PROPERTIES OF CONTINUOUS FUNCTIONS Lemma 2.3.1 (Preservation of Sign) Suppose f is continuous at p and suppose f (p) > 0. Then there exists 8 > 0 such that x E D f n (p - 8, p + 8) implies f(x) > 0. 8 This
exercise is used only for the proof Theorem 11.3.1, which is Lebesgue's criterion for Riemann integrability.
SOME PROPERTIES OF CONTINUOUS FUNCTIONS
Proof:
Let E =
51
lJ;fl, which is positive. Then there exists J > 0 such that xEDJn(p-J,p+J) =*
lf(x)-f(p)I<E,
which implies
0 < f(p) < f(x) < 3f(p).
2
2
•
This proves the lemma. See Exercises 2.29 and 2.30 for generalizations of this lemma.
Definition 2.3.1 We say that f has the Intermediate Value Property on an interval I if and only if for all a and bin I with a < band for all k strictly between f(a) and f(b), there exists c E (a, b) such that f(c) = k. Theorem 2.3.1 (Intermediate Value Theorem) Suppose f is continuous on the interval I. Then f has the Intermediate Value Property on I. Proof: We will suppose a and b are in I with a < b. As a first case we assume f(a) < k < f(b): For the opposite case see Exercise 2.31. We will let a1 =a and b1 = band we will use the method of interval-halving. Let m = a 1 ~b 1 •
1. If f(m)
> k, then let b2
=
m and a2 = a1.
2. Otherwise, let a2 = m and b2 = b1. Then halve the interval [a 2, b2] and proceed as above to construct a decreasing nest of closed finite intervals [an, b11 ]. Note that Ian- bnl ---+ 0 as n---+ oo. By the Nested Intervals Theorem (Theorem 1.6.1) there exists a unique
such that an
---+
c and bn ---+ cas n---+ oo. Since c E [a, b], f is continuous at c. Thus
f(c)
=
lim f(ak):::; k and f(c)
k-->oo
=
These two inequalities can be satisfied only if f(c) •
lim f(bk)?: k.
k-->oo
•
= k.
EXAMPLE 2.5
Now we can prove that IR has an f/P for all n E N and for all p f(x) = xn and note that f E C(IR). Also, f(O) = 0 < p and
f(1
+ p)
=
(1
+ p)n
= 1n + n1n-lpl + ... + n1lpn-l + pn > np?: p.
>
0. Let
52
CONTINUOUS FUNCTIONS
Thus f(O) < p < f(l + p) and so there exists c E (0, 1 + p) such that f(c) = en = p, and c = yip. Let us comment a bit more upon this important example. In section 1.3 we considered a sequence of successive decimal approximations to /2. Thus, ak was the largest k-decimal place rational number such that a~ < 2. Thus ( ak
+
1 ~k)
2
> 2 and
jak -
v'2j < w-k -+ 0
ask-+ oo. Hence ak -+ /2. We remark that the Intermediate Value Theorem, like the method of interval-halving itself, is very useful for the task of solving equations that involve continuous functions. See Exercises 2.32 and 2.33. Iff E C[a, b], for a closed finite interval [a,b], then we can conclude more than just the continuity off at each point c E [a, b]. The next theorem shows that f will be uniformly continuous on [a,b ]. We present a definition first.
Definition 2.3.2 We say f is uniformly continuous on a domain D if and only iffor all t: > 0 there exists 8 > 0 such that for all x1 and x2 in D such that lx2 - x1l < 8 we have lf(x2) -!(xi) I < t:. That is, f is uniformly continuous on D provided there exists 8 > 0 that satisfies the requirement'> for continuity at x 1 uniformly, meaning at all x1 E D. Intuitively, uniform continuity of f on D imposes a restriction on how fast f can change its values on the entire domain D.
Theorem 2.3.2 Iff E C[a, b], then f is uniformly continuous on [a, b]. Proof: We suppose the theorem were false and we will deduce a contradiction. Thus we suppose f is not uniformly continuous on [a,b]. Hence there exists t: > 0 such that no 8 > 0 will suffice the meet the uniform continuity condition. Hence, if 8n = ~.there exists Xn, Yn E [a, b] such that lxn- Ynl < ~but
Since Xn is a bounded sequence, the Bolzano-Weierstrass Theorem guarantees the existence of a convergent subsequence Xnk -+ p E [a, b] as k -+ oo. Now,
Thus Ynk -+ p as well. Therefore, since f is continuous at p,
This means that
But lf(xnk)- f(Ynk)l 2:
t:
for all k and so we have a contradiction.
•
EXERCISES
53
EXERCISES 2.29 t Suppose f is continuous at p and f (p) > c. Prove: There exists 8 > 0 such that x E DJ n (p- 8,p + 8) implies f(x) >c. (Hint: Consider g(x) = f(x)- c.) 2.30 t Suppose f is continuous at p and f(p) < c. Prove: There exists 8 > 0 such that x E D f n (p- 8, p + 8) implies f(x) < c. (Hint: Consider g(x) = - f(x ).) 2.31 t Suppose f E C[a, b] and f(a) > k > f(b). Prove: There exists c E (a, b) such that f( c) = k. This will complete the proof of Theorem 2.3.1. (Hint: Consider
g(x) = - f(x).) 2.32 Letp(x) = x 3 + 3x 2 - 2x -1. Prove that the polynomial equationp(x) has a root somewhere in the interval (0, 1). Let p(x) = x 4 + x 3 - 2x 2 p(x) = Ohasarootin (-1,0). 2.33
+x+
=0
1. Prove that the polynomial equation
Let f(x) = cos x- x- sin x. Prove: There exists a solution to the equation f(x) = 0 on the interval [0, 1rj6]. (You may assume that sinx and cosx are
2.34
continuous on R) 2.35 Let p(x) = a2n+lx 2n+l + · · · + a1x + ao be any polynomial of odd degree. Prove: The equation p(x) = 0 has at least one realroot. (Hint: Prove p(x) --+ ±oo depending on whether x --+ ±oo and consider then that p must have both positive and negative values.) 2.36
A function
f : (-a, a)
--+
lR is called an odd function provided
f (- x) =
- f (x) for all x E (-a, a). A function f : (-a, a) --+ lR is called an even function provided f( -x) = f(x) for all x E (-a, a). Prove or give a counterexample for each of the following statements. a) Iff is an odd continuous function on (-a, a), then there exists a solution to the equation f(x) = 0. b) The composition oftwo odd functions must be odd. c) Every function of an even function is even. d) An even function of an odd function is even. e) An even function of any function is even. f) Every function f : (-a, a) --+ lR is the sum of an even function with an odd function. 2.37 t Prove the following fixed point theorem: Suppose f E C[O, 1] and suppose 0 :S f(x) :S 1 for all x E [0, 1]. Then there exists c E [0, 1] such that f(c) =c. (The pointe is thencalleda.fixedpointforthefunction f. Hint: Consider g(x) = f(x)-x.) 2.38 Provethefollowingtheorem: Suppose! E C[O, 1] andsupposeO :S f(x) :S 1 for all x E [0, 1]. Let n EN. Then there exists c E [0, 1] such that f(c) =en. (Hint: Compare this problem with Exercise 2.37.)
54
CONTINUOUS FUNCTIONS
y
3.0
2.5
2.0
1.5
1.0
0.5
2
Figure 2.1
y = ~·
2.39 Suppose f and g are in C[a, b] and suppose [!(a)- g(a)][f(b)- g(b)] ~ 0. Prove that there exists c E [a, b] such that f(c) = g(c). (Hint: Compare this problem with Exercise 2.37.) 2.40 Suppose f is uniformly continuous on (a, m] and also on [m, b). Prove: uniformly continuous on (a, b).
f
is
2.41 Let f(x) = ~.for all x E (0, 1). Is f continuous on (0, 1)? Is f uniformly continuous on (0, 1)? Justify your conclusions. (See Fig. 2.1.) 2.42 Let f(x) = ~.for all x E (1,oo). Is f continuous on (1,oo)? Is f uniformly continuous on (1, oo )? Is f uniformly continuous on (0, oo )? Justify your conclusions. (See Fig. 2.1.) 2.43 Let f(x) = x 2 , for all x E R Is f E C(I~)? Is f uniformly continuous on ~? Justify your conclusions. 2.44 Suppose f is uniformly continuous on D and suppose E uniformly continuous on E as well. 2.45
c D.
f
is
= x 2 uniformly continuous on (0,1)? Why or why not? LetQ(x) = yx. Prove that the function Q is uniformly continuous on [0, oo ).
Is f(x)
2.46 (Hint: See Exercise 2.21.) 2.47
Prove:
Let .
f(x)
=
{
1
~m x
<x ~ ifx = 0.
ifO
1,
You may assume that the function sin x is continuous. (See Fig. 2.2.)
EXTREME VALUE THEOREM AND ITS CONSEQUENCES
Figure 2.2
y
55
= sin ~.
a) Show that f does have the intermediate value property (Definition 2.3.1)
on [0, 1], but f tj. C[O, 1]. C(O, 1] but f is not uniformly continuous on (0, 1].
b) Show f E
2.4
EXTREME VALUE THEOREM AND ITS CONSEQUENCES
Definition 2.4.1 We say a function f is bounded on a domain D, provided that there exist m and M in lR such that
m:::;
f(x):::; M
for all xED.
Observe that f (x) = x is continuous on lR but f is not bounded on JR. And g( x) = ~ is continuous on (0, 1) but is not bounded, since it lacks an upper bound, though there is a lower bound, 0. We have, however, the following important theorem pertaining to continuous functions on closed, finite intervals. Theorem 2.4.1 Iff E
C[a, b], then f is bounded on [a, b].
Proof: By Theorem 2.3.2, f is uniformly continuous on [a, b]. Thus, for example, if we pick the positive number f = 1, there exists 6 > 0 such that for all x 1 , x E [a, b] such that lx 1 -xI < 6 we must have If (xi) - f (x) I < 1. Now consider the sequence of evenly spaced points defined as follows: XI
=
a, X2 = a + 6, X3 = a + 26, ... , Xk = a + (k - 1)6, ....
By the Archimedean Property of IR, there exists N E N such that a + N 6 > b, whereas a+ (N- 1)6 :::; b. In other words, although 6 may be a small positive number, if we take enough steps of size 6 we must eventually pass b. Now consider
56
CONTINUOUS FUNCTIONS
the following inequalities.
x E [x1,x2)
=::}
x E [x2, x3)
=::}
x E [x3, x4)
=::}
< f(x) < j(x1) + 1 j(x2)- 1 < f(x) < j(x2) + 1 j(x3)- 1 < f(x) < j(x3) + 1
j(x1) -1
If we let m
= min{f(x1)- 1, j(x2)- 1, ... , f(xN)
- 1}
and if we let
M = max{f(x1) then we have for all x E [a, b], m
+ 1, j(x2) + 1, ... , f(xN) + 1}, < f(x) < M.
•
Observe that the function f (x) = x is bounded on D = (0, 1). However, f has no largest value and no smallest value, since for all x E (0, 1), there exists x' < x < x" withx',x" E (0, 1).
Definition 2.4.2 We say that f has a maximum value M on a domain D, provided that there exists x M E D such that f(x) S f(xM)
=M
for all x E D. Similarly, we say f has a minimum value m on D, provided that there exists Xm E D such that f(x) ~ f(xm) = m for all xED. The next theorem establishes that on a closed, finite interval [a, b], every continuous function must have both a maximum and a minimum value.
Theorem 2.4.2 (Extreme Value Theorem) Iff E C[a, b], then f has both a maximum and a minimum value on
[a, b].
Proof: By Theorem 2.4.1, we know that {f(x) I x E [a, b]} i= 0 is bounded both above and below. Let M = sup{f(x) I x E [a, b]}. Thus M is the least upper bound of the range of f. Hence for all k E N, M is too small to be an upper bound, so there exists Xk E [a, b] such that
k
M-
1
k < f(xk) S M.
EXTREME VALUE THEOREM AND ITS CONSEQUENCES
57
Since Xk is a bounded sequence, the Bolzano-Weierstrass Theorem guarantees that there exists a convergent subsequence x nk. Thus there exists x M such that Xnk -> x M ask----> oo. Since [a, b] is closed, XM E [a, b]. But
and nk is an increasing sequence of natural numbers. Thus
-1
1
< f(xnk) :::; M, for all k E N. Since M- ---->Mask----> oo, and Hence M because f is continuous at XM, f(xM) = limk_,oo f(xnk) = M. For the minimum point, see Exercise 2.48. • •
EXAMPLE 2.6
Let p( x) = x 2 - x + 1. We claim that p must have a minimum value on JR.. The problem is that although p E C(IR), lR is not a finite closed interval so the Extreme Value Theorem does not apply directly. Consider
p(x)
=
x 2 ( 1 - -1 x
1) . +x2
The second factor approaches 1 as x ----> ±oo. Thus, there exists M such that lxl > M implies
>!. ( 1-~+2_) x x 2 2
Observe that p(O) = 1 and if lxl > J2 then x 2 > 2. Hence if we set a= max{ J2, M}, then lxl >a impliesp(x) > 1 = p(O). Now,p E C[-a, a] so the Extreme Value Theorem guarantees that there exists x m E [-a, a] such thatp(xm) :::; p(x) for all x E [-a, a]. But this implies that
p(xm) :::; p(O) = 1:::; p(x) for all valuesofx sincep(x) exceeds 1 outside [-a, a]. Hencep(xm):::; p(x), for all x E JR. Theorem 2.4.3 Let f E C[a, b]. Then the range off is a closed, finite interval.
Proof: Recall that the range off is f([a, b]) = {f(x) I x E [a, b]}. Since f achieves a maximum value M and a minimum value m, the Intermediate Value Theorem assures us that f achieves every value between M and m. Hence
f([a, b])
=
[m, M].
•
58
CONTINUOUS FUNCTIONS
Definition 2.4.3 Iff is any function whatever (not necessarily continuous) on a domain of definition D, we define the sup-norm off to be llfllsup =sup {lf(x)ll XED}· Thus ll!llsup may be either positive real-valued or else oo. However, if f E E C[a, b] as well, since the absolute value function is continuous and compositions of continuous functions must be continuous. Thus, for all f E C[a, b], llfllsup < oo, since f is bounded. For any function f the reader will show in Exercise 2.50 that f is bounded if and only if llfllsup < oo.
C[a, b], then III
Theorem 2.4.4 Let f and g be any bounded functions on [a, b] and a
E
JR. Then
i. llfllsup = 0 if and only iff= 0. ii. llafllsup = lalllfllsup· iii. IIJ + gllsup ~ IIJIIsup + llgllsup• Item 3 is called the triangle inequality. Proof: Items (i) and (ii) are immediate from the definitions. For 3, consider that for all x E [a, b] we must have I(!+ g)(x)l ~ lf(x)l + lg(x)l ~ llfllsup + llgllsup· But II!+ gllsup is the least upper bound of the set of values I(!+ g)(x)l, and the • right side above is an upper bound of the same set of values. In Theorem 2.2.2 we learned that iff and g are in C[a, b], then f ± g E C[a, b] too, as is af for all a E R Thus continuous functions on [a,b] can be added and subtracted giving other continuous functions. The constant function 0 is an additive identity: f + 0 = f. And addition and subtraction of functions in C[a, b] as well as multiplication by constants enjoy all the usual properties of shared by any so-called vector space. We list the axioms of a general (real or complex) vector space in Table 2.1. (See [10] for the concept of a vector space.) In linear algebra, the student will have studied vectors in JR.n primarily, and also the concept of an abstract vector space. So in effect we have learned here that C[a, b] is a vector space. In a vector space, it is convenient to have defined a type of positive real-valued function on vectors called a norm. (In physics, the norm of a vector is called its magnitude.)
Definition 2.4.4 A normed vector space V, over a field F (which is either JR. or
=
= 0 if and only ifv = 0 (Positive Definite).
lalllvll (Homogeneous).
EXTREME VALUE THEOREM AND ITS CONSEQUENCES
iii.
59
llv + wll :S I vii + llwll (Triangle Inequality).
Here lo:l refers to the absolute value of o: if o: E 1R and to the modulus of o: if u E C.
Thus the sup-norm is an example of a norm, and C[a, b] is a normed vector space, or normed linear space, as it is also called. The reason a norm is convenient is that it gives us a concept analogous to that of the length of a vector, and then the distance between the vectors, or continuous functions f and g, would be understood to be II!- Yllsup· In the next section we will study convergence of sequences of continuous functions on [a, b] in terms of the sup-norm concept of distance. In a linear algebra class the student will have learned that in a finite-dimensional vector space V there is a minimal natural number n, called the dimension, such that any set of more than n vectors must be linearly dependent. That is, if v 1 , ... , Vn+I are in V then there exist o: 1 , .•. , O:n+I in IR, not all 0, such that n+l LO:kVk
= 0.
k=l
Table 2.1
Axioms of a Vector Space
A vector space (or linear space) over a field IF, which may be either the field JR. of real numbers or the field IC of complex numbers, is a set V of elements called vectors that is equipped with two operations: addition of vectors and multiplication of vectors by scalars (also called numbers) from IF. The following axioms must be satisfied for V to be a vector space. I. If u and v in V and a E IF, then u vectors and scalar multiplication)
2. If u and v in V, then u + v
=
+v
E V and av E V. (Closure under addition of
v + u. (Commutativity of vector addition)
3. If u, v and w in V, then u + (v + w) addition)
=
(u + v) + w. (Associativity of vector
4. There exists a unique vector 0 E V such that u + 0 additive identity)
=
u for all u E V. (Existence of
5. Foreachu E Vthereexistsauniquevector-u E Vsuchthat-u+u of additive inverse) 6. For all u E V, lu
=
= 0. (Existence
u. (The scalar 1 E IF is a multiplicative identity.)
7. For all a and bin IF and for all u E V, (ab)u multiplication)
= a(bu).
(Associativity of scalar
8. For all a and b in IF and all u and v in V,a(u+v)
= au+avand(a+b)u = au+bu. (Distributivityofscalarmultiplication)
The field IF will almost always be the field JR. of real numbers in this book. The one exception is that in Chapter 6 (Fourier series) IF can be the field IC of complex numbers.
60
CONTINUOUS FUNCTIONS
(See, for example, [10].) In Exercise 2.57 below, the reader will show that the vector space C[a, b] must be infinite-dimensional because there is non E N that can serve as the dimension of this vector space.
EXERCISES 2.48 t Complete the proof of Theorem 2.4.2 by proving for all f E C[a, b] the existence of a minimum point Xm, such that f(xm) :::; f(x) for all x E [a, b]. 2.49 Let p(x) = a2 nx 2 n + · · · + a 1 x + ao be any polynomial of even degree. Prove: If a2n > 0, then p has a minimum value on R 2.50 Prove: A function Definition 2.4.1.)
f
is bounded if and only if
11/llsup <
oo. (Hint: See
2.51 Give an altemativeproofforTheorem2.4.1: Iff E C[a, b], then f is bounded. (Hint: Suppose false. Then Ill is not bounded above. By Exercise 2.50, for all n E N there exists Xn E [a, b] such that
lf(xn)l > n. Now apply the Bolzano-Weierstrass Theorem to Xn and deduce a contradiction of the fact that f E C[a, b].)
2.52
Prove that C[a, b] is a vector space.
2.53 Let B[a, b] denote the set of all bounded functions on [a, b]. Is B[a, b] a vector space? Prove your conclusion. 2.54 For each n E N, let Xn be the set of polynomials of degree equal to n, and let Pn = u~=O xk. Prove or give a counterexample: a) The set Xn is a vector space. b) The set Pn a vector space. Let P denote the set of all functions f on [0, 1] such that f(x) 2 0 for all x E [0, 1]. Is P a vector space? Justify your conclusion.
2.55
2.56 Let P denote the set of all functions f on [0, 1] such that there exists at least one point x E [0, 1], perhaps depending on f, for which f(x) > 0. Is P a vector space? Justify your conclusion. 2.57 t Let S = {1, x, x 2, x 3, ... , xn, ... }, a list of infinitely many continuous functions on [a,b]. Show that if n E N and if a 0 , 0:1, ••• , an are not all 0, then it is not possible for akxk = 0, the zero function. (Remark: This exercise shows that C[a, b] is an infinite-dimensional vector space, because it cannot have finite dimension n for any n E N.)
EZ=o
2.58
2.59
Given an example off E C(a, b) for which the range off is a) a finite open interval. b) an infinite open interval. Find
11/llsup if
61
THE BANACH SPACE C(a, bj
a) f(x)=xon(-1,!).
b) f(x) = {o_xz
if x E Q, ifx ~ Q.
2.5 THE BANACH SPACE C[a, b] We begin with a definition that is applicable whenever a sequence of functions f n is defined on a common domain, D. We do not require that the functions be continuous in the following definition.
Definition 2.5.1 Iff and f n are all defined on a domain D, we say that f n uniformly on D provided
--+
f
llfn- fllsup --+ 0 as n--+ oo. (Here the sup-norm is taken over the domain D.) A helpful way to visualize the concept of uniform convergence is to picture, for each given E > 0, the curves f(x) + E and f(x)- E. Then fn - t f uniformly on D if and only if for each E > 0 there exists N E N such that n ~ N implies that the graph of y = f n ( x) is sandwiched between the graph off ( x) + E and that off ( x) - E. The following theorem shows that uniform convergence behaves very well with respect to the continuity of functions.
Theorem 2.5.1 Suppose fn E C(D) for all n, and suppose also that fn uniformly on D. Then f E C(D).
--+
f
In words, this theorem says that a uniform limit of continuous functions must be continuous.
Proof: Let x E D be arbitrary. We must show f is continuous at x. Let E > 0. We need to show there exists 8 > 0 such that x' E D n (x- 8, x + 8) implies lf(x')- f(x)l < E. Since fn --+ f uniformly on D, there exists N EN such that k ~ N implies ll!k- fllsup < ~· In particular, this means E
llfN - fllsup < 3· Since fN is continuous at x, there exists 8 implies
> 0 such that x'
lfN(x')- fN(x)l <
i·
E D
n (x- 8, x + 8)
62
CONTINUOUS FUNCTIONS
We claim that this 8 works for the function f as well. In fact, if x' E Dn(x-8, x+8) then
if(x')- f(x)j = l[f(x')- !N(x')] + LfN(x')- !N(x)] + [!N(x)- f(x)JI
:S lf(x')- fN(x')i E
E
+ IJN(x')- !N(x)j + IJN(x)- f(x)j
E
<3+3+3 =E.
• Definition 2.5.2 We say that the sequence of functions f n ----> f pointwise on D if and only if, for all xED, the sequence of real numbers fn(x) converges to the real number f(x). The following example will illustrate the difference between uniform and pointwise convergence . • EXAMPLE 2.7
Let fn(x) = x 11 , for all x E [0, 1]. We claim fn does not converge uniformly on [0, 1]. lfO :S x < 1, then we know x 11 ----> 0. But if x = 1, x 11 = 111 ----> 1. Thus fn E C[O, 1] for all n EN, and fn----> f pointwise on D, where
f(x)={~
ifO:::; X< 1, ifx = 1.
So the continuous functions f n ----> f and yet f ¢ C[0, 1]. The function f could not have failed to be continuous on [0, I] if the convergence had been uniform . • EXAMPLE 2.8
Now consider fn(x)
=x
11
,
but on the domain x E [0,
llfn- Ollsup = fn
(~)
=
2~
H Here we see that
---->
0
H
so fn ----> 0 uniformly on D = [0, This illustrates how the question of uniform convergence is affected by the choice of domain as well as by the sequence of functions fn· Note that on [0, ~] the limit function 0 is indeed continuous.
Theorem 2.5.2 If fn
---->
f uniformly on D, then fn
---->
f pointwise on D, but not
conversely. Proof: Observe that for each x E D, we must have
lfn(x)- f(x)j :S llfn- Jllsup·
THE BANACH SPACE C[a, bj
63
Thus if llfn- fllsup----) 0, it follows that lfn(x)- f(x)l----) 0 for each xED. That the converse is false is shown by Example 2.7. • •
EXAMPLE 2.9
Let
fn(x)
=X
I+
I n.
It is not hard to see that fn(x) ----) f(x) = x pointwise for all x E [0, 1]. We claim that this convergence is actually uniform. For this purpose it can be convenient to make use of the derivative, even though we have not yet reached our chapter on this subject, but using what we recall from elementary calculus about derivatives. Let 9n(x) = f(x) - fn(x) for all x, so that 9n(O) = 0 = 9n(1) and 9n(x) 2: 0 for all x E [0, 1]. The reader can set g~(x) = 0 and check that 9n(x) achieves its maximum value at 1
and that
II!- fnllsup = 9n(xn) = (1 +1~r
(
1 ) 1- 1 + ~
----)
1 ~(11) = 0.
Thus f n ----) f uniformly on [0, 1]. The graph showing this sequence offunctions converging uniformly to f (x) = x is shown by the right-hand half of the graph in Fig. 4.5. This theorem assures us that the only possible uniform limit f for a sequence of functions f n would be the pointwise limit f. If the sequence fails to be pointwise convergent, then it cannot be uniformly convergent. However, a sequence offunctions can be pointwise convergent on a domain D without being uniformly convergent on D. Definition 2.5.3 In a vector space V equipped with a norm, we say that v n ----) v in the norm if and only if llv n -vii ----) 0 as n ----) oo. We call a sequence vn a Cauchy sequence if and only iffor all f > 0 there exists N E N such that n 2: m 2: N implies llvn- vmll < f. A normed vector space will be called complete if and only if every Cauchy sequence converges. A complete normed vector space is known also as a (real) Banach Space, after the discoverer, Stefan Banach. Theorem 2.5.3 In any normed vector space V, then it is a Cauchy sequence.
if a
sequence v n E V converges,
Proof: First, suppose Vn----) v, which we understand to mean that llvn- vii ----) 0. We need to show that the sequence v n is Cauchy. Let f > 0. Then there exists N E N, corresponding to f, such that n 2: N implies that llv n -vii < ~- Hence, if
64
CONTINUOUS FUNCTIONS
nandm ;::-: N,
llvn- Vmll = ll[vn- v] + [v- VmJII ~ llvn- vii+ llv- Vmll f
f
<-+-=f. 2 2
•
Thus the sequence v n is Cauchy. • EXAMPLE 2.10
In the vector space C[a, b], equipped with the sup-norm, we say that f n ----t f in the norm if and only if llfn- fllsup ----t 0 as n ----too. This is also called uniform convergence. (See Definition 2.5.1.) We call the sequence f n a Cauchy sequence in the sup-norm (or a uniformly Cauchy sequence) if and only if for all f > 0 there exists N EN such thatn ;::-: m ;::-: N implies llfn- fmllsup
Theorem 2.5.4 The vector space C[a, b] equipped with the sup-norm is a complete normed vector space.
We suppose that fn is a Cauchy sequence (in the sup-norm sense defined above), and we must prove there exists f E C[a, b] such f n ----t f. Let f > 0. Then there exists N E N such that n ;::-: m ;::-: N implies 11/n- fmllsup < ~- For each fixed x E [a, b],
Proof:
lfn(x) - fm(x)l ~ 11/n- fmllsup <
f
2·
Thus, for all fixed x E [a, b], the sequence f n(x) is a Cauchy sequence of real numbers. By the Completeness Axiom, f n ( x) must converge. So we can define f(x) = limn-+oo fn(x), for all x E [a, b]. Now, for all nand m ;::-: N, lfn(x) - fm(x)l <
f
2·
Holdingxtemporarily fixed in [a, b],letm ----too, and we find that If n(x)- f(x)l ~ ~ Since this works for each such x, llfn- fllsup ~ ~ < f, so fn ----t f uniformly on [a, b], and f E C[a, b]. • Because uniform limits of continuous functions must be continuous, it is natural to ask the following question. Suppose f n E C[a, b] for all n, and suppose f n ----t f pointwise. If the function f E C[a, b], would this guarantee that actually f n ----t f uniformly as well? The answer is no, as the following example shows . • EXAMPLE 2.11
We define a sequence fn E C[O, 1], for all n ;::-: 2, as follows. Let
fn(x) =
{
nx
ifO ~ x ~ ~'
~- nx
ifl<x<~ n - n'
if~<x~l.
THE BANACH SPACE C[a, bj
65
The student should draw the graph off n (x) to observe its properties clearly. One sees that fn(O) 0 ~ 0. But ifO < x ~ 1, then there exists N EN such that n ::::: N implies ~ < X, which implies f n (X) = 0. Thus f n ~ 0 pointwise on [0, 1], and indeed the constant function 0 is continuous on [0, 1] as well. Nevertheless, the student sees from the graph that II f n - Oil sup = 1 f> 0, so f n does not converge uniformly on [0, 1]. Although the question posed just above Example 2.11 is no, there is what may be called a partial converse to Theorem 2.5.1.
=
Theorem 2.5.5 (Dini) Suppose f n E C[a, b) ,for all n E N, and suppose
fn(x)
~
f(x)
at least pointwise on [a, b], where f E C[a, b] as well. If for each fixed x E [a, b] the sequence of numbers f n (x) is monotone decreasing, then f n ~ f uniformly on [a, b]. (There is also an increasing version of this theorem-see Exercise 2.70.) Proof: Let us suppose for all x E [a,b] the sequence of numbers fn(x) is a decreasing sequence converging to the limit f(x). We wish to show that llfn- fllsup ~ 0 as n ~ oo. For convenience, denote hn = fn - f, so hn E C[a, b] and for all x E [a, b], hn(x) is a decreasing sequence of positive numbers approaching 0. Let t > 0. For each fixed x E [a, b], there exists Nx E N such that n 2: Nx implies 0 ~ hn(x) < ~- Since hNx E C[a, b], there exists rx > 0 such that lx'- xl < rx and x' E [a, b] imply that lhNJx')- hNx(x)l
< ~-
Hence 0 ~ hNx(x') < t for all n 2: Nx and for all x' E (x- rx,x x' E [a, b]. Next observe that [a, b] C
+ rx)
with
U (x- rx, x + rx), xE[a,b]
an open cover of [a, b]. By the Reine-Borel Theorem, there exists a finite subcover: n
[a,b]
C
U(xk -rxk,x+rxk). k=l
Now let
N = max{Nxk
Ik
= 1, 2, ... ,n}.
If n 2: Nand x E [a, b], then there exists k E {1, ... , n} such that
X E (xk- rxk' Xk
+ rxk)
and N 2: Nxk as well. Thus 0 ~ hn(x) < t; and since hn E C[a, b], we have llhn- Ollsup
66
CONTINUOUS FUNCTIONS
EXERCISES In some of these exercises it will be convenient to use the derivative to aid in finding a sup-norm. 2.60 Prove that f n ( x) = ~ ---> 0 pointwise on lR but not uniformly on R However, prove the convergence is uniform on [0, 1].
=
x•~n. Prove: fn ---> 0 uniformly on R
2.61
Let fn(x)
2.62
Show that fn(x) = x~n converges uniformly on [0,1] but not on [0, oo ).
2.63
2.64
Let fn(x) = xe-nx for all x E [0, oo). a) Find llfnllsup for all n. b) Prove that .fn---> 0 uniformly on [0, oo). Let fn(x)
= nxe-nx for all x
E
[0, oo).
a) Find llfnllsup for all n. b) Determine whether or not fn converges uniformly on your claim.
[0, oo), and prove
2
2.65
Let fn(x) = xe-nx for all x E R a) Find llfn[[sup for all n. b) Determine whether or not fn converges uniformly on lR and prove your claim.
2.66
Let fn(x) = H:x2 for all X E R a) Find the pointwise limit of fn· b) Does fn converge uniformly on IR? Prove your conclusion.
2.67
Let fn(x) = sinn x and let~ > 0. a) Prove: fn converges uniformly on [0, ~ -~].but not uniformly on [0, ~]. b) In which sense does fn converge on [0, 1f /2)? Prove your claim.
2.68 Suppose f n (x) = 1 - xn. Decide whether or not f n converges uniformly on each interval, and prove your conclusion. a)
[0, 1]
c) [0, b], where 0 :S b < 1
2.69
Let
fn(x)
= {~
b) [0, ~] d) [0, 1)
ifO<x<~, ifx E [0,1] \ (o, ~),
for all n E N. For each x E [0, 1], find limn_,= fn(x). Is the convergence uniform on [0, 1]? Prove your conclusion.
2.70 t Complete the proof of Theorem 2.5.5 by treating the case in which, for all x E [a, b], the sequence fn(x) is an increasing sequence of numbers converging to the limit f(x). If fn and f are all continuous, prove the convergence is uniform. (Hint: Let 9n = - fn and use the case proven already.)
TEST YOURSELF
67
2.6 TEST YOURSELF
EXERCISES
2.71 Is the set Ua,b(x) =ax+ b I a E Q and bE Q} of all linear functions with rational coefficients countable or uncountable? 2.72 True or Give a Counterexample: Every open subset of JR. is the union of countably many open intervals. 2.73 True or Give a Counterexample: Iff: [0, 1] ----+JR. is bounded on [0, 1) but not continuous at 0, then the f (x) converges to some L =I- f (0) as x ----+ 0+. 2.74
Find all the cluster points of the set Q of all rational numbers.
True or Give a Counterexample: Iff : [0, 1] ----+ [0, 1] is a continuous function then there exists c E [0, 1] such that f(c) = yfC.
2.75
2.76 Give an example of a sequence of unbounded functions fn(x) that converges uniformly to f (x) = log x on R 2.77 Give an example of a bounded, continuous function on (0, 1) that is not uniformly continuous on (0, 1). 2.78
Is the given set a vector space or not a vector space? a) The set P2n-1 of all polynomials of odd degree.
b) The set of all bounded functions on R
2.79 2.80
Find
Jlfllsup if f(x) = x 3 on (-1, 0.5).
fn(x) =
e:. AnswerTrueorFalse:
a) fn converges pointwise on R
b) fn converges uniformly on R
2.81
Let fn(x)
= xe-nx for all X E [0, oo). Find llfnllsup· fn(x) = 1- xn. True or False: fn converges
2.82 Suppose given interval. a) [0, b], where 0 :::; b < 1 b) [0, 1] c) [0, 1) 2.83
Give an example of a sequence Xn ----+ 0+ for which sin
Let f(x) = x sin ~ for all x E (0, 1 ). Let E that if x and x' are in (0, 8), then lf(x)- f(x')l <
2.84
2.85
True or False: The function f(x)
uniformly on the
;n = 1 for all n E N.
> 0. Find a value of 8 > 0 such E.
= x sin ~is uniformly continuous on (0, 1).
Suppose f is a monotone increasing function on R True or Give a Counterexample: limx->O+ J(x) = f(O).
2.86
68
CONTINUOUS FUNCTIONS
2.87 Let € > 0. Find a value of 8 > 0 for which whenever x and a are both in [o, oo) with lx - al < 8, this implies 1 val < €.
vx -
2.88 Let fn(x) = 1 + sinn 1r2x for all x E [0, 1]. True or False: The sequence fn is uniformly convergent on a) [o, !] b) [0, b] for each bE [0, 1). c)
[0, 1)
CHAPTER3
RIEMANN INTEGRAL
3.1
DEFINITION AND BASIC PROPERTIES
J:
The Riemann integral, denoted by f(x) dx, is an especially useful concept for both pure and applied mathematics. But it is much more difficult to define than the other, simpler limits studied earlier in this book. The integral is defined to meet the following objectives. Iff is a positive-valued function defined on an interval [a, b], then the integral should be the area of the region of the xy-plane above the interval [a, b] on the x-axis and below the graph of y = f(x). Iff is not strictly positive, then the integral is intended to be a signed area, by which we mean the area between the x-axis and the positive part of the graph off, minus the area between the x-axis and the negative part of the graph of y = f (x). To these ends, the integral is defined as a limit of so-called Riemann sums, which are determined by f, by a so-called partition P of [a, b], and by a choice of so-called evaluation points Xi. Now we define all these terms carefully. Definition 3.1.1 A partition P is an ordered list offinitely many points starting with a and ending with b >a. Thus P = {xo, x1, ... , xn}. where
a = xo <
X!
< ... <
Xn
= b.
Advanced Calculus: An Introduction to Linear Analysis. By Leonard F. Richardson Copyright © 2008 John Wiley & Sons, Inc.
69
70
RIEMANN INTEGRAL
These points are regarded as partitioning [a, b] into n contiguous subintervals, [xi-1, xi]. i = 1, ... , n. The lengthoftheithsubinterval is given by f:lxi =Xi -Xi-1· The mesh of the partition is denoted and defined by
IIPII =
max{tlxi I i = 1,2, ... n}.
In each of then subintervals, we select an arbitrary evaluation point xi E [xi- 1, xi]· We define the Riemann sum n
P(f, {xi})= Lf(xi)f:lxi· i=1 Each summand of the Riemann sum is understood to represent the area of a rectangle of base [xi-1, xi] and height f(xi), if f(xi) ;::: 0, or minus such an area if
J(xi) < o. Definition 3.1.2 The Riemann integral on a closed finite interval [a, b] of a bounded function f is said to exist and have the value L E JR., provided that for each f. > 0 there exists J > 0 such that liP II < J implies
IP(f, {xi})- Ll
(3.1)
independent of the choice of the partition and the evaluation points. In this case we write
1b
f(x)dx = L
and we write also that lim P(f, {xi})= IIPII--->0
jb f(x) dx. a
But it must be understood that this limit is in the very intricate sense of Definition 3.1.2. The family of all so-called Riemann integrable functions f is denoted by R[a, b]. Thus, f E R[a, b] if and only if f(x) dx exists.
J:
It must be emphasized that the required inequality 3.1 must be satisfied independent of the choice of P and independent of the choice of {xi}, just so long as IIPII < J . • EXAMPLE 3.1 Let f(x) = c, a constant, for all x E [a, b]. Then f E R[a, b] and
1b
f(x) dx = c(b- a).
The proof is simple: Just observe that for all partitions P of [a, b], and for all {xi}, chosen evaluation points, we have n
n
P(f, {xi})= .L:cf:lxi = c Lf:lxi = c(b- a). i=l
i=l
71
DEFINITION AND BASIC PROPERTIES
•
EXAMPLE 3.2
Let
f(x)={
1 0
ifxEQn[0,1], if X E [0, 1] \ Q.
Then f ~ R[a, b]. This can be understood as follows. No matter how small we make IIPII, each subinterval determined by the partition must contain both rational and irrational numbers (see Section 1.6). We are free to pick all Xi E Q, in which case n
P(f, {xi}) =
L
lL~xi = 1- o = 1.
i=l
But we are free also to select all Xi ~ Q, in which case P(f, {Xi}) = 0. We cannot bring both values of the Riemann sums within E of the same number L if()< € :$ ~· Theorem 3.1.1 Iff E R[a, b], then f is bounded on [a, b]. Proof: We will suppose the theorem were false and deduce a contradiction. Denote by L the value of the f (x) dx. Then choose E = 1. There exists 8 > 0 such that IIPII < 8 implies IP(f, { xk})- Ll < 1, so that
J:
(3.2) Let us fix P of such mesh, so as to satisfy this inequality. But if f is unbounded on [a, b], there exists i E { 1, ... , n} such that f is also unbounded on [xi-1, xi]. However,
IP(f, {xk} )I=
It k=1
f(xk)fixkl2:: lf(xi)lfixi
-I L
f(xk)fixkl
ki-i
since Ia + ,61 2:: !lo:l-ltJI! (see Exercise 3.1 ). But we can hold Xk fixed for all k i= i, and vary Xi E [xi- 1 , xi] at will so as to make lf(xi)l as large as we wish, violating • the bound of inequality (3.2). Remark 3.1.1 We remind the reader again of the guidance for the proving of theorems in the Introduction on page xxiii. The reader should write out a full analysis of the proof of Theorem 3.1.1 as explained earlier, and just as should be done with regard to every theorem. Note how the indirect proof begins with the supposition that a Riemann integrable function exists that is unbounded. Then we appeal to the definition of the Riemann integral and show that there must be a partition with a subinterval on which the function is unbounded. This introduces an impossible variability in the contribution of that particular integral to the value of the Riemann sum, on account of the total freedom of choice of evaluation point in each interval of the partition.
72
RIEMANN INTEGRAL
• EXAMPLE 3.3
Let f(x) = {
Then f
tf R[O, 1], since f
~
ifO < x ~ 1, if X= 0.
is unbounded on [0, 1].
Theorem 3.1.2 Let f and g be in R[a, b] and c E R Then i. f
+ g E R[a,b] and
1b
(! + g)(x) dx =
ii. cf E R[a, b] and
1b
1b
cf(x) dx
f(x) dx
=c
1b
+
1b
g(x) dx.
f(x) dx.
Remark 3.1.2 Observe that Theorem 3.1.2 says R[a, b] is a vector space, since sums (and differences) of Riemann integrable functions are again Riemann integrable, and the same is true for constant multiples of Riemann integrable functions. Moreover, if we define a mapping T : R[a, b] ----. lR by T(f)
=
1b
f(x) dx
then Theorem 3.1.2 says that Tis a linear function, since T(cf +g)= cT(f) + T(g).
A linear function from a (real) vector space to lR is called a linear functional. Proof: i. Let E > 0, let
L
=
1b
f(x) dx and M
We know there exists 81 > 0 such that
=
1b
g(x) dx.
liP II < 81 implies
IP(f, {xi})- Ll <
E
2. Similarly, we know there exists 82 > 0 such that liP I < 82 implies IP(g, {xi})-
E
Ml < 2.
DEFINITION AND BASIC PROPERTIES
Now let
73
o=min{ 01, o2} > 0. Then liP II < oimplies
!P(f + g, {xi})- (L + M)l =I [P(f, {xi})- L]
+ [P(g, {xi})- MJI
:s; I [P(f, {xi})- LJI + I[P(g, {xi})- MJI < Thus f
f
f
2 + 2 =f.
+ g E 'R[a, b] and I:U + g)(x) dx =
L
+ M.
ii. This part is Exercise 3.2.
• Theorem 3.1.3 Suppose a
and
1c
:s; b :s; c. Iff E 'R[a, b] and f E 'R[b, c], then f E 'R[a, c]
f(x) dx =
1b
f(x) dx
I:
+
lc
f(x) dx.
I:
Proof: Let~: > 0, L = f(x) dx and M = f(x) dx. By hypothesis, there exist o1 > 0 and 02 > 0 such that if P1 is a partition of [a, b] with IIP1 11 < 1 and if P2 is a partition of [b, c] with IIP2II < 02, then we have
Let B denote Let
o
ll!llsup. where the sup-norm refers to the interval [a, c]
and is finite.
o=min { 01, o2 , 9~}.
Suppose now that Pis any partition of [a, c] with II Pll < o. Unfortunately, P needn't be the union of a partition of [a, b] with a partition of [b, c], since we might not have bE P. So we let P' = P U {b} = P1 U P2, so liP' II < too, and the same is true for P1 and P 2 , which are partitions of [a, b] and [b, c], respectively. Then
o
IP(f, {xi})- (L + M)l :s; !P(f, {xi})- P'(f, {xi} )I + IP'(f, {xi})- (L + M)l :s; IP(f,{xi}) -P'(f,{xi})l + IP1(!, {xi})- Ll + IP2(f,{xi})- Ml
<
f,
which we justify as follows. The sum of the final two summands on the right is less than by the choice of o. For the first summand we observe that if b was in P from the outset, then P = P' and the first summand is then zero. But if b rJ. P, we see the only non-0 contributions come from the one interval of P that contains b, and from the two intervals of P' that
¥
74
RIEMANN INTEGRAL
have b as an endpoint, since the contributions of the other intervals cancel. Thus the first summand is less than 3o · B = E/3. •
Jba
J:
Remark 3.1.3 If a S b, we define f(x) dx = f(x) dx, which we understand to mean that if a = b, then the integral is zero. This enables us to extend Theorem 3.1.3 as follows. First,
1b
f(x) dx
+
1a
1a
f(x) dx = 0 =
f(x) dx
because of our definition. And if a S c S b we have
1c
f(x) dx
+
1b
f(x) dx
=
1b
f(x) dx
by Theorem 3.1.3 above, so that
1c
f(x) dx = =
If c
1b 1b
1b + 1c
f(x) dx-
f(x) dx
f(x) dx
f(x) dx.
< a, then a similar argument could be given.
EXERCISES 3.1
Prove: For all nand {3 in !R, In+ !31 ~ \lnlIn+ !31
3.2 3.3
1!31\· Hint:
= In- ( -!3)1.
t Prove part (ii) of Theorem 3.1.2. t Let p E [a, b] and define the indicator function l{p}(x) = {
~
ifx =p, ifx
-=1-
p.
Prove that l{p} E R[a, b] and
1b
l{p}(x) dx = 0.
Hint: Find an upper bound on the value of P ( 1{P}, {Xi}). 3.4 t Suppose h(x) = 0 except at finitely many points Prove: h E R[a, b] and
1b
h(x) dx = 0.
XI,
x 2 , ... , Xk in [a, b].
EXERCISES
75
(Hint: Use Exercise 3.3 above and Theorem 3.1.2.) 3.5 Suppose f E 'R[a, b] and g(x) = f(x) except at finitely many points in [a,b]. Prove: g E 'R[a, b] and
1b
g(x) dx =
1b
j(:1:) dx.
(Hint: Write h(x) = g(x)- f(x) and apply Exercise 3.4.) 3.6
f Suppose f
E
'R[a, b] and f(:r) 2: 0 for all x
1b
E
[a, b]. Prove:
f(x) dx 2: 0.
(Hint: Find a lower bound for all Riemann sums P(f, {Xi}).) 3.7
Suppose f and g are in R[a, b] and f(x) :::; g(x) for all x E [a, b]. Prove:
1 b
f(x) dx S::
1b
g(x) dx.
a
rL
(Hint: Let h(x) = g(x)- f(x). Use Theorem 3.1.2 and Exercise 3.6 above.) 3.8 fDefine a step function on [a, b] as follows. We call a a step function if there exists a partition a = Xo < XI < · · · < Xn = b
such that a(x) = Ci for all x E (xi~ I, Xi), i = 1, ... , n. Thus a is constant on each open interval (Xi~ 1. Xi). The values of a at the points of the set { x 0 , x 1 , ... , Xn} are arbitrary. Prove: The function a E 'R[a, b] and
3.9
t Suppose f
E
R[a, b] and m S:: f(x) S:: M, for all x E [a, b]. Prove: m(b- a) S::
1b
f(x) dx S:: M(b- a).
3.10 t (Mean Value Theorem for Integrals) Let f E C[a, b]. You may use the result of Theorem 3.2.2 stating that C[a, b] c R[a, b]. Prove: There exists x E [a, b] such that
1b
f(x) dx = f(x)(b- a).
(Hint: Let m = inf{f(x) I x E [a, b]} and M = sup{f(x) I x E [a, b]}. Use Exercise 3.9 above and the Intermediate Value Theorem for continuous functions.)
76
RIEMANN INTEGRAL
3.11
Suppose f E R[a, b]. Prove that lim n-->00
3.12
b- a -I:! n n
a)
a +bk n
(
k=l
=
1b
f(x)dx.
a
Express
2k)
2 "cos n ( lim1+n L.J n
n-->oo
k=l
as an integral. (Be sure to include the lower and upper limits of integration.) 3.13 t Let f E C[a, b]. Show that there exists a sequence of step functions an (Exercise 3.8) such that an --+ f uniformly on [a, b]. (Hint: The function f is uniformly continuous on [a, b].) 3.14
0 Let 1 if x E { ~ In EN}, f(x)= { 0 ifxE[0,1]\{~InEN}.
Prove:
f
E R[O, 1] and
1 1
f(x) dx
= 0.
Hint: Let € > 0 and show there exists 8 > 0 such that
liP II < 8 implies
3.15 Let f be defined as in Exercise 3.14. If a is any step function (Exercise 3.8) on [0, 1], prove that II a- /II sup ~
!·
3.2 THE DARBOUX INTEGRABILITY CRITERION Here we will state and prove an alternative but equivalent criterion for the Riemann integrability of a function f : [a, b] --+ R The new criterion will be very useful for proving important theorems about the Riemann integral. In particular, the Riemann sum of a function f depends on the choice of a partition P and arbitrary evaluation points {Xi}. It is helpful to give a condition for Riemann integrability off that does not refer to evaluation points. Since every f E R[a, b] must be bounded on [a, b], we can define Upper Sums and Lower Sums for f as follows. Definition 3.2.1 Let f be any bounded function on [a, b] and let P be any partition of [a, b]. Let
THE DARBOUX INTEGRABILITY CRITERION
77
Note that Mi and mi are in JR. Define the upper sum and the lower sum by n
U(f, P) =
L Mitl.xi
n
and L(f, P) =
i=1
L mitl.xi. i=1
Observe that
L(f, P) ::; P(f, {xi}) ::; U(f, P) for all P and for all {xi}, since mi ::; f(xi) ::; Mi for all i. We can say more. Lemma 3.2.1
If P and P' are any two partitions of [a, b], then L(f, P) ::; U(f, P').
Proof: We begin with the special case in which P and P' differ by a single point. Let P = { xo, ... , Xn} be the first partition, and let z E ( x k-1 , x k), for one specific index k between 1 and n. Let P' = P U { z}. Thus P' differs from P only in that the kth interval of P has been subdivided into two smaller intervals by P', and those smaller intervals will be the kth and (k + 1)th determined by P'. Let m~. m~+ 1 , M£, and M£+1 denote the inf and sup off for the two smaller intervals. Now m~ ~ mk and m~+l ~ mk, whereas M£ ::; Mk and M£+1 ::; Mk. Thus
+ m~+ 1 (xk- z) ::; M£(z- Xk-1) + M£+ 1 (xk- z)::::; Mktl.xk.
mktl.xk ::; m~(z- Xk-d
Hence
L(f, P) ::; L(f, P') ::; U(f, P') ::; U(f, P). So far, we have shown that adding one point to P lowers the upper sum and raises the lower sum. If we add a finite number of points to P to form P', we use this argument repeated finitely often to show the same inequality. Now, suppose P and P' are any two partitions of [a, b]. Then consider their so-called mutual refinement P U P' = P". Then we have
L(f, P) ::; L(f, P") ::; U(f, P") ::; U(f, P').
• Definition 3.2.2 Define the upper integral and the lower integral off on [a, b] by
and
1b
f
1b
f = sup {L(f, P) I Pis a partition of [a, b]}
= inf{U(f, P) I Pis a partition of[a, b]},
78
RIEMANN INTEGRAL
respectively. Since every lower sum is less than or equal to every upper sum, it follows from Exercise 1.32 that
I: f ::; I: f.
Theorem 3.2.1 (Darboux Integrability Criterion) Let f be bounded on [a, b]. Then f E R[a, b] if and only if lim [U(f, P)- L(f, P)] = 0. IIPII-->O When this condition is satisfied,
I:
f (x) dx =
I: I: f =
f.
Proof: For the right-to-left implication, we suppose that
lim [U(f, P)- L(f, P)] IIPII--->0
= 0.
Since
for all P, we have
as
liP II
---4
0. It follows from the squeeze theorem that
which we denote by L. But then we know that if t > 0 there exists t5 > 0 such that for all P with mesh less than t5 we musthaveboth L andP(f, {xi}) between L(f, P) and U (!, P), and thus less than t apart, independent of the choice of evaluation points Xi. Thus letting liP II ---4 0, we see that
lim P(f, {xi})= L IIPII--->0
I:
so that f E R[a, b] and f(x) dx = L. For the left-to-right implication we suppose f E R[a, b], and write L = Lett
lb
f(x) dx.
> 0. There exists 8 > 0 such that liP II < t5 implies IP(f,{xi})-LI <
t 4,
THE DARBOUX INTEGRABILITY CRITERION
79
independent of the choice of {Xi}. We claim it is possible to choose {Xi} such that
IP(f, {xi})- U(f, P)l <
~·
and we could choose {xa such that IP(f, {x~})- L(f, P)l < f. To prove the first claim, denote P = {a = xo < · · · < Xn = b} and select Xi E [xi-1, xi] such that
for all i. Then n
0:::; U(f, P)- P(f, {xi})= ~(Mi- f(xi))f1xi i=l
n
<
4(b
~ a) t; !1xi
E
A similar argument can be given for the second claim. (See Exercise 3.16.) Now, if IIPII < 8 we have
IU(f, P)- L(f, P)l:::; IU(f, P)- P(f, {xi} )I+ IP(f, {xi})- Ll + IL- P(f, {xD)I + IP(f, {x~})- L(f,P)I
<4
G)=
E.
• Theorem 3.2.2 The space C[a, b] c R[a, b]. (That is, every continuous function on a closed finite interval is Riemann integrable.) Proof: Let f E C[a, b]. By Theorem 2.3.2, f is also uniformly continuous on [a, b]. If E > 0, there exists 8 > 0 such that lx1 - x2l < 8 implies
for all x 1 and x2 in [a, b]. Now let P = {a = x 0 < · · · < Xn = b} such that IIPII < 8. In each [xi-1, Xi] we have Mi - mi < b~a since Mi and mi are actual values off achieved at points of this interval, less than 8 apart. Thus n
U(f, P) - L(f, P) = ~(Mi - mi)f1xi < i=l
Hence
lim [U(f, P)- L(f, P)]
IIPII~o
=
0,
E.
80
RIEMANN INTEGRAL
•
so j E R[a, b].
Theorem 3.2.3 Iff is monotone on [a, b], then f E R[a, b]. Proof: have
Suppose first that
f
is increasing. Then, for each partition P of [a, b] we n
U(f, P)- L(f, P) = ~)f(xi)- j(Xi-I)]~xi k=l n
:::; IIPII L[f(xi) -
f(xi-d]
i=l =
IIPII[f(b)- !(a)]---> o
as liP I -----> 0. Thus f E R[a, b]. Finally iff is decreasing, then- f is integrable by • the first part of this proof, and so f = -(-f) E R[a, b] as well. Here is a variant of the Darboux Integrability Criterion which is often useful. (See Exercises 3.26 and 3.27 for applications of this variant.)
Theorem 3.2.4 Let f be bounded on [a, b]. If for all E > 0 there exists a partition Po of[a, b] such that U(f, Po) - L(f, Po) < E, then f E R[a, b]. Proof: Let M =
ll!llsup E JR., and let E > 0. By hypothesis there exists Po = {xo, ... , x N}
such that U(f, Po) - L(j, Po)
<
~- Let
0
E
=
12MN"
IIPII < o. We wish to show that U(j, P) - L(j, P) < E, establishing that f E R[a, b] by the Darboux Integrability Criterion. Let P' = P U Po, so that U(f, P')- L(j, P') < ~because P' 2 Po. It will suffice to show that Let
IU(f, P)- U(f, P')l < ~ and IL(f, P)- L(f, P')l < ~ because we see from the triangle inequality that
IU(f, P) - L(j, P)l :::; IU(f, P) - U(f, P')l + IU(f, P') - L(j, P')l + IL(f, P')- L(f, P)l. But P' 2 P and has at most N points not already in P. Thus at most N subintervals from P and at most 2N intervals from P' are not common to both upper sums. Hence
IU(f,P)- u(f,P')I < NMo + 2NMo < ~Similar reasoning applies to the lower sums.
•
EXERCISES
81
EXERCISES 3.16 t Complete the proof of Theorem 3.2.1 by showing that we could choose X~ E [xi-1, Xi] for all i such that IP(f, {xa) - L(f, P)l < ~3.17
Let if X = 1, ifxE [0,2]\{1}.
f (X) = { 1 0
Find a partition P of [0, 2] for which U(f, P) - L(f, P) 3.18
<
k·
Suppose
f(x) =
{~
if X= 0, ifxE
(n~l'~],
where n E N. Prove: f E R[O, 1], even though f has infinitely many discontinuities. (Hint: Consider Theorem 3.2.3.) 3.19
Suppose
f(x) =
{~-~)n
ifx
= 0,
if X E (
n~l, ~]
,
where n E N. Prove: f E R[O, 1], even though f has infinitely many discontinuities. (Hint: Use the Darboux Criterion.) 3.20
t Let f
be any real-valued function on a domain D <:;;; R Define if f(x) 2:: 0, if f(x) < 0.
and let if f(x)
< 0,
if f(x) 2:: 0 for all x E D. Prove that
for all x E D. (Hint: Just check the cases based on the sign of f(x).) 3.21
t Suppose f
E R[a, b]. Prove: j+ and f-are in R[a, b]. Hint: Show that
U(J+, P) - L(J+, P) :S U(f, P) - L(f, P). 3.22 Iff E R[a, b], prove above.) 3.23
If I E
R[a, b]. (Hint: Use Exercises 3.20 and 3.21
Prove or give a counterexample: The function
If I E R[a, b] {::} f
E R[a, b].
82
RIEMANN INTEGRAL
3.24
t Iff E R[a, b], prove:
(Hint: Write the left side by expressing f f- are both nonnegative functions.)
= j+ - f-
and use the fact that j+ and
3.25 Suppose f E C[a, b] is everywhere nonnegative, and suppose there exists p E [a,b] suchthatf(p) > 0. Prove: f(x)dx > 0.
I:
3.26 t Suppose f E C(a, b) and also that f is bounded on [a, b]. Prove: f E R[a, b]. (Hint: Use Theorem 3.2.4.)
t Let
3.27
ifO<x:S:l, if X= 0. Prove:
f
E R[O, 1]. (Hint: You may use Exercise 3.26.)
3.28 Iff E R[a, b] and if [c, d] c [a, b], prove: f E R[c, d]. Hint: If P is any partition of [c, d], P can be extended to a partition P* of [a, b] with liP* I :::; IIPII· Show that U(f, P) - L(f, P) :::; U(f, P*) - L(f, P*). 3.29 t 9 Let f E R[a, b] and let f. > 0. Prove there exist step functions (Exercise 3.8) satisfying a(x) :::; f(x) :::; a' (x) for all x E [a, b], so that
b 1 a
a(x)dx:::;
1b
f(x)dx:::;
1b a
a'(x)dx
such that
and implying also that
1b
lf(x)- a(x)l dx
(Hint: Use upper and lower Darboux sums.)
3.30
Let f be a bounded function on [a, b].
a) Prove that f E R[a, b] <=>
I: f I: f. =
b) Let f = liQI. Find the numerical values of
not Riemann integrable on [a, b] if a 9This
<
b.
I: I: f and
f. Explain why f is
-
exercise is used in the proof of Theorem 6.5.1, which establishes the convergence of Fourier series in the £ 2 -norm.
INTEGRALS OF UNIFORM LIMITS
t<) 10 Let I
3.31
a) J:(f
83
and g be bounded functions on [a, b]. Prove that
+ g)(x) dx::::;
J: l(x) dx
+ J:g(x) dx.
b) J:(f+g)(x)dx 2 J,:l(x)dx+ J:g(.r)dx.
3.3
INTEGRALS OF UNIFORM LIMITS
The following example shows that pointwise limits of Riemann integrable functions need not be Riemann integrable . •
EXAMPLE 3.4
Since the set Ql of rational numbers is countable, the same is true of the set S of all rational numbers in [0,1]. So writeS= {qn In E N}. Now define a function if X E { qb ... , qn} ,
ln(x) = {
~
all other x E
The reader can observe that for all x E
1s(x) = {
~
[0, 1].
[0, 1]. In (x)
---+
18 ( x ), where
if XES, if X
E [0, 1] \ S,
the indicator function of the set of rational numbers in [0, 1]. Now we know that each In E R[O, 1], by Exercise 3.4. However, the pointwise limit of the sequence In is 1s, which is not Riemann integrable, as explained in Example 3.2 and also in Exercise 3.30. The following example shows that if In E R[a, b] for all n and if In ---+ pointwise, then even if I E R[a, b] as well, still it is possible that
as n •
---+
oo .
EXAMPLE 3.5
Let
ifO<xS::~,
if~< X S:: 1, ifx = 0 for all n EN. 10This
exercise is used in the proof of Theorem 11.4.1, which is Fubini's theorem.
I
84
RIEMANN INTEGRAL
In Exercise 3.32, the student will prove that f n(x) ---* f(x) = 0 pointwise on [0, 1]. Also, it is clear that fn E R[O, 1] for all n, and f E R[O, 1] as well. Yet
1 1
fn(x) dx
=1
1 1
---*
1=/- 0 =
f(x) dx.
Thus it is false in general that
lim n---+oo
1b a
fn(x) dx =
1b a
lim fn(x) dx. n---+oo
The following theorem shows, however, that the integral behaves much better with respect to uniform convergence.
Tbeorem3.3.1 Supposefn E R[a,b]foralln E N,andsupposefn---* funiformly on [a, b]. Then f E R[a, b] and
1b
fn(x) dx---*
1b
f(x) dx.
Proof: First we will use the Darboux Criterion to prove that f E R[a, b]. Let f Then there exists N E N such that n ::::>: N implies
> 0.
€
ilfn- fllsup < 3(b _a)· > 0 such that IIPII < 8 implies U(JN, P)- L(JN, P) < i· Let P ={a= xo < · · · < Xm = b} have mesh less than 8, and observe that on each interval [xk-1, xk] we have
Also, there exists 8
€
3(b- a)
€
~ !N(x)- 3(b- a) < j(x) €
< !N(x) + 3(b- a) <MN -
k
€
+3(b-a)'
where Mf is the supremum of !Non [xk-1,xk] and m);' is the corresponding infimum. Thus
INTEGRALS OF UNIFORM LIMITS
85
Now it follows that n
IU(f, P)- L(f, P)l
=
2)Mk- mk)Axk k=l
2
~ ~ ( Mt'- mf: + 3 (b ~ a)) Axk =
U(JN, P)- L(JN, P)
2f
+3
<E. Thus f E R[a, b] as claimed. With this proven, we can now show easily that n ----> oo. We apply Exercise 3.24 as follows:
l1b fn(x) dx -1b f(x) dxl
=
I: fn(x) dx I: f(x) dx ---->
as
l1b fn(x)- f(x) dxl
~ 1b lfn(x)- f(x)l dx
~ 1b llfn- fllsupdx = llfn- Jllsup(b- a)
---->
0
•
as n----> oo. Definition 3.3.1 JfV is a vector space equipped with a norm II · 11. we call
T:V---->IR. a linear functional provided T(ax
+ y)
=
aT(x)
+ T(y)
forallxandy in V andforall a E JR.. (Compare with Remark3.1.2.) We say that a sequence Xn ----> x E V <=> llxn - xll ----> 0 as n ----> oo. (Compare with Definition 2.5.1.) A linear functional T is called continuous at x if and only iffor each sequence x n in V such that Xn ----> x we have T(xn) ----> T(x). Tis called continuous if and only ifT is continuous at each x E V. Now recall that C[a, b] is a complete normed vector space equipped with the sup-norm. If we define T: C[a, b] ---->JR. by
T(f) = 1b f(x) dx
86
RIEMANN INTEGRAL
then T is linear by Theorem 3.1.2 and T is continuous by Theorem 3.3.1. Thus T is a continuous linear functional on C[a, b]. The student will meet some more continuous linear functionals on C[a, b] in the exercises below, and all of the continuous linear functionals on C[a, b] wiii be identified by the Riesz Representation Theorem, Theorem 7.4.1.
Lemma 3.3.1 Let V be any vector space over the real numbers, equipped with a norm II· II· A linear functional T : V-+ IRis continuous¢> Tis continuous at 0. Proof: If T is continuous (at all x E V) then it must be continuous at 0. So we prove the opposite implication. Note that since T(O) = T(O we must have T(O)
= 0.
+ 0) =
T(O)
+ T(O),
Suppose Tis continuous at 0: That is,
implies T(xn) -+ 0 = T(O). Let x E V be arbitrary and suppose Xn llxn - xll -+ 0. By hypothesis,
T(xn - x) so T(xn)
-+
= T(xn)
- T(x)
-+
-+
x: ie,
0,
•
T(x).
Definition 3.3.2 A linearfunctional Ton a normed vector space Vis called bounded if and only if there exists K E IR such that IT(x)l :::::; Kllxii,Jor all x E V.
Theorem 3.3.2 If T is a linear functional on a normed vector space V, then T is continuous if and only ifT is bounded. Proof: For the implication from right to left, suppose that T is bounded. It will suffice to prove Tis continuous at 0. So suppose llxnll -+ 0. Then
and this implies that T(xn) -+ 0. For the implication from left to right, suppose that Tis continuous. We will prove T is bounded by contradiction. So suppose the claim were false. Then for all n E N there exists Xn E V such that
Let Cn = ( vlnllxnll)- 1 (since Xn
i- 0) and then
for all n E N. Denote Yn = CnXn and observe that llYn II = fails to converge to 0; in fact it is unbounded.
Jn
-+
0, yet T(yn) •
EXERCISES
87
Because of Theorem 3.3.2, continuous linear functionals on normed linear spaces are often called bounded linear functionals. Observe that iff E C[a, b] and if we define T(f) = f(x) dx, then
J:
IT(!) I :S
1b
llfllsupdx
= (b- a)llfllsup·
Thus T is bounded, with constant K = b - a.
EXERCISES
3.32
tIn Example 3.5, show that fn(x)
---->
0 pointwise on [0, 1].
3.33 In each example,jind the pointwise limit and then use Theorem 3.3.1 to show that the convergence is not uniform. a) Let fn be as in Example 3.4. b) Let fn be as in Example 3.5.
J2
3.34 Prove: 1 (1 + x 2 )-n dx ----> 0 as n ----> oo. (Hint: Use Theorem 3.3.1, but be sure to prove the necessary hypotheses are satisfied.) 3.35
Find lim n->oo
!
2
1
1
1 + (1 +X2 )n dx
and justify your conclusion.
3.36
Let
ifO:Sx:S~, if..!.< X<- 1 n for each n > 1. Find a) limn-+oo fn(x) for each
b) lirnn-+oo
J
1 0
.T
E
[0, 1], and
fn(x) dx.
3.37 Prove: J01r 14 sinn(x) dx----> 0 as n----> oo. (Hint: Use Theorem 3.3.1, but be sure to prove the necessary hypotheses are satisfied.) 3.38 In the following exercise, the student may use techniques of integration learned in elementary calculus, including the concept of improper integrals. (See page 135.) Find the mistake in the following reasoning. We calculate readily that
! 1
!n
for all n. Since t 1 on both sides that
---->
n
1 --dt tlnn
= 1
0 uniformly on [1, oo ), it follows by taking the limit on n
1=
Odt
= 1.
88
RIEMANN INTEGRAL
Explain why the latter conclusion is false, and find the error in the reasoning. 3.39 Fix p E [a, b] and, for all f E C[a.b], equipped with the sup-norm, and define T : C[a, b] ---> lR by T(f) = f(p). Prove: Tis a continuous linear functional on C[a,b]. 3.40
LetT: C[O, 1]---> lR be defined by
Prove that T is a bounded linear functional. Find a constant K for which
IT(f)l :S
Kllfllsup
for all f E C[o, 1]. 3.41 LetT : C[O, 3] ---> lR be defined by T(f) = f(1) - !(2). Prove: Tis a bounded linear functional on C[O, 3], equipped with the sup-norm. 3.42 t Consider the vector space IR 2 and let x = (x1, x2). Define the Euclidean
norm by
and add vectors componentwise as usual. a) t Prove: If T : IR 2 ---> lR is linear, then there exists a = (a1, a2) E IR 2 such that, for all x we have
(Hint: Write x = x 1 (1, 0) + x 2 (0, 1).) t Prove that for all a E IR 2 , the corresponding linear functional T given in Exercise 3.42.a above is continuous with respect to the Euclidean norm. (Theorem 3.3.2 is helpful.) c) Let ci : IR 2 _. lR be defined by Ci(x) = Xi, i = 1, 2. Prove: ci is a continuous linear functional on lR 2 •
b)
3.43 a) Show that R.[a, b], equipped with the sup-norm, is a complete normed
vector space. b) Suppose that T and T' are both bounded linear functionals defined on
R.[a, b] as in (a) above. Suppose T(a) = T'(a) for each step function a. Prove that T(f) = T'(f) for all f E C[a, b]. (Hint: Use Exercise 3.13.) 3.44 Let P denote the vector space of all real polynomial functions (all nonnegative integer degrees allowed) on the interval [0, 1]. Endow this vector space with the supnorm. a) Show that the set { xk k E N U {0}} is a basis for P.
I
EXERCISES
89
b) Define T : P --+ IR by letting T (xk) = k and show that the domain of T
can be extended to all of P so as to make T a linear functional on P. c) Prove that T is not bounded, and thus not continuous.
Remark 3.3.1 In Exercise 3.42 above, the reader has shown that every linear functional defined on IR 2 is continuous. It is not hard to see that the same argument applies toRn for each n E N as well. Thus the reader may wonder why theorems are presented for the purpose of proving that a linear functional is continuous, or equivalently, bounded. The reason is that if a normed vector space is infinite-dimensional, then unbounded linear functionals do exist, although this cannot happen in finitedimensional spaces. To demonstrate the existence of unbounded linear functionals, we need the concept of a Hamel basis for a vector space V.
Definition 3.3.3 A Hamel basis B is a (generally uncountable) set { ea Ia E A} having the following two properties. i. B is linearly independent: That is, for each finite set F
L
Caea
= 0
c A we have
==>- each coefficient Ca
= 0.
aEF
ii. For each vector v E V there exists a finite set F C A such that
It follows easily from the definition that each vector v E V can be expressed uniquely as a linear combination of finitely many of the basis vectors. The existence of Hamel bases is generally proved in graduate courses in functional analysis [8], as an application of the Axiom of Choice from set theory. But if the reader will accept the existence of Hamel bases, then we can demonstrate the existence of unbounded linear functionals on any infinite-dimensional normed vector space. Let V be any infinite-dimensional normed linear space. Let B = { ea I o: E A}
be any Hamel basis for V. Define a linear map T : V --+ IR as follows. Select a countable set {ei I i E N} C B and define T(ei) = illei II for each i E N. Then define T to be 0 on each remaining basis vector from B. Extend Tin the usual way to a linear transformation of V --+ IR, and observe that T is unbounded since ~~~:1?' is an unbounded sequence. Hence T is an unbounded linear functional. This generalizes Exercise 3.44 above. Anotherinteresting feature of Exercise 3.42 is that each of the standard cooniinate functions in IR 2 is a bounded linear functional. One way to think of coordinates is that coordinates are bounded linear functionals (real-valued linear functions that are
90
RIEMANN INTEGRAL
continuous) defined on a normed vector space V having the property that if two vectors v 1 and v 2 are distinct, then there exists a coordinate function C such that C(v 1 ) =j: C(v2 ). In a graduate course in functional analysis [8] the reader will study the Hahn-Banach Theorem. One consequence of this theorem is that in every Banach space there are enough bounded linear functionals to serve this same function of distinguishing among the vectors of V.
3.4 THE CAUCHY-SCHWARZ INEQUALITY 11 If x = (x 1 , x 2 ) andy =
(Yl,
y2 )
E
JR 2 , the student will recall that
x · y = X1Y1 + X2Y2 = llxll 1/y/1 cosO is the scalar product of x with y, known also as the dot product, or the inner product, and denoted also by (x, y). Here (} is the angle between x and y. Since Icos (}I : : ; 1 for all 0, it follows that
lx·yl::::; 1/x/11/yl/. This is called the Cauchy-Schwarz inequality in JR 2 • Here
1/x/1 =
Jxr
+x~ = VX.X
and 1/ · 1/ has all the properties of a norm on JR 2 • A similar result holds in R.n as well. It is natural to consider the following generalization of these concepts to j, g E R.[a, b]. Just as vectors in the plane have two components, we can think of a function f E R.[a, b] as being a vector with infinitely many components-namely the numbers f(x) corresponding to the uncountably many x E [a, b]. But we cannot possibly add all the values of f(x)g(x) for all x E [a, b]. Thus we try definingascalarproductof f and g by
(!,g)=
1b
J(x)g(x) dx.
We need to show however that the product of two Riemann integrable functions is always integrable. This is the next theorem.
Theorem 3.4.1 Iff and g are in R.[a, b], then the product fg E R.[a, b] as well. Proof:
i. We begin with the special case in which f(x) ~ 0 and g(x) ~ 0 for all x E [a, b]. Let P = {x 0 , x 1 , ... , xn} be any partition. Fix for the moment any one subinterval [xk-1, xk] and denote by Mf, M 9, and M1 9 the sup over this interval off, g, and fg, respectively. Since f(x)g(x) ::::; M1M9 for all 11 This
section will be used in Chapter 6 for the study of Fourier series.
THE CAUCHY-SCHWARZ INEQUALITY
91
x E [xk-I,xk],itfollowsthatMJg :-::; MtM9 . Similarly,fortheinfimawe have m f 9 ~ m 1m 9 on each such subinterval. Hence Mt 9
-
m1 9
:-::;
MtM9
m1m 9
-
= (MJ- mt)M9 + (M9
-
m 9 )mJ
:S: (MJ- mf )llgllsup + (Mg- mg)llfllsup· Summing over k
=
1, 2, ... , n, we find that
lim [U(fg,P)- L(fg,P)] :-::; lim [U(f,P)- L(f,P)JIIgllsup IIPII--->0 IIPII_,o +
lim [U(g, P)- L(g, P)JIIJIIsup IIPII_,o
= 0 +0 = 0.
Thus fg E R[a, b] by the Darboux Integrability Criterion. ii. Now suppose f and g in R[a, b] are arbitrary. Since each such function is bounded, there exists K E lR such that f(x) + K ~ 0 and g(x) + K ~ 0 for all x E [a, b]. Since R[a, b] is a vector space, f + K and g + K E R[a, b]. By Case (i),
(f +K)(g + K)
=
fg+ Kf +Kg+ K 2 =FE R[a,b]
as well. But then fg = F- Kf- Kg- K 2 E R[a, b] also.
• Definition 3.4.1 For all f and gin R[a, b] we define (!,g)
=
1b
f(x )g(x) dx.
Define also the L 2 -norm 12 off by llfll2 =
J(J, f)=
1b
lf(x)l2 dx.
Note that (!,g) = (g, f). We observe also that this product is linear in f and g separately: That is, (/I+ ch,g) =(/I, g)+ c(f2,g) and (f,g1 + cg2) = (f,g1) + c(f,g2). Also, (!,f) ~ 0 for all f E R[a, b]. One property of the scalar product of lR 2 jails to carry over to this case, however: it is possible for (!, f) = 0 without f (x) = 0 12 The letter Lin L 2 stands for Lebesgue, and ideally the L 2 -norm is studied in the context of the Lebesgue integral. It is still a useful norm to consider however in the context of Riemann integration, as we use the term here.
92
RIEMANN INTEGRAL
on [a, b]. This means thatllfll2 is not a true norm since it is possible for llfll2 = 0 even though f -=1- 0 E R[a, b]. (See Exercises 3.45 and 3.46.)
Remark 3.4.1 It would be possible to remedy the failure of ll·ll2 to be a true norm by forming the vector space of equivalence classes offunctions in R[a, b]. Here f would be considered equivalent tog if and only if II/- gll2 = 0. However we will not do this here because of another serious defect: even with such a procedure R[a, b] would
still not yield a complete normed linear space with respect to the 2-norm II ·ll2· Thus one may wonder, how much would one have to enlarge the space R[a, b] to endow it with a limit for each Cauchy sequence in the 2-norm. The answer is that one must define what is called the Lebesgue Integral, a very subtle refinement of the Riemann Integral that can integrate every f E R[a, b], with the same results, but which can also integrate many more functions. This subject is left for a graduate course in real analysis, however. Even though the 2-norm is not a true norm on R[a, b], we can still use our geometrical intuition about norms (or lengths) of vectors to prove the famous Cauchy-Schwarz inequality.
Theorem 3.4.2 (Cauchy-Schwarz) Suppose f and g are in R[a, b]. Then
It
f(x)g(x)dxl S ( [ f(x) 2 dx) l ( [ g(x) 2
dx)!
Remark 3.4.2 The theorem means that in the notation introduced above
just like the analogous inequality for dot products in JR. 2• Proof: For all t E JR., define a polynomial
p(t) = (tf + g, tf +g) =
1b
[tf(x) + g(xW dx.
Observethatp(t) ?: 0 for all t. By linearity of (-,·)in each variable(or by elementary algebra with the integral formulation) we see that
p(t) = 11/11~ t 2 + 2(!, g) t + II gil~= at 2 + bt + c, where a = 11/11~. b = 2(!, g), and c = llgll~· But a quadratic polynomial p(t) ?: 0 for all t E JR. if and only if b2 - 4ac :::; 0, which is equivalent to b2 :::; 4ac. Hence
•
EXERCISES
93
EXERCISES In the following exercises we will assume you know from elementary calculus courses how to evaluate integrals by means of antiderivatives. The necessary Fundamental Theorem of the Calculus will be proven in Chapter 4.
3.45 3.46
t Give an example off E R[a, b] such that IIJII2 = 0 yet f(x) t Prove that II ·ll2 does satisfy the triangle inequality:
"/=- 0 on [a,b].
(Hint: Write II!+ gil~ = (! + g, f +g), expand using linearity in each variable, and apply the Cauchy-Schwarz inequality.) 3.47
Suppose f E R[a, b]. a) If there exists 8 > 0 such that f(x) 2: 8 for all x E [a, b], prove that
1
7
E'R[a,b].
Hint: On each interval [xk-1. xk] of any partition, compare 1 1 .l 1 MJ =sup f(x) and m£ = inf f(x)
with the corresponding numbers M{ and m{ Then use the Darboux criterion. b) Now suppose only that lf(x )I 2: 8 > 0 for all x E [a, b], for some fixed 8 > 0. Show again that
1
7 Hint: Apply (a) to 3.48
-j, and then use Theorem 3.4.1.
J0'i (1 + tanx)y'xsecxdx::::; ?T~.
Use the triangle inequality for
[1~ 3.50
R[a,b].
Use the Cauchy-Schwarz inequality to show: a) f011" v'x sin x dx ::::; 1r. b)
3.49
E
Prove:
(
II · ll2 to show 1
2
2
v'cosx + x) dx ]
::::;
1+
~-
Iff E R[O, 1], then 1 [ {1 Jo{1 xf(x) dx::::; v'3 Jo [f(x)f dx] !
94
RIEMANN INTEGRAL
3.51
Prove: Iff E 'R[O, 1], then
[1 (y'cosx+f(x)) dx] ! s; ~+ [1 (f(x)) dx] ! 1
1
2
2
3.52 tIff and g are in n[a, b], we say f is orthogonal tog, denoted by f and only if (!,g) = 0. Prove that
f
l_ g {:}
l_
g, if
II!+ gil~ = IIIII~ + llgll~·
(This is a modem analogue of the Pythagorean Theorem.)
3.53
Let f and g be in 'R[a, b], with llgll2 > 0. a) Find a constant c E IR such that (f - cg) l_
g. (Hint: Use the definition of orthogonality in Exercise 3.52.) b) Now let f(x) = x and g(x) = 1 on [0, 1]. Find the value of c for which (!- cg) l_ g. Using this value of c, would (!- cg) l_ g on the different interval [-1, 1]? 3.54 3.55
Let f(x) = sinx and g(x) = cosx on [0, n]. Prove f
l_
g.
Suppose llfll2llgll2 > 0, where f and g are in R.[a, b]. a) Prove: I(!, g) I = llfll2llgll2 if and only if there exists t E 1R such that
1b[g(x) + tf(xW dx = 0. (Hint: Let p(t) = (tf + g, tf +g) ;: : 0, as in the proof of the CauchySchwarz inequality.) b) The preceding part says that we get equality in the Cauchy-Schwarz inequality if and only if g is very close to being a constant multiple of f. What happens to the Cauchy-Schwarz inequality if llfll2 = 0 or if llgll2 = 0?
3.56
Let f(x) = sinx on [0, 1]. Give an example of g, with
l(f,g)l = ll!ll2ll.qll2· 3.57 Let J,g E n[a,b]. (!,g) = llfll2ll.qll2· 3.58
Show that
ll.qll2 > 0, for which
II!+ gll2 = 11!112 + llgll2
if and only if
Let g E R.[a, b], and define T 9 : C[a, b] ----. IR by
T 9 (!) = 1b f(x)g(x) dx for all f E C[a, b]. Prove: T9 is a bounded linear functional on the Banach space C[a, b], equipped with the sup-norm.
Remark 3.4.3 On a cover of the Notices of the American Mathematical Society [ 17] the reader can see a photograph of a clay tablet from the Yale Babylonian Collection.
TEST YOURSELF
95
The sketch on this tablet indicates that Babylonian mathematicians were aware of the geometrical reasoning that justifies the Pythagorean Theorem of plane geometry for the case of a right isosceles triangle. This was approximately one thousand years before the life of Pythagoras - about three thousand years before the present time. It is considered the earliest known record indicating a geometrical proof. We mention this because it is connected to an interesting circle of ideas that we have encountered. We have used the quadratic formula to prove the Cauchy-Schwarz inequality for the vector space R[a, b]. The reader could easily look ahead to Theorem 8.1.1 to see that essentially the same method of proof establishes the Cauchy-Schwarz inequality in every vector space equipped with a scalar product (x, y), which includes also the Euclidean plane and n-dimensional Euclidean space. The Cauchy-Schwarz inequality yields as a consequence the triangle inequality of plane geometry, but in the general context of a vector space with an inner product-a so-called inner product space. Although the concept of an inner product space appears at first abstract, the geometry that is employed hinges on the quadratic formula, which embodies the method of completing the square learned in high school algebra. This method has been used in school exercises for at least four thousand years. Again, the earliest records appear on clay tablets that were school-room exercise tablets from Babylon [15]. And in Exercise 3.52 above, the reader has proven a version of the Pythagorean Theorem for the vector space R[ a, b]. The same proof works in the Euclidean plane, in n-dimensional Euclidean space, or in any inner product space. It is a simple application of the inner product of two vectors. In other words, we have learned some elements of the geometry of the functionspace R[a, b] by applying the quadratic formula from four thousand years ago. Study of this abstract and modem subject from functional analysis has shed light on Euclidean Geometry. The tool on which this rests is four thousand years old, and preceded Euclidean Geometry by two millennia. So here is a question that is not part of Mathematics: Is this light shed upon Euclidean Geometry a new light, or an old one?
3.5 TEST YOURSELF
EXERCISES 3.59
Let f n ( x)
=
l+~x2 for all x E JR. Find:
= limn-+oo fn(x) for all x E JR. b) llfnllsup for each n E N. c) Is fn uniformly convergent on IR? (Yes or No) a) The pointwise limit f(x)
3.60
Does f n (x) = xn converge pointwise on a) (-1,0]?
b) [-1, 0]?
96
RIEMANN INTEGRAL
3.61
Let
f(x)
=
{12 ifif 01 ~ x < 2.1, ~X~
If E
> 0, find a 8 > 0 such that
independent of the choice of evaluation points Xi·
3.62
Evaluate
2
n ( lim- "cos
n-+oo
n L..J k=1
2k)
1+n
by expressing it as a suitable definite integral and evaluating this by means of the Fundamental theorem.
3.63
Let
f(x)
= {1 0
ifx = 1, ifxE[0,2]\{1}.
Find a partition P of [0, 2] for which U(f, P)- L(f, P)
3.64
f
E
3.65
True or Give a Counterexample: If
III
E
<
1 49 •
R[O, 1] and iff is bounded, then
R[O, 1]. Suppose the linear map T: C[O, 1]
--+
lR is given by
1 1
T(f) =
f(x) sin(x) dx.
Find abound K such that IT(!) I~ Kllfllsup for all
3.66
Let T : R [ ~ , e]
--+
f
E C[O, 1].
lR be the linear functional defined by
T(f) =
r
f(x) dx.
}l X e
Find a constant K for which IT(!) I ~ Kll!llsup for all fER[~, e].
3.67
Let
x2
f(x) a) Find
= {
-x 2
if X E Q, if x E lR \ Q.
f01 f(x) dx and J01 f(x) dx.
b) True or False: The function f E R[O, 1].
3.68 Suppose f + g and f- g are both Riemann integrable on [a, b]. True or False: Both f and g are in R[a, b].
97
EXERCISES
3.69
Decide whether or not each set is a vector space (over the real numbers): a) C[a, b], the set of continuous functions on [a, b]. b) P 10 (JR), the set of all polynomials of degree exactly equal to 10.
= xe-nx for all x
3.70
Let fn(x)
3.71
12 . d 1'lffin-+exo J7r F1D 1r/ 4 COSn
X
E
[0, oo). Find llfnllsup for all n.
dX.
3.72 LetT: P[O, 1] ----+ lR be defined by Tp or False: T is a linear map. 3.73
= degp for each polynomial p.
Let
f(x) = Find I~ f(x) dx and I~ f(x) dx.
x2 { -x2
if X E Q, if X E JR \ Q.
True
This page intentionally left blank
CHAPTER4
THE DERIVATIVE
The first limit that is used intensively by every first year calculus student is the derivative. Here we will define the derivative carefully and prove the essential theorems about the derivative. We will discuss carefully the concept of the differential, because this will highlight the very important fact that a function is differentiable if and only if its increments can be approximated locally by a linear function.
4.1
DERIVATIVES AND DIFFERENTIALS
Definition 4.1.1 If x 0 E D 1 is a cluster point of the domain D 1 of a function define the derivative off at xo to be
'( ) _ ! xo -
. f(xo +h) - f(xo) 11m l
h->0
I~
.f, we
,
provided that this limit exists. (Note that in accordance with the concept of limit, h ----> 0 through values such that x 0 + h E D 1. It is also common to denote x = x 0 + h and to write the limit criterion as
!
'( x ) = 0
f(x)- f(xo) . 1nn
x->xo
x- Xo
Advanced Calculus: An Introduction to Linear Analysis. By Leonard F. Richardson Copyright © 2008 John Wiley & Sons, Inc.
99
100
THE DERIVATIVE
if the limit exists.) This limit is often interpreted geometrically as representing the slope of a tangent line to the graph of y = f(x) at the point (x 0 , f(x 0 )). The difference quotient of which f' (x 0 ) is the limit is the slope of a chord joining ( x 0 , f (x 0 )) to the point (x 0 + h, f (x 0 + h)). In effect, if the limit defining f' (xo) exists, we take the tangent line to be defined by the equation y- f(x 0 ) = f'(x 0 )(x- x 0 ), which describes the unique line through the base point (x 0 , f(x 0 )) with the slope f'(xo) . • EXAMPLE 4.1
We present several examples of derivatives. i. Let f(x)
=
lxl- Then we claim f'(O) does not exist. In fact, lim
IO + hi - 101 = h
h--->0
lim h--->0
1M h
does not exist, since if hn --+ 0+ (i.e., from the right side ofO), then~ --+ 1, but if hn --+ 0- (i.e., from the left), then ~ --+ -1. Since these one-sided limits are different, the limit of the function does not exist. ii. Again we take f(x) = lxl, but we show that if xo > 0, f'(xo) = 1. In fact, we can use h such that Ihi < xo, and then lim lxo
iii. Now let f(x)
= xn, n
+ hl-lxol = h
h--->0
lim!!_ h
= 1.
h---tO
EN. Then
J'(x) = lim (x h--->0
+ h)n- xn
h nxn-lh + n(n-l)xn-2h2
=lim
h---tO
2
+ ... + hn
h
= nxn-1. iv. If f(x)
= c, a constant, then f'(x)
= 0. (See Exercise 4.3.)
There is another way to understand the concept of a tangent line that is also very useful, although the definition may seem a bit more technical at the beginning. Imagine now a straight line through ( x 0 , f (x 0 )) with slope m. Thus the rise of this line corresponding to an increment h in the x-variable is given by a linear function L(h) = mh. On the other hand, the rise of the graph of y = f(x) will be denoted by tJ.f = f(xo +h)- f(xo). The intuitive idea is that the graph y = f(x) has a tangent line at (x 0 , f(x 0 )), provided that there exists a linear function L(h) = mh that approximates tJ.f accurately for all sufficiently small values of h. Then the
DERIVATIVES AND DIFFERENTIALS
101
tangent line corresponds to the linear function L(h) that gives this approximation. We make the following formal definition based on this idea.
Definition 4.1.2 The function f is called differentiable at a cluster point x 0 E D f, provided that there exists a linear function L(h) = mh such that the function given by i(h) = D.f- L(h) --+ 0
h ash--+ 0.
The intuitive idea is that iff( h) than h--+ 0.
--+
0 as h
--+
0 then D.f - L( h)
--+
0 much faster
Theorem 4.1.1 A function f is differentiable at a clusterpointx0 E DJ if and only if f'(xo) exists. Iff is differentiable at xo, then L(h) = f'(xo)h is denoted by dfxo (h) and (4.1) D.f = f(xo +h)- f(xo) = dfxo(h) + t:(h)h, where f(h)--+ 0 ash--+ 0 (in such way that x 0 +hE DJ). Proof: For the implication from left to right, we suppose that f is differentiable at xo. Then there exists m E lR such that f
as h
--+
(h)
=
D.f- mh
h
--+
0
0. Thus D.f
=
f(xo +h)- f(xo)
h
= m + f(h)
-+ m
h
as h --+ 0, so f' ( xo) exists and equals m. For the opposite direction of implication, we suppose f'(xo) exists. Let m If we define L( h) = mh and let f
(
h)= D.f- L(h) = f(xo +h)- f(xo) _
h
then the hypothesis implies f(h)
h
--+
0 ash-+ 0.
=
f'(xo).
!'( )
Xo ,
•
Remark 4.1.1 We remind the student here to review the Introduction regarding Learning to Write Proofs on page xxiii. The reader should write out a careful analysis of what makes each proof in this course work. We will do this together for the first proof of the chapter. We are trying to prove that a function f is differentiable if and only if its increments D.f can be approximated well locally by the differential, df, in the technical sense described in the statement of the theorem. For the implication from left to right, we suppose f is differentiable, so that D.f can be approximated by df using a suitable error-function f. This enables us to express the difference quotient for the derivative
102
THE DERIVATIVE
in terms of df and t:, and the assumption that E __, 0 as h __, 0 implies that the limit that yields the derivative exists. For the opposite implication we assume f'(x) exists and we define E by means of Equation (4.1 ). All that remains is to prove that t: __, 0 as h __, 0. By writing E in terms of the difference quotient and the derivative, the existence of the derivative implies that f. __, 0 as h __, 0.
Corollary 4.1.1 Iff is differentiable at xo then f is continuous at xo. Proof: We need to prove limx->xo f(x) = f(x 0). This is equivalent to proving limx->xoU(x)- f(xo)) = 0. Denote h = .T- xo, and we have lim (f(xo +h) - f(xo)) = lim [dfxo (h)+ E(h)h]
h->0
h->0 =lim [f'(xo) h->0
+ E(h)]h = 0.
• Theorem 4.1.2 Suppose f'(x) and g'(x) both exist. Then i. (f ± g)'(x) exists and (f ± g)'(x) = f'(x) ± g'(x). ii. (fg)'(x) exists and (fg)'(x) = f'(x)g(x)
iii.
+ f(x)g'(x).
('&)' (x) exists and('&)' (x) = g(x)f'(~)(~{Jx)g'(x), ifx E Df is a cluster point of D L· g
Proof: Let us prove case (i) for f +g.
r
h~ .
(f
I un = h--->0
+ g)(x +h)- (f + g)(x) h
(f(x +h)- f(x) h
+
g(x +h)- g(x)) h
= f'(x) + g'(x) The proof for f
-
g is very similar. For (ii) and (iii) see Exercises 4.4 and 4.5.
•
Theorem 4.1.3 (The Chain Rule) Suppose g is differentiable at xo and f is differentiable at g(xo). Then (f o g)'(x0 ) exists and (f o g)'(xo) = f'(g(xo))g'(xo).
Proof: To show that the derivative of the composition f o g exists at x 0, we denote k = g( xo + h) - g( xo) __, 0 as h __, 0 since g is continuous at xo by Corollary 4.1.1.
EXERCISES
103
Also, k = g'(xo)h + F:(h)h, where f:(h) --+ 0 ash--+ 0. Next we form the following difference quotient noting that E( k) --+ 0 as k --+ 0:
f(g(xo +h))- f(g(xo)) h f'(g(xo))k + E(k)k h f'(g(xo))[g'(xo) + F:(h)]h + E(k)[g'(xo) h --+ J'(g(xo))g'(xo)
+ F:(h)]h
•
ash--+ 0. EXERCISES
< 0, showthatf'(xo) = -1. (SeeExample4.1(ii).) 4.2 Prove directly from Definition 4.1.1 that if f' (a) exists, then f must be 4.1
If f(x)
=
lxl andifxo
continuous at a. (Hint: Suppose Xn --+ a with each Xn E D f \ a. Prove that f(xn)- f(a) --+ 0 as Xn --+a.) 4.3
t Prove the conclusion of Example 4.1 (iv). t Prove conclusion (ii) of Theorem 4.1.2. (Hint:
4.4 of Corollary 4.1.1.) 4.5
You may use the conclusion
t a) Prove: if g'(x) exists and g(x)
=/=-
0, then
(i )' (x) exists and is equal to
~(~W. (Hint: You may use the conclusion of Corollary 4.1.1.) b) Now use part (a) above together with Theorem 4.1.2(ii) to prove conclusion (iii) of Theorem 4.1.2. 4.6
Let
ifx E (0,1], ifx =0, as in Fig. 4.1. a) Prove: f E C[O, 1]. (Be sure to consider x = 0. You can use the squeeze theorem for functions.) b) Prove: f'(x) exists for all x E (0, 1], but f'(O) does not exist. (We assume you know about the derivative of sin x from elementary calculus.) 4.7
t Let f (x)
See Fig. 7.1, p. 217.
= {
~2 sin ( ; )
ifx =/=- 0, ifx =0.
104
THE DERIVATIVE
Figure4.1
f(x) = xsin (~).with envelope u(x) = x, l(x) = -x
a) Prove: f'(x) exists for all x E JR. (Be sure to consider x b) Prove: f E C(JR). c) Prove: f' is not continuous at x = 0. 4.8
= 0.)
Let x2
f(x) =
{
0
ifx E Q, ifx tJ_ Q.
Prove that f is continuous at one and only one point x = a. Find the value of a and prove that f' (a) exists. Suppose f and g are differentiable and suppose that f(g(x)) x E D 9 is a cluster point of D 9 •
4.9
= x.
Suppose
a) Prove:
g'(x)
=
f'(gl(x)).
b) Let f be a restriction of the sine to a domain on which it is injective. Suppose that f and g = f- 1 are differentiable. Use the result of (a) to derive a familiar formula for g '. (This will be treated further in Theorem 10.4.3.)
THE MEAN VALUE THEOREM
105
4.2 THE MEAN VALUE THEOREM The theorems in this section play a vital role in such diverse applications as extreme value problems, proofs of inequalities, and as we shall see in Section 4.3, even the proof of the Fundamental Theorem of Calculus. Definition 4.2.1 The local maximum and local minimum points off, defined below, are called local extreme points. i. A function f is said to have a local maximum point at a E D 1 if and only there exists 8 > 0 such that x EDt n (a- 8, a+ 8) implies f(x) ~ f(a). ii. The point a E D 1 is called a local minimum point if and only 8 > 0 such that x EDt n (a- 8, a+ 8) implies f(x) ~ f(a).
if
if there exists
The idea behind the preceding definition is that a local extreme point need not be either the highest or lowest point on the entire graph of the function, but will be a local high point or a local low point-meaning just within its own vicinity. To study local extreme points with the derivative, we need the concept of an interior point of a domain. Definition 4.2.2 A point a is called an interior point of D 1 if and only if there exists 8 > 0 such that (a- 8, a+ 8) s:;; Dt· The set of all interior points of Dt is denoted by D/, the interior of D 1. In other words, an interior point a of a domain D is a point which has no points from the complement of D within some small specified radius 8 > 0 of a. Theorem 4.2.1 Iff is differentiable at a local extreme point J.L E
D/, then f' (J.L) =
0.
Proof: 1.
Consider first the case in which f has a local maximum point at J.L. Note that there exists 8 > 0 such that 0 < h < 8 implies f(J.L +h)- f(J.L)
h
~ 0,
whereas -8 < h < 0 implies
Thus lim f(J.L+h)-f(J.L)
whereas
r
h!lg_
h
-
J(JL + h) - J (J.L) > 0 h
-
'
0
106
THE DERIVATIVE
Because f' (Jl) exists, it follows that
f'(M) = lim f(Jl +h)- f(Jl) = 0 h->0 h since it is both nonnegative and nonpositive.
ii. If f has a local minimum at Jl, consider g at Jl, to reach the desired conclusion.
= - f, which has a local maximum
• Theorem 4.2.2 (Rolle's Theorem) Iff E C[a, b] is such that
f(a)
= 0 = f(b)
and such that f' (x) exists at least for all x E (a, b), then there exists Jl E (a, b) such that f' (Jl) = 0. Remark 4.2.1 Rolle's Theorem says that if the chord joining the two endpoints (a, f(a)) and (b, f(b) lies on the x-axis, then there exists a tangent line at some point of the graph that is parallel to the chord.
Proof: Iff : : : : 0 then the claim of the theorem is trivial. If there exists x 0 E [a, b] such that f(xo) > 0 then the maximum point off on [a, b] must occur at an interior point Jl E (a, b). By Theorem 4.2.1, f' (Jl) = 0. The remaining case, in which there • exists x 0 E [a, b] such that f(x 0 ) < 0, is similar. Theorem 4.2.3 (Mean Value Theorem) Suppose f E C[a, b] and differentiable at least on (a, b). Then there exists JL E (a, b) such that
f'(!-l)
= f(b)- f(a). b-a
Remark 4.2.2 The Mean Value Theorem states that under the given hypotheses there must be a tangent to the graph that is parallel to the chord joining the endpoints. (See Fig. 4.2.) Observe that this is a generalization of Rolle's Theorem.
Proof: Observe that the straight line that is the chord joining the endpoints of the graph off is described by the equation
y = f(a)
+
f(b)- f(a) (x- a). b-a
Now, for all x E [a, b], let
h(x)
= f(x)- y = f(x)-
f(a)-
f(b)- f(a) (x- a), b-a
THE MEAN VALUE THEOREM
107
[h) Figure 4.2
f (x) = v'l -
x 2 , chord with parallel tangent.
the difference in height between the graph off and the graph of the chord. Observe that hE C[a, b] and h(a) = 0 = h(b). Also, h' exists at least on (a, b). Thus Rolle's Theorem implies there exists J-t E (a, b) such that h' (J-t) = 0. This implies that
Corollary 4.2.1
Iff'
=0 on an interval I, then f is a constant function on I.
•
Proof: Fix any a E I and let x E I be arbitrary. Then the Mean Value Theorem implies there exists J-t E I such that
f(x)- f(a) = f'(J-t)(x- a)= O(x- a)= 0
=f(a), a constant function. • Corollary 4.2.2 If f' = g' on an interval I, then there exists c E IR such that f(x) =g(x) +c. Proof: Let h( x) = f (x) - g (x) and observe that h' =0 on I. Now use Corollary so that f(x)
4.2.1. Corollary 4.2.3 Suppose f E C[a, b] is differentiable at least on (a, b). Then
•
108
THE DERIVATIVE
i. Iff' (x) 2: 0 for all x E (a, b), then f is increasing (denoted by f /)on [a, b]. ii.
If!' (x) > 0 for all x
E
(a, b), then f is strictly increasing (denoted by f
i)
on [a,b]. iii. If f'(x) ::::; Oforall x E (a, b), then f is decreasing (denoted by f '\,)on [a, b]. iv. If f'(x) < Ofor all x E (a, b), then f is strictly decreasing (denoted by f on [a,b].
!)
Proof: i. Suppose that f'(x) 2: 0 for all x E (a, b). Then ifx1 < x2 are points of [a, b], the Mean Value Theorem implies there exists J.L E (a, b) such that
so that j(x2) 2: j(x1), and f is increasing on
[a, b].
For the remaining cases, see Exercises 4.10-4.12.
•
• EXAMPLE 4.2
If x
> 0, then sinx < x.
Proof: Let f(x) = x- sinx. Then f(O) = 0 and f'(x) = 1- cosx > 0 on (0, 27T). By Corollary 4.2.3, f is increasing strictly on [0, 27r], so that f(x) > 0 for all x E (0, 27T]. Hence sinx < x if 0 < x ::::; 27T. If x > 27T, however, it follows that sin x ::::; 1 < x as well. •
Definition 4.2.3 A function f : I
R
where I is an interval, is called monotone on I if and only iff is either increasing throughout I or else decreasing throughout I. The function f is called strictly monotone provided it is either strictly increasing or else strictly decreasing. --+
Definition 4.2.4 A function f : D --+ IR, where D <;;; IR, is called injective if and only iff is one-to-one. That is, f is injective if and only if
for all x1 and x2 in D .
• EXAMPLE 4.3 A function
f
that is strictly monotone on an interval I must be injective on I.
EXERCISES
109
EXERCISES
4.11
t Prove Case ii. of Corollary 4.2.3. t Prove Case iii. of Corollary 4.2.3.
4.12
t Prove Case iv of Corollary 4.2.3.
4.10
4.13 Give an example of a differentiable function f for which f' E C[a, b] such that f increases strictly on [a, b] yet it is false that f' (x) > 0 for all x E [a, b]. 4.14
Give an example off increasing strictly on [a, b] yet f
4.15
t Let n
E N. Prove:
(1-x
for all x E
4.16
¢ C[a, b].
2
t
[0, 1]. Hint: Let f(x) = (1-
Prove: log(1
4.17
Prove: 1 -
4.18
Prove:
~ 1-nx2 x2
t- (1- nx
2
).
+ x) < x for all x > 0. 2
x 2
< COS X for all X
=/:- 0.
X
- -2 < tan- 1 x < x 1+ x
for all x
4.19
> 0.
Let p be a polynomial of degree n and let
Prove iliat the number of elements in the set E, denoted by IEl, satisfies the inequality lEI ~ n + 1. (Hint: Apply Rolle's Theorem.)
4.20
Suppose If' (x) I ~ M E lR for all x E I, an interval. a) Prove: f is uniformly continuous on I. b) If I is a finite interval (a, b), prove that limx-.a+ f (x) exists. (Hint: It is enough for this problem to know that f is uniformly continuous, by the previous part. Consider first any sequence Xn ~a+ and prove that f(xn) is Cauchy.) c) Consider the function s ( x) = sin ~ on (0, 1). Does s satisfy ilie hypothesis of this exercise? Investigate statements (a) and (b) for the case of the functions.
4.21
4.22
Let q(x) = y'X for all x E [0, 1]. True or False: a) The function q is uniformly continuous on (0, 1). b) The function q' is bounded on (0, 1).
t Suppose f
E
C[a, b]. Suppose f is differentiable at a and at b, and also that J'(a)J'(b) < 0.
11 0
THE DERIVATIVE
Prove f is not injective on [a, b]. (Hint: Consider the extreme points of f.)
4.23 If D of D.
~
JR, then prove D 0 , the interior of D, is the union of all open subsets
4.3 THE FUNDAMENTAL THEOREM OF CALCULUS Theorem 4.3.1 (Fundamental Theorem, Version 1) Suppose f E R[a, b], and suppose also there exists F such that F'(x) = f(x) on [a, b]. Then
1b
f(x) dx = F(b)- F(a).
Remark 4.3.1 We will see by example below that it is possible for a derivative to exist throughout [a, b] and yet not be integrable, which implies that a derivative can exist without being either continuous or monotone, for example.
Proof: By hypothesis if E > 0 there exists 8
> 0 such that liP I < 8 implies
Let P = {xo, ... , Xn} be any partition of [a, b] such that liP II < 8. By the Mean Value Theorem in each sub-interval [xi- 1 , xi] we are free to pick Xi in such a way that F'(xi)Llxi = F(xi)- F(xi-d· Then n
P(f, {xi})=
L f(xi)Llxi i=1 n
= L:F'(xi)Llxi i=1 n
= L[F(xi)- F(xi-1)] i=1
= F(b)- F(a). Since this implies IF(b)- F(a)-
I:
I:
f(x) dxl
<
E
for all
f(x) dx = F(b)- F(a).
> 0, we must have •
• EXAMPLE 4.4 Let
F(x)
E
=
2
{x sin ()
;2
ifO < lxl:::; 1, if X= 0.
THE FUNDAMENTAL THEOREM OF CALCULUS
111
The student can show that F'(O) exists and equals 0, and F'(x) can be calculated using the Chain Rule at any x f:- 0. Thus F' exists throughout [-1, 1], yet F' fJ_ R[-1, 1] and also F' fJ_ C[-1, 1]. (See Exercise4.24.) Theorem 4.3.2 (Fundamental Theorem, Version 2) Suppose f E C[a, b], and define
F(x) = for all x
E
1x f(t) dt
[a, b]. Then F'(x) exists and F'(x)
Proof: The difference quotient F(x
+
hh-
F(x)
= f(x ).
=
*1x+h
=
h J(x)h
f(t) dt
1
= f(x) ____. f(x) as h ____. 0. Here x comes from the Mean Value Theorem for Integrals (Exercise 3.1 0), and f(x) ____. f(x) since x ____. x ash____. 0 and f E C[a, b]. • We have seen in Example 4.4 that a derivative can exist without being continuous. Nevertheless, all derivatives on intervals do share one property in common with continuous functions, as we see below. Theorem 4.3.3 (Intermediate Value Theorem for Derivatives) Iff' (x) exists for all x E I, an interval, then f' has the Intermediate Value Property on I: Namely, if a, bE I and ify lies strictly between f'(a) and f'(b), then there exists x between a
and b such that f'(x)
= y.
Proof: Suppose a < band f'(a) < y < f'(b). (For the opposite inequality, just consider g = -f.) Define 1>(x) = f(x) - yx so that¢ E C[a, b] and is also differentiable on [a, b]. It would suffice to show there exists x E (a, b) such that ¢'(x) = 0. By the Extreme Value Theorem, ¢must have both a maximum and a minimum point in [a, b]. If such an extreme point is also an interior point, we are done. So we will suppose the extreme points occur only at the endpoints a and b, and we will deduce a contradiction. We will present the argument in three cases. 1.
Suppose ¢(a) is a maximum point and ¢(b) is a minimum point. Then
A-'( )
.
1 '+'a=nn
h-->0+
¢(a+ h)- ¢(a) < 0 h _
so that f'(a):::; y. And ¢'(b)= lim ¢(b +h)- ¢(b) :::; 0 h-->0h
112
THE DERIVATIVE
so that f' (b) ::; y as well. This contradicts the hypothesis. ii. Suppose ¢(a) is a minimum and ¢(b) is a maximum. Then a similar contradiction can be deduced. (See Exercise 4.30.) iii. Finally, if one endpoint is both a maximum and a minimum point, then ¢ is a constant function and ¢' (x) 0.
=
•
EXERCISES 4.24 t Prove that F' (x) exists for all x E [-1, 1] in Example 4.4, yet F' is not bounded, and so F' tJ_ R[-1, 1] and also F' tJ_ C[-1, 1]. 4.25
Let
1
x2
F(x) =
2
e-t
for all x E R Find F'(x). (Hint: Let G(u) ThusF =Go U.)
4.26
2
= J0u e-t dt
where u
= U(x) =
x2•
Suppose f E R[a, b] and let
F(x) =
1x
f(t) dt
[a, b]. Prove: F E C[a, b]. (Hint: f must be bounded.)
for all x E
4.27
dt
Let
f(x) =
{~in~
Let
F(x) =
1x
if 0
<X
::;
1,
if X= 0.
f(t) dt
for all x E [0, 1]. Prove: FE C[O, 1]. (Hint: See Exercise 3.27.)
4.28 Suppose/ E R[a,b] andletF(x) = J: f(t)dtforallx E [a,b]. If f(x) 2 0 for all x E [a, b], prove F is increasing on [a, b]. 4.29
Suppose f' and g' exist and suppose f', g' E R[a, b]. Prove:
1b
f(x)g'(x) dx = f(b)g(b)- f(a)g(a)
(Hint: Consider J:(fg)'(x) dx.)
4.30
t Prove case (ii) of Theorem 4.3.3.
-1b
g(x)f'(x) dx.
EXERCISES
4.31
= 2xsin (~)-cos(~).
f(x)
Figure4.3
113
Suppose
f(x)
= {
~
x < 1,
ifO
~
if1
~X~
2.
a) Does there exist a function F on [0, 2] such that F'(x) = f(x)? b) Let F(x) = f(t) dt for all x E [0, 2]. Prove that FE C[O, 2] but that it is false that F'(x) = f(x) on [0, 2].
J;
4.32
Let 2 .
F(x) =
{
1
~ sm;
ifO < lxl ~ 1, ifx = 0
and let f(x) = F'(x). (See Fig. 4.3.) a) Prove that F' (x) exists for all x E [-1, 1]. b) Find f(x) for all x E [-1, 1], and prove f E R[-1, 1] \ C[-1, 1]. (Hint: Apply Exercise 3.26.) c) Find
4.33
Find
f~ 1 f(x) dx.
J; g(x) dx if g (X )
=
2x cos K X {0
+ 1r sin K X
ifx E (0, 1], ifx = 0.
114
THE DERIVATIVE
4.34 Suppose f : (-a, a) ---+ JR. is a differentiable function. Either prove the following statements or give counterexamples: a) Iff is an odd function, then f' is an even function. b) Iff is an even function, then f' is an odd function. c) 0 Iff' is odd, then f is even. (Hint: Use Theorem 4.5.1.) d) If f' is even, then f is odd. 4.4
UNIFORM CONVERGENCE AND THE DERIVATIVE
We have learned that if fn E C[a, b] and if llfn - !II sup ---+ 0, then f E C[a, b] as well. We have seen that if fn E 'R[a, b] and if llfn -!II sup ---+ 0, then f E 'R[a, b] and, moreover,
1b
fn(x) dx
---+
1b
f(x) dx.
It is understandable that the student now anticipates learning a virtually identical result for differentiation. The following examples show, however, that differentiation behaves in a more delicate manner with respect to uniform convergence (meaning sup-norm convergence) . • EXAMPLE 4.5
Let f n (x) = sinnnx for all x E JR., and let g( x)
llfn- Yllsup =sup {
lsinnxll n
=0. Then X
E lR
}
1 = ;;:
---+
0
so that fn ---+ g uniformly on R However, f~(x) = cosnx ~ g'(x) = 0 for some values of x. For example, if x = 0, cos nx cos 0 = 1 ~ 0, and this failure of f~ (x) to converge to g'(x) occurs at many other values of x as well. (See Fig. 4.4.)
=
y
Figure 4.4
f n ( x) =
~ sin nx, n
= 10, 20, 40.
UNIFORM CONVERGENCE AND THE DERIVATIVE
115
• EXAMPLE 4.6
Let
fn(x)
=
lxll+~
for all x E [-1, 1]. Then fn is increasing as n increases and fn(x)----> f(x) = lxl uniformly on [-1, 1] by Dini's Theorem. (An alternative approach to proving this uniform convergence is illustrated in Example 2.9.) If x > 0, f(x) x and J'(x) = 1. If x < 0, then f(x) -x and f' (x) = -1. Note that f' (0) does not exist, but
=
=
. . fn(x) - fn(O) x 1 +~ llm = 1lm x--->0+ X - 0 x--->0+ X = lim x~ = 0 1 X--->0+
and
( -x)l+~ . fn(x)- fn(O) . ltm = 1lm X - 0 x--->0X
X--->0-
=lim -(-x)~=O, x--->0-
so that
f~ (0)
= 0 ----> 0 and yet f' (0) does not exist.
(See Fig. 4.5.)
y
X
-1.0
Figure4.5
-0.5
0.5
1.0
lxl 1+!, lxl 1 +~, ... increases and approaches lxl.
Nevertheless, there is a useful theorem about uniform convergence and the derivative, as we see below. Theorem 4.4.1 Suppose f n is defined on a finite interval I and f~ E C (I). Suppose f~ converges uniformly on I. Suppose moreover that there exists at least one point a E I such that f n (a) is a convergent sequence of real numbers. Then there exists a differentiable function f such that f n ----> f uniformly on I, and
f'(x)
= lim f~(x) n->oo
116
THE DERIVATIVE
on!. Remark 4.4.1 Notice that the key unexpected hypothesis is that it is the sequence of
derivatives f~ which we must assume to be uniformly convergent. Proof: Let g denote temporarily the uniform limit off~ on I, so g E C[a, x] for each x such that [a, x] ~ I. By the Fundamental Theorem (Version 1) we can define f(x) for all x E I as follows: fn(x) = fn(a) --->
+
1x f~(t)
lim In( a)+
n-+oo
dt
r g(t) dt = f(x).
Ja
By the Fundamental Theorem (Version 2) f'(x) = g(x) for all x E I. Thus fn---> f at least pointwise. To show Ilin- /II sup ---> 0, observe that
lfn(x)- f(x)l = \Jn(a)
+
1x f~(t)
dt- f(a)
-1x
g(t) dti.
Denoting by L the finite length of I, we have 11/n- Jllsup :::; lfn(a) - J(a)l
+ L II/~- 9llsup ---> 0 + 0 =
0
•
as n---> oo.
EXERCISES 4.35
Let
fn(x) = .!sin(n 2 x). n Prove: f n converges uniformly on IR to a differentiable function, yet f~ (0) diverges.
4.36
In Example 4.5, find a value of x for which f~ (x) diverges as n ---> oo.
4.37
Give an example of a sequence fn for which f~ ---> 0 uniformly on IR, yet ER
fn(x) diverges for all x
4.38 Let fn(x) = sinn x for all x E [0, 1r]. Prove that f~ is not uniformly convergent on [0, 1r]. (Hint: Suppose false and apply Theorem 4.4.1 to deduce a contradiction.) 4.39 Denote by C1 [a, b] the space of functions having continuous derivatives on [a, b]. Define 11/11 = 11/llsup + IIJ'IIsup for each f E C1 [a, b]. a) Prove that II · II is a norm on the vector space C 1 [a, b]. b) <)Prove thatC 1 [a, b] is a complete normedlinear space using the norm 11·11·
CAUCHY'S GENERALIZED MEAN VALUE THEOREM
4.5
117
CAUCHY'S GENERALIZED MEAN VALUE THEOREM
Theorem 4.5.1 (Cauchy's Generalized Mean Value Theorem) Let f and g be in C[a, b] and suppose that both f and g are differentiable at least on (a, b), with g' (x) f. 0 for any x E (a, b). Then there exists x E (a, b) such that
J(b)- f(a) g(b)- g(a)
J'(x)
= g'(x)'
Remark 4.5.1 Observe that the pair of functions x = g(t), y = f(t) with t E [a, b] defines a smooth curve in the xy-plane parametrically. This curve need not be the graph of a function of the form y = ¢( x), since it is possible that more than one y corresponds to x. However, geometrically, the conclusion of the Generalized MVT is essentially the same as that of the ordinary MVT: There is a tangent line to the parametric curve that is parallel to the chord joining the endpoints. (See Fig. 4.6.)
y
X
Figure 4.6
x
= t cost, y = t sin t, chord with parallel tangent.
118
THE DERIVATIVE
Proof: Using the parametric functions defined in the remark above, we write an equation for the straight line through the endpoints of the parametric curve: f(b)- f(a)
y
= f(a) + g(b) _ g(a) (x- g(a)).
Note that g( b) - g( a) =/=- 0 because of the ordinary Mean Value Theorem. Now we define a function h( t) to be the difference in height between the parametric curve and the chord at the point with first coordinate x = g(t), t E [a, b]. Thus
h(t)
= f(t)-
[!(a)+~~:~= ~i:j (g(t)- g(a))]
for all t E [a, b]. We see that h E C[a, b] and h' exists at least on (a, b). And h(a) = 0 = h(b). By Rolle's Theorem, there exists x E (a, b) such that
h'(x)
= 0 = f'(x)- f(b)- f(a) g'(x). g(b)- g(a)
• As an application of the Cauchy Generalized Mean Value Theorem, we prove one of many useful versions ofL'Hopital's Rule.
Theorem 4.5.2 (L'Hopital's Rule) Suppose f and g are in C[a, b], both differentiable at least on (a, b), with g'(x) nowhere vanishing. If lim f(x) = 0 = lim g(x) X----l'a
X----l'a
and if
.
f'(x)
hm --,--( 9 X)
x->a
= L,
f(x) · d . g(x) extsts an then l lmx-+a
lim f(x) = L. x-+ag(x)
Proof: Observe that under the hypotheses we must have f(a) Cauchy's Generalized MVT we have f(x) g(x) as x
---->
= f(x)- f(a) = f'(x) g(x)- g(a)
a because this forces x ----> a.
---->
=
0
= g(a).
By
L
g'(x)
•
Remark 4.5.2 We observe that a similar theorem could be proved with the limit as x----> band that g(x) can't be 0 in this theorem.
CAUCHY'S GENERALIZED MEAN VALUE THEOREM
119
• EXAMPLE4.7
We give some examples of limits that can be computed by means ofL'Hopital's rule. · l'1lllx--+1
I.
.. l'lmx--+0
11.
logx x~
-1·IIIlx--+1
1 -
1~cosx
1/x _ - 1- -
1.
-1'1mx--+0 2X sinx
~ -
cosx = 1'lm~;--+0 -2-
_ 1
2·
-
Next, we prove two of many possible variations on L'Hopital's Rule. (A similar theorem could be proven for x ~ -oo.) Theorem 4.5.3 Suppose f and g are both differentiable on ( b, oo) with g'( x) nowhere 0 and b > 0. Suppose that lim f(x)
=0=
X-+00
lim g(x) X--+00
and also that
= L.
lim f'(x) X--+00 g 1 (X)
Then
limx--+oo 1 f~')) _qx
Proof:
exists and is equal to L.
Define
F(u) = {
~ (~)
if()< u < if 'U = ()
f,,
if()< u <
f,,
and if u = 0. We see that F, G E C[O, 1/b). Also, G'(u) lim x--+oo
f(x) g(x)
=
= -g'(1/u)/u 2 =f. 0 on (0, 1/b). Thus
lim u--+0+
F(u) G(u)
= lim F'(u) u--+0+
= = =
G' (U)
. l Im
-.f' (~)
:&
U--+0+ -g 1 (~)
~
lim
!' (l) __ u_
u--+0+ g 1 ( ~) lim x--+oo
f'(x) = L. g'(x)
•
120
THE DERIVATIVE
Theorem 4.5.4 Suppose for all x > b that f and g are both differentiable with g'(x) nowhere 0. Suppose that lim f(x)
X--+00
= oo =
lim g(x)
X--+00
and that
f'(x) - L . l lm - - - . x---+oo
Then
g'(x)
.
f(x)
Inn - () x--->oo g X exists and is equal to L. Proof: If x
> x 0 > b, there exists x such that f'(x) g'(x)
f(x)- f(xo) g(x)- g(xo) f(x)
= g(x)
[1- ~] 1-
g(xo) g(x)
'
where the expression in brackets--> 1 as x --> oo and where can write this as
- [1
f(x) f'(x) g(x) = g'(x)
-
~]
x depends upon x.
We
g(x)
1 _ /(xo) f(x)
·
If E > 0, there exists xo such that x > xo implies E
f'(x)
L-- < - - < L 2 g'(x) Let 1
> TJ > 0. There exists B > x 0 such that x 1
since 1 - TJ
~
B implies
_ g(xo)
-____:;.,9 ~(x4-) 1 _ /(xo) f(x) Thus
E
+ -. 2
-
1 < TJ.
(1- TJ) < f(x) < (L + 2.) (1 + TJ) (L- 2.) 2 g(x) 2
> 0. By picking TJ > 0 sufficiently small we can guarantee that f(x)- Ll < E
l g(x)
for all x ~B. Thus ~ -->Las x--> oo.
•
EXERCISES
121
• EXAMPLE 4.8
Here are more examples utilizing l'Hopital's Rule.
i. lim
n----too
(1 + .!.) n
n
=
lim X-+00
(1 + .!.)
x
X
X-->00
= elimx--+oo~ l+x
.. rlffix-->oo ex xn - r nxn-1 - lffix-->00 ~
ll.
-- ... --
=e .
rlffix-->oo eXn! = 0 . Here n
E
N.
EXERCISES
4.40
Let f(t) = t 2 and g(t) = t 3 , for all t E [0, 1]. a) Find the value(s) ofl1 E [0, 1] such that
/(1)- f(O) =
f' (t1) (1- 0).
b) Find the value(s) ofl2 E [0, 1] such that
g(1)- g(O) = g' (£2) (1- 0). c) Find the value(s) off E [0, 1] such that
/(1)- f(O) g(1)- g(O)
!'(f) g' (f).
4.41 Let p be a polynomial of degree n and let E = {x I ex = p( x)}. Prove that the number of elements in the set E, denoted by lEI, satisfies the inequality lEI :::; n + 1. (Hint: Use Cauchy's Generalized Mean Value Theorem.) 4.42
Find the error in the following attempt to apply L'Hopital's Rule: 2x - 1 I" 2 . x 2 - x - 2 = I"lffi 1lffi - = lffi - = 1
x-->2 x2 - 2x
x-->2 2x - 2
2 -x-2 3 · f act 1"lffix-->2 xx2_ Show th at m 2x = 2·
4.43
Find
(Hint: See Example 4.8(i).)
x-->2 2
122
THE DERIVATIVE
> 0, find limx_,oo
4.44
If n E Nand if p
4.45
If P is any polynomial, find
r
h~t 4.46
P(x
(lo;:r'.
+ 3h) + P(x- 3h)- 2P(x) h2
•
Find
where kEN.
4.6 TAYLOR'S THEOREM We will see that Taylor's Theorem is another generalization of the Mean Value Theorem. Denote by cn(a- r, a+ r) the set of all functions f such that at least the first n derivatives off are continuous: Thus f(n) E C(a- r, a+ r). We would like to approximate f by means of a polynomial of degree n, expressed in powers of .r-a: f(x) ~Co+ c1(x- a)+···+ cn(x- a)n. If we actually had f(x) =eo+ c1(x- a)+···+ cn(x- a)n then it is easy to see by substitution that
f(a) =
C(),
f'(a) = CI, ... , j(n)(a) = n!Cn.
Thus we could in this case write
f(k)(a) Ck=--
k!
for all k = 0, ... , n, with the understandings that O! = 1 and f( 0)(a) = f(a), by definition. Definition 4.6.1 We define the nth Taylor Polynomial
Pn(x)
=
k L -J(k)(a) k-(x- a) 1 n
k=O
for all f E Cn(a- r, a+ r), where r > 0. The problem is to describe the difference f(x)- Pn(x), which we call Rn(x), the nth Taylor Remainder term. This remainder is zero only iff is actually equal to the polynomial Pn.
TAYLOR'S THEOREM
123
Theorem 4.6.1 (Taylor's Theorem) If f(n+l) (x) exists for all x in the interval (ar, a+ r), then we have f(x) = Pn(x) + Rn(x), where R (x) = n
f (n+l)( ) J.L (x- a)n+l (n+ 1)!
for some suitable value of J.L between a and x.
Remark 4.6.1 We can regard Taylor's Theorem as a generalization of the Mean Value Theorem in the following sense. The Mean Value Theorem says that if exists on (a- r, a+ r), then
f(x) = f(a)
f'
+ f'(J.L)(x- a)
for all x E (a- r, a+ r) and for some suitable J.L strictly between a and x. Note that f (a) = Po (x), a constant polynomial. And f' (J.L) (x - a) is in the correct form to be Ro(x). Thus the Mean Value Theorem implies the special case of Taylor's Theorem in which n = 0. Note that if f(n+ll(x) exists for all x E (a- r, a+ r), then f E Cn(a- r,a + r). Proof: Fix x for now. If x =a we see easily that f(a) On the other hand, if x -:f. a, then
= Pn(a), so thatRn(a) = 0.
(x- a)n+l (n+1)! -::j=.O, so there exists K E lR such that
R'71 (X ) -_
(
K ( n+ 1)!. X
a
-
)n+l
.
Our goal is to show there exists J.L between a and x such that K = f( n+ 1 l (J.L). The trick is to replace a by a variable o: and to define
If we set a
= a, we see that h(a) = f(x)- [Pn(x)
+
(n!
)! (x- a)n+l]
1
=0
by definition of K. On the other hand, if we set a = x, then we see that
h(x) = f(x)- [f(x) + 0 +···+OJ = 0.
124
THE DERIVATIVE
Moreover, as a function of a, his differentiable for all a between a and x. By Rolle's Theorem, there exists J.L between a and x such that h'(J.L) = 0. However,
=0-
h' (a)
{ (!' (a))
+ (- f' (a) + !" (a)( x -
+ ( -f"(a)(x-a)+ + ... + (n!
so that h'(J.L)
= 0 and K
=
(3)(
f
2
!a)(x-a) 2 )
f(n)(a) (x- a)n-l (n- 1)!
+ f(n+l)(a) (x- a)n ) =
a))
K
- -(x- a)n n!
}
f(n+l)(a) K (x- a)n- -(x- a)n, n! n!
•
f(n+l)(J.L).
• EXAMPLE 4.9 Let f(x) =ex and write f(x) = Pn(x) + Rn(x) in powers of x = x- 0. We claim that Rn (x) ----. 0 as n ----. oo for all x E JR, so that n 1 lim "'""' - xk n--+oo L...J k!
=
ex
k=O
forallx E R It suffices to show for all x that
R (x)
=
n
J(n+l)( ) J.L xn+l (n+1)!
=
l.t e xn+l ___. 0 (n+1)!
as n ----. oo. However, 0
< IR (x)l = -
n
(n
el.t
+ 1)!
lxln+l < elxl lxln+l . (n + 1)!
Thus it suffices to show that for all x E JR,
lxln+l
-,'----'-----.,..-, ___. 0 (n + 1)!
as n ----. oo. So fix x arbitrarily and observe that there exists N E N such that .J# < ~- Now, for all n 2": N, write n = N + k, and
lxln
lxiN
1
-<--·------0 n! - N! 2k as k ----. oo, which is necessary since n ----. oo.
EXERCISES
125
EXERCISES Suppose f is a polynomial of degree non (a- r, a+ r). Prove that f(x) = Pn(x), the nth Taylor Polynomial defined in Definition 4.6.1. (Hint: Prove that Rn(x) 0.)
4.47
=
4.48
If x > 0, prove that
(Hint: Use ex= P1(x)
+ R1(x).)
4.49 Prove that e is irrational. (Hint: Suppose false, so that e = ~, where p, q E N. Write e = e 1 = Pn(1)+ Rn(1), multiply both sides by n!, and deduce a contradiction when n EN is sufficiently large.) 4.50 Expand the polynomial p( x) of (x- 1): That is, show that
= 3x 3 + 2x 2 - x + 1 as a polynomial in powers 3
p(x) = :~::::>k(x-
1)k
k=O
and find the values of the constants
eo, ... , c3 .
Let f(x) = sinx and find the nth Taylor Polynomial Pn(x) in powers of x = x- 0 (that is, use a= 0). Prove that Pn(x) ~ sinx as n ~ oo for all x E R (See Fig. 4.7.) Prove also that no polynomial P(x) = sinx on any interval of
4.51
positive length. y
Figure 4.7 function?
sin(x), P5 (x), and P7 (x) on [0, 1r]. Can you see which curve belongs to which
Let f(x) = cosx and find the nth Taylor Polynomial Pn(x) in powers of x = x- 0 (i.e., use a= 0). Prove that Pn(x) ~ cosx as n ~ oo for all x E R
4.52
126
THE DERIVATIVE
4.7 TEST YOURSELF
EXERCISES
+ x) < x for all x > 0.
4.53
True or False: ln(1
4.54
Give an example of a function F that is differentiable at every x E [0, 1] yet
F'(x) is not bounded on [0, 1]. 4.55
True or False: Iff E R[a, b] and F(x) for all x E [a, b].
=
J: f(t) dt for all x
F'(x)
= f(x)
4.56
True or False: If F'(x) exists and is Riemann integrable on
1b 4.57
E
[a, b], then
[a, b], then
F'(x) dx = F(b)- F(a).
Suppose that the derivative f' (x) exists and is bounded on the closed interval
[a, b] and that f' is continuous on the open interval (a, b) but is not continuous at a or at b. True or False: The function f' is Riemann integrable on [a, b]. 4.58 Give an example of an unbounded function f on [0, 1] that is equal everywhere to the derivative of another function F. 4.59
Define
IIIII = ll!llsup + ll!'llsup for each
f
E C1 [a,
b]. True or False: If T: C1 [a, b]
-->
lR is defined by
Tf = !' (a ; b) ' then T is continuous with respect to
I · II·
4.60 Let f(t) = t 2 and g(t) = t 3 , for all t E [0, 1]. Find the value(s) off E [0, 1] such that f(l)-f(O) = f'(B g(I)-g(O)
g'(t •
4.61
If Pis any polynomial, find limh-+O P(x+4h)+P~- 4 h)- 2 P(x).
4.62
Fmd hmx-+oo
4.63
Let T(f) =
.
.
(logox/'o x . ·
J01 f(x) sin~ dx for each f
E
R[O, 1]. Is T continuous?
CHAPTERS
INFINITE SERIES
An infinite series is a sum of infinitely many numbers. Infinite series appear throughout pure and applied mathematics, and they were important even long ago. For example, the student probably recalls that the infinite decimal expansion of the fraction ~ is 0.33.;l .... The underlined .;l connotes endless repetition of the digit 3. Such an endless decimal expansion can be understood as representing
3 10
3
3
+ 100 + 1000 + ... '
where again the 3 dots indicate that the additions continue without end. What does it mean to add infinitely many numbers? Can anyone actually perform an infinite number of additions? These are some of the questions we will address in the present chapter.
5.1
SERIES OF CONSTANTS
If xis a sequence of real numbers, we can think of x as being a junction defined on the natural numbers N as follows: for all k E N, x(k) = Xk E R Usually the kth term of the sequence xis denoted Xk rather than x(k), but it is useful to have the Advanced Calculus: An Introduction to Linear Analysis. By Leonard F. Richardson Copyright© 2008 John Wiley & Sons, Inc.
127
128
INFINITE SERIES
symbol x for the sequence as a whole (viewed as a function, for example) so that we can write conveniently about various global properties of the sequence, as we will see in the next several sections. The first thing we wish to understand about a sequence x is whether or not it is possible to define in a useful way the concept of the sum of the infinitely many numbers Xk. What is clear is that for all n E N we can define the nth partial sum n Sn
= L X k = X ! + X2 + · · · + Xn k=l
by virtue of finitely many repetitions of the closure of lR under addition.
Definition 5.1.1 We say the infinite series 00
L
Xk
=
X!
+ X2 + · · · + Xn + · · ·
k=l
converges to the sum s provided the sequence of partial sums
as n ---t oo. If'L.~ 1 Xk converges to s, then we call the sequence x summable. A series that does not converge is called divergent
Theorem 5.1.1 (nth Term Test) If x is a summable sequence, then n ---t oo.
Xn ---t
0 as
Proof: Denote sn = EZ=l Xk. Since xis summable, there exists L E lR such that sn ---t Las n ---t oo. Hence the sequence Sn-l ---t Las well. Therefore Xn
=
Sn- Sn-1 ---t
L- L = 0.
• Remark 5.1.1 We remind the reader here, for one final time, of the importance of learning how to read mathematical proofs, and how this contributes to learning to prove theorems oneself. The reader should review the Introduction on page xxiii. It is very important for each reader to write out a careful analysis of each proof studied, taking careful note of exactly how it works and how each step is justified. The proof of the nth term test is very brief. Even so, it offers lessons. The reader Xn may diverge. will see in Example 5.1 that it is possible for Xn ---t 0, although The insight that enables us to prove this theorem is that we can express Xn in terms of Sn and Sn-1· Then one must recognize that if Sn ---t s, so also must Sn-1 ---t s. In fact, because sn ---t s we know that for each f > 0 there exists N E N such that
Er'
SERIES OF CONSTANTS
n ~ N implies that lsn- sl <E. Hence if n ~ N + 1 we must have lsn-1as well, showing that Sn-1 ----> s, lagging only one step behind sn itself.
129
sl < E
It is important to understand that the nth term test can be used to prove divergence of an infinite series, but it is a one-directional implication and does not prove convergence, as shown by the following important example .
• EXAMPLE 5.1 Let Xk = /c for all k E N. We claim that even though Xk = /c ----> 0 as k ----> oo, the so-called harmonic series L:;~ 1 Xk diverges. As usual, let sn denote the Nth partial sum. If s., were convergent, then sn would be bounded, and every subsequence, such as s2n, would be bounded as well. However,
n 2
>1+----->oo
as n ----> oo. This is a contradiction. Hence the harmonic series diverges. See Exercise 5.1.
• EXAMPLE 5.2 Since the harmonic series diverges, it is interesting to note that the so-called alternating harmonic series
converges. This is a consequence of the following theorem.
Theorem 5.1.2 (Alternating Series Test) Suppose the sequence Xk is a decreasing sequence converging to the limit 0. That is, suppose x1 ~ x2 ~ · · · and Xk ----> 0 as k----> oo. Then 00
:~:)-1)k+1Xk = .T 1
-
X2
+ X3- X4 + '''
k=1
converges. Moreover, if s denotes the sum of the infinite series, then we have the following estimate of the difference between the partial sum Sn and the sums:
for all n.
130
INFINITE SERIES
Proof: Observe that 81 = x1 2: 82 = x1 - x2. However, 83 = 82 + X3 :S 81 because the positive sequence Xk decreases. Extending this reasoning, we observe the following:
In other words, the subsequence 82n is increasing with every odd partial sum as an upper bound, and 82n-1 is decreasing with every even partial sum as a lower bound. Thus 82n / ' L, the least upper bound of the even partial sums. Hence L is a lower bound of the odd partial sums. And S2n-1 ""'
G 2: L,
where G is the greatest lower bound of the odd partial sums. And
Thus G- L = 0 and L = G. Moreover, since the successive terms of the sequence 8n are on opposite sides of L for all n
•
Thus sn ---) L = s and the theorem is proven.
One of the things we learn from the harmonic and the alternating harmonic series is that a series 1 Xk can converge even though 1 lxkl fails to converge. We see next why the opposite phenomenon cannot occur.
L%:
L%:
Definition 5.1.2 A series 00
:~:::>k k=1
is called absolutely convergent if the series 00
converges. In this case the sequence x is called absolutely summable. A series 00
that converges but fails to converge absolutely is called conditionally convergent, or conditionally summable.
Thus the alternating harmonic series is conditionally convergent.
SERIES OF CONSTANTS
131
Theorem 5.1.3 Every absolutely summable sequence x is summable. Proof: If n Sn
=
LXk, k=l
it will suffice to show that sn is a Cauchy sequence. What we know by hypothesis is that n
an=
L
lxkl
k=l
isCauchy. Thusift: > O,thereexistsNsuchthatn However, if n > rn ;:::: N, then n
n
k=m+l
k=m+l
> m;:::: Nimplieslan-aml <E.
• • EXAMPLE 5.3
We present the Geometric Series Test. We call the series 00
Lark
= a + ar· + ar 2 + · · · + arn + ...
k=O
a geometric series with common ratio r. Observe that if the constant a =1- 0 and lrl ;:::: 1, then arn f+ 0 as n _, oo, so E~o ark diverges by the nth term test. However, we claim that if lrl < 1 then the corresponding geometric series must be absolutely convergent, and thus convergent as well. To prove 00
converges provided lrl < 1, it suffices to prove n
Bn
=
Lark
= a + ar + ar 2 + · · · + arn
k=O
converges. But Bn - rsn = a - arn+l. Hence Sn
as n _, oo. That is
a 1-r
= - - - - _, - -
1-r
00
"""' ark = _a_' L....t 1- r k=O
(5.1)
132
INFINITE SERIES
provided that
lrl < 1, so 00
converges as well. For example, the series
L: -21)k 00
(
k=l
converges absolutely.
• EXAMPLE 5.4 The geometric series test enables us to resolve the famous problem known as Zeno's paradox. The ancient Greek philosopher Zeno proposed this scenario. The legendary warrior Achilles was challenged to a foot race against a tortoise. For fairness, the tortoise is given a head start of d yards. The claim is that Achilles can never catch the tortoise. Suppose the tortoise runs at r times the speed of Achilles, where 0 < r < 1. The idea is that if it takes Achilles t seconds to reach the point where the tortoise was at the beginning of the race, then when he reaches that point the tortoise will have advanced to a position dr yards ahead of Achilles new location. Then it will take tr seconds to reach the new location of the tortoise. However, during that time the tortoise will have advanced to a position dr 2 yards ahead of Achilles. In effect, each time Achilles reaches the previous position of the tortoise, the determined quadruped has moved a bit farther ahead. So Achilles can never reach the tortoise. The resolution is that the time required for Achilles to reach the tortoise is the sum of an infinite series:
T =
00
00
0
0
L trn = t L rn,
which is finite since lrl < 1, by the geometric series test. Of course, Zeno realized that the tortoise would lose the race to Achilles. Zeno's point was that in his era it was not yet understood how to deal with limits, such as the sum of an infinite series, with logical rigor.
EXERCISES 5.1 tIn Example 5.1, try regrouping s2n-1 in such a way as to show s2n-1 ::::; n, for all n. If BN 2": 100, how big must N be? Remark: The reader may be surprised at how many terms a computer would have to add to make the partial sums rise to a total of only 100, even though the harmonic series diverges to oo. How many years would your PC have to compute in order to add the required number of terms to make SN 2": 100? What would be the round-off error from so many additions? 5.2
Test for absolute convergence, conditional convergence, or divergence:
EXERCISES
133
a) 2::~ 1 (-1)k+l
b)
t
1
00
f;k(k+1) (Hint: Write 1 1 ---k+ 1 k
1 k(k
+ 1)
and calculate sn. This is an example of what is called a telescoping series.) c) 2::~ 1 ak, where if k is a perfect square, if k isn't a perfect square.
if k = 2j -1, if k
5.3
= 2j.
Using Exercise 5.2.b above as a model, find a formula for ak such that n Sn
= Lak = Vn k=1
for all n, so that sn diverges to oo although ~ ~ 0 as n ~ oo.
=
L~=t ak
=
log n, for all n.
5.4
Find a formula for ak so that sn
5.5
Give an example of Xk ~ 0 for which 2::~ 1 ( -1)k+lxk diverges.
5.6 If x andy are sequences and c E JR, we define (cx)k = cxk and (x Xk + Yk· Prove: If x andy are summable, then a) x + y is summable and 00
00
00
L(xk+Yk) k=1
+ Y)k =
= LXk+ LYk· k=1
k=1
b) ex is summable and 00
00
LCXk =cL:xk. k=1 k=t We note that this exercise shows that the family of summable sequences a is vector space.
134
INFINITE SERIES
5.7 A function f : lR ----> lR is called a contraction of lR if and only if there exists a constant r E [0, 1) such that for all x and x' in lR we have
lf(x)- f(x')l S rlx- x'l· Let
5.2
f
be a contraction of!R, with corresponding constant r. a) Prove that f is uniformly continuous on R b) Let x 0 E lR be arbitrary and define a sequence Xn by Xn = f(xn_I), for each n EN. Show that lxn+l - xnl S rnlx1- xol· c) Prove that the sequence Xn in 5.7.b is a Cauchy sequence. (Hint: Exercise 1.25 may be helpful.) d) Let p = limn-+oo Xn, and prove that f(p) = p. (A point q is called a fixed point off if and only if f(q) = q. You have just shown that every contraction of lR has a fixed point.) e) If p and q are both fixed points of J, prove that p = q. f) Let f : lR ----> lR be a differentiable function such that ll!'llsup = r < 1. Prove that f is a contraction of R
CONVERGENCE TESTS FOR POSITIVE TERM SERIES
Since every absolutely summable sequence is summable, it is desirable to have tests designed to determine whether or not a series of exclusively nonnegative terms converges. We will see that absolutely convergent series play a very important role in the applications of infinite series. Throughout the present section, all terms of infinite series will be nonnegative unless otherwise noted. We observe first that if Xk ~ 0 for all k EN, then
is a monotone increasing sequence: sn is increasing as n increases strictly. Thus we know that Sn----> sup{sm I mEN}. It follows that Sn is convergent if and only if sup{ Sm observation leads us to the following useful test.
Theorem 5.2.1 (Comparison Test) Suppose there exists K
Im
E N}
<
oo. This
E N such that
for all k ~ K. (In words, suppose Yk eventually dominates xk-) Then we have the following conclusions. i. lf"£'/:= 1 Yk converges, then
"£;:: 1 Xk converges.
CONVERGENCE TESTS FOR POSITIVE TERM SERIES
ii. If'L.~ 1 Proof: Let Sn
Xk
diverges, then 2:.~ 1
= 2:.~= 1 Xk
and
IJn
Yk
135
diverges.
= 2:.~= 1 Yk·
Let B
= E~=--;_ 1 lxk- Ykl·
i. We know that
But
for all n E N. Since the sequence sn is bounded above by <; converges to its least upper bound, and so 2:.~ 1 Xk converges.
+B
E
lR, sn
ii. This part is a consequence of the previous part, being its contrapositive. In particular, if 2:.~ 1 Yk did not diverge then it would converge. Then part (i) would imply that 2:.~ 1 Xk converges, which contradicts the hypothesis.
•
For the next theorem, recall from elementary calculus that iff E R[1, b] for all b ?:: 1, we define the improper integral 00
1
f(x) dx
=
lim b-+oo
1
lb
f(x) dx
1
if this limit exists. The infinite length of the interval on which the integral takes place is responsible for the term improper integral. Theorem 5.2.2 (Integral Test) Suppose f is a monotone decreasing function approaching 0 on [1, oo): that is, f "-,. 0 on [1, oo). Let ak = f(k), for all k E N. Then 2:.~ 1 ak converges if and only if
1
00
f(x) dx < oo.
J:
f (x) dx is an Remark 5.2.1 Since f (x) ?:: 0 for all x ?:: 0 in this theorem, 00 increasing function of b. Thus the condition J1 f (x) dx < oo means the same thing as 00
1 1
exists in this case. Proof:
f(x) dx = lim b-+oo
lb 1
f(x) dx
136
INFINITE SERIES
i. First we prove the implication from right to left, which is the if part. Because f is decreasing, we have
1 1 2
a2 :-::;
f(x) dx,
3
a3 :-::;
f(x) dx,
an:-::; 1nn-1 f(x) dx. Thus
n }r f(x)dx Sn='L:Uk:-::;ai+ 1
k=l
1
00
:-::; a1
+
f(x)dx < oo
for all n E N. Thus sn converges. ii. Now we prove the implication from left to right, which is the only if part. f(x) dx. Thus < oo. Observe for all k that ~ suppose 1
E%"=
ak J:+l
ak
1n f(x)dx :-::; tak :-: ; ~ak < oo for all n E N. Thus above. Hence
J1b f (x) dx is an increasing function of b that is bounded lim
b---.oo
lb 1
1
00
f(x) dx =
1
f(x) dx
exists.
• Theorem 5.2.3 (Ratio Test) Suppose
as k
--->
Xk
> Ofor all k and suppose
oo. Then we have the following conclusions. 1, E~ 1
i. If L
>
ii. If L
< 1,
E:
Xk
1 Xk
diverges. converges.
iii. If L = 1, the test fails to determine convergence or divergence.
EXERCISES
137
Proof:
i. If L > 1, then there exists K E N such that k > K implies ~ > 1, Xk which implies 0 < Xk < Xk+l• and Xk ft 0. Thus 2::~ 1 Xk fails to converge because of the nth term test. ii. If L < 1, fix any number r such that L < r < 1. Then there exists K E N such that k ~ K implies x~! 1 < r which implies Xk+I < Xkr. It follows that for all j E N we have x K +j < x Krj. Thus the convergent geometric series 00
00
LXKrj
dominates
LXK+j,
j=l
j=l
which must also converge. Hence the partial sums of 2::~ 1 above by K
Xk
are all bounded
oo
L:xk+ LXK+j• k=l j=l
so that 2::~ 1
Xk
converges.
iii. See Exercise 5.14.
• EXERCISES
5.8 t Prove the Limit Comparison Test: Suppose Xk ~ 0 and Yk > 0 for all k E N. Suppose iP..1r.. -----+ L E R Yk a) If L > 0 then x is summable if and only if y is summable. (Hint: Apply the comparison test.) b) If L ~ 0 and if y is summable, then xis summable as well. 5.9
t Use the result of Exercise 5.8 above to test for convergence: a) b) c)
2k L.....k=l 2k2-1 "'00 k-1 L.....k=l k2" "'00
"'oo I L.....k=l 2k-1
d) I:~~
1 2k
5.10 t Now suppose x andy are sequences that need not have entirely nonnegative terms. If x is absolutely summable and if y is bounded, prove that xy is absolutely summable, where we define the product xy by (xy)k = XkYk for all k E N, so that 2::~ 1 XkYk resembles an inner product of two vectors. 5.11
t Apply the integral test to prove: a) If p > 1, then 2::~ 1 k1P is convergent. b) If 0 ::; p ::; 1, then L ~ 1 ~p is divergent.
5.12
Test for convergence or divergence:
138
INFINITE SERIES
a)
E%:1 ,1
E%"=2 ( k,;g k) E%': 1 krk- 1 , where \r\ < 1.
"00
5.13
5k 3
c) wk=1 k5+1
b)
5.14 Prove the final part of Theorem 5.2.3 by finding two sequences, Xk and Yk 00 such that ~ ~ 1 and Yk±l ~ 1, but wk-1 " _ Xk converges whereas wk-1 "'~ Yk xk Yk diverges. Prove your claims. (Hint: Consider Exercise 5 .11.) 5.15 Prove that the series that O! = 1 by definition. 5.16
E%':o %~
is absolutely convergent for all x E R Note
Prove the nth Root Test: Suppose
Xk
2:: 0 for all k
ifXk ~ L as k ~ oo. Then we have the following conclusions.
a)
E N, and suppose
E%': E%':
If L > 1, 1 Xk diverges. b) If L < 1, 1 Xk converges. c) If L = 1, then the test fails. Hint: Try to do for kth roots what was done for ratios in the proof of Theorem 5.2.3.
5.17
5.18
t Let Xk 2:: 0 for all k E N. Note that limk__,oo ifXk need not exist. a) If lim sup ifXk < 1, prove that the sequence x is summable. b) If lim sup ifXk > 1, prove that E%"= 1 Xk = oo. In Theorem 5.2.3 lim ~ need not exist. Let S = lim Xk
a)
E%': E%':
sup ~. Xk
Prove: If S < 1, then the positive-term series 1 Xk converges. b) Give an example in which S > 1 yet the series 1 Xk converges. c) Make up and prove a valid lim inf version of the ratio test. 5.19
Test for convergence or divergence: "00
k'
Ii< b) E%':o~ a)
C)
5.3
wk=O
E%':2 (Ia;k)k
ABSOLUTE CONVERGENCE AND PRODUCTS OF SERIES
In adding lists of finitely many real numbers, we know that the numbers may be added in any order we find convenient: the result will be independent of order. With infinite series, however, this is actually false. The following definition and example show what can go wrong. Definition 5.3.1 A series E~ 1 YJ is called a rearrangement of the series E%': 1 Xk provided every term Yj appears among the Xk 'sand each Xk appears exactly once among the YJ 's. Thus the only difference between the sequence y and sequence x is that the terms appear in a possibly different order. Note that for y to be a
ABSOLUTE CONVERGENCE AND PRODUCTS OF SERIES
139
rearrangement of the sequence x, y must be a single sequence containing each term of x exactly one time.
•
EXAMPLE 5.5
Consider the alternating harmonic series
which converges by the alternating series test, but is only conditionally convergent since the harmonic series diverges. In Exercise 5.9 the reader has shown that the series of positive terms is divergent; That is, 00 1 1 1 2k - 1 = 1 + 3 + 5 + · · · = oo
I: k=l
and that the series of negative terms diverges as well: 00
L
-1 2k
1
1
1
= -2- 4- 6- · · · =
-oo.
k=l
We will show that we can re-arrange the terms of the alternating harmonic series in such a way that the new series diverges. We will construct the rearranged series as follows. Begin by selecting enough of the positive terms so that their sum exceeds 2. Then follow these terms with the first negative term of the alternating harmonic series-namely, After this negative term, add several more positive terms using enough to increase the total sum thus far by at least 2. We can do this, since the sum of the positive terms is infinite. Follow this with the second negative term: - ~. We see that after the first negative term is included, the partial sum to this point is > 1. After the second negative term is included, the partial sum to that point is > 2. And so on. The partial sums diverge to infinity. Because of this example, we regard conditionally convergent series as inherently unstable, in the sense that their convergence depends critically on the order in which the terms appear, because of the cancelations that are critical to the convergence occurring. The next theorem shows that absolutely convergent series do not exhibit this instability.
-!.
Theorem 5.3.1 (Dirichlet) Suppose E%"= 1 Xk is any absolutely convergent series, and suppose E~ 1 Yk is any rearrangement of this series. Then E~ 1 Yk is also absolutely convergent, and 00
00
LYk = LXk· k=l
k=l
140
INFINITE SERIES
Proof: Define if Xk 2:: 0, if Xk
<0
and if Xk
:::;
0,
ifxk > 0. Observe that Xk = xt- x-,; and lxkl = xt + x-,;. Note also that 0:::; xt : :; lxkl and 0 :::; x k
:::; Ix k 1.
Define corresponding terminology for the sequence y. We see that n
n
oo
k=l
k=l
k=l
L:xt:::; L lxkl:::; L lxkl < oo so that
00
L:xt =P
Similarly, 00
L:x-,; =Q
Now, Sn
n
n
k=l
k=l
LXk = L(xt -xJ;)
=
=
n
n
k=l
k=l
L:xt- L:x-,; ~P-Q.
Thus the absolute summability of x implies 1:::~ 1 Xk = P- Q. Now we consider andY/; similarly defined. Since each given there exists N such that
xt,
n
Yt
Yt is equal to some
n
N
k=l
k=l
LYt:::; L:xt:::; P. Thus 1:::~ 1
Yt = P' :::; P < oo. Similarly, 1:::~ 1 Y/; = Q' :::; Q < oo. Now, n
n
n
k=l
k=l
L I~ I LYt + LY"k : :; P' + Q' =
k=l
for all n, so y is absolutely summable, as claimed. Moreover, we see as for x that 00
LYk = P'-Q'. k=l
ABSOLUTE CONVERGENCE AND PRODUCTS OF SERIES
141
To show that the sum of the Yk 's is identical to that for the Xk 's it would suffice to show that P' = P and Q' = Q. What we know is that because y is a rearrangement ofx,
P' :::; P and Q' :::; Q. However, x is also a rearrangement of y. Thus P :::; P' and Q :::; Q', so that P and Q = Q', as desired.
= P' •
The Cauchy Product Sometimes, as in Exercise 5.20, Theorem 5.3.1 helps us find the sum of an absolutely convergent series. But the primary importance of this theorem is that there are applications of series that create a need for order-independent sums, as guaranteed by this theorem for absolutely convergent series. Later in this chapter, we will study the expansion of functions as sums of infinite series of functions-for example, power series expansions of the form 00
f(x)
= :L>kxk. k=O
f and g are two functions with such expansions, we would like to know how to multiply two such infinite series together in such a way that the product will equal f(x)g(x). In Exercise 5.10 we encountered the so-called inner product of two sequences x andy, for which (xy)k = XkYk· But 2:~ 1 XkYk does not usually equal (E~l Xk) (E;:'=l Yk). For example, if ( -l)k+l Xk = = Yk,
If
v'k
then x and y are conditionally summable, but 00
LXkYk
= oo,
k=l which is divergent. Thus the sum of the inner product could not be the product of the two convergent sums. Even for absolutely convergent series, this problem is apparent. For example, if
Xk
=
1 2k
= Yk,
then
k=O In this case
k=O
142
INFINITE SERIES
but the product of the sums is 4. If we seek a method of multiplying infinite series that will have the product of the sums equal to the sum of the product series, then we need to take our cue from the distributive law for finite arithmetic processes. Thus
needs to be the sum of all products of the form XjYk where j and k vary independently in N. These individual products which will be the summands of the new series can be conveniently arranged in an infinite rectangular array as shown below.
In Section 1.8 we saw that such a family as this, arranged in an infinite table, is a countable set and can be organized in many different ways into a sequence. One easy pattern for doing this is to begin with X!YI. X1]/2, X2]/1, X1]/3, X2]/2, X3]/I, ... .
Here we are proceeding systematically along a family of diagonals on which we have XjYk with j + k = 2, then j + k = 3, then j + k = 4, etc. But there are many other patterns that work, and we want to make sure that the sum of the resulting series is independent of the order in which the terms have been lined up in a sequence. This means we will need to make sure we have an absolutely convergent series in at least one order, which will then mean every rearrangement is also absolutely convergent and has the same sum. If this is established, then we can write
L
Xj]/k,
j,kEN
where this notation carries the meaning that the sum will be independent of the ordering of the terms into a sequence.
Theorem 5.3.2 If x and y are both absolutely summable sequences, then the countablefamilyoftermsoftheformxjyk isabsolutelysummable in every order. Moreover,
Proof: Fix temporarily a sequence z that is comprised precisely of the terms of the form Xj1/k to be summed. Let Sn = 1 z; and let N denote the largest index j or
E7=
ABSOLUTE CONVERGENCE AND PRODUCTS OF SERIES
k that appears among the terms
XJYk
143
corresponding to z 1 , z2 , ... , Zn. Clearly,
t, lz;l ~ (t.lx;l) (t, IYkl) ~ (~lx;l) (t,IY'I) < oc for all n E N. Thus z is absolutely summable. By Theorem 5.3.1, the terms XJYk are absolutely summable and have the same sum independent of how they are ordered into a sequence. That is, L~k=l XJYk = limn~oo sn = s. It remains only to show that s has the value claimed in the theorem. Since the reasoning thus far in the present proof was independent of the choice of z, let us pick z now according to convenience as follows. We will follow a pattern of expanding squares in the upper left-hand corner:
where we have listed explicitly the terms from the squares that are I by I, 2 by 2, and 3 by 3. So Bn ~ s. Hence sn2 ~ s as well, since this is a subsequence of the convergent sequence s. However,
•
Thus s has the value claimed in the theorem. • EXAMPLE 5.6
There is a special form of the product of two sequences, called the Cauchy Product, that we will use in the study of power series. Consider the following ordering for the sequence z used in the proof of Theorem 5.3.2: X1Y1, x 1y2, x2y1, x1y3, x2y2, X3YI, .. .. Here we are proceeding systematically along the family of diagonals on which we have the terms x 1 yk with j + k = 2, j + k = 3, j + k = 4, and in general j + k = l. Thus Cj
=
L
Xj1Jk,
j+k=l
the sum of the terms on the lth diagonal. Then the sums 2::~= 2 ct form a subsequence of the sequences of partial sums of z. Thus 00
Let= 1=2
144
INFINITE SERIES
The absolutely summable sequence c is called the Cauchy Product of the sequences x and y. Thus the sum of the Cauchy Product sequence is the product of the sum of x with the sum of y. Of course, the same idea works with sequences x and y indexed by integers starting with 0 instead of 1. Then cz is indexed by l ::::: 0. Another important application of absolute convergence appears in the following theorem about so-called double summations.
Theorem 5.3.3 If the countable set
is absolutely summable and if 00
ck = Laj,k, j=1
then Ck is also absolutely summable, and 00
Lck
= Laj,k· j,k
k=1
Remark 5.3.1 Here
Ck denotes the sum of all the entries a1,k in the kth column of the infinite matrix or array indexed by j for the rows and k for the columns. Thus this theorem asserts that if the double sum is absolutely convergent, then the sum of the individual column sums is absolutely convergent and has the same sum as that of the original double sum. At first this might appear to be a special case of the rearrangements addressed in Dirichlet's theorem. This is not the case. Each term ck represents the sum of an infinite series found in one entire column of the infinite square array aj,k· Dirichlet's theorem addresses the rearrangement of one infinite series into one (differently ordered) infinite series-not into a series of infinitely many different subseries of the original series.
Proof: LetS = L:1,k aj,k and P = L:j,k laj,kl· We observe that for each k we have lck I ::; L:~ 1 laj,k I and for each K E N we have
(See Exercise 5.28.a.) Thus ck is absolutely summable. It remains to show that
L:;:'= 1 ck =
L:j,k aj,k·
LetS' = L:%: 1 Ck. We must showS' K 1 such that K ::::: K 1 implies
S'-
I
= S.
tckl k=1
Let E > 0 be arbitrary. There exists
<
~·
ABSOLUTE CONVERGENCE AND PRODUCTS OF SERIES
145
Also, there exists K2 such that K 2: K 2 implies
L
<
lai,kl
~
jEN,k>K2
so that
L
S-
aj,k
jEN,k<::,_K
(See Exercises 5.28.b and 5.28.c.) Selecting K = max(K1. K2) we see by the • triangle inequality that IS- S'l < f for all f. Hence S = S'.
Law of the Unconscious Statistician Absolute summability is particularly useful in the study of probability. A fundamental object of study in probability is a random variable. A random variable X is called discreteifithasatmostcountablymanyvalues,sothattherangeSx = {xn In EN}. To each value Xn of X there is assigned a probability P(X = xn) E [0, 1], and L::=l P(X = Xn) = 1. Definition 5.3.2 If X is a discrete random variable with range S x, the expectation £(X)= L:xESx xP(X = x), provided this series is absolutely convergent.
The requirement of absolute convergence is very important in the definition of expectation, though it is commonly omitted from elementary text books about probability. Consider for example a discrete random variable X for which Sx = Q, the set of all rational numbers. There is no one preferred or most natural way in which to order Q into a sequence. If we did not require absolute convergence in the definition of expectation, then the value of £(X) could be any real number, or even ±oo, depending on how we ordered Sx in the sum. Thus expectation would fail to be well-defined. An important theorem in elementary probability pertains to a function g defined on the range Sx of a discrete random variable X. If we let Y = g(X), then Y is again a random variable and is necessarily discrete since the image under g of a countable set is at most countable itself. Thus Sy is necessarily countable at most. It is a fundamental property of the probability function P that
P(Y = y) =
L
P(X = x).
xEg- 1 (y)
Here, g - l (y) that
= {x
E
S x I g ( x) = y}. By the definition of expectation we know £(Y)
=
L yP(Y = y). yESy
The following important theorem is sometimes called the Law of the Unconscious Statistician because it is frequently not recognized as a theorem that requires proof in elementary texts.
146
INFINITE SERIES
Theorem 5.3.4 (Law of the Unconscious Statistician) Let X be a discrete random variable and let Y = g( X) be another discrete random variable.
If the sum
L g(x)P(X = x) xESx
is absolutely convergent, then £(Y) exists and equals the latter sum.
We will apply Theorem 5.3.3. Let Sy = {Yk I k E N}. We can denote g- 1 (yk) = {xj,k}. which is indexed by j and is a countable set at most. We can
Proof:
write
L
g(x)P(X = x)
xESx
L
=
aj,k,
(j,k)EN2
where
aj,k = g(xj,k)P(X = Xj,k) is countable. If, however, g- 1 (Yk) is finite, we use the lower for all (j, k) values of j to index the actual members of g- 1 (yk) and we enter 0 for aj,k for all larger values of j. Thus for each Yk E Sy we have the corresponding column sum if g- 1 (yk)
00
ck
= L aj,k = YkP(Y =
Yk)·
j=1
Since the double sum is absolutely convergent by hypothesis, Theorem 5.3.3 tells us that£ (Y) = .L:k ck exists and that it is given by the following absolutely convergent sum: g(x)P(X = x). £(Y) =
L
xESx
• EXERCISES 5.20 t Use the formula for the sum of an absolutely convergent series given in the proof of Theorem 5.3.1 to find the sum of the series
1
1
1
3 - 4+ 9 5.21 5.22
1 16
1
+ - ... + 3k
1 - 4k
+ - ....
If cis asummablesequence and if x E [0, 1], prove that I:~=O ckxk converges. Suppose lf'(x)l exists and is less than 1 for all x E (0, 1]. a) Prove: lim
n->oo
f
(.!.)n
exists and equals some number L E R Hint: Show that the series
EXERCISES
147
is absolutely summable. b) Prove that limx->O+ f(x) exists and equals L.
5.23
t If 2::%: 1 Xk is conditionally convergent, prove that 00
00
L:xt = oo = L:x;;;-. k=l
Here
k=l
xt and x;;;- are defined as in the proof of Theorem 5.3.1.
5.24 Use the result of Exercise 5.23 above to show that every conditionally convergent series can be rearranged so that it diverges to -oo. 5.25 If x is conditionally summable and y is a sequence that is not identically 0, prove that the countable family of terms Xj]Jk cannot be absolutely summable in any order. 5.26 Let c denote the Cauchy Product of the sequence x with the sequence y, and let d be the Cauchy Product of y with x. Show that Ct = dt for alll E N. 5.27
Let x, y E
~.
and form the sequences
j, k = 0, 1, .... Let c be the Cauchy Product of e with f. Use the Binomial Theorem to prove that (x + y)l Cl
=
[!
(The reader who knows that ex = 2::~ 0 ~~ will see that this exercise gives a direct proof from the Maclaurin series expansion of ex that exey = ex+Y.)
5.28
t Provide missing details as follows in the proof of Theorem 5.3.3. a) Prove that
(Hint: Consider 2:::= 1
(
I:f= 1 a1,k) and let .J ----+ oo.)
b) Prove that there exists K 1 as claimed. c) Prove that there exists K 2 as claimed. Hint: Consider
148
INFINITE SERIES
5.29 Demonstrate the importance of the assumption in Theorem 5.3.3 that the sum over all (j, k) E N2 be absolutely convergent by giving an example in which each column sum Ck = 0, making the sequence Ck absolutely summable, and yet
L
laj,kl
(j,k)EN2
diverges. 5.4 THE BANACH SPACE
h
AND ITS DUAL SPACE
Recall that a Banach space is any vector space, equipped with a nonnegative function called a norm, having the property of being complete with respect to the norm: ie, such that every sequence that is Cauchy in the norm converges to a point of the space. We have seen that !Rn is a Banach space in the Euclidean norm, and C[a, b] is a Banach space in the sup-norm. Now we will define another space, together with a nonnegative function that we will show makes this space a Banach space.
Definition 5.4.1 Denote by h the set of all absolutely summable sequences x. If x is any sequence, let 00
k=l
Thusx E
h
< oo.
ifandonlyifllxlll
Remark 5.4.1 The reader should be sure to use lower case h to denote this space, as L 1 is used to denote a different space of absolutely integrable functions in the sense of Lebesgue. Both spaces are named after Henri Lebesgue. Theorem 5.4.1 The set lt is a vector space,
II · ll1 is a norm, and with this norm l 1
is a Banach space. Proof: For all x andy in hand c E IR, (x + y)k = Xk + Yk and (cx)k = CXk. To show that h is closed under addition and scalar multiplication, it will suffice to show that
llx + Yll1 S llxll1 + IIYII1 < oo and
llcxll1 = lclllxll1 < oo, which will simultaneously establish two of the properties required of a norm. However, 00
llx + Yll1 =
L
00
lxk
+ Ykl S L(lxkl + IYki)
k=l
k=l
k=l
=
k=l
llxll1 + IIYih < oo.
THE BANACH SPACE h AND ITS DUAL SPACE
149
The second property is Exercise 5.30. To complete the proof that II · 11 1 is indeed a norm, see Exercise 5.31. To complete the proof, we need to show h is complete in the so-called h -norm. To avoid confusion with the indices, we will denote a Cauchy sequence {x(n)}~=l c h. This means that for all n E N we have x(n) E h. so that x(n) is itself a sequence with kth term denoted by Xkn) E R Since x(n) is a Cauchy sequence in the h -norm, for all € > 0 there exists N E N such that CXl
m,n ?_ N
==? llx(m)- x(n)lll =
L
lxkm)- Xkn)l
< ~·
(5.2)
k=l
I
However, this implies for all fixed k that m, n ?_ N implies lxkm) - Xkn) <
~as
well. Hence, for all k, { Xkn)} ~=l is a Cauchy sequence in R Thus we can define a sequence x by setting Xk = limn_, 00 Xkn), for all k E N. We need to show that x E hand that llx(m) - xll1 -+ 0 as m-+ oo. Fix arbitrarily pEN and m, n ?_ N, so that Equation (5.2) above implies that
""I p
L- xk(m) - xk(n)l <
€ 2.
k=l
Letting n
-+
oo we see that
Here we have used the finite upper summation limit p so that we could employ p iterations of the theorem about limits of sums of convergent sequences. Since p
"" L- Ixk(m) -
Xk I < _
2€
k=l
for all p E N, we have CXl
L
lx~rn)- Xkl
= llx(m)- xlll ::;
~·
k=l
Hence x(m) - x E h and x(m) E h implies x(m) - [x(m) - x] Since m ?_ N implies
=x
E h as well.
€
::; 2 < €, we see that x(m) ___.. x as m ___.. oo.
•
For examples illustrating the concepts in the proof of Theorem 5.4.1, see Exercises 5.36-5.38.
150
INFINITE SERIES
In Section 3.3 we identified several different bounded linear functionals on the Banach space C[a, b], and we showed in Exercises 3.42.a and 3.42.b of that section that all the bounded linear functionals on JR 2 could be identified with the elements of JR 2 itself. (The bounded linear functionals on !Rn can be identified with JRn in a similar way.) Here we are going to identify all the bounded linear functionals on h. But before we do this, let us observe that in the exercises just cited, we showed that the bounded linear functionals on JR 2 correspond to JR 2 itself, and JR 2 is a Banach space equipped with the Euclidean norm. It is not ordinarily the case that the family of bounded linear functionals on a Banach space can be identified with the space itself. But it is true that this family of bounded linear functionals is always a Banach space. We will prove this first. Definition 5.4.2 Let V denote any normed linear space. Let V' be the set of all ---+ lR such that T is linear and bounded. We call V' the dual space ofV.
T :V
Theorem 5.4.2 Let V be any normed linear space. Then V' is a Banach space, with norm defined by
IITII = inf { K IIT(v)l:::; Kllvll Vv E
v}.
Remark 5.4.2 The symbol liT II denotes the norm of the bounded linear functional T, that is defined in the theorem. In Exercise 5.32.a, the student will show that IT(v)l :::; IITII· llvll for all v E V.
The reader will recall from a course in linear algebra that the sum of any two linear maps is linear and that any constant times a linear map is linear, so we will see that V' is a vector space if we can show that the function II · II defined on V' is a norm, which will prove also that the sum of two bounded linear functionals is again bounded, and the same for scalar multiples. Observe first that
Proof:
IT(v)l :::; IITII · llvll for all v E V, so that lcT(v)l
=
lciiT(V)I:::; lci·IITII·IIvll
so that llcTII :::; lei · IITII < oo. ButT= ~(cT), so that IITII:::;
! 11cTII· Thus llcTII =
1 1
lciiiTII. Observe next that
+ T2)(v)l :::; IT1(v)l + IT2(v)l:::; (IIT1II + IIT2II)IIvll· Thus we see that IIT1 + T2ll :::; IIT1II + IIT2II· To complete the proof that I(Tl
II· II is a norm on V', see Exercise 5.32.b. It remains to be shown that V' is complete in the given norm. Let Tn be any Cauchy sequence in V'. Let E > 0. Then there exists N such that rn and n ?:: N implies
THE BANACH SPACE it AND ITS DUAL SPACE
151
Thus, for all v E V,
\Tm(v)- Tn(v)\ < ~1\v\1. Hence {Tn('v) }~=l is a Cauchy sequence in lR and we can define
T(v)
lim Tn(v)
=
1l,-----4CX)
for all v E V. The proof that Tis linear is Exercise 5.33. Finally, we must show that Tis bounded and that 1\Tn - T\1 ----> 0. But if m and n ~ N, we know that
Letting n
---->
oo, we see that
for all v E V. Thus
so T m
---->
T in norm. Moreover,
1\T\1 = 1\Tm- (Tm- T)\1 :::; 1\Tm\1
+ 1\Tm- T\1 < oo,
so T is bounded as claimed. Thus V' is a Banach space.
•
Now we will proceed to identify l~ as a Banach space. Definition 5.4.3 Denote by l 00 the set of all bounded sequences, and
1\Y\\oo =sup {\Yk\1 kEN} Vy E loo. The norm 1\ · 1\oo is also called a sup-nonn. Theorem 5.4.3 For ally E 100 andforall x E
h define
00
Ty(x)
=
2:::: YkXk. k=l
We claim that loo is a Banach space, the mapping T : y ----> Ty from loo is linear, injective (meaning one-to-one), and surjective (meaning onto) l~, and
which means that T preserves norms. Remark 5.4.3 Because of the properties described in Theorem 5.4.3, the mapping Tis called an isomorphism from the Banach space l00 to l~.
152
INFINITE SERIES
Proof: For all y E l00 00
Ty: X____, LYkXk
k=l is an absolutely convergent series (Exercise 5.10), and Ty : h Ty is bounded since
JTy(x)J =
----> JR. is clearly linear.
~~ YkXkl ~ ~ 1YkXk1 ~ JJyJJoollxJJl
for all x E h. Thus IITyJJ ~ IIYIIoo· The reader will prove that IITyJI = IIYIIoo (Exercise 5.34 ). The mapping T : y ----> Ty carries l00 linearly into l~. Since IITyll = IIYIIoo• Ty = 0 if and only if y = 0. That is, the kernel of the map T: y----> Ty is {0}, soT is one-to-one from l00 into l~. It remains to be proven that T is onto l~. So we let L E l~ be arbitrary, and we must show there exists y E l00 such that L = Ty. For each j E N, let e(j) E h be defined by if k
= j,
if k =I= j. Define a sequence y by letting Yi = L ( e(jl) for all j E N. Then
for allj. Hence y E l00 • We claim that for all x E Observe that
oo
L k=n+l
as n
h. L(x) = Ty(x), so that L = Ty. n
lxkl = llxll1- L lxkl----> 0 k=l
----> oo. Therefore
ao; n ----> oo, since every bounded linear functional L must be continuous. That is, L(x) = 1 YkXk = Ty(x). Finally, we explain briefly how the mapping y ----> Ty establishes that l00 must be a Banach space itself, and why this Banach space is said to be isomorphic to l~. We know already that the sup-norm is a norm, but this follows also from the properties ofT. For example,
E:
IIY + zlloo = IITy+zll = IITy + Tzll ~ IITyll + IITzll = IIYIIoo + llzlloo·
153
EXERCISES
The other properties of a norm can be established for the Z00 -norm similarly. To see that Zoo is complete, suppose y(n) is a sequence of bounded sequences that is Cauchy in the Z00 -norm. Thus Ty is a Cauchy sequence in Z~. Hence there exists y E Zoo such that Ty ---+ Ty. It follows that y(n) ---+ y since
The mapping T is called an isomorphism of Banach spaces because it preserves all the vector space operations, it preserves the norm, and it preserves convergence and completeness in that norm. •
EXERCISES 5.30
t Prove for all x E h and for all c E JR, llcxll1 = lclllxll1 < oo. t Show for all x E h that llxll1 ::;:: 0 and llxll1 = 0 if and only if x = 0, the
5.31 identically 0 sequence (ie, 5.32
Xk
= 0 for all k).
Let V be a normed linear space. a) tifT E V', prove IT(v)l :S: IITII·IIvll for all v E V. b) 1fT E V', prove T = 0, the zero functional, if and only if liT II
t
5.33
t In the proof of Theorem 5.4.2, we defined a function T
:V
---+
= 0.
lR by
T(v) = lim Tn(v), n->oo
where Tn E V'. Prove that Tis linear.
5.34 t If y E Z00 , x E h. and Ty(x) = L:~ 1 YkXk, prove IITyll = IIYIIoo· (Hint: The direction ":S:" was established in the proof of Theorem 5.4.3. Try applying Ty to e<1) from the proof of that theorem.) 5.35 Prove directly that Zoo with the sup-norm is a Banach space, without appealing to Theorem 5.4.3.
=
W., for all
x~n), for all k
E N. Show
5.36 For all n E N define a sequence x
5.37
For all n E N define a sequence x
xk
a) Show that x
-
{1
1
p:
h
by letting
if k :S: n, if k
> n.
h by showing llx(n) 11 1 < oo.
154
INFINITE SERIES
b) Define a sequence x by letting Xk = lirnn--->oo Xkn), for all k E N. Is x E h? Prove your answer, yes or no. c) Is x(n) a Cauchy sequence in h? Prove your answer, yes or no.
5.38
For all n E N define a sequence x(n) E h by letting x(n) _ 'k
-
{i
if k :::; n,
0
if k
> n.
a) Show that x(n) E h by showing llx(n) 11 1
< oo.
b) Define a sequence x by letting Xk = limn--->oo Xkn), for all k E N. Is x E h? Prove your answer, yes or no. Prove that llx(n) - xlloo ----+ 0 as n----> oo. c) Is x(n) a Cauchy sequence in h? Prove your answer, yes or no. d) Is x(n) a Cauchy sequence in the sup-norm? Prove your answers, yes or no. e) Ish complete in the sup-norm?
5.5
SERIES OF FUNCTIONS: THE WEIERSTRASS M-TEST
Definition 5.5.1 Suppose for all n E N that fn is defined on the domain D. Let n
s .. (x)
=
L fk(x). k=l
lfsn(x) converges to the numbers(x)foreachx E D we say that '2.~ 1 fk converges pointwise to son D. However, if
llsn- sllsup----> 0 as n----> oo on D, then we say the series '2.~ 1 fk converges uniformly to the sums on D.
•
EXAMPLE 5.7
Let fk(x) = xk, for all x E D = [0, 1). Since lxl 00
< 1,
00
~ fk(x) = ~ xk = ____::____ L...i L...i 1-x
k=l
k=l
pointwise convergent on D. However, we can see as follows that this series is not uniformly convergent on D. Suppose it were uniformly convergent. Then the sequence sn would be Cauchy in the sup-norm. Lett = 1, and there exists N such that n > m 2': N
155
SERIES OF FUNCTIONS: THE WEIERSTRASS M-TEST
implies llsn- smllsup < 1. That is, E~=m+ 1 xk < 1 for all n > m :2: Nand for all x E D. Hence n
lim
x---+1-
""' xk ~ 1. L...
k=m+1
However, limx---+1- E~=m+ 1 xk = n- m > 1 whenever n > m + 1. This is a contradiction. Hence the convergence is not uniform on [0, 1). Exercise 5.40 shows that the series in Example 5.7 does converge uniformly on a smaller domain. The next theorem shows that when we do have uniform convergence of an infinite series, this permits strong conclusions of interest in the calculus. Theorem 5.5.1 Let fk be defined on a domain D for all k E N. Then we have the following conclusions.
i. If !k E C(D) for all k and if E~ 1 fk converges uniformly to fonD, then f E C(D) as well. ii. If fk E R[a, b] for all k and if E~ 1 fk converges uniformly to f on [a, b], then f E R[a, b] and
1 b
00
f(x) dx
= {;
1 b
fk(x) dx.
iii. If fk E C1 [a, b] and if there exists c E [a, b] such that E~ 1 fk(c) converges and if E~ 1 f{ converges uniformly on [a, b], then E~ 1 fk converges uniformly to a differentiable function f on [a, b] and 00
J'(x)
=L
fk(x).
k=1
Proof: 1.
For all n E N, Sn = E~= 1 fk E C[a, b] and Thus f E C[a,b].
llsn- !II sup
-t
0 as n
-t
oo.
ii. This is Exercise 5.41. iii. By hypothesis, s~ = EZ=l f{ - t g uniformly on [a, b]. And sn(c) converges for some c E [a, b]. By the corresponding theorem on uniform limits and derivatives for sequences, we see that Sn converges uniformly to some differentiable function f and f'(x) g(x) = E~ 1 fk(x).
=
•
Because uniform convergence is so useful, it is helpful to have a convenient test that can identify many (though not all) uniformly convergent series. Theorem 5.5.2 (Weierstrass M-test) Let fk be defined on a domain D for all kEN. Let Mk = llfkllsup < 00
156
INFINITE SERIES
for all k. If"L-: 1 on D. Moreover,
Mk < oo then
E: !k converges both absolutely and uniformly 1
00
converges uniformly on D as well. Proof: Let E > 0. By hypothesis, an = E~=l exists N E N such that n > m ~ N implies ian
Mk is a Cauchy sequence, so there -ami < E. Denoting
we see that n
L
llsn- Smllsup =
/k
k=m+l
sup
n
k=m+l n
=
L
Mk = !an- ami < E.
k=m+l
E:
Thus Sn is Cauchy in the sup-norm, so 1 fk converges uniformly. The exact same argument as that just given still works if we replace !k by l!k I in every step. This implies that • 1 l!kl converges uniformly on D as well.
E:
• EXAMPLE 5.8 Let
1 f(x) = - 1 -x
for all x E ( -1, 1). We know from the Geometric Series formula that 00
provided that !xi
< 1. We wish to express
f
'(
1 x)=(1-x)2
as an infinite series in powers of x. Suppose we fix x 0 E ( -1, 1) and select a real number r for which !xol < r < 1. We apply the Weierstrass M-Test by
EXERCISES
letting Mk =
krk-l
157
to conclude that the series 00
Lkxk-1 k=l
of term-by-term derivatives converges uniformly on [-r, r] by applying the Ratio Test (Theorem 5.2.3) to L::;: 1 krk-l . It follows then from Theorem 5.5.1 that f'(x) exists on [-r, r] and, specifically, 00
J'(xo)
=
L kx~-
1
•
k=1
Note that this formula has been established for all values of x 0 E ( -1, 1). EXERCISES
5.39
Let
fk(x)
=
( 1)k+l -
k
xk.
a) Prove: L::;: 1 fk(x) converges uniformly on [0,1]. (Hint: Use the error estimate from Theorem 5.1.2.) b) Find ll!kllsup on [0,1]. Can the Weierstrass M-test be used to prove the uniform convergence of L::;: 1 fk(x) on [0,1]? Why or why not?
< 1, prove that L::;: 1 xk converges uniformly on [0, a].
5.40
If 0 ~
5.41
Prove part (ii) of Theorem 5.5.1.
a
5.42 If L::%"= 1 fk converges uniformly on D, prove: the converse true? Prove or give a counterexample.
llfnllsup __, 0 as n
__, oo. Is
5.43 Determine whether or not each of the following series converges uniformly on the indicated domain. a) L::;: 1 e-kx on [1, oo). b) d)
on lR
c) e)
·
L::;: 1 sink x on [0, 8], where
0
5.44
sin
kx uk=l ~
'"'00
<8<
L::;: 1 sinkxon [0,~). L::;: 1 tank x on [o, ~].
~·
Let
f(x)
=
f: si~:x k=l
on JR. Prove that f E and find an expression for series. Justify all your conclusions.
C1 (IR)
5.45
Suppose 00
L k=O
ak
cos 2krrx
f' (x)
in terms of an infinite
158
INFINITE SERIES
converges uniformly on [0, 1] to a function f E R.[O, 1]. Prove that ak
=
21
1
f(x) cos2k1rxdx,
J;
1
if k > 0, and ao = f(x)dx. (Hint: First show how f0 cos2j7rxcos2k7rxdx depends upon j and k, and apply a theorem about uniformly convergent series.)
5.6 POWER SERIES Definition 5.6.1 A power series in powers of (x - a) is an infinite series 00
:~:::>k(x- a)k, k=O
where cis a sequence of real coefficients and a E lR is the base point
Given any power series, our first goal is to determine the set of values of the real variable x for which the power series converges. The next theorem shows that the set of points of convergence is always an interval (finite or infinite) centered about a. Note that the set of points of convergence of a power series is never empty, since x = a is always a point of convergence. Theorem 5.6.1 Given any power series 00
:~:::>k(x- a)k, k=O
there exists a radius of convergence R ~ oo such that the series converges absolutely everywhere inside (a- R, a+ R), diverges everywhere outside [a- R, a+ R], and converges uniformly on every closed finite interval
[a,{J] C (a- R,a+ R).
Remark 5.6.1 If R = oo, we interpret (a- R, a+ R) as ( -oo, oo ). It is always a delicate question in any example to determine what happens at the endpoints a ± R of the interval of convergence, if R < oo.
Proof: If we were to replace the variable x by x' = x - a, then we would be proving that the series L::;;"= 0 ckx'k has the claimed convergence properties in (-R, R). Then we would replace x' by x - a and have the general result. So, without loss of generality, it is sufficient to prove the theorem for the case a = 0. 1.
We begin by proving the following claim: If the series I:;;"=~ ckxk converges for x = x1 -=1- 0 and if [a, {3] C (-I xi I, lx11), then L::;;:o ckx converges both absolutely and uniformly on [a, {3].
POWER SERIES
Pick r
159
> 0 such that
Thus for all x E
-lxii < -r:::; a:::; fJ:::; r < lxii· [a, ,8], we have lxl :::; r < Ixi I. so that lckxkl =
lckx~l·~~~k :S: lckx~l·l!_lk XI
XI
Since L~o ckxt converges, ckxt ____, 0, and so hxt I :::; B < oo for all k and for some B. Since 0:::; r·
< lx1l.
I;, I< 1, and
is a convergent geometric series. But
on [a, ,8], so L~o llckxk II sup < oo. Hence L~o ckxk converges both absolutely and uniformly on [a, fJ] by the Weierstrass M-test. Observe that every x E ( -lxii,
lxii) lies in some such interval
Thus we have also proven that L~o ckxk converges absolutely on the interval
(-lxii,Ixii). ii. Now let I be the set of all points at which L~o ckxk converges, and let
R = sup { lx1ll x1 If [a, fJ]
c (- R, R), then there exists xi
E
E
I} .
I such that
Hence the series converges uniformly and absolutely on [a, ,8]. And it follows that the series converges absolutely on (-R,R). On the other hand, if x (j. [-R,R], then lxl is too big to permit x E I. Hence x (j. [-R,R] implies x (j. I. That is, ( -R, R) <:;;;I<:;;; [-R, R].
• Remark 5.6.2 We observe that if lx- ai < R, then L%':o lck(x- a)k I converges. But if lx- al > R, then L%':o lck(x- a)kl diverges. For this reason, we can determine the radius R of convergence of L%':o ck(x- a)k by testing the positiveterm series L~=O lck(x- a)kl for convergence. The ratio test is frequently useful for this purpose.
160
INFINITE SERIES
• EXAMPLE 5.9 k
We will find the interval I of convergence of E~=l xk • Here a = 0, and we begin by finding R. Applying the ratio test to the corresponding series of absolute values, we compute .
hm
k~oo
xk+l I k+I I-~-~ = xk
.
hm k-+oo
k
k -+ k1 1xl
= lxl.
Thus the series converges (absolutely) for lxl < 1 and diverges for lxl > 1. The endpoints must be tested separately. At x = 1 we get the harmonic series. But when x = -1, we get
f
(-~)k
k=l
which is -1 times the alternating harmonic series, which converges. Thus
I= [-1, 1) . • EXAMPLE 5.10
Theorem 5.6.1 enables us to use the definite integral to obtain power series expansions for an assortment of interesting functions. Here we will obtain a power series expansion for tan- 1 x. We begin with the geometric series formula
1
00
'"'rk-L....!
1-r
k=O
for all r E ( -1, 1). Now set r = -t 2 , and restrict t to the interval ( -1, 1) to insure lr-1 < 1 as required. Be sure to observe that Theorem 5.5.1 enables us to integrate both sides of the expansion
but the integrals must by definite integrals (with upper and lower limits of integration). Thus, if lxl < 1, we have
x l +
00
1
--2
0
1
t
dt
= L:(-1)k
lx
k=O
t 2 kdt,
0
yielding tan -1 x
00
(-1)k
= '"' --x2k+ 1 62k+1'
.
(5.3)
By our derivation, we know only that Equation (5.3) is validforx E ( -1, 1). Yet we know tan- 1 xis a continuous function defined on [-1,1]. In Exercise
EXERCISES
161
(5.46) the reader will show that Equation (5.3) remains valid at x = ±1, hence for all x E [-1, 1].
EXERCISES 5.46 a) Show that the series on the right side of Equation (5.3) converges uniformly on [-1, 1] so that if we let 00
(
1)k
_ _ x2k+l g( x) = '""'_~2k+1 , k=O
.
then g E C[-1, 1]. (Hint: Use the alternating series test.) b) Use the preceding result to show that Equation (5.3) remains valid for all X
E [-1, 1].
c) Use the preceding result to find an infinite series the sum of which is 1r. (Hint: What is tan- 1 {1)?)
5.47 a) Use the geometric series formula [Equation (5.1)] to derive a power series expansion oft in powers of (t- 1), valid for It -11 < 1. Does the resulting power series converge (to some real number) at t = 2? What about t = 0? b) Integrating from t = 1 to t = x, for all x E (0, 2), find a power series expansion in powers of ( x - 1) for log x. c) Prove that the power series representation of logx found in (b) remains valid for x = 2. What about x = 0? Explain your conclusions.
5.48
Show that the series 00
k
Lxk k=l
does not converge uniformly on ( -1, 1). (Hint: Show that the the sequence of partial sums is not Cauchy in the sup-norm.)
5.49
Find 00
1
Lk2k·
k=l
(Hint: Begin with the series
5.50
L::%': 0 tk and integrate between two appropriate limits.)
Find the interval of convergence for each of the following power series. ""'oo
oo
oo
~
xk
b) L:k=o -k+l
(-l)k+l(x-1)2k+l
d) ""'oo
a) wk=O kf
xk
(x
+ 1) k
c) L:k=l 2k+l wk=l k" e) L:;%': 0 if,xk. Hint: For the endpoints, it may help to recall the power series expansion of ex.
162
INFINITE SERIES
Use Exercise 5.17 to prove that the radius R of convergence of the power series I:~o akxk is given by
5.51
1
R=----== limsup ~ provided that lim sup ~ is a strictly positive real number. Give a suitable interpretation of this formula for R in the case that lim sup ~ E { 0, oo}.
5.7
REAL ANALYTIC FUNCTIONS AND
If f(x)
= I:~o ck(x- a)k for all x
E
coo FUNCTIONS
(a- R, a+ R), we would like to show that
00
J'(x)
= L kck(x- a)k-1, k=l
the sum of the term-by-term derivatives of the power series for f. In order to use Theorem 5.5.1 to prove this conclusion, we will need to know how the radius R' of convergence for 00
L kck(x- a)k-l k=l
compares with R. Because of the factor k multiplying Ck in the derived series, we would expect that x - a should be smaller: that is, we expect that R' :::; R. Actually, we will prove the surprising fact that R' = R.
Theorem 5.7.1 If R is the radius of convergence of
and if R' is the radius of convergence of 00
L kck(x- al-
1
,
k=l thenR = R'. Proof: As in the proof of Theorem 5.6.1, we could change variables to x' = x - a, and in the x'-variable the base point a would be replaced by 0. So without loss of generality we can give the proof under the assumption that a = 0.
i. First we will prove that R' :::; R. For this it will suffice to show that if !xi < R', then !xi :::; R (See Exercise 5.52.) Since !xi < R', 00
L k ickxk-ll < oo. k=l
REAL ANALYTIC FUNCTIONS AND
But then the series
00
00
k=1
k=1
converges by the Comparison Test. Hence soR' ~ R.
c= FUNCTIONS
163
I::o ckxk converges absolutely,
ii. Now we must prove the counterintuitive claim that R ~ R'. As in the first case, we will show that if lxl < R then lxl ~ R'. Because of the delicate comparison needed for case (ii), we pick x 1 such that lxl < lx1l < R. So we are seeking to prove that if 00
00
2:1ckx~l
where r
=
=/=-
0, we have
I; I· Here we justify the limit (1) by means of the nth Term Test, 1
since I:: 1 krk- 1 converges by the ratio test (Exercise 5.13). Thus 00
Lkhxk-11 k=1
< oo
by the Limit Comparison Test (Exercise 5.8). Hence lxl ~ R', as claimed.
• Definition 5.7.1 If there exists R > 0 and if there exists a coefficient sequence c such that 00 f(x)
= L ck(x- a)k k=O
for all x E (a- R, a+ R), then we call f (real) analytic at x =a.
Theorem 5.7.2 Iff is (real) analytic at a and if R is as in Definition 5.7.1, then f E coo (a - R, a + R) and j(k)(a) Ck = _k_!_ for all k ::::-: 0.
Remark 5. 7.1 If k = 0, we interpret f (o) as being f itself, and recall the convention that 0! = 1. This theorem states that no function can be equal to the sum of a power
164
INFINITE SERIES
series on an open interval without being infinitely differentiable (i.e., without the derivative J(k) being continuous on the open interval for all k). Moreover, iff is analytic on (a - R, a + R), then the coefficients of the power series must be the coefficients from Taylor's Theorem (Theorem 4.6.1 ). In other words, if a function
can be expressed as the sum of a series in powers of (x - a), its coefficients are unique. Proof: Let x E (a- R, a+ R), so there exist o: and (3 such that
x
E
{o:, (3)
[o:, (3]
C
C
(a- R, a+ R).
Both the series :E%"=o ck(x - a)k and the series :E%': 1 kck(x - a)k- 1 converge uniformly and absolutely on [o:, (3]. Thus f'(x) exists and 00
L: kck(x- a)k-
J'(x) =
1
.
k=1
Since the derived series has the same radius R of convergence as the original series, this argument can repeated as often as we like to prove that f(P) (x) exists and 00
f(P)(x) =
L: k(k- 1) · · · (k- [p- 1])ck(x- a)k-p k=p
for each p E N. Since both f(P) and f(P+l) exist on (a - R, a+ R), j(P) E
so
C{a- R, a+ R),
R,a + R). It remains to determine the coefficients ck. We begin by substituting x =a into
f
E C00 (a-
00
f(x) = L:Ck(x- a)k k=O yielding
f(o)(a)
eo= f(a) = Next we substitute x
0
,-.
= a into 00
f'(x) =
L kck(x- a)k-1, k=1
obtaining
f'(a)
C1
= - 1,-.
In general, we substitute x = a into 00
f(P)(x) =
L: k(k- 1) · · · (k- [p -1])ck(x- a)k-p, k=p
REAL ANALYTIC FUNCTIONS AND C00 FUNCTIONS
165
showing that
f(P)(a) p.
Cp= --~-·
• The reader may be surprised to learn that although every (real) analytic function at a must be in C00 (a- R, a+ R) for some R > 0, there exist f E C00 (lR.) such that f is not analytic at 0. • EXAMPLE 5.11
Let
f(x) = {
~-~
if X =/= 0 if X= 0.
If x =/= 0, then f' (x) = ; 3 f (x). By direct calculation we see that
j'(O)
= lim
f(x)
x->0
= 0.
X
(See Exercise 5.57.) If x =!= 0,
f"(x)
=
-6 4 f(x) X
4
+ 6f(x). X
In Exercise 5.57 the reader will show also that j"(O) = 0. The industrious reader may then devise a proof by the method of mathematical induction that j(k) (0) exists and= 0 for all k E N. If we construct the Maclaurin series 00
f(k)
(0)
k
Lk!X' k=O
this converges uniformly and absolutely to the sum 0 for all x E R However, this sum does not equal f(x) on any interval (- R, R), with R > 0. Hence a function need not be analytic. Knowing that proving the convergence of a (Taylor) power series
coo
~ J(k)(a) (x- a)k L.....t
k=O
k!
is insufficient to establish that it converges to the given function f(x), how can we establish (other than by the methods in Section 5.6) that a given function
f
E C00 (a- R, a+ R)
is analytic? In fact, the Nth partial sum of this series is just the nth Taylor Polynomial Pn(x) of Theorem 4.6.1. To show that Pn(x) -.. f(x) as n -> oo, we recall from the latter theorem that the Taylor Remainder _
Rn(x) -
J(n+l)
(n
(J.l)
+ l)!
n+l
(x- a)
.
166
INFINITE SERIES
It is necessary and sufficient to show that Rn (x) ___. 0, bearing in mind that we cannot control the location of p, between a and x .
• EXAMPLE 5.12 Let f(x) =ex for all x E R We claim that J(k)( )
00
00
f(x) = L ---,j!-xk = L k=O k=O for all x E R We need to show that J(n+I)( ) Rn(x) = J.L xn+l (n + 1)!
k
~!
---t
0
as n ___. oo where p, is somewhere between 0 and x. But for our function
IR (x)l n
=I
e~'
(n+l)!
xn+Il
<
-
elxl
f,
lxln+l ___. 0
(n+l)!
as n ___. oo by virtue of the nth Term Test, since
oo lxln+l L(n+l)!
k=O
converges by the ratio test. This proves the claim.
Theorem 5.7.3 Suppose f and g are both real analytic at x =a. Then the product
f g is also analytic at x = a. Proof: By hypothesis, there exist positive numbers Rt and R2 such that lx-al < R1 implies 00
f(x)
= Lak(x- a)k 0
and
!x - a! < R2 implies 00
g(x)
= L bk(x- a)k, 0
with both series being absolutely convergent. Let R =min{ R 1 , R 2 }. If k
ck(x- a)k
= Lal(x- a) 1bk-l(x- a)k-l l=O
is the kth term of the Cauchy product of the sequences ak(x- a)k and bk(x- a)k, then lx - al < R implies 00
f(x)g(x)
= Lck(x- al. 0
•
EXERCISES
167
y
Figure 5.1
f(x)
= x 4 ifO < x < 1.
EXERCISES
lxl < R' implies lxl ::; R.
Prove that R' ::; R.
5.52
Suppose
5.53
Use Examples 5.10 and 5.12 with Theorem 5.7.3 to show that
h( x ) = e:I; tan ~ 1 ( .7: ) is real analytic at 0. Find the coefficient of x 4 in the power series expansion of h.
5.54 Let f(x) = tan- 1 x. Find j< 100 l(o) and j< 101 l(O). [Hint: Use Theorem 5.7.2 and Equation (5.3).] 5.55
Prove that the following functions are not analytic at 0. a) f(x) = lxl b) For each k E N let
f(x)
5.56
Let
f(x)
= {
=
~
{
xk
if X> 0,
0
if X
4
ifO
:S 0.
<X< 1,
if - 1
<
X
:S 0.
Prove your answers to the following questions, and see Fig. 5.1. a) Is
c) Is
5.57
f E C (0, 1)? f E C (-1, I)?
b) IsjEC 00 (-1,0)?
00 00
t Let f(x)
= { ~~
'"12
if X =j:. 0, ifx = 0.
168
INFINITE SERIES
Prove: a) f'(O) exists and equals 0. (Hint: Use L'H6pital's Rule. See Exercises 4.8 and 4.46.) b) f"(O) exists and equals 0. c) J(k)(O) exists and equals 0 for all k E N. (Hint: Use induction to prove that r(x) has a useful form for each n E N.)
5.58 Prove that sin x is analytic at 0 by showing that its Taylor series about a = 0 converges to sin x for all x E R 5.59
Let sinx
f(x) =
/ {
if X i=- 0, if X= 0.
Prove that by selecting c suitably, f becomes a real analytic function at 0, with a power series expansion in powers of x that converges to f (x) for all x.
5.60
Let the function f(x) be as in Exercise 5.57. Define the function g by
g(x) = f(x)f(x -1). Prove that g E C00 (R.), g(x) 2: 0 for all k = 0, 1, 2, .... (See Fig. 5.2.)
X,
and g(k)(O) = 0 = g(k)(1) for all
y
Figure5.2 g(x)
5.61
0 Let h( ) = x
{~~~
= f(x)f(x-
if X g(t)dt
for g(t)dt
1
:::;
1).
0,
<X< 1, if X 2: 1, if 0
where g is as in Exercise5.60 above. Prove that hE C00 (R.), h(x) = 0 for all X:::; 0, h(x) = 1 for all x 2: 1, h(k)(x) = 0 at x = 0 and at x = 1 for all k E N, and 0 :::; h(x) :=:; 1 for all x.
WEIERSTRASS APPROXIMATION THEOREM
169
5.62 t 0 13 Let [a, b] and [c, d] denote any two closed finite intervals of the x- and y-axes, respectively. Construct a function l E C 00 (lR) such that l(x) = c for all x ~a, l(x) = d for alll ~ b, and c ~ l(x) ~ d for all x E [a, b]. (Hint: Adjust the function h from Exercise 5.61 above. The reader can find extensions of these conclusions to JR.n in [20].)
5.8 WEIERSTRASS APPROXIMATION THEOREM We saw in Section 5.7 that iff is real analytic at a, then f E c=(a- R, a+ R) and oo
j(k)(a)
f(x) = L:-k!-(x-al k=O
for all x E (a- R, a+ R), where R > 0 is the radius of convergence. Thus on every closed finite interval [a, /1] c (a - R, a + R), the sequence of nth degree Taylor polynomials Pn ----+ f uniformly. That is,
IIPn- Jllsup ----> 0 as n ----+ oo. On the other hand, we saw an example of a function f E C00 (1R) for which the sequence Pn of Taylor polynomials converged everywhere to 0, not to f. The following theorem is not a theorem about power series, but it is located here to stand in contrast to the theorems of the preceding two sections. We will see that every continuous function on a closed, finite interval [a, b] is the uniform limit of some sequence of polynomials. This includes the function of Example 5.11 on [-1, 1] for example, though the polynomials in this case could not be the Taylor polynomials.
Theorem 5.8.1 (Weierstrass Approximation Theorem) Let a < b, both a and b in R Iff E C[a, b] then there exists a sequence of polynomials Pn such that
IIJ- Pnllsup ----> 0 asn----+oo. Proof: There are five parts to the proof of this famous theorem. The first two parts are reductions to a somewhat simpler case. i. We claim it suffices to prove the theorem for [0, 1]. Suppose the theorem were true for [0,1] and
f
E C[a,
b]. Let
x-a
U=--
b-a
13 This exercise is used to prove Theorem 6.5.1, which establishes the convergence of Fourier series in the £ 2 -norm.
170
INFINITE SERIES
so that u goes from 0 to 1 as x goes from a to b. Define
g(u)
= f(x) = f((b-
a)u +a).
Thus g E C[O, 1], and by hypothesis there exists a polynomial p such that lg(u)- p(u)l < E for all u E [0, 1]. Thus
for all x E
[a, b). But p
(xa) b-a
is still a polynomial in x.
ii. We claim it suffices to prove the theorem only for f(O) = 0 = /(1).
f
E C[O, 1] such that
Suppose the theorem were known true when f(O) = 0 = /(1). Let f E C[O, 1] be arbitrary and letL(x) = f(O)+x[/(1)- /(0)]. Then f-L is still continuous and vanishes at the endpoints. By hypothesis, there exists a polynomial p such that II/- L- Pllsup < E. But L +pis still a polynomial. By virtue of parts (i) and (ii), we can assume henceforth without loss of generality that
f E C(o,l)(JR) = {/ E C(JR) I x tf_ (0, 1) ====? f(x) = 0}.
Definition 5.8.1 Suppose for all n EN we have a sequence kn in R.[-1, 1) of nonnegative functions on [-1, 1] such that (a) f~ 1 kn(x) dx (b) forallO
=
1for all n, and
< 8 < 1 we have
on {xI 8 ~ lxl ~ 1}. Such a sequence offunctions is called an approximate identity. iii. We suppose for now that we have an approximate identity kn and we let
for all f E C(o,l)(lR) and we claim that 11/n- fllsup---> 0 on [0, 1) as n---> oo. The function f n is called the convolution off with kn. Each function kn can be
WEIERSTRASS APPROXIMATION THEOREM
171
called also the kernel function of the integral operator that produces fn given f and n. Let t > 0. By uniform continuity of f (Exercise 5.66), there exists J > 0 such that ltl ::; J implies lf(x)- f(x + t)l ::; t/2, for all x E [0, 1]. Let M = llfllsup on [0,1] and let Sn = llknllsup on {x I J ::; lxl ::; 1}. Thus Sn-+ 0 as n-+ oo. Then, since J~ 1 kn(x) dx = 1, 1
lf(x)- fn(x)l = /t(x)- [
+ t)kn(t) dt/
f(x 1
1
l.f(x)- f(x + t)lkn(t) dt
::; [ 1
= (
[~li lf(x)- f(x + t)lkn(t) dt +
[lili lf(x)- f(x + t)lkn(t) dt
+
i
1
+ t)lkn(t) dt)
lf(x)- f(x E
< 2M Sn + 2 + 2M Sn 3t
E
=
2 +4MSn < 4
if n is taken sufficiently big to make Sn < 16~. For n this big,
3t
IIJ- fnllsup :S 4
llf- fnllsup
-+
0 as n
-+
oo.
iv. Suppose kn and fn are as above. Suppose also that each function kn happens to be a polynomial. Then we claim that fn is also a polynomial. Letting u = x
+ t, and since x E
[0, 1] and
f
x+1
fn(x) =
1
= 0 off [0,1], 1
f(u)kn(u- x) du =
x-I
1
f(u)kn('U- x) du.
0
But kn (u- x) is a polynomial of some degree N in (u- x) and can be written in the form q0 (u) + q1 (u)x + · · · + QN(u)xN. Thus
1 1
fn(x) =
1 1
f(u)qo(u)du+···+
which is a polynomial in x.
f(u)qN(u)d-u·:rN,
172
INFINITE SERIES
y
Figure5.3 Qw(x).
v. Let Qn(x) =en (1- x 2
t,
a polynomial in x with en chosen so that
We claim that Qn is an approximate identity. Clearly, Qn(x) ~ 0 on [-1, 1] and Qn E R[O, 1]. See Fig. 5.3. (a) We claim that Exercise 4.15. (b) We claim en
so that
(1- x 2
< ..fii.
t
~ 1- nx 2 , for all x E [0, 1]. In fact, this was
To this end we calculate that
EXERCISES
173
We have used the inequality from Exercise 4.15 to justify the step marked by (1). (c) lfO
<
o< 1, then Qn(x) <
on
vn (1- o r
o::; !xi ::; 1. But y'ri (1 - o2 ) n
2
___,
0 as n ___,
oo since
converges by the ratio test. Thus
J!Qnllsup ___, 0 on {xI o::; lxl::; 1} as n ___, oo.
• EXERCISES
5.63
Show that the conclusion of the Weierstrass Theorem would be false for
f(x) = ~ on (0,1), thereby verifying the necessity of the use of a closed interval [a,b]. 5.64
Show that the Weierstrass Theorem would be false if we replaced [a,b] with the infinite interval JR. (Hint: Consider f(x) =ex on JR.)
5.65 a) Suppose f E C[0, 1] has the property that
1 1
f(x)xk dx
=0
for all k = 0, 1, 2, 3, .... Prove that f(x) = 0 on [0, 1]. (Hint: Deduce a 1 conclusion about the value of 0 f(x)p(x) dx if pis any polynomial. Then apply the Weierstrass Approximation Theorem.)
J
1
b) Define Tk(f) = J0 f(x)xk dx for all k = 0, 1, 2, 3, .... Prove: For each k = 0, 1, 2, 3, ... , Tk is a bounded linear functional on C[O, 1], equipped with the sup-norm. c) Suppose Tk(f) = Tk(g) for all k = 0, 1, 2, 3, ... , where f and g are in C[O, 1]. Prove: f(x) g(x) on [0, 1].
=
5.66
t Prove that every f
E Co(IR) is uniformly continuous on JR.
5.67 Construct an approximate identity on [-1, 1] consisting entirely of step functions, and justify your claims. 5.68
+ 1)xn on [0, 1]. Let
Let f E C[O, 1] and let ¢n(x) = (n
1 1
Tt('I/J) =
1/J(x)f(x) dx
174
INFINITE SERIES
for all '1/J E C[O, 1]. a) Prove: limn_, 00 Tt(c/Jn) = f(1) for all
f E C[O, 1].
(Hint: Find
1 1
lim
n---Jooo
nxn+k dx
0
for each nonnegative integer k. What could you conclude if f were a polynomial?) b) Prove that c/Jn -+ 0 pointwise on [0, 1] \ {1} but not uniformly. c) Prove that T 1 is a bounded linear functional on the Banach space C[O, 1], but Tt(c/Jn) does not converge to Tt(limn_,oo c/Jn). 5.69 Consider the vector space P from Exercise 3.44. Determine whether or not P is complete, with the norm from that exercise. Prove your conclusion.
5.9 TEST YOURSELF
EXERCISES 5.70
Determine absolute convergence, conditional convergence, or divergence: 00
(-1)k+ 1
"'00
.l........!.L-
a) Ek=1 k(k+1)
b)
5.71
{_1 )k+l
LJk=1
Vk
Given an example of Xk -+ 0 for which E~= 1 ( -1)k+ 1 xk diverges.
5.72 Let f : 1R -+ 1R be a differentiable function such that False: There exists a real number p E IR such that f(p) = p. 5.73
llf'llsup =
~· True or
True or False: The conditionally summable sequences of real numbers form
a vector space. 5.74
Determine convergence or divergence: a)
E~z (ki~gk)
b) E~ 1 krk- 1 , where
5.75
lrl < 1.
Find two sequences, Xk and Yk such that
E~ 1 xk converges, whereas E~= 1 Yk diverges.
5.76
Xk+l Xk
-+ 1
and
Yk+l Yk
-+ 1,
but
Determine convergence or divergence: "'00
kk
a) LJk=o kf
b) E~2
(lo; k)k
00 ~ has a rearrangement that diverges 5.77 True or False: The series Ek= 1 - k2 to oo.
EXERCISES
5.78
175
Give an example of two pairs of sequences Xk and Yk such that
and state the numerical values of the sums of each of the three series for each pair!
5.79 Let Xk and Yk be any two absolutely summable sequences, with the index k 2: 1. Write out the term c5 of the Cauchy product of x andy. 5.80
For each n E N define a sequence x(n) E
h by letting
jjx(n) II 1 for each fixed n E N. ll!kllsup and determine whether or not L:~ Mk
for all k E N. Calculate 5.81 Find Mk = each case. a) b)
L.::%: 1 e-kx on [2, oo ). L.::%: 1 sink x on [0, 1r /2].
converges, in
This page intentionally left blank
PART II
ADVANCED TOPICS IN ONE VARIABLE
This page intentionally left blank
CHAPTERS
FOURIER SERIES
14
Periodic (and nearly periodic) phenomena have played a large role in human activities since early recorded history. The cycles of day and night, the phases of the moon, the tides, the flooding of rivers, the seasons, and the migrations of birds and animals affect human beings' lives. Civilization as we know it is based upon agriculture, and this requires anticipation of the cycles in the world around us. Almost three thousand years ago Babylonian astronomers successfully predicted the times of lunar and solar eclipses by expressing these complicated events as summations of numerous simpler periodic events. These remarkable predictions were accurate to the extent of predicting eclipses that would be visible at least from some part of the world. Greek astronomers approximately two thousand years ago built a bronze device that predicted retrograde motions of the outer planets, which causes these planets to appear to reverse direction sometimes in their paths viewed against the background of distant stars. This device predicted the retrograde motions by compounding the effects of multiple periodic circular motions. 14 This
chapter is not required for any subsequent chapters.
Advanced Calculus: An Introduction to Linear Analysis. By Leonard F. Richardson Copyright © 2008 John Wiley & Sons, Inc.
179
180
FOURIERSERIES
In modem physics the color of light is determined by the frequency of oscillation of an electromagnetic wave, and the pitch of a musical note is determined by the frequency of oscillation of a vibrating string, membrane, or air column in a musical instrument. But the timbre of a musical instrument is determined by combining numerous oscillations of different frequencies, each with its own characteristic amplitude. Thus the recognizable differences of concert tone A on a flute, a piano, or a violin reflects in each case the summation of many different frequencies of relatively small amplitude added to the lowest frequency, which determines the perceived pitch. In all the examples listed above, complex oscillations are analyzed as sums of many simple oscillations. This subject is known as harmonic analysis; it is known also as Fourier analysis after Joseph Fourier, who successively applied it to analyze the flow of heat through a metal rod in 1822. There are many distinguished books about Fourier series, though most presume knowledge of the Lebesgue integral. The author recommends especially the book by Dym and McKean [6]. It provides a useful summary of the needed properties of the Lebesgue integral in the beginning.
6.1 THE VIBRATING STRING AND TRIGONOMETRIC SERIES We will begin with an important example from physics, 15 which explains why one might wish to express all functions from a broad class as the sums of infinite series of sine waves and cosine waves . • EXAMPLE 6.1
Suppose a piece of music wire (such as a violin, guitar, or piano string) is stretched taut and fastened down at the endpoints of the interval [0, 1] on the x-axis. Now either a finger or an implement is used to stretch the string away from the axis, into some initial plane curve described by a displacement function f (x). At time t = 0 the string is released, and, depending on the manner in which the plucking or hammering of the string was done, the string has an initial velocity at each point x E [0, 1] given by g(x). This leads to the following partial differential equation governing the function u(x, t), which is the displacement of the string at coordinate x and at time t.
a2 u
2a
2
u
at2 =a ax2.
(6.1)
Here a is a constant reflecting the tension to which the string is tightened and the density of the steel in the string. Since our purpose here is to motivate a mathematical problem and not to keep track of the effect of physical constants on the solution, we will take a = 1 henceforth. The boundary conditions on the string and the initial conditions on the displacement and velocity at timet = 0 are given as follows. 15 Neither first-year physics nor differential equations (ordinary or partial) is required for this course. The Example is presented for physical motivation and historical perspective.
THE VIBRATING STRING AND TRIGONOMETRIC SERIES
u(O, t) u(l, t) { u(x, 0)
~~(x,O)
=0, =0, = f(x), = g(x).
181
(6.2)
What is needed is a technique that may be enable us to construct a function u(x, t) which satisfies Equation (6.1) and conditions (6.2). A classic method of solution is to seek solutions having the special simplified form
u(x, t) = F(x)G(t), where F and G must be determined. (This method is called separation of variables.) Substitution into Equation (6.1) yields
F" (x) F(x)
G" (t) G(t) '
which proves that both sides must be equal to the same constant since the left side is independent oft and the right side is independent of x. If that constant were a nonnegative number A2 , then we would have
F"(x) = A2 F(x),
(6.3)
which has the general solution
But then there would be no solution for F( x) other than the identically zero solution given the boundary conditions (6.2). (See Exercise 6.5.) Thus we take the constant to be a negative number -A2 , which yields the general solution (6.4) Fn(x) = csin2mrx for some integer n E Z and an arbitrary constant c. Similar work for the function G yields a sequence of corresponding solutions
un(t) =(An cos 2mrt + Bn sin 2mrt) sin 21rnx. We observe that Equation (6.1) will be satisfied by any function of the form 00
u(x,t) = L(Ancosnt+Bnsinnt)sinnx,
(6.5)
n=O
provided that all questions of convergence of the series itself and its twice derived series can be managed.
182
FOURIER SERIES
The goal of this example is to satisfy both the boundary conditions and the initial conditions in (6.2). In order to achieve this, we will need to be able to express both f and g as sums of infinite trigonometric series: 00
f(x) = LAnsin2mrx n=l
and
00
g(x) =
L 2mr Bn cos 2mrx. n=l
Trigonometric Series Thus the problem of describing the motion of a vibrating string leads us to the importance of determining which functions f and g can be expressed as convergent infinite series of sine and cosine functions of various periods. And we will need a method of computation of the necessary coefficients of such expansions. As a problem of pure mathematics, we ask for which functions f defined on [0, 1] it is possible to expand 00
f (x)
=
L an cos 2mrx + bn sin 2mrx
(6.6)
n=O
in a suitably convergent manner. (We have included the index n = 0 in the summation in order to allow for nonzero constant functions.) Also, we need to ask whether given such a function f we can find the coefficients an and bn. Theorem 6.1.1 If the series in Equation (6.6) converges uniformly on f, then f E C[O, 1],
[0, 1] to some
function
an= { bn = for each n > 0, and ao =
2J01 f(x)cos2mrxdx, 2 f0 f(x) sin 2mrx dx
f01 f(x) dx.
1
(6.7)
Moreover, if the series
00
L (- 2mran sin 2mrx + 2mrbn cos 2mrx) n=O
converges uniformly on [0, 1], then f is differentiable and 00
f'(x) =
L -2mran sin 2mrx + 2mrbn cos 2mrx.
(6.8)
n=O
Proof: Since each summand on the right side of Equation (6.6) is a continuous function on [0, 1], the uniform convergence of this series guarantees the continuity of
EXERCISES
183
the sum, f, by Theorem 5.5.1. We can prove Equation (6.7) by using Theorem 5.5.1 to compute the integrals on the right sides as sums of infinite series of integrals, and showing that all but one integral vanishes. (See Exercise 6.7.) There will be no loss of generality if we take bo = 0 always. Equation (6.8) can be proven similarly, and this is left to Exercise 6.8. • Definition 6.1.1 A function f : lR ----> lR is called periodic with period T, provided that f(x + T) = f(x)forall x E R If the trigonometric series in Equation (6.6) converges at least pointwise to a function f on [0, 1], then the same series converges for all x E lR to a function of period 1 since each summand of the series has period 1. Thus any function f given as in Equation (6.6) can be viewed as being a periodic function of period 1 on the entire real line. We observe that if¢: [0, 1) ----> lR is any function defined on the left-closed and right-open unit interval, then ¢ can be extended to a function ¢e of period 1 on the real line by letting ¢e(x) = ¢(x- l x J), where the floor function lx J denotes the greatest integer that does not exceed x. A periodic function on the real line that is not constantly zero cannot be Riemann integrable, since its support is unbounded. However, we have the following concept. Definition 6.1.2 If f : lR ----> lR is periodic with period T, we call f Riemann integrable on the circle if its restriction to any closed interval of length T is Riemann integrable. The reader should note that iff has period T, then f(a) = f(a + T) for each real number a. Exercise 6.4 establishes that the integral of a periodic integrable function on the circle is independent of the choice of the interval [a, a + T] over which the integration takes place. An interesting way to picture the domain of a periodic function geometrically (or topologically) is as follows. Imagine a closed interval [a, a+ T] of length T. Picture the interval bent into a circle with the endpoints joined together to form a single point. This is a circle of circumference T. One can also picture this circle algebraically as the quotient group of the additive group of real numbers modulo the additive subgroup of integer multiples ofT. That is, the circle of circumference T can be pictured as lR/TZ. EXERCISES 6.1
Leta< band¢: [a, b)----> R a) Prove that¢ can be extended to a periodic function ¢e defined on the entire real line, with period T = b - a. b) Let f(x) = ¢e(a + (b- a)x) for each x E R Prove that f has period 1. c) If f can be expanded into a series according to Equation (6.6), derive a similar series expansion for ¢e·
6.2
Prove that the series
184
FOURIER SERIES
converges uniformly to a periodic function f of period 271" in C1 (IR), the continuously differentiable functions on the real line. 6.3
Prove that the series
~ sin2n7rx L....t 2n
n=O
converges uniformly to a periodic function differentiable functions on the real line.
f
of period 1 in C00 (1R), the infinitely
6.4 t Suppose f : lR ~ lR has period T and suppose that the restriction of f to the interval [0, T] is Riemann integrable. Prove that f is Riemann integrable on every closed interval of length T and that a+T
1 a
{T
f(x) dx = lo f(x) dx
for all a E R 6.5 In Equation (6.3) prove that there is no nonconstant solution for F if we take >.negative. (Hint: Use the boundary conditions in Equation (6.2).)
6.6
Explain why negative integers n are not needed in Equation (6.5).
6.7
Prove Equation (6.7).
6.8
Prove Equation (6.8).
6.2
EULER'S FORMULA AND THE FOURIER TRANSFORM
We will see that it is helpful both computationally and conceptually to recast the concept of trigonometric series so that it applies to complex-valued functions as well as to real-valued ones. First we remind the reader of some elementary properties of the set C of complex numbers. The properties are listed in Table 6.1. The first six axioms listed are identical to the field axioms for the real numbers. Axiom 7 applies to C but not to R Because of the fact that squares of complex numbers can be negative, there is no order relation for C and of course it follows that there is no Archimedean property for C either. Similarly, there is no concept of positivity or negativity for nonreal complex numbers, though these features are retained by the sub field of real numbers. We regard lR as being a subset of C consisting of all those complex numbers having the form X
Definition 6.2.1 If z = x
+ i0 = X + 0 =
X.
+ iy E C, then we define the conjugate z of z by Z = X - iy
z
and the modulus of to be the nonnegative real number
izl2 =
zz = x2 + y2.
izi given by
185
EULER'S FORMULA AND THE FOURIER TRANSFORM
Table 6.1
Field of Complex Numbers
The set
C
= {x + iy I x
E
R, y E R}
of all complex numbers is a field with two operations, called addition and multiplication. These satisfy the following properties:
+ z E C and wz E C. Commutativity: If w and z are elements of C, w + z = z + w and wz = zw. Associativity: If v, w, and z are elements of C, v + (w + z) = (v + w) + z
i. Closure: If w and z are elements of C, then w
ii. iii.
v(wz)
and
= (vw)z.
+ z) = vw + 1JZ. + z = z and 1z = z, for all z E C.
iv. Distributivity: If v, w, and z are elements of C, v( w v. Identity: There exist elements 0 and 1 inC such 0 Moreover, 0 # 1.
vi. Inverses: If z E C, there exists -z E C such that -z + z there exists z- 1 = ~ E C such that z~ = 1. vii. The number i 2
= 0.
Also, for all z
#
0,
= -1, the additive inverse of the number 1.
We call x and y, respectively, the real and the imaginary parts of z these are denoted by
x =
~(z),
y=
=
x
+ iy and
~(z).
We remark that z z is a nonnegative real number for all z E C. The complex numbers can be modeled conveniently in a geometrical manner by identifying z = x + iy with the point (x, y) in the Cartesian plane. We think of the x-axis as the real axis and they-axis as the imaginary axis in this picture. The plane can be equipped with polar coordinates (r, 0) as well as with rectangular coordinates (x, y). The relationship between these two systems of coordinates is that x = r cos 0 andy = r sin 0. Thus each complex number z can be written in the polar form
z = r(cosO + isinO).
Definition 6.2.2 A sequence of complex numbers Zn is said to converge to z E C if and only iflzn- zl --+ 0 as n--+ oo. It is easily verified that Zn Exercise 6.1 0.)
--+
z if and only both ~Zn --+ ~z and ~zn --+ ~ z. (See
186
FOURIER SERIES
Definition 6.2.3 An infinite series E:=o Zn of complex numbers is said to converge to a complex numberS if and only if the sequence N
SN = LZn n=O
converges to S.
E:=o
We leave it to Exercise 6.11 for the reader to prove that Zn converges to ~Zn converges to ~Sand ~Zn a complex numberS if and only both converges to ~S.
E:=o
E:=o
Definition 6.2.4 For each complex number z E C we define n
oo
z' e = L..J t z
(6.9)
"""
n=O
n.
which is shown to converge in Theorem 6.2.1.
Theorem 6.2.1 For each complex number z, the series in Equation (6.9) converges. Moreover, for each real number x, we have Euler's formula:
L oo
eix =
n=O
Proof:
('
)n
~ n.
=
cosx + isinx.
(6.10)
E:=o 1.ftl
We note first that converges since the real exponential series converges absolutely for every real number. But it is immediate that I~ (zn) I ::; lznl and also that I~ (zn)l ::; lznl. Thus the sum of the real parts in Equation (6.9) converges absolutely, as does the sum of the imaginary parts. Now we apply Exercise 6.11 to the series appearing in Equation (6.9) in order to conclude that the series expansion that defines ez converges for every z E C. For Euler's formula, we observe that if n
= 2k
if n = 2k + 1. It follows that
= cosx + isinx.
•
EULER'S FORMULA AND THE FOURIER TRANSFORM
187
Corollary 6.2.1 For each real number x, we have
eix
cosx
+ e-ix
(6.11)
2
eix _ e-ix sinx
(6.12)
2i
•
Proof: Apply Euler's formula.
We observe next that the Nth partial sum of the trigonometric series in Equation (6.6) can be rewritten using Corollary 6.2. I in the form N
L (an cos 21rnx + bn sin 21rnx)
(6. 13)
n=O N
L
Cne211"inx'
n=-N
where eo
= ao and
alnl - i sgn(n)blnl 2 if n # 0. (Here sgn denotes the signum function.) We would like to convert the formulae in Equation (6.7) for an and bn into direct formulae for the calculation of en. In order to do this, we will need a suitable definition for the integral of a complex-valued function of a real variable.
Cn
=
Definition 6.2.5 Suppose that f : [a, b] ~
f(x)
= SRJ(x) + iC.Sf(x)
for each x E [a, b]. If the two real-valued functions SRf(x) and C.Sf(x) are both Riemann integrable on [a, b], then we say that f E R([a, b],
1b
f(x) dx =
1b
SRJ(x) dx
+i
1b
SSJ(x) dx.
Moreover, for each k E Z we define the kth character by Xk(x)
=
e211"ikx.
(6.14)
Remark 6.2.1 Each function Xk : IR ~ IR in such a way that Xk : Z ~ {1 }. In algebraic terms, this means that each function Xk is well-defined on the additive quotient group IR/Z, which is interpreted geometrically as a circle of circumference 1. Note that X-k(x) = h(x), the complex conjugate of Xk(x). See Exercise 6.23.
188
FOURIER SERIES
Corollary 6.2.2 Iff is a Riemann integrable function on [0, 1] with uniformly convergent trigonometric series, then theN th partial sum of that series can be expressed in the form N
SN =
L CnXn(x), -N
where
1 1
Cn =
f(x)X.n(x) dx
for each integer n, positive, negative, or zero. We leave the proof of this corollary to the reader in Exercise 6.13. The reader should note that as N ---+ oo the summation that defines S N leads to what is naturally expressed as a sum from -oo to oo. This leads us to the formulation of the general concept of the Fourier series of any Riemann integrable function f on [0, 1], without making any claims initially regarding convergence of that series to such a general function.
Definition6.2.6 Let f E 'R([O, 1],C). We define the Fourier series off to be the (doubly) infinite series 00
L
S(f) =
J(n)xn(x),
(6.15)
n=-oo
where we define the numbers
j( n ), called the Fourier coefficients off, by
1 1
J(n)
=
f(x)X.n(x) dx
(6.16)
J:
for each n E Z. The function Z ---+ C is called the Fourier transform of f. Even if the doubly infinite summation in Equation(6.15) were not convergent in any sense, we would call N
SN(f)
=
L:J(n)xn(x) -N
theN th partial sum of the Fourier series of the Riemann integrable function f. We note in advance that in Theorem 6.5.1 we will prove that the Fourier series of every Riemann integrable function does in fact converge to f in the sense of the L 2-norm, which will be defined later. For the case of pointwise convergence, we observe that since the summation in Equation (6.15) is over a countable set of indices, it will be independent of order, provided that it is absolutely 16 convergent. Thus there 16
An infinite series of complex numbers
Cn
is called absolutely convergent, provided that the sum of
len I is finite, where the symbol that looks like an absolute value actually means modulus in this context.
The theorems concerning independence of order of summation apply to absolutely convergent series of complex numbers as well as to real series. The proofs are identical, even down to the use of the same absolute value symbol for the modulus.
EULER'S FORMULA AND THE FOURIER TRANSFORM
189
there will be no loss of generality if we understand that the double summation in Equation (6.15) means the limit of SN as N __, oo, whether the convergence be pointwise, uniform, or in the sense of £ 2 -norm, which remains to be defined. In the next section we will begin to investigate the convergence of S N. First we will need to know that the standard computational properties of the exponential function carry over to the case of complex exponents. Theorem 6.2.2
Jfw and z are complex numbers, then ew+z = ewez.
Remark 6.2.2 In Exercise 5.27 the reader showed that the multiplicative property of the exponential function
for all real numbers x and y follows from the binomial theorem. We will not require that exercise for what follows. The proof below of the multiplicative property of ez for all complex numbers z is based on the multiplicative property of ex for all real numbers x, which we assume to be well known to the reader. Proof: Because exey = e(x+y) the Cauchy product of the series expansions on the left-hand side yields
n xky(n-k) oo (x + y)n k!(n- k)! = n! n=Ok=O n=O oo
2:2:
2:
Since real convergent power series are unique, it follows that
n xky(n-k) (x + y)n k!(n- k)! = n! ' k=O
2:
which can be observed to follow alternatively from the binomial theorem. The complex series n
oo
ez
=
2: ~! n=O
is said to converge absolutely, though for complex series this means that the sum of the moduli converges. It follows that the Cauchy product can be applied to the expansions of ew and ez, and this results in the conclusion of the theorem. • Definition 6.2.7 If f(x) = u(x) + iv(x) maps an interval I to C where u and v are real-valued, we say that f is differentiable on I ifu and v are differentiable and then we define f'(x) = u'(x) + iv'(x).
In Exercise 6.12 the reader will show that de(a+ib)x dx
i
c
x
e(a+ib)t dt
(a+ ib)e(a+ib)x,
(6.17)
(a+ib)t lx ...,..e---:-:(a+ ib) t=c
(6.18)
for any c and x in I, except that the Equation (6.18) requires a + ib # 0.
190
FOURIERSERIES
EXERCISES 6.9
Let z
= x + iy and z' = x' + iy' be complex numbers. zz'.
Show that
a) zz' =
b) jzz'l = lzllz'l· c) iz + z'l ::; lzl + lz'l d) ~(zz') = xx'- yy'. e) ~(zz) = xy' + x'y. f)
X-n(x) = Xn(x)
6.10 t Use Definition 6.2.2 to prove that a complex sequence z.. both ~Zn ---t ~z and 'Szn ---t 'Sz.
---t
z if and only
6.11 t Use Definition 6.2.3 to prove that a complex series L:~=O Zn converges to a complex numberS if and only both N
L ~(zn)
~S and
L ~(zn)
---t
~S
n=O
n=O
as N
N
---t
---too.
6.12 t Prove Equations (6.17) and (6.18) for any real numbers a and band for any c and x in /. For the second formula assume a+ ib "1- 0. (Hint: Use Equation (6.9), or else use Euler's formula (Theorem 6.2.1) together with the multiplicative property of the exponential function.) 6.13
Prove Corollary 6.2.2.
6.14
Use Euler's formula together with Theorem 6.2.2 to prove that
cos(x + y) = cosxcosy- sinxsiny and sin(x 6.15
Give an example off E R([a, b], C) for which
6.16 t Show that Definition 6.2.5.) 6.17 have
+ y) = sinxcosy + cosxsiny.
II: f(x) dxl ::; I: lf(x)l dx
I:
f(x) 2 dx < 0.
even for f E R([a, b], C). (See
t Use the result of Exercise 6.12 to prove that for all integers nand m we if n "1-m,
ifn=m. This equation establishes that the functions Xn(x) are what are called mutually orthogonal characters. The set {Xk I k E Z} is called an orthonormal set. 6.18
t Suppose f
fa off by !a(x)
=
E R[O, 1] has period 1, and let a E R Define the a-translate f(x +a). Let k E Z and define the k-rotation !Xk off by
EXERCISES
191
(fxk)(x) = e 21rikx f(x ). Prove that the Fourier transform converts translations into rotations, and rotations into translations in the following sense. a) f;.(n) = Xn(a)i(n).
b) fu(n) = i(n- k). 6.19 Calculate the Fourier transform i( n) for each n E Z iff is a periodic function of period 1 defined on an interval of length 1 as shown below. Write also the formal Fourier series S(f) for each given function. (Integration by parts is sometimes helpful for these problems.) a) Let if X E [a, b], f(x) = 1[a,bJ(x) = { ifx E [0, 1) \ [a,b],
~
where 0 :::; a < b < 1. b) f(x) = 1 +cos 21rnx +sin 47rnx on [0, 1). c) f(x) = x on [0, 1). d) f (x) = x on [e)
f (x) = x 2 on
4, 4] . 4, 4) ·
[-
f) f(x) =ex on [0, 1). 6.20 Suppose that f is an even function on the same domain. Prove that a) f is even. b) gis odd.
[-!, ! ] and g is an odd function on
~
6.21
For the periodic function of period 1 given by f(x)
= x for 0 :::; x <
1, find
00
lim BN(x)
N->oo
=
L
i(n)xn(x)
n=-oo
at the point x = 0. Is the value of this sum the same as f(O)? (You can use the data from Exercise 6.19.)
6.22 Suppose that the function f E R[O, 1] is a periodic function with period 1, and suppose that f is real-valued. Prove that a) i( -n) = f(n). b) SN(f) is real-valued for each N E N.
6.23 The functions Xn(x) defined in Equation (6.14) are called characters, for which reason the Greek letter X is used to represent them. 17 Show that X n ( x) is a homomorphism of the additive group lR/Z onto the multiplicative group of complex numbers of modulus 1. That is, show that
Xn(x
+ y) = Xn(x)xn(y).
17 This exercise is only for students who have studied some abstract algebra. It is not necessary for any subsequent topics.
192
FOURIER SERIES
6.3 BESSEL:S INEQUALITY AND l2 Bessel's inequality is very important in the study of Fourier series, especially with regard to convergence. In order to introduce this inequality, we need the structure of an inner product space for the space of complex-valued Riemann integrable functions. In Definition 6.2.5 we defined the integral of any complex-valued function jfor which fRj and ':Sf are both Riemann integrable. We say that such functions f are in the space R([a, b], C). And in Definition 3.4.1 we defined the inner product (or scalar product) of any two real-valued Riemann integrable functions as
(!,g)=
1b f(x)g(x) dx.
This definition will not suffice for the purpose of making the Riemann integrable complex-valued functions into an inner product space. The difficulty is that
can be negative and is thus not the square of a norm. We need to show how one can define a special kind of inner product (called a Hermitian inner product) on the vector space R([a, b], C) over the complex numbers. (See Exercise 6.24.)
Definition 6.3.1 In any complex vector space V (Table 2.I on page 59, using complex scalars), we call a function (-,·):vxv~c
a Hermitian scalar product 18 if and only if it has the following three properties. i. (ax+y, z) = a(x,z)+(y, z)forall a E Candforallxandy in V. (Linearity in the First Variable)
ii. (x, y) = (y, x) for all x andy in V. (Conjugate Symmetry)
iii. (x, x) 2: 0 for all x E V and (x, x) = 0 {::} x = 0 E V.
(Positive
Definiteness) We show next how to introduce the structure of a Hermitian scalar product in
R([a, b], C). Definition 6.3.2 In the vector space R([a, b], C), we define
(!,g)= 18
1b
This may be called also a Hermitian inner product.
f(x)g(x)dx
(6.19)
BESSE~S INEQUALITY AND l2
and we define the L 2 -norm
19
llfll2 =
193
off E R([a, b], C) by
J(!,f) =
(1b
1
2
lf(xWdx)
(6.20)
We will see shortly that this product is a Hermitian scalar product, with one subtle proviso to be described momentarily. We see that (!,f)=
1b
lf(xW dx 2:: 0
since f(x)f(x) = lf(x)l 2, where the vertical bars in lf(x)l connote the modulus of the complex value of f(x). Also, we see that in the special case in which f and g happen to be real-valued, then the Hermitian inner product(!, g) is the same as the inner product defined for real-valued integrable functions in Section 3.4. Properties (i) and (ii) of the definition of a Hermitian inner product are left to the reader to verify in Exercise 6.25. The one property that is not satisfied by R([a, b], C) is that we could have (!, f) = 0 without having f = 0. We can remedy this deficiency if we agree to understand that two Riemann integrable functions are to be considered equivalent provided that the integral of the absolute value of their difference is zero. Thus it is really the equivalence classes that are the vectors in the Hermitian inner product space R([a, b], C). Definition 6.3.3 Let f and g be in the space R([a, b], C). We say that f is equivalent lf(x)- g(x)l dx = 0. tog, written as f"' g, provided that
J:
See Exercise 6.28. Theorem 6.3.1 In a complex vector space V equipped with a Hermitian scalar product as defined in Definition 6.3.1, we define
llxll = J (x, x)
(6.21)
for all vectors x E V. The function II· II as defined in Equation (6.21) is a norm, as in Definition 2.4.4, where scalars care taken as complex and lei is interpreted as the modulus of c. Moreover the Cauchy-Schwarz Inequality is satisfied:
l(x,y)l::; llxiiiiYII· Proof: To prove the Cauchy-schwarz inequality, we fix x andy, and we proceed as follows. If x = 0 the Cauchy-Schwarz inequality is trivial. So suppose x =I 0. 19 This norm is named for Henri Lebesgue, inventor of the Lebesgue integral. Here we use the norm only in the context of the Riemann integral, however.
194
FOURIER SERIES
For all c E C, observe that (ex+ y, ex+ y) ~ 0 for all c. By linearity of the scalar product in the first variable and conjugate-linearity in the second variable we see that
!lxll 2 lcl 2 + 23?(c(x,y)) + IIYII 2 2:0 for all c E C. Let
(x,y) llxll 2 • 2 An easy calculation shows that I(x, y) 1 ~ llxii 2 IIYII 2 • c =-
The first two conditions of Definition 2.4.4 are easily verified for condition, the triangle inequality, is left for Exercise 6.27.
II · 11.
The third •
We are ready to introduce Bessel's inequality. The reader should recall from Exercise 6.17 that the characters Xk with k E Z comprise an orthonormal set with respect to the Hermitian inner product. Moreover, the Fourier coefficients of a Riemann integrable function fare defined by Equation (6.16) in such a way that
J(k) = (!, Xk) using the Hermitian inner product. Theorem 6.3.2 (Bessel's inequality) Let f E R([O, 1], C) be a periodic function of period 1. Then we have Bessel's inequality 00
L IRk)l
2
~ 11111~.
(6.22)
k=-oo
with the right side being necessarily finite. Moreover, we have
so that equality holds in Bessel's inequality if and only if (6.23)
as n----> oo. Remark 6.3.1 We will show in Theorem 6.5.1 that the limit shown in (6.23) exists and is zero. Then we will have established equality in Bessel's inequality. The resulting equality is called the Plancherel identity.
Proof: We calculate carefully the following nonnegative Hermitian scalar product:
(1- ~j(k)xk,f- ~f(k)xk) = 11!11~- ~j(k)i(k) n
=
11111~- L -n
li(k)l
2
2:
o.
BESSEL:S INEQUALITY AND l2 2
Thus the partial sums of the terms jJ(k)j are an bounded above by yields Bessel's inequality.
195
11!11~, and this •
Definition 6.3.4 We de.fme the space l 2 of square-summable complex double sequences as follows:
lz = {
C: Z-+ C I k~oo hl 2< 00}.
Remark 6.3.2 We should observe that the Fourier transform, denoted by :F : f -+ j, is a mapping ofR([O, 1], C) into l 2 . One might hope that the Fourier transform maps R( [0, 1], C) onto [z, but this is not true. It is commonly proven in a graduate course in harmonic analysis that the Fourier transform can be extended to the much larger space of an Lebesgue measurable square-integrable functions, which is denoted L 2 , and that the Fourier transform is an invertible map from L 2 onto l 2 . However, the development of the Lebesgue integral requires approximately one semester at the graduate level. The space [z of square-summable double sequences has many interesting properties. Theorem 6.3.3 The space [z is a vector space over C and .for each pair of elements c and din l 2 , we can define a Hermitian inner product by the fonnula 00
(c, d) =
L
(6.24)
ckdk
k=-oo
and a nonn by the fonnula
that satisfies the Cauchy-Schwarz inequality
l(c, d) I :S llcllzlldllz· Proof: The first task is to show that the sum in Equation (6.24) is convergent. In fact, we will show that (6.25) and this will establish absolute convergence. For each c E [z, define the truncation _ (
nC-
C-rH .•. ,
Co, ... , Cn )
E
~r2n+l
'L-
•
196
FOURIER SERIES
For these truncated sequences there is no convergence problem and it is easy to see that we define a Hermitian scalar product by the formula n
(nc, nd) =
L
Ckdk.
-n
Theorem 6.3.1 can be applied to the truncated sequences to prove that n
-n
for each value of n E N. This establishes Equation (6.24) and it establishes the Cauchy-Schwarz inequality
for l2 at the same time. It remains only to prove that l2 is a vector space, which we leave to the reader in Exercise 6.30. •
EXERCISES 6.24 Prove that 'R.([a, b], C) satisfies all the axioms to be a vector space over the field of complex numbers. 6.25
Prove that 'R.( [a, b], C) satisfies properties (i) and (ii) of Definition 6.3.1.
6.26
t Prove that in a Hermitian inner product space we have a) (!, cg) = c(f, g), where c E C. (!, g +h) = (!,g) + (!,h).
b)
6.27 t If llxll is defined by means of a Hermitian inner product, prove the Triangle Inequality:
llx + Yll
~
llxll + IIYII·
(Hint: Use the Cauchy-Schwarz inequality.)
6.28 Suppose that j, !I, g, and 91 are all Riemann integrable complex-valued functions on [a, b] such that f "" !I and g "" g 1 as in Definition 6.3.3. Prove that
(Hint: Use the Cauchy-Schwarz inequality.)
UNIFORM CONVERGENCE & RIEMANN LOCALIZATION
197
6.29 t 0 20 Suppose that f E R([O, 1], q has period 1 and that c_n, ... , Cn are any 2n + 1 complex constants. Then n
f-
L
n
< !-
j(k)Xk
k=-n
2
L
CkXk
k=-n
2
with equality holding if and only if ck = j( k) for each k. Hint: Write the difference inside the right-side norm as the sum of two sums of differences, one of these involving (J(k)- Ck) Xk· 6.30 t Complete the proof of Theorem 6.3.3 by proving that if c and dare in l 2 and if a E C then II ac + dll ~ < oo. (Hint: Write the square of the norm as a Hermitian scalar product and apply the Cauchy-Schwarz inequality.) 6.31 t 0 21 Prove that l2 is complete, in the sense that each Cauchy sequence in the l 2 -norm converges to a square-summable complex sequence. (Hint: Prove that if c(n) E l 2 is a Cauchy sequence, then c~n) converges to some Ck E C for each k E Z. Then prove that I:~oo ick 12 < oo and that llc(n) - cll2 ----+ 0 as n ----+ oo.)
6.4
UNIFORM CONVERGENCE & RIEMANN LOCALIZATION
In this section we will prove that if a periodic function f E CP(JR) has period 1, then the Fourier series S(f) converges uniformly to J, provided that 1 :::; p:::; oo. In other words a periodic function on the circle will have a uniformly convergent Fourier series provided that f has at least a first-order continuous derivative. Before we state formally and prove the main result, which is Theorem 6.4.2, it is important to remark that iff is not differentiable this theorem can fail. It is known that if f is only continuous but not differentiable, then the Fourier series of f can diverge for uncountably many values of x. Divergence is even more erratic behavior than convergence to a value different from f(x), a phenomenon the reader has seen in Exercise 6.21. The main theorem will require a preliminary result that introduces the Dirichlet kernel. Theorem 6.4.1 Iff E R[O, 1] is a periodic function ofperiod 1, the nth partial sum Sn of its Fourier series is given by
1: 1
Sn(x) =
f(t)Dn(X- t) dt,
(6.26)
2
20 This
exercise is used in the proof of Theorem 6.5.1, establishing the convergence of Fourier series in the
£ 2 -norm. 21 This
exercise is cited in Remark 6.5.2.
198
FOURIER SERIES
y
4
2
X
Figure 6.1
Dirichlet kernel Dn for n
= 3.
where the Dirichlet kernel Dn is defined by 8in(~n+1)7rx
Dn(x) =
ifx ~ Z, ifx E Z
sm7rx {
2n+ 1
1
and is shown in Fig. 6.1. Also, J~ 1 Dn(x) dx
(6.27)
= 1for each n EN.
2
Remark 6.4.1 The Dirichlet kernel does not converge to zero for x bounded away from the origin. It depends for its work on rapid oscillations in sign to produce cancelations, together with most of its integral being nearly 1 over a small interval around the origin.
Proof: We observe that Sn(x) =
L n
(
k=-n
=11
1 1
f(t)xk(t) dt
)
Xk(x)
f(t) ktn n(x- t) dt
=
=I:
L n
k=-n
1 1
f(t)xk(x- t) dt
f(t) Ctn e211'ik(x-t)) dt
since the integrand has period 1 and can be integrated with the same result on any interval of length 1 (Exercise 6.4 ). It will suffice to prove that the sum inside the
199
UNIFORM CONVERGENCE & RIEMANN LOCALIZATION
integrand is the Dirichlet kernel evaluated at x- t. We reason as follows using Euler's formula and the sum of a geometric series. n """" e21fikx
~
2n
= e-21finx """"e21fikx = e-21finx 1 ~
k=-n
_ 27ri(2n+l)x e
1-
. e21f•x
k=O
e21fi(n+l)x
e-21finx -
1_
e-i(2n+l)7rx - ei(2n+1)1fx
e21fix
= sin(2~ + 1)7rx = Dn(x), Slll71"X
provided that the denominator in the geometric series formula is not zero, which is equivalent to x tf. Z. If x E Z, then the sum is clearly 2n + 1. Finally, since we have shown above that Dn ( x) 2.::~=-n e 21rikx, it follows 1
readily that J! 1 Dn(x) dx 2
= 1 for each n EN.
•
The following famous lemma is very useful. Lemma 6.4.1 (Riemann-Lebesgue Lemma) Iff E R( [0, 1], C), then
as
lnl-+ oo. The proof of this lemma is left to Exercise 6.32.
Definition 6.4.1 Iff E R([a, b], C), we define the £ 1 -norm off by the formula
llfll1 =
1b lf(x)l
dx,
where the vertical bars indicate the modulus of the complex-valued function f.
II
J:
If we identify functions f and gas equivalent if lf(x)- g(x)l dx 22 · ll1 is a legitimate norm on R([a, b], C). (See Exercise 6.33.)
= 0, then
Lemma 6.4.2 Let f E CP(JR) have period 1, with 1 :'::: p < oo. Ifn -=f. 0, then (6.28)
it can be shown that the norrned vector space R.([a, bj, IC) is not complete in the L 1 -norrn. (See Exercise 6.50.b.) The completion of this space with respect to the L 1 -norrn requires the Lebesgue integral. 22 However,
200
FOURIER SERIES
Proof: We begin by applying integration by parts to
1 1
f' (n) =
f'(x)xn(x) dx
=
f(x)xn(x)l:
=
2rrinf(n).
-1
1
f(x)xn'(x) dx
We iterate this argument a total of p times obtaining ~
~
f(n) =
j(P)(n)
(2rrin)P .
(6.29)
Finally, we observe that for each function g E R( [0, 1], q we have
19(n)l =
111
g(x)xn(x)dxl
~
1 1
lg(x)xn(x)l dx =
IIYII1·
• Theorem 6.4.2 Let f E CP(JR) have period 1, and suppose 1
~
p
< oo.
Then the
Nth partial sum N
L
SN =
f(n)xn(x)
n=-N converges uniformly to
f
on the real line. Moreover,
for some constant K that is independent of N but dependent upon f and p. Proof: It would suffice for the first part of the theorem to give a proof for p = 1, but the first part follows from the inequality that is the second part, and that is what we will prove. Our first step will be to prove that the sequence Sn is Cauchy in the sup-norm. For each n E N and for each m 2 n we have
ISm(x)- Sn(x)l
~
L
IJ(k)l
ikl?=n
~ (I:
lkl>n
1
2 lfoi(k)1 )
2
(I: (2rr~)2v) lkl>n
~ IIJ(P) 112 ( 21oo (2rrx)-2P dx)! =
llf(P)II
(2rr)!-v n~ 2 Jrr(2p- 1)
---+0
1
2
UNIFORM CONVERGENCE & RIEMANN LOCALIZATION
201
as n ----> oo. For inequality ( 1) we have used Equation (6.29) and the Cauchy-Schwarz inequality for l 2 • For inequality (2) we have used both Bessel's inequality and the integral test for infinite series of positive terms. This proves that
and Sm is uniformly convergent to some continuous function ¢. Letting m ----> oo, we see that II¢ - Sn I sup --t 0 as n ----> oo and that the convergence takes place at the rate claimed. It remains still to prove that Sn ----> f or, in other words, that¢ = f. Since uniform convergence is established already, we need prove only pointwise convergence to f. We fix x arbitrarily and observe that 1
Sn(x) - f(x)
=[
1
2
f(x 1
+ y)Dn(Y) dy-
2
f(x) [
=
2
! f(x + y) - f(x) .
1
.
-!
-1! -
Dn(Y) dy 1
2
_1
sm11'(2n+1)ydy
Slll1l'Y
ei7r(2n+l)y _ e-i11"(2n+l)y
Q(y)
d
y,
2i
2
where we have used Euler's formula, and we define f(x+y)- f(x)
Q(y) =
{
/'(xJtn7ry 71"
ify
I o,
ify = 0.
Next we define (6.30) and the reader will show in Exercise 6.37 that each of these functions is continuous and thus Riemann integrable. Finally, we see that Sn(x)- f(x) =
~i
(ZJ:( -n)- Q:(n)) ----> 0
as n ----> oo by the Riemann-Lebesgue lemma.
•
The following theorem provides very important information about the behavior of the pointwise convergence of Fourier series.
Theorem 6.4.3 (Riemann Localization) Suppose that f and g are Riemann integrable functions of period 1, and suppose there exists a subinterval (a, b) C [0, 1] such that f(x) = g(x)forall x E (a, b). Then Sn(f)- Sn(g)----> 0 uniformly on each interval [c, d] c (a, b).
202
FOURIER SERIES
fC
Remark 6.4.2 The Fourier coefficients n) of a Riemann integrable function of period 1 are called global objects because they result from an integral of f against a character over the whole domain [0, 1], which we can think of as being the entire circle IR/Z. The word global is used to connote the fact that the behavior off over its entire domain is reflected in the calculation of f( n) for each value of n. On the other hand, a convergence property of the Fourier series S (f) in a very small interval that reflects the behavior off only in that small interval is called local in nature. The Riemann Localization Theorem is a striking example of local-global duality. The reader should note that the integrable functions f and g could differ as much as we like outside the possibly minute interval (a, b), and also that neither S(f) nor S(g) need be convergent even pointwise on (a, b), and yet Sn (f) - Sn (g) must converge uniformly to zero on any closed proper subinterval of (a, b).
1.5
1.0
0.5
-0.4
0.4
Figure 6.2 810 for F(x) = el"'l [1 - 1( -. 25 ,. 25 )] on [- .5, .5) with period 1.
Proof: We begin by making a few simplifications that incur no loss of generality. Observe first that Sn(f)- Sn(g) = Sn(h), where h = f- g and h(x) 0 on (a, b). Let xo = ~and r = b2a, so that (a, b) = (xo -r, xo+r). Our goal is to prove that Sn(h) --> 0 uniformly on each [c, d] C (x 0 - r, x 0 + r). As a final simplification, we would like to represent the circle IR/Z as [- ~, ~) and it would be convenient to have xo = 0. To this end, define F(x) = h(x + x0 ) = hx 0 (x). Thus F 0 on ( -r, r). If [c,d] C (a, b) then there exists 8 E (O,r) such that [x 0 - 8,x 0 + 8]:;;! [c,d]. If we can prove that Sn(F) --> 0 uniformly on [-8, 8] then Sn(h) --> 0 uniformly on [c, d] since Sn(h) is simply an xo-translation of Sn(F). (An example of a function such as F in this proof is shown in Fig. 6.2 together with the tenth partial sum of its Fourier series.)
=
=
203
UNIFORM CONVERGENCE & RIEMANN LOCALIZATION
We proceed with the proof of the Riemann Localization theorem. Recall that 1
Sn(x) =
1_: F(x + y)Dn(Y) dy
., = 1 2
1
F(x + y)
-~
sm 1ry
.
·
sin(2n + 1 )1ry dy.
We define two functions p+ and p- by ±i1fy p±( ) = F( x + y )e y 2"ZSlll7r1J .
=
=
for all x E [-~, ~). If lxl ::; 8, then since F 0 on ( -r, r), we have F±(y) 0 if IYI < r - 8 = 8'. It follows that p± E R ( [- ~, ~] , C). We calculate from Euler's formula that as n --;. oo by the Riemann-Lebesgue lemma. It remains to prove that the convergence of Sn to 0 is uniform on [-8, 8]. If xi and x 2 are in [-8, 8], then
independent of n. Moreover, the right-hand side is uniformly continuous as a function of XI and/or x 2 in the circle, lR/Z, which is compact. (Here we use Exercise 6.39.) Thus for each E > 0 there exists Tf > 0 such that lx2 - xi I < Tf implies
for all
XI
and
x2
in the circle lR/Z. Now we lay out an ry-net of points
-8 =
XI
< X2 < · · · < Xn = 8
such that Xk - Xk-I < Tf for each k. We pick a value of n E N large enough so that n ~ N implies Sn(xk) < ~ for each k = 1, ... , n. Then n ~ N implies that ISn(x)l < E for all x such that lxl < 8. •
Remark 6.4.3 In Exercise 6.40 the reader will use the Riemann localization theorem to show that if an integrable function of period 1 is smooth on an interval but not globally smooth, there will still be uniform convergence on a closed subinterval of the integral of smoothness, provided one keeps at least some positive distance 8 > 0 away from the endpoints. See Fig. 6.3. Notice that just before and just after each
204
FOURIER SERIES
-1.5
-1.0
Figure 6.3
-0.5
Sw for f(x)
0.5
=
lOe"', -0.5
~
1.0
1.5
x < 0.5, with period 1.
long vertical jump, the Fourier approximation 8 5 seems to overshoot the function by a larger amount than is the case well within the interval of smoothness. That overshoot near the jump is known as Gibbs' phenomenon. By using a Fourier approximation Sn with larger values of n, one can narrow the support of that large overshoot, confining it closer in the x-coordinate to the jump point, but one cannot eliminate the greater overshoot near that jump point. Gibbs' phenomenon is visible also in Fig. 6.2.
EXERCISES 6.32
J
t Prove Lemma 6.4.1.
(Hint: The result follows quickly from the fact that E l2, because of Bessel's inequality.)
Prove that II · lit in Definition 6.4.1 has all the properties required to be a norm on the vector space of equivalence classes of Riemann integrable functions
6.33
R([a,b],C). 6.34 Iff and g are functions of period 1 in C 1 ( [0, 1], C) and if J( k) all k E Z, prove that f(x) = g(x) for all x.
=
g( k) for
6.35 Find the Fourier series that converges uniformly to the periodic function f(x) = cos27l'x + sin41l'xsin61l'x of period 1. 6.36
Prove that an absolutely convergent Fourier series must be uniformly convergent as well.
t Show that the functions Q+ and Q- in Equation (6.30) are continuous. 6.38 Iff is a Riemann integrable function of period 1, prove that f E 0 ([0, 1], C) if and only if kn f(k) ___. 0 as lkl ___. oo for all n E N. 6.37
00
£ 2 -CONVERGENCE & THE DUAL OF 12
205
6.39 tIff E R([O, 1), q is a function of period 1, and if we denote the t-translate off by ft(x) = f(x+t) then the function ¢(t) = lift- fll1 is a continuous function oft E lR/Z, the circle_ (Hint: Let E > 0_ Show that there is a step function a such that II!- all1 < ~· Next use a double application of the triangle inequality for the £ 1 -norm.)
6.40 <> Suppose f E C 1( (a, b), q for some (a, b) c [0, 1] and that f is a Riemann integrable function of period 1. Prove that the Fourier series S(f) converges uniformly to f on each proper closed subinterval [c, d] c (a, b). See Fig. 6.3. Note that in the figure a = -0.5 and b = 0_5 and [a, b) is also a valid domain for the study of functions of period 1. (Hint: Use Exercise 5_62 to create a smooth periodic function that agrees with f on (a, b)- Apply Theorem 6.4.3.)
6.5
L 2 -CONVERGENCE & THE DUAL OF l 2
We have shown in Equation (6.22) that iff E R([O, 1], q is a function of period 1, then the double sequence j( n) is in l 2 , and 00
L IJ(n)l
2
~ IIIII~
-oo
with equality holding if and only if IIBn(f)- fll2 ---> 0 as n ---> oo. The case of equality in Bessel's inequality is called the Plancherel identity, which we will establish with Theorem 6_5 _J_ The Plancherel identity can be interpreted as an infinitedimensional version of the Pythagorean Theorem for R[O, 1], which we interpret as an infinite-dimensional Hermitian inner product space, utilizing the equivalence relation f "'gin R[O, 1] if and only if lf(x)- g(x)l 2 dx = 0_ Some applications of this identity are given in Exercise 6.44_
J;
Theorem 6.5.1 Iff E R([O, 1], q is a function of period 1, then
as n ___, oo. Consequently we have the Plancherel identity:
L lf(n)l 00
2
=
11!11~-
(6.31)
-oo
Remark 6.5.1 A celebrated theorem of Lennart Carleson [5] established that the Fourier series of any square-integrable Lebesgue measurable function must converge to f (x) pointwise except on a set of Lebesgue measure zero, 23 and that theorem does apply to all the functions covered by our theorem above. However, a set of points 23 The reader will not need to know about Lebesgue measure here. The definition of a set of Lebesgue
measure zero-known also as a Lebesgue null set-is, however, provided in this book as Definition 11.2.1.
206
FOURIER SERIES
can have Lebesgue measure zero and still be an uncountably infinite set. There are examples known of continuous functions f for which Sn(f) is actually divergent for infinitely many values of x. And there is an example of a Lebesgue integrable function f for which the Fourier series diverges at each point x! The extraordinary pathologies of Fourier series in regard to pointwise convergence, even for continuous functions, make theorems like the one we are about to prove very interesting and useful. Proof: Suppose first that f is real-valued. By Exercise 3.29, there exist step functions a(x) :::; f(x) :::; a'(x) for all x E [a, b], so that
1b a(x) dx:::; 1b f(x) dx :::; 1b a'(x) dx such that I I: a'(x) dx- I: a(x) dxl
< E, implying also that
b
11f(x)- a(x)l dx < E and
1b lf(x)- a'(x)l dx <E. a
This follows immediately from the use of the upper sums and the lower sums from the Darboux integrability criterion. These sums are integrals of step functions with heights corresponding to the infimum and the supremum of f on the intervals of a partition. Consider next a step function having only two values, each on .a subinterval of strictly positive length. Let p be the point at which the single jump discontinuity occurs. By Exercise 5.62 we know that for each 8 > 0 there exists a function ¢> E coo [0, 1] such that ¢>(x) = a(x) except on an interval (p- 8, p + 8). In effect, we are connecting the two steps with a smooth (C 00 ) curve which departs from the lower step very near the point of jump discontinuity and joins the upper step smoothly only slightly to the opposite side of the jump discontinuity. This permits us to make 111>- alii as small as we like. For general step functions we can iterate the process just described at each of the finitely many jump discontinuities of a. Moreover, we can do this keeping llr/>llsup = llallsup :::; 11/llsup- (Iff is complex-valued, the same approximations can be produced by working separately with the real and imaginary parts, R(f) and SS(f).) In this manner we establish that there exists a sequence of functions rl>n E coo [0, 1] with period readily adjusted to be 1, and having the properties
II!- rl>nlll _, 0,
and
llrl>nllsup::::: 11/llsup =
M
< 00
for all n. It follows that
II/- r/>nll~ =
1If1
2
rPnl dx:::;
2MIIJ(x)- rPn(x)lh
_, 0
as n _, oo. By Theorem 6.4.2 we know that Sk(r/>n) _, rl>n uniformly on [0, 1] as k _, oo. By Exercise 6.41 it follows that Sk(rPn) _, rPn as k _, oo in the
£ 2 -CONVERGENCE & THE DUAL OF / 2
207
£ 2-norm. Thus for each E > 0 there exists a number K such that k 2: K implies that II!- Sk(¢)112 < E. By Exercise 6.29, the Fourier coefficients off provide optimal L 2 approximation to f. Thus
•
for all k 2: K. This implies the theorem.
Remark 6.5.2 Thanks to Theorem 6.5.1, we know that iff and g are in R([O, 1],
i= k,
6.5.1 can be considered a version of the Riesz-Fischer Theorem, restricted according to the limitations of the Riemann integraL As stated originally by Ricsz, this theorem said that an b-scqucncc of coefficients corresponds to an £ 2 -convergent series of functions in an expansion using an orthonormal basis, such as the trigonometric basis { e 21rinx} used for Fourier series.
208
FOURIER SERIES
where 8jk is called the Kronecker delta. If c E l 2 , then
c
= L:%':-oo cke(k)
because
2
2
as n --+ oo for any square summable sequence c. Now letT be any bounded linear functional on l 2 , and define each n. It follows from the linearity and the continuity ofT that
dn
=T
(e(nl) for
as n --+ oo. Hence
T(c) = (c, d) for all c E
l2 •
We need to prove that d E l nd =
L
2
•
(6.32)
The nth truncation of dis given by
dke(k) E l 2
lkl~n
for each n. Hence IT(nd)l ~ IITIIIIndll2. which implies by Equation (6.32) that
for each n. Thus
lldll~ ~ IITIIIIdll2, which implies that lldll2 ~ IITII
< oo. Hence dE l 2 • Moreover,
by the Cauchy-Schwarz inequality, which implies that IITII ~ lldll2. which implies that
• EXERCISES 6.41 Let Sk(¢) denote the kth partial sum of the Fourier series for a Riemann integrable function¢. Prove that if IISk(¢)- ¢llsup--+ 0 ask--+ oo, then
as k --+ oo, thereby completing the proof of Theorem 6.5.1.
6.42
Iff and gin 'R([O, 1], C) and if
J= g, then II!- gll2 = 0.
EXERCISES
209
6.43 Prove the following generalization of the Plancherel identity, called Parseval 's identity: for each f and gin R([O, 1], C) we have (!,g) =
(1, fj).
Parseval's identity shows that the Fourier transform preserves the Hermitian scalar product. (Hint: Apply the Plancherel identity to (! + g, f +g) to prove that
l'R(f, g)
=
l'R(i, fj).
Then do something similar with if to obtain the desired conclusion.)
6.44 Use Exercise 6.19 together with the Plancherel identity [Equation (6.31)] to find the sums of the following infinite series. a) L::nEl\1 b) L::nEl\1
C) L::nEl\1
*
(2~)2 (2n~1)2
6.45 Suppose that f E R[O, 1] is a function of period 1. If(!, Xn) = 0 for each n E Z, prove that 11!11 2 = 0 so that f is equivalent to zero in the normed vector space of equivalence classes in R[O, 1] with the L 2 -norm. 6.46
Let c be a nonzero element of l 2 • Denote c_l
= {dE l 2 l (d, c) = 0},
the orthogonal complement of the one-dimensional subspace Cc of l 2 • a) Prove that cj_ is a vector subspace ofR[O, 1]. b) Prove that each x E l 2 has a unique decomposition x = zc + d, where z E C and d E cj_. c) Prove that there exists a bounded linear functional T on l 2 such that T (c) = 1 and T: cj_ ---.. {0}. d) Suppose now that c and dare any two nonzero elements of l 2 , and suppose further that neither element is a scalar multiple of the other. (That is, the two are linearly independent.) Prove that there exists a bounded linear functional T: l 2 ---.. C such that T(c) =1- T(d).
6.47 <:; It is known that the Fourier series of a continuous, periodic function need not converge uniformly: in fact it can diverge for an uncountably infinite set of values of x, though it is not easy to give an example and we do not do so here. On the other hand, every uniformly convergent Fourier series must converge to a continuous function. The following steps will establish Fejer's Theorem: Every continuous function of period 1 is the uniform limit of the Cesaro means (Exercise 1.56) of the partial sums of its Fourier series.
21 0
FOURIER SERIES
a) Define the nth Fejer kernel Fn to be the average of the first n Dirichlet kernels. (See Fig. 6.4.) Fill in the missing steps in the following calculation:
Fn(x)
=!:_I: n
Dk(x)
=!:_I: sin(2~ + n
k=O
=
_1 nsm1rx
1 [sin
= ~ b) Prove that
1:
1)7rx
Slll7l"X
O
~(I: (ei71"x)2k+l)
n1rx]
0 2
sin1rx
1
Fn(x) dx = 1.
2
c) Prove that ifO
<8<
~.then Fn---> 0 uniformly on the domain
8 -<
1
lxl -< -. 2
d) Define
1
O"n(x) = n
n-1
L Sk(x)
k=O
and prove that iff is continuous of period 1, then 1
O"n(x)- f(x) = [
2 1
[f(x + y)- f(x)]Fn(Y) dy---> 0
2
uniformly in x as n ---> oo. e) Use Fejer's theorem to give an alternative proof for the Weierstrass Polynomial Approximation Theorem (Theorem 5.8.1).
6.48
Let
fn(x) = 1[~,l](x)
rx1
for each n so that fn E R[O, 1]. Prove that the sequence fn is Cauchy in the L 2 -norm. Prove that iff E R[O, 1], then II!- fnll2 fails to converge to zero, so that R[O, 1] is not complete in the L 2 -norm.
6.49 <>Let Xn be a sequence of all the rational numbers in (0, 1). Thus the sequence Xn is dense in [0, 1]. Let
fn(x) = {
{/lx~xnl'
0,
X= Xn
EXERCISES
211
y
3
2
-0.4
-0.2
Figure 6.4
0.2
0.4
Fejer kernel F5(x).
and define
for each N E N. Prove that FN is Riemann integrable on [0, 1] for each N, and {FN} is a Cauchy sequence in the £ 2-norm, but that there is no 1 E R.[O, 1] such that IIFN- 1112----> 0. Hence R.[O, 1] is not complete in the £ 2-nonn.
6.50
<) Following Example 1.15, construct an open dense set n
0
=
UOn, with On= U(ak,bk) C [0,1] nEN
k=l
such that
Let1n =
1o,. a) Prove that 1n is a sequence of Riemann integrable functions that is also a Cauchy sequence in the £ 2 -nonn, but that there is no 1 E R.[O, 1] such that ll1n- 1112----> 0 as n----> oo. Hence 'R[O, 1] is not complete in the £ 2-norm. b) Show also that R.[O, 1] is not complete in the £ 1 -norm.
212
FOURIER SERIES
6.6 TEST YOURSELF
EXERCISES 6.51 Suppose f is an even Riemann integrable function on the real line with period equal to 1. If 1
j_
f(x) dx = 3, 1 2
then find a)
Ji_ f(x) dx 2
b)
6.52
J01 f(x) dx
c) J~ f(x) sin 21rx dx
If
f(x) =
L
sin~;nx,
nEN
then find a)
f01 f(x) cos61rxdx
b) J~ j(x)sin61rxdx
6.53 a) Find the numerical value of
b) Find the numerical value of
c) Find the numerical value of
... e-a -i.-) . ;s (eT6.54 Find the Hermitian inner product (Definition 6.19) (f,g} if f(x) = x and g(x) = ix for all x E [0, 1 ), both functions extended to JR. so as to be periodic with period 1. 6.55 True or False: If (·, ·} is a Hermitian inner product on a complex vector space V, then
(x, cy + z} = c(x, y} for all c E C.
+ (x, z}
EXERCISES
213
6.56 Suppose that f E R [- ~, ~] . Decide which of the terms real-valued, pure imaginary-valued, 25 even, and odd apply correctly to j( n) if a) f is both real-valued and even. b) f is both real-valued and odd. 6.57
Let f(x) = cos3 2wx. a) Find the trigonometric Fourier series expansion (in terms of cos 2wnx and sin 2wnx) for the function f (x). b) Find
6.58
J01 f(x) cos 2wx dx.
Suppose the sequence of partial sums n
Sn
=
L
Cne27rinx
k=-n
converges uniformly. True or False: The sum LnEZ len 12
<
oo.
6.59 Use Exercise 6.19 together with the Plancherel identity [Equation (6.31)] to find the numerical value of the sum of each of the following infinite series: a)
b)
L
1
n4'
nEN\3N
where 3N = {3n In EN}.
25 Being pure imaginary-valued means being a real number times i. This is a special subset of the complex
numbers, often denoted iR
This page intentionally left blank
CHAPTER7
THE RIEMANN-STIELTJES INTEGRAL26
The Riemann integral of a function f provides a continuous analog of the process of summation of numerical values f(xi), with each such value weighted by the width D.xi of the interval [xi-1, xi] from which xi is selected. There are many reasons for generalizing this concept to allow for the weighting of the numerical values f(xi) by numbers different from D.xi. The Riemann-Stieltjes integral allows for the replacement of D.xi by D.gi = g(xi) - g(xi_I), where g is a function of bounded variation. The concept of bounded variation is explained in the first section. A good example to have in mind would be the probabilistic expectation of a game of chance in which there is a winning given by f (x) if a random number turns out to be x. However, x may be more likely to be in some intervals than in others, and this difference is measured by the function g. The Riemann integral is an example of a bounded linear functional on the normed vector space C[a, b]. The Riesz Representation Theorem will establish that every such bounded linear functional comes from a Riemann-Stieltjes integral with respect to a suitable function g of bounded variation. 26 This chapter is not required for any subsequent chapters.
Advanced Calculus: An Introduction to Linear Analysis. By Leonard F. Richardson Copyright © 2008 John Wiley & Sons, Inc.
215
216
THE RIEMANN-STIELTJES INTEGRAL
7.1
FUNCTIONS OF BOUNDED VARIATION
The idea behind the concept of the total variation of a function f defined on [a, b] is to measure the extent to which the values of f oscillate up and down. One can think of it as being an odometer measurement of the amount of vertical travel, with all distances counted as positive.
Definition 7.1.1 Let f: [a, b]
~
R
i. If P ={a= xo < x1 < · · · <
Xn
= b} is any partition of[a, b], we define
n
P(J)
= L lf(xk)- f(xk-1)1. k=1
ii. We define the total variation off on
[a, b] by
V/:(1) =sup{ P(J)} p
where the supremum is taken over all partitions of[a, b]. iii. Since f is also defined on the interval [a, x]for all x E [a, b], we define V;' (!), the total variation function, to be the total variation off on the interval [a, x]. This is understood to be a function of x.
iv. We say J has bounded variation on [a, b], denoted by f E BV[a, b], if and only ifV/:(1) < oo. • EXAMPLE 7.1 Suppose f is monotone on [a,b]. Then we claim V/:(1) = lf(b) - f(a)l, so that f E BV[a, b]. We prove this first for f increasing on [a, b]. Then for all P we have
P(J) =
n
n
k=1
k=1
L lf(xk)- f(xk-dl = L[f(xk) -
= f(b)-
f(a)
f(xk-d]
= lf(b)- f(a)l.
Theorem 7.1.1 Suppose f E C[a, b] and f' is bounded at least on (a, b). Then f E BV[a, b] and V/:(1) :S llf'llsup(b- a). Proof: We apply the Mean Value Theorem for derivatives as follows. For all P, we have n
IP(J)I =
n
L lf(xk)- f(xk-1)1 = L lf'(xk)ILlxk k=l
k=l n
:S llf'llsup
L Llxk = IIJ'IIsup(b- a). k=l
FUNCTIONS OF BOUNDED VARIATION
217
Here the Mean Value Theorem guarantees the existence of suitable points Xk in [xk-l,xk]·
•
•
EXAMPLE7.2
Let
f(x) = { ~
2
sin(~)
if X =/:- 0, ifx = 0.
We claim f E BV[O, 1]. See Fig. 7 .1. y 1.0
0.5
-1.0
Figure7.1
f(x)
= x 2 sin (~),with envelope u(x) = x 2 , l(x) = -x 2 •
In fact, we showed in Exercise 4.7 that f'(O) = 0 and that for all x =1- 0 we have f'(x) = 2xsin (~) - 1r cos(~). Thus on [0, 1] we have
llf'llsup :S 2 + 7r < 00 so f E BV[O, 1], as claimed. Theorem 7.1.2 Let f E BV[a, b], and let a :S x :S y :S b. Then
V!(J) = v:(f)
+ V:f(f).
Proof:
i. First we will prove that
VJI (f) :S
v: (f) + V,Jl (f).
218
THE RIEMANN-5TIELTJES INTEGRAL
SoletPbeanypartitionof[a,y].Itispossiblethatx ~ P,soletP* = PU{x}. Then P* = P' UP", where P' is a partition of [a, x] and P" is a partition of [x,y]. By Exercise7.4,
P(f) ~ P* (f) = P' (f) + P" (f) ~ v:U) +Vi(!). Now we take the supremum over a11 P to obtain
VJ'(f) ~ v:(f) +Vi(!). ii. We will complete the proof by showing that
VJ(f);:::: v:(f) +Vi(!). For this we let P 1 be any partition of [a, x] and P2 any partition of [x, y], and let P = P 1 U P 2, a particular partition of [a, y]. Then P1 (f)
+ P2 (f) = P(f)
~
VJ (f)
for all P 1 and P 2. We take the supremum first over all P1, showing that
Then we take the supremum over all P 2, and we find that
v:u) +Vi(!)~ vJ(f).
• EXAMPLE 7.3 Let
f(x) =
{~sin(;)
if X ifx
:f: 0 = 0.
We claim that V01(f) = oo, so that f ~ BV[O, 1]. It will help the reader to sketch the graph of this function, indicating its infinitely many oscillations as x --> 0+. It is helpful to sketch the lines y = x and y = -x as helper lines. See Fig. 4.1. We observe that the graph of f touches the two helper lines wherever 1l'
(2k+l)7r
Xk
2
an odd multiple of l Although these are not extreme points off, because f'(xk) :f: 0, we can still see that
•
FUNCTIONS OF BOUNDED VARIATION
219
Moreover, lf(xk)l
2 k~ 1 ,
= Xk =
with the sign of f(xk) alternating. Thus
By repeated application of Theorem 7.1.1, we have for all N E N,
Vol(!) 2: v::-1U) + v::~l2(f) + ... + v:,l(f) N
2
I:: 4k k=2
8k+8 2
+ sk + 3
-
00
as N---+ oo. Hence V01 (f) = oo, so f 1:. BV[O, 1]. Theorem 7.1.3 The function f E BV[a, b] if and only if there exist two monotone functions, g and hare increasing, such that f = g- h on [a, b]. Proof:
i. We prove the if implication (from right to left) first. In this case, since g and h are monotone, g, h E BV[a, b], so f = g- h E BV[a, b] as well, by Exercise 7.6. ii. Now we prove the only if part (from left to right). Now suppose f E BV[a, b]. If we let g(x) = (f), Theorem 7.1.1 implies that g is increasing on [a, b]. Let h(x) = g(x)- f(x), so f = g- h. It suffices to prove that his increasing on [a, b] as well. So let x
v:
v;(f)- f(x) ::; v;y(f)- f(y), which is equivalent to showing that
by Theorem 7.1.1 again. However, P = { x, y} is a partition of [x, y], so f(y)- f(x) :S lf(y)- f(x)l = P(f) :S Vi(!).
• Remark 7.1.1 Theorem 7.1.3 can be called a representation theorem for BV[a, b]. What this means is that the theorem shows that a function flies in the set BV[a, b] if and only if it is the difference between two monotone increasing functions. In some sense it is easier to understand the concept of a function being monotone increasing than it is to grasp the concept of a function having bounded variation. Thus this
220
THE RIEMANN-STIELTJES INTEGRAL
representation theorem for BV[a, b] expresses every such function as the difference between two simpler and seemingly more familiar objects. It asserts also that every such difference has bounded variation. However, one may assume too easily that monotone increasing functions are easy to understand! Consider Exercise 7.10, in which the reader will construct a bounded monotone increasing function that has a jump discontinuity at each rational number on the x-axis. It is very difficult to picture the graph. See Exercise 2.12 for the definition of a jump discontinuity.
EXERCISES
t
7.1
= if(b)- f(a)i,
Suppose f is decreasing on [a,b]. Prove v:(f)
so that
f E BV[a,b]. 7.2
Prove: Iff E BV[a, b], then f is bounded on [a, b].
7.3
Let f(x)
7.4
P*
= sin(x 100 ).
Prove that f E BV[O, 10].
t Let P be any partition of [a, b], x' E [a, b], = P u {x'}. Prove: P(f) :::; P*(f):::; v:(f).
and
f :
[a, b]
--t
JR. Let
7.5 Let P be any partition of [a, b], a :::; b < x', and let P' = P U { x'}, a partition of [a, x']. Prove: P(f) :::; P' (f) :::; v,:' (f).
7.6 Prove that v:(cf +g):::; lciV:(f) + v:(g), so thatBV[a, b] is a vector space. (Hint: Consider P(cf +g) and apply the triangle inequality.) 7.7
Let if X
0,
-:j::.
if X= 0. See Fig. 7.2. a) Prove that f ¢ BV[O, 1]. b) Prove that f' exists on [0, 1]. Is llf'llsup
< oo?
Explain.
7.8 Prove that f E BV[a, b] if and only iff is the difference of two monotone functions. 7.9 If J, g E BV[a, b], prove that f g E BV[a, b]. (Caution: the product of two monotone functions need not be monotone.)
7.10
Let the rational numbers in (0, 1) be listed in a sequence Xk. kEN. Let
ifO:::;x<xk, if Xk Let
00
f(x) =
L fk(x). k=l
Prove:
:::; X:::;
1.
EXERCISES
Figure 7.2
221
f(x) = x 2 sin {;fir), with envelope u(x) = x 2 , l(x) = -x 2 •
a) The series E~ 1
7.11
!k converges uniformly on [0, 1]. is increasing on [0, 1]. E 'R[O, 1]. has a jump discontinuity at every rational point Xk in {0, 1). (This means
c) d)
f f f
e)
although both one-sided limits exist.) f is continuous at each irrational value of x in [0, 1].
b)
Prove or give a counterexample: a) Iff E BV[a, b], then f E 'R[a, b].
Iff E 'R[a, b], then f E BV[a, b]. t Let f E BV[a, b] and suppose f is continuous at x 0
b)
7.12 E [a, b]. Prove: vax(f) is also continuous at x = xo. (Hint: Use Theorem 7.1.1, Example 7.1, and Exercise 7.6 above. You will need to use Exercises 2.8-2.12 to show that we can represent f = (h - ¢2 with ¢1 and ¢2 increasing and with ¢1 and ¢2 continuous at xo.) 7.13 <0 Follow the steps below to construct a function f E BV[O, 1] that is monotone increasing on [0, 1] and differentiable with f'(x) 2: 0 for all x E [0, 1], yet f' is unbounded on [0, 1], which implies also that f' ¢ C[O, 1].
222
THE RIEMANN-STIELTJES INTEGRAL
a) Let Xn = 2~ for all n E N. Thus Xn '\. 0 as n ----> oo. Now use the function l from Exercise 5.62, to define f E C00 [Xn, Xn-l] so that
We require that f'(x) ;:::: 0 for all x and that f' vanish at at all x in the interval
Xn,
at
Xn-1.
and
b) Now link together the segments of the graph off smoothly for all the intervals indexed by n let f(O) = 0. Prove that f' is unbounded although f' (x) exists for all x E [0, 1], including x = 0. (Hint: Use the Mean Value Theorem.) Suppose f'(x) exists for all x E [a, b], and suppose f' E R[a, b]. Use the Fundamental Theorem of Calculus to prove that f E BV[a, b] and
7.14
Vd'(f)::; 7.15
1b lf'(x)j
dx.
Let
g(x)
={sin(:;) ()
Let f(x) 3.27.)
0
= J; g(t) dt.
if()
< X :S 1,
if X= 0.
Use Theorem 7.1 to prove f E BV[O, 1] (Hint: Use Exercise
Let a[a, b] denote the family of step functions defined on [a, b] (Exercise 3.8), and let a[a, b] denote the set of all uniform limits of step functions. a) Prove: a[a, b] is a complete normed vector space equipped with the supnorm. b) Prove: BV[a, b] is not a complete normed vector space in the sup-norm. c) Prove: a[a, b] 2 C[a, b]. d) Prove: a[a, b] 2 BV[a, b]. e) Prove: a[a, b] <; R[a, b]. f) Let V be any sup-norm complete vector space of functions that contains BV[a, b]. Prove that V 2 a[a, b]. (Because of this result, we call a[a, b] the sup-norm completion of BV[a, b]. (In general, the completion of a normed vector space V can be defined as the intersection of all those complete normed vector spaces that contain Vas a normed vector subspace.) g) Prove: f E a[a, b] if and only if f(x+) and f(x-) both exist for each x E (a, b) and f(a+) and f(b-) both exist. (See Exercises 2.8 and 2.9.) (Hint: Use the Reine-Borel Theorem for the implication from right to left.)
7.16
223
RIEMANN-STIELTJES SUMS AND INTEGRALS
7.2
RIEMANN-STIELTJES SUMS AND INTEGRALS
The Riemann integral
l
b
f(x) dx
=
lim P(f, 1-l) IIPII_,o
=
lim f(!-lk)!lxk IIPII_,o k=l
a
n
L
where J.l = {Ilk E [xk-1, xk]l k = 1, 2, ... , n} is a set of arbitrary evaluation points for f and !lx k measures the length of the kth subinterval determined by the partition P of [a, b]. The Riemann-Stieltjes integral differs in one important way from the latter concept. Instead of weighting each subinterval by its length !lxk as in the Riemann sums, we will use the changes of a second function g on that interval serve as the weight of the interval. There are many reasons for making such an extension of the concept of the integral. For example, the interval [a, b] might be the space of possible outcomes of a probabilistic experiment. Then !lgk = g(xk) - g(xk-d could represent the probability of the outcome landing in the interval [xk-l, xk] of possibilities, and the function f could be the value in some sense of such an outcome. In this illustration, f dg would be a probabilistically expected value to result from running the experiment. Another reason for extending the concept of integration in this way is that it will enable us to give a complete description of the dual space C'[a, b] of the Banach space C[a, b] of continuous functions on [a, b].
J:
Definition 7.2.1 Suppose tion of [a, b]. Let 1-l
and let !lgk sum
=
J, g : [a, b]
----7
lR and P
{Ilk E [xk-1. Xk]l k
= g(xk)-g(xk-I)Jorall k =
=
= {xo, x1, ... , xn}
is any parti-
1, 2, ... , n}
1, 2, ... , n. Define the Riemann-Stieltjes
n
P(f,g, !-L) =
L
f(!-lk)!lgk.
k=l
We say that f is Riemann-Stieltjes integrable with respect tog on [a, b] there exists L E lR such that for all f > 0 there exists 8 > 0 such that
liP II < 8
==}
if and only if
IP(.f, g, p,) - Ll < f,
independent of the choice of P and of 1-l subject to the stipulations above. If this condition holds, we write
1 b
a
fdg
=
lim P(f,g,JL) IIPII_,o
=L
and f E RS([a, b], g), the class of Riemann-Stieltjes integrable functions on [a, b] with respect to g. f is called the integrand and g is called the integrator.
224
THE RIEMANN-STIELTJES INTEGRAL
• EXAMPLE 7.4 Let f E C[a, b] and let
g(x) =
{~
if a::; x < t, if t :S: X :S: b,
where tis some fixed real number. Let P be any partition of [a, b]. In order that l:!.gk =f:. 0, it is necessary and sufficient that Xk-1 < t ::; Xk. Thus
since J.tk ~ t as function.
IIPII
~
0. Similar reasoning could be applied to any step
Theorem 7.2.1 Let f E 'R[a, b] and suppose g f dg exists and Then
I:
1b
fdg =
1b
E C1 [a, b], so that g' is continuous.
f(x)g'(x)dx,
which is a Riemann integral.
Remark 7.2.1 Notice that in the special case in which g(x) f(x) dx, the ordinary Riemann integral of f. f dg
I: =I:
= x,
g' - 1 and
Proof: We apply the Mean Value Theorem for derivatives to write
n
P(f, g, J.t) =
L
(7.1)
f(J.tk)g'(xk)l:!.xk
k=l
=
f
n
n
k=l
k=l
L f(J.tk)g'(J.tk)D.xk + L f(J.tk)[.q'(xk)- g'(J.tk)]D.xk.
(7.2)
and g' are in 'R[a, b], so is their product, and the first sum in Equation f(x)g'(x) dx, as !!PI! ~ 0. Thus it suffices to prove that the (7.2) converges to second sum in Equation (7.2) converges to 0 as liP!! ~ 0. We prove this as follows. Let E > 0. Since g' is uniformly continuous on [a, b], and since f is bounded, there exists 8 > 0 such that liP!! < 8 implies jxk - J.tkl < 8 which implies Since
I:
RIEMANN-5TIELTJES SUMS AND INTEGRALS
225
making the second sum in Equation (7.2) less than E. (Note that if b- a = 0 or if llfllsup = 0, the claim about the second sum is trivial.) •
Theorem 7.2.2 Suppose fiE RS([a, b],gi)fori andj in {1, 2}. Let c E R Then
i. I:(cfi
+ fz) dg1
exists and
1b ii. I: fid(cgl
(cfi
+ !z) dg1
= c
1b h
dg1
+
1b
1b h
dg1
+
1b !I
fz dg1.
+ gz) exists and
1b !I
d(cg1
+ gz) =
c
dgz.
I:
Remark 7.2.2 This theorem says that f dg is separately linear in each of its two variables f and g. It also establishes that RS([a, b], g) is always a vector space.
Proof: We will prove the first part here. (See Exercise 7.18 for the second part.)
IP(cj,
,; lciiP(j,
+
J,g,p)- [c l
g, p)
-l
j, dg,
---+
as
j, dg,
+
I+ IP(j,
l j,dg,ll
g, p)-
f.' j, I dg,
lciO + 0 = 0
IIPII _, o.
•
In some ways the Riemann-Stieltjes integral has surprisingly different properties from those of the Riemann integral. Consider the following example . • EXAMPLE 7.5 Let
f(x)={~
and let
g(x) =
if X E [0, 1], ifxE(1,2]
{ ()
if X E [0, 1),
1
ifxE[1,2].
Then we make the following observations. 1 i. I 0 f dg
[0, 1].
=
!(1) · 1
=
0, since
f
E C[O, 1] and g is a step function on
226 ii.
THE RIEMANN-8TIELTJES INTEGRAL
t f dg
= 0, since on [1, 2] we have
g'(x)
1 =1 2
2
f dg
iii.
= 0, so
f(x)g'(x) dx
= 0.
I02 fdg
does not exist (that is, f ~ RS([0,2],g)), in stark contrast to the properties of the Riemann integral. Let us prove this claim as follows. No matter how small we make IIPII. we can still have Xk-l < 1 < Xk, so that 6.gk = 1, and f(JLk) can be either 0 or 1 depending upon how we choose f.Lk· Thus P(f, g, JL) can be either 0 or 1, and cannot be forced to converge to a limit merely by requiring IIPII ---> 0. Note that f being Riemann-Stieltjes integrable on both [0, 1] and [1, 2] with respect tog fails to force f to be in RS([O, 2], g).
Theorem 7 .2.3 Let a < b < c. If
i"
I: f dg, Ibc f dg, and I: f dg all exist, then
J dg =
ib lc J dg +
J dg.
Let t > 0. There exists 81 > 0 such that if P 1 is a partition of [a, b] with < 8t. then
Proof: IIPtll
Pt(f,g,JL) And there exists 82
-ib
fdgl <
~·
> 0 such that if P2 is a partition of [b, c] with IIP2II < 82, then IP2(f,g,JL)
-lc
fdgl <
~·
And there exists 83 > 0 such that if P3 is a partition of [a, c] with IIP311 < 83, then IP3(f, g, JL)- I~' f dgl < ~· So let 8 = min{81. 82, 83} > 0, and let P1 and P2 be any partitions of [a, b] and of [b, c], respectively, with IIPill < 8, i = 1, 2. And let P = P1 U P2, a partition of [a, c] with liP II < 8 also. Then we have
lie
Jdg-
(ib lc Jdg+
Jdg)l =
l(ic
Jdg-P(f,g,JL))
[ib lc -ib + -lc
+ (P(f,g,,L)-
: ; lie
J dg- P(f,g,JL)I
+ IPl(f,g,JL) t
J dgl
t
Jdg+
IP2(f,g, JL)
Jdgl
)I
J dgl
t
<3+3+3=E.
•
EXERCISES
227
Example 7.5 is a special case of the following theorem that the reader should bear in mind when dealing with the Riemann-Stieltjes integral.
Theorem 7.2.4 Iff and g are both discontinuous at the same point c E [a,
b], then
f 0 and there exists fg > 0 such that for all li > 0 there exists p/, x' with
c < 1i < x' < c + o, lg(x') - g(c)l ;:-:::
fg
and
lf(J-L')- f(c)l ;:-: : fJ· No matter how small we make liP II. the value of P(f, g, p.) can fluctuate by at least the fixed positive amount f Jfg by choosing P with x' in it and then choosing p.' as the evaluation point for f in the interval between c and x'. Thus The other cases are very similar.
I: f
dg fails to exist. •
EXERCISES 7.17
Suppose that f E
C[a, b], t
(a, b), and
E
g(x) =
Cl
if a<:::
c
if X= t,
C2
ift
{
Prove that f E RS([a, b],g) and
I: f
<X<:::
Prove the second part of Theorem 7 .2.2.
7.19
Find
I 12 X d(log X). 2 Find I 1 (x + x 3 ) d(tan- 1 x)
7.21 Find I~ x dlx J (Note: The floor function does not exceed x.) 7.22
b.
dg = j(t)(c2- ci).
7.18
7.20
x < t,
lxJ denotes the greatest integer that
Let f(x) = {
Q n [a, b],
1
if x E
()
if X E [a, b] \ Q.
228
THE RIEMANN-8TIELTJES INTEGRAL
Prove that I: f dg exists if and only if g is a constant function.
7.23
Let the continuous function f E RS([a, b], g), p E (a, b), and h(x)
for all x E
7.3
[a, b] \ {p}. Prove: f
= g(x)
RS([a, b], h) and I: f dh =I: f dg.
E
RIEMANN-STIELTJES INTEGRABILITY THEOREMS
The following theorem shows a remarkable symmetry between the roles of the integrand and the integrator functions in the concept of the Riemann-Stieltjes integral.
Theorem 7.3.1 (Integration by Parts) Iff E RS([a, b], g) then g E RS([a, b], f)
and
j f dg +1gdf b
b
a
= f(b)g(b)- f(a)g(a) =(!g)
lba.
Remark 7.3.1 This theorem appears for ordinary Riemann integration in the more familiar formula for integration by parts, written as follows:
j bf dg =(!g) Ib- jb gdf. a
a
a
Proof: It will suffice to show limiiPII--+O P(g, J, J.L) exists and lim P(g, J, J.L) = IIPII-o
(fg)l~ -jb f dg. a
Equivalently, it will suffice to show
(fg)l~- P(g, J, J.L)---> as
IIPII ---> 0.
1b
f dg
In fact,
(fg{- P(g, J, J.L) = f(b)g(b)- f(a)g(a) - { g(J.LI)[f(xi)- f(xo)]
+ g(J.L2)[f(x2) -
f(xi)]
+ · · · + g(J.Ln)[f(xn) - f(xn~I)]} = f(a)[g(J.LI)- g(a)] + f(xi)[g(J.L2)- g(J.LI)] + · · · + f(b)[g(b)- g(J.Ln)] ---t
as
1b f
dg
IIPII ---> 0 since {a, J.li. J.l2, ..• , J.ln, b} is a partition P' of [a, b] and IIP'II ~ 2IIPII 0. ---t
•
RIEMANN-8TIELTJES INTEGRABILITY THEOREMS
•
229
EXAMPLE 7.6
t
We evaluate I~ 1 xdlxl = 2121- (-1)1- 11- 1 lxl dx = 4 + 1- ~.where the right-hand integral can be read directly from a graph. Theorem 7.3.2 Iff E C[a, b] and 9 E BV[a, b], then f E RS([a, b], 9 ). Remark 7.3.2 By Theorem 7.3.1, this theorem implies also that 9 E RS([a,b],f).
Proof: Since 9 E BV[a, b], 9 = 91-92, where 91 is increasing and 9 2 is increasing on [a, b]. Thus it will suffice to prove the claim in the theorem for the case in which 9 is increasing on [a, b]. The proof will be very similar in concept to the proof of Riemann integrability of each f E C[a, b]. Let P be any partition of [a, b], let Mk = maxxE[xk_ 1 ,xk] f(x) and mk = minxE[xk_ 1 ,xk] f(x). Then n
U(f, 9, P) =
n
L Mki:!.9k and L(f, g, P) L mki:!.9k· =
k=l
k=l
Clearly,
L(f, 9, P) :::; P(f, 9, J-L) :::; U(f, 9, P) for all P and J-L. It is easy to show, just as we did for Riemann sums in Chapter 3, that
P' 2 P
===}
L(f, 9, P) :S L(f, 9, P') :::; U(f, 9, P') :::; U(f, 9, P).
Thus for all P and P' we have L(f, 9, P) :::; U(f, 9, P'). We define the upper integral
I:
I:
f d9 to be the infimum of all the upper sums, and the lower integral f d9 to be the supremum of all the lower sums, again just as for Riemann integration. Since every lower sum is less than or equal to every upper sum, we have
and we claim that these two are actually equal. Let f > 0. It would suffice to show that
1b
f d9 -
1b
f d9 <
f.
For this it would be sufficient to prove there exists 8 > 0 such that liP II < 8 implies U(f, g, P) - L(f, g, P) < f. By uniform continuity off, there exists 8 > 0 such that lx- x'l < 8 implies
lf(x)- f(x')l <
9
(b)
~ 9 (a)"
230
THE RIEMANN-STIELTJES INTEGRAL
Thus if liP II
< 8, then Mk - mk <
g(b)~g(a) for all k, and this implies
U(f, g, P)- L(f, g, P) <
E.
Thus
1bfdg = L = 1bfdg which defines the number L. Hence IIPII
I:
I:
< 8 implies both Land P(f, g, J-L)
must
f dg and f dg and hence within E of each other. That is, liP II < 8 implies IP(f, g, J-L)- Ll < E, and the theorem is proven. •
lie between
Theorem 7.3.3 For each g E BV[a, b], define T9 : C[a, b]
--+
lR by
Tg(f) = 1b f dg. Then T9 is a continuous linear functional on C[a, b] and
Remark 7.3.3 The symboliiT9 II is called the norm of T9 • Norms of bounded linear functionals were defined in Remark 5.4.2. By Theorem 7.3.2 T 9 is a linear functional. It suffices to prove T 9 is
Proof:
bounded. However, for all partitions P of [a, b], we have IP(f,g,J-L)I =
l~f(J-Lk)~gkl:::; ~lf(J-Lk)il~gkl n
:S IIJIIsup
L
~~gkl = IIJIIsupP(g)
k=l
:S IIJIIsupV;(g). Thus ITg(f)l for all
f
E
=
1 1 p1~ 0 P(f,g,J-l)l :S IIJIIsupV;(g)
C[a, b]. It follows that IITgll :::; v;(g).
•
EXERCISES 7.24
Evaluate I~ 1 x d(lxl
7.25
Evaluate Io'~~"
12
+ [x]).
xd(cosx).
Let p E [a, b] and define T: C[a, b] ----> lR by T(f) = f(p), a so-called point evaluation. Prove that Tis a bounded linear functional on C[a, b] with liT II = 1.
7.26
THE RIESZ REPRESENTATION THEOREM
231
=
7.27 LetT be as in Exercise 7.26. Find a function g E BV[a, b] such that T_9 (!) T(f), where T 9 is defined in Theorem 7.3.3. Can you find gin this exercise in such a way that Vd'(g) = 1 = I!TI!? Explain. 7.28 Letg E BV[a,b] andsupposeh(x) = g(x) exceptatonepointx = p E (a, b). Show that hE BV[a, b] and Th T9 • Must Vd'(h) = Vd'(g)? If yes, prove it. If no, give a counterexample.
=
Let g E BV[O, 2] such that
7.29
ifO::S:x<1, if X= 1, if 1 <X:'::: 2. Prove that for all I E C[O, 2], T 9 (!) depends only upon the difference between c 1 and c3, and is independent of c2.
IIT'.9 II
< V02 (g).
7.30
Give an example of g E BV[O, 2] for which
7.31
Let l(x) be defined as in Example 7.3. Let 1[-!,,I] be the indicator function
of the interval [ ~, 1], and let ln(x)
= l(x )1[ -f.,l] (x) for all x
E [0, 1]. Prove:
a) In E BV[O, 1] for all n EN. b) In----> I uniformly on [0, 1]. c) Prove or give a counterexample: the uniform limit of a sequence of func-
tions of bounded variation must be of bounded variation. 7.32
Let {
O
if~ :':::X:'::: 1, ifO <X<.!. n
l(x) = {
~
if(}< X:'::: 1, if X= 0.
x2
Yn(x) = and let
Prove: a) The sequence Yn converges uniformly on [0, 1] to g(x) b) IE RS([O, 1], Yn) for all n EN. c) ltf.RS([0,1],g).
= x 2 as n ---->
oo.
7.4 THE RIESZ REPRESENTATION THEOREM 27
27 The
Riesz Representation Theorem is not required for any other part of this book. II is included here because it is very important, depends only upon advanced calculus topics presented earlier in this book, and provides an excellent introduction to advanced, graduate level analysis.
232
THE RIEMANN-5TIELTJES INTEGRAL
We have seen in Theorem 7.3.3 that to each a E BV[a, b], there corresponds a bounded linear functional T01 : C[a, b] ~ lR by
Ta(f) =
1b
fda
and that II Tall ::; V,.l'(a) for all a. Thus T 01 E C'[a, b], the dual space of the Banach space C[a, b]. Our next theorem establishes that every T E C'[a, b] can be represented as being T 01 for some suitable a E BV[a, b]. We adapt here the constructive method of proof presented in the classic book Functional Analysis by Frigyes Riesz and Bela Sz.-Nagy [18] in 1955. The proof is a considerably larger undertaking than those appearing earlier in the present advanced calculus text. It requires however only theorems with which the reader is familiar already. 28
Theorem 7.4.1 (Riesz Representation Theorem) LetT E C'[a, b]. Then there exists a E BV[a, b] such that T = T 01 • That is, T(f) = Ta(f) =
1b
fda
for all f E C[a, b]. Moreover; a can be selected so that V,.l'(a) = IITII and so that a( a)= 0. Proof: Since the proof is substantial, we present the intuitive idea that motivates it first. We are given some T E C'[a, b], so for all f E C[a, b], T(f) E !R, and Tis both bounded and linear. Let M=IITII, for convenience. Note that the only kind of function we have a right to apply T to is a continuous function f. For example, we would need to justify application ofT to such a function as if a::; x < t, ift::;
X::;
b
since the latter function is not continuous at t. Observe that 1 [a,a) is the indicator function of the empty set, which is therefore identically zero. We are going to prove that it is possible to extend the domain of definition ofT to a larger space that includes all functions such as 1 [a, t) while keeping II T II = M even on this larger space. 29 Then 28 Note
that there is more than one theorem in the subject of functional analysis with the name Riesz Representation Theorem. Another famous one, known also as the Riesz-Fischer Theorem, describes the relationship between l2 and L 2 , using orthonormal bases. It can be interpreted also as describing the convergence of Fourier series in the square-norm, which the reader has met in Remark 6.5.2. Both theorems will likely be encountered again in graduate courses. 29 The construction of the necessary extension of T occupies most of the lengthy proof of the Riesz Representation Theorem. A much shorter proof [8] exists, in which the extension is available automatically from the Hahn-Banach Theorem, which is beyond the scope of this book. Although the shorter, more advanced proof exists, there is intellectual merit in the constructive proof because the Hahn-Banach Theorem depends upon the Axiom of Choice, whereas the constructive proof is independent of that axiom.
THE RIESZ REPRESENTATION THEOREM
we will define
233
~ft E [a, b), T(l [a,b]) tf t = b.
a(t) = {T(l[a,t))
Observe that a( a)= T(O) = 0. We will prove that a E BV[a, b] and that T = T01 • The intuitive motivation comes from the following formal observation, in which we pretend to be able to compute l[a,t)da for some not yet known a E BV[a, b]. (This is a pretense since this Riemann-Stieltjes integral cannot even exist unless a happens to be continuous at t.) We take a partition P and calculate
I: n
P(l[a,t),a,p,)
=
L l[a,t)(J-tk)[a(xk)- a(Xk-t)] k=l l
=
L[a(xk)- a(xk-t)] k=l
= a(xt)- a(a) = a(xt), where Xt is the last partition point for which /-ll lies inside [O,t]. So we can hope
I:
l[a,t)da = T(l[a,t))· Now we that P(l[a,t)• a, p,) ---+ a(t) as liP II ---+ 0. That is, proceed to the rigorous arguments. We have T E C'[a, b] and liT II = M < oo. We are going to show how to extend T to every bounded function f having the property that there exists a sequence fk E C[a, b] such that for all x E [a, b] we have fk(x) /' f(x). Notice that f need not be continuous, so Dini's theorem does not apply here, and that f could be a function such as l[a,t)• for all t E [a, b] (Exercise 7.33). The proof will proceed in five parts. i. Let B+ [a, b] denote the set of all bounded functions f for which there exists {fk} C C[a,bJ with fk(x) /' f(x) for all x E [a,b]. We claim that if f E B+[a, b] and fk is as just described then the sequenceT(Jk) is convergent. Moreover, if gk E C[a, b] such that for all x E [a, b] we have gk(x) /' f(x), then This will enable us to define T(f) = limk_,oo T(fk). So let f E B+ [a, b] and letB = llfllsup < oo. (WeremarkthatofcourseC[a,b] C B+[a,b].) In order to show that the sequence T(fn) converges, it will suffice to show that n
T(fn) = T(ft)
+ L[T(/k) -
T(/k-t)]
k=2
converges. We will show that 00
T(ft)
+ L[T(fk)- TUk-t)] k=2
234
THE RIEMANN-STIELTJES INTEGRAL
converges absolutely. Let
where sgn denotes the signum function. We have
~ IT(fk) - TUk-di = T (~ ak(fk :'::: M(B
fk-I))
+ IIJII!sup)
for all n since
~~ak(fk- ik-I)(x)l :'::: ~(fk- fk-d(x) =
Un- JI)(x) :'::: B
+ llhl!sup < oo,
which means that
tak(fkll k-2
fk-1)11 :': : B + llhl!sup· sup
Thus E~ 2 [T(fk)- T(fk-I)] is absolutely convergent, and T(fn) converges. In order to complete the first part, we need to show that if in addition to fk we have also a sequence gk E C[a, b] such that gk(x) /' f(x) for all x E [a, b], then lim T(.qk) = lim T(fk)· k--->oo
k--->oo
We know that fk(x) /' f(x) and gk(x) /' f(x) for all x as well. It follows that fk- i f and gk- i f pointwise as well, these two sequences being strictly increasing at each x. And
t
t
II t II sup ----> 0 implies T ( t) ----> 0. So it would suffice to show that fk
since and gk
i f
j f
pointwise on [a, b] implies lim T(fk) = lim T(gk)·
k--->oo
k-+oo
What we know from the first part of this proof is that T(fk) and T(.qk) both converge. We claim that for each k there exists j such that fk < gj for all x. If that were false for some k, then consider Xj such that fk(xj) ~ gJ(xJ). for all j. Since Xj E [a, b], the Bolzano-Weierstrass theorem implies there exists a subsequence Xj; ----> p E [a, b]. By continuity of fk, fk(p) ~ gj(p) for all j. But gj(p) i f(p), so f(p) :'::: fk(p). Yet fk(p) i f(p) is a strictly
THE RIESZ REPRESENTATION THEOREM
235
increasing sequence. This is a contradiction. Hence we see that fk and 9k have subsequences such that
and this sequence also increases strictly at each point to f. But that means T of this sequence converges, and so each subsequence of this sequence converges to the same limit. That is,
which implies in tum that limk---.oo T(Jk)
= limk---.oo T(gk)·
ii. Now we are able to define
T(f) = lim T(fk) k--->oo for all f E B+[a, b]. If c > 0, clearly cfk / cf at each x too, so the set B+ [a, b] is closed under multiplication by positive scalars. Similarly, fr and h E B+ [a, b] implies !I + h E B+ [a, b] (Exercise 7.34). Now let
B[a,b]
= {f =!I- hI JI,h E B+[a,b]}.
Then B[a, b] is a vector space of functions (Exercise 7.6). Moreover, we claim that we can define
T(J) = T(JI) - T(h) for all f E B[a, b]. For this extension ofT to be well-defined, we need to know that if
f = !I - h = 91
- 92
are two representations off, where
then
T(JI)- T(h) = T(g1)- T(g2) and T so-defined is linear on B[a, b] (Exercise 7.36). iii. Next, we wish to show that T, as extended above, remains bounded on the vector space B[a, b]. We will show that
IT(J)I :::; Mllfllsup, whereM is still the normofTasgiveninitially onC[a.b]. So let j, g E n+ [a, b] and suppose fk, 9k E C[a, b] such that for all x E [a, b] we have fk(x) / f(x) and 9k(x) / g(x). Although fk- 9k --+ f(x)- g(x), it is not necessarily true that
[fk(x)- 9k(x)] / [f(x)- g(x)].
236
THE RIEMANN-STIELTJES INTEGRAL
What is more serious for our purposes is that we may have
llfk- 9kllsup > II/- 9llsup =
K.
But we can fix this last problem by a method known as truncation as follows. We define a new sequence of functions c/Jn E C[a, b] (see Exercise 7.37) by
c/Jn(x)
=
fn(x) 9n(x) + K { 9n(x)- K
if fn(x) - 9n(x)
S K, > K,
if fn(x)- 9n(x)
< -K.
if lfn(x)- 9n(x)l
(7.3)
Also,
llc/Jn- 9nllsup S K =
II/- 9llsup·
We need to know that c/Jn(x) / f(x) for all x E [a,b]. This can be seen from the geometrical meaning of the truncation defined above in Equation (7.3) as follows. Consider the band in the plane trapped between the graphs of 9n(x) + K and 9n(x) - K, over the interval [a, b]. The whole band moves upwards as n increases strictly since 9n (x) increases. If the graph of f n slips either over the top or under the bottom of the band, then we truncate the graph of fn with the upper or lower boundary curve, respectively. This produces c/Jn· It is clear that
c/Jn(x) S max{fn(x), 9n(x)} S f(x) for all n and for all x. To see that c/Jn (x) / , consider the fact that for each x and for each n, c/Jn (x) must be either the middle, the upper, or the lower value permitted by Equation (7.3). The only way it is conceivable that c/Jn(x) ~ c/Jn+I (x) is if c/Jn+I (x) is a lower value among the three possibilities than is c/Jn (X). For example, if c/Jn (X) = 9n (X) + K and c/Jn+ 1(X) = 9n+ 1(X) - K, then
c/Jn+I(x)
9n+I(x)- K > fn+I(x) ~ fn(x) > 9n(x) + K = c/Jn(x).
=
On the other hand, if c/Jn (x) = f n (x), we could have
c/Jn+I(x) = 9n+I(x)- K > fn+1(x) ~ fn(x) = c/Jn(x). In each case, c/Jn(x)
S c/Jn+I(x). Moreover, since
9n(x) / g(x), fn(x) / f(x) and lf(x)- g(x)l S K for all x, lim c/Jn(x) = lim fn(x) = f(x).
n-+oo
n-+oo
THE RIESZ REPRESENTATION THEOREM
237
Now we can reason as follows.
IT(!- g)l
=I
n-+CXJ
=
lim IT(¢n- 9n)l ~ MK
lim T(¢n)- lim T(gn)l n-+oo
n--->oo
Thus even on B[a, b] we have the extended T with
liT II =
M.
iv. Now we define a(t) = T(1[a,t)) for all t E [a, b), and a(b) = T(1[a,bj)· We observe that a(a) = T(1[a,a)) = T(O) = 0. The reason for the slight difference in the way a(b) is defined will become clear in part v of this proof. We claim that
vd'(a) ~
IITII =
M.
To prove this, we let P = { xo, x1, ... , xn} be any partition of [a, b], and we form n
P(a) =
2: la(xk)- a(Xk-I)I. k=l
Observe that for all k we can select a number Ek E { ±1} such that
la(xk)- a(Xk-dl
=
Ek[a(xk)- a(xk_I)]
=T
(Ek[1[a,xk)- 1[a,Xk-dl) ·
Thus we can write
P(a) = T
{tf.k
[1[a,xk) -1[a,xk_ 1 ) ] }
= T(f), where we denote by f the argument ofT. Note that lf(x) I ~ 1 for all x, so that IT(!) I :S MIIJIIsup = M. Hence
Vd'(a) :S
IITII·
If we can show that T = Tw then we will know from Theorem 7.3.3 that the opposite inequality holds as well, and then we will know that IITII = v;(a). v. We claim that T = Ta. To prove this, we let P = {xo,XI, ... ,xn} be any partition of [a, b]. Let f E C[a, b]. We need to show that T(f) = Ta(f). Consider
P(f, a, x)
=
n
n
k=l
k=l
2: f(xk)[a(xk)- a(Xk-d] = 2: f(xk)~ak
(7.4)
238
THE RIEMANN-STIELTJES INTEGRAL
This is a Riemann-Stieltjes sum for
I:
fda, with f.Lk selected to be Xk for fda as
all k = l, ... ,n. Thus the sum in Equation (7.4) converges to IIPII --; 0.
I:
On the other hand, P(f, a, x) = T(
0 there exists 8 > 0 such that liP I < 8 implies II!-
P(f,a,x) = T(
IIPII --; 0.
Hence T(f) =
I:
fda.
•
We have observed already that if a E BV[a, b], then
Ta+c =To: E C'[a, b] for all c E R This enables us to restrict our attention, without loss of generality, to those a E BV[a, b] for which a( a) = 0, which we assume henceforth. Also, changing the value of a at just one point, or finitely many points, in (a, b) has no effect on To:. We must ask ourselves under what circumstances in general To: and Tp will be the same linear functionals on C[a, b], where a and f3 E BV[a, b] with a(a) = 0 = f3(a). First, suppose p E (a, b) is a point at which a E BV[a, b] happens to be continuous. Consider the continuous function fn that is identically I on [a,p], 0 on [P + ~' b], and linear on [p, p + ~] . We see that
Now the third integral is 0 by definition of fn· The middle integral is bounded in absolute value by as n --; oo since
vt+~ (a)= vt+~ (a)- Vf(a),
v;
which approaches 0 since (a) is continuous function of x at each point of continuity of a (Exercise 7.12). But the first integral is just a(p)- a(a) = a(p ). Thus we see that a(p) is determined uniquely by the action of the original linear functional T at each point p of continuity of a. Namely,
a(p) = lim T(fn). n--->oo
EXERCISES
239
Now suppose p E (a, b) is a point of discontinuity of a. We know from Exercise 2.25 that a monotone function has at most countably many discontinuities. Since a is a difference of two monotone functions, the same is true for a, because the union of two countable sets is again countable. Moreover, lim a(x) = a(p+)
x-+p+
and lim o:(x) = o:(p-)
X-+p-
both exist since this is true for monotone functions. Also, every interval (p, p + o) or (p - o, p) of positive length contains uncountably many points, and hence contains points of continuity of a. Thus a(p+) and a(p-) are uniquely determined by a at points of continuity, which means these one-sided limits are determined uniquely by the initial linear functional T that is being represented as T 0 • All that remains is to ask what is the value of o:(p) itself. But changing the value of a at p has no effect on To: since if r(x) = 1 at x = p and 0 everywhere else, T 1 (f) = 0 for all f E C[a, b]. But if we require a to be right-continuous on (a, b), then a(x) is uniquely determined by T. (See Exercise 7.39.) Definition 7.4.1 We define
BVo[a, b] =
{a E BV[a, bJI a( a) = 0,
a right-continuous on (a, b)}.
Remark 7.4.1 By virtue of the reasoning above, for all T E C'[a, b] there exists a unique a E BVo[a, b] such that T = T 0 • Note that we do not require right-continuity at x = a since we have agreed to let a(a) = 0, and we may need a jump discontinuity on the right at a-for example to provide for the functional T(f) = f(a). And rightcontinuity would be a vacuous requirement at x = b, in addition to which we are forced to let a( b) = T (l[a,bJ) if a( a) = 0. Moreover,
IITII = v;(a), and a--+ To: is a linear map of the vector space BVo[a, b] injective (meaning one-toone) smjective (meaning onto) C'[a, b]. (A map that is both injective and smjective is called a bijection.) Since C'[a, b] is a Banach space it follows that BV0 [a, b] is also a Banach space with the so-called total variation norm llalltv = VJ'(o:). Moreover, C'[a, b] and BV0 [a, b] are isomorphic as Banach spaces. Some of the details of the Banach space properties are in Exercises 7.39 and 7.40 below. EXERCISES
7.33 Lett E [a, b]. Find a sequence fk E C[a, b] such that fk(x) /' l[a,t)(x) for all X E [a, b]. 7.34 Prove: /I, h E B+[a, b] implies !I+ h E B+[a, b], where B+[a, b] is as defined in part (1) of the proof of Theorem 7 .4.1. Show also that if c ~ 0 then
cfi E B+[a,b].
240
THE RIEMANN-5TIELTJES INTEGRAL
7.35 Prove: B[a, b] is a vector space of functions, where B[a, b] is defined in part (2) of the proof of Theorem 7 .4.1. 7.36
If
f = h - h = 91 - 92 are two representations off, where fi, /2, 9b 92 E fl+[a, b], then T(fi) - T(h) = T(g!) - T(92). Prove that Tis linear on B[a, b]. 7.37
Prove that the functions
c/Jn defined in Equation (7.3) are continuous.
7.38 a) Prove that 1 [a,t] 1[a,t] E
i
B+ [a, b] if a :::; t < b, but
-1 [a,t] E
B+ [a, b] so that
B[a,b].
b) Let a< c
< d:::; b. Prove that 1[c,d) i B+[a, b], but 1[c,d) E B[a, b].
7.39 Let (3 E BV[a, b] and {Pk I k E N} be the set of points in (a, b) at which (3 is discontinuous. Let f3n be the same as (3 except that f3n(Pk) = f3(pk+) for all k = 1, ... , n. And let (3' be the same as (3 except that (3' (Pk) = f3(Pk+) for all k E N. Show that (3' E BV[a, b] and that
v:(f3n - (3')
----t
0
as n----> oo. Prove that Tf3' = Tf3. 7.40 Let an be a Cauchy sequence in BVo [a, b], in the sense of the total variation norm: For all f > 0 there exists N E N such that n and m ;::: N implies
Prove that there exists a E BV0 [a, b] such that
as n----> oo. That is, prove that BV0 [a, b] is complete. (Hint: Consider the sequence
which is already known to be complete, and use the uniqueness of the correspondence between a and Ta in Remark 7.4.1.)
TEST YOURSELF
241
7.5 TEST YOURSELF
EXERCISES 7.41
Let
f(x)
f
True or False:
E
= {
~ 2 (sin(~) +sin(~))
if X =/= 0, if X= 0.
BV[O, 1]. Explain.
7.42 Give an example of an integrable function f (x) that is not of bounded variation on [0, 1].
I:
7.43 Find X d is no less than x.)
r l (Note: The ceiling function r l denotes the least integer that X
X
7.44
True or False: The Riemann-Stieltjes integral I~ 1 Lx Jd sgn x exists.
7.45
True or False: The Riemann-Stieltjes integral 1
1
7r
x 2 sin - d sgn x X
-1
exists. (We interpret the integrand function at x = 0 as having the value 0.)
= 1{1}·
7.46
Let g(x)
7.47 that
LetT : C[O, 2]
Find I 0 tan- 1 x dg.
---->
2
R by T(f)
1
Find a function g E BV[O, 2] such
2
T(f)
for all
7.48
= 2f(1). =
f
dg,
f E C[O, 2]. Suppose that f(x) =ex for all x. Let ifO ~ x < 1, if X= 1, if 1
Evaluate I 0
2
7.49
I: f
f
<X~
2.
dg.
True or Give a Counterexample: If dg exists and
I: f dg and Ibc f dg both exist,
then
This page intentionally left blank
PART Ill
ADVANCED CALCULUS IN SEVERAL VARIABLES
This page intentionally left blank
CHAPTERS
EUCLIDEAN SPACE
8.1
EUCLIDEAN SPACE AS A COMPLETE NORMED VECTOR SPACE
Throughout pure and applied mathematics, it is necessary to consider functions of more than one variable. If a function depends on n real variables, x 1 , x 2 , ••• , Xn, it is possible to combine these n real variables into one vector variable
In this notation, x is not a real number, but rather an n-tuple of real numbers. In addition, it is often necessary to consider functions that have vector values instead of real values. Up to this point, our course has focused on real-valued functions of one real variable. In this chapter we will begin the rigorous study of vector-valued functions of vector variables. In the study of real-valued functions of a single real variable, we saw the advantages of considering normed vector spaces of more than one dimension. For example, the reader has seen that the vector space C[a, b] has infinitely many dimensions (Exercise 2.57). In the present chapter, we will focus on finite-dimensional vector spaces equipped with what is called the Euclidean norm. However, we begin with a more general context. Advanced Calculus: An Introduction to Linear Analysis. By Leonard F. Richardson Copyright © 2008 John Wiley & Sons, Inc.
245
246
EUCLIDEAN SPACE
Definition 8.1.1 In any (real) vector space V (Tabel2.1, p. 59), we call a function (-,-):VxV-+lR. a scalar product if and only if it has the following three properties. i.
(ax+ y, z)
ii.
(x, y)
=
(y, x) for all x andy in V.
iii.
(x,x)
~
Oforallx E V and (x,x) = 0 {::}
=
a(x, z) + (y, z) for all a E JR. and for all x andy in V. x=
0 E V.
• EXAMPLE 8.1
In the vector space JR.n, define the Euclidean scalar product by n
(x, y) =
I:>i1/i·
(8.1)
i=l
The reader should verify that this product satisfies all three conditions to be called a scalar product.
Theorem 8.1.1 In any (real) vector space V equipped with a scalar product as defined in Definition 8.1 .I, we define
llxll = y'(x,x)
(8.2)
for all x E V. The function II · II as defined in Equation (8.2) is a norm, as in Definition 2.4.4, and the Cauchy-Schwarz Inequality is satisfied:
I(x, y) I :::; llxiiiiYII· Proof: To prove the Cauchy-Schwarz inequality, we fix x and y, and we proceed as follows. For all t E JR., define a polynomial
p(t) Observe that p( t)
~
p(t)
=
(tx + y, tx + y).
0 for all t. By linearity of (·, ·) in each variable we see that =
llxll 2 t 2 + 2(x, y)t + IIYII 2 = at2 + bt + c,
where a= llxll 2 , b = 2(x, y), and c = IIYII 2 • But the quadratic polynomialp(t) ~ 0 for all t E JR. if and only if b2 - 4ac :::; 0, which is equivalent to b2
:::;
4ac. Hence
EUCLIDEAN SPACE AS A COMPLETE NORMED VECTOR SPACE
The first two conditions of Definition 2.4.4 are easily verified for condition, the triangle inequality, is left for Exercise 8.1.
Definition 8.1.2 If a vector space V is equipped with a norm open ball of radius r ?: 0 about p E V by
Br(P)
II · II·
II · II,
247
The third •
we define the
= {v E V lllv- Pll < r} ·
Similarly, we define the closed ball of radius r ?: 0 about p E V by
Br-(P) = {v E V lllv- Pll
:S: r}
·
The bar above the symbol Br indicates that the set Br is the closed ball, meaning that it includes the spherical boundary surface .
• EXAMPLE 8.2 In the familiar Cartesian plane of Euclidean geometry, with the norm of a vector being its geometrical length, we have
(x, y)
= XIYI
+ X2Y2 = llxiiiiYII cosO,
where() is the angle between x andy. Thus the Cauchy-Schwarz inequality in the plane follows from the fact that Icos Bl ::; 1 for all 0. (TheCauchy-Schwarz inequality follows alternatively from the argument given in Theorem 8.1.1.) In the plane of Euclidean geometry, B 1 (0) is the region strictly inside the circle of radius 1 centered at the origin. The reader should check that B 0 (p) = 0, the empty set, for all points p.
Definition 8.1.3 lfx = (xi. x2, ... , Xn) andy = (YI. y2, ... , Yn) are in JR.n, the set of all n-tuples of real numbers, we define the Euclidean scalar product of x andy as in Equation (8.1) and the Euclidean norm ofx as in Equation (8.2). The Euclidean space lEn of n dimensions is defined to be the vector space JR.n equipped with the Euclidean scalar product and the Euclidean norm. The reader should have checked that the Euclidean scalar product as defined above does satisfy the three properties required to be a scalar product, and that the Euclidean norm is in fact a norm. We remark that it is very common when dealing with lEn to call it JR.n informally. This means that we should suppose that the Euclidean inner product and norm are in use unless stated explicitly to the contrary. (See Exercise 8.4.)
Theorem 8.1.2 The Euclidean normed vector space lEn has the following two properties.
i. The Euclidean space lEn is complete in the sense of Definition 2.5.3. ii. For a sequence of vectors x<Jl in lEn, the sequence X(j) ----> X E
!En
¢o?
X~)
----> Xk
248
EUCLIDEAN SPACE
as j----) oo,foreach k = 1, 2, ... , n.
Remark 8.1.1 The second part of the theorem tells us that convergence of a sequence in lEn is equivalent to convergence in each separate coordinate sequence. Proof: The reader will recall that the vectors x E !Rn, which are the vectors of lEn, comprise a vector space using the familiar operations under which
ex= (cx1. ... , cxn) and
X+ Y = (x1 + Yl, ... , Xn + Yn). We begin by proving claim (i). To see that every convergent sequence in any normed vector space is Cauchy in the sense of Definition 2.5.3, see Exercise 8.8. We will prove that every Cauchy sequence x(k) in V converges. By hypothesis, For each f > 0 there exists K such that j and k ;::: K implies llx(j) - x(k) < f. However, for each l = 1, ... , n, we have
II
x(j) - x(k) I
lI
I -<
Thus the sequence x}l) is Cauchy in IR, and so it has a limit: x}J) ----) x1 as j ----) oo, for each l = 1, ... , n. Now we denote x = (xi. ... , xn). Then we conclude that
as j ----) oo, by the theorems governing limits of sequences. Hence the original Cauchy sequence does converge in the sense of the norm of lEn. Note that we have also proven that if each sequence of coordinates converges, then the sequence of vectors in lEn converges as well. We leave the second claim to Exercise 8.9 .
• EXAMPLE 8.3 Here are two examples of convergence and divergence, which means failure of convergence, for sequences of vectors.
(1;
2 i. Let x(j) = 2-+:_ 3 , j sin (}-)). Then x(j) ----) ( ~, 1) E IE as j ----) oo, because of the convergence of the sequence in each coordinate separately. 1
u. In IE 2 , the sequence x(j) = (},
j) diverges as j ----) oo.
EXERCISES
Figure 8.1
249
The unit ball B1 (0) in the taxicab norm .
• EXAMPLE 8.4
Here is an example of a non-Euclidean norm that can be placed on the vector space ll~.2. Define the taxicab norm on JR. 2 by
In Exercise 8.4 the reader will show that the taxicab norm satisfies the requirements of Definition 2.4.4 to be a norm. To understand the name taxicab norm consider a city in which all roads run parallel to the x-axis or the y-axis. The length of an honest taxicab ride would be determined using the taxicab norm. Fig. 8.1 shows the unit ball around the origin in the taxicab norm. EXERCISES
8.1
t Complete the proof of Theorem 8.1.1 by proving the triangle inequality:
1/x + Yll S llxll + IIYII for all x and y in V. That is, prove that the norm defined in terms of a given scalar product does satisfy the triangle inequality. (Hint: Apply the Cauchy-Schwarz Inequality.)
8.2 In lEn, llx- Yll is interpreted geometrically as being the distance between and y. Prove the geometrical version of the triangle inequality in lEn:
llx- zl/ S llx- Yll + IIY- zll
x
250
EUCLIDEAN SPACE
for all x, y, and z in lEn. Interpret this inequality in terms of a geometrical triangle in lEn.
8.3
Two vectors x and y in lEn are called orthogonal if and only if
(x,y) = 0. We will prove the Pythagorean Theorem in lEn. a) Prove: The vectors x and y are orthogonal if and only if
b) Interpret part (a) as a theorem about triangles in lEn with a vertex at the origin. c) Let x, y, and z be in lEn. Prove that the triangle with vertices at x, y and z is a right triangle with right angle at x if and only if
8.4 t Define the taxicab norm as in Example 8.4. Show that requirements of Definition 2.4.4 to be a norm.
I · lit satisfies the
8.5 Find necessary and sufficient conditions on the real numbers a 1 and a2 to assure that (x, Y)a = a1X1Y1 + a2X2Y2 satisfies Definition 8.1.1 and is thus a bona fide scalar product on the two-dimensional vector space !R2 • Then sketch the unit ball around the origin in !R2 using the norm I · II a that is determined by this scalar product.
8.6
In lE 1 , what is the open ball B 1 ( 0)? Answer the same question in JE3 .
8.7
Find the lirn1_, 00 x(j) in JE 2 , or state that it does not exist. a) xUl
=
(}.(l+}r).
b) x<Jl = (}.
-jk ).
8.8 t Prove that every convergent sequence x
t Show that x<Jl
~ x in the sense of the norm of lEn if and only if the sequence
x~J) ~
x1 for each
8.10
Suppose tk ~ t E lR and x(k) ~ x E JE"' ask ~ oo. Prove: tkx(k) ~ tx as
l = 1, ... , n, completing the proof of Theorem 8.1.2.
k~oo.
8.11 A sequence x<Jl in lEn is called bounded if there exists a number M > 0 such that llx(j) II ::; M for all j E N. a) Prove that every convergent sequence in lEn is bounded.
EXERCISES
251
b) Give an example of a bounded sequence in IE 2 that is not convergent. c) t (The Bolzano-Weierstrass Theorem) Prove that every bounded sequence in lEn has a convergent subsequence. (Hint: One approach is to give a proof by induction on the dimension n. Or one can work informally as follows, applying the Bolzano-Weierstrass Theorem for lR a total of n times. Here is a notational suggestion: If one wishes to denote a subsequence of a subsequence, use two strictly increasing functions ¢1 and ¢ 2 mapping N into itself, and x(t o¢ 2 (j)) can be used as in Definition 1.5.1 to denote the jth term of a subsequence of a subsequence of a sequence xJ .)
8.12 Prove: A sequence Xn ---+ L E JEk if and only if every subsequence Xn, possesses a sub-subsequence xn,. that converges to L as j ---+ oo. (Hint: To prove J the only if part, suppose false and write out the logical negation of convergence of Xn to L.) 8.13 Prove or give a counterexample: A sequence Xn E JEk converges if and only if every subsequence Xn, possesses a sub-subsequence Xn,. that converges as j ---+ oo. J 8.14 a) Suppose a vector space Vis equipped with an inner product (·, ·), and suppose we define a corresponding norm by llxll 2 = (x, x). Prove the Parallelogram Law:
b) Prove that the taxicab norm defined in Exercise 8.4 does not correspond as in part (a) above to any inner product on JR 2 • c) Under the hypotheses of part (a) above, prove the identity 1
(x, Y) = 4 (llx + Yll 2 - llx- Yll 2 )
•
d) (For this pan, you might be able to negotiate extra credit from your teacher.) Suppose only that V has a norm. Define what is hoped to be a scalar product on V by the formula in the preceding part. Prove that this defines a legitimate scalar product on V provided the norm satisfies the Parallelogram Law of part. (Hints: Positivity and symmetry are easy to verify. To show additivity of the product in the first variable, express 4(x+y, z) in terms of the identity given in the previous part. Then add and subtract llx- y + zll 2 to show that
Then let x and y change roles and add the results. When this is done, show that (ax, y) = a(x, y) for all a E N. Extend this to a E Q. Then justify that the inner product is a continuous function of x and extend to all
a E R)
252
EUCLIDEAN SPACE
8.15
Prove that Euclidean space is self-dual. That is, prove that T : !En --+ lR is a bounded linear functional if and only if there exists y E !En such that T(x) = (x, y) for each x E !En.
8.2 OPEN SETS AND CLOSED SETS In the calculus of one variable, it is possible to accomplish much with functions defined on an interval. Intervals are quite simple sets to consider as domains on which functions may be defined. Many useful theorems concerning the continuity, differentiability, or integrability of functions can be proven based on whether a domain is a closed or an open interval, or a finite or an infinite interval. In two or more variables, however, the domain of definition of a function can be much more complicated than an interval. For example, the real-valued function
f(x, y) =
vii X
1 2X 2
-
3y 2
is defined on the domain that is the region lying inside but not on an ellipse, excluding those points inside the ellipse that are on the y-axis. It is easy to construct more intricate examples than this. Hence a rigorous study of the calculus of functions defined on a Euclidean space !En necessitates a careful study of those topological properties of sets in !En that are needed to prove important theorems regarding continuity, differentiability, and integrability of such functions. Most of the definitions and theorems work for all vector spaces V equipped with a norm, so we state them in this context.
Definition 8.2.1 In a normed vector space V a subset S ~ V is called an open set if and only if for each x E S there exists a number r > 0 corresponding to x such that the open ball Br(x) ~ S. Expressed informally, a set S is said to be open if each point p of S has at least some small open ball, or buffer zone around itself that remains entirely inside S. The radius of the buffer zone may need to be smaller for some points of S than for others . • EXAMPLE 8.5 The first example is the open ball Br(x), which we claim is an open set. (The language open ball would be a poor choice if the claim just made were false, but a name alone does not establish that a definition is satisfied.) Proof: To show that the definition of openness is satisfied, we must show that ify E Br(x) then there exists d > 0 such that Bd(Y) ~ Br(x). We claim that d = r - IIY- xll > 0 works. In fact, if z E Bd(y), then liz - xll :::; liz - Yll
+ IIY- xll
< d + IIY - xll = r,
253
OPEN SETS AND CLOSED SETS
proving that z E Br(x) as claimed.
•
Theorem 8.2.1 Suppose A is an arbitrary set. Suppose for each a E A there is an open set Oa <; V indexed by a. Then 0 = UaEA Oa is an open set. Remark 8.2.1 In this theorem, the index set A may be finite or infinite, countable or uncountable. Proof: In words, this theorem says that the union of a family of open sets must always be open. Let x E 0. We must show there exists r > 0 such that Br (x) <; 0. But there exists a E A such that x E Oa. Since Oa is open, there exists r > 0 such • that Br(x) <; Oa <; 0. It is natural at this point to ask whether the intersection of an arbitrary family of open sets must be open. This is false, as shown in Exercise 8.21.
Theorem 8.2.2 If 01, ... , Ok is a family of finitely many open sets, then 0 1 Oi is open.
n7=
Proof: In words, this says the intersection of finitely many open sets must be open. For each x E 0, x E 0 j for each j. Hence there exists for each j a number ri > 0 suchthatBrj(x) <;;:; Oj. Letr = inf{r~, ... ,rk},sothatr > 0. ThenBr(x) <; 0.• Definition 8.2.2 In a normed vector space V. a set S <; V is called a closed set if and only if its complement = V \ S is open.
sc
We will use the following definition to prove an alternative and equivalent form of the concept of closed set. Definition 8.2.3 A point p is called a cluster point of a subset S of a normed vector space V if and only if there exists a sequence of vectors x(j) E S \ {p} such that xUl ----+ pas j ----+ oo. /fp E S but pis not a cluster point of S, then pis called an isolated point of S. Note that a cluster point of S may or may not be an element of the set S. An isolated point of S must, however, be an element of S.
Theorem 8.2.3 A subset S of a normed vector space V is closed if and only if every cluster point p of S is an element of S. Proof: Suppose first that S is closed. Let p be any cluster point of S. We need to prove that p E S. By hypothesis, there is a sequence x(j) E S \ {p} such that xUl ----+ p as j ----+ oo. Thus, for every open ball Br (p) there exists J E N such that j :;::: .J :=::} x(j) E Br (p). Thus p rf= V \ S since the latter set is open. Hence pES. Next, we suppose every cluster point of S is in S. We need to prove that the complement of Sis open. Let x E = V \ S. We claim there exists > 0 such that Br(x) <; = V \ S. Suppose this claim were false. Then for each j E N
sc
sc
sc
r
254
EUCLIDEAN SPACE
there exists x(jl E S \ {x} such that x(jl E B1, (x). But then x(j) ~ x. Thus xis J a cluster point of S, which forces x E S. This is a contradiction, which proves the claim. • • EXAMPLE 8.6
Every closed ball Br(x) is a closed set. Proof: We will show that the complement of the closed ball is open. If y E V \ Br(x), let 8 = IIY- xll- r. We claim that Bo(Y) is contained in the complement of the closed ball. So let z E Bo (y). Then
liz- xll + liz- Yll 2:: llx- Yll by the triangle inequality, so
liz- xll 2:: llx- Yll - liz - Yll > llx- Yll - 8 = r. Thus z lies outside the closed ball, as required.
•
• EXAMPLE8.7
It follows from the preceding example that for each p E JE 11 , the singleton set {p} = Bo(P) is a closed set. The reader should show that this singleton set {p} is not an open set. Note that this does not follow from some linguistic relationship between the words open and closed in every day parlance. Instead, it is necessary to show that if r > 0 then Br (p) is not a subset of {p}.
Theorem 8.2.4 Suppose A is an arbitrary set. (A may be finite or infinite, countable or uncountable.) Suppose for each a E A there is a closed set Fa ~ JE 11 indexed by a. Then :F = naEA Fa is a closed set. Proof: It suffices to observe the ;: =
(u F~)c aEA
Since each set Fg is open being the complement of the closed set Fa, the union is open and its complement, :F, is closed. •
EXERCISES 8.16 Prove that Theorem 8.2.2 implies that the empty set 0 is open in any normed vector space V, and that the whole space V is closed.
EXERCISES
8.17
255
For each of the following sets, state whether it is open, closed, both, or neither. a) A= {x E JE2 1 x 1 ~ 0 }. b) B= {xElE2 Ix1 =0}. c) C = {x E lE 2 Ix1?: 0 andx2 > 0}. d) E = {X E !E2 IX} + X2 > 0}. e) F = {x E !E2 Ix1 + x 2 :::; 0}. f) D = AUB. g) G = {x E lE 2 < + x~ < 4}.
It xi
8.18 Prove that a point p is a cluster point of a set D <;;; lEn if and only if for each 8 > 0 there exists xED such that 0 < llx- Pll < J. 8.19 Prove that if pis an isolated point of D, once 8 > 0 is sufficiently small the set {x E D I 0 < llx - Pll < 8} = 0, the empty set. 8.20 Let 0 denote an arbitrary open subset of lEn. Prove that 0 is the union of a (possibly infinite) family of open balls. 8.21 t Let p E lEn. Show that the singleton set {p} can be written as the intersection of an infinite sequence of open balls, thereby showing that intersections of open sets need not be open. (Hint: Define a sequence rn > 0 for which inf{rn In EN}= 0.) 8.22
Prove or give a counterexample: No setS <;;; lEn can be both open and closed.
8.23 Prove or give a counterexample: Every set S <;;; lEn must be either open or closed. 8.24
Prove that if H, ... , H is a family of finitely many closed sets, then :F =
U~=l Fi is closed. 8.25 Use Example 8.7 to prove that every setS <;;; lEn is the union of a family of closed sets. 8.26
t If S c
lEn, define the interior of S, denoted by S0 ={xES
so, to be
I :::J r > 0 such that Br(x)
<;;; S}.
Prove that S <;;;lEn is an open set if and only if S = S 0 •
8.27
Prove that (Br(x)r = Br(x) in the space lEn.
8.28 Define the closure S of S <;;; lEn to be the intersection of all closed sets that containS. Prove that Sis a closed set and that S = S U C, where Cis the set of all cluster points of S. 8.29 Prove: The closed ball Br(x), defined in Definition 8.1.2, is actually the closure of the open ball Br(x), provided r > 0. 8.30 Define the boundary of S <;;; lEn, denoted by aS, to be aS = the boundary of Br(x) in lEn.
S\
so.
Find
256
EUCLIDEAN SPACE
8.31 Define a setS ~ !En to be dense in !En if and only if S = !En. Denote by !Qln the subset of !En consisting of vectors with exclusively rational coordinates. Prove that !Qln is dense in !En. 8.32 Prove or give a counterexample: Every open set in !En can be expressed as the union of either a finite or a countable family of open balls. 8.33 Find (!Qln )" and 81Qln. Justify your answers. (Hint: It will help to consider the density of the set of irrational numbers in the set R)
8.3 COMPACT SETS In Chapters 1 through 3, we saw that closed, finite intervals [a, b] have very useful properties as domains of continuous functions. Such intervals were very important for establishing an Extreme Value Theorem (Theorem 2.4.2) and a uniform continuity theorem (Theorem 2.3.2), for example. In this section we seek to describe the class of subsets of !En that have similar properties as domains for functions of vector variables. We take our cue from the Heine-Borel Theorem (Theorem 1. 7.2) for functions defined on intervals [a, b]. Definition 8.3.1 If E
~!En,
we call a family
= {Oa I a E A}
0 of open sets Oa an open cover of E
if and only if
If there exists a finite subset F of the index set A such that
then we call 0 F = {Oa I a E F} a finite subcover of E. We call the set E compact if and only if every open cover of E has a finite subcover. A set E is called bounded if and only if sup { llxlll x E E} < oo. The reader should take care not to confuse the concept of an open cover with the union of its elements .
• EXAMPLE 8.8 Let
s = {(m,n) I mE z, n E Z} = Z 2 =
We claim that the subset S
of IE 2
S c
z
X
z.
is not compact. For a proof, we observe that
U (m,n)ES
B1((m, n)).
COMPACT SETS
However, the open cover {B1 ( (m, n)) subcover. Do you see why not?
Im
257
E Z, n E Z} of S has no finite
• EXAMPLE 8.9
The set B1 (0) is not a compact subset of IE 2. For a proof, show that
0
=
{Br(O) I 0 < r < 1}
is an open cover of B1 (0), yet there is no finite subcover. The reader should show why this is true. The following theorem provides a powerful generalization of the phenomena exhibited by the two preceding examples.
Theorem 8.3.1 (Heine-Borel Theorem for !En) A set E if E is both closed and bounded.
~ lEn
is compact if and only
Proof: We leave the proof that every compact set must be both bounded and closed to Exercises 8.38 and 8.39, respectively. Here we suppose that E is both closed and bounded, and we will prove that E is necessarily compact. We suppose that this were false, and deduce a contradiction. Thus we assume there exists an open cover 0 = {Oa I a E A} of E for which there exists no finite subcover. Since E is bounded, there exists R > 0 such that E ~ BR(O), which is, in tum, contained in the hypercube [-R, R] xn, ann-fold Cartesian product of the interval [- R, R]. It will be convenient in this proof to denote the hypercube by a pair of diagonally opposite comer-vectors a 1 = (- R,- R, ... ,- R) and b 1 = (R, R, ... , R). Thus we will denote the cube
Next we subdivide the cube [a1, b 1 ] into 2n congruent subcubes as follows. Simply bisect the interval [- R, R] on each of the n coordinate axes, and form all 2n possible Cartesian products of a half-interval from each of the axes. Each such subcube can be denoted [a, b], where the coordinates of a= (a 1 , a2 , ••• , an) denote the left-hand endpoints of the n chosen subintervals on the axes, and with a similar convention is employed forb using right-hand endpoints. Observe that for each such subcube [a, b] c [a 1 , b 1], the set En [a, b] is covered by 0. Among the 2n subcubes formed, there must exist at least one subcube [a2, b2] having the property that En [a2, b 2 ] has no finite subcover from 0. (Otherwise, there would exist a finite subcover for E itself.) Now we repeat the process by subdividing [a2, b 2 ] into 2n subcubes and we select a subcube [a3, b3] in the same manner as before. In this way, we obtain a decreasing nest [ai. b1] ::) [a2, b2] ::) ... ::) [ak, bk] ::) ... of subcubes having the property that for each k E N the set Ek = E covered by 0 but has no finite subcover.
n [ak, bk]
is
258
EUCLIDEAN SPACE
Select a point Pk E Ek for each k EN. If j and k ;::: N then Pj and Pk E EN. Hence
as N ----> oo. Thus the sequence Pk is a Cauchy sequence, and Pk ----> p E EN for each N E N. (Here we use the fact that EN is a closed set.) Since p E E, there exists a E A such that p E Oa. Since Oa is open, there exists r > 0 such that Br(P) ~ Oa. Now select N EN such that k;::: N implies
Ry'n 2k-2
< r,
so that p E EN C Br(P) ~ Oa. Thus we have covered EN with a single set Oa from 0, contradicting the claim that EN has no finite subcover from 0. •
EXERCISES 8.34 If Eisa compact subset of !En, prove (without using the Beine-Borel Theorem) that every closed subset of E is compact. 8.35 Use only the definition of compactness to show that every finite subset of lEn is compact. 8.36 a) LetS = { xUl Ij E N} c lEn be any convergent sequence. Show that Sis a compact set if and only if limj--->oo xUl E S. b) LetS= {xn=(1+ 2'!;, 2'!;) ln=0,1,2,3, ... } c!E 2. TrueorFalse: S is compact.
8.37 Let S = { x<j) Ij EN} C lEn be any sequence. Prove or give a counterexample: The set S is a compact set if and only if the sequence x(j) is convergent to an element of S.
t Prove part of the Heine-Bore! theorem by showing that if E ~ lEn is compact, then E is bounded. (Hint: Show how to cover E with suitable open balls
8.38
centered at 0.)
t Prove part of the Heine-Bore! theorem by showing that if E ~ lEn is compact, then E is closed. (Hint: Suppose that pis any point not in E. Show how to cover E with the complements of suitable closed balls centered at p. Conclude that Ec is open.)
8.39
8.40 t Let f be a real-valued function defined on a closed finite interval [a, b] in IE 1 . Define the graph Gf = { (x1, X2) E IE 2 j X2 = j(x1 ), X1 E [a, bl}. a) Prove: Iff E C[a, b], then G f is a compact subset of IE 2. (Hint: Use the Heine-Bore! Theorem.)
CONNECTED SETS
259
b) Let
g(xl) =
{
sin .K... 0 xl
if XI E (0, 1], if XI
= 0.
2
Is the graph G 9 a compact subset of IE ? Prove your conclusion.
8.41 Let EI 2 E2 2 ... 2 Ek 2 ... be a decreasing nest of nonempty closed subsets of lEn. a) Give an example to show it is possible for n:I Ek to be empty. b) If EI is compact, show that n:I Ek =/= 0. (Hint: Select a point xk E Ek for each k E N. Apply Exercise 8.ll.c. ) 8.42 For each of the following subsets of lEn, determine whether or not it is compact and justify your conclusion. a) Br(x), with r > 0. b) Br(x), with r > 0. c) sn-I = Br(x) \ Br(x), with r > o. Let K c lEn be compact. Suppose f and g are in C(K) and suppose that f(x) = 0 for each x E K such that g(x) = 0. Prove: Iff > 0, there exists M 0
8.43
z
such that
lf(x)l < Mlg(x)l
+f
for all x E K. (Hint: Use the Heine-Borel Theorem. For interesting applications of this exercise in approximation theory, see [13].)
8.4 CONNECTED SETS In order to generalize the Intermediate Value Theorem (Theorem 2.3.1) to functions defined on Euclidean space, it will be necessary to identify a class of subsets of lEn with properties sufficiently similar to those of intervals. Let us begin by writing a formal definition of the concept of interval. Definition 8.4.1 An interval is any subset I ofiR such that for all a and bin I with a -::; b the set {x E IR I a -::; x -::; b} ~ I.
This concept includes all open, closed, half-open and half-closed, finite or infinite intervals. (See Exercise 8.44.) The difficulty in generalizing this concept is that there is no natural linear ordering, or notion of inequality in lEn when n > 1. The concept we seek is that of being a connected set. In order to define this concept it is convenient to begin with its opposite: a set is called disconnected, in the sense that it can be separated into two parts, or components. Definition 8.4.2 We say that a set E ~ V, a normed vector space, is disconnected if there exists a pair of disjoint open subsets A and B of V for which E ~ A U B and such that both EI = E n A =/= 0 and E2 = En B =/= 0.
260
EUCLIDEAN SPACE
In this case, we say that A and B separate E. If E is not disconnected, then E is called connected .
• EXAMPLE 8.10 Let E = Q, the set of all rational numbers. We claim that Eisa disconnected subset of IE 1. In fact, let
A=
{x E IE1 Ix< J2}
and B =
{x E IE1 Ix> J2}.
Then A and Bare disjoint open sets that separate E. The key to this construction is that there exists E IE 1 \ Q. In Exercise 8.45 the reader will prove that a subset of IE 1 is connected if and only if it is an interval. The set Q, which is shown above to be disconnected, is not an interval. We show below that the concept of a set E being disconnected can be expressed without reference to open sets A and B as in the definition. This will give us a more visually intuitive concept of the meaning of connectivity.
J2
~ V, a normed vector space, is disconnected if and only if we can decompose EasE= E1 U E2, both sets nonempty, where E1 n E2 = 0 and where neither E 1 nor E2 contains any cluster points of the other set.
Theorem 8.4.1 A set E
Proof: First we suppose that E is separated. Thus A and B exist as in the definition. If e E E 1, then there exists r > 0 such that Br (e) ~ A. Since E 2 is disjoint from A, e is not a cluster point of E2. Similarly, E 2 has no cluster point of E 1. Next we suppose that E = E1 U E2, a disjoint union, where neither E1 nor E2 has any cluster point of the other, and each set is nonempty. Let e E E. Since e is in one of the two sets E1 or E2 without being a cluster point of the other, there exists re > 0 such that Bre (e) is disjoint from the set to which e does not belong. Let
A=
U BTf(e). eEE1
Similarly, we let
B=
U BTf(e). eEE2
It follows that A and B are open sets and that A n E = E 1 and B n E = E 2. We need prove only that A n B = 0. Suppose this claim were false. Then there exists pEA n B. Hence there exist e1 E E1 and e2 E E2 such that lle1- Pll < ~and lle2 - Pll < ~- By the triangle inequality for norms, it follows that
II e1- e2 II < This is a contradiction.
Te 1
+ Te 2
2
{
~max Te 1 ,Te2
}
•
•
EXERCISES
261
Theorem 8.4.2 Let f be any continuous real-valued function defined on an interval I. Let G f denote the graph off as defined in Exercise 8.40. We claim that G f is a connected subset ofiE 2 • Proof: We suppose the claim were false, and we will deduce a contradiction. If G f were disconnected, then G f = E1 U E2, a disjoint union of two nonempty sets, neither one containing any cluster point of the other. So there exist (xi, f(xi)) E Ei, fori = 1, 2. Suppose without loss of generality that x 1 < x 2 . (We know x 1 =1- x 2 , since G f is the graph of a function.) Define
c =sup { t
E
[xl. x2JI (q,f(q)) E E1. V'q E [x1. tl}.
Thus x1 :::; c:::; x2. If (c, f(c)) E E1 then c < x2. Hence for n EN there exists
Cn
E
(c,c+ ~) n [x1,x2]
for which (en, !(en)) E E2. This would imply that (cn.f(cn)) ----+ (c, f(c)), by continuity of f. Hence (c, f(c)) is a cluster point of E2, which is a contradiction. On the other hand, if (c, f(c)) E E2, then c > Xt. (c, f(c)) is a cluster point of E 1 . This yields a similar contradiction. •
Corollary 8.4.1 The Euclidean space lEn is connected, for each n
E
N.
Proof: Suppose the corollary were false. Then lEn = E1 U E2, a disjoint union of two sets, neither of which contains any of the other's cluster points. Let p E E 1 and q E E2. Let
¢(t) = p
+ t(q- p)
for each t E [0, 1]. Then ¢(0) E E1 and ¢(1) E E2. Define
c =sup { t E [0, 1JI (q, qy(q)) E Et, V'q E [0, tl}. Now complete the proof just as in Theorem 8.4.2.
•
Theorem 8.4.3 Suppose E and F are nondisjoint connected subsets of a normed vector space V. Then G = E U F is connected. Proof: Suppose false. Then there exist nonempty disjoint open sets A and B such that An G =1- 0 =1- B n G. Since E cannot be separated, E is contained entirely in one open set or the other. Without loss of generality, suppose E ~ A. Similarly, F is contained entirely in one of the open sets. Since B n G =1- 0, F ~ B. But this is • impossible since E n F =1- 0 and A n B = 0. EXERCISES
t Prove that if I ~ R. is an interval, as defined in Definition 8.4.1, then I must have one of the following forms: [a, b], (a, b), [a, b) or (a, b]. In this notation -oo :::; a :::; b :::; oo, but closed endpoints must be finite. (Hint: Consider sup( I) and inf(J).)
8.44
262 8.45
EUCLIDEAN SPACE
t
a) Prove that every connected subset S ~ JE1 is an interval. b) Prove that every interval I is a connected subset of lE 1. In particular, this includes the claim that the real line JE 1 is itself a connected set. (Hint: Consider Theorem 8.4.2 and the function f(x) = 0. Explain how connectedness in E 2 implies connectedness in E 1 .)
8.46
Prove that the union of the x 1-axis and the x 2-axis in lE 2 is a connected set.
8.47
Prove that 8 1
= {x E 1E2 IIIxll = 1} is a connected set.
8.48 Suppose E ~ JE2 such that for all x E E, x 2 = 0. Suppose E is a connected subset oflE1, which we identify with the x 1-axis oflE 2. Prove that E is also connected as a subset of lE 2. 8.49 Let E = { x E JE2 1 x 1x 2 > 0 }. Is E connected? Prove your conclusion. 8.50 Let E = { x E lE 2 1 x1x2 ::::: 0 }. Is E connected? Prove your conclusion. 8.51
Let E = { x
E lE2 Ixi
=
1}.
Is E connected? Prove your conclusion.
8.52 In E 2, let E = { x Ix1 = 0} U { x Ix1 x2 = 1, x1 Prove your conclusion. 8.53
> 0}. Is E connected?
Let .g () X
=
xsin.!!:. x {0
if X> 0, if X= 0.
Prove that the graph of g is a connected subset of lE 2.
8.54
Let if X> 0, if X= 0.
Prove that the graph off is a connected subset of lE 2. (Hint: Try writing G f ~ AU B with A and B open and disjoint. Prove that Gf must be contained entirely within one of the two open sets.)
8.55 Let E ~ lEn be any connected set and let p be a cluster point of E. Prove or disprove: E U {p} is connected. 8.56 Let E ~ lEn be any connected set and let disprove: E is connected.
E be the closure of E. Prove or
2
8.57 Let f (x) = e -x for all x E R Denote the graph off by Gf and the x-axis in lE 2 by Go. Prove or disprove: S = G f U G0 is a connected subset of JE2. (Hint: Apply Theorem 8.4.1.) 8.58 Prove or give a counterexample: Every open set 0 ~ lEn can be expressed as the union of a finite or countable family of disjoint open balls. (Hint: See Exercise 8.32.)
TEST YOURSELF
263
8.5 TEST YOURSELF
EXERCISES 8.59 Let x = (5, -2, 3) E IE3 . Give an example of a non-0 vector y E JE3 for 2 2 2 which llxll + IIYII = llx + Yll · 8.60
Let S
~
!En be a set of vectors for which each sequence
has a convergent subsequence. True or False: S must be bounded.
8.61 The taxicab norm in 1R 2 is defined by llxllt = lx1l + lx2l· Draw a sketch of the unit ball B 1 (0) in JR 2 using the taxicab norm. Label your axes. 8.62 D.
Give an example of a set D ~ IE2 and a point p which is an isolated point of
8.63 Give an example of a set S ~ IE 2 for which S is dense IE 2 but has empty interior. 8.64 True or False: The interior of the closure of an open set S S itself. 8.65
Let f(x) = sin i for all x
#
~
!En need not be
0. True or False: The graph G f is compact.
8.66 Let E 1 :2 E2 :2 ... :2 Ek :2 ... be a decreasing nest of closed subsets of !En. True or False: n~ 1 Ek must be non-0. 8.67 LetS = {Xn = (1 + ~, ~,) 0, 1, 2, 3, ... } C IE2. True or False: S is compact.
8.68
2 2
In=
Let
g(x)
=
x 2 cos.! x {0
# 0, ifx = 0. ifx
True or False: The graph G 9 is connected.
8.69 For each one of the following two sets, decide whether it is connected or not connected. a) S = {x E IE 2 jlx2l = lx1j}. b) T = {x E IE 2 jlx2l < lx1j}. 8.70 True or False: If S # 0 is open in !En, then S non empty disjoint open balls of finite radius.
This page intentionally left blank
CHAPTER9
CONTINUOUS FUNCTIONS ON EUCLIDEAN SPACE
9.1
LIMITS OF FUNCTIONS
We will consider vector-valued functions f : D ____, lEm, where the domain of definition off is a set D ~ V, a normed vector space. (Most often, V will be lEn.) Since for each x E D we have f(x) E lEm, we can write f(x)
= (!l(x), ... , fm(x)),
where fi(x) E R. for all x E D. In elementary courses in the calculus of several variables, Vis always lEn and it is common to write the real-valued function fi(x) in the form fi(xl, ... , Xn)· In order to define the concept limx-+a f(x), we need to restrict ourselves to the case in which a is a cluster point of D. Recall that a cluster point of D need not be an element of D. A cluster point a of a set D has the property that it is always possible to find points x E D for which x -=f. a and yet x is as close to a as we like. If a is not a cluster point, it will be impossible to define limx-+a f(x) because it is impossible for x to approach a. For example, if a were an isolated point of D, then for all sufficiently small t5 > 0 we obtain the set
{xI 0 < llx- all < t5,
xED}= 0,
Advanced Calculus: An Introduction to Linear Analysis. By Leonard F. Richardson Copyright @ 2008 John Wiley & Sons, Inc.
265
266
CONTINUOUS FUNCTIONS ON EUCLIDEAN SPACE
the empty set. Every statement one might make about each element x of the empty set, such as the statement \\f(x) - L\\ < t, is vacuously true. Definition 9.1.1 Let a be any cluster point of the domain De of a function £. Then limx_,a f(x) = L E !Em if and only if for all t > 0 there exists 8 > 0 such that x E Dr and 0 < \\x- a\\ < 8 implies \\f(x) - L\\ < t. This definition is a natural adaptation of Definition 2.1.2 to vector-valued functions defined on Euclidean space. We have the following theorem that permits more such adaptations to be developed. Theorem 9.1.1 Suppose a is a cluster point ofthe domain De of£. Then the following two statements are equivalent. i. limx_,a f(x)
=L
ii. For every sequence of points X 71 E Dr \ {a} such that X 71 ~ a, we have the sequence f(xn) ~ L.
Proof: Let us prove first that (i) implies (ii). So suppose (i) and consider any sequence of points x 71 E Dr \ {a} such that X 71 ~ a. We know for each t > 0 there exists 8 > 0 such that x E DJ and 0 < \\x- a\\ < 8 implies \\f(x)- L\\ < t. But since x 71 ~ a, there exists N such that n 2: N implies 0 < \\xn - a\\ < 8. This implies \\f(x71 ) - L\\ < E, so that f(xn) ~ L. Next we prove (ii) implies (i). So suppose (ii) is true. We need to show limx->a f(x) = L. We will show that if this conclusion were false then a selfcontradiction would result. But if it were false that limx_,a f(x) = L, then there exists t > 0 such that for all 8 > 0 there exists x E Dr such that 0 < \\x - a\\ < 8 and yet \\f(x) - L\\ 2: f.. In particular, if we let 871 = ~ then we get x 71 such that 0 < l\x71 - a\\ < ~and yet \\f(x71 ) - L\\ 2: E. Now, Xn E Dr\ {a} and X 71 ~a, yet f(Xn) ft L. This contradicts (ii), which was assumed to be true. • The theorem above combined with Theorem 8.1.2 and Theorem 2.1.1 yields the following immediate corollary. Corollary 9.1.1 Let a be any cluster point of the domain De. Then limx_,a f(x) exists and equals L = (h, ... , lm) if and only iflimx->a fi(x) exists and equals li foralli = l, ... ,m. This corollary enables us to understand the concept of convergence of vectorvalued functions f in terms of convergence of each real-valued component function
k Remark 9.1.1 If two functions f and g are defined on subsets of the same space IE", with values in the same space !Em, then we can form the sum or difference of these functions with the domain of this combination being Den Dg. If, furthermore, f = f and g = g are real-valued functions of vector variables, then we can also multiply and divide the two functions. The domain D L of the quotient will be only 9
LIMITS OF FUNCTIONS
267
those points of Dt n D 9 for which g(x) =1- 0. (Multiplication and division are not defined, however, iff and g map lEn to !Em with m > 1.)
Corollary 9.1.2 Let f and g be defined on domains in V, a normed vector space, with values in !Em, as in the Remark above. Suppose a is a cluster point of the intersection DrnDg of the domains off and g, limx--+a f(x) = L andlimx--+a g(x) = M. Then
i. limx--+a(f ± g)(x) = L ± M. ii. Ifm
= 1, then limx_, (fg)(x)
= LM.
8
iii. If m = 1, then limx--+a L (x) g cluster point of D 1·
=
f:t,
provided that M =/:- 0 and that a is a
9
Proof: The proofs are very similar to those of Corollary 2.1.1. Here we will treat only conclusion (i). Consider a sequence of points Xn E Dr n Dg \ {a} such that Xn ---> a. Then (f ± g)(xn) = f(xn) ± g(xn) ---> L ± M, by the corresponding theorem for limits of sequences. The other proofs are similar. • This corollary suggests that limits of functions on !En are very similar in behavior to limits of real- valued functions of one real variable, subject only to certain essential hypotheses to insure the requisite operations are defined for the objects in question. However, the reader must be very careful: it is much harder for the limit of even a real-valued function of two or more variables to exist than it is for functions of only one variable. See Exercise 9.4.
Definition 9.1.2 Let x E IE" and let k integers. We define
= (k 1 , ... , kn) be any n-tuple of nonnegative
a real-valued monomial on lEn. A real-valued polynomial on IE" is any function of the form p(x) = LJ=l cixki, where each Cj E JR. A rational function is any ratio of two real-valued polynomials. A rational function is defined wherever the denominator is nonzero. In the exercises below, the reader will show that the behavior of polynomials and rational functions on !En with respect to limits is very similar to the behavior of such functions on JR.
Definition 9.1.3 Let f: D---> !Ern.
SupposeD~
IE", where n 2 2.
=L providedforallsequencesxn E Drsuchthatllxnll-> oowehavef(xn)---> L.
i. If the domain Dr is not a bounded subset of IE", we say limx--+oo f(x)
ii. /fa is a cluster pointoJDr we say limx--+a f(x) = oo provided for all sequences Xn E Dr such that Xn ---> a we have llf(xn) II ---> oo. Remark 9.1.2 In case (i) we say that f(x) converges to Las x diverges to infinity. In case (ii) we say that f(x) diverges to oo as x converges to a. The use of the word converges versus diverges hinges on whether or not the limit exists as a point of lEn.
268
CONTINUOUS FUNCTIONS ON EUCLIDEAN SPACE
Remark 9.1.3 The restriction ton ~ 2 in Definition 9.1.3 is employed in order to avoid conflict with the normal terminology in IE1 . For functions of a single real variable, it is customary to distinguish between x ----) oo and x ----) -oo and to distinguish both of these concepts from lxl ----) oo. The rationale for this is that in IE 1 there are only two directions in which to move away from 0 as far as one pleases. But in lEn for n ~ 2 there are infinitely many directions along which one might wander as far from 0 as one pleases. Thus distinguishing among directions in the concept of divergence to infinity becomes hopeless.
EXERCISES 9.1
Write a detailed proof for Corollary 9 .1.1.
9.2 Let p be any real-valued polynomial on lEn, as in Definition 9.1.2. In the parts below, prove that limx_,ap(x) exists and equals p(a). a) If m(x) =Xi, 1 :S i :S n, prove that limx---+a m(x) = m(a). b) If m(x) = xk, prove that limx_,a m(x) = m(a). c) Let p be any polynomial on lEn. Prove that limx_,a p( x) exists and equals p(a). That is, prove that a polynomial is continuous everywhere. (See Definition 9.2.1.) 9.3 Let Q(x) be any real-valued rational function on lEn, as in Definition 9.1.2. Prove that limx_,a Q(x) exists and equals Q(a), provided that a E DQ. That is, prove that a rational function is continuous wherever it is defined. (See Definition 9.2.1.) 9.4
t Let f : IE2 ----) lR by the formula ifx-#0, ifx = 0.
(See Figs. 9.1 and 9.2) a) Show that ifx----) 0 along either the x 1 - or the x2-coordinate axis, f(x) ----)
0. b) Show that if x ----) 0 along any straight line x2
=
kxt through the origin,
f(x)----) 0. c) Show that limx_,o f(x) does not exist. Hint: Show there exist points x arbitrarily close to 0 at which f(x) = ~· d) Does limx_, 00 f(x) exist? Prove your conclusion.
9.5
Let f
: IE2 ----) lR by the formula ifx -=J 0, ifx
= 0.
Prove that limx_,o f(x) = 0. (Hint: Let € > 0. Find 8 implies lf(x)l < E. Compare with Exercise 9.4.)
> 0 such that llxll < 8
EXERCISES
Figure9.1
f(xt,x2)
2 1
= xl~ +"'x22 2
269
inthefirstquadrant.
9.6
For x E E 2 , evaluate each of the following limits. a) limx_,o 11 11 . b) limx_,oo 11 11 .
9.7
Let f(x) = x sin~' f: E 1 a) limx--+0 f(x) = 0. b) limx--+0 = 00.
! !
\
{0}
--4
JR. Prove each statement True or False:
ix)
c) limx--->0 lflx)l =
00.
9.8 Let a be a cluster point of the domain Dr of a nonvanishing (that is nowhere-0) function f (x). Prove that
!~ f(x) = 0 {::} !~ 11£(~)11
= oo.
9.9 Suppose the domain D ~ En of a function f : D Prove: limx_, 00 f(x) = L E Em if, and only if, for each f such that x E D \ BM(O) implies llf(x)- Lll < f.
--4
Em is not bounded.
> 0 there exists M > 0
270
CONTINUOUS FUNCTIONS ON EUCLIDEAN SPACE
Figure 9.2
f(xt, x2)
2 1
= xl~ +"'x22 2
in all four quadrants.
9.10 Suppose that a is a cluster point of the domain D <:;:; lEn of a function f : D ---+ lEm. Prove: limx->a f(x) = oo if, and only if, for each M > 0 there exists 8 > 0 such that x E (D \{a}) n Bo(a) implies llf(x)l! > M. 9.11 Prove the following Cauchy Criterion for the existence of the limit of a function. Let f : D ---+ lEm and let a be a cluster point of D <:;:; lEn. Then limx_,a f(x) exists if and only if for each t > 0 there exists 8 > 0 such that x and x' E D n (Bo(a) \ {a}) implies llf(x)- f(x')ll < E.
9.2 CONTINUOUS FUNCTIONS The following definition is analogous to Definition 2.2.1. Definition 9.2.1 Let V be any normed vector space. A function f : D ---+ lEm is called continuous at a point a E D <:;:; V provided that for each t > 0 there exists a corresponding 8 > 0 such that x E D n B0 (a) implies
llf(x)- f(a)ll
<E.
Iffiscontinuousateverypointa ED, wesayf E C(D), thefamilyofallcontinuous functions on D with values in lEm. In this notation, the value ofm is fixed. If there is
CONTINUOUS FUNCTIONS
271
more than one possible range-space lEm under discussion, we will denote the family of continuous functions instead as C (D, lEm ). If Dr has any isolated points a, then f is automatically continuous at a. The more interesting case, however, is that of a nonisolated point a E Dr.
Theorem 9.2.1 A function f is continuous at a cluster point a E D limx_,a f(x) exists and equals f(a).
if and only if
Proof: First suppose f is continuous at a E D. Then for each E > 0 there exists 8 > 0 such that llx- all < 8 and x E D implies llf(x) - f(a)ll < E. Hence for all x E D such that 0 < llx- all < 8 we have llf(x)- f(a)ll < Ewhich implies limx_,a f(x) exists and equals f(a). Now suppose limx--->a f(x) exists and is f(a). Then for all E > 0 there exists 8 > 0 such that 0 < llx- all < 8 and xED implies llf(x)- f(a)ll <E. This implies for all xED n Bo(a) we have llf(x)- f(a)ll < E so that f is continuous at a. •
This theorem enables us to prove some properties of combinations of continuous functions that are analogous to the results in Theorem 2.2.2 to the extent possible. Theorem 9.2.2 Let f and g be defined on domains in a normed vector space V with values in lEm. Suppose f and g are both continuous at a Then i. f ± g is continuous at a ii. If m = 1, then the function
iii.
/fm
= 1 and if g(a) =1- 0,
f g is continuous at a
then~ is continuous at a
iv. If we write f = (JI, ... , fm) in terms of its scalar-valued components, then f is continuous at a if and only if !I, ... , f m are all continuous at a Proof: This is a simple application of Corollary 9.1.2 and Corollary 9.1.1 and it is left to the Exercises. Note that the proof is trivial if a is not a cluster point of the • domain of the appropriate combination off and g.
The following definition is useful for the study of continuous functions in Euclidean space. Definition 9.2.2 A set E is said to be relatively open in D ~ V, a normed vector space, provided there exists an open set U ~ V such that E = U n D. Similarly, a set E is said to be relatively closed in D ~ V provided there exists a closed set K ~ V such that E = K n D .
•
EXAMPLE 9.1
In JE 1 , [0, 1) is relatively open in [0, 2], since [0, 1) = (-1, 1) n [0, 2]. Similarly, (0, 1] is relatively closed in (0, 2], since (0, 1J = [ -1, 1] n (0, 2].
272
CONTINUOUS FUNCTIONS ON EUCLIDEAN SPACE
Theorem 9.2.3 A subset E ~ Dis relatively open in D if and only iffor each x E E there exists 8x > 0 such that Box(x) n D ~E. Proof: First suppose that E is relatively open in D. Then there exists an open set U such that U n D = E. If x E E ~ U, it follows that there exists a corresponding 8x > 0 such that B 0Jx) ~ U. Thus B 0Jx) n D ~ U n D = E. This completes
the proof in one direction. For the other direction see Exercise 9.27. Corollary 9.2.1 A subset E ~ D is relatively open in D in D, D \ E, is relatively closed in D.
•
ifand only if its complement
The proof of this corollary is left to the Exercises. It is a remarkable and very useful fact that the continuity of a function f on a domain D is closely connected to the way in which open sets and closed sets are transformed by the inverse of f. In this terminology we make no supposition that f is invertible, as we see below.
Definition 9.2.3 Let f : D --+ !Em, where D ~ V, a normed vector space. We define the inverse image of E ~!Em by r- 1 (E) ={xED I f(x) E E}. Theorem 9.2.4 Let f be defined on a domain D ~ V, a normed vector space, where f: D--+ !Em. Then f E C(D) if and only if£- 1 (U) is relatively open in D for each open set U ~ !Em. Proof: First we suppose f E C(D). Let U ~ !Em be open. We need to show r- 1 (U) is relatively open in D. If r- 1 (U) = 0 then there is nothing to prove. Without loss of generality, let x E r- 1 (U), so that f(x) E U. Since U is open, there exists r > 0 such that Br(f(x)) ~ U. Because f is continuous at x, there exists 8x > 0 such that
f: Bo)x) n D--+ Br(f(x)).
Thus Box (x) n D ~ r- 1 (U). By Theorem 9.2.3 r- 1 (U) relatively open in D. Next we suppose r- 1 ( U) is relatively open in D for each open set U ~ !Em, and we must prove that for each x E D, f is continuous at x. Let E > 0. We need to show that there exists 8 > 0 such that f: B 0 (x) n D--+ Be(f(x)). Since r- 1 (Be(f(x))) is relatively open in D, there exists 8 > 0 such that B 0 (x)
nD
~
r- 1 (B.(f(x))).
(See Theorem 9.2.3.) Thus f is continuous at x.
•
EXERCISES 9.12
Define?Ti: lEn--+ !Rby?Ti(x) =
Xi,i
= l, ... ,n. Provethat?Tiiscontinuous.
9.13 t IfV is any normed vector space, define N: V--+ IR by N(x) = llxll· Prove that N is continuous. Hint: See Exercise 2.20. Show that illxll - IIYIII :S llx- Yll·
EXERCISES
9.14
273
Let f : IE 2 ----+ IE 2 by the formula ifx:f:O, ifx
= 0.
Determine whether or not f E C (1E 2 ) and prove your conclusion.
9.15
Let f : IE 2 ----+ lE2 by the formula f(x) = { (x1x2,sin (0,0)
(~))
ifx:f:O, ifx
= 0.
a) Determine whether or not f E C (1E 2 ) and prove your conclusion.
b) LetS = {xI f(x) conclusion.
= 0 }.
Is the complement of San open set? Prove your
9.16 Suppose f : lE 2 ----+ lR has the property that for each fixed value x 2 = b the function gb(xl) = f(xl, b) is a continuous function of x1. Suppose also that for each fixed value x 1 =a the function ha(x 2 ) = f(a, x 2 ) is a continuous function of x 2 • Does it follow that f E C (1E 2 , lR)? If yes, prove it. If no, give a counterexample and prove that it is a counterexample. (Remark: This question is often expressed verbally as follows. Does the continuity of f(x) in each variable x 1 , x 2 separately imply joint continuity, meaning continuity as a function of the vector variable x? See Problem 9.14.) 9.17
Prove all four parts of Theorem 9.2.2.
9.18
Prove Corollary 9.2.1.
9.19 Give an example of a continuous function f : JE 1 ----+ JE 2 that maps the open set ( -271', 211') onto the circle xi + x~ = 1, which is not an open set. ~
lEn is open then a relatively open subset E
9.20
Prove that if D
9.21
Prove that C(D, lEm) is a vector space.
9.22
Let D
= {x
~
D is open.
11 < jjxjj ::; 3} C lE Show that the set E = { x 11 < l!x!l ::; 2} 2
•
is relatively closed in D, and that the set F in D.
= {x
12 < l!xll ::; 3} is relatively open
9.23 Identify the relatively open subsets of the set Z contained in IE 1 . Identify the relatively closed subsets as well. 9.24 Give an example of a function such that f- 1 (U) is not open.
f
E
C([O, 1), IR) and an open subset U
~ lR
9.25 Let f be defined on a domain D ~ lEn, where f : D ----+ !Em. Prove that f E C(D) if and only if r- 1 (K) is relatively closed in D for each closed set K ~ lEm.
274 9.26
CONTINUOUS FUNCTIONS ON EUCLIDEAN SPACE
Is the set { x E JE4
1
x 1 X4 -
x2x3
= 1} open, closed, neither, or both?
9.27 t Complete the proof of Theorem 9.2.3 by showing that E ~ D is relatively open in D if for each x E E there exists 8x > 0 such that Box (x) n D ~ E. 9.28 Let f : D -+ lEm be continuous on its domain D ~ lEn, and suppose g : f(D) -+ JEP is continuous on f(D). Prove: The composition go f E C(D, JEP), where go f(x) = g(f(x)) E JEP, for each x E D. (Hint: Use Theorem 9.2.4.)
9.3 CONTINUOUS IMAGE OF A COMPACT SET In the Extreme Value Theorem (Theorem 2.4.2) we saw that a continuous realvalued function on a closed finite interval [a, b] must achieve both a maximum and a minimum value. The type of set analogous to [a, b] that enables us to prove an Extreme Value Theorem for real-valued functions on lEn is a compact set. (Iff were not real-valued but rather 1Em-valued with m > 1, then it would not make sense to speak of an extreme value for f since there is no natural order relation among the vectors of lEm .) We begin with a more general theorem that does not require real values, however. Theorem 9.3.1 Let D C lEn be compact, and suppose f : D -+ lEm is continuous. Then the set f(D) = { f(x) x E D} is a compact subset oflEm.
I
Proof: In words, this theorem states that the continuous image of a compact set is compact. We begin by letting 0 = { Oa I a E A} be an arbitrcli)' open cover of f(D), where A is an index set and each Oa is an open set in lEm. It will suffice to prove that there exists a finite subcover of f(D) selected from the given open cover 0. Since f is continuous, each set Va = r- 1 ( Oa) is relatively open in D. Thus there exists an open set Ua in lEn such that Va = Ua n D. Since f(D) is covered by 0, U = {Ua I a E A} is an open cover of D. Since Dis compact, there exists a finite subcover {Uj I j = 1,2, ... ,p} of D. Thus D = u~=1 ltj. and f(D) ~ U;=l Oj. This is a finite subcover of f(D), which must therefore be • compact. Now we can prove an Extreme Value Theorem for lE". Theorem 9.3.2 (Extreme Value Theorem for JEn) If D C lEn is compact and if f : D -+ JR. is continuous, then f achieves both a maximum value and a minimum
value on D.
Proof: Since f(D) is a compact subset of JR., it follows that f(D) is both closed and bounded. Let M =sup f(D) and m = inf f(D), both of which are necessarily real-valued (not infinite). It will suffice to prove that both M and mare elements of f(D). We will see that this follows from the fact that f(D) is closed. Since M is the least upper bound of f(D), M- fails to be an upper bound of f(D), and this
i
CONTINUOUS IMAGE OF A COMPACT SET
275
is true for every k E N. Thus, for each k E N we have a point M-
1
k<
j(xk) :::; M.
If there exists k such that f(xk) = M, then M E j(D) and we are finished. Otherwise, the sequence f(xk) ----> M, with none of the points of this sequence being M, so that M is a cluster point j(D). Hence M E f(D) as claimed. The similar argument for m is an exercise. • Here is a surprising theorem that can be quite useful. Theorem 9.3.3 Iff : D ----> !Em is a continuous, one-to-one function defined on a compact set D c !En, Then f- 1 is continuous.
Proof: Since f is one-to-one (also called injective), it makes sense to define f- 1 (y) = x if and only if f(x) = y. An invertible continuous function that has a continuous inverse is called a homeomorphism. Thus in words the theorem says that a continuous injective map of a compact set is a homeomorphism. It suffices to prove the continuity of f- 1 at any cluster pointy E f(D). If
we need to show that xj
= f- 1 (yj)
____. x
= f- 1 (y).
We claim that the entire sequence Xj must converge to x. If this were false, then there exists E > 0 and a subsequence Xj; for which llxj; - xll 2': E for all i. In that case, the bounded sequence Xj; must itself have a subsequence Xji 1 that converges to some vector x' -I- x, since llx' - xll 2': E. But f(x') = y = f(x). This is a contradiction since f is one-to-one. • We can extend the concept of the sup-norm to vector-valued functions as follows. Definition 9.3.1 Iff: D----> !Em where D
<:;:;
!En, we define the sup-norm off to be
llfllsup =sup {llf(x)lll XED}· It is left to the Exercises to show that this is in fact a norm. Next we can extend the concept of uniform convergence as follows to vector-valued functions. Definition 9.3.2 Iff and fj are all defined on a domain D with values in !Em, we say that fj ----> f uniformly on D provided that llfj - fllsup ----> 0 as j ----> oo.
The following theorem shows that uniform convergence behaves very well with respect to the continuity of functions. Theorem 9.3.4 Suppose fj E C(D, !Em) for all j, and suppose also that fj uniformly on D. Then f E C(D, !Em).
The proof is left to the Exercises.
---->
f
276
CONTINUOUS FUNCTIONS ON EUCLIDEAN SPACE
EXERCISES Prove or give a counterexample: If a continuous injective function f mapping (-1, 1) to JE 1 has closed range then its range must be bounded.
9.29
9.30 Prove or give a counterexample: If a continuous injective function f mapping ( -1, 1) to lE 1 has bounded range, then its range must be closed. 9.31 Give an example of a noncontinuous image of a compact set that is not compact because a) it is closed but not bounded. b) it is bounded but not closed.
The set E = (0, 1] is a bounded subsetofJE 1 and E is relatively closed in the set F = (0, 2]. Is E compact? Justify your answer. 9.32
In
9.33
Let D = { ~ EN} U {0} c JE 1 . Prove each of the following statements. a) D is compact. b) Iff E C(D, JEm), then function !If I must achieve both a maximum value and a minimum value on D. c) If g: D----> JEm, then g E C(D,JEm) if and only if g(~) ----> g(O) as n ----> oo. (Caution: Be sure to prove that if the latter condition is satisfied and if {xk IkE N} ~Dis such thatxk----> 0, then g(xk)----> g(O).)
9.34
Complete the proof of Theorem 9.3.2 by proving that m is a cluster point of
f(D). 9.35
Let¢: [0, 1]----> JE2 be defined by ¢(t) = (cos[27rt], sin[27rt]). Let f
=
be the restriction of¢ to [0, 1). a) Show that f maps the interval [0, 1) one-to-one and continuously onto 8 1 , the circle of radius 1 in JE 2 • b) Show that r- 1 is not continuous. Why doesn't this contradict Theorem 9.3.3? Evaluate or prove nonexistent
c) The function ¢maps the compact set [0, 1] continuously onto the circle 8 1
of radius 1 in JE 2 • Does ¢have a continuous inverse? Does this contradict Theorem 9.3.3? 9.36
Let f : [0, oo) ----> JE2 be defined by f(t) = (cos [4 tan- 1] , sin [4 tan- 1 t]).
Show that f maps the interval [0, oo) one-to-one and continuously onto the circle of radius 1 in JE 2 • Show that r- 1 is not continuous.
EXERCISES
Figure 9.3
f(x 1 , x 2 )
277
96 96 = x 1 x 2 + Xl + X2 •
9.37 a) Suppose S ~ lEn, K c !Em, and K is compact. Suppose f : S ____, K is
one-to-one and onto K and that r- 1 is continuous. Prove: Sis compact. b) Give an example of a non-compact set S ~ lEn, a compact set K c !Em, f : S ____, K is one-to-one, continuous and into K although r- 1 is continuous as well. 9.38 Prove that the sup-norm introduced in Definition 9.3.1 satisfies Definition 2.4.4 and is thus a norm. 9.39
Prove Theorem 9.3.4. (Hint: Adapt the proof of Theorem 2.5.1.)
9.40 Let D ~ lEn and fj : D ____, !Em, for all j E N. Prove that in the normed vector space C(D, !Em) equipped with the sup-norm, a sequence fi converges if and only if it is a Cauchy sequence. Prove that C(D, !Em) is a complete normed linear space. (Hint: Adapt the proof of Theorem 2.5.4.) 9.41 t Let D ~ lEn and f : D ____, !Em. We call f uniformly continuous on D if and only if for each E > 0 there exists 8 > 0 such that llx- x'll < 8 implies llf(x)- f(x')ll < E, for all x and x' ED. If Dis compact and iff E C(D,IEm), prove that f is uniformly continuous on D. (Hint: Suppose false and use the Bolzano-Weierstrass Theorem to deduce a contradiction.) 9.42 It is commonly necessary to adapt Theorem 9.3.2 to special circumstances in which the domain Dis not compact. Suppose we wish to construct an open-topped
278
CONTINUOUS FUNCTIONS ON EUCLIDEAN SPACE
rectangular box of volume 48 with base measuring XI by achieve minimum total surface area
x2
in such a way as to
for the five faces of the box. Prove: The function f has a minimum value on the domain described by the conditions XI > 0 and x2 > 0. See Fig. 9.3. (Hint: The idea is to reduce the problem to one on a compact domain. It may be helpful to consider the curves xix 2 = k for positive constants k, as well as the lines xi = a and x2 = b for constants a and b.)
9.4 CONTINUOUS IMAGE OF A CONNECTED SET We saw in Theorem 2.3.1 that every continuous function defined on an interval has the Intermediate Value Property, which is stated in Definition 2.3.1. In order to generalize the Intermediate Value Theorem to Euclidean Space, we will need to consider functions defined on a connected subset D ~ En, and we will need to restrict our attention to real-valued functions so that the concept of k being between f(a) and f(b) will make sense. But first we prove a more general theorem. Theorem 9.4.1 Let D ~ V, a normed vector space, be a connected set, and suppose f E C(D, Em). Then the range f(D) is a connected subset ofEm.
Proof: Suppose the theorem were false: We will deduce a contradiction. By Theorem 8.4.1, the range f(D) can be separated as f(D) = AUB, where A =f. 01- B, AnB = 0, and neither A nor B contains any cluster points of the other set. It follows thatD = r-I(A)uf-I(B), wherethetwonon-0setsA' = r-I(A)andB' = r-I(B) are disjoint. We claim that neither A' nor B' can have any cluster point of the other set. Suppose false: For example, suppose there exists a' E A' that is also a cluster point of B'. Then there exists a sequence bj E B' such that bj ---+ a'. By continuity, bi = f(bj) ---+a= f(a'). Hence A has a cluster point of B, which contradicts the hypothesis that D is connected. It follows that neither A' nor B' can have a cluster point of the other set, which contradicts the supposition that the theorem is false. Hence f(D) is connected. • Theorem 9.4.2 (Intermediate Value Theorem) Let D be a connected subset of V, a normed vector space, and suppose f E C(D, EI ), so that f is real-valued. /fa and bED and if f(a) < k < f(b), then there exists c E D such that f( c) = k.
Proof: By the preceding theorem, we see that f(D) is a connected subset of EI and is thus an interval I. (See Exercise 8.45.) Since f(a) and f(b) E I, it follows that k E I= f(D). Thus there exists c ED such that f(c) = k. •
EXERCISES
Figure 9.4
f(x1 ' x2)
= (x1
'
x2 -
279
1-).
'XIX2
EXERCISES
9.43 t A subset D <;;; lEn is called convex if and only if for each pair of points a and bE D the straight-line segment joining them is contained in D. That is,
I
S(a, b) = { ¢(t) ¢(t) =a+ t(b- a), t
E
[0, ll} <;;; D.
a) Prove that every convex subset of lEn is connected. b) Prove that lEn is a connected set for each n E N.
9.44 The plane lE 2 = {x I x 1 x 2 = O}UQIU · · · UQIV. Here QI, for example, denotes the open first quadrant, where x 1 > 0 and x 2 > 0, and the dots denote disjointness of the union. Prove: Each of the five sets in the union is connected. 9.45
Let f : D
---7
JE3 by
where D is the Euclidean plane without either of the two coordinate axes. Prove: f(D) can be written as the union of 4 mutually disjoint sets, each set connected and
280
CONTINUOUS FUNCTIONS ON EUCLIDEAN SPACE
each set maximal in the sense that it could not be enlarged with additional points from f(D) without losing connectivity. See Fig. 9.4. 9.46 Prove that each open ball Br (a) is a convex set in lEn. Do the same thing for each closed ball Br(a). 9.47
Consider the polynomial p E C
(1E2 , lE 1 )
p(x) = xix~ - 2xix2
given by
+ x1x2 -
3.
Show that there exists x E JE 2 such that p(x) = 0. 9.48 Consider the function f E C (1E 2 ,1E2 ) given by f(x) = ex 1 (cosx 2 ,sinx 2 ). Note that f(l, 0) = (e, 0) and f(l, 1r) = (-e, 0). Is there a point x E E 2 such that f(x) = (0, 0). Justify your claim. 9.49 Suppose D C lEn is compact and f E C (D, lEm) is one-to-one. Prove: If f(D) is connected, then Dis connected. 9.50
Show how to use Theorem 9.4.1 to give an alternative proof of Theorem 8.4.2.
9.51 Show how to use Theorem 9.4.1 to give an alternative proof that lEn is connected. (Hint: If false, show that the continuous image of an interval into lEn could fail to be connected, which is a contradiction.)
9.5 TEST YOURSELF
EXERCISES 9.52 True or Give a Counterexample: If a set S = AUB c lEn is the disjoint union of two non-0 sets, neither A nor B containing a cluster point of the other, then there is a positive distance 8 > 0 such that for all a E A and b E B we have II a- bll ~ 8. 9.53
Let if X> 0, if X= 0.
True or False: The graph of f is a connected set. 9.54
Give an example of a function
f : lE 2
__,
lim j(x1, kx1) X1-+0
lR for which
=0
for all k E JR, yet limx-+O f(x) does not exist. 9.55 State the Cauchy Criterion for the existence of limx-+a f(x) at a cluster point a of D1. 9.56
True or False: Iff : lEn __, lR is defined by f(x) =
"'n (
)i
L.t~=~ ~:112 xi ' 2
EXERCISES
281
then the setS= {xI f(x) = 0} is a closed set. 9.57
True or False: Iff E C (D, IEm) with D C lEn and if 0 is open in IEm, then
r- 1 (0) is an open set in lEn. 9.58
Let ifx
# 0,
ifx = 0. Either find limx---+O f(x) or else state that it does not exist.
In
9.59 Let D = {0} U { 2~ E N} C IE 1 • True or False: If g : D g E C(D, IEm) if and only if g ( 2~) --> g(O) as n--> oo.
-->
IEm, then
9.60 True or Give a Counterexample: If Sis a relatively closed subset of B 1 (0) c lEn, then S must be compact. 9.61
True or False: B 1 (0, 0) U B 1 (1, 1) is a connected subset of IE2 .
9.62 Let f E C(IEI, JE 2 ). Suppose f(O) = (1, 0) and f(1) = ( -1, 0). True or False: there exists a number x between 0 and 1 such that f(x) = (0, 0).
9.63
Let D = {0} U { ( Xt. then f(D) is connected.
sin;_) Ix 1 # 0 }·
True or False: Iff E C (D, IE 2 ),
This page intentionally left blank
CHAPTER 10
THE DERIVATIVE IN EUCLIDEAN SPACE
10.1
LINEAR TRANSFORMATIONS AND NORMS
We saw in Theorem 4.1.1 that a differentiable function f : D --+ JR. defined on a domain D <;;; JR. is a function for which the increments are given by
6.f = f(x +h)- f(x) = L(h) + t(h)h,
(1 0.1)
where L is linear and t( h) --+ 0 as h --+ 0. We saw that if a function is differentiable, then the derivative f'(x) exists and we have
L(h)
=
f'(x)h.
What the linear function L actually does is assign to the increment h in the variable x the corresponding rise of the tangent line to the graph of f at ( x, f (x)). The problem in generalizing this concept to f : !En --+ !Em is that in this context J, L, h, and twill all need to be replaced by vectors and by vector-valued functions f, L, h, and t(h). Unfortunately, it would be meaningless to multiply the vector values of t(h) and h. Thus, for purposes of generalization to f : !En --+ !Em it will be convenient to reformulate Equation (10.1) in the following equivalent form. A Advanced Calculus: An Introduction to Linear Analysis. By Leonard F. Richardson Copyright © 2008 John Wiley & Sons, Inc.
283
284
THE DERIVATIVE IN EUCLIDEAN SPACE
differentiable function f by
:D
---+
lR is a function for which the increments are given
D.J = f(x +h)- f(x) = L(h) + €(h),
(10.2)
where L is linear and
J€(h)J
W----*
0
as h ---+ 0. In this reformulation, €(h) = E(h)h. When we pass to functions f : lEn ---+ !Em, we cannot divide by a vector h. However, we can divide by its norm, which means the same thing as the length of the vector increment. Since for functions f : lR ---+ lR the derivative f'(x) is the single coefficient of a 1 x 1 matrix representing the linear map L, it is easy to overlook the important distinction between the number f' (x) and the linear transformation L for which this number is its sole matrix coefficient. But for f : lEn ---+ !Em the linear transformation L will map lEn to !Em, and it will be represented by an m x n matrix. In the present section we begin by reviewing some basic information about linear transformations and their matrices. We will need to introduce the concept of the norm of a matrix, which may be new to the reader. Let us denote by .C(!En, !Em) the set of all linear transformations L : lEn ---+ !Em. (In the special case in which m = n, we will write .C(!En) in place of .C(!En, lEn).) The reader will recall [ 10] that if L and L' E .C(!En, !Em) and if c E JR, then cL + L' E .C(!En, !Em), so that .C(!En, !Em) is a vector space. Let x E lEn and L E .C(!En, !Em). We can decompose n
x= L:xjej, j=l
where ei = (0, ... , 0, 1, 0, ... , 0), where all entries are 0 except for the jth entry, which is 1. The family {ej I j = 1, ... , n} is called the standard basis of lEn. For the range space !Em we will denote the standard basis {fl. ... , fm} to avoid confusion. Now we can write L(x) = Ej= 1 L(ej)xj. We can introduce an m x n matrix where each column vector
cj
E:
= [
~~~ lmj
l
and L(ej) = 1 lijfi. Then L(x) = [L]x, in which the m x n matrix [L] multiplies the column vector x in the following customary way. We express the matrix [L] in terms of its m row vectors:
LINEAR TRANSFORMATIONS AND NORMS
:
l[ l
Xn
R.,
285
Thus L(x) can be represented by the column vector
[L]
X2 XI [
.
R2. X R1 ·X
=
: .
.
·X
L:::
Hence L(x) = 1 (Ri · x)fi. For the purposes of this book, we will need to introduce a norm on the vector space .C(!En, !Em). We begin with the following theorem. Theorem 10.1.1 Let L E .C(!En, !Em) with matrix[£]= [lij]mxn· Let
Then for each x E lEn we have
IIL(x)ll ::; Mllxll·
Proof: In the notation introduced just above the statement of the theorem, we proceed as follows using the Cauchy-Schwarz inequality and the orthogonality of the basis vectors, each of which has norm 1.
m
i=l
m
i=l
m i=l
• We are ready to introduce a norm on .C(IEn, !Em) by generalizing a definition from Theorem 5.4.2 as follows. Definition 10.1.1 For each L E .C(IEn, !Em) we define a norm by
IlLII =
inf { K
IIIL(x)ll ::; Kllxll Vx E lEn}.
286
THE DERIVATIVE IN EUCLIDEAN SPACE
Remark 10.1.1 In Exercise 10.1, the reader will show that
IIL(x)ll:::; IILII·IIxll for all x E JE". In order to show that the preceding definition satisfies Definition 2.4.4 of a norm on a vector space, the most significant step is the triangle inequality, which we leave to Exercise 10.4.
Remark 10.1.2 In Exercise 10.10 the reader will prove that .C(JEn, JEm) is a complete normed linear space .
• EXAMPLE 10.1
If T E .C(JEn ), the linear transformation T : lEn --+ lEn is called invertible if and only if T is both one-to-one and onto lEn. This can be expressed also as requiring that T be both an injection and a surjection, which is called a bijection. This condition is equivalent to the existence of a map r- 1 E .C(JEn) such that
ToT- 1 =T- 1 oT=l, the identity transformation of lEn --+ lEn. We will have a special interest in invertible linear transformations of lEn. We define
g.c(n,JR) =
{T
E
.C(JEn) Ir- 1 exists}.
r-
1 It will be useful to recall from linear algebra that exists if and only if the determinant is nonzero: det T f=. 0. In this connection it may help the reader to remember that the determinant of the matrix of a linear transformation is independent of the basis with respect to which the matrix is written. This is because if B is the standard basis and B' is any other basis, there will be a change of basis matrix A such that
to which we apply the multiplicative property of the determinant function to see that dct[T]B' = (dct A)- 1 dct[T]B dct A= dct[T]B·
EXERCISES 10.1
t If L
E
.C(JEn, JEm), prove
IIL(x)ll:::; IILIIIIxll for all
x E lEn.
Hint: Express
IlLII in terms of the set { 11 t~ii 11 I x f=. 0 }·
10.2 If L E .C(JEn, JEm), prove L if IlLII = o.
= 0, the zero linear transformation, if and only
EXERCISES
10.3
If L E .C(lEn, lEm), prove llcLII = lei IlLII for all c E R
10.4
t If Land L' E .C(lEn, lEm), prove the triangle inequality
287
IlL+ L'll:::; IlLII +IlL' II·
10.5
Let A and B be in .C
(1E 2 )
[A] = (
with matrices in the standard basis
b 8) and [B] = ( g ?) .
Find IIAII, IIBII, and IIA + Bll·
10.6 Give an example of X E .C prove impossible.
(1E 2 )
for which IIX +XII
< IIXII +II XII or else
10.7 Let L E .C(lEn, lEm). Prove that Lis a uniformly continuous function from !En ---+ !Em, as defined in Exercise 9.41. (Hint: Use the result of Exercise 10.1.) 10.8
Let L E .C(!En, lEm) and T E .C(lEm, JEk), so that To L E .C(!En, JEk). a) Prove: liT 0 Lll :::; IITIIIILII· b) Now let k = m = n. Denote T 2 = To T and Ti+ 1 = TJ o T for all j E N. Show that IITJ II :::; IITIIJ. c) Give an example in which IIT 2 II < IITII 2 • (Hint: Find a linear transformation T =1- 0 for which T 2 = 0 E .C(!En ).)
10.9 t Let T(k) E .C(lEn, lEm) for each k E N. Prove that the sequence T(k) ---+ T as k ---+ oo in the norm of .C(!En, lEm) if and only if for each i, j the sequence of matrix coefficients t~7) ---+ tij. Hint: Prove that for each i, j we have
10.10 Prove that the normed vector space .C(!En, !Em) is complete, meaning that each Cauchy sequence in the norm converges to an element of .C(lEn, !Em). 10.11
Suppose X E .C(!En) is such that II XII a) Prove that
< 1.
is Cauchy, so that TK converges to some T E .C(lEn) asK---+ oo. b) Prove that (I - X)TK ---+ I asK---+ oo. c) Prove that (I- X)- 1 exists and equals T. (Hints: Here the power X 0 =I, the identity transformation in .C(!En). See Exercise 10.8.)
288
THE DERIVATIVE IN EUCLIDEAN SPACE
10.12
Suppose A E .C (lE 2 ) has the matrix
[A]= (
~
)2)
with respect to the standard basis of lE 2 • Prove that
Vi3 :::;
II A II :::; 4.
10.13 t Show that the set of linear transformations {Ti,j} defined by Ti,j : ei --> fi and Ti,j : e1 --> 0 for all l i- i is a basis for the finite-dimensional vector space .C(lEn, lEm ), and show that the dimension is dim .C(lEn, lEm) = mn. 10.14 Let the function f : .C(JEn,lEm) --> IR defined by f(T) = tioio• where [T] = [tij]mxn. 1 :::; i :::; m, and 1 :::; j :::; n. Prove that f is uniformly continuous. (Hint: Show that f is a linear map from .C(lEn, lEm) to IR and use the hint from Exercise 10.9.) 10.15 Define S c .C (lE2 ) by letting S = {A E .C (lE 2 ) Ia 11 i- 0 }, where the matrix [A] = [aijhx2· Prove: Sis an open but not connected subset of .C (lE 2 ). 10.16 t Prove that the determinant function det : .C(lEn) --> IR is continuous. (Hint: The determinant is a polynomial in the matrix coefficients. See [10].) 10.17 Prove that Q.C(n, JR), defined in Example 10.1, is a group with the operation being composition of the linear transformations. That is, if S, T, and U E Q.C(n, JR) and if I is the identity transformation, then a) SoT E Q.C(n, JR). b) So(ToU)=(SoT)oU. c) I E Q.C(n, JR). 1 d) E Q.C(n, !R).
s-
10.18
Show that the subset Q.C(n, JR) C .C(JEn) is not a vector space.
10.19 Show that the group Q.C(n, IR) is not commutative if n there existS and T E Q.C(n, IR) such that SoT i-To S.
> 1.
That is, show
10.20 Prove that Q.C(n, IR) is an open subset of the normed vector space .C(lEn). (Hint: Use the result of Exercise I 0.16 together with the fact that T is invertible if and only if det(T) i- 0.) 10.21 LetS E Q.C(n,IR), and leta= IIS- 1 11· Usethefollowingstepstoprove that the open ball B1. (S) C Q.C(n, IR). a) Show that~ > 0. b) Show for each x E lEn that llxll :::; aiiS(x)ll. so that IIS(x)ll ~ ~ llxll· c) ConsideranyT E B1.(S) c .C(JEn). Write a
IIT(x)ll = IIS(x)- [S- T](x)ll· Prove that IIT(xll ~ Pllxll. where p = ~ - liS- Til > 0. (Hint: See Exercise 9.13.) d) Show that the kernel ker(T) = {xiT(x) = 0} = {0}, so that Tis both one-to-one and onto lEn. Hence T is invertible.
DIFFERENTIABLE FUNCTIONS
289
10.22 Prove thatgC(n, JR) is nota connected subset of C(IEn). (Hint: Consider the result of Exercise 10.16. Use the fact that Tis invertible if and only if det(T) =/= 0.) 10.23
If T E C(!En ), let [T]s = ( tij )n x n and denote the trace ofT by n
tr(T) =
L tii· i=l
a) If B' is any other basis for lEn, and if [T]B' n
=
(t~j)nxn. then prove that
n
L:tii = L:t~i· i=l
i=l
Hint: Use the theorem from Linear Algebra that
tr(AB) = tr(BA). b) Denote by s((!En) = {T E C(!En)
I tr(T)
= 0}. Prove that
s((!En) is a
closed subset of C(IEn).
10.2
DIFFERENTIABLE FUNCTIONS
Recall from the opening of the preceding section our intended meaning of the concept of differentiability of f at a point x E D <:;; lEn. We mean that the increments .::lf = f(x +h) - f(x) should be locally approximated by a linear transformation A E C(!En, !Em) applied to the increment h. The formal definition is as follows. Definition 10.2.1 SupposeD<:;; IE11 and f: D---+ !Em. /fx E Dis a cluster point of D we say that f is differentiable at x and that f' (x) = A E C(!En, !Em) if
. 1
h~
llf(x +h) - f(x) - A(h)ll _ 0 llhll -.
Remark 10.2.1 The concept of derivative is most useful at points in the interior of the domain of definition. At such points, for all sufficiently small h, we have x + h E D. However, the definition still makes sense even for boundary points of D, at which only some h will be admissible, however short h may be. In that case, it is understood in this definition that h ---+ 0 through values of h such that x + h E D, the domain of f. An equivalent form of the condition in this definition is the following statement. Theorem 10.2.1 SupposeD <:;; lEn and f : D ---+!Em. Let x E D be a cluster point of D. Then f is differentiable at x and f' (x) = A E C (lEn, !Em) if and only if
.::lf = f(x +h) - f(x) = A(h) + €(h), where "lr~/11
---+
0 ash---+ 0.
290
THE DERIVATIVE IN EUCLIDEAN SPACE
The proof of this theorem is left to Exercise 10.33. Theorem 10.2.2 Iff is differentiable at xo E D, then f is continuous at x 0 . Proof:
A(O)
For each x E D, denote h = x- xo. Note that A is continuous and Then
= 0.
f(x) - f(x 0 ) = A(h) +'E(h) ----> 0 + 0 = 0 as x----> xo, so that h ----> 0. Thus f(x) ----> f(xo) as x----> xo.
•
If f : IR ----> IR, we have seen that the differentiability of f at x is equivalent to the existence of . J(x +h)- f(x) 11m . h----+0 h But if f : D ----> !Em where D ~ IE" with n > 1, then the relationship between differentiability and the existence of the so-called partial derivatives is significant but imperfect. (See Exercise I 0.38.) Definition 10.2.2 Let f : D ----> !Em, where D is a subset oflE", equipped with the standard orthonormal basis B. Then the partial derivatives are defined by
8fi =lim fi(x 8xj t----+0
+ tej)-
fi(x)
t
for all 1 ::; i ::; m, 1 ::; j ::; n, provided these limits exist. Furthermore, the directional derivatives are defined by
_ .
D v f( X) - l f1f i
t--+0
f(x +tv) - f(x) , t
provided that the limit exists. Remark 10.2.2 In elementary courses about several variable calculus, it is customary to define the directional derivative for real-valued functions, though not for vectorvalued functions, and to limit the directional derivative to directions specified by unit vectors v. We do not require v to be a unit vector here, because if we did this we would lose the very satisfying property of Dvf(x) presented in Exercise 10.42. Theorem 10.2.3 Let f : D ----> !Em be differentiable at x E D, where D is an open subset oflE". Then the directional derivatives
Dvf(x) = lim f(x +tv) - f(x) t----+0 t exist and equal f' (x)v for all v E IE". Moreover, the partial derivatives
DIFFERENTIABLE FUNCTIONS
exist for all1 B. the matrix
:S: i :S: m,
1
291
:S: j :S: n. With respect to the standard orthonormal basis [f'(x)]B=
[~:t] J
mxn
.
Proof: In this theorem, the purpose of the assumption that D is open is to ensure that for all v E !En we will have x + tv E D for all sufficiently small t. Denote f'(x) =A E £(1En,!Em). For the directional derivatives we write ~f =
f(x +tv) - f(x) = A( tv)+ f.( tv)
and we see that
~f t
=
Av + t(tv) ltlllvll _, Av = Dvf(x) lltvll t
as I _, 0. Here we use the fact thut
w:11 ---. 0 as l -+ 0 because its norm approaches
l!llfdl
zero as a result of the hypothesis of differentiability. And we use the fact that is bounded. It is shown in Exercise 10.34 that ~ is the ith component of Dei f. Thus each 3
partial derivative ~ exists because of the first part. 3
Denote the matrix [f'(x))B =A= [aiJ]mxn· We need to show that~ = aiJ· 3 Let Cj denote the jth column vector of A, which is the matrix form of the vector Dei f = Ae1 . This completes the proof that
[f' (x)]B =
[~~~] mxn . XJ
• Corollary 10.2.1 In Theorem 10.2.3, ijf'(x) exists, then it is unique. •
EXAMPLE 10.2
Let f : IE 2 _, IE 1 be defined by j(x1, xz) = x1x2, with 0 :S: x1 0 :S: x 2 :S: 1. (See Fig. 10.1.) Since the matrix is given by [j'(x1,x2)]
< 1 and
= ( x2 x1 ) ,
we calculate that
[!'(0.5, 0.5)] (
8:t ) = ( 0.5
0.5 ) (
8:t ) = 0.05 + 0.05 = 0.1.
This is the rise of the plane that is tangent to the graph of this function at (0.5, 0.5, 0.25) as both of the first two coordinates are increased from 0.5 each to 0.6 each. We compare this result with the actual rise of the function, which is f(0.6, 0.6)- f(0.5, 0.5) = 0.36- 0.25 = 0.11. The derivative gives an error in the estimate of the 0.11 rise of the function ofO.Ol.
292
THE DERIVATIVE IN EUCLIDEAN SPACE
Figure 10.1
X3
= x1 x2 •
• EXAMPLE 10.3
This example is a computational illustration of the meaning of the derivative 2 as a linear approximation to the increments in f. Let f : JR2 --+ IR by f(r, 0) = (r cosO, r sinO). A simple calculation using Theorem 10.2.3 shows that
[r'(1,~)J(%)=(~ -~)(%)=(~)· The reader should use a calculator to compare this result with the numerical approximations tofu~. i + ~~)- f (1, i) ~ (-0.011,0.145). Although Exercise 10.38 shows that the existence of all the partial derivatives ~J is not sufficient to ensure that f is differentiable, we do have the following definition and theorem.
Definition 10.2.3 Let D 5;;; lEn and f : D --+ !Em. We say that f is continuously differentiable, denoted by f E C1 ( D, !Em), provided that f is differentiable at every
DIFFERENTIABLE FUNCTIONS
293
point of a E D and that f' : D ---+ .C(!En, !Em) is continuous. (That is, we require that llf'(x)- f'(a)ll ---+ 0 as x---+ a,for each a ED.) Theorem 10.2.4 LetD ~lEn beanopensetandf: D---+ lEm. Thenf E C1 (D,lEm) if and only if every partial derivative ~ is continuous on D. 1
Proof: First we suppose that f E C1 (D,JEm). We note that ~ei is the ith 3 component of De; f. However,
l
or (x)- ox; or (a) ox;
1
::; IIDeJ(x)- De;f(a)ll ll(f'(x)- f'(a))ejll
=
::; llf'(x)- f'(a)llllejll ---+ 0 as x---+ a because f' is continuous. Thus~ is continuous at each a E D. 3
Next, we assume that each of the partial derivatives~ is continuous on D, and 3 we must prove that f E C1 (D, lEm). We let A be the linear transformation that has the matrix
[A]mxn
=
[;~; (x)]
with respect to the standard bases. We need to prove first that
f(x +h) - f(x) = A(h) + t(h) where ~~~~~~~~ ---+ 0 as h ---+ 0. However, this is equivalent to proving for each i = 1, ... , m that
where
Ei(h)
lfhif ---+ 0 as h ---+ 0 and where Ri denotes the ith row vector of the matrix A. Thus we can fix i arbitrarily and prove the latter condition for fi, which is real-valued. Let t > 0. There exists 0 > 0 such that B,s(x) ~ D and such that llhll < 0 implies 8fi
I
OXj
(x +h)- ofi (x)l <.:..n OXj
for each j = 1, ... , n. Suppose that llhll < 0 and denote h Xo = x and k
Xk
= xo +
L hjej j=l
E B,s(x)
= E7= 1 hjej.
Denote
294
THE DERIVATIVE IN EUCLIDEAN SPACE
which is a convex set as defined in Exercise 9.43, for all k = 1, ... , n. Applying the Mean Value Theorem from one-variable calculus we find the existence of real numbers lk to see that n
fi(x +h)- fi(x) = ~)Ji(xk)- fi(Xk-d) k=1
where
lrJkl <
~- It follows that n
llfi(x +h)- fi(x)- Ri · hll ~
L I7Jklhkl k=l
k=1 This proves that fi is differentiable, and so is f. To see that f'(x) is continuous we apply Theorem 10.1.1 to see that
llf'(x)- f' (a) II ~
L (8fi (x)- ~fi (a))2---> 0 . .
~.J
OXj
OXj
as x ---> a since the partial derivatives are continuous by hypothesis.
•
The following theorem will be useful when we study the Inverse Function Theorem. Theorem 10.2.5 Let f and g E C1 ( D, !Em), where Dis an open subset of!En. Define ¢: D---> lR by ¢(x) = f(x) · g(x), a scalar product in JEm. Then¢ E C1(D, JR) and
¢'(x)h = g(x) · f'(x)h + f(x) · g'(x)h for all h E lEn. Remark 10.2.3 In the equation that is the conclusion of this theorem, the dot products indicated on the right have meaning because both sides of the equation are applied to an arbitrary vector h E lE". Remembering that matrix multiplication is associative when it is defined, the conclusion can be expressed as follows in terms of matrices:
[¢' (x)]I xn [h]nx 1 =[g(x)h xm [f' (x)]mxn [h]nx 1 + [f(x)hxm[g'(x)]mxn[h]nx1
EXERCISES
295
or [¢'(x)]Ixn = [g(x)hxm[f'(x)]mxn + [f(x)hxm[g'(x)]mxn· Proof: We write ¢(x +h) - ¢(x) = (f(x +h)- f(x)) · g(x +h)
+ (g(x +h)- g(x)) · f(x) = (f'(x)h + EI(h)) · (g(x) + g'(x)h + E2(h)) + f(x) · (g'(x)h + E2(h))
= g(x) · f' (x)h +
f(x) · g' (x)h + E(h),
where E(h) is a real-valued function defined by the latter equation. It remains only to prove that IE(h)l ___, 0
llhll ash ___, 0, which is left to Exercise 10.36. The continuity of the derivative follows from the continuity of its matrix coefficients. • EXERCISES
n6,
10.24 Follow the model of Example 10.3 to estimate f ~ + / 0 ) - f ( 1, ~). Compare this linear estimate with the actual difference estimated numerically using a calculator. 10.25 In each of the following examples we have f : JEI ___, IE2. Find all x for which f' (x) exists, and find the matrix [f' (x)]. Iff' (x) does not exist for all x, prove this also. a) f(x) = (cosx,sinx). b) f(x)
=
(x, v?).
10.26 For each of the following functions f, Show that f'(x) exists, find the matrix [f'(x)] and calculate det f'(x). a) f: IE 2 ___, IE 2 by f(x) = (xi cos x2, XI sin x2). b) f: JE 3 ___, JE 3 by f(x) =(XI COSX2,XI sinx2,X3). c) f : IE 3 ___, IE3 by f(x) = (xi sin X3 cos x2, XI sin X3 sin x2, XI cos x3). 10.27 Let f: IE2 ___, lE 2 be defined by f(x) = (ex 1 cosx 2, ex 1 sinx2). a) Prove that f E CI (lE 2, JE 2). b) Calculate the matrix [f' (x)], with respect to the standard basis for JE 2. c) Find dct[f'(x)]. d) Calculate Dvf(x), where v = (1,
V3).
Suppose fECI (1E 2,1E 2) is such that the matrix [f'(1, 2)] = ( _12 34 ) . Find the directional derivative D(l,- 2Jf(l, 2). 10.28
296
THE DERIVATIVE IN EUCLIDEAN SPACE
10.29 In elementary courses in the calculus of several variables, one learns about the gradient n
of
'\Jf = " - e i L...Jax· i=l ~ of a differentiable function f : lEn --? R Determine the relationship between '\1 f and f' (x) as defined in this section.
10.30 t Let T E .C(!En, !Em). Prove that T is differentiable at each point x E lEn and that T' = T. That is, show that Tis its own local linear approximation. Let f E C (D, IE2 ), where Dis an open subset of IE2 • Suppose also that f' E C (D, .C (IE 2 , IE3 )). Prove or give a counterexample:
10.31
for all x1 andx 2 in D and c E R
10.32 t Suppose that f and g : D --? !Em and that x E D is a cluster point of D ~ lEn. Let c E R If both f and g are differentiable at x, prove that (cf + g)' (x) exists and equals cf'(x) + g'(x). 10.33
t Prove Theorem 10.2.1.
10.34
Prove that iff is differentiable, then ~ is the ith component of De; f.
10.35
Prove Corollary 10.2.1.
10.36
t Complete the proof of Theorem 10.2.5 by proving the remaining limit.
3
10.37 Give an example of a function f that is continuous at x but not differentiable at x. Prove that your example has the required properties. 10.38
t Define f: IE2
--?
IE 1 by
f(x)
=
{
~
~1+x2
ifx E IE2 ifx
\
{0},
\
{0},
= 0.
See Figs. 10.2 and 10.3. 8 2 a) Prove: UXl and ..E.L 8 X2 exist at each x E IE • b) Prove: f is not differentiable at 0.
-P-
10.39
Define f
: IE2
--?
IE 1 by ifx E IE2 ifx = 0.
See Figs. 9.1 and 9.2. a) Prove: The directional derivative Dvf(x) exists at each x E IE2 and for each v E IE2 \ {0}.
EXERCISES
Figure 10.2
b) Prove:
297
~'+"'~ in the first quadrant.
:z:l
:z:2
f is not differentiable at 0.
10.40 Prove that f E C1 (D, !Em), where Dis an open subset of lEn, if and only if the directional derivative Dv f E C(D, !Em) for all v E lEn. 10.41
Let the open set U C lEn. Suppose that
f :U
---->
IR and that there is a
number ME IR such that j.§!;(x)j ::::; M for all x E U and for all j = 1, ... , n.
f E C(U, IR). (Hint: Let p E U and take r > 0 such that U. Prove that f is continuous at p. Adapt the technique from the proof of Theorem l 0.2.4.) b) Now suppose that f : U ----> !Em and that there is a number M E IR such that a) Prove that
Br(P)
~
~~(x)j ::::; M forallx E U and forallj Prove that f E C(U, !Em).
= 1, ... , nand all i =
1, ... , m.
10.42 Iff E C1 (lEn,lEm) and v and w E lEn, express Dtv+wf(x) in terms of Dvf(x), Dwf(x) and t, for each t E R Conclude that the mapping V :
298
THE DERIVATIVE IN EUCLIDEAN SPACE
Figure 10.3
C1 (JEn,lEm) ear map.
--7
2 1+"'~
xl
x2
in all four quadrants.
]Em defined by V(v) = Dvf(x) for fixed f and fixed xis a lin-
10.43 Let f and g E C1 (lE 2 , JE 2 ). Define¢ : JE 2 --7 JR. by ¢(x) = f(x) · g(x), a scalar product in JE 2 • Find the matrix [¢' (0)] given that
[f(O)] = (1, 1), [f'(O)] = ( !2
[g(O)]
r ).
=
(-1,-1),
[g'(O)] = ( _31 ~ ) .
10.3 THE CHAIN RULE IN EUCLIDEAN SPACE Here we generalize Theorem 4 .1. 3 (the Chain Rule) to lEn. Theorem 10.3.1 (The Chain Rule) SupposeD~ lE" and g: D --7 lEm is differentiable at x 0 E D. Suppose f: g(D) --7 JEP is differentiable at g(x0 ). Then (fog) is differentiable at xo and (fog)' (xo) = f' (g(xo))g' (xu), a composition of linear transformations. Proof: To show the composition f o g is differentiable at xo, we denote
k = g(xo +h) - g(xo)
--7
0
THE CHAIN RULE IN EUCLIDEAN SPACE
299
as h ___. 0 since g is continuous at x 0 by Theorem 10.2.2. Also, k = g'(xo)h + E"(h),
1
where 11f (~~11 ___. 0 ash___. 0. Next we observe that ~f
= f(g(xo +h))- f(g(x 0 )) = f'(g(x 0 ))k + E(k),
where ~ llkll ___. 0 as k ___. 0 . Therefore ~f
= f'(g(xo))g'(xo)h + f'(g(x 0 ))£(h) + E(g'(x0 )h + E"(h))
Denoting ~(h) = f'(g(x 0 ))£(h) ll~(h)ll llhll
+ E(g'(x0 )h + E"(h)) it suffices to show that
llf'(g(xo))E"(h)
+ E(g'(xo)h + E"(h))ll
llhll < llf'(g(xo))E"(h)ll + IIE(g'(xo)h + f.(h))ll llhll llf'(g(xo))f.(h)ll IIE(g'(xo)h + f.(h))ll 0 = llhll + llhll -----
as h ___. 0. Since llf'(g(xo))E"(h)ll < llf'(g(xo))IIIIE"(h)ll ___. 0 llhll llhll as h ___. 0, it suffices to show that IIE(g'(xo)h + E"(h))ll IIE(k)ll llhll =
lfhll ----- 0
ash ___. 0. Thus it suffices to show that if rJ > 0, then for all sufficiently smallllhll we have IIE(k)ll < 'fJIIhll· Since 1 > 0 there exists 81 > 0 such that llhll < 81 implies llf.(h)ll :S 1llhll, which implies in tum that llkll :S (1 + llg'(xo)ll)llhll· Also, if rJ
2
= 1 + llg'(xo)ll'
then there exists 82 > 0 such that llkll < 82 implies
300
THE DERIVATIVE IN EUCLIDEAN SPACE
Thus it suffices to pick h such that
since then
• 10.3.1
The Mean Value Theorem
In one-variable calculus, the Mean Value Theorem (Theorem 4.2.3) plays a very important role. In Exercise 10.49 the reader will see that a direct adaptation of that theorem to vector-valued functions is not possible. However, the following version of the theorem is true and is useful. Theorem 10.3.2 (Mean Value Theorem) Suppose f : D --t !Em is a differentiable function, where D is a convex subset of lEn, as defined in Exercise 9.43. Suppose M = supxED jjf'(x)jj < oo. Then,forall aandb in D,
llf(b)- f(a)ll ::; Mllb- all. Proof: Observe that the straight-line segment
S = {a+ t(b- a) I 0 ::; t ::; 1} ~ D because Dis convex. Define a differentiable function¢: [0, 1]
--t
JR. by
¢(t) = (f(b)- f(a)) · f(a + t(b- a)) for all t E [0, 1]. By the ordinary Mean Value Theorem from one-variable calculus together with Exercise 10.45.b, we obtain
llf(b)- f(a)l! 2 = ¢(1)- ¢(0) = ¢' (f)(1 - 0) = (f(b)- f(a)) · (f' (a+ f(b- a)) (b- a)) ::; llf(b)- f(a)l!l!f'(a + l(b- a))l!l!h- all ::; l!f((b)- f(a)IIMI!b- all, where the inequality comes from the Cauchy-Schwarz inequality combined with the Chain Rule and the property of the norm of a linear transformation. Therefore
l!f(b)- f(a)l! ::; Mllb- all
• The following corollary is an immediate consequence of the proof of the Mean Value Theorem.
THE CHAIN RULE IN EUCLIDEAN SPACE
301
Corollary 10.3.1 Suppose f : D ----> lEm is a differentiable function. Suppose x and y E D and suppose the straight-line segment L between x andy lies in D. Suppose M = supwEL llf'(w)ll < oo. Then
llf(x)- f(y)ll :::; Mllx- Yll· Remark 10.3.1 The corollary differs from the Mean Value Theorem in that D does not need to be convex. Instead, x andy are special points for which L <;;; D. In the corollary, M is the supremum only over L, not over D. If f : D ----> lEm, with D <;;; lEn, it makes sense to speak of a local extreme point (either a local maximum or a local minimum) at a point x E D if and only if m = 1. We say that f : D ----> ~has a local extreme point at x E D, provided that there is a number r > 0 such that f(x) is either the largest or the smallest value achieved by f in the open ball Br(x). Theorem 10.3.3 (Local Extreme Point) Suppose that D <;;;lEn and that f: D----> ~ has a local extreme point at X E D 0 , the interior of D. Iff is differentiable at X, then J'(x) = 0 E .C {lEn,lE1). Proof: Let v E lEn \ {0}. Define ¢(t) = f(x +tv). Thus ¢ : ~ ----> ~has a local extreme point at t = 0. By the first derivative test from one-variable calculus, combined with the multivariable chain rule, we see that ¢'(0) = f'(x)v = 0 for all nonzero vectors v. Thus f' = 0 E .C {lEn, lE 1 ). Note that the latter condition is • equivalent to the condition that ~ = 0 for all i = 1, ... , n.
10.3.2 Taylor's Theorem Next we will generalize Taylor's Theorem (Theorem 4.6.1). For this purpose we restrict our attention to functions f : D ----> ~. where D is an open, convex subset of lEn. Note that for such functions f'(x) is a 1 x n matrix, and its one row vector is the gradient of f, denoted by
We denote by CN + 1 ( D, ~) the set of all functions f : D ----> ~ such that f and each of its derivatives of order up to N has in tum a continuous derivative. (See Exercise 10.53 for elaboration of this concept, including its equivalence to the continuity of all partial derivatives of order up to and including N + 1.) We will let k denote an arbitrary n-tuple of nonnegative integers, and we will denote by lkl = E~=l ki. We will denote (1k1+··+kn J 0 ikiJ axk
= 8x~ 1
0
0
0
ax~n
302
THE DERIVATIVE IN EUCLIDEAN SPACE
and
Theorem 10.3.4 (Taylor's Theorem in lEn) Let f E CN+l(D,JR), where Dis an open, convex subset of !En. Let a and b E D. Then there exists a point J.l on the straight-line segment between a and b such that f(b)
=
(1 0.3)
where
~ (N: I)! (~(b;- a;) 8~;) N+> fix="·
RN(b)
Equivalently, we can express these formulas in the form
alkl f
f(b) =
(b- a)k
L 8xk (a) k1!k2! ... k ! + RN(b), lki:S:N
(10.4)
n
where
Proof: Define a function¢ : lR ~ lR by ¢(t) = f(a + t(b- a)) = (! o '1/J)(t), where '1/J(t) = a+ t(b- a) for all t E [0, 1]. Observe first that in matrix form
[¢'(t)] = [f'('!j;(t))]['l/J'(t)] =[/f;('l/J(t))
=
...
8 ((bl- al)-8 x1
[b1~a1]
Jt('l/J(t))J
bn- an 8 + ... + (bn- an)-8 ) x1
II . 1/J(t)
Repeating this argument k times for each of then summands in the last line above, we see that ¢{k)(t) =
(~(b·- a-)~) II 8x· ~
J=l
k
J
J
J
.p(t)
The rest of the proof of formula (10.3) is a matter of applying Taylor's Theorem (Theorem 4.6.1) for one variable to ¢(t). In order to prove formula (10.4), we expand the powers of the differential operator in formula (I 0.3), using the expansion given by the so-called multinomial theorem:
EXERCISES
303
where the coefficients
are the multinomial coefficients of degree K. (See [2].) However, in order to carry out this expansion, we need to know Clairaut's theorem. This theorem, which is proven as Exercise 11.46, assures under the hypothesis of continuity of the partial derivatives that there is independence of the order of composition of the differential operators. That is, continuity of the partial derivatives assures that
/J2f
/J2f
DxiDXj
DxjDxi ·
This permits the regrouping and collecting of the like partial derivatives in the expansions utilized above. • We remark that sometimes it is useful to know that the sum of the multinomial coefficients of order N is
This follows from the fact that the sum represents the number of ways to select N things from a set of n things. Equivalently, it is the number of elements in the set of functions from a set of N elements to a set of n elements. EXERCISES
10.44 Suppose f g(O) = Xo,
IE 2
-+
IE 2 and g : IE3
g'(o) = (
-+
IE2 are both differentiable. If
_!2 -,} ~ ) '
and
f'(xo) = (
~ ~1
) ,
find the matrix [(f o g)'(O)] using the standard bases. Suppose that T E .C(!Em, JEP) and that f : !En
10.45
-+
!Em is differentiable.
a) Prove that T o f is differentiable at each x E !En and that
(To f)'(x) =To f'(x). b)
10.46
t If a E !Em is a constant vector, prove that (a· f)'(x) exists and equals a· f'(x). Suppose f E C1(IE3, JE 2) and the matrix
[f'(O)] = (
1 g ~ ).
304
THE DERIVATIVE IN EUCLIDEAN SPACE
If (a, b) E JE2 , express the matrix [( (a, b) ·f)' (0)] in terms of a and b.
10.47 Suppose f: D----* lEm is differentiable at x E D and suppose there exists an inverse function g : f(D) ----* D that is differentiable at f(x). Find the relationship between f' and g'. 10.48 Suppose y = g(x) is differentiable in an open ball around x E lEn. Suppose z = f(y) is differentiable in an open ball around g(x), where g : lEn ----* lEm and f : lEm ----* JEP. Apply the chain rule to the differentiable function f o g : lEn ----* JEP to compute ~ for alll ::; i ::; p and all 1 ::; j ::; n. 3
10.49 t Letf: JE 1 ----* JE 2 bedefinedbyf(t) = (cost,sint). Showthattheredoes not exist a point l E [0, 211"] for which
f(27r)- f(O) = f'(l)(27r- 0). 10.50 Suppose that f'(x) exists and that llf'(x)ll is bounded on a convex set D ~ lEn, where f : D ----* JEm. Prove that f is uniformly continuous on D, as defined in Exercise 9.41. 10.51 Suppose f : E 1 ----* E2 by f(x) = (cosx, sin x). Prove or give a counterexample: For all real numbers a and b we have llf(b)- f(a)ll
:S: lb- ai.
10.52 Suppose f'(x) exists for all x in a nonempty open set D ~ lEn, where f: D----* Em. 0 E C(En, Em), prove that f is a constant a) SupposeD is convex. Iff'(x) function on D. b) Suppose next that Dis an open connected set in En, but not necessarily convex. If f'(x) := 0 E C(En,Em)
=
on D, prove that f is a constant function on D. (Hint: Show that if x E D, then there exists an r > 0 such that f remains constant on Br(x). Show then that the set r- 1 (c) is an open subset of En for each c E Em.)
10.53
t Let D be an open subset of En and f E C1 (D, Em). Thus f': D----* C(En, Em).
We can denote f'(x) =
~ ;:,; (x)Tij, t,J
where Tij is defined in Exercise 10.13. We say f E C 2 (D, Em) if and only iff' has a continuous derivative. Using {Tij} as a basis of C(En, lEm) to show that f E C 2 (D, Em) if and only if a::£~k exists and is continuous for all i, j, and k.
INVERSE FUNCTIONS
10.54
305
n
Let D = {X E lE 2 IIIxll > 1} {X E lE 2 1 X2 # 0 if Xi ~ 0}. a) Show that D is open, but not convex. b) Apply Exercise 5.62 to showthatthereexistsg E c=(JR) such thatg(O) := 1 if 0 ;::: g( 0) = -1 if 0 ~ and g' (0) is bounded on R c) Let f : D---+ JR. be defined by
i,
i,
g (tan-i
f(x) = {
(~))
if x E D with
xi
1
if x E D with xi
-1
if X E D with
> 0, ~
Xi ~
0, x 2 > 0, 0, X2 < 0.
Show that f E Ci(D) and that f' is bounded on D. < oo such that
d) Show that there does not exist M
lf(x)- f(x')l
~
Mllx- x'll
identically on D.
10.55 Let f : lE2 ---+ JR. be defined by f(x) = sin(xi + x 2 ). Selecting a = 0 in Taylor's Theorem, use either formula (10.3) or the expanded version to prove that RN(b)---+ 0 as N---+ oo. 10.56 The following exercise works in any vector space equipped with a scalar product, but the reader may let f E lEm and let {ek I k = 1, ... , n} be an orthonormal subset of lEm, where n ~ m. Note that (ej, ek) = 0 if j # k but equals 1 if j = k. Prove that
has an absolute minimum value on lEn, which is achieved by selecting ak = (f, ek) for all k = 1, ... , n. Hint: Apply Theorems 10.2.5 and 10.3.3 to
We remark that this exercise can be applied usefully in function spaces such as R[a, b], giving rise to the standard formulas for Fourier coefficients and also coefficients of other orthogonal-function expansions.
10.4 INVERSE FUNCTIONS Even for differentiable functions f : JR. ---+ JR., there is no need for f to be invertible. And if the function f has an inverse, there is no need for that inverse to be differentiable. For example, f(x) = x 2 is continuously differentiable on JR, yet it is not one-to-one and hence has no inverse. Another good example is this one: Let f(x) = x 3 . Now f is one-to-one on JR., but f-i(x) = ~.which has no derivative at the origin. On the other hand, suppose D is an open subset of JR. and suppose that
306
THE DERIVATIVE IN EUCLIDEAN SPACE
E C1 (D,R) and f'(a) =f. 0 at some a ED. By continuity off' we know that f' remains nonzero on some interval I = (a- 8, a+ 8) ~ D. By the Mean Value Theorem, the restriction f!I off to I is one-to-one and thus has an inverse. We will generalize this theorem to lEn and we will prove a theorem that shows under suitable hypotheses that the inverse exists and is also continuously differentiable on a suitably small open ball. We begin with the following theorem which provides an interesting contrast to the Mean Value Theorem (Theorem 10.3.2).
f
Theorem 10.4.1 (Magnification Theorem) Let f E C1 (D, lEn), where Dis an open subset of !En. Ifx E Dis such that dct f'(x) =/= 0, then there exists an open ball Br(x) and a constant a> 0 such that for ally and z in Br(x) we have
llf(y)- f(z)ll ~ ndiY- zll. Consequently, f is one-to-one and thus invertible on Br(x). Proof: Begin by denoting T = f' (x) E .C(lEn, lEn) c C1 (lEn, lEn). Observe that T is invertible and that both T and r- 1 have strictly positive norms. For ally and z in lEn we have IIY- zll
=
IIT-I(T(y)- T(z))ll:::; IIT-liiiiT(y)- T(z)ll.
Letting a = 21 1'j_ 111 , we have IIT(y)- T(z)ll ~ 2ally- zll. Suppose we could show that for sufficiently small r > 0 and for all y and z in Br(x) we have ll(f(y)- f(z))- (T(y)- T(z))ll
(10.5)
Then it would follow that llf(y)- f(z)ll >ally- zll since if that were not true, then it would follow that IIT(y)- T(z)ll < 2ally- zll, which is false. Hence we will know that f is both one-to-one and invertible on Br(x) once we have proven Equation 10.5. However, for all y and z in Br (x) we have ll(f(y)- f(z))- (T(y)- T(z))ll
= ll(f- T)(y)- (f- T)(z)ll :::;MIIy-zll
where M =
sup wEBr(x)
ll(f- T)'(w)ll =
sup wEBr(x)
llf'(w)- f'(x)ll.
INVERSE FUNCTIONS
307
Here we are using the Mean Value Theorem 10.3.2 together with Exercises 10.30 and 10.32 for the basic properties of differentiation, as well as the convexity of a ball. Since f' is continuous, we can choose r > 0 so as to insure that M < a. • The next theorem will complete our preparation for the Inverse Function Theorem. Theorem 10.4.2 (Open Mapping Theorem) Let f E C1 (D, lEn), where Dis an open subset ofJEn. Suppose that for all xED we have det f'(x) =/:- 0. Then f(D) is an open subset oflEn.
Proof: Let x E Dandy = f(x). We seek an open ball Bp(y) <:;; f(D). Thanks to the Magnification Theorem, we can find r > 0 such that f is one-to-one even on a suitably chosen closed ball Br(x) <:;; D. Denote by sn~ 1 = DBr(x), the sphere that is the boundary of then-dimensional ball Br(x). Thus f (8"'~ 1 ) is a compact subset of lEn. Since the distance function d(y, u) = IIY - ull is continuous as a function of u E f ( sn~ 1 ), it follows that
since y tj. sn~l and f is one-to-one. We claim that p = ~ satisfies the requirements of the theorem. Let z E Bp(y). We need to prove z E f(D). We note that the distance of z from each point in f (sn~ 1 ) is greater than p, whereas liz- Yll < p. If we let g(v) =liz- f(v)ll 2 = (z- f(v)) · (z- f(v)) for all v E Br(x), then the minimum value of g must be less than p and must occur at an interior point of Br(x). At that minimum point we must have g'(v) = 0 E .C(JEn,1R.). By Theorem I 0.2.5 we know that g' (v) = 0 implies (z- f(v)) · f'(v)h = 0 for all h E lEn. Since f' (v) is a nonsingular linear transformation of lEn ___. lEn it is • also onto lEn. Thus z- f(v) = 0, which implies that z E f(D). Observe that under the hypotheses of the Open Mapping Theorem, iff is invertible then it f~l is continuous. We are ready to state and prove the Inverse Function Theorem. Theorem 10.4.3 (Inverse Function Theorem) Let f E C1 (D, lEn), where D is an open subset of lEn. Suppose that for all x E D we have det f' (x) =/:- 0. If f is one-to-one on D, then r- 1 E C1 (f(D), D) and
(f~ 1 )' (f(x)) = (f'(x))~ . 1
(10.6)
308
THE DERIVATIVE IN EUCLIDEAN SPACE
Remark 10.4.1 We mention that the function det f' (x) is often called the Jacobian off, so that the condition in the theorem is that the Jacobian not vanish on D. If the Jacobian does not vanish, the mapping f is called nonsingular. In Exercise 10.65 you will show that the Jacobian is independent of the choice of basis for lEn.
Proof of the Theorem. By hypothesis r- 1 exists and is continuous by the Open Mapping Theorem. (Alternatively, this could have been shown using Theorem 9.3.3 applied to a suitable closed ball.) We need to prove that r- 1 E C1 (f(D), lEn) and that its derivative is given by the formula in the theorem. Denote y = f(x) andy+ k = f(x +h). Write
r- 1 (y + k)- r- 1 (y) = (f'(x))- 1 k +
t(k),
(10.7)
which defines t(k). To show that r- 1 differentiable at f(x) and that (f'(x))- 1 serves as the derivative of r- 1 at y it is necessary and sufficient to show that lit:(k)ll
lkil~
0
as k ~ 0. Since f is invertible, h =1- 0 if and only if k =1- 0. Applying the linear map f'(x) to both sides of Equation (10.7), we obtain f' (x)h = f(x +h) - f(x) + f' (x)t(k). Since f is differentiable at x, we know that llf'(x)t(k)ll llhll
~o
ash ~ 0, which is equivalent to k ~ 0 since both f and r- 1 are continuous. It is not hard to show using a composition of a linear transformation with its own inverse that
II£' (x)t(k) I
~ I (f'(~))- 1 llllt(k) II.
which implies that llt(k)ll ~ 0 llhll . By the Magnification Theorem we have the bound o:llhll ::; llkll for some suitable constant o: > 0 and for h of sufficiently small norm. This inequality implies that
as k ~ 0. This proves that (f) - 1 is differentiable at y. The formula in Equation ( 10.6) follows immediately from application of the Chain rule to r- 1 0 f(x) x,
=
EXERCISES
309
recognizing that both functions in the composition are differentiable. Thus we obtain
the identity transformation. Hence
(f- 1 )' (f(x)) = (f' (x))- 1 . It remains only to show that (10.6) in the form
(f- 1)'
E
C(D, .C(IE:n)). We can rewrite Equation
(f- 1 )' (x) = (f' (f- 1 (x)))- 1 . We note that this expresses ( r- 1 )' as the composition of three maps. The first map is r- 1 , which was just shown to be continuous. The second map is f', which is assumed to be continuous. Finally, the map that sends an invertible linear transformation to its own inverse is a continuous mapping of Q.C(n, JR.) to itself, as is shown in Exercise 10.67. Since compositions of continuous maps are continuous, this completes the proof.
•
EXERCISES 10.57
Let f
: E 1 --+ E 1 be defined by f(x) =
x 3 sin! { 0
x
if X =/:- 0, if X= 0.
Show that f E C 1 (E 1, E 1) but that f is not invertible in any open interval ( -8, 8) around the origin. Explain why the Magnification Theorem fails to insure invertibility of f in this exercise.
10.58 Let f E C 1 (E 1 ' E 1) be defined by f(x) =sin X for all X E E 1. Is f (E 1) open? If yes, prove it. If no, explain which hypothesis of the Open Mapping Theorem fails to be true in this example. 10.59 Suppose D C En and S C Em are both open sets. Suppose f : D --+ S is differentiable, one-to-one and onto S, and suppose r- 1 is differentiable also. Prove that n = m. Conclude thatifm =1- n, then En and Em are not diffeomorphic, meaning that there is no differentiable one-to-one map of En onto Em with differentiable inverse. (Hint: Apply the Chain Rule to the composition off and r- 1 in either order. If T E .C(En, Em), recall a theorem from linear algebra concerning the rank and the nullity ofT.)
10.60 For each of the following functions, find: (i) all points x at which the Jacobian off does not vanish; (ii) A ball Br(x) on which f has a differentiable inverse, for those x identified in part (i). a) f : E 2 --+ E 2 by f(x) = (x1 cos x2, x1 sin x2). b) f: E 2 --+ E 2 by f(x) = (ex 1 cos 2, ex 1 sinx2).
310
THE DERIVATIVE IN EUCLIDEAN SPACE
c) f: JE3 --) IE3 by f(x) d) f : IE3 --) IE3 by f(x)
=(xi cosx2,x1
sinx2,x3).
= (xi sin x 3 cos x2, XI sin X3 sin x2, Xt cos x3). 1 1 10.61 Suppose f E C (1E , IE ) and that f is locally injective, meaning that for each x E JEI there is a corresponding r > 0 such that f restricted to Br (x) is injective (meaning one-to-one). Prove that f must be injective on JEI. (Hint: Suppose false and deduce a contradiction.) 10.62 Give an example of a function f E CI (1E 2, IE2) that is locally injective at each point x E IE 2 for which XI =f. 0 and that has det f' (x) =f. 0 (that is, nonvanishing Jacobian) if XI =f. 0, yet for which f is not injective on { x E IE 2 I XI > 0}. 10.63 Suppose f E CI (lEI, JEI) and that f' is nowhere zero on JEI. Prove that must be invertible on JEI and that I E CI (J (JEl) , JEI).
r
f
y 0.040
0.035
0.030
0.025
X
0.012
0.014
0.016
0.018
0.020
Figure 10.4 2x + 4x 2 sin ~. 10.64
Let f
: JEI --) JEI be defined by f(x) =
2x {0
+ 4x 2 sin.! x
if X =f. 0, if X= 0.
Show that f is not injective (and therefore not invertible) in any open interval ( -r, r) with r > 0, although f' exists and is bounded in the open interval ( -1, 1). Which hypothesis of the Magnification Theorem fails, saving that theorem from being contradicted by the noninvertibility of f on ( -r, r) for all r > 0? Justify your conclusion. (Be sure to find the value of f'(O) and to justify your result. It may help to apply Exercise 4.22. See Fig. 10.4.) 10.65 Iff : lEn --) lEn is differentiable, prove that the Jacobian off is independent of the choice of basis used in lEn. (Hint: Use a change-of-basis matrix and a property of determinants that you learned in a course in linear algebra.) 10.66
Let g E CI (lEn, lEn) and f E CI (lEn, lEn). Denote the Jacobian Jg(x) = detg'(x).
IMPLICIT FUNCTIONS
311
Express the Jacobian Jrog in terms of the Jacobians of Jc and Jg. Justify your conclusion.
10.67 t Show that the mapping T ____, y-t is a continuous mapping from (j£(n, JR) to itself. (Hint: It is sufficient to prove that each matrix coefficient of the inverse matrix ofT is a continuous function of the original matrix. You may use a formula from linear algebra for matrix inversion using determinants.) 10.68 Give an alternative coordinate-free proof of Exercise 10.67 by using the result of Exercise 10.11.
10.5 IMPLICIT FUNCTIONS In elementary calculus courses we learn a process called implicit differentiation. We are taught to take an equation of the form f(x, y) = 0 and differentiate both sides using the chain rule, supposing that y is a function of x that satisfies the equation. That is, y = g(x) and f(x,g(x)) 0. This yields the result
=
!!.1. g'(x) -__ Q1.' ax
.
ay
where we assume the denominator is not zero. Although this is a very useful process, it can lead easily to nonsense. For example, consider the equation
We could perform implicit differentiation mechanically to obtain the apparent result dy dx
X
y
whenever y =1- 0. If we do not think about what we are doing, we may imagine that there is a function y = g(x) such that x 2 + g(x) 2 + 1 = 0 with the given derivative. Of course there is no such function! There is not even a single pair of real numbers x and y such that x 2 + y 2 + 1 = 0. This shows us that we need a theorem that will enable us to know whether or not there really is a differentiable function y = g( x) satisfying the given equation f (x, y) = 0 and with the derivative obtained according to the method of implicit differentiation. This theorem, the Implicit Function Theorem, is stated and proven below. It is very powerful. One need only consider what a formidable problem it is to solve even a polynomial equation of degree 5 or higher to appreciate the significance of being able to find the derivative of a function y = g( x) even though it is very unlikely that we can solve the equation f(x, y) = 0 explicitly to express yin terms of x using only finitely many elementary operations. We will adopt the following notation for the work of this section. Let f be in C1( D, !Em) where D is an open subset of JEn+m, and denote (x, y) E JEn+m where
312
THE DERIVATIVE IN EUCLIDEAN SPACE
x E lEn andy E !Em. Denote the Jacobian ____,a(.::....:.ft.:....._, • ___;" • '....::..cfm.:..:..:-) (
-=-c
8(yt, ... , Ym)
xo, Yo )
= d et
( !lli ay'·3 (Xo, Yo ) )
mxm
.
Theorem 10.5.1 (Implicit Function Theorem) Suppose f E C1 (D,!Em), where D is an open subset of JEn+m. Suppose there is a point (Xo,Yo) in D such that f(xo, Yo) = 0. Suppose that the Jacobian
8(!1, · · ·, fm) ( ) ...J. O a(Yl,·· .,ym ) xo,Yo -r- · Then there exist open sets U and V in lEn and !Em, respectively, such that Xo E U and Yo E V with the following properties: i. For each x E U there exists a unique y E V, denoted by y = g(x), such that f(x, g(x)) = 0. ii. The function g E C1 (U, !Em).
Proof: The proof of this theorem is not much longer than its statement. We will apply the Inverse Function Theorem. Define f(x, y) = (x, f(x, y)) E JEn+m. Then the (n + m) x (n + m) matrix Inxn
[ f'(Xo,Yo) ] = (
(
~(Xo,Yo)
Onxm
)mxn
(
~(xo,yo)
)
)mxm
.
Therefore det f-'( Xo, Yo )
8(ft, •••
= a(
,Jm)( )...J. ) xo, Yo -r- 0
Yl> · · · ,ym because of the block-triangular form of the matrix. Also, it is easy to see that f E C1 (D, JEn+m) and that the determinant of its derivative is a continuous function. By the Inverse Function Theorem together with the Magnification Theorem, there exists an open ball of the form
on which the Jacobian remains nonvanishing and on which f has a continuously differentiable inverse. Applying Exercise 10.83 we see that f(Br(x 0 ) x Br(Yo)) is open in JEn+ m. Thus there exists p > 0 such that
Thus for each x E Bp(Xo) there exists a unique y E Br(Yo) such that
(x, y) =
£- 1 (x, 0).
IMPLICIT FUNCTIONS
313
=
If we denote g(x) = y, we see that f(x, g(x)) 0. In the notation of the theorem, we can take U = Bp(Xo) and V = Br(Yo), the Cartesian product of these two sets • being in Br(Xo,Yo). Remark 10.5.1 If one is presented with an m x (n + m) matrix, it is difficult to see by inspection whether or not there exist m column vectors from this matrix that together would form a m x m matrix with nonzero determinant. Thus it is hard to judge by inspection whether or not the conditions of the implicit function theorem are satisfied. It is helpful to review the concept of rank from linear algebra. The rank(T) of a linear transformation T : V --+ W is dim T(V), the dimension of the image ofT. T can be represented as a matrix in many ways, depending on the chosen bases for V and for W, the domain and range vector spaces. But for any matrix, it is not hard to prove that the rank of the corresponding linear transformation is the dimension of the span of the set of all the column vectors of the matrix, which is called the column rank of the matrix. There is also a concept of row rank, which is the dimension of the span of the set of all the rows of the matrix. It is a fundamental theorem in linear algebra that rank(T), the column rank of [T], and the row rank of [T] must all be the same number. If the row rank or the column rank of [T]mx(n+m) is equal tom, then there must be m linearly independent columns and the student will then be able to look for those columns, whose determinant will yield the needed nonzero Jacobian as in the implicit function theorem. The row rank will be m if and only if the set of all m rows of the matrix is linearly independent. We remark also that the open sets U and V can be challenging to identify in any given example, unless it is a relatively simple example such as one in which it is easy to solve for the implicitly determined function explicitly. The delicacy of the problem is reflected in the difficulty of establishing the existence of U and V in the proof of the implicit function theorem.
Let us consider in detail an example in which we apply the implicit function theorem to a specific real-valued function of three real variables . • EXAMPLE 10.4
Let F : IE 3
--+
lR by
F(x)
.'!.
=
.'!.
.'!.
xf +xi +xj -1.
For which points on the surface defined by F(x) = 0 is it true that at least one variable is determined as a continuously differentiable function of the others, restricted to suitable open sets? Fig. I 0.5 shows just the upper half of this surface. The reader can see in the figure that the entire surface defined by F(x) = 0 has twelve sharp edges and six sharp points, or cusps. We will begin our analysis of the problem of this example by seeing what information we can get from the implicit function theorem. To apply this theorem, we need to have
314
THE DERIVATIVE IN EUCLIDEAN SPACE
~
£
Figure 10.5 Upper half of xt +xi
~
+ xg = 1, X3 2: 0.
F E C1. But with respect to the standard basis, the matrix of the derivative of Fis [F'(x)] = ~ [x~!,x;l,x~!], provided that x 1x2x3 =/=- 0. One sees that F' exists only in the open set D = {x I x1x2x3 =/=- 0}, which is the union of the eight open octants of three-dimensional space which exclude the three coordinate planes. Moreover, FE C1 (D, E 1 ). Since g~ =/=- 0 on D for any i = 1, 2, 3, each of the variables can be expressed as a C1 function of the other two, in suitably restricted open sets. For example, suppose F(xo) = 0, where x 0 = (a, b, c). If we solve for x 3 in terms of the first two, we can restrict (x1, x 2) to the open set U, which is that part of the open quadrant of the x1x2-plane to which (a, b) belongs and is strictly inside the curve 2
2
x[ +xi= 1. In order to insure the existence of a unique continuously differentiable solution for x3 over the domain U chosen as above, we restrict x 3 to the open set V = (0, 1) or else V = (-1, 0), choosing the interval to which c belongs. The reader sees that uniqueness of the solution is a delicate matter here, since if (xi. x2, X3) is on the full graph, then so is (xi. x2, -x3), which wrecks uniqueness unless x 3 is suitably restricted.
315
IMPLICIT FUNCTIONS
Finally, we mention that we have not shown yet that unique C1 solutions fail to exist if at least one of the three coordinates is zero. That information does not come from the implicit function theorem, but depends on elementary calculations, which we leave to Exercise 10.76. We can translate the Implicit Function Theorem into a statement about the partial derivatives of the component functions gi of the implicitly determined function g(x). Since f(x, g(x)) = 0 E lE"' for all x E U <;;; lEn, it is natural to define g( x) = ( x, g( x)) E JEn+m. Observe that
The derivative (f o g)'(x) = f'(g(x))g'(x) = 0 E .C(lEn,lEm) for all x E U. Expressed in terms of matrices this composition of linear transformations becomes the following matrix equation.
(
1
0
0 !!JJJ..
1 2..9l
f!.JJ:m.
f!.JJ:m.
.2h_
!ll.!_
ax,
ay"'
!!.l.m_
ax,
!!.l.m_
ax,
ay"'
= Omxn·
axn
) mx(n+m)
ax,
axn
(n+m)xn
(10.8)
Suppose we fix a value of j between 1 and n, and we wish to calculate ~, for each value of i = 1, ... , rn. For this we consider the jth column of the product .. llows. matnx, obtammg a system of rn equat10ns m rn unk nowns, !2.9:.!_ ax., ... , f21lm_ ax. , as 10 0
0
0
0
0
J
J
One can calculate the unique solution to this system of equations using Cramer's Rule from linear algebra, since the Jacobian
o(fl, ... . !m) ~ 0. o(yl,····Ym) We mention however that for large rn it is generally simpler to compute the solutions to the system of equations using the method of row reduction, taught in all introductory courses in linear algebra.
316
THE DERIVATIVE IN EUCLIDEAN SPACE
If we denote
~(x,g(x))
( and
(
~(x,g(x))
)mxn
=
(~!)
)mxm
=
(~;)'
then we can rewrite Equation (I 0.8) as follows.
(M)
dx mxn +
(M)
dy mxm
_0
(~) dx mxn -
mxn·
(10.9)
This yields the full solution for the partial derivatives ~ in matrix form as 3
(dg) dx
mxn
=-
(dydf)
-l
mxm
(
df)
dx mxn.
(10.10)
Although Equation (1 0.1 0) has theoretical significance, we remark again that when one solves for the partial derivatives of the implicitly defined functions it is often more convenient to solve the systems of equations expressed either in the form of Equation (1 0.8) or Equation (1 0.9) by the elementary method of row reduction, without inverting the matrix for ~: .
• EXAMPLE 10.5 We will describe a curve in IE3 as the intersection of two surfaces, defined by the equations h(x1.x2,x3) = 0 and h(x1,x2,x3) = 0. Suppose hand h are in 0 1 (IE3 , IR). Such descriptions of curves as intersections of two surfaces in IE3 are common in introductory courses in the calculus of several variables. We will see how such a description can arise from the implicit function theorem, by denoting f = (h, h) E C1 (IE 3 , IE 2 ). If at a particular point in the intersection of surfaces we have
8(!1,/2) #0 ' 8(x2, x3) then the implicit function theorem guarantees the existence locally of solutions x2 = x2(x1) and x 3 = x 3(xl) to the vector equation f(x) = 0. We could apply Equation (1 0.9)to find ~ and ~ along the curve of intersection, which is parameterized in this way by the variable x 1 . However, as an alternative to remembering Equation (10.9) the student can apply the Chain Rule directly to find the derivatives once the implicit function theorem has been used to assure the existence of differentiable solutions. Since f(x1o x 2(x 1), x 3(x 1)) = 0, we denote g(x1) = (x1. x2(x1), x3(x1)), so that f o g(x1) = 0. Next we apply the Chain Rule to differentiate with respect to x 1 on both sides of the latter equation. This yields a matrix equation
EXERCISES
317
Thus we have
This matrix equation yields two real equations which can be solved for x~(x 1 ) and x~(x1), because 8(!1, h) # 0 8(x2, x3) · In general, when we are given a function f E C 1 (JEn+m), we may not know whether m variables y can be selected for which there exist differentiable solutions of the equation f(x, y) = 0 in termsoftheremainingn variables x in a differentiable manner, at least locally in a neighborhood of some point in the solution set. If the rank of the linear transformation f' (p) ism, then there exist m columns of the matrix of the derivative that collectively form an m x m nonsingular matrix, which means that the determinant of that square matrix is not zero. Those m selected columns correspond to the components of the vector y for which there exists a locally differentiable and unique solution in terms of the remaining variables, which we could then label collectively as the vector x. 1.0
0.5
-0.5
-1.0
-0.5
.i
Figure 10.6
0.5
0.0 1
xf +xi
= 1.
EXERCISES 10.69
Define f E C 1 (JE 3 , JE 2 ) by f(x 1,X2 1 X3) = (x1X2COSX3, X2sinx3).
1.0
318
THE DERIVATIVE IN EUCLIDEAN SPACE
Find the Jacobian 8(x2,x3) ?(/J ,h) .
Find the Jacobian 10.71
o(h,h) .
8(x1,x3)
Let f E C1 (JE4 , JE1 ) be such that the matrix
[J'(x)hx4 = [x2,
XI, X4,
xa].
Find all xo = (x 1, x2, x 3 , x4) at which the implicit function theorem guarantees that there is a localC 1 solution of the equation f(x) = f(xo) forx 3 in terms of the other three variables. 10.72
Many laws of nature, such as the ideal gas law, can be written in the form
F(x 1 , x2, xa) = 0. Suppose that F E C1 (JE3, JE 1) and that for all i = 1, 2, 3 we
have ax; aF r_;_ 0. Prove that OX! OX2 oxa ---=-1 OX2 oxa OX! •
State the hypotheses needed on the points (xb x2, x 3 ) to insure that this conclusion is valid. 10.73 Suppose F(x 1 , •.• , xn) = 0, where F E C1 (lEn, JE 1 ). Suppose that for all i = 1, ... , n we have g~ =/= 0. Find the numerical value of the product
.
OXi
II
~=l, ... ,n
OXi+l'
where we make the notational agreement that Xn+l means x 1 . State the hypotheses needed on the points (x 1, x2, ... , Xn) to insure that this conclusion is valid. 10.74
Let F : IE 2
--->
lR by
F(x) =
1
1
x? + x~.
For which points on the curve defined by F(x) = 1 is it true that at least one variable is determined as a continuously differentiable function of the other, restricted to suitable open sets U and V? Find U and V and justify your conclusion. (See Fig. 10.6.) 10.75
Let F : JE3
--->
lR by 4
F(x) = xt
4
4
+ xJ +xi.
For which points on the surface defined by F (x) = 1 is it true that at least one variable is determined as a continuously differentiable function of the others, restricted to
EXERCISES
.1
Figure 10.7
Upper half of xf
319
.1
.1
+ xi + x1 = 1, X3
::::
0.
suitable open sets U and V, which you should find? Justify your conclusion. (See Fig. 10.7.)
10.76
Let
2
2
2
= x[ + x1 + x] - 1 Suppose that c = 0 for the point xo = (a, b, c). F(x)
Show that as in Example 10.4. the equation F(x) = 0 cannot be solved uniquely for x3 as a C1 function in terms of x 1 , x 2 , even when the variables are restricted to suitable open neighborhoods of c E lR and of (a, b) E JE 2 . At that same point, are the other two variables, x 1 and x 2 , locally unique C 1 functions of the other two? Justify your conclusions. 3 10.77 Suppose that f = (h,h) E C 1 (lE 3 ,lE 2 ) and x 0 = (a,b,c) E JE is in the solution set of the equation f(x) = 0. Suppose this equation can be solved for x 2 = x 2 (x 1 ) and x 3 = x 3 (x 1 ), both C 1 functions in a neighborhood of xo with x 2 (a) = b, and x 3 (a) =c. If the matrix
[f'(xo)] = ( __\
6 ~ ),
then find x~ (a) and x~ (a). (Hint: Apply the Chain Rule to differentiate
320
THE DERIVATIVE IN EUCLIDEAN SPACE
with respect to x 1 on both sides.)
10.78 Consider the curve C of intersection in IE 3 of the two surfaces with the equations fi(x) = 0 and h(x) = 0, where
and
h(xb X2, X3) =X~+ X~- 1. (See Fig. 10.8.) Show that the point ( 2,
~' 4)
lies on the curve C and that in a
neighborhood of this point the curve C can be parameterized as x1 = x1(x3) and
X2
= X2 (x3).
Find X~
(
Figure 10.8
4) and X~ 4). (
Intersection of cylinder with hyperbolic cylinder.
Consider the curve of intersection in IE3 of the two surfaces with the equations fi(x) = 0 and h(x) = 0, where
10.79
and
EXERCISES
a) Explainwhyf=
321
(JI,h) EC 1 (IE3,JE2).
b) Find all points (x2, x 3) for which gg~:;~~ =1- 0. c) At each point identified in item (b), apply Equation (1 0.1 0) to find ~ and ~ along the curve of intersection described as (x 2, x 3) = g(xt).
10.80
Describe the 3-sphere 8 3 in IE4 as the solution set of f(x) = 0, where
f(x)
=
llxll 2 -
1.
If x = (xt, x2, x 3, x4) E 8 3, show by applying the Implicit Function Theorem that for at least one value of i the coordinate xi can be expressed as a continuously differentiable function of the three other coordinates, with Xi and the triplet of other coordinates restricted to suitable open sets V and U which you should identify explicitly. Then find the solution for Xi explicitly, without using the Implicit Function Theorem.
10.81
Let S£(2, JR.) denote the set of all two-by-two real matrices
for which det X = 1. If X E S£(2, JR.), show by applying the Implicit Function Theorem that at least one of the four coordinates Xt, x 2 , x 3 , X4 can be expressed as a continuously differentiable function of the three other coordinates, with the variables restricted to suitable open sets which you should identify explicitly. Then solve explicitly for that identified coordinate, without using the Implicit Function Theorem.
10.82
Define the so-called orthogonal group of the plane by 0(2) = {
x E gc (1E2 ) 1 x xt = I} ,
xt
where denotes the transpose of X and I is the identity transformation. Denote the matrix
a) Prove that 0(2) is a group by showing that it is closed under the operations
of multiplication and taking inverses. <=> f(x) = 0 E JE 3, where
b) Prove that X E 0(2)
f(x)
=
(x1X3
+ X2X4, X~+ X~- 1, X~+ X~- 1)
for each x E IE 4 . c) Prove that the rank of the matrix [f'(x)] is 3 if X E 0(2). (Hint: How many linearly independent rows does this matrix have?) d) Use the Implicit Function Theorem to identify three of the four coordinates of x that can be expressed as C1 functions of the remaining coordinate, in a neighborhood of the identity element I E 0(2).
322
THE DERIVATIVE IN EUCLIDEAN SPACE
e) Use Equation (10.10) to find the derivatives at the identity element of the three chosen dependent variables with respect to the chosen independent one. t) This part is not about the implicit function theorem but it is interesting. Use your knowledge of determinants to show as follows that 0(2) is not connected. i. Show that if A E 0(2), then det A occur.
=
±1 and show that both cases actually
ii. Denote by SO (2) the set of elements of the orthogonal group with determinant equal to 1. Show that S0(2) is a (sub)group of 0(2). (It is called the special orthogonal group of the plane.) iii. Prove that the set of elements of 0(2) that have determinant equal to -1 is not closed under multiplication. 10.83 Suppose U is an open set in lEn and V is an open set in lEm. Prove that the Cartesian product U x V = {(x, y) I x E U, y E V} is an open set in JEn+m .
10.6 TANGENT SPACES AND LAGRANGE MULTIPLIERS30 If a differentiable function G defined by
= (G1. ... , Gk) : JEn+k -; JEk, then the surfaceS S ={xI G(x) = v}
is called the level surface for G(x) = v. Note that each of the functions Gi : JEn+k -; R If we denote by Si the level surface for the equation Gi (x) = vi, then
Suppose that x 0
=
(x?, ... , x~+k) E Sand that
has rank k. Let 8i,j = 1 if i = j and 0 if i =!= j. With respect to the standard basis { ej
I
= (8t,j, ... , 8n+k,j) j = 1, 2, ... , n
+ k}
for JEn+k and the analogous smaller basis for JEk, we note that the matrix
30 This
section can be omitted without disturbing the continuity of the book.
TANGENT SPACES AND LAGRANGE MULTIPLIERS
323
has its whole set of k row vectors linearly independent, and these row vectors are the gradient vectors
Let 1>: lR _, S be a differentiable function for which ¢(0) = x 0 . Then we call the vector v = ¢'(0) a tangent vector to Sat x 0 . Of course there are infinitely many tangent vectors at any one point on a differentiable surface because there are so many different differentiable curves in the surface passing through that point. Definition 10.6.1 The tangent space Txo(S) at x 0 E S is the set of all tangent vectors to Sat x 0 . The translate
is called the tangent plane to the suiface S, with point of tangency at x 0 . In the next theorem, we will prove that the tangent space is always a vector subspace of JEn+k. A translate a+ V = {a+ x I x E V} of a vector subspace V of lEn is called an affine subspace of lEn. An affine subspace is a vector subspace if and only if a E V. (See Exercise 10.84.) Theorem 10.6.1 Let G : JEn+k
Suppose G'
(x0 )
-t
JEk be a differentiable function. Let
has rank k. Then the tangent space Txu(S) is the vector subspace
of dimension n. In words, Txu ( S) is the orthogonal complement of the span of the k gradient vectors '\7G 1 (x0 ), ... , '\lGk (x0 ). Proof: Suppose first that v = ¢'(0) E Txo(S). Here 1> maps into each level surface Gi(x) = vi. We will show that v j_ '\i'Gi (x0 ) for each i = 1, ... , k. In fact, Gi(¢(t)) =vi, a real constant. We differentiate using the Chain Rule to find that
a:(¢(o))¢'(o) = o. In terms of the standard matrix representation of the left side of the latter equation, we have '\lGi (x0 ) ·1>' (0) = 0, so that v j_ '\i'Gi (x0 ). This shows that Txo (S) is a subset of '\lGi
(x0 ) J. for each i.
This implies that
The hypothesis that rank ( G' (x 0 )) = k implies that dimTxu ( S) ::; n. If we can show that the tangent space is at least n-dimensional, then it will have to be the entire
324
THE DERIVATIVE IN EUCLIDEAN SPACE
orthogonal complement of the span of the gradient vectors as claimed. Thus it will suffice to produce a linearly independent set of n vectors in the tangent space. Because the rank of a matrix is also the number of linearly independent column vectors, it follows that the matrix [G' ( x 0 )] has k independent columns. We can rearrange the order of the n elements of the standard basis of JEn+k to arrange that the first k columns are linearly independent. By the Implicit Function Theorem, there exists an open set U c JEk containing ( x~, ... , x2) and an open set V c lEn containing ( x2+I, ... , x2+n) such that there are unique differentiable functions
XI
'1/h (xk+l, · · ·, Xk+n)
solving the equation
Next we define n differentiable curves on S by the equations
cPn(t)
(1/h (x2+1, .. ·, X~+k-I• x2+n + t), · · ·, '¢k (x2+1, · · ·, x2+n-I, x2+n +t); x2+1, ... x2+n-I•x2+n
+t)
In comparing the vectors ¢~(0) fori = 1, ... , n, observethatforeach of these vectors the final n entries, which follow the semicolon, are all 0 except for a single entry that is 1. The location of the 1 is different for each of these vectors. Thus the n vectors • are independent and the theorem is proved. Corollary 10.6.1 Let G : JEk+n ~ JEk be a differentiable function and let
Suppose x 0 is a local extreme point of a differentiable function f : S ~ JR. and that G' (x0 ) has rank k. Then there exist numbers AI, ... , Ak such that
(10.11) The numbers AI, ... , Ak are called Lagrange multipliers. Proof: If¢: JR.~ Sis a differentiable curve on S with ¢(0)
'1/J(t) = f(¢(t)).
= x 0 , let
TANGENT SPACES AND LAGRANGE MULTIPLIERS
325
Since this function has an extreme point at 0, we have
1/1' (0)
= \7 f( ¢(0)) .
It follows from Theorem 10.6.1 that \7 f (x0 ) is orthogonal to the tangent space Txo(S). Since the co-dimension of Txo (S) is k, it follows that \7 f (x0 ) lies in the • span of the k vectors \JG1 (x0 ), •.• , \lGk (x0 ). This proves the corollary. The method of Lagrange multipliers permits an optimization problem to be replaced by a problem of solving a system of equations. From the k + n components of the vectors in Equation (1 0.11 ), we obtain a system of k + n equations in the n + 2k unknowns XI. ... , Xk+n• AI. ... , Ak. We get k additional equations from the k components of the equation G(x) = v. Thus we obtain a system of n + 2k equations in n + 2k unknowns. Although we have replaced a calculus problem with an algebraic problem, the algebraic problem can be challenging. Nevertheless, the method of Lagrange multipliers is a powerful tool for optimization problems . • EXAMPLE 10.6
Ul
LO
Figure 10.9
x1
+ x~ + x~ = 1.
326
THE DERIVATIVE IN EUCLIDEAN SPACE
We will begin with a three-dimensional example. Consider the surface S defined by the equation
xf +X~+ X~=
1
in E 3 , shown in Fig. 10.9. We will find both the maximum and the minimum values of the function
f(x) =xi+ x~
+ x~
= 1
on S. (In effect, we are determining the closest and furthest distances from the origin on S.) In this example, we denote x = (x,x 2 , x3). Observe that if we define
G(x)
= xf + x~ + x~
then S = c- 1 ( { 1}). Hence S is closed because G is continuous. S is also bounded. (Why?) Hence the function f must achieve both a maximum and a minimum value somewhere on S. Since S is smooth at all points and since '\!G is nonvanishing on S, the extreme points must occur at those points for which '\1 f(x) = >. '\!G(x). This yields the following system of equations.
X1 (1- 2>.xi)
0,
x2 (1- 2>.xD X3 (1 - 2>.x~)
0,
4
xl
4
4
+ x2 + x3
0, 1.
The reader should check the following statements by making the necessary calculations. • If none of the three variables is zero, then
• If exactly one of the three variables is zero, then at a point satisfying the system of equations we must have f(xi. x2, x3) = J2. • If exactly two of the variables are zero, then at a point satisfying the system we must have j(x1, x2, x3) = 1.
v'3. But the reader should be able to explain why at least one of the variables must be nonzero. Thus the minimum value is 1. There is also an easy way to explain even from the outset why f(xt, x2, x3) ~ 1 everywhere on S.
It follows that the maximum value off on S is
EXERCISES
327
EXERCISES 10.84 Prove that the tangent plane x 0 only if x 0 E Txo (8). 10.85
+ Txo (8) is a vector subspace of lEn if and
Describe both the tangent space and the tangent plane to the sphere
. x at the pomt
0= (1v'n' v'n' 1 ... , ..;n1) .
The sphere 8 3 C IE 4 is defined by
10.86
3
8 =
{x I t x7 = 1} . 't=l
Define f
: 83
--->
lR by 4
f(x) =
2: aixi i=l
where ai is a constant for each i E {1, 2, 3, 4}. Show thatthemaximumand minimum values of f on 8 3 are
10.87 The group SC(2,JR) of matrices was defined in Exercise 10.81. Let mapping SC(2, JR) to JR, be defined by f(x)
f,
=xi + x~ + x~ + x~,
where we identify the matrix
X= ( with the vector x the equation
~~ ~~
= (x 1 , x 2 , x 3, x4)
) E SC(2,JR)
constrained to the surface 8 in IE 4 defined by
a) Prove that f achieves a minimum value on 8 but that it has no maximum. b) Use the method of Lagrange multipliers to find the minimum value off on
8. 10.88
Let 4
J(x) =
I:xr i=l
328
THE DERIVATIVE IN EUCLIDEAN SPACE
for all x E JE 4 . Let 8 1 be the surface in determined by
and let 82 be the surface defined by
Let 8 = 81 n 82. a) Prove that f(x) has a minimum value on 8 but no maximum. b) Find the minimum value of f(x) on 8.
10.7 TEST YOURSELF
EXERCISES 10.89
Let A and B be in £ {lE2 ) with matrices in the standard basis given by [A] = (
Find
6 8)
and [B] = (
8 ~ ).
IIAII, liB II, and IIA +Ell·
10.90
True or False: It is possible to give an example of X E £ (JE 2 ) for which
IIX +XII< IIXII + IIXII· 10.91 Define 8 c £ (JE 2) by letting 8 = {A E £ (JE 2) Iarl > 1 }, where the matrix [A] = [aij]2x2· True or False: 8 is a connected subset of£ {lE 2 ). 10.92
Give an example ofT E £ {lE2 ) such that in which
IIT2 II < IITII 2 .
10.93 Let f E C (D, JE 2 ), where D is an open subset of JE 2 • Suppose also that f' E C(D, £ (JE 2 , JE 2 ) ). True or False:
f' (cx1 for all
x1
+ x2) = cf' (x1) + f' (x2)
and x2 in D and c E R
Let f : JE 2 ----) JE 2 be defined by f(x) Dvf (1, i), where v = (2, 1).
10.94 10.95
Suppose f : JE 2
----)
JE 2 and g : JE 3
=
(ext cosx 2 , ext sinx 2 ). Calculate
----)
JE 2 are both differentiable. Let
g(O) = Xo, [g'(O)] = ( !2 --:} and
[f'(xo)] = (
5
~1
~2 ) ) .
EXERCISES
329
in the standard bases. Find the matrix [(f o g)'(O)J using the standard bases. 10.96 True or give a Counterexample: If f E C (1E 1 , JE 1 ) is locally injective, meaning that for each x E lE 1 there is a corresponding r > 0 such that f restricted to Br(x) is injective (meaning one-to-one), then f must be injective on 1E 1 . 10.97 Find all points x at which the Jacobian off does not vanish, and for each such x a ball Br(x) on which f has a differentiable inverse, iff : JE3 ----> JE3 by f(x) = (x1 COSX2, X1 sinx2, X3). 10.98
Let f E C1 (1E 4 , 1E 1 ) be such that the matrix
[f'(x)hx4 = [x2,
X1,
X4,
X3].
Find all xo = (x1, x2, X3, X4) at which the implicit function theorem guarantees that there is a local C1 solution of the equation f(x) = f(xo) for X3 in terms of the other three variables. 10.99 Give an example of a locally injective map f : JE 1 injective.
---->
JE 2 which is not
10.100 Suppose that f =(!I, h) E C1 (1E 3 ,1E 2 ) and xo = (a,b,c) E JE3 is in the solution set of the equation f(x) = 0. Suppose this equation can be solved for X2 = X2(x1) and X3 = X3(xt), both C 1 functions in a neighborhood of XQ With x2(a) = b, and x3(a) =c. If the matrix [f'(xo)]
= ( _\
6 ~ ),
then find x;(a) and x;(a). (Hint: Apply the Chain Rule to differentiate
with respect to x1 on both sides.)
This page intentionally left blank
CHAPTER 11
RIEMANN INTEGRATION IN EUCLIDEAN SPACE
11.1
DEFINITION OF THE INTEGRAL
For functions f : [a, b] ----> IR the Riemann integral was motivated by the desire to represent the signed area trapped between the graph of f and the x-axis. We saw in Chapter 3 that every Riemann integrable function on a closed, finite interval [a, b] must be bounded. One can motivate integrals of bounded real-valued functions mapping a bounded domain of definition D f c lEn to IR in terms of signed volumes for n = 2, or in terms of mass (as the integral of a density function) for n = 3 if f(x) ~ 0 for all x E DJ. For n > 3 the applications are less visual though still important. The fundamental concept of the integral remains the same as it was for n = 1. We would like to partition the domain of f into very small pieces. In each piece we select an arbitrary evaluation point at which to evaluate f. Then we would like the finite sums of the lengths (respectively area, volume, etc) of the pieces weighted by the values off at the chosen evaluation points to converge as the mesh of the partition approaches zero. Such a limit, if it exists, is called a Riemann integral. We saw for Riemann integrals in IR 1 that the existence of a Riemann integral could be characterized very conveniently and easily using the Darboux Criterion (Section 3.2). This will be our starting point for Riemann integration in lEn: we will begin Advanced Calculus: An Introduction to Linear Analysis. By Leonard F. Richardson Copyright© 2008 John Wiley & Sons, Inc.
331
332
RIEMANN INTEGRATION IN EUCLIDEAN SPACE
by defining upper and lower integrals for functions defined in lEn much as we did for Jl~l.
The reader will recall from a first course in the calculus of several variables that functions of interest are seldom defined on a rectangular domain. Most often, useful examples of functions of several variables are defined on domains that are bounded by several intersecting curves (in the plane), or by several intersecting surfaces (in three-dimensional space). We will begin, however, by considering functions defined on a rectangular block. At the end of this section we will show how to extend Riemann integration on lEn to functions defined on more general bounded domains.
Definition 11.1.1 Let a and b E lEn. We define a closed rectangular block
to be a Cartesian product of closed finite intervals on then axes. Note that iffor any i we have ai > bi then [ai, bi] = ¢,the empty set, which is both closed and open. We denote the interior of a closed rectangular block as the open rectangular block
(See Exercise 11.1.) In either case we define the measure of the block B by
IT
tL(B) =
(bi- ai),
1:-;i:-;n
if ai < bifor all i.
If ai 2: bifor at least one i, then we define tL(B)
= 0.
Note that B degenerates into the empty set if ai > bi and has at most n - 1 strictly positive dimensions if ai = bi for some i. Note also that tL(B) is called length in IE 1 , area in IE 2 , and volume in IE3 • Next, we define what is meant by a partition of a block B and by the mesh of a partition.
Definition 11.1.2 A partition 'P of a block B = [a, b] is a finite set 'P
= {Bi = [~,hi] Ii = 1, ... 'N}
such that B = Ui= 1, ... ,N Bi and Bf n B'j We define the mesh of'P by
IIPII = [a,b]E'P max
= 0for all pairs {i, j} such that i
L
-::f. j.
(bi-ai)2.
1:-;i:-;N
A partition P' is called a refinement of a partition 'P, provided that for each block B~ E 'P', B~ is a subset of some block Bj of the partition 'P.
We continue our generalization to lEn of the contents of Section 3.2 by defining the upper and lower sums for a bounded function f on a closed rectangular block B.
DEFINITION OF THE INTEGRAL
333
Definition 11.1.3 Let f : B ----t IR, where B is a closed rectangular block in lEn. Let P = {Bi I i = 1, ... , N} be any partition of B. For each Bi E P with f-L(Bi) > 0, we denote Mi = sup{f(x) I x E Bi} and mi = inf{f(x) I x E Bi}. We define the upper sum by
U(f, P) =
L
Mif-L(Bi)
l~i~N
and the lower sum by
L(f, P) =
L
mif-L(Bi)
l~i~N
where both sums are over those i between 1 and N with f-l(Bi) > 0. Theorem 11.1.1 Let P and P' be any two partitions of a closed rectangular block Band let f be any bounded function on B. Then L(f, P') :::; U(f, P).
B:
Proof: Suppose first that a block s;: Bj happens to occur. The suprema and infima off on the two blocks are related as follows:
Next we observe that for all B: E P' and Bj E P the intersection B: n Bj is again a rectangular block, though of course it could be a block of measure zero, whether empty or not. Since P' is a partition of B, { B: n B j I B: E P'} is itself a partition of Bi. Let Mf,i be the supremum off on B: n Bj. Then
L M{,jf-l (B: n Bj) :::; Mjf-L(Bj ). i
Now define the mutual refinement P" of P and P' to be
Then we see that
L(f, P') :::; L(f, P") :::; U(J, P") :::; U(f, P),
•
which is what we needed to prove.
Just as in Section 3.2, we see now that the family of all upper sums (for the various possible partitions) is bounded below by each lower sum, and vice versa. Hence we may make the following definition. Definition 11.1.4 Let f be any bounded function on a closed rectangular block B. Define the upper integral
l
f
= inf {U (f,
P) I P is a partition of B}
334
RIEMANN INTEGRATION IN EUCLIDEAN SPACE
and define the lower integral lf
= sup{L(f, P) I Pis a partition of B}.
Theorem 11.1.2 For each bounded function f on a block B, we have
The proof is left to Exercise 11.2.
Definition 11.1.5 We call a bounded function f integrable on a closed block Band write f E R(B) if and only if = Iff E R(B), we define
JBJ JBJ.
l
f(x)dx = l f = l f ·
Theorem 11.1.3 (Darboux Integrability Criterion) Let f : B - t JR. be a bounded function on a rectangular closed block B. Then f E R(B) if and only for each E > 0 there exists a partition P for which U(f, P)L(f,P) < t. The proof is left to Exercise 11.4.
Theorem 11.1.4 The set R[a, b] is a vector space, and the map T: R[a, b] defined by T(f)
= JB f(x) dx is linear.
-t
JR.
The proof is left to Exercise 11.5.
Theorem 11.1.5 Iff E C[a, b], then f E R[a, b]. Proof: Since f is continuous on the block B = [a, b], which is both closed and bounded, we know that f is uniformly continuous on B. Thus if f. > 0, there exists a 8 > 0 such that llx- x'll < 8 implies that
lf(x)- f(x')l < 2 p~B), where we assume p(B) > 0. Now if we partition B with a finite set of blocks P = {Bi I i = 1, ... ,p} with IIPII < 8, it follows that on each block Bi we have IMi -mil ::;
2p~B) ·
Hence
U(f, P)- L(f, P)::;
f.
"2 < t.
DEFINITION OF THE INTEGRAL
It follows from the Darboux Criterion that
f
E
R(B).
335
•
We tum our attention now to the definition of the Riemann integral for functions defined on any bounded subset Dt C lEn. Then the domain Dt is contained in some sufficiently large rectangular block [a, b]. We would like to make the following definition. Definition 11.1.6 Suppose that the domain D f of a function f is contained in some sufficiently large rectangular block [a, b]. We will extend the definition off so that f(x) = 0 at each point x E [a, b] \ D f· We define
J
f(x) dx
D1
= {
J[a,b]
f(x) dx,
provided that this integral exists.
=
Theorem 11.3.1 will show that for some sets D, even f(x) 1 is not integrable on D. Also, for Definition 11.1.6 to make sense, we must show that it is independent ofthe choice ofblock [a, b] :;2 Dt. For this purpose, it suffices to prove the following theorem. Theorem 11.1.6 Let f be any bounded real-valued function on a bounded domain Dt C lEn. Suppose Dt ~ [a, b] and Dt ~ [a', b']. Extend the definition off to be zero identically on the complement of D f in each block. Then
{
J[a,b]
f =
y- f J[a' ,b']
and {
J[a,b]
f = {
J[a' ,b']
f.
Proof: Let [A, B] be any closed rectangular block that contains both [a, b] and [a', b']. It will suffice to prove that
la,b/ = lA,B/
and
la,b/ = lA,B/
The two arguments are so similar that we will present a proof of only the first. Let E > 0. There exists a partition P of [A, B] for which
U(f,P)- f f <
j[A,B]
i·
Without disturbing this inequality, we can refine P in such a way that no block Bk E P intersects both the interior of [a, b] and the interior of its complement. Thus the collection P' consisting of those blocks of P that lie in the smaller block [a, b] is a partition of [a, b] that has the property that U(f, P')- fra,h]f < ~In order to establish that
1--1 la,b] f- lA,B] f
<
E,
336
RIEMANN INTEGRATION IN EUCLIDEAN SPACE
it will suffice to show that for P suitably selected we have
IU(f, P)- U(f, P')l < ~Since f vanishes on the complement of [a, b], the difference under consideration consists of a sum over only those blocks of P that are outside [a, b] but have one face lying on a face of [a, b]. Thus
IU(f, P)- U(f, P')l S
IIJIIsupAIIPII,
where A denotes the (n-1)-dimensionalsurface area of the block [a, h]. By refining • P we can insure that liP II is small enough to accomplish this goal.
EXERCISES 11.1 t Prove that B 0 = (a, b) is the interior of the rectangular block B where interior is defined in Exercise 8.26.
11.2
Prove that
for all bounded functions
11.3
= [a, b],
t Let f
f :B
---t
lR defined on a closed rectangular block B.
and g be bounded functions on a rectangular block B, and let c E R
f 8 (f +g) S f8 f + f 8 g. J8 (! +g);::: J8 f + J8 g. Compare J8 cf and J8 cf with f 8 f and J8 f.
a) Prove that b) Prove that c)
Take note of the effect of c
-
being either positive or negative.
11.4 t Prove Theorem 11.1.3. (Hint: Apply Definition 11.1.5 directly. Compare with Theorem 3.2.4.)
t The following parts will prove Theorem 11.1.4, which can be compared with Theorem 3.1.2. Apply the parts of Exercise 11.3 to show each of the following parts. a) R[a, b] is closed under addition. b) R[a, b] is closed under scalar multiplication. c) The map T: R[a, b] ---t lR defined by T(f) = 8 f(x) dx is linear. 11.5
J
11.6
Let
f(x) = { be a function defined on a block B for which
~
if X1 = 1, ifxt E [0,2] \ {1}
c JE 2 •
Find a partition P of B
U(f, P)- L(f, P) <
1
8
.
=
[0, 2] x [0, 1]
EXERCISES
11.7
t Let f
be any real-valued function on a rectangular block B
j+(x)
= {
and let
f-(x)
= {
~(x)
~
337
lEn. Define
if f(x) ~ 0, if f(x) < 0
~f(x)
if f(x) < 0, if f(x) ~ 0.
for all x E B. Prove that
for all x E B. (Hint: Just check the cases based on the sign of f(x).) 11.8
t Suppose f
E
R[a, b]. Prove: j+ and f-are in R[a, b]. Hint: Show that
U(f+, P) - L(J+, P) ~ U(f, P) - L(f, P). 11.9 Iff E R[a, b], prove above.)
IJI
E
R[a, b]. (Hint: Use Exercises 11.7 and 11.8
11.10
Give an example off such that
11.11
tIff
E
IJI E R[a, b] yet f ¢ R[a, b].
R[a, b], prove that
11.[a,b] f(x) dxl ~
1.
[~b]
lf(x)l dx.
(Hint: Write the left side by expressing f = j+ are both nonnegative functions.)
f-
and use the fact that j+ and
f-
11.12 Let the linear map T : R[a, b] --+ lR be defined by T(f) Prove that T is a bounded linear functional.
=
fra,b] f(x) dx.
11.13 Suppose f E C(a, b) and also that f is bounded on [a, b]. Prove that f is Riemann integrable on [a, b]. 11.14
Let f: [(0, 0), (1, 1)]
f(x) =
{
--+
.! sin ~
lR be defined by ___L_
X!X2
if 0 <Xi ~ 1, i if X1X2 = 0.
=
1, 2,
Prove that f is Riemann integrable on the closed rectangular box
[(0, 0), (1, 1)] = [0, 1]X 2 • (See Fig. 11.1.) Show also that set for which x 1x2 = 0.
f
is not continuous at any point in the uncountable
338
RIEMANN INTEGRATION IN EUCLIDEAN SPACE
Figure 11.1
Let a= (0, 0) and b = (1, 1) E lE 2 • Let f
11.15
f(x) Prove:
f(x) = -21 sin_,.._, XtX2
f
~
= {
~
if X1 E Ql and
: [a, b] X2
---7
lR by
E Ql,
otherwise.
R.[a, b].
11.2 LEBESGUE NULL SETS AND JORDAN NULL SETS Even in the study of Riemann integrability on the real line, it was difficult to see what exactly is the connection between the Riemann integrability off and continuity of f. We know that every continuous function is Riemann integrable, but so is every step function and every monotone function, and even every function of bounded variation. In Exercise 3.14 we saw a Riemann integrable function with a countably infinite set of discontinuities. And in Exercise 7.10 we saw a function f E R.[O, 1] that had a discontinuity at every rational point of [0, 1]. Thus we saw that a function can be Riemann integrable even with its points of discontinuity constituting a set that is both countably infinite and dense. Finally, in Exercise 11.14 we saw a function f E R[(O, 0), (1, 1)] that was discontinuous at each point in the unit square for which either x 1 = 0 or x 2 = 0. This set of points of discontinuity is uncountably infinite. It is therefore with good reason that we say the relationship between continuity and Riemann integrability appears to this point in our study to be mysterious. The precise
LEBESGUE NULL SETS AND JORDAN NULL SETS
339
formulation of a necessary and sufficient criterion for Riemann integrability of a bounded function on a bounded domain in terms of continuity was established early in the twentieth century by the French mathematician Henri Lebesgue. Lebesgue's name is associated most often with a more refined concept of integration called the Lebesgue Integral, and related to this is the concept of the Lebesgue measure of a set. We will not delve into these graduate-level topics in this course. But both the concept of a Lebesgue null set and the related concept of a Jordan null set are accessible to us at this point and they are essential to understanding Riemann integrability in terms of continuity. Definition 11.2.1 A setS <;;; lEn is called a Lebesgue null set if and only iffor each f > 0 there exists a countable family of open blocks Bf such that S <;;;
00
U Bf and
LfL(Bi)
iEN
i=l
In Example 1.15 we saw that the set of rational numbers is a Lebesgue null set, because we showed there how to cover Q with a countable sequence of open intervals with the sum of the lengths being less than any preassigned f > 0. This argument can be generalized to any countable set, as the reader will see in the Exercises. Closely related to the concept of a Lebesgue null set is that of a Jordan null set, which we define next. Definition 11.2.2 A set S <;;; lEn is called a Jordan null set f > 0 there exists a finite family of open blocks Bf such that p
S <;;;
if and only if for each
p
UBf and
LfL(Bi)
i=l
i=l
<E.
In the Exercises the reader will show that every finite set is a Jordan null set, but that the set of rational numbers is not a Jordan null set. The following theorem establishes a useful relationship between Lebesgue and Jordan null sets. Theorem 11.2.1 If a compact set K null set.
c lEn is a Lebesgue null set, then K is a Jordan
Proof: Let f > 0. By hypothesis, there exists a countable family of open blocks B~ such that K <;;; UnEN B~ and such that L:nEN fL(Bn) < f. Since K is compact, there exists a finite subcollection { Bj 1 , • •• , Bj,,} the union of which still covers K. But then L:~=l fL(Bjk) < f. • We will apply the concepts of Lebesgue and Jordan null sets especially to certain sets of points closely related to the concept of continuity of a function. Specifically, we will need the concept of the oscillation of a function at a point. Intuitively, oscillation at a jump discontinuity should be the height of the jump, and oscillation equal to zero should mean continuity.
340
RIEMANN INTEGRATION IN EUCLIDEAN SPACE
Definition 11.2.3 Let f : D a point x E D to be
-+
o(f, x) = lim ( r--->0+
JR., where D
sup
yEDnBr(x)
~
lEn. We define the oscillation off at
f(y)-
f(y)) .
inf
yEDnBr(x)
It is left to the reader to show that this limit exists in Exercise 11.25.
Theorem 11.2.2 Let f : D and only if o(f, x) = 0.
-+
JR., where D
~
lEn. Then
f
is continuous at x E D
if
Proof: First, suppose f is continuous at x. Then if t: > 0 there exists 8 > 0 such that 0 < r < 8 implies
sup
f(y)-
yEDnBr(x)
inf
f(y) :::;
t:.
yEDnBr(x)
Since this is true for all t: > 0, it follows that o(f, x) = 0. Next, suppose that o(f, x) = 0. We must show that f is continuous at x. Let t: > 0. Then there exists 8 > 0 such that 0 < r < 8 implies sup
f(y) -
yEDnBr(x)
inf yEDnBr(x)
f(y) <
t:.
It follows that for all y E D n Br(x) we have lf(y) - f(x)l < continuous at x.
t:.
Thus f is
•
The following theorem will be useful in establishing Lebesgue's Criterion for Riemann integrability.
Theorem 11.2.3 Let f : D -+ JR., where D ~ JEk and let 8 > 0. Then the set E ={xED I o(f, x) ~ 8} is a closed set. Proof: Suppose Xn E E for all n E N and suppose Xn -+ x as n -+ oo. We need to prove that x E E. For each t: > 0 there exists Xn E B,(x), and there exists r > 0
such that Br(xn)
~
B,(x). Also, sup
f(y) -
yEDnBr(Xn)
f(y)
inf
~
8.
yEDnBr(xn)
However, this implies that sup yEDnB,(x)
f(y) -
inf
f(y)
~
8,
yEDnB.(x)
which completes the proof since this is true for all t: > 0.
•
The following theorem is useful when studying the Jacobian Theorem for changes of variables in Riemann integrals on lEn.
EXERCISES
341
Theorem 11.2.4 Let 0 C lEn be an open set, B a closed block in 0 and let 4> be in Cl(O,lEn). (i) If S C B 0 is a Jordan null set, then 4>( S) is also a Jordan null set. (ii) If S C B 0 is a Lebesgue null set, then 4>(8) is also a Lebesgue null set.
Denote M = ll4>'llsup < oo and let E > 0. We begin by proving item (i). By hypothesis, there exists a finite sequence of closed blocks B1, ... , B N such that
Proof:
N
Be
U B'k k=l
N
and LJ1(Bk) < k=l
:
n·
2M n2
Note that the intersection of two rectangular blocks (with edges parallel to the axes) must again be a rectangular block. Thus without loss of generality we can assume that each block Bk ~B. It is easy to see that any closed rectangular block R is contained in a union of finitely many cubes for which the sum of the volumes of the cubes is less than twice 11(R). This is true even though the edges of R need not be commensurable, because we can slightly enlarge the edges of R to make all the (extended) edges commensurable. Thus, without loss of generality, we can assume that each block Bk used above is actually a cube. If we denote by Dk the diagonal measurement of Bk and sk its edge-length, then Dk = SkVn and
JL(Bk) =
Dn
!.
n2
On the other hand, by Theorem 10.3.2, each individual coordinate function of the image (Bk) is limited to at most an interval of length M Dk. Hence (Bk) is contained in a cubeB~ such that Jl(BD :::; Mnn ~ 11 (Bk)· By slightly enlarging the cubes B~ we can assure that 4> (Bk) C (B~t and that
Jl(BD:::; 2Mnn~J1(Bk)· Thus
N
4>(8)
~
N
U 4> (B'k) c U (B~t k=l
k=l
N
and Lll (BD <
t.
k=l
Item (ii) is very similar and is left as an exercise for the reader.
•
EXERCISES 11.16 t Suppose Ek C lEn is a Lebesgue null set, for each k E N. Prove that UkEN Ek is also a Lebesgue null set. (Hint: See Exercise 1.88.) 11.17 LetS C lEn be any countable set. Show that S is a Lebesgue null set. (Hint: See Example 1.15.) 11.18 Prove that the union of finitely many Jordan null sets is a Jordan null set. Give an example to show that the union of countably many Jordan null sets need not be a Jordan null set.
342
RIEMANN INTEGRATION IN EUCLIDEAN SPACE
11.19
Show that ifF
c
lEn is a finite set, then F is a Jordan null set.
11.20 Prove that the set of rational numbers in the interval [0, 1] is not a Jordan null set, but that it is a Lebesgue null set. 11.21
Give an example of a countable subset of JE 2 that is not a Jordan null set.
11.22
t Prove that every Jordan null set is a Lebesgue null set.
11.23
Prove that every convergent sequence Xn is a Jordan null set.
11.24 set.
Prove that an interval I of strictly positive length is never a Lebesgue null
11.25 Prove that the limit in Definition 11.2.3 exists. (Hint: Show that the function of r in the definition is monotone increasing on (0, oo). Compare with Exercise 2.28.) 11.26
Prove that o(f, x) 2: 0 for all x and for all f
:D
11.27
t Prove that the set of points of discontinuity off : D
--->
JR., D --->
~
lEn.
JR., D
~
!En, is the
U {xED Io(f, x) 2: ~}. nEN
11.28
Prove that every subset of a Lebesgue null set is a Lebesgue null set.
11.29 t Suppose that the set E of points of discontinuity off : D ---> JR., D ~ IE", is a Lebesgue null set, and suppose that D is a bounded subset of lEn. Let
Ek = {xED I o(f, x) 2:
~}.
Prove that Ek is a Jordan null set. (Hint: Use Theorem 11.2.3.)
11.3
LEBESGUE'S CRITERION FOR RIEMANN INTEGRABILITY
The main objective of this section is to prove the following theorem. Theorem 11.3.1 (Lebesgue's Criterion for Riemann Integrability) Let f : B ---> JR. be any bounded function defined on a closed, bounded rectangular block B c lEn. Let S = { x E B I f is discontinuous at x }. Then f E R(B) if and only if S is a Lebesgue null set.
Proof: First suppose that f E R(B). We will prove that Sis a Lebesgue null set. By Exercise 11.27 we know that s = ukEN sk. where
Sk
= {x
I
E B o(f, x)
2:
~} .
LEBESGUE'S CRITERION FOR RIEMANN INTEGRABILITY
343
By Exercise 11.16, we see that it suffices to prove for each k E N that Sk is a Lebesgue null set. By Exercise 11.22, it suffices to prove that Skis a Jordan null set. Let£ > 0. There exists a partition P = { B1, ... , Bp} of B for which
U(f,P)- L(f,P) < Denote
E
k'
p
P'(~t)
UBf i=l
and observe that B \ P' is a Jordan null set, being a union of degenerate blocks, and so is Sk \ P'. Thus it suffices to prove that
is a Jordan null set. If x E S~, then x lies in the interior of exactly one block Bi of the partition P, and on that block we have Mi-mi ~ o(j, x). Hence
U(f, P)- L(f, P)
~~
I:
JL(Bi)·
B;nS~-10
It follows that
I:
JL(Bi) <
E,
B;nS~,£0
and thus s~ is a Jordan null set as claimed. For the opposite direction of implication in the theorem, we suppose that S is a Lebesgue null set, and we must prove that f E R(B). By hypothesis, each set Skis a Lebesgue null set. By Exercise 11.29, each Sk is a Jordan null set. Let E > 0. We need to show there exists a partition P of B such that
U(f, P) - L(f, P) <
E.
Select k E N such that k
> 4JL(B). E
Without loss of generality, suppose llfllsup > 0. Since Sk is a Jordan null set, there exists a finite set of blocks B1, . .. , Bt such that Bf n B'j = 0 if i =I j
Hence l
I:(Mi- mi)JL(Bi) < i=l
~·
344
RIEMANN INTEGRATION IN EUCLIDEAN SPACE
Let
i=l
be a compact set. For each x E K we have
o(f,x) <
1
k.
Thus there exists an open block B~ containing x such that on this block we have
Mx-mx <
2
k"
There exists a finite subcollection Bt+ 1 , ... , B N that covers K, by the Heine-Borel Theorem. Without loss of generality, we can now intersect each Bi with B itself to insure that each Bj <;;; B, and we can refine this set to make a partition, with B'j1 n B'j2 = 0 for all j1 =f. ]2 and to insure that each Bj <;;; K. And
2
N
~ (Mi- mi)p,(Bj) < kp,(B) < ~· j=l+l
Hence U(f, P)- L(f, P)
<
E,
so
f
•
R(B).
E
EXERCISES 11.30 For each of the following functions on IE 1 , determine whether or not the function is Riemann integrable on [0, 1], and prove your conclusion. a) Let f(x) = 1Qn[O,l]• the indicator of the rational numbers in the unit interval. b) Let ifxE(0,1], f(x) = 1,!; if X= 0.
{~in
c) Let
if x E ( n~l'
~], n EN,
if X= 0. (See Fig. 11.2, in which [1/x] denotes the integer part of 1/x.)
11.31 For each of the following functions on JE 2 , determine whether or not the function is Riemann integmble on [0, 1] x 2 , the unit square, and prove your conclusion. a) Let f(x) = 1Qx2n[O,l] x2 (x), the indicator of the rational number pairs in the unit square. b) Let sin _1L_ if Xi E (0, 1], i = 1, 2, f(x) = XtX2 {0 if X1X2 = 0.
EXERCISES
,,E_ _
345
y
0.5
0.2
-0 5
0.4
0.6
0.8
1.0
X
-1:0 - - - - -
Figure 11.2
( -l)[l/xl. Note the ambiguity of the plot near x
= 0.
(See Fig. 11.1.) 11.32 Prove that the product of two Riemann integrable functions on B Riemann integrable. 11.33
c
lEn is
t Let B be a closed rectangular block and S C B a Jordan null set. a) If his a bounded function that is zero everywhere on B except on S, Prove: hE R(B) and
l
h(x)dx
= 0.
b) Suppose f and g E R(B) and suppose f(x) = g(x) for all x E B \ S. Prove
l
c) Let P
f(x) dx
=
l
g(x) dx.
= {Bj I j = 1, ... , n} be a partition of B.
1
f(x) dx =
B
(Hint: Let /j(x)
tj=l 1
Prove that
f(x) dx.
B;
= f(x)1n; (x) and consider
E;=l k)
11.34 t Suppose f(x) ~ 0 for all x E B c lEn. Suppose f E R(B), where B is a bounded rectangular block. Suppose 8 f(x) dx = 0. Prove that
J
S = {x E B
I f(x) > 0}
is a Lebesgue null set. (Hint: Show that
Sn =
{X B If(x) ~ ~} E
is a Jordan null set.) 11.35 Let f and g E R(B) where B = [0, 1] xn C lEn. Extend the domain off, making f(x) identically zero off B. Prove that the convolution
f
* g(x) (~I)
l
f(x- y)g(y) dy
346
RIEMANN INTEGRATION IN EUCLIDEAN SPACE
exists for all x E En. (Hint: Use Exercise 11.32.) 11.36 A set E ~ B, a bounded closed block in En, is called Jordan measurable provided that lE E R(B), where the indicator function lEis defined on page 83. If E is Jordan measurable, we define its Jordan content to be
v(E) = llE(x) dx. Prove: A bounded set E is a Lebesgue null set.
c
En is Jordan measurable if and only if its boundary, aE,
11.37 Let B c En be a closed, bounded rectangular block, and let E c B be a Jordan null set. Prove that E is Jordan measurable, as defined in Exercise 11.36, and that the Jordan content v(E) = 0. 11.38 Let B c En be a closed, bounded rectangular block, and let E c B be a Jordan measurable set for which the Jordan content v(E) = 0. Prove that E is a Jordan null set.
11.4 FUBINI'S THEOREM In an elementary course in the calculus of functions of several real variables, students learn to calculate double and triple integrals by the method of iteration. A theorem attributed to Fubini may have been mentioned without proof, stating that if B = [a1, bl] x [a2, b2], then
1: (1: 1: (1: 1
Jlf(x,y)d(x,y) =
2
=
2
f(x,y)dy) dx 1
f(x, y) dx) dy.
Although this statement is correct for continuous functions on B, many Riemann integrable functions are far from continuous. The following example illustrates one of the difficulties that can be encountered. Specifically, it is not clear that for any given fixed value of x the one-variable integral f(x, y)dy will exist. If it exists, there is still the question of whether or not the outer integral with respect to x exists .
J::
• EXAMPLE 11.1 Let f
: [0, 1] x 2 --> lR be defined by if x 1 =1- 0.5, if x 2 E Ql and if x2
X1
= 0.5,
¢ Q and x1 = 0.5.
FUBINI'S THEOREM
We see that
1
347
1
f(0.5, x2) d.r2
J01 (J01 j(x1, x2) dx1) dx2 exists and equals Moreover, Lebesgue's Criterion tells us that Jf1o,I]x f(x) dx does exist
does not exist. We see also that
0. 2 and is easily seen to be equal to 0. Phenomena of this type motivate the use of upper and lower integrals in the following statement of Fubini's Theorem. Theorem 11.4.1 (Fubini's Theorem) Let [a, b] be a closed rectangular block in lEm and let [c, d] be a closed rectangular block in JE 11 , SO that fl = [a, b] X [c, d] is a closed rectangular block in JErn+n. Let f: B ~JR. be a boundedfunction. Then, denoting x E lEm andy E lEn, we have
i.
fs!(x,y)d(x,y)~ 1b (1df(x,y)dy) dx
: :; 1b
U:f(x,y)dy) dx
(ii)-
:::; fs!(x,y)d(x,y), ii.
1b : :; 1b (1d
fs!(x,y)d(x,y):::;
(1df(x,y)dy) dx
: :; l
f(x,y)dy) dx
f(x, y)d(x, y).
iii. Iff E 'R(B), then we have
fs!(x,y)d(x,y)
=
=
1b 1b (1d
(1df(x,y)dy) dx f(x, y) dy) dx.
Remarks. In this theorem, we could just as well have reversed the order of iteration from dydx to dxdy. The student will be most familiar with this theorem in the special
348
RIEMANN INTEGRATION IN EUCLIDEAN SPACE
case in which the blocks [a, b] and [c, d] are replaced by intervals [a, b] and [c, d], so that m = n = 1. The theorem as stated above can be used in any Euclidean space JEP to successively reduce an integral over a p-dimensional block top iterated integrals over intervals. This decomposition is simplest to write if all the lower-dimensional integrals in the iteration exist, so that it is not necessary to resort to lower or upper integrals, which always exist for a bounded integrand on a closed, bounded block. The latter condition is certainly met iff E C(B).
J
y) y)
y) y)
J
Proof: In order to analyze 8 f(x, d(x, and 8 f(x, d(x, we need to consider arbitrary partitions P of B. We will show first that actually it suffices to take the supremum and infimum of the lower and upper sums, respectively, over those partitions that arise as Cartesian products of arbitrary partitions P1 of [a, b] and P2 of [c, d] as follows. We denote
P1 x P2 = {Ri x
sj
1
~ E
P1, sj
E
P2},
which is a partition of B, though not an arbitrary partition of B. However, define orthogonal projections by 1r1(x)
= (x1, ... ,xm) and
by 1r2 (x) = (xm+ 1 , ... , Xm+n)· If P denotes an arbitrary partition of B, define to be the partition obtained by using all the nonempty intersections of all the blocks of the form 1r1(B'), where B' E P}. Define 1r2(P) to be the partition obtained by using all the nonempty intersections of the blocks of the form 1r2(B') where B' E P}. Thus 1r1(P) is a partition of [a, b] and 1r2(P) is a partition of [c, d]. Furthermore, 71'1(P) x 1r2(P) is a refinementofP, and refinements raise lower sums and lower upper sums. This justifies the claim that to prove (1) for example, it suffices to begin with a partition P = P 1 x P2. where P 1 is an arbitrary partition of [a, b] and P2 is an arbitrary partition of [c, d]. Part (ii) is very similar to part (i), and part (iii) follows from the first two parts. To prove part (i), we proceed as follows. First we note that there is need to prove only inequalities (i) and (ii). Observe that 71' 1 (P)
T
J(x, y) dy =
lrc,dJ
L:i lsir. J(x, y) dy
by Exercise 11.33.c. Next we apply Exercise 11.3 to see that
r (rf(x,y)dy) [a,b]
[c,d]
dx
~Lr j
=
~l 't,J
~
(If(x,y)dy) dx
[a,b]
Si
(lsf(x,y)dy) dx 1.
U(f, P)
3
EXERCISES
Thus
-1 (-1 [a,b]
349
f(x, y) dy) dx :::; U(f, P)
[c,d]
for all partitions P of B. Hence
-1 (-1 [a,b]
f(x,y)dy) dx:::; [ f(x,y)d(x,y)
JB
[c,d]
which proves inequality (ii). The proof of inequality (i) is similar and is left to the • Exercises. EXERCISES
Let
11.39
ifx1 tf- Q, if X1 E Q.
Prove: a) b)
f01 (I; j(x1, x2) dx2) f01 (I; f(xb x2) dx1)
dx1 does not exist. dx2 does not exist.
c) /ro, 1] x2 f(x) dx does not exist. Let f: [0, l]x 2
11.40
--?
JR. be defined by if X1 E Q, if X1 tf_ Q.
Prove: a) The function f is not continuous at any point of [0, 1] x 2 except for those at which X2 =~,SO that j tf_ R ([0, ljx 2 ). b)
f01 (I; f(xb x2) dx2)
c) Both
dx1 = 1.
1 (1 11
1 j(x1,x2)dx1) dx2
1
and
1
1
(
j(x1, x2) dx1) dx2
exist. Find their values. 11.41
Let lP' denote the set of all prime numbers. Let
8=1t{(;,~)
lrnandnE{1,2, ... ,p-1}}.
350
RIEMANN INTEGRATION IN EUCLIDEAN SPACE
a) Prove: Sis dense in the unit square B = [0, 1] x 2 • (Hint: The set lP' has no upper bound.) b) Let = B \ S and let f = 1sc, the indicator function of Prove that
sc
sc.
f rf; R(B). c) Prove that
and
1 (1 1
1
f(xi, x2) dx1) dx2
both exist and equal 1. (Hint: You may assume the uniqueness of factorizations into primes.)
11.42
Suppose
and
1(1 1
1 J(x1,x2)dx2) dx1
both exist and are equal. Prove that the set of numbers XI for which
does not exist is a Lebesgue null set in the real line. (Hint: Use Exercise 11.34.)
11.43
Prove Inequality (i) of Theorem 11.4.1.
11.44
Prove Parts (2) and (3) of Theorem 11.4.1.
11.45 (Differentiation across the integral sign) Let B = [a, b] denote both (x, t) and (x, y) E IE 2 • Suppose that
• f:B-+R
• f(·, t) E R[a, b] for each fixed t E [c, d]. • ¥t E R(B). • ¥t(x, ·) E R[c, d] for each fixed x E [a, b].
• J: ¥t (x,
t) dx E C[c, d] as a function oft.
a) Prove that
aaY lb f(x, y) dx a
X
[c, d] C IE 2 and
JACOBIAN THEOREM FOR CHANGE OF VARIABLES
351
exists and equals
bDJ
1 a
- (x,y)dx. 0Y
(Hint: Use Fubini's Theorem to prove that the difference
1 a
b
f(x, y) dx-
1y(1bM c
a
Dt (x, t) dx
)
dt
is independent of y, and apply the Fundamental Theorem of Calculus.) b) The Fourier transform of_: function¢ E R[O, 1] may be defined to be the
complex-valued function¢: IR ___, C defined by
1 1
J;(a)
=
1 1
¢(x) cos(21rax) dx- i
¢(x) sin(21rax) dx.
Prove that the derivative d¢ ~ da (a)= -27ri'¢(a),
where 'lj;(x)
= xf(x).
11.46 (Clairaut's Theorem) Suppose f E C2 (D, IR), where Dis an open subset of lE 2 . Prove that
&2!
&2!
OXt OX2
OX20Xt .
(Hint: Suppose false. Show by continuity that there is a closed rectangular block R c D on which one of the two mixed partial derivatives remains identically strictly larger than the other. Conclude that the double integral of the first over R is strictly greater than the double integral of the second. Now show the contradiction that both integrals must be equal by integrating each second order partial derivative over R using Fubini's theorem. Clairaut's theorem can be readily generalized to higher-order mixed partial derivatives.)
11.5
JACOBIAN THEOREM FOR CHANGE OF VARIABLES
In one-variable calculus, the student has learned how to change variables in a single integral. Suppose we are given a surjection (i.e., an onto mapping) g : [c, d] ___, [a, b] that is a C1 function with nonvanishing derivative. Since g' (x) is not zero for any valueofx E [c,d),eitherg'(x) > Oforallxorelseg'(x) < Oforallx. In either case, g is one-to-one. If g'(x) > 0 for all x and iff E R[a, b], then fog E R[c, d] (why?) and
1d
f o g(x)g'(x) dx =
1b
f(y) dy.
352
RIEMANN INTEGRATION IN EUCLIDEAN SPACE
On the other hand, if g'(x)
< 0 for all x, then fog E 'R[c, d] and
-1d
f
o
g(x)g'(x) dx
=
1b
f(y) dy.
The reader should note the minus sign. In either case, we can write that
f
f
o
f
g(x)lg'(x)l dx =
J 9 -1 [a,b]
f(y) dy.
J[a,b]
In this section we will prove a generalization of this change-of-variables theorem to Euclidean spaces of all finite dimensions n ::::: 1. Theorem 11.5.1 Let g E C 1 ([c, d], lEn) be a one-to-one nonsingular function defined on a closed rectangular block [c, d] C lEn. Suppose [a, b] C g(c, d), which is necessarily an open set. Let f E 'R[a, b], and extend f to be identically zero on g[c, d] \ [a, b]. Then fog E 'R[c, d] and
1
f o g(x)
g- 1 [a,b]
I~
I
dx =
f
f(x) dx
J[a,b]
Remark 11.5.1 Here we define
1
8
g-l[a,b]
f o g(x) I8 g I dx = X
1.
[c,d]
8
J o g(x) I8 g I dx X
since f o g vanishes off g- 1 [a, b], the boundary of which is a Jordan null set.
Proof: We observe that since g is a nonsingular C1 -function that is invertible globally, the inverse is also a C 1 -function by Theorem 10.4.3. Note that f E 'R[a, b], and the boundary of [a, b] is a Lebesgue null set, as is the set of points of discontinuity off itself. Hence the set of points of discontinuity off o g in [c, d] is a Lebesgue null set by Theorem 11.2.4. This shows that fog E 'R[c, d]. Let E > 0. By Theorem 11.1.3 we know there exists a partition P for which
IU(J, P)- L(f, P)l < c Corresponding to Definition 11.1.3 we can define upper and lower step functions U(x) and L(x) as follows. On each block Bi E P we let U(x) = Mi and L(x) = mi on Bf. On the null set that is the boundary of Bi, we let
U(x) = max{Mj I x E 8Bj} and
L(x) = min{m1 I x E 8Bj}· Thus L(x) :::; f(x) :::; U(x) for all x E g[c, d] and
f J[a,b]
(U- L)(x)dx
=
1
g[c,d]
(U- L)(x)dx <
L
JACOBIAN THEOREM FOR CHANGE OF VARIABLES
353
Now we will suppose that we have already proven the change of variables theorem for indicator functions of closed rectangular blocks. (We will prove this special case of the change of variables theorem as the last stage of the proof of Theorem 11.5.1.) Since the step functions U and L are linear combinations of finitely many indicator functions of blocks (except on a Jordan null set, which does not affect the Riemann integral), this will imply the validity of the change of variables theorem for U and L as well. It follows that
{
L(x)dx = {
J[a,b]
Log(x)
Jg-l[a,b]
:; r 1 : ; ~-~~.~ r u
0
= {
dx
g(x) 1 ag 1 dx
Jg-l[a,b]
0
~~~
g(x)
ax ~g uX
1
1
dx
U(x) dx.
J[a,b]
Hence
r
IJg-1 [a,b]
1 0 g(x) 1~ 1 dx _
r
J[a,b]
f(x) dxl <
E.
Since E > 0 is arbitrary, the desired equality is established. The proof the theorem will be complete once we have proven it for the special case of the indicator function of a rectangular block. Without loss of generality, we may assume that f = lB, an indicator function of a rectangular block B = [A, B]. Since the change of variables theorem is known to be true for lEn if n = 1, we proceed by induction. Thus we assume the theorem is true for JEk for all k < n and we must establish the theorem for lEn. It is clear that the validity of the theorem is independent of the order in which the coordinate axes are listed in lEn. Since at each point at least one of the partial derivatives ~ =/= 0, we can change the order in which we list the coordinate axes J
in the domain space of g to insure that ~ =/= 0. Since g E C1 , we can insure by making B sufficiently small that~ has constant sign on g- 1 (B). Moreover, the Jacobian~ must have constant sign on g- 1 (B) since it cannot vanish and g- 1 (B) is connected. Thus we can assume without loss of generality (by changing the order of the basis in the domain space) that ~ > 0 on g - l (B). In addition to the mapping g: g- 1 (B) -4 B = [A, B], we define a mapping h from g- 1 (B) onto an image that we call H by letting
h(x)
=
(g1(x),x2,x3, ... ,xn)·
Since h' (x) = ~ has constant sign on its domain, and since only the first coordinate of x is altered by h, we see that h is invertible on H and that the inverse is a C1
354
RIEMANN INTEGRATION IN EUCLIDEAN SPACE
function. Now we consider the mapping
goh- 1 :H---+B and we note that
goh- 1(x)
= (x1,g2(x) .... ,gn(x))
so that the first coordinate is unaffected by this composite mapping. We denote by Bx 1 the cross section of B determined by fixing the value of x1, and we define the cross section Hx 1 of H in the same way. Note that this cross section of H must be indexed by the subscript x 1 since it, unlike Bx 1 , is not independent of x 1 • Observe also that
a(g2,····gn) = ( go h-11 )' (go h-1)' = 1 · 8(x2, ... , Xn) Hx1 We are ready to complete the proof of the change of variables theorem as follows. We proceed by computing the volume of the box B, making use of the hypothesis that the theorem works already in n - 1 variables and in 1 variable. We denote by Hx 1 the set of elements of H with the first coordinate fixed at the value x 1 . Also, Let 1f project En orthogonally onto the hyperplane determined by setting x 1 = 0, and denote H' = 1r(H) = 1f(g- 1 (B)), since halters only the first coordinate.
In the equality labeled (i), we have used the (n - 1 )-dimensional version of the change of variables formula, noting that go h- 1 : Hx 1 ---+ Bx 1 • In the equality labeled (ii), the change in order of iteration is justified by Fubini's Theorem together with the existence of the Riemann integral on the product space. This integrability is a consequence of the boundary of the domain being a Jordan null set, by Theorem 11.2.4. The inner integral exists except perhaps on a Lebesgue null set in the outer variable, on which either the upper or lower form of the inner integral can be used with no effect on the value of the iterated integral. In the equality labeled (iii), we
EXERCISES
355
have used the one-dimensional version of the change of variables formula for upper integrals (Exercise 11.55), bearing in mind that h- 1 alters only the first coordinate of x. Also, we use here the definition that the Riemann integral off over a bounded set is the Riemann integral of a trivial extension off to any closed rectangular block containing that set. • • EXAMPLE 11.2
Consider polar coordinates, for which g: (r-, 0) ----> (x 1 , x 2 ) where
x1 = r-cosO and x2 = r-sinO. Thus g maps the rectangle [0, a] x [0, 21r] onto a punctured circular disk of radius a. The area of this disk is
1 2
"
since g' (r-, 0)
llg'(r-, 0)1 dOdr-
= 1ra 2
= r-, as the reader should verify easily.
EXERCISES 11.47 Iff E R[a, b] and if we have a surjection g E C 1 ([c, d], [a, b]) such that g' is non vanishing on [a, b], prove that fog E R[c, d]. 11.48
In spherical coordinates, g(p,O, ¢)
=
(x 1,x2, x 3 ), where
x 1 = psin ¢cos 0, x 2 = p sin¢ ~:~in 0, and x 3 = p cos¢. Calculate g' (p, 0, ¢) and use it to find the volume of a sphere of radius a. 11.49 11.50 Find the Jacobian of the transformation that corresponds to cylindrical coordinates in elementary calculus. 11.51 LetS C lEn be any Jordan measurable set and letT T(S) is Jordan measurable, and that
JL(T(S)) 11.52
E
.C(IEn). Prove that
= dct(T)JL(S).
Let T : JE 3 ----> JE 3 be the affine transformation defined by
T(x1, x2, x3) = (x1 +a, x2 + b, X3 + c + ax2) where a, b, and care real constants. Find the volume of T(B 1 (0)), where B 1 (0) is the unit ball around the origin. 11.53 Prove that the change of variables theorem remains valid even if the transformation g has the property that g' vanishes on a Jordan null set. (Hint: What can you say about g(S), if Sis a Jordan null set?)
356
RIEMANN INTEGRATION IN EUCLIDEAN SPACE
11.54
Let B1 ( 0) denote the closed ball of radius 1 centered at 0 E lEn. Let
Z -11 1112 dx,
In =
lth(O)
X
which we define by
In= lim { r->O+ J{xEJEnlr:-;llxll:9}
~dx. llxll
For parts (a) and (b) below, use polar or spherical coordinates. a) Prove that In = oo if n = 1 or if n = 2. b) Prove that I3 < oo. c) Introduce spherical coordinates in lEn as follows. Let p =
>n =cos
-1 Xn
-
p
l!xl!. Let
and Pn-1 = psin>n
for 0 ~ n ~ 1r. Let
>n-1 =cos for 0
~
n-l
~
-1 Xn-1
- - and Pn-2 = Pn-1 sin>n-1
Pn-1
1r, ... , and let
A. X3 . A. '1'3 =cos -1 , an d P2 = p3sm'l'3 P3
for 0 ~ ¢3 ~ 1r. Let x1 = P2 cos(), x2 = P2 sin(), with 0 ~ () Prove that
< 21r.
o(x1, ... ,xn) o(p,0,¢3,····>n)
is pn- 1 times a polynomial in the sines and cosines of(), ¢ 3, ... , n· d) Prove that if n > 3, then In < oo.
11.55 Let g E C1((a,b),IE 1) be nonsingular(meaning g' is nowhere zero) and let S ~ [c', d'] c g(a, b) be any bounded subset of g(a, b). Let f be any bounded real valued function on S. Prove:
rf(x) dx r- f(x)!g'(x)l dx.
ls
=
}y-l(S)
(Hint: Interpret f as being identically zero off S. Consider any partition P of [c', d'] = g[c, d], and compare upper sums over both domains.)
11.56 Fix any a E lEn and define T E £(!En) by T(x) = x +a. If B is any closed finite box in lEn and iff E R(B), then f E R(T- 1B) and
{ f(x) dx = {
jB
lT-lB
f(Tx) dx.
TEST YOURSELF
357
That is, the Riemann integral on !En is invariant under all translations.
11.57 Denote by O(n) the set of all n x n matrices A for which AAt = I, the identity matrix, where At denotes the transpose of the matrix A. Prove: a) 0( n) is a group under the operation of matrix multiplication. This is called the orthogonal group on !En. b) I det AI= 1 for all A E O(n). c) If B is any closed finite box in !En and iff E R(B), then f E R(A -l B) and
{ f(x) dx = {
jB
}A-lB
f(Ax) dx.
That is, the Riemann integral on !En is invariant under orthogonal transformations.
11.6 TEST YOURSELF
EXERCISES 11.58
Give an example of a function
f :B
---+ JR. for which
~~ =0 and~~= 1, where B = [0, 1]x 2 •
11.59
Give an example of functions
f
and g mapping B ---+ JR. such that
where B = [0, 1]x 2 •
11.60 Give an example of a function f : B ---+ JR. for which integrable but f- is Riemann integrable, where B = [0, 1] x 2 . 11.61
f
is not Riemann
Let the linear map T: R[a, b] ---+JR. be defined by
T(f) = {
f(x) dx.
J[a,b]
FindiiTII·
11.62 Let f = 1Qn[O,l]• the indicator function of the set of all rational numbers in the unit interval of IE 1 . Find the set of points of discontinuity off. 11.63
LetS= { x E IE 2 1 x1 ~ IQI, x 2 ~ IQI}. IsS Jordan measurable?
11.64
True or False: The set { ( -1)n +~In EN} is a Jordan null set in IE 1 .
358
RIEMANN INTEGRATION IN EUCLIDEAN SPACE
11.65
Let
f(x) =
{sin~' 0,
X
X~
0,
=0.
Find the oscillation o(f, 0). 11.66 Find the Jordan measure of the set of points of discontinuity of the function shown in Fig. 11.1. 11.67
Suppose F(x 1 ,x 2 )
= f(x 1 )1Q(x2)·
Find all functions j(x1) for which
FE n[o, l]X 2 • 11.68
Let f(x) = ( -x, -y, xy- z) for all x E IE 3 . a) Find the matrix [f'(x)]. b) Find the volume of the image of a sphere of radius 2 under the function f in this exercise.
APPENDIX A SET THEORY
A.1
TERMINOLOGY AND SYMBOLS
When the author was young, it was easy for a professor to know whether or not students were familiar with set-theoretic symbols and language from high school. The answer was no. Thus professors took care to explain what these terms mean. Today it is less clear. High-schools teach some set theory, but maybe not very much. And since students seldom get to use this language in their high-school problem solving, they may not have developed facility with it. The same is true with introductory college courses. There is likely to be some use of set theory, but it is not clear what background is common to all students. The formal study of set theory refers particularly to the study of infinite sets (also called transfinite sets), and the reader is referred to [11] for a deep and serious study of that subject. Here we begin by defining the commonly used symbols that appear in this text and giving a few illustrative examples and theorems.
Advanced Calculus: An Introduction to Linear Analysis. By Leonard F. Richardson Copyright © 2008 John Wiley & Sons, Inc.
359
360
SET THEORY
For reasons that we will sketch briefly in Example A.7, logic requires us to limit the frame of reference in any set-theoretic discussion to some universal set, which is chosen for convenience or suitability to a given purpose. In the study of sets in the abstract, this universal set is commonly designated by the letter X. It is common to designate a subset A of X, denoted by A C X by writing that A is the set of all elements of X that possess some property. • EXAMPLE A.l
Let A = {x E R I -1 :::; x :::; 1}. This is read A is the set of all those real numbers x that have the property that x lies between -1 and 1 inclusive. It is common to write the set A in interval notation for the real numbers as
A= [-1, 1]. Some authors use the symbol A c X to mean proper subset, which means that A is contained in X but A is not equal to X. The author of this book will usually write A s;; B for this however. If it is important to stress the possibility of a subset of X being equal to X itself, the author will often write A ~ X. Two very important operations between sets are union and intersection.
Definition A.l.l The union of A and B, denoted by A U B is the set of all x E X such that either x E A or x E B. This can be written as AU B = {x EX I x E A or x E B}. The intersection of A and B is
An B = {x EX I x E A and x E B}.
Note that in mathematics we use the word or in the inclusive sense, not in the exclusive sense. Thus x E AU B, provided that x E A, x E B, or both . • EXAMPLE A.2
Continuing Example A.l, in which A = [-1, 1] c R we can define B = { x E R I 0:::; x < 2} = [0, 2). Then we have AU B = [-1, 2) in interval notation, and An B = [0, 1].
Definition A.1.2 If A n B = 0, the empty set, we call A and B disjoint sets. Another important operation is the difference between two sets.
Definition A.1.3 The difference between two sets A and B, denoted by A\ B, is the set of all those elements of A that are not in B.
TERMINOLOGY AND SYMBOLS
•
361
EXAMPLE A.3
Continuing Example A.2, we have A \ B
= [-1, 0).
Definition A.1.4 Within the context of a universal set X, we denote X\ A = Ac, which is called the complement of A. Thus the complement of a set has meaning only within the context of a previously specified universal set. •
EXAMPLE A.4
Continuing Example A.2, the set Ac = ( -oo, -1) U ( 1, oo ), provided we are working with reference to the universal set R A different but sometimes useful concept of subtraction of two sets is the following.
Definition A.l.S The symmetric difference of two sets A and B is denoted and defined as follows: Atl.B =(A\ B) U (B \A) .
•
EXAMPLE A.5
In terms of Example A.2, Atl.B
= [-1, 0) U (1, 2).
Theorem A.l.l Atl.B =(AU B)\ (An B). Proof: This proof illustrates a frequently useful approach to proving that two sets, Atl.B and (AU B)\ (An B), are the same. The plan is to show that each of the latter two sets is a subset of the other. We will begin by proving that
A6B
~
(AU B)\ (An B).
By Definition A.l.S, if x E Atl.B, either x E A but x rJ. B, or vice versa. (Observe that in this case the union that defines Atl.B does not permit x to be in both sets.) If x E A butx rj. B, thenx E AUB butx rj. XnB. Hencex E (AUB) \(An B). On the other hand, ifx E B butx rJ. A, we can still concludethatx E (AUB) \(An B). Therefore, A6B ~ (Au B)\ (An B) as claimed. We leave it to the reader to verify by similar reasoning that
A6B
;:;:>
(AU B)\ (An B).
• In this book, we need to apply the concepts of union and intersection to infinite families of sets-that is, to infinite sets of sets. We will clarify this with an example.
362
SET THEORY
• EXAMPLE A.6 Suppose to each x E lR we associate an interval Ax = (x - 1, x + 1). If S c lR we can define the union over x E S of the sets Ax to be the set of all those real numbers that lie in at least one of the sets Ax with x E S. Thus in the example
U
IX
Ax= U{Ax
E
[0,3]} = (-1,4).
xE[0,3]
Similarly, the intersection of a set of sets is the set of all those elements that belong simultaneously to every one of the sets. Continuing the example of this paragraph, we have
n
Ax= (0.5, 1).
xE{0,1.5)
TheoremA.1.2 (DeMorgan'sLaws)Supposeforeachs E S, asetofindices, there is associated a set As ~ X. Then i.
(u
As)c =
sES
nA~
sES
ii.
Proof: To prove the first equality, we will prove that the set that is written as the left side is contained in the set that is the right side. Then we will prove the reverse inclusion, thereby establishing equality. So suppose that x E (UsES As This tells us that for all s E S we have x ~ A..,. This tells us in tum that x E A~ for each s E s. Consequently, X E nsES A~. which establishes that
r,
Toprovethereverseinclusion, supposethatx E nsES A~. This means thatx E A~ for each s E S. Hence x ~ As for any s E S. Thus x E (UsES Ast This tells us that
which completes the proof of equality of the left and right sides of the first law. We leave the second law to the reader to prove as Exercise A.l. •
EXERCISES
363
EXERCISES A.l
Prove that
A.2 naEA
If a set Ea is a closed subset of lR for each a E A, an index set, prove that Ea must be a closed set.
A.2
PARADOXES
We tum next to a brief introduction to some of the logical problems that can arise in set theory and in the logical use of language . • EXAMPLE A.7
We explain in this example why it is necessary to fix a frame of reference, called a universal set, at the beginning of any logical discussion involving set theory. At first glance it might seem reasonable to define a set S = {A I A is a set}. Thus we appear to have defined an extraordinarily big set S that has as its elements every set that exists. In other words, A E S <=? A is a set. (This is a concept that is reminiscent of the optimistic concept of a war to end all wars.) We will explain now why there can be no such set as S. We will show that the alleged definition of S introduces a paradox, known as Cantor's paradox, with the result that the claimed definition is not a legitimate definition but is only a misleading sequence of words. In order to do this we must use a theorem that we will not prove here, but which the reader can find in the book [11] by E. Kamke. We begin with a (legitimate!) definition. Definition A.2.1 The power set of any set X, denoted by P(X), is defined to be the set of all subsets of X. Thus A E P(X) if and only if A <;;: X. If X happens to be a finite set, with n elements, it can be proved that the number of elements in P(X) must be 2n. (The reader can find this fact in [11] or in [2].) However, there is a more difficult theorem in set theory that states the following. Theorem A.2.1 For every set X, P(X) has more elements than X itself For a proof, the ambitious reader can consult Kamke's book [11]. (The proof is a powerful generalization of the method the reader will see on a simpler level in Cantor's Theorem 1.8.1.) This theorem is true even when X is an infinite set, and it means that it is not possible to pair each element of P( X) one-to-one with the elements of X itself or with the elements of any subset of X itself. Now consider the alleged set S defined in the present Example. Since each
364
SET THEORY
element of P(S) is itself a set in its own right, the elements of P(S) are all elements of S itself! This means that P(S) ~ S and the elements of P(S) can certainly be paired with themselves on a one-to-one basis. This contradicts Theorem A.2.1, which concludes the proof that there cannot exist such a set as
s. As a consequence of logical problems such as are illustrated in Example A.7, mathematicians begin set-theoretic reasoning with some set that exists, and new sets are introduced using only legitimate operations of set theory, such as forming the power set of a given set. Not every sequence of words is a legitimate definition. Any subject in mathematics begins with definitions and axioms, and within that framework theorems are proven. A theorem must be a sentence that can be called a proposition in the sense of logic. A proposition is a sentence that must be either true or false within the context of the given axiomatic system. (That there is no in-between status for a proposition is sometimes called the law of the excluded middle.) That not every sentence is a proposition was explained in an amusing way by Bertrand Russell, a very distinguished twentieth-century mathematical logician and philosopher. • EXAMPLE A.8 Here we present Russell's Paradox. A (male) barber puts a sign in his shop window advertising as follows: I will shave all those men and only those men who do not shave themselves. Although the claim is ambitious and may provoke skepticism, it is probably not apparent at first sight that the barber's sign bears only a meaningless sequence of words, and not a valid proposition. The problem is this: Must the barber shave himself? If he shaves himself, his own advertisement prohibits him from doing so. But if he does not shave himself, his advertisement requires him to do so. Here is a more serious version of Russell's Paradox for the purposes of mathematics. Let S = {A I A ~ A}. In words, S is the set of all those sets that are not elements of themselves. The reader may be wondering about an example of a set thatis an element of itself. Try this one: B = {A I A is not an Edsel}. It seems clear that B E B since the set B is not an Edsel-at least not literally. There is a logical paradox regarding the proposed setS. The question is this: IsS E S? If the answer is yes, then S cannot be inS. But if the answer is no, then S must be inS. A rigorous treatment of symbolic logic was undertaken by Russell and many others during the twentieth century in order to develop formal methods of avoiding such paradoxes by excluding the possibility of constructing such a definition as the one we attempted to give in this example for the set S. The student may have observed that mathematics professors try to be very careful in the use oflanguage and that we tend to be quite critical of the student's mathematical writing. Centuries of experience have proven this to be a wise practice.
PROBLEM SOLUTIONS
SOLUTIONS FOR CHAPTER 1
1.8 We can use any value of 8 satisfying the double inequality 0 < 8 < ~· 1.24.b This sequence is not bounded (because of the Archimedean property) and so it cannot be Cauchy. 1.24.d This sequence is not Cauchy because it repeats the values 0 and 1 infinitely often. Thus beyond each 0 there is a 1, and beyond each 1 there is a 0. No matter how big N is, there will be n and m greater than N such that Xn - Xm = 1 - 0 = 1 which cannot be made less than € if 0 < t: < 1. 1.31.a sup( B)
= 1 and inf(S) = -1.
1.31.b sup( B) = oo and inf(S) = -oo. 1.41 lim sup Xn
= 1 and lim inf Xn =
1.48 For example, Xn
= a + n,
Yn
1.50 For example,
= an,
= n.
Xn
1.63 For example, let Yi
=
Yn
~ (1
-1. The limit does not exist.
= n.
+ (-l)n ). 365
366
PROBLEM SOLUTIONS
1.68 For example, if welet (an, bn) = (- ~, ~), then
1.79 The empty set 0 is open because it is true (in a vacuous sense) that for each x E 0 there an r > 0 such that ( x - r, x + r) ~ 0. This is true because there is no element x in 0, so that the claim is true for each (nonexistent) element of 0. 1.81 For example, 0 = { ( -n, n) I n E N} is such an open cover. The reader should take care not to confuse the set 0 with the union of its member intervals! We leave it to the reader to prove that 0 is an open cover and has the claimed property. 1.83 For example, let 0 reader.
= {( 2~, 2) I n
1.92.c One example is
0, the empty set.
1.98 We can take 8 =
2 0•
6
E
N}.
We leave the necessary proofs to the
Can you give another example?
The idea is to use the triangle inequality.
1.99.a True. 1.99.b False since the sequence is unbounded. 1.100 For example, we can use Xn = Yn = ( -1 )n. 1.101 We have lim inf Xn = 0 and lim sup Xn = 2. Most errors with this question result from forgetting the meaning of lim supxn = limn_, 00 sup(Tn), where Tn is the nth tail of the sequence. 1.102 For example, we can use Xn 1.103 Counterexample: Let Xn convergent.
= (-1)nn and Yn = (-1)n+ln.
= (1 + (-1)n) n. Then the subsequence x 2n-l is
1.104 For example, we can use (an, bn)
= (0, ~)for all n.
1.105 True. 1.106 For example, one can use On
=
(~, 2).
1.107 False. For example, there is a sequence of rational numbers that converges to
v'2. 1.108 This is true since the set S is the union of countably many finite sets. 1.109 False. The setS is indexed by the set Q, which is countable. 1.110 We can take N > ~· Such N exist because of the Archimedean property of the real number system.
PROBLEM SOLUTIONS
367
1.111 We give a counterexample. The constant sequence Xn = 1 is convergent, and the sequence Yn = ( -1 )n is bounded. But XnYn is divergent. 1.112
0.
1.113 For example, we can let 0 = { 1.114 True, since Ec
(~, 1) n EN}· J
= UnEN ( n~ 1 , ~) U ( -oo, 0) U ( (1, oo ), a union of open sets.
SOLUTIONS FOR CHAPTER 2 2.4 Each real number x E JR. is a cluster point of the set Q because the set Q is dense in the set R 2.13.b The limit is nan- 1 . 2.25 c = nan- 1 . The reader should prove this claim. 2.45 f(x)
= x2
is uniformly continuous on (0, 1). The reader should explain why.
2.58.b For example, let f(x) 2.59.b
llfllsup =
2.64.a
llfnllsup =
=
tanx for all
x E ( -~, ~).
00.
e- 1 .
2.68.d The sequence of functions does not converge uniformly. The student should take care to distinguish this case from the one that precedes it. 2.71 The set is countable. Write Q = { ri functions !a,b as
Ii
E N} and arrange the set of all possible
UUr,,rj I j
EN}.
iEN
Then invoke the fact that the union of countably many countable sets is countable. 2.72 True. Pick a suitable open interval for each rational number that lies in the given open set. 2.73 Counterexample:
f(x) =
{~in~
if 0 <X< 1, if X= 0.
The indicated limit fails to exist and thus is not equal to any number L. 2. 74 The set JR. is the set of all cluster points of the set Q.
368
PROBLEM SOLUTIONS
2.75 True. Apply the Intermediate Value theorem to g(x) = f(x)-
v'x on [0, 1].
2.76 For example, let fn(x) = logx- ~· 2.77 For example, f(x) =sin~ on (0, 1). This was a homework problem.
2.78.a This is not a vector space because the difference between two such polynomials can have non-odd degree.
2.78.b Yes, this is a vector space. 2.79
11/llsup = 1.
2.80.a True, since f n converges pointwise on lR because f n ( x)
-+
0 for each fixed x
as n-+ oo.
2.80.b False since fn
-+
0 pointwise but
11/n- Ollsup =
oo and does not approach
0 as n-+ oo.
2.81 You can use the derivative from elementary calculus for this exercise, to deterfor all n E N. mine that 11/nllsup =
;e
2.82.a True. 2.82.b False. 2.82.c False. It is very important to distinguish this case from the preceding one. 2.83 For example, let Xn = 4 n~l. 2.84 We can use 8 =
~·
2.85 True, because this function can be extended to be continuous on the closed, finite interval [0, 1].
2.86 We give a counterexample. Let
f(x) = {x x+1 2.87 We can use 8 = 2.88.a True. 2.88.b True. 2.88.c False.
2 E
because
Iv'x - val
if X~ 0, if X> 0.
~
..;rx=ar.
369
PROBLEM SOLUTIONS
SOLUTIONS FOR CHAPTER 3 3.12 limn->oo ~ E~=l cos ( 1 +
2 :)
= J13 cos x dx = sin- sin 1.
3.17 For example, we could choose P = {0, 0.95, 1.05, 2}. 3.33.a fn(x) converges to 1 if and only if x E Qn [0, 1]. Proofs are left to the reader. 3.33.b fn(x) 3.35 limn---.oo
----t
0 for all x E [0, 1]. Proofs are left to the reader.
J12 1 + (1+;2 )n dx ----> 1. Proofs are left to the reader.
3.45 For example, let f be the indicator function of a single point. 3.56 For example, let g(x) 3.59.a We find that f(x) 3.59.b llfnllsup
= 2 }n
3.59.c Thus f n
---->
= csin x for some constant c =1- 0.
= 0 for all x
----t
0 as n
----t
E
R
00.
0 uniformly on R
3.60.a For the first interval the answer is yes. In fact lfn(x) - 01 n ----t oo, provided that -1 < x :::; 1.
=
lxln
---->
0 as
3.60.b For the second interval, the answer is no since the sequence ( -1 )n diverges. 3.61 We can pick 8
=f.
Then
II PII < 8 implies that IP(f, {xi})- 31
3.62 This is the limit of a sequence of Riemann sums leading to sin(3) - sin(1 ). 3.63 For example, we can let P
J13 cos(x) dx
=
= {0, .99, 1.01, 2}, a four-point partition of [0, 2].
3.64 We give a counterexample. Let
f(x) = { 3.65 IT(!) I :::; llfllsup 1- cos 1. 3.66 We can take K 3.67.a
E Q n [0, 1],
1
if
-1
if X E [0, 1] \ Q.
J01 sin(x) dx =
X
(1- cos 1)11/llsup• so we can take K
.
= f 1e x1 dx = 2 .
f01 f(x) dx = ! and f01 f(x) dx = -!.
=
370
PROBLEM SOLUTIONS
3.67.b False. 3.68 True since R[a, b] is a vector space, so that lie in R[a, b] . 3.69.a
(f+g)t(f-g)
and
(f+g);(f-g)
both
C[a, b] is a vector space.
3.69.b Pw(.IR) is not a vector space. For example, it has no zero vector.
;e.
3.70 llfnllsup = One can use the derivative from elementary calculus for this problem to determine the existence and value of the maximum value of fn· 3.71 limn_, 00
J;/
2 cosn 4
x dx = 0 since the integrand converges uniformly to zero.
3.72 This is false because T(f +g) 3.73
f01 f(x) dx =
~and
-=1-
Tf
+ Tg, if g =-f.
J; f(x) dx = :/.
SOLUTIONS FOR CHAPTER 4
4.32.c J~ 1 f(x) dx
. 4•45 llffih--->0
= 2sin(l).
P(x+3h)+P(x-3h)-2P(x) _ h2
-
gp"( X ) •
4.53 True. Let f(x) = x- ln(l + x). We see that f(O) = 0 and f'(x) > 0 for all x > 0. Thus the Mean Value theorem tells us that f is increasing and f(x) > 0 for all x > 0. 4.54 For such an example we can use
F(x) = { ~2sin
(;b)
if 0 <X~ 1, if X= 0.
4.55 False. Use any step function with distinct values on two contiguous intervals to give a counterexample. 4.56 True. This is the first version of the Fundamental theorem of the calculus. 4.57 This is true for any function with the given properties-not only those functions which are derivatives.
PROBLEM SOLUTIONS
371
4.58 For example, we can use
f( X ) = 4.59 True:
{2
° 1 2 1 xsm ::::2-- cos ::::2
X
X
X
0
if 0 < X :S 1, if X= 0.
liT II :::; 1. The reader could show as an additional problem that II Til = 1.
4.60 t = ~4.61 The limit is 16P"(x). 4.62 The limit is 0. 4.63 Yes, because
IITII
:S 1.
SOLUTIONS FOR CHAPTER 5
5.3 Let a1 = 1 and let an = .jTi -
Jn -
1 for each n > 1.
5.9.a Divergent. 5.9.b Convergent. 5.12.a Divergent. 5.19.a Hint: Use the Ratio Test together with the definition of e as a limit. 5.19.b Convergent. 5.20
i·
5.43.a Converges uniformly by theM-test. 5.43.c Not uniformly convergent on [0, ~). 5.49 The sum is In 2. 5.50.e The interval of convergence is I = (-e, e). The reader should prove divergence at the endpoints by applying the nth term test and using the hint. 5.54 j< 100) (0)
= 0 and j< 101 ) (0) = 100!.
5.70.a Absolutely convergent. 5.70.b Conditionally convergent. (_J\k+l
5.71 Let Xk =
~·for example.
372
PROBLEM SOLUTIONS
5.72 True. 5.73 False. This set of series is not closed under subtraction, for example. 5.74.a Diverges. 5.74.b Converges (absolutely). 5.75 Let Xk =
p and Yk = f• for example.
5.76.a Diverges, by the ratio test. 5.76.b Converges, by the comparison test. 5.77 False, because the series is absolutely convergent. 5.78 For example, let Xk =
Yk =
f,..
Then
The student should be able to write another such example.
5.81.a
Mk
= e- 2 k, so the series L:~
5.8l.b
Mk
= 1, so the series L:~ Mk diverges.
Mk
converges.
SOLUTIONS FOR CHAPTER 6 6.15 Forexample,letf(x) = ixforallx E answers.
[a,b]. Thereareinfinitelymanypossible n E Z \ {0}, n=O.
6.19.c J(n)
=
{i;n,
n
=
~
{ 0,
2,..n
0,
n=O.
2,
6.19.d J( n)
=/=-
'
n
=/=-
0,
n=O.
PROBLEM SOLUTIONS
j( n) =
6.19.e
{
~ ~,. 2 n 2 ' 12,
373
n "I 0, n=O.
6.35 Hint: Write each trigonometric function using Euler's formula, and multiple using the multiplicative property of the exponential function.
6.51.a
Jl f(x) dx = 1.
6.51.c
J01 f(x)sin27rxdx =
6.52.b
J01 f (x) sin 61rx dx = i.
6.53.a
4-.
2
0.
6.53.b i. 6.53.c ~· 6.54 -~. 6.55 False. The product is not linear in the second variable. 6.56.a
f( n) is both even and real-valued.
6.56.b
f( n) is both odd and imaginary-valued.
6.57.a f(x) = ~ cos61rx + ~ cos27rx. 6.57.b
l
6.58 True, because lim Sn is continuous, hence Riemann integrable. 6.59.a 6.59.b
,.4
90 . 81!"4
729
.
SOLUTIONS FOR CHAPTER 7
7.19 1. 7.21 6. 7.24 2.
374
PROBLEM SOLUTIONS
7.27 For example, we can use g = 1(p,b] if a::; p < b, or g = 1p if p =b. 7.30 For example, let g = 1{1 }.
7.41 This is false as the reader should be able to deduce from the fact that BV[O, 1] is a vector space.
7.42 For example, let
f(x)=
sin 1!: x' { 0,
X =/:-
X=
0, 0.
7.43 6. 7.44 False.
7.45 True. 7.46 0. 7.47 We can choose g(x)
= 2 · 1[1, 2], twice the indicator function of the subinterval
[1, 2] c [0, 2]. 7.48
J02 f dg =
3e.
7.49 We can take f = 1(a,b) and g = 1(b,c)· Then on the full interval [a, c] both functions have a discontinuity at b. But on [a, b] or the interval [b, c], only one function or the other one has a discontinuity.
SOLUTIONS FOR CHAPTER 8 8.7.a The limit is (0, e). 8.7.b The limit does not exist. 8.17.a Open. 8.17.c Neither. 8.17.e Closed. 8.42.a Not compact. 8.42.c Compact. 8.59 Any y perpendicular to x will suffice. For example: let y = ( 1, 1, -1). 8.60 True. Ifnotthereexists a sequence for which llx(k) II ---+ oo, and no subsequence converges.
PROBLEM SOLUTIONS
375
8.61 This is a square box with vertices at (±1, 0) and (0, ±1). 8.62 For example, let D = {(m, n) an isolated point of D.
Im
E Z and
n E Z}. Then the point (0, 0) is
8.63 The setS= Qx 2 is an example. 8.64 True. For example, this is true for Example 1.15. 8.65 False. The graph is not closed, lacking any points on the y-axis between 1 and -1. 8.66 False. For example, let Ek 8.67 True, because Xn
---> XI
= lEn \ Bk(O).
as n
--->
oo.
8.68 True, because g is continuous. 8.69.a This means that x 2 = ±xi, the graph of which consists of two intersecting straight lines, each of which is a connected set. Thus Sis connected. 8.69.b The two open sets A = {x E IE 2 I XI > 0} and B = { x E IE 2 1 XI < 0} separate T, so that T is not connected. There is no point in T corresponding to XI= 0. 8.70 False. For example, this would make lEn disconnected! For the two open separating sets, just take any one of the open balls for A and the union of all the others for B. But lEn is connected.
SOLUTIONS FOR CHAPTER 9
9.6.a The limit does not exist. 9.19 For example, let f(x)
= (cosx, sinx).
9.26 Closed. 9.52 Here is a counterexample. Let A be the graph of x 2 = .!.. in IE 2 , and let B be XI the XI -axis. 9.53 True. This follows from the fact that any open set containing the point (0, 1) must intersect the graph of fl(o,oo) which is connected since the restriction off is continuous on (0, oo ). 9.54 Let ifx-j-0, ifx =0.
376
PROBLEM SOLUTIONS
9.55 limx--.a f(x) exists if and only if for each E > 0 there exists 8 > 0 such that x andy in B 0 (a) n (DJ \{a} implies
llf(x)- f(y)il 9.56 True since f E C(!En, JR) and S
<E.
= f- 1 ( {0} ), the pre-image of a closed set.
9.57 False, since the pre-image of an open set under a continuous function need only be relatively open in D. 9.58 This limit does not exist, since the limit along either axis is zero, but the limit along the line X2 = X1 is ~9.59 True. Necessity follows directly from continuity at 0. Sufficiency is nearly the same as Exercise 9.33.b.
9.60 Here is a counterexample. B 1 (0) is a relatively closed subset of itself, but it is not closed so it cannot be compact, although it is bounded. 9.61 True, since A and B are connected being convex, and their union is connected since An B f:. 0. 9.62 False. Letf(x) = (cos7rx,sin7rx). Then llf(x)ll
= 1.
9.63 True, because the continuous image of a connected set is connected.
SOLUTIONS FOR CHAPTER 10 10.5
II All = liB II = IIA + Bll
= 1.
10.25.a f'(x) exists for all x. 10.26.a det f' (x)
= x 1.
10.26.c detf'(x) = x~sinx 3 • 10.28 D(l,- 2Jf(1, 2) 10.44 (
-3 4
3 -4
= ( ~150 ).
6 ) -3 .
10.48 ~ = ~m_ ~£11.J... 8x; L..Jk-1 8yk 8x;
10.89
II All =
1=
liB II, and IIA + Bll
= 1.
PROBLEM SOLUTIONS
10.90 This is false, since
377
IIX +XII = II2XII = 2IIXII·
10.91 This is false since the function ci,I (A) = a 1,I is a continuous function of A and ci,l (S) is not a connected subset of IE 1. 10.92 Let T have the matrix (
~
8) in the standard basis.
10.93 This is false, since f'(x) is not linear on x E D. Rather f'(x) is a linear map from IE2 --> IE3 for each fixed x E IE2 • Moreover, D need not be a vector space, so that cx 1 + x 2 need not be in D.
-~ ) ( 2 ) _ e ( 2v'3- 1 )
~
1
-2
2+v'3
.
10.95 The matrix
10.96 This is true, because if we did have f(a) = f(b) for some a< b there would be an extreme point someplace in the open interval (a, b), because of the extreme value theorem and because f cannot be constant on [a, b]. For such an extreme point p and r > 0, f could not be injective on (p - r, p + r ). 10.97 By direct calculation, the Jacobian
8(Jr,h,h) 8(xi, X2, X3) =XI. Thus f has nonvanishing Jacobian off the plane XI = 0, and for each such x we can take a ballofradiusr = min(lxii,7r).
10.98 The Jacobian
Thus the equation
f(x)
=
f(xo)
has a local C1-solution for x 3 in terms of the other three variables in a neighborhood of Xo provided x4 =f. 0.
10.99 Let f(x) = (cos x, sinx) for all x E [0, 21r). 10.100 Denote g(xr) = (x1, x2 (xi), x3(xi)). Thus f o g(xr) [f'(xo)][g'(a)] = ( _\
B
~ ) ( ~~h~
)
= 0. Hence
= (
8)·
378
PROBLEM SOLUTIONS
This yields the equations
1 + x~(a) + 3x~(a) -1 + 2x~(a)
0, 0
and the solutions x~(a) = ~and x~(a) = -~.
SOLUTIONS FOR CHAPTER 11 11.6 For example, we can let
p
= {((0,0), (.95, 1)), ((.95,0), (1.05, 1)), ((1.05, 1), (2, 1))}.
11.48 g'(p, (), ¢) = p2 sin¢ and the volume is ~1ra 3 . 11.52 The volumes are identical. 11.58 For example, let f be the indicator function of the set of points for which both coordinates are rational. 11.59 Let f be the indicator of the points with both coordinates rational, and let g be the indicator of the points with both coordinates irrational. 11.60 For example, let coordinate rational.
11.62 The function
f
f
be the indicator function of the set of points having first
is not continuous anywhere on [0, 1].
11.63 No, Sis not Jordan measurable since its boundary is not a Lebesgue null set. 11.64 True, because it is a compact subset of a Lebesgue null set. 11.65 o(f, 0)
=2
11.66 The Jordan measure is zero. 11.67 We must have f(xi) = 0 except on a Lebesgue null set. This follows from Lebesgue's theorem. 11.68.a
[f'(x)] = ( 11.68.b The volume is 3 ~11".
-1
8 (/0
y )
_:1 .
REFERENCES
I. Tom M. Apostol, Mathematical Analysis: A Modem Approach to Advanced Calculus, Addison-Wesley, Reading, MA, 1957. 2. C. Berge, Principles ofCombinatorics, Academic Press, New York, 1971. 3. Carl B. Boyer, The History of the Calculus, Dover Publications, New York, 1949.
4. R. Creighton Buck, Advanced Calculus, McGraw-Hill, New York, 1956.
5. Lennart Carleson, On convergence and growth of partial sumas of Fourier series, Acta Mathematica, Vol. 116, 1966, pp. 135-157. 6. H. Dym and H. P. McKean, Fourier Series and Integrals, Academic Press, New York, 1972. 7. Bernard R. Gelbaum and John M. H. Olmsted, Counterexamples in Analysis, Holden-Day, San Francisco, 1964. 8. Casper Goffman and George Pedrick, First Course in Functional Analysis, Prentice-Hall, Englewood Cliffs, NJ, 1965. 9. Kenneth Hoffman, Analysis in Euclidean Space, Prentice-Hall, Englewood Cliffs, NJ, 1975. 10. Kenneth Hoffman and Ray Kunze, Linear Algebra, Prentice-Hall, Englewood Cliffs, NJ, 1971. II. E. Kamke, Theory of Sets, Dover Publications, New York, 1950. 12. E. Landau, Foundations of Analysis, Chelsea Publishing Co., Broomall, PA, 1960. Advanced Calculus: An Introduction to Linear Analysis. By Leonard F. Richardson Copyright © 2008 John Wiley & Sons, Inc.
379
380
REFERENCES
13. H. E. Lomeli and C. L. Garcia, Variations on a theorem of Korovkin, American Math Monthly, Vol. 113, No. 8, 2006, pp. 744-750. 14. G. D. Mostow, J. H. Sampson and J.-P. Meyer, Fundamental Structures of Algebra, McGraw Hill, New York, 1963. 15. Otto Neugebauer, The Exact Sciences in Antiquity, Brown University Press, Providence, RI, 1957. 16. John M. H. Olmsted, The Real Number System, Appleton-Century-Crofts, New York, 1962. 17. J. Dauben, Review: The Universal History of Numbers, Notices of the American Mathematical Society, Vol. 49, No. I, January 2002, pp. 32-38. 18. Frigyes Riesz and Bela Sz.-Nagy, Functional Analysis, Frederick Ungar Publishing Co., New York, 1955. 19. Walter Rudin, Principles of Mathematical Analysis, McGraw-Hill, New York, 1964. 20. Michael Spivak, Calculus on Manifolds, W. A. Benjamin, New York, 1965.
INDEX
llfllsup, 275 11/112. 92, 193 11/111. 199 11/llsup. 58 llxll1. 148
Is, 83
AnB, 360 AUB, 360 A\ B, 360
Ac, 361 Br(p), 247
Z, 5
Dvf(x),
290 L(f,P), 77 Pn(x), 122 S(f), 188 253 8°, 105, 255 SN(/), 188 Txo(S), 323 U(f,P), 77 V,I'(f), 216 [a, b], 332 En, 247 'J'(z), 185 N, 3 IQI, 4 IQin, 256
Br(p), 247
s, 255
Xk· 187 {Jjk· 208
sc,
u, 279
0,
~.290 J
glb, 15 inf, 15 rxl, 241 LxJ, 183, 221 lim, 9 limx-->a±• 44 limx-->a-, 44 liminf, 18 limsup, 18 lub, 14 f'(x), 289
JR, 5 IRn, 247 R(z), 185
IITII,
27, 360
8Un±l•···.fntm) 312 8(yl, ... ,y,.,.) '
150
Advanced Calculus: An Introduction to Linear Analysis. By Leonard F. Richardson Copyright
© 2008
John Wiley & Sons, Inc.
381
382
INDEX
r- 1 {E), 212
Xnk,
C'[a, b], 232
IQI+, 4
BV[a,b], 216 C(D), 46, 271 C(D,Em), 271
xk, 267
22
(a, b), 332
C[a, b], 46, 58, 61 C1 {D,Em), 293 C1 {IR), 184 C1 [a, b], 116 C"", 163 :F, 195
Q.C(n, IR), 286 .C{En), 284 .C{En, Em), 284 0(2), 321 O(n), 357 RS([a, b], g), 223 R(B), 334 R{[a,b],IC), 187, 192 R[a,b], 70 S.C{2, IR), 321
J"sl, 334 1:1. 1s as,
255
_l, 209, 323
sgn, 187, 234 sup, 15 8 1, 334
f J:l, 78 lEI,
109 188 dlx 0 , 101
f,
l(a+),43 I( a-), 44
I L 108 I/, 10s I'\., 108
1 "'g,
193
I r, 108 1+, 81, 337 1-' 81, 337 h, 148 b, 195 loo, 151
m(B), 332 o(/,x), 340 !, 16 /,16 Xn '\., 16 Xn--> L, 9 Xn---> 00, II Xn i, 16 Xn Xn
X¢(k)•
22
affine subspace, 323 alternating series test, 129 analytic function, real, 163 approximate identity, 170 Archimedean ordered field, 5 Property, 5 Banach space, 63 isomorphism, 151 Bessel's inequality, 194 bijection, 239, 286 Bolzano-Weierstrass Theorem for En, 251 for IR, 23 bound greatest lower, 15 least upper, 14 boundary, 255 bounded function, 55 sequence, II sequence in En, 250 set, 14 set in En, 256 variation, 216 Cantor diagonalization, 36 paradox, 363 theorem, 33 Cauchy Mean Value Theorem, 117 product, 144 sequence, 10 Cauchy criterion for functions, 45, 270 for sequences, 13 Cauchy-Schwarz inequality b, 195 Euclidean space, 246 functions, 92 Hermitian scalar product, 193 ceiling function, 241 Cesaro means, 22 chain rule Euclidean space, 298 for IR, 102 change of variables, 352
INDEX
character, 187 Clairaut's theorem, 351 closed ball, 24 7 closed set, 35, 253 closure, 255 cluster point, 40, 253 column rank, 313 matrix, 313 compact set, 256 comparison test, 134 complement, 253 complete normed space, 63 Completeness Axiom for IR, 13 completion BV[a, b], 222 normed vector space, 222 of IQ, 26 complex number conjugate, 185 imaginary part, 185 modeled in plane, 185 modulus, 185 polar form, 185 real part, 185 connected, 260 connectivity of!En,261 continuity sequential criterion, 46 continuously differentiable, 293 contraction, fixed points, 134 convergence at infinity, 267 conditional, 130 function on !En, 267 pointwise, 62 uniform, 61 convergence test nth tern1, 128 nth-root test, 138 alternating series, 129 comparison, 134 geometric series, 132 integral, 135 limit comparison, 137 ratio, 136 convergent series, 128 convex set, 279 convolution, 170, 346 countability of IQ, 32 cover
finite sub-, 256 open, 29, 256 Darboux integrability criterion Euclidean space, 334 for R 78 DeMorgan 's Laws, 362 dense, 25, 256 derivative, I00 diffeomorphic, 309 differentiable, 10 I, 289 Differentiation of integral, 350 Dini's Theorem, 65 directional derivative, 290 Dirichlet kernel, 198 theorem, 139 disconnected, 259 discontinuity jump, 44, 220 divergence function on !En, 267 to infinity, 267 divergent sequence, 9 to oo, 12 series, 128 dot product, 90 double summations, 144 dual space, 150 Euclidean norm, 88, 247 Euclidean space, 247 standard basis, 284 Euler's formula, 186 evaluation point, 70 existence v'2. 26 p > 0, 51 expectation, 145 Extreme Value Theorem for !En, 274 for IR, 56
w.
Fejer kernel, 210 theorem, 209 field axioms Archimedean ordered, 5 complex numbers, 185 finite interval, 29 set, 29 subcover, 256 fixed point, 53
383
384
INDEX
floor function, 183 Fourier coefficients, 188 series, 188 transform, 188, 351 Fubini's Theorem, 347 function, 40 bijection, 239 bounded, 55 continuous, 46, 271 derivative, I 00 differentiable, 101, 289 domain, 40 even, 53 gradient vector, 323 graph of, 258 implicit, 311 increasing, 44 indicator, 83 injective, I 04, I 08 integrable, 334 inverse image, 272 Jacobian, 308 level surface, 322 limit, 40, 266 locally injective, 310 monomial, 267 monotone, 108 nonsingular, 308 odd, 53 of vector variable, 265 oscillation, 340 partial derivative, 290, 291 periodic, 183 polynomial in lEn, 267 range, 40 rational, 267 real analytic, 163 step, 75 strictly monotone, 108 surjection, 351 surjective, 151 vector-valued, 265 functional bounded, 86 continuous, 85 linear, 72, 85 functions integrable equivalent, 193 orthogonal, 94 fundamental theorem of calculus, 110
geometric series, 132 test, 132 graph, 258 greatest integer function, 227 lower bound, 15 glb, 15 group, 288 Hamel basis, 89 Heine-Bore! Theorem Euclidean space, 257 real line, 30 homeomorphism, 275 implicit differentiation, 311 function theorem, 312 functions, 311 index set, 28, 253 indicator function, 74, 83 infimum, 15 injection, 286 injective, 104, 108, 275 inner product, 90 Hermitian, 192 space, 95 integer, 5 integral differentiation of, 350 improper, 135 lower in lEn, 334 parts, 228 upper in lEn, 334 integral test, 135 integrator, 223 interior, 255 point, 105 intermediate value property, 51 theorem, 51 connected set, 278 derivatives, Ill interval, 259 finite, 29 Inverse Function Theorem, 307 inverse image, 272 invertible, 286 isolated point, 46, 253 isomorphism, Banach space, 151 Jacobian, 308, 312 Jordan
INDEX
content, 346 measurable, 346 null set, 339 jump discontinuity, 44 kernel Dirichlet, 198 of linear map, 288 Weierstrass approximation, 171 L'Hopital's Rule, 118 Lagrange multipliers, 324 lea~t upper bound, 14 Lebesgue criterion Riemann Integrability, 342 Henri, 339 null set, 339 level surface, 322 limit function, 40 sequential criterion, 42 inferior, 18 sequence, 9 superior, 18 limit comparison test, 137 linear functional, 72, 85 bounded, 86 continuous, 85 linear space, 59 linear transformation identity, 286 invertible, 286 Jacobian, 312 kernel, 288 magnification theorem, 306 open mapping theorem, 307 local extreme point, 30 I local-global duality, 202 locally injective, 310 lower bound, 14 integral, 78 in lEn, 334 sum on lEn, 333 on IR, 77 magnification theorem, 306 matrix invertible, 286 Mean Value Theorem Cauchy, 117 derivatives, I 06 Euclidean space, 300
integrals, 75 332 monomial, 267 monotone function, 108 increa~ing, 44 sequence, 16 multinomial coefficients, 303 mutual refinement, 77 mea~ure,
Nested Intervals theorem, 25 non singular map, 308 matrix, 317 norm, 59 £ 1 , 199 £ 2 , 193 taxicab, 249 Euclidean, 88, 247 of a functional, 150 of linear transformation, 285 sup, 58, 275 taxicab, 250 nth root test, 138 nth term test, 128 number algebraic, 36 even, 8 integer, 5 irrational
/2,8 natural, 3 odd, 8 rational, 4 real, 5, 13 transcendental, 36 open ball, 247 open cover, 29, 256 Open Mapping Theorem, 307 open set, 28, 252 orthogonal characters, 190 complement, 209, 323 functions, 94 group, 321, 357 vectors, 250 oscillation of function on lEn, 340 of function on IR, 50 paradox, 363 Cantor's, 363 Russell's, 364
385
386
INDEX
Zeno's, 132 Parallelogram Law, 251 Parseval's identity, 209 partial derivative, 290, 291 partition, 70 in lEn, 332 mesh, 70 mesh in lEn, 332 mutual refinement, 77 in lEn, 333 refinement in lEn, 332 period T, 183 Plancherel identity, 194, 205 point cluster, 40, 253 evaluation, 230 interior, I 05 isolated, 46, 253 local extreme, 105, 30 I pointwise convergence, 62 polynomial, 267 power series, 158 preservation of sign, 50 product of open sets, 322 proposition, 364 Pythagorean Theorem L 2 norm, 94 Euclidean space, 250 Plancherel identity, 205 radius of convergence, 158 random variable discrete, 145 expectation, 145 rank, 313 linear transformation, 313 ratio test, 136 rational function, 267 rectangular block closed, 332 open, 332 Riemann -Lebesgue lemma, 199 -Stieltjes integral, 223 integral, 70 circle, 183 complex valued, 187 Euclidean space, 334 general domain, 335 localization theorem, 201 sum, 70 Riesz -Fischer Theorem, 207
Representation Theorem, 232 Rolle's Theorl(m, 106 row rank, 313 matrix, 313 Russel's paradox, 364 sandwich theorem functions, 44 sequences, 12 scalar product, 90, 246 Euclidean, 247 Hermitian, 192 separated, 260 separation of variables, 181 sequence nth tail, 17 bounded, II Cauchy, 10 convergent, 9 decrea~ing, 16 divergent, 9 to oo, 12 increasing, 16 limit, 9 monotone, 16 null, 20 strictly decreasing, 16 strictly increasing, 16 sub, 22 sequential criterion continuity, 46 limits, 42 series uniformly convergent, 154 absolutely summable, 130 conditionally convergent, 130, 147 convergent, 128 divergent, 128 double, 144 Fourier, 188 harmonic, 129 power, 158 radius of convergence, 158 ratio test, 136 rearrangement, 139 summable, 128 telescoping, 133 set boundary, 255 bounded, 14 closed, 35, 253 closed ball, 247 closure, 255 complement, 361
INDEX
connected, 260 convex, 279 countable, 31 dense, 256 dense in IR, 25 disconnected, 259 empty, 360 finite, 29 index, 28, 253 infinite, 359 interior, 255 minus, 360 open, 28, 252 open ball, 247 orthonomml, 190 power, 363 relatively closed, 271 relatively open, 271 separated, 260 transfinite, 359 sets DeMorgan's Laws, 362 difference, 360 disjoint, 360 intersection, 360 product of open, 322 symmetric difference, 361 union, 360 signum, 234 space affine sub, 323 Banach, 63 complete normed, 63 dual, 150 self dual, 207, 252 Euclidean, 247 Hilbert, 207 inner product, 95 linear, 59 normed vector, 58 orthogonal complement, 323 vector, 59 spaces diffeomorphic, 309 spherical coordinates, En, 356 squeeze theorem functions, 44 sequences, 12 standard basis, 284 step function, 75 strictly monotone function, I 08 subset proper, 360
supremum, 15 surjection, 286, 351 surjective, 151 tail
nth, 17 tangent plane, 323 space, 323 vector, 323 Taylor polynomial, 122 theorem in En, 302 in IR, 123 triangle inequality complex space, 196 nom1s, 58, 59, 287 on IR, 6 Unconscious Statistician law of, 146 uncountability of IR, 33 uniform continuity in En, 277 in IR, 52 convergence in En, 275 in IR, 61 of series, 154 upper bound, 14 integral in En, 334 in IR, 78 sum on En, 333 on IR, 77 variation bounded, 216 total, 216 vector space, 59 vibrating string, 180 Weierstrass Approximation Theorem, 169, 210 M-test, 156 Zeno's Paradox, 132
387
This page intentionally left blank