MEASURE AND INTEGRATION A Concise Introduction to Real Analysis
Leonard F. Richardson
@WILEY A JOHN WILEY & SONS, INC., PUBLICATION
This Page Intentionally Left Blank
MEASURE AND INTEGRATION
This Page Intentionally Left Blank
MEASURE AND INTEGRATION A Concise Introduction to Real Analysis
Leonard F. Richardson
@WILEY A JOHN WILEY & SONS, INC., PUBLICATION
Copyright C 2009 by John Wiley & Sons, Inc. All rights resened. Published by John Wiley & Sons, Inc.. Hoboken, New Jersey. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923. (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 11 1 River Street, Hoboken. NJ 07030, (201) 748-601 1. fax (201 ) 748-6008, or online at http:/lwww.wiIey.cornlgoipermission. Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services or for technical support. please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (3 17) 572-3993 or fax (3 17) 572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic format. For information about Wiley products. visit our web site at www.wiley.com. Library of Congress Cataloging-in-Pi4blication Data:
Richardson, Leonard F. Measure and integration : a concise introduction to real analysis / Leonard F. Richardson. p. cm. Includes bibliographical references. ISBN 978-0-470-25954-2 (cloth) 1 , Lebesgue integral. 2. Measure theory. 3. Mathematical analysis. I. Title. QA312.R45 2008 5 15'.42-d~22 2009009714 Printed in the United States of America 1 0 9 8 7 6 5 4 3 2 1
To Joan, Daniel, and Joseph
This Page Intentionally Left Blank
CONTENTS
Preface Acknowledgments
xi ...
Xlll
xv
Introduction
History of the Subject
1
1.1 1.2 1.3
1
History of the Idea Deficiencies of the Riemann Integral Motivation for the Lebesgue Integral
Fields, Borel Fields, and Measures 2.1 2.2 2.3 2.4
Fields, Monotone Classes, and Borel Fields Additive Measures CarathCodory Outer Measure E. Hopf’s Extension Theorem 2.4.1 Fields, a-Fields, and Measures Inherited by a Subset
Lebesgue Measure
3 6 11 11 18 20 24 29
31 vii
viii
CONTENTS
3.1 3.2 3.3
3.4 3.5 3.6
4
Measurable Functions
4.1 4.2 4.3
4.4
5
55 55
56 58 61 63 65 66 69
5.1 5.2
69 72 74 78 81 83 89
5.6
7
Measurable Functions Baire Functions of Measurable Functions 4.1.1 Limits of Measurable Functions Simple Functions and Egoroff’s Theorem 4.3.1 Double Sequences 4.3.2 Convergence in Measure Lusin’s Theorem
31 34 36 38 41 41 43 50 52
The Integral
5.3 5.4 5.5
6
The Finite Interval [ - N , N ) Measurable Sets, Bore1 Sets, and the Real Line 3.2.1 Lebesgue Measure on IR Measure Spaces and Completions Minimal Completion of a Measure Space 3.3.1 3.3.2 A Nonmeasurable Set Semimetric Space of Measurable Sets Lebesgue Measure in Rn Jordan Measure in RTL
Special Simple Functions Extending the Domain of the Integral The Class C+ of Nonnegative Measurable Functions 5.2.1 The Class C of Lebesgue Integrable Functions 5.2.2 5.2.3 Convex Functions and Jensen’s Inequality Lebesgue Dominated Convergence Theorem Monotone Convergence and Fatou’s Theorem Completeness of L’ (X.2,p ) and the Pointwise Convergence Lemma Complex-Valued Functions
92 100
Product Measures and Fubini’s Theorem
103
6.1 6.2 6.3
103 108 117
Product Measures Fubini’s Theorem Comparison of Lebesgue and Riemann Integrals
Functions of a Real Variable
123
CONTENTS
7.1 7.2 7.3 7.4 8
9
10
Functions of Bounded Variation A Fundamental Theorem for the Lebesgue Integral Lebesgue’s Theorem and Vitali’s Covering Theorem Absolutely Continuous and Singular Functions
ix
123 128 131 139
General Countably Additive Set Functions
151
8.1 8.2 8.3
152 156 161
Hahn Decomposition Theorem Radon-Nikodym Theorem Lebesgue Decomposition Theorem
Examples of Dual Spaces from Measure Theory
165
9.1 9.2 9.3 9.4 9.5
165 170 174 178 185
The Banach Space LP(.Y. U, p ) The Dual of a Banach Space The Dual Space of L P ( X .U, p ) Hilbert Space, Its Dual, and L 2 ( X ,U. p ) Riesz-Markov-Saks-Kakutani Theorem
Translation lnvariance in Real Analysis
10.1 10.2
10.3 10.4
10.5
An Orthonormal Basis for L2(T) Closed, Invariant Subspaces of L 2 ( T ) 10.2.1 Integration of Hilbert Space Valued Functions 10.2.2 Spectrum of a Subset of L 2 ( T ) Schwartz Functions: Fourier Transform and Inversion Closed, Invariant Subspaces of L (R) 10.4.1 The Fourier Transform in L (IR) 10.4.2 Translation-Invariant Subspaces of L (R) 10.4.3 The Fourier Transform and Direct Integrals Irreducibility of L 2(R) Under Translations and Rotations 10.5.1 Position and Momentum Operators 10.5.2 The Heisenberg Group
195 196 203 204 206 208 213 213 216 218 219 22 1 222
Appendix: The Banach-Tarski Theorem A.1 The Limits to Countable Additivity
225 225
References
229
Index
231
This Page Intentionally Left Blank
PREFACE
The purpose of this textbook is to provide a concise introduction to Measure and Integration, in the context of abstract measure spaces. It is written in the hope that it has sufficient versatility to provide the prerequisites for beginning graduate courses in harmonic analysis, probability, and functional analysis. For both mathematical and pedagogical reasons, it is helpful to the student to emphasize examples and exercises from the real line and Euclidean space. This emphasis also reflects the special theorems that should be established for the cases of the line and Euclidean space. The author's department spans diverse branches of both pure and applied mathematics, making it desirable to teach sufficient measure and integration in one semester. This helps to make room in the introductory part of the graduate curriculum for both complex analysis and functional analysis. The choice of textbook for achieving this goal, covering both abstract measure spaces and Euclidean space in one semester, has been problematic. In 1963, the author was fortunate to be a student in the course "Measure and Integration" taught by Professor Shizuo Kakutani (191 1-2004) at Yale University. His course achieved all the objectives cited above. It also succeeded in conveying to the student a sense of awe for the profound insights the subject has to offer for mathematics.
xi
xii
PREFACE
The author had hoped that a bGok on this subject might be written by Professor Kakutani himself. Since this did not happen, and because the author’s notes from Professor Kakutani’s course seemed potentially useful if converted into a textbook, this book has been written. Many exercises have been added, these being very important to a sound core graduate course in analysis. Several favorite topics in real analysis augment the original course. There are two concluding chapters, intended for the reader who would like to continue this study after the end of the semester. These chapters provide some favorite applications to functional analysis, and to the study of translation-invariant function spaces on the line and the circle. The excursion in the latter direction concludes with a brief proof that L2(IR) is irreducible under the combined actions of translation in the (real) domain, and rotation in the (complex) range, of the functions. There is a very brief introduction to the Heisenberg Commutation Relation and the role of the Heisenberg group in the Stone-von Neumann theorem, which is not proven here. The role of translation invariance in real analysis, beginning with Lebesgue measure and the Lebesgue integral themselves, reflect the author’s interest in the underlying group-theoretic structures in analysis.
AC KNOWLEDGMENTS
Real analysis, and how to teach it, is a perennial topic of discussion among analysis professors, and I have benefited in more ways than I could enumerate from conversations with colleagues for 35 years. I feel that I have benefited in the writing of this book especially from discussions with Professors Jacek Cygan, Mark Davidson, Raymond Fabec, and P. Sundar. It has been a great pleasure and privilege to teach measure and integration to numerous graduate students over the years. They have all contributed to my enjoyment of the subject and to the choices that went into this book. For the autumn semesters of 2007 and 2008, incoming graduate classes at LSU learned measure and integration from a preliminary version of this book. These fine students kindly pointed out numerous typos, and they asked questions that enabled me to improve the exposition. Some found interesting resources in the digital literature. Especially helpful to me were two advanced graduate assistants, Jens Christensen and Anna Zemlyanova. They did a superlative job of grading the weekly homework assignments collected during these two terms, and their observations were especially helpful to me in improving the text. This book began with my own notebook from Professor Shizuo Kakutani’s course in measure theory, taught in 1963 at Yale University. It was Professor Kakutani’s way to leave it to students to write detailed notes for the course, filling in many proofs that were omitted in class. This work was the principal assignment whereby students xiii
xiv
ACKNOWLEDGMENTS
earned academic credit. I have added numerous exercises to the original course, believing these to be essential to the teaching of present-day graduate students. Some exercises require the student to complete a proof or to extend a theorem, thereby incorporating in a formal manner duties of the kind that Professor Kakutani left as an assignment. Many other exercises are problems typical of PhD qualifying examination questions in real analysis. These are intended show students how the subject of measure and integration is used in analysis, and to help them prepare for milestone tests such as qualifiers. No doubt the best features of this text were learned from Professor Kakutani. Also, there are ideas borrowed from several favorite books, which are cited in the References. These are recommended highly for further reading. The supplementary topics in the last chapter, and a number of the exercises in the text, reflect my interest in the role of group actions in analysis, which in this course means mainly the action of the real line or the Euclidean vector group. In this I have benefited from the teaching my own major professor at Yale from 1967 to 1970, Professor George D. Mostow, to whom I am indebted beyond words. The shortcomings and errors of this book are, of course, exclusively my own. I welcome gratefully corrections and suggestions from readers. I am grateful to John Wiley & Sons for the opportunity to offer this book, as well as the course it represents and advocates, to a wider audience. I appreciate especially the role of Ms. Susanne Steitz-Filler, the Mathematics and Statistics Editor of John Wiley & Sons, in making this opportunity available. She and her colleagues provided valued advice, support, and technical assistance, all of which were needed to transform a professor’s course notes into a book. L.F.R.
INTRODUCTI0 N
The reader of this book is likely to be a first-semester graduate student. The author advises this reader, gently but firmly, to expect that what follows is an intense and sustained exercise in logical reasoning. The effort will be great, but the reader may find that the reward is greater and that this subject is an important cornerstone of modern mathematics. Strategies employed commonly by undergraduates for the study of mathematics may not be appropriate for the graduate student. Because the reader is assumed to be a beginning graduate student in mathematics, I offer here some advice regarding how to study mathematics in graduate school. In reading the proofs of theorems in this text or in the study of proofs presented by one’s teacher in class, the student must understand that what is written is much more than a body of facts to be remembered and reproduced upon demand. Each proof has a story that guided the author in its writing. There is a beginning (the hypotheses), a challenge (the objective to be achieved), and a plan that might, with hard work, skill, and good fortune, lead to the desired conclusion. Thus each step of a proof should be seen by the student as meaningful and purposeful, and not as a procedure that is being followed blindly. It will take time and concerted effort for the student to learn to think about the statements and proofs of the presented theorems in this light. Such practice will cultivate the ability to do the exercises as well in a fruitful manner. With experience in recognizing the story of the proof or problem at hand, the student will xv
xvi
INTRODUCTION
be in a position to develop his or her own technique through the work done in the exercises. The first step, before attempting to read a proof, is to read the statement of the theorem carefully, trying to get an overall picture of its content. The student should make sure that he or she knows precisely the definition of each term used in the statement of the theorem. Without that information, it is impossible to understand even the claim of the theorem, let alone its proof. If a term or a symbol in the statement of a theorem or exercise is not recognized, look in the index! Write down what you find. After clarifying explicitly the meaning of each term used, if the student does not see what the theorem is attempting to achieve, it is often helpful to write down a few examples to see what difficulties might arise, leading to the need for the theorem. Working with examples is the rnatheniatical equhlalent of laboratop work for a natural scientist. Indeed, the student should accumulate a tool kit of examples that illustrate what may go wrong with a theorem if the hypotheses are not satisfied. One also should learn examples that show why stronger conclusions than those stated may fail to be valid with the given hypotheses. Many exercises in the text are intended to provide examples that assist one in these ways to grasp the meaning of the theorems, and the student must understand that the exercises are a very important component of the study of measure and integration. The experience gained from the exercises will help the student to understand which steps in the solution of a problem require careful justification.
CHAPTER 1
HISTORY OF THE SUBJECT
1.1 HISTORY OF THE IDEA In a broad sense, much of mathematics is devoted to the decomposition, or analysis, of a whole entity into its component parts and the reconstruction, or synthesis, of the whole from its parts. There is in this an expectation that the whole is somehow equal to the sum of its parts, which are simpler individually. We discuss briefly a few examples of this aspect of mathematics. It was nearly three thousand years ago that Babylonian astronomers successfully predicted the times of lunar and solar eclipses by expressing these complicated events as summations of numerous simpler periodic events. The predictions were fairly accurate, to the extent of predicting eclipses that would be visible at least from some part of the world. That remarkable achievement may be interpreted as the first appearance on Earth of harmonic (or Fourier) analysis. Measurement has been a special interest for mathematicians and scientists since the early days of civilization. Two thousand years ago, the classical geometers of Greece made profound contributions to the study of measurement. A line segment could be measured by a shorter segment if the short one could be laid off end to end in such a way that the long segment would be seen to be an exact positive integer Measure and Integration: A Concise Introduction to Real Aiiulysis. By Leonard F. Richardson Copyright @ 2009 John Wiley & Sons, Inc.
1
2
HISTORY OF THE SUBJECT
multiple of the short one. Two segments were called commensurable if both could be measured by a common shorter segment, or unit of length. And the discovery by Pythagoras or his associates of incommensurable segments was a momentous event in the history of mathematics. Inherent in the definition Greek geometers used for measurement of segments was the concept that the measure of the whole segment must be the sun1 of the measures of its nonoverlapping parts, as indicated by the lengths marked off according to the shorter segment. The Greek geometers pushed this technique much farther, successfully studying the circumference and area of a circle and the surface area and volume of a sphere. These challenges were met by the application of what is called the principle of e.uhaustion. The method was to approximate a geometrical measurement from below, and at each successive iteration of the approximation technique, at least of half of what remained to be counted was to be included. (Greek geometers understood that if too little were taken at each stage, even an endless succession of steps might not approach the goal.) For example, the area of a circle is approximated from within by means of an inscribed 2 '"sided regular polygon. As n increases, the area of the resulting 2n-sided polygon grows in a computable manner. (See Figure 1.1.) We begin with an inscribed square consuming the bulk of the area of the circle. The next polygon is an octagon, which adds the areas of four thin triangles to that of the square. At each stage, one can construct the 2 n+l-gon by bisecting the arcs that are
Figure 1.1 Area of a circle by exhaustion.
subtended by the sides of the inscribed 21L-gon.In effect, the ( n + 1)th stage of the approximation results from adding a small increment to what was obtained at the nth stage. Such classical achievements in geometry may have been the first instances in
DEFICIENCIES OF THE RIEMANN INTEGRAL
3
which a measure (of area, for example) was sought that would be countflb~y addirive, meaning that the measure of the whole slzould be the sum of the measures of its infinite sequence of nonoverlapping parts. Greek geometers and philosophers may not have been entirely convinced of the validity of the principle of exhaustion. The expression of some dissatisfaction with this geometric technique may have been the purpose of the famous paradox of Zeno, in which the legendary warrior Achilles was pitted unsuccessfully in a footrace with a persistent tortoise. The notion that the whole should be the sum of even an infinite sequence of its nonoverlapping parts appears very strongly in modem analysis, both pure and applied. For example, in 1822 Joseph Fourier presented a seminal paper on the heat equation to the French Academy, in which he introduced the use of infinite trigonometric series for the purpose of determining the solution of the heat equation on a finite interval, or rod [6]. Fourier’s method required that the hypothetical solution function be expressible as the sum of an infinite trigonometric series:
Unfortunately, these so-called Fourier series can diverge for very large sets of numbers .z in the domain of even a rather nice function f. Yet, if one ignored the embarrassing reality that f ( z ) need nor be the sum of its parts, Fourier’s method actually worked. And the search was on for suitable concepts and tools to analyze correctly the right classes of functions for which the Fourier series would converge in a useful sense to the functions being represented. The efforts took place on a grand scale. Even the theory of sets was invented by Georg Cantor for the purpose of analyzing sets of convergence and divergence of Fourier series. Though Cantor’s approach was not sufficient for the needs of Fourier series, it became a cornerstone of modern mathematics. Early in the twentieth century Henri Lebesgue invented his new and very refined concept of the integral, based on the measure of suitable subsets of the line. Lebesgue measure became the foundation not only for Fourier analysis, but also for probability, and for functional analysis which permeates modern analysis.
1.2 DEFICIENCIES OF THE RIEMANN INTEGRAL The Riemann integral is the integral of elementary calculus. It is the integral developed intuitively by Newton and Leibnitz and put to great use in the classical sciences. Before undertaking the considerable work of developing the Lebesgue integral, the reader and student need to become acquainted with the deficiencies of the Riemann integral. This will motivate the effort that follows. First, we review the definition of the Riemann integral of a bounded real-valued function f on a closed, finite interval [u. b] of the real line.
4
HISTORY OF THE SUBJECT
Definition 1.2.1 A partition P is an ordered list of finitely many points starting with a and ending with b > a. Thus P = ( ~ 0 . ~ 1. ,. ,.x,}, where a = xo
< x1 <
s . .
< z,
= b.
These points are regarded as partitioning [a. b] into n contiguous subintervals, [ x , - l . x L ] ,i = 1 , .. . . n. The length of the ith subinterval is given by Ax? = z, - z+1. The mesh of the partition is denoted and defined by
IlPIl
= max
(Axc,I i
=
1.2.. . .n}
Definition 1.2.2 Let f be any bounded function on [ a : b] and let P be any partition of [a: b]. Let = s u p { f ( z ) I z E [x,-1.x2]} andni, = inf{f(z) I z E [zL-l..rc,]}.
Define the upper sum, n
and the lower sum,
n
m,Az,.
L ( f .P)= 2=1
We say that f is Riemann integrable on [u. b] with L ( f ,P) + L and U ( f .P ) -+ L as JJPII + 0.
l,"f ( z )dz = L if and only if both
Note that A l , and mi are real numbers in Definition 1.2.2 because f is bounded. EXAMPLE1.1
Since the set Q of rational numbers is countably infinite, the same is true of the set S of all rational numbers in [ a ,b] for any a < b. So, write S = { q, 1 n E IN}. Now define the functions 1 if z E { y l . . . . . q l L ) . 0 if z E [a. b]\{ql.. . . , q n } .
It is known from advanced calculus' that each function fn lies in 8 [ a .b ] , the set of Riemann integrable functions on [a. b]. In the following exercises, the reader will prove that the pointwise limit of the sequence f n is not Riemann integrable. 'See [20], for example.
DEFICIENCIES OF THE RIEMANN INTEGRAL
5
EXERCISES The exercises below refer to the functions f n in Example 1.1.
1.1 Prove that each function functions on [a.b].
fV2
lies in R[u.b], the set of Riemann integrable
Prove that for each z in [u,b], fn(z)--+ ls(z),where
1.2
i 1
=
ifzES,
0 i f x E [u.~ I \ s ,
the indicatorfunction of the set S of rational numbers in [a. b].
1.3
Prove that the function 1s is not Riemann integrable. That is,
because the latter integral does not exist. The failure of the pointwise limit of a sequence of Riemann integrable functions to be Riemann integrable is considered a serious shortcoming of the Riemann integral. The following example will illustrate a deficiency that is shared by the Riemann integral and the Lebesgue integral that we will define. EXAMPLE1.2
Let
for all n E Pi. The reader should do the following exercise. EXERCISE
1.4 Let f n be as in Example 1.2. Prove that f n ( z ) --+ f ( z ) = 0 pointwise on [O, 11. Also, it is clear that f n E R[0,1] for all n , and f E R[O. 11 as well. Yet
Thus it occurs for some convergent sequences of functions that rb
rb
even when all the integrals exist. For the Lebesgue integral, however, Theorem 5.3.1 will identify useful conditions under which equality would be guaranteed in Equation (1.1).
6
HISTORY OF THE SUBJECT
EXAMPLE1.3
Let
for each n E Icu-. The reader should check that each f is Riemann integrable hut f71(z). then f # R[O.11 because f is not bounded. The that if f ( z ) = reader should recall from elementary calculus that f is, however, inipropei.1.v Riemann integrable. In Exercise 5.44 the reader will see a generalization of this example that satisfies Lebesgue convergence theorems but that cannot be corrected with improper Riemann integration.
1.3 MOTIVATION FOR THE LEBESGUE INTEGRAL The Lebesgue integral begins with a seemingly simple reversal of the intuitively appealing process of Definition 1.2.2. Instead of partitioning the interval [a. b] on the z-axis into subintervals and considering the range of values of a bounded function f on each small subinterval, Lebesgue began with the interval [m,A11 on the y-axis, where A 1 = sup{f(z) 1 z
E
[ u b]}. ~ and rn = inf{f(x) I x E [a. b ] } .
Thus P = {yo. 1 ~ 1. ~. . .y n ) , where 771 = yo < g1 < . . . < y, = M . Next, instead of forming a sum of the lengths A x ? of the z-intervals weighted by the heights 111, or m 2 ,Lebesgue sought to form a sum of the heights, y L , each weighted by some suitable concept of the length, or inemure p, of the set f-’ ( [ Y ~ - ~, y.L ] ) the , {et of points z for which f(s)E [yL-l. y2]. The difficulty is that the set f-’([y?-l. y,]) does not need to be an interval. Indeed, f - l ( [ y ? - ~ .yl]) can be a very complicated subset of the z-axis.’
EXERCISE 1.5
Give an example of a real-valued function f : IR --+
f-’
([-;.
;I)
IR for which
= R\&.
the set of irrcltionnl numbers. It turns out that the definition on the real line of the Lebesgue integral-a wonderful improvement upon the Riemann integral-is very simple once one has defined a ’The comparison of Riemann with Lebesgue integration has been likened to a story about a sinart merchant who sorts money into denominations before counting the day’s receipts. Riemann adds the figures as they come in, but Lebesgue sorts first according to values. Lebesgue integration is subtler, however. than this analogy suggests, because the sets f - l ( [ y j - ~ . y2]) can be very intricate indeed.
7
MOTIVATION FOR THE LEBESGUE INTEGRAL
suitable concept of the positive real-valued measure of a subset of the line. The desired measure should agree with the concept of length when applied to a subset that is an interval. The key property that one needs for a concept of the measure of a set is that if one takes any infinite sequence of mutually disjoint sets E,, one needs to have P
(6 i=l
-x
E ? ) = i=l p(E,).
That is, one needs a countably additive measure on subsets of the line which generalizes the concept of length of an interval. Unfortunately, no measure exists that agrees with the concept of the length of an interval and that can be defined on all the subsets of the line. Thus it turns out that defining the family of Lebesgue measurable sets is a very serious undertaking in the construction of the Lebesgue integral. And that is why we will begin our task in the next chapter with the definition of Lebesgue measurable sets and the definition of Lebesgue measure on those sets. This will turn out to be a lengthy task. (Confession: the pun is intended.) We can see in advance how the Lebesgue integral will resolve some of the deficiencies of the Riemann integral. Suppose that we have defined already a countably additive Lebesgue measure that generalizes the concept of the length of an interval in the real line. The set S = Q n [a.b] of Exercise 1.3 is a countably infinite set. That is, the points of S can be arranged into a single infinite sequence: S = { s I n E N}. Each point is an interval of length zero. Thus it will need to be the case that the Lebesgue measure
c J-
1(S) =
l{Sr1} =
0.
n=l
If the reader finds it believable in advance that the Lebesgue integral of a constant function on a measurable set will be that constant times the Lebesgue measure of the set, then
lab
ls(r)d r
=
1.0=0
in the sense of Lebesgue integration. The reader can understand at this point why Lebesgue measure is required to be only countably additive. If Lebesgue measure were to be uncountably additive, then every set would have measure zero because every set is a disjoint union of singleton sets. Thus the theory of Lebesgue measure would collapse. It will be seen in the coming chapters that the Lebesgue measure of each interval on the line will be its Euclidean length, that each Riemann integrable function will still be Lebesgue integrable, and that the value of that integral will be unchanged. Thus the reader is advised nor to forget everything that he or she has learned before!
'
'One could define a concept of the sum of an uncountable family {ah I a E A } of nonnegative real numbers indexed by an uncountable set A. For example, the sum could be taken to mean the supremum of the sums over all countable subsets of A. It is a simple exercise to show that, with this definition, a sum must be infinite unless I, = 0 for all a outside some countable subset of A .
8
HISTORY OF THE SUBJECT
EXAMPLE1.4
The fact that the Lebesgue measure of a countable set, such as the set of rational numbers, is zero will resolve another shortcoming of the Riemann integral. Let S = Q n [0,1]= {qn I n E IN}, as before. Define the functions fn(z) =
i
n 0
if r E { q l , . . . . q , } . if x E [o. 1]\{ql,. . . . qT2.}.
It is easy to see that fn(x)diverges to co on the dense set S , whereas fn(x) at every other value of z E [O. 11. We can define a function
---f
0
This function f is not real-valued at the points of S-we say that it is extended real-valued. But because the set S has Lebesgue measure zero, it will turn out that f is Lebesgue integrable and that $ f(x)dx = 0, in the sense of Lebesgue. Thus we do have rl
despite the fact that the pointwise limit of f n exists only in the e.rtended realvalued sense. Here we have benefited from the fact that the functions f n are uiiiforinly bounded, except on a set of measure zero. The reader should note that the function f is not even improperly Riemann integrable in any plausible sense. Before proceeding to the task at hand, we explain why it is necessary to develop both Lebesgue measure and the Lebesgue integral for functions mapping a domain X that is an abstract set into the set R of real numbers. One reason for working at this level of generality, in which X is simply a set (not necessarily a set of real numbers) and f : X -+ IR,is that it is important to define the Lebesgue integral for functions of several variables. That is, we wish also to be able to integrate f : IR” + IR. An element of IR”is not a real number, but rather an n-tuple of real numbers. Moreover. in higher analysis, both pure and applied, it is necessary to work with functions defined on groups, such as the important classical groups of matrices, and this requires knowledge of Lebesgue integration on abstract sets. Moreover, the study of Fourier inversion for functions defined on groups requires the introduction of a measure on what is called the dual object4 of the group, and the Fourier transform must be integrated on that object. “he dual object of agroup is the set of all unitary equivalence classes of irreducible unitary representations of the underlying group. When endowed with a usable topology. such objects can be quite complex topologically.
MOTIVATION FOR THE LEBESGUE INTEGRAL
9
Finally, there is a very important motivation for the abstract study of measure theory from probability. In a probabilizy model, the outcomes of an experiment are pictured as points in a so-called sample space X. An event is conceptualized as a subset E C X. The idea behind this is that E denotes the event that the experiment yields a result that is an element of E . For example, X could be the real line. The experiment could be measuring the temperature of the mathematics classroom at 3 P.M. on a certain day. The interval E = [80,90] would represent the event that the temperature turns out to be between 80" and 90°F. In probability theory, one wishes very strongly to have a concept of the probability p( E ) that has the following properties: The probability p ( E ) E [0,1] for each event E . The probability measure p is additive on all countably infinite sequences of mutually disjoint events. As discussed above, this cannot be done for all subsets of IR-at least not with a measure that generalizes reasonably the length of an interval. Thus for probability also, we must study the concept of measurable sets and the measure of such a set. The sample space for a probability model need not be a subset of the real line. For example, Brownian motion is the type of motion exhibited by a particle suspended in a fluid. In order to study Brownian motion by means of probability theory, one must place a measure on the set of all possible paths that a Brownian motion may follow. Since the sample space of an experiment could be a set quite different from R,we must develop the theory of measure and integration on abstract sets.
51t turns out that these paths are continuous, but they are nowhere differentiable with probability l !
This Page Intentionally Left Blank
CHAPTER 2
FIELDS, BOREL FIELDS, AND MEASURES
In this chapter we introduce the families of subsets of a general set X on which we will be able to ascertain the existence of a countably additive measure. The concept will agree with that of the length of an interval in the real line, and with that of the volume of a rectangular box in Euclidean space of arbitrary finite dimension. 2.1 FIELDS, MONOTONE CLASSES, AND BOREL FIELDS
Let X be an abstract set. We will denote by g ( X ) the power set of X, which is the set of all the subsets of X .
Definition 2.1.1 A nonempty family 24 c p ( X ) is called afield of sets (also called an algebra of sets) provided that for all -4. B E 24 we have AuBE'U. AnBE24. X\A E 24. Meusure arid Integration: A Concise Introduction to Real Analysis. By Leonard F. Richardson Copyright @ 2009 John Wiley & Sons. Inc.
11
12
FIELDS, BOREL FIELDS, AND MEASURES
Often, we will begin with a family of sets that we wish to have as elements of a field of sets, and we would like to know the minimal field that contains these sets.
Definition 2.1.2 Let 2l be any family of subsets of X . Define IF(%), generated by U, to be the intersection of all the fields that contain 2l.
the field
To see that there exists a field containing 2, do Exercise 2.6. One can see that
IF(%) is a field, since if A and B are in every field that contains U, the same is true of A u B . A n B , X\A, and X\B. Observe that IF(%) is contained in every field that contains 9.Thus it is reasonable to call IF(%) the minimal field that contains 2l. EXAMPLE2.1
Let X = [O. 1) = {x 1 0 6 x < I} be the real unit interval that is left-closed and right-open. Let ‘2 denote the family of all the disjoint unions of finitely many intervals of the form [a, b ) c [O. 1).6 It is easy to see that E is a field since both the union and the complement of two intervals of the given form will always have the same form. This field will play an important role in the development of Lebesgue measure on the real line, and it is called the field of elementary sets in [O,1). The reader should check, as an example, that the field generated by the set of all intervals of the form [a. b ) C [O. 1) is the field E of all elementary sets in [O, 1). It is easy to generalize this example to define the field of elementary sets in any finite interval [-N, N), for N E K. We can take an elementary set to be any finite union of left-closed, right-open intervals in [ - N , N ) . EXERCISES
2.1 Show that the concept of a field of sets would be the same if we had omitted closure under unions from the three criteria listed in Definition 2.1.1. 2.2 Show that if 2l is a field of subsets of X , then X E 2l and the empty set 0E 2l. It is easy to see that a field is a family U of subsets of X that is closed under the operations of taking unions and intersections of finitely many members of 2l as well as closed under complementation.
Definition 2.1.3 A family U C Cp(X)is called a monotone class if and only if it has the following two properties: i. If A1 S A2 S . . . C A, S . . . is an increasing sequence of sets ‘4, E 2,then A, E 2l.
u:=,
6Left-closed, right-open intervals are used only to make it easy to describe afield of elementary sets. The convenience is that the complement of [a.b ) E [O. 1) is a union of finitely many intervals of the same form. Later we will define the Borel sets and we will see that all intervals. and all open sets and closed sets, are among the Borel sets.
FIELDS, MONOTONE CLASSES, AND BOREL FIELDS
13
ii. If Al 2 A2 2 . . . 2 An 2 . . . is a decreasing sequence of sets An E Q, then A, E 2.
n;==,
In words, a monotone class is a family U of subsets of X that is closed under the operations of taking unions of countable increasing sequences of members of U and of taking intersections of countable decreasing sequences of members of 2.
Definition 2.1.4 A Borelfield is a field U that is also a monotone class. Theorem 2.1.1 A family U of subsets of X is a Borelfield if arid only if it is closed under the operations of complementation arid of taking unions of countable sequences of elerneiits of 2. Remark 2.1.1 Because of Theorem 2.1.1, a Borel field is often called either a a-field or a a-algebra. The letter a connotes set-theoretic summation, or union, of arbitrary countable sequences.
EXERCISES
2.3
Prove Theorem 2.1.1.
2.4 Let f : X + Y be any function from one set into another. Let U E any a-algebra of subsets of Y . Prove: The family
p ( Y )be
is also a a-algebra.
2.5 Show that the field of elementary sets in [O. l ) ,as defined in Example 2.1, is not a Borel field. Hint: consider
Definition 2.1.5 Let U be any family of subsets of X . Define the Borelfield, IB(U), generated by U, to be the intersection of all the a-fields that contain U, and define the monotone class, M(U),generated by U to be the intersection of all the monotone classes that contain U. An element of B(Q)is called a Borel ser. Whether or not a subset S of an abstract set X is a Borel set depends upon what field U is used to determine the Borel field IB(!Z). For certain important examples, such as the real line IR or Euclidean space R", the choice of U will be the field of elementary sets defined in Examples 2.2-2.4.
14
FIELDS, BOREL FIELDS, AND MEASURES
EXERCISES
2.6 Let 24 be any family of subsets of X . Show that 24 is contained in each of the following: a field, a Borel field, and a monotone class. 2.7 Give an example of an infinite set X and a family 24 c Cp(X)for which hI(24) is not a field of sets. The following theorem will be useful.
Theorem 2.1.2 IfU is afield of subsets o f X , then IN(%)= B(U). Proof: Since each Borel field is a monotone class, it follows that hl(24) L l
B(U).
This much would be true even if 24 had not been a field. Thus it will suffice to prove that lM(24) 2 B(U). Because ml(24) is a monotone class, it will suffice to prove that hl(24)is a field (and consequently a Borel field as well). Thus it will suffice to show that if A and B lie in ml(U), we must have i. A n B E lhl(24) and ii. X\A
E
a!(%).
We present the rather subtle proof of these two requirements as follows. We begin by fixing temporarily A E al(24)and defining
2 3 = ~ {BE Cp(X) I A n B E Ill(%)}.
(2.1)
We will prove that 23'4 must be a monotone class. Suppose that B, is an increasing sequence of elements of '23.4. We wish to show that U,:=, B, E % A . But
since the latter set is a monotone class and since A n B,, is an increasing sequence of sets in hl(U).The reader should do Exercise 2.8 to prove that 23 4 is a monotone class.
I. Now that we know that 2 3 is~ a monotone class, to complete the proof of (i) we need to prove that % A 2 U so that we will know that 2 3 4 2 ml(U). We start with a special case by restricting the set A to be in U, fixed temporarily. That 2 3 2 ~ U follows in this special case from the fact that U is a field. Since we know already that 'L3.4 is a monotone class, and since in subcase (a) we know that 2 3 2~ U, ~ it follows that 2 3 2~ ml(U). Thus we have shown that nl(%)is closed under the operation of taking intersections with all A E 24.
FIELDS, MONOTONE CLASSES, AND BOREL FIELDS
15
(b) For the full requirement (i), we relax the initial restriction and allow A E JM(U) to be fixed arbitrarily as in Equation (2.1), and we let 23 A be defined still by Equation (2.1). And now we know from subcase (a) that 2 32 ~ 2.Since 23’4 is a monotone class, it follows that % A 2 fir(%), which is therefore closed under the operation of intersection. 11. Let
U/ = { AE
qqx)I X\A
E
mqa)}.
Since U’ 2 U because Q is a field, it suffices to show that U’ is a monotone class and therefore contains IN(%).So we let A,, be an increasing sequence of elements of U’ and we wish to show that U,”=,A,, E 2’. But
n,”=,
because (X\A,) is the intersection of a decreasing sequence of sets in the monotone class Ihl(U). Exercise 2.9 will complete the proof of Theorem 2.1.2.
EXERCISES 2.8 To complete the proof that 23 A is a monotone class, the reader should prove that BA,as defined in Equation (2. l), is closed under the operation of taking intersections of decreasing sequences in 23 A .
2.9 To complete the proof of Theorem 2.1.2, the reader should prove that U ’ is closed under the operation of intersection of decreasing sequences of sets. 2.10 Give an example of a set X and a subset U c T ( X ) that is not a field, yet M(U) = IB(U).
Remark 2.1.2 The reader should not be surprised if the proof of the preceding theorem seems quite abstract. We will study a variety of theorems to help us understand which sets are Borel sets or measurable sets in familiar examples. But we will also learn theorems that show us that our intuition regarding Borel sets and measurable sets is limited. The reason that the proof of Theorem 2.1.2 does not give us a sense of intuition about the family of Borel sets is that B(U) is defined as the intersection of a huge family of a-fields containing U. The reader has a very easy example of a a-field that contains every U, but the totality of such a-fields is immense. Be alert to the different contexts in which the term Borel set is used. In this book, in the context of the real line, a Borel set means a set that is in the Borel field generated by the field of finite unions of left-closed, right-open intervals. We will see that this is the same as the Borel field generated by the family of open sets or by the family of closed sets. In the context of an abstract field U within the power set
16
FIELDS, BOREL FIELDS, AND MEASURES
Fp(X)of an abstract set X , the term Boref set refers to any element of the Borel field B(Q) generated by 2. H EXAMPLE2.2 Let & be the field of elementary subsets of the unit interval [O. l),as in Example 2.1. The elements E E B(E) are called the Borel sets of the unir interval. Sometimes B(&)is denoted as %(X) where X = [O. l ) ,so that the notation connotes the Borel subsets of the unit interval with respect to the standard field of elementary sets. The reader should note that every interval within [O. I) is a Borel set, regardless of whether it is half-closed, closed, or open.’ But this scarcely scratches the surface of the family of Borel sets, as the reader will see later in Example 3.1 1. Having observed earlier, in Example 2.1, that there is an immediate extension of the concept of an elementary set to any finite interval [-N. K )c R,we can extend the concept of a Borel set to [ - N . N ) as well. Moreover, a subset S of the real line IR itself is called Borel provided that S n [ - N , N ) is Borel for each N E IN. EXAMPLE2.3
Let
x = [O, 1) x [O, 1) = [O. l ) ? the unit square in the plane R2. Let I k and J k denote left-closed, right-open subintervals of [O. 1) for each k = 1.. . . . n. Let & denote the family of all subsets of X of the form n
E= U I ~ X J ~ . k=l
where n E IN is arbitrary but finite and varying, depending on the choice of E E &. The reader should be able to generalize this example to cover elementary sets and Borel sets in X,$ = [ - N . N ) x [ - N . N ) for each N E IN. A subset S of IR2 is called a Borel set in IR2 if S n X i is Borel for each N E IN.
EXERCISE 2.11
Prove that the set & of Example 2.3 is a field.
The field & of Example 2.3 is called theJield of elementary sets in the unit square, and B(&) is called thefield ofBorel sets in the unit square, also denoted as IB ([O. 1)’)
’This is true because each singleton set, {z}, is a Borel set, as the reader should prove easily.
FIELDS, MONOTONE CLASSES, AND BOREL FIELDS
17
EXAMPLE2.4
Let
x = [O,!)l
a k-fold Cartesian product of copies of the unit interval. The set X is called the unit cube in IR”. Let E denote the field of all finite unions of k-dimensional boxes of the form E = I1 x x Ik. where each I3 is an interval of the form [ a J ,b 3 ) E [0,1).Then the set B(E) is called thefield of Borel sets of the urzit cube in IRk. We can extend the concepts of elementary set and Borel set to any cube [ - N , N)kfor N E IX. And a subset S c R k is called a Borel set provided that S n [-N, N ) k is a Borel set for each N E IN.
H EXAMPLE2.5 Here we define Borel sets in the product of infinitely many copies of the unit interval [0, l), the Tychonofcube X of arbitrary infinite dimension.’ Let r be any arbitrary (infinite) index set. (We do not assume that r is countable.) For each y E r, let S , = [O. l),the unit interval. According to the Axiom of Choice [ 161, there exists a function z on r such that for each E r we have x ( i ) E S,. Thus the function z chooses one element from each set S,. We define A,
x=
n s,
= {X
1 z(7) E s,
vy E
rj.
?El-
The Axiom of Choice tells us that the Tychonoff cube
Intuitively, we think of ~ ( 1as)representing the yth coordinate of the point
x
E
x.
Now let A be anyfinite subset of r. For each 7 E A we specify some subinterval [ u-. b,) c S,. By a cylinder set in X we mean a set R E X such that x E R if and only if
4 1 )E
is,[a,,
b,)
if 7 if r:
E
A.
E
r\A.
This means that a cylinder set R is defined by restricting only some finite collection of coordinates to particular subintervals of S, . We define an elementary *It is common to take the Tychonoff cube to be an arbitrary product of copies of the closed unit interval [O. 11, so that the Tychonoff cube will be compact. The set X defined here is a subset of the compact cube.
18
FIELDS, BOREL FIELDS, AND MEASURES
set in X to be the union
n
k=l
of finitely many cylinder sets in X . The set E of elementary sets is a field, and IB(E) is the set of Borel sets in the Tychonoff cube X .
Remark 2.1.3 The discussion up to this point has presented what may be called an external construction of the Borel sets corresponding to a field of elementary sets. The word external connotes the fact that B(E) is defined as the intersection of all Borel fields that contain E. For those who have studied transfinite ordinal numbers and transfinite induction [ 161, we sketch briefly here an internal construction of B( E). The interested reader can consult reference [8] for details. For the case of Euclidean space IR" or the cube therein, the procedure may be described briefly as follows. We begin with a field 2l that happens not to be a Borel field, meaning that it is not a monotone class. So we construct a larger field by tacking on the unions and intersections of all increasing and decreasing nested sequences of sets from U. The resulting family is called 2l1. The same procedure is applied to subsets of U1 in order to augment 2l1 so as to produce %a. We do this for each finite ordinal number n , and then for the first infinite ordinal, w,we define the union of all the infinitely many families already constructed to be a,. Then we apply the original procedure to U, to produce 2 1 d + l , etc. At each nonlimit ordinal number we apply the original procedure, and at each limit ordinal, we take the union over all its predecessors and repeat the process. When this has been done for all ordinal numbers X < R, the first uncountable ordinal, the process stops, having reached a Borel field. The constructed field is the same one, called B(U),that we obtained much more easily by the external method. A good example of an interesting Borel set that the reader will meet is the Cantor set, to be defined in Example 3.1 1. It is a common but serious error to imagine that a subset of the interval [0,1) is a Borel set if and only if it is a union or an intersection of countably many intervals, or intervals of the specified type. This is false, and the elaborate transfinite induction we have described suggests this fact. 2.2 ADDITIVE MEASURES Let IR* = IR u { co,-a}denote the extended real number system, and [O. co] = {z E
IR* I 0 6 .c 6 a}.
Let X be an abstract set and 2l a field of subsets of X .
Definition 2.2.1 A function p : U -+ [O. a]is called ajnitely additive measure if and only if i. If A then
=
uyZlA , is the union of finitely many mutually disjoint sets il, U, E
ADDITIVE MEASURES
19
ii. p ( 0 ) = 0. If p ( X ) < m,then p is calledjinire: Otherwise, p is called itJinire. The measure p is called approximateljjinite if, for all A E U with / ! ( A )= GO and for all real numbers A l > 0, there exists B c A such that 111 < p ( B ) < a.
EXAMPLE2.6
Let X be any infinite set and U = Y ( X ) ,the power set of X . Let p ( A ) = 0 if A = 0 and p(A) = cc if A # 0. Then p is finitely additive but it is not approximately finite. Next, define a finitely additive measure v by letting v(A) be the number of elements in A if A is finite, and letting v ( A ) = cc if A is infinite. Then I/ is approximately finite. EXAMPLE2.1
Let X = [O. l),the unit interval, and let ‘2 be the field of elementary sets in the unit interval. Thus if E E ‘2, we can express n
E
U[a,. b L ) .
=
1=1
a disjoint union of finitely many left-closed right-open intervals. Suppose we are given a nondecreasing function f : [O. 11 -+ IR for which f ( 0 ) = 0. We wish to define n
i=l
and to show that p is a finite, finitely additive measure on ‘2, The work is left to Exercise 2.13. Because finitely additive measures are not sufficient for the purposes of analysis, we make the following definition.
Definition 2.2.2 A countablj additive tneasure on a jield U is a finitely additive measure p with the following property. If A , E U is an infinite sequence of disjoint sets for which f
A= UA,EU. 1=1
then we have I
9Here we understand the convention to be that [a. b ) = {z E [a,.) = izi.
IR 1
a
< b } , with a
< b.
Note that
20
FIELDS, BOREL FIELDS, AND MEASURES
Note that we do not require U to be a a-field in the preceding definition
Definition 2.2.3 A measure 1.1 that is defined on a field 9l if X = UZEN X , , with p ( X , ) < c/; for each i E IN.
p(S)is called a-finite
Note that a finite measure is a-finite as well, because X = an infinite sequence of copies of X itself.
ULEK X, the union of
EXERCISES 2.12 Show that condition (ii) of Definition 2.2.1 can be replaced by the following condition: (ii') p ( 0 ) < GO. 2.13 The decomposition of E used to define ji in Example 2.7 is not unique. Prove that p is nonetheless well defined. Prove also that there is a bijection between the set of all finite, finitely additive measures on 2 ' in that example and the set of all nondecreasing real-valued functions o on [O. 11 for which o(0) = 0. 2.14 Prove that every infinite a-finite countably additive measure 1.1 is approximately finite. 2.15 Let X be any set, and let 2l = p ( X ) the , power set of X. Define the coiintiiig measure v ( E ) to be the cardinality of E for eachfinite set E G X . Let v ( E ) = r ~ for each infinite subset of X . a) Prove that U is a a-field in p ( X ) ,and that v is countably additive on U. l o b) Prove that v is a a-finite measure on U if and only if X is at most a countably infinite set. 2.16 Give an example of a finitely additive measure 1.1 on the set IN of all natural numbers such that p is not countably additive. I ' 2.3 CARATHEODORY OUTER MEASURE In this section we define still another type of set function, called a CararhPodoi? outer measure. We will show that for this type of set function, there exists a special class of sets, called iizeasiirable sets, on which the set function is a countably additive measure. To begin with, we do not assume that the CarathCodory outer measure is even finitely additive, but rather only that it is what is called countably subadditive. (In the next section, we will determine the necessary and sufficient conditions for defining a CarathCodory outer measure that extends a given finitely additive measure on a given field of sets.)
'"This exercise establishes that the triplet ( X ,U. u ) is a rneCisitre space, in the sense that will be presented in Definition 3.3.1. "An example of a finitely additive measure, defined on suitable subsets of the Eitclideari plune. that fails to be countably additive will be provided by Exercise 3.28.
,
CARATH~ODORYOUTER MEASURE
21
Definition 2.3.1 Let X be a non-(a set, and let p : Q ( X )+ [O,
a].
Then p is called a Carathe'odory outer nieasure if and only if it satisfies the following three conditions: i. p ( 0 ) = 0. ii. A c B ===+ p ( A ) < p ( B ) ,which is called monotonici9." iii. A c u2L,A, tiviv'.
=+ p ( A ) < C2L,p ( A z ) which , is called countable subaddi-
Next, we define the a-field U of subsets A E Q ( X ) on which a CarathCodory outer measure will turn out to be countably additive.
Definition 2.3.2 Let p be a CarathCodory outer measure defined on Q ( X ) .A subset A E X is called p - m e a ~ u r a b l e if, ' ~ and only if, for all W E Q ( X )we have p(Nr) = p(W7 n A ) + p (W n A ' ) . where A"
= X\A,
the complement of A.
We observe that without the condition in this definition, all we would have known is that p ( W )< ~ ( 1 n 4A ~ ) + p (14' n A " ) . We could describe the definition in words as stating that A is p-measurable if, and only if, A arid its complement split every set N'additively with respect to the CarathPodory outer measure p. Next, we will show that the family of all p-measurable sets A is a a-field on which p is a countably additive measure.
Theorem 2.3.1 Let p be a Carathe'odory outer measure defined on Q ( X ) .Then the family U of all p-measurable subsets of X is a a-field, and p is countably additive on U. Proof: It will suffice to prove the following three statements: i. A
E
U implies that A" = X\A E 2.
ii. A and B in U implies that A n B E U. iii. If the mutually disjoint sets A, lie in U for all z
E
IN, then
f
A=UA,EU 2=1
'*This property is redundant, but is stated here for emphasis because it is important. See Exercise 2.17. I3For a simpler but equivalent definition in the special case in which p ( X ) < co. see Theorem 2.3.2.
22
FIELDS, BOREL FIELDS, AND MEASURES
and
Note that properties (i) and (ii) will suffice to establish that 2l is a field. The field property is important for the proof that property (iii) suffices to establish closure of 2l under countable unions even for nondisjoint sequences of sets in 2l. Otherwise, one might think that property (ii) is superfluous in light of Theorem 2.1.1. We begin by proving (i). But this is immediate since if A is p-measurable, we know for each 11' E y ( X )that
''
p(1I') = p ( W n A )
+ p(1V n A").
and this implies that A" is measurable as well by the symmetry of the definition. Next, we prove part (ii). We must show that if A and B are p-measurable, then
+
p ( W ) = p ( W n ( A n B)) p(W n ( A n B)").
where ( A n B)"= A" v B". We know that
+ p (It' n A") = p ( W n ( A n B)) + p(W n ( A n B"))+ p(N7 n A") 2 p(1V n ( A n B))+ p(11' n ( A n B)").
p(1.t') = p ( W n A )
because we have
( A n B)" = A' u B" = A" v ( A n B'). This means that the subadditive measure p is also superadditive when we split bT.' using the pair A n B and ( A n B)". Hence 14' is split additively and A n B is measurable. Finally, we need to prove part (iii). Keep in mind that the sets A , are tiiutually disjoint. For arbitrary W we have
'?See Exercise 2.18.
CARATH~ODORYOUTER MEASURE
23
by monotonicity of p. Since this is true for all n , we have also that
Also, by subadditivity,
Hence Equation (2.2) tells us that
and then subadditivity of p implies that
It follows that A E 9,so that U is a a-field by virtue of Exercise 2.18. And by subadditivity applied to Inequality (2.2), we have
Applying Equation (2.3) to A itself in place of TI', we see that
because p ( 0 ) = 0. The proof will be completed by Exercise 2.18. The preceding theorem gives us a way of obtaining a countably additive measure on an identifiable a-field of measurable sets, provided we can come up with a suitable CarathCodory outer measure. The Hopf Extension Theorem of the next section will make use of this theorem to enable us to start with a finitely additive measure on a field U (of elementary sets) and extend that measure to a countably additive measure defined at least on the set B(U),the Bore1 field generated by U.
EXERCISES 2.17 In Definition 2.3.1, show that properties (i) and (iii) imply (ii). Moreover, ,L is also finitely subadditive. 2.18 Show that, in part (iii) of the proof of Theorem 2.3.1, closure under countable disjoint unions implies closure under countable arbitrary unions.
24
2.4
FIELDS, BOREL FIELDS, AND MEASURES
E. HOPF'S EXTENSION THEOREM
Suppose we are given an abstract set X , a field U E p ( X ) ,and ajnitely additive measure p defined on U. We ask under what conditions it will be possible to extend p to a countably additive measure on the g-field B(U)and whether such an extension will be unique. We will call U a field of abstract elementan sets. The field B(U) will be called a field of abstract Bore1 sets. In a typical application of the Hopf Extension Theorem, we will have a specific set X for the underlying space and a specific field 2l that we will use for elementan1 sets for that space. Usually there will be some naturally defined finitely additive measure p on U. For example, in the next chapter, we will pick the field of elementary sets to be the one that is generated by the left-closed, right-open intervals in the real line, and we will use Euclidean length to determine the finitely additive measure p. Then the Hopf Extension Theorem will be used to produce Lebesgue measure. Different choices of elementary sets and of finitely additive measures of elementary sets may be made to produce different countably additive measures on the real line or elsewhere. Recall that, by Definition 2.2.2, a finitely additive measure 1-1 is said to be countably additive on a field U if and only if for each sequence of disjoint sets A E U, such E 2, we have that A = UzEWAz
i=l
Theorem 2.4.1 Let 1-1 be ajnitely additive measure on afield 'u y ( X ) .In order that there e.xist a countably additive extension 1-1* of p defined on a g-field U* 2 Q, it is necessan and suficient that p be countably additive on 2. l 5 Moreover; $ ( X , U.p ) is 0-jnire, then the extension p* is unique. Proof: We prove first the rzecessih of the condition. We assume that there exists a countably additive extension p * of p. Since it is an e.rtension, we must have p * ( A )= p(A) for all A E 2. Thus, if
A
=
U z E I N AEl U.
with the sets A, E U for all i being mutually disjoint, then
Thus p is countably additive on U. Next, we will prove the suficiency of the condition. We will define the function p* on the power set p ( X )by
"The reader should note that such a a-field !2P must contain IB(2l) as well.
E. HOPF'S EXTENSION THEOREM
25
where the infimum is taken over all countable coverings of B by sets B in U. Recall that every field U includes as an element the whole set X ,so that the family of countable coverings is nonempty. Note that in Equation (2.4), we do not require the sets B,to be disjoint. We will prove first that p * is a Caratheodory outer measure. There are three properties to be confirmed: 1. p* (0) = 0. This is easy to confirm, since 0 E U, so that it can be covered by itself.
2 . If -4C B, then p * ( A ) 6 p*(B). This follows immediately from the fact that every covering of B covers A as well, and the infimum over a superset will be smaller.
x7:l
uT=,
3. If B C B,, we must show that p * ( B ) 6 p * ( B , ) . (Here, B and B,are arbitrary subsets of X: They need need not be elements of U.) This will be immediate if I
2=1
C,:,
So we will assume that p * ( B 7< ) ic and we will prove ( 3 ) as follows. Let E > 0. We can cover each set B,,C ULL,Eln,, in such a way that each Bn,, E U and
Since the family {&., I i E IN.n E IN} is countable'' and covers B,it follows from absolute convergence that P*(B) 6
1 L4Bn.l) ncK'.idN
I6If C is any countable collection of numbers, then the concept of summation of that countable set of nonnegative numbers is as follows:
meaning that F is required to be finite. In advanced calculus, this is treated also for sums of real numbers of arbitrary sign, provided that the sum as defined above is finite for the absolute values. See. for example.
pol.
26
FIELDS, BOREL FIELDS, AND MEASURES
for all E > 0. Here we use the fact that the union of countably many countable sets is countable. For the equality (i) we have used the properties of absolutely convergent double-series ” with respect to summation first by rows and then by columns, or vice versa. This is sufficient to prove (3). Next, we must show that p* is an extension of p and that U is contained within the family of p*-measurable sets. a. We will show that for all A E U, we have p*(A) = ~ ( ~ 4 What ) . is clear is that since A E A u 0 u 0 u . . . we must have p*(A) 6 p(A). In order to prove the opposite inequality, we will make use of the hypothesis of countable additivity 017 U, as in Definition 2.2.2. So, suppose we have a countable covering of A: A c A,, with each A, E U. We can construct from this covering a countable disjoint covering by sets B,E U defined as follows: Let B1 = A1 n A,Bz = (A2 n A)\B1, and in general
u,:
Now A is covered by a countable disjoint family of sets B,E U. Countable additivity on U tells us that
i=l
i=l
for all countable covers by sets A,, where we are using the monotonicity of p . Therefore, p(A) < p*(A). Hence p(A) = p*(A). b. Since we know that p* is a CarathCodory outer measure, we can denote by PI * the family of all p*-measurable sets, as in Definition 2.3.2. We know that U * is a a-field. If we can show that U E U*, then we will know that p* is an extension of p, and it will follow also that B(0)C U*. So, we let A E U. and we let 14’ E Cp(X).According to Definition 2.3.2, we must prove that p * ( \ \ ’ ) = p*(I\’ n A)
+ p*(It’ n A ” )
Because of subadditivity of p * , it will suffice to prove that
p*(I\’) 2 p * ( \ l ’ n A) + / / * ( I t ’ n A‘) ”A doirble series is a sum of the forrn&N,JcN x , , ~and . absolute convergence implies that
which the reader can find in many advanced calculus texts, such as [20]. This is also a special case of Fubini’s Theorem, which is Theorem 6.2.2 in the present text.
E. HOPF'S EXTENSION THEOREM
This inequality would be immediate if p * ( W ) p * ( W ) < a.
=
co, so we will assume that
To this end, we let E > 0. There exist A , E U such that H' E
Note that
27
ULEN A, and
r
and p * ( W n A") 6
xtLlp(Al n A'). Thus r
p*(W7 n A )
+ p*(W n A") 6 2 p ( A Z )< p*(TV) + E , i=l
using absolute convergence of the series and finite additivity of p on U. Since E > 0 is arbitrary, the proof is complete except for uniqueness, which is treated in Exercise 2.2 1.
Definition 2.4.1 A Carathtodory outer measure p defined on q ( X )is called regular with respect to a field U E q ( X ) if and only if for each A E q(X)there exists a Bore1 set B E B(U) such that A c B and p ( A ) = p ( B ) . ' * We turn our attention next to the questions of the uniqueness and regularity of the extension.
EXERCISES 2.19 Let p be afinite, finitely additive measureI9 on a field U 5 q ( X ) . Prove" that p is countably additive on U if and only if for each decreasing sequence A1
with
(-):=1
An = ,@, we have
3 A2 3 . . . 3 A,, 2
...
p ( A T L=) 0.
2.20 The Carathtodory outer measure on ? ( X ) , constructed in Theorem 2.4.1, is regular. Hint: Let A E q ( X ) . For each n E IN pick
I8See Exercise 2.20. 19Finiteness of the measure means that p ( X ) < tm. ?OVariations on this exercise appear in Exercises 5.20. 5.24, and 8.10
28
FIELDS, BOREL FIELDS, AND MEASURES
with each An,t E U, for which p * ( B n ) < p * ( A )+
i.
2.21 Suppose that both p1 and p2 are countably additive extensions of the measure p from the field U to IB(Q). Suppose that p is countably additive on 2. a) Suppose that p ( X ) < a.Prove that p l ( B ) = p2(B) for all B E IB(Q). (Hint: Show that the set
is closed under complementation and under taking unions of increasing sequences, making B(U) a monotone class that contains Q. The finiteness of p ( X ) will be helpful for complementation.) b) Now replace the hypothesis that p ( X ) < co in part (a) with the hypothesis that p is o-finite on X . Prove that p l ( B ) = p 2 ( B ) for all B E IB(U). (Hint: Use Corollary 2.4.1 .)
Theorem 2.4.2 Let p be a regular Carathe'odory outer measure on ' p ( X )such that p ( X ) < c;c. Then a subset A c X is p-mensurable ifand only i f
A X ) = P(A)+ P(4". Proof: Necessity is immediate, so we will prove sufficiency. We need to prove for each IV E ' p ( X )that p ( W ) = p(I1.. n A )
+ p(TV n A').
Because of regularity, there exists a measurable set V 2 It; such that p( V) Observe that, since V is measurable, we have
= ,u(LV).
+
p ( A ) = p ( A n V) p ( A n V c ) . p ( A C )=p(A' n V )+ p ( A c n Vc). By hypothesis and by Equations (2.5),
P ( X ) = P(A) + A A C )= P(V) + P(VC) = [ p ( An V) p(ACn V ) ] [ p ( An V c ) p(ACn V ' ) ] .
+
+
+
On the other hand, it follows from subadditivity that
+ +
p ( A n V) p(Ac n V) 2 p(V), p ( A n V c ) p ( A Cn Vc) 3 p(V'). It follows that
p ( A n V)+ p( Ac n V) = p(V). p ( A n Vc)+ p ( A Cn 11') = p(Vc).
(2.5)
E. HOPF'S EXTENSIONTHEOREM
29
By the choice of V and by the preceding equations,
+
~ ( 1 5 ' )= p ( V ) = p ( V n A) p(V n A") 2 p(lV n A) ~ ~ ( 1n 1 'A") 2 ,L(lV).
+
Hence
p(W n A) + p(11'
n A") = p(117).
and A is measurable. 2.4.1
Fields, a-Fields, and Measures Inherited by a Subset
In Definition 3.3.1, we will see that a triplet, (X.U.p ) , is called a meamre space. provided that X is a set, U E p ( X ) is a a-field, and ,LL is a countably additive measure defined on U.
Definition 2.4.2 A triplet (X,U. p ) is a premeasure space provided that X is a set, Q 2 p ( X ) is a field, and p is a finitely additive measure defined on U. Thus the Hopf Extension Theorem provides necessary and sufficient conditions for a premeasure space to be extended to a full-fledged measure space. Note that there exist premeasure spaces that cannot be extended to measure spaces. 2' It is often useful to consider the restriction of a measure p , given to us in either a premeasure space or a measure space, to a subfield of the power set p ( S ) for some set S E U. An especially important instance is the situation in which ( X .U. p ) is a-finite, so that X = U Z E I N Xwith L , p(X,) < cc,for each i E IN.
Definition 2.4.3 If (X,U. p ) is any premeasure space, define the premeasure space inherited by S E U to be the triplet
(S.Us. P I . where US= {A n S I A E
a} c p(S).
and we retain the symbol p for the restriction to USof the given measure on U. Since U is a field, it is clear that Us E U, so that p is defined on Us. Moreover. it is easily checked that USis itself a subfield of the field p(S), with the understanding that complementation in USwill be with respect to the S, not with respect to X . That is, for A E US,we define A" = S\A. Again, because U is a field, the set S" also inherits a premeasure space from (X,U, p ) .
"For example, see Exercise 2.16
30
FIELDS, BOREL FIELDS, AND MEASURES
Theorem 2.4.3 r f ( X .U. p) is a n j premeasure space and i f S E U, then an a r b i t r a p set B belongs to B(U) if and only if B = B1 u B2, where B1 E B(U.7) and Bz E B(Usc). We are to understand in this theorem that B(Us) E p(S).That is, we treat S as the universal set in the definition of the Borel field generated by US. We observe that if '23 is a a-field in Y ( X )containing 2,then both of the following two conditions are met: '23sis a a-field in p(S)containing US,and '23s. is a a-field in p(S') containing Us.. Conversely, if '231 is a a-field in p ( S ) containing US and if 9 3 2 is a a-field in q ( S " )containing Us., then we define
Proof:
'23
=
(B1 u B2 1 B1
E '231. B2 E ' 2 3 2 ) .
Then it is clear that '23s = and '23s. = 2 3 2 , and that '23 is a a-field in p ( X ) containing U. The conclusion follows from Definition 2.1.5, in which the Borel field generated by a given field is the intersection of all a-fields containing the given field. We note that the preceding theorem does not involve
(X. U), which is called a premeasurable space.
and relates only to the pair
Corollary 2.4.1 Suppose
x = UZEINX'. with each X ,
E U, ajield
contained in p ( X ) . Then B
E
B(U)ifand only if
This is a countable adaptation of the proof of Theorem 2.4.3. The reader should check the details in the same manner as for that theorem. A set 93 is a o-field containing U if and only if the following condition is met for each i E IN: ' 2 3 , ~ ~2 ax,, with 23x,being a a-field in p ( X , ) .
Proof:
CHAPTER 3
LEBESGUE MEASURE
3.1 THE FINITE INTERVAL [ - N , N ) Let XN = [-N, N )c R,a left-closed, right-open finite interval. Denote an interval [ a k , b k ) by J k . We will call a set E = UE=l J k an elementary ser in X , y , provided that
-N 6 a1 6 bl 6 a2 6 b2 6 . . . 6 a,, 6 b, < N . and we will denote by (E the field of such elementary sets in X N . The choice of left-closed, right-open intervals may appear to be very restrictive, but this choice makes ‘2 easily into a field. And it is not a limitation in the long nin, since we will show soon that each singleton (x} turns out to be a Borel set in X N , and thus every interval is a Borel set in X.V. Moreover, we will extend the study of Lebesgue measure to all of lR by means of 0-finiteness. By Exercise 2.13, we can define a finitely additive measure p on (E by the formula
Measure arid 1nregratiort: A Concise htrroducrioti ro Real Analysis. By Leonard E Richardson Copyright @ 2009 John Wiley & Sons, Inc.
31
32
LEBESGUE MEASURE
for each E E E. Then p ( X , v ) = 2 5 . Since this is a finite total measure, a e will apply the result of Exercise 2.19 to prove in the next theorem the extendability of p to a countably additive measure on E3(E). Note that /L is simply the extension to the field generated by intervals J k of the measure that is Euclidean length on each interval J k .
Theorem 3.1.1 Let E be thejeld of disjoitit unions offinitely man) intenwls J k = [ a k . b k ) in [ - N . N ) , as described above. There is a countably additive meusitre 1-1 * defined oti E3( E) that coincides with the measirre p on E, where p of an interval is its Euclidean length. Proof: By the Hopf Extension Theorem, Theorem 2.4.1, it is sufficient to prove that p is countably additive 011 E. By Exercise 2.19, it suffices to show that for each decreasing sequence of elementary sets
El 2 E2
2 . . . 3 El, 2
...
n,:=,
such that En = 0, we have p(E,)= 0. If this were false, then there would exist 6 > 0 such that p(E,) 2 6 for all n E IS.For each n E IN, we can pick an elementary set EA, the closure EA of which is a compact set” I?:, c E,,, and such that
p(En\R) <
6
2n+l’
We should note that EL is not an elementary set, because it is not a union of finitely many left-closed, right-open intervals. Thus we take care not to apply p to Eil. It is important to note also that the sets EL need not form a decreasing sequence. Although En 2 En+1 -%+I.
=
the fact that
EA misses a part of E n having measure less than
6 2n+l means that E:,
misses at most that much of EL+l.The most we can conclude is that
6
/L
Also,
(EL\%) < 3’
nl,=,EA c n,=,En,and K
K
It follows that for each K E IN we have
”For each interval [a. b ) among the finitely many comprising the elementary set &. we use an interval of the form [ a . b - 6) for a suitable. very small 6 > 0. Thus we can construct En so that its closure will be closed In both the topology of IR and the relative topology of [-iV. N ) .
THE FINITE INTERVAL [-lv. 2%')
33
Thus K 71=1
n=l
for all Ii' E W. Yet we know that
-V].By thefinite intersection But each EL is a closed subset of the compact set [-A\r, property (Exercise 3.1) for compact spaces, this is impossible. Since the hypotheses of the Hopf Extension Theorem are satisfied, there is a countably additive measure I , the restriction of p * to the p*-measurable sets in ~ ( X N that ) , extends the concept of the length of an interval. The measure 1 is called Lebesgue measure, and its domain is the set of all Lebesgue measurable sets in X V . H
We have constructed a Lebesgue measure for X , v for each AT E K. It is important to note that the Lebesgue measure defined on X N is the restriction to [ - N , N ) of the Lebesgue measure defined on X , V ~for each N' > N . This follows from Theorem 2.4.3 concerning the Borel field inherited by a subset and from Exercise 2.21 establishing the uniqueness of the Hopf extension measure for g-finite measure spaces. We saw in Exercise 2.20 that the outer measure constructed in the Hopf Extension Theorem (Theorem 2.4.1) is regular. This means that every set S E p ( X ) is a subset of a Borel set B E B(U) having the same measure as the outer measure of S. We apply these concepts to Lebesgue measure constructed in the present section on the interval [-N. N ) . The field of elementary sets (E c B((E)c C. where we use the letter C to denote the family of all measuruble sets in the case of Lebesgue measure.
EXERCISE
3.1 A topological space X is called compact provided that it has the Heine-Bore1 property: Every open covering of X has a finite subcover. a) Suppose X is compact and 3 is a family of closed subsets of X such that
Prove that there is a finite subcollection of 3 with empty intersection. b) Prove thefinite intersection properQ for compact sets: If every intersection of finitely many members of 3 is nonempty, then
34
LEBESGUE MEASURE
3.2 MEASURABLE SETS, BOREL SETS, AND THE REAL LINE The following theorem is concerned with abstract measures on sets, as constructed using the Hopf Extension Theorem. We should recall Exercise 2.20, which tells us that if a countably additive measure 1-1is generated by the Hopf Extension Theorem from a finitely additive measure p on a field 2, then the outer measure p * is regular in the following sense: For each set S E p ( X ) ,there is a Borel set B E B(Q)such that p * ( S ) = p ( B ) .
Theorem 3.2.1 Let p be a finite, countably additive measure, constructed from some finitely additive measure on afield in 2l c ! J ? ( X )using , the Hopf Extension Theorem (2.4.1). Then a set S in p ( X )is measurable23ifandonly ifthere exist Bore1 sets B, and B* in B(U) such that
B, C S
C
B* and p(B,)
= p*(S) = p(B*).
where p* is the outer measure defined on !J?(X ) , as in the proof of the Hopf Exterision Theorem. Proof: We begin with necessity. Suppose that S is measurable. Then we know that p(S) p(Sc) = p ( X ) . Because p is regular, there exists a Borel set B* 2 S such that p ( S ) = p ( B * ) .We know that the measurable sets form a a-field, so S" is measurable too. So there exists a Borel set B 2 S" such that p ( S c ) = p(B). If we let B, = B", then B, c S and p ( B * ) = p(S). Next, we prove sufficiency. Suppose there exist Borel sets B, C S C B* such that p ( B * ) = p * ( S ) = p ( B * ) . We know that S will be measurable provided that P*(S) + P*(S") = P ( X ) . Since p* is subadditive, it suffices to prove that p* (S) p * ( S c ) < p ( X ) . We know that p ( B * )+ PL(B:) = P ( X ) = P ( B * )+ P ( ( B * ) " ) ,
+
+
so that p((B,)") = p ( ( B * ) ' ) .By the monotonicity of ,LL* we see that p * ( S ) = p ( B * ) and p * ( S c ) < p ( ( B * ) ' ) .
Thus
Corollary 3.2.1 Let p be a a-finite, countably additive measure, constructed from some finitely additive measure on a field in U C p ( X ) , using the Hopf Extension 23The term p-measurable would be used if there were any ambiguity regarding the measure under consideration.
MEASURABLE SETS, BOREL SETS, AND THE REAL LINE
35
Theorem (2.4.1). Then a set S in p ( X ) is measurable ifand only ifthere exist Borel sets B* and B* in B(U)such that
B, G S
C
B* and p(B*\B,)
= 0.
Proof: Apply o-finiteness, taking note of the fact that the union of countably many w sets of p-measure zero will have p-measure zero. The reader should note that if S is a measurable set, then p * (S) = p( S ) . That is, if S E 2,then there exist B*and B, E IB c 2 such that P(B*) =
4 s )= PL(B*).
This is the first result of several that will help us to explore the nature of Lebesgue measurable sets. The exercises below provide very important additional insights that will be useful frequently.
Definition 3.2.1 A set B c IR is called an F,-set if and only if it can be expressed as the union of countably many closed sets. A set B c IR is called a Gs-set if and only if it can be expressed as the intersection of countably many open sets. 24 In view of Exercise 3.2 below, F,-sets and G,j-sets are special types of Borel sets in the real line.
EXERCISES
3.2 Prove that every open set G E [ - N . N) is a Borel set and that every closed set F c [-N. N) is a Borel set. In particular, each singleton set { p } is a Borel set, and each interval is a Borel set. (Hint: You may use the fact from advanced calculus that every open subset of JR is the union of countably many open intervals.) 3.3 Suppose that S E [- N, N) . Use Theorem 3.2.1 to prove that Sis a measurable subset of [ - N , N ) if and only if for each E > 0 there exist an open set G 2 S and a closed set F 5 S such that p ( G ) - p ( F ) < E . (Hint: The measure of a Borel set is the same as its outer measure. To produce G , think about expanding the half-closed, half-open intervals very slightly to make them open.) 3.4
Prove that in Theorem 3.2.1 we can take the set B , to be an F,-set and the set
B* to be a Ga-set. (Hint: Use Exercise 3.3.) Remark 3.2.1 The measure p* on the power set of X N = [-N. N ) that we have defined in this section is called Lebesgue outer measure and is denoted also as 1 *. We define the Lebesgue inner measure by the formula 1*(S) = q X N ) - l*(SC). 24These notations are universal in the mathematical literature. The name 4 comes from the German word gebier, meaning a domain. which is normally open. and from the Greek 6, analogous to the Roman d, for the German word durchschnitt, meaning intersection. The name Fo comes from the French word fermP, meaning closed, and the Greek letter u, analogous to the Roman s. for sum.
36
LEBESGUE MEASURE
It follows from Theorem 2.4.2 that S G X , y is Lebesgue measurable if and only if l * ( S )= l * ( S ) ,in which case I ( S ) = l * ( S )= l * ( S ) .
Definition 3.2.2 We will call the measure 1 defined in Remark 3.2.1 Lebesgue measure on XN = [-N. N). The Lebesgiie measurable sets in Xn: are the sets in the family C(X,) = ( A E y ( X )1 l * ( W )= l * ( U 7 n A ) +l*(T.I' n A') V U 7 E (53(X)}.
3.2.1 Lebesgue Measure on IR Next, we extend the definitions of Borel sets, measurable sets, and Lebesgue measure to IR from XN = [-N.N). We will apply Corollary 2.4.1 so as to make use of the a-finiteness of IR with respect to Lebesgue measure. One may replace the decomposition IR = UNEN X N by
where SN = X N \ X N - ~ is a sequence of mutually disjoint Borel sets of finite Lebesgue measure.
Definition 3.2.3 We say that A E %(IR), the family of all Borel sets in the real line, if and only if A n S N E % ( S N )for each N E IN. Call A E C(IR), the family of all measurable sets in the real line, if and only if A n S N E ~ ( S Nfor ) each N E IN. Finally, define the Lebesgue measure of A E C(IR) by
[ ( A )=
l ( An S N ) . NEE
Theorem 3.2.2 Let 0 denote the set of all open subsets of IR.Then the fatnil?' 23 of all Borel sets in the line is generated by 0:
23(IR)
=
IB(0).
Proof: The reader will provide a proof in Exercise 3.5.
Remark 3.2.2 We began the study of Borel sets and of measurable sets in the real line by considering intervals of the form [ab ) , because we need Lebesgue measure to generalize the concept of the length of an interval. The use of left-closed, half-open intervals was a convenience for describing a j e l d of sets so that we could use the theorem that the monotone class and Borel field generated by the elementary sets would be the same. At this point, it is advantageous to think of the 0-field of Borel sets in the real line as being generated by its topology-that is, by the family of open sets. First, this
MEASURABLE SETS, BOREL SETS, AND THE REAL LINE
37
establishes that the family of Bore1 sets is canonical, meaning that it is independent of the choice of decomposition of IR into the union of countably many elementary sets of finite measure. Moreover, this interpretation generalizes to topological spaces that are different from IR. EXERCISES
3.5
Prove each of the following statements about sets A c IR. Let EN be the field of elementary sets in Slav = XN\XN-I and denote
E(IR)
=
1
E
N
=
U En El, E E n V n 6 N.V N 11=1
~
E
IN
1
.
Prove that Q(R)is the a-field generated by the field E(R). (Hint: Use Corollary 2.4.1 .) Prove Theorem 3.2.2: Show that %(IR) is the a-field generated by the family of all open sets in IR,or equivalently, generated by the family of all closed sets. Prove that A E 2(R) if and only if there exist B * and B, in Q(R)such that B, E A E B* and 1(B*\B,) = 0.Explain why it would be insufficient on IR to know only that p ( B * ) = p ( A ) = p ( B * ) . Show that 1 is a countably additive measure on both %(R) and 2(IR).(Hint: You may use theorems from advanced calculus concerning absolutely convergent double series.)
3.6
Let E be any measurable subset of IR. Define the set
E +c
= {e+c
I e E E}.
We will prove the translation invariance of Lebesgue measure. Suppose first that E is an open set, and prove that I ( E ) = I(E + c). (Hint: You may use the fact that every open subset of the real line is the union of countably many disjoint open intervals.) Prove that translation preserves the family of G6-sets, the family of F,-sets, and the family of Lebesgue null sets. Prove that E is measurable if and only if E c is measurable. Prove that 1(E) = I(E c) for each Lebesgue measurable set E . Translations modulo 1: Now let T , : [O. 1) -+ [0, 1) by
+
T,(Z) = n:
+
+ c - .1 + cj.
where thefloorfunction 1x1 denotes the greatest integer that does not exceed 2 . Prove that T , preserves both the measurability and the measure of each measurable subset of [0,1).(Hint: T , is the identity map if ?z E Z.Partition E into two suitable subsets.)
38
3.7
LEBESGUE MEASURE
Let S = (IR\$) n [O. 11, and let 1 denote Lebesgue measure. a) Show that 1(S) = 1. b) Let E > 0. Construct a closed subset F c S such I ( F ) > 1 - E . c) Explain why it is impossible for F to contain any interval of length greater than zero.
3.8 Let S E 2 be a Lebesgue measurable subset of JR. Suppose that for each finite interval I G IR we have I (S n I ) < 1 ( I ) . Prove that 1 ( S ) = 0. 3.9 Let E E c(R) be any Lebesgue measurable subset of the real line for which l ( E ) < a. Let f(r) = l ( E n (-co.z). a) Prove that f E C(IR), the vector space of continuous real-valued functions on the real line. b) Let cy E (0. 1). Prove that there exists a measurable set E , c E such that 1(E,) = crl(E). 3.3 MEASURE SPACES AND COMPLETIONS
Definition 3.3.1 A measure space is any triplet (X,Q. p ) where X is a set, U is a a-field of subsets of X, and p is a countably additive measure on U. In a measure space, a set A E Q is called a null set provided that p ( A ) = 0. A measure space is called complete provided that every subset of a null set belongs to U. A pair (X. Q) is called a measurable space provided that 2l is a a-field in p ( X ) . EXERCISE 3.10 Let X = IR,the real line, and Q = Q(IR),the a-algebra of all subsets of X. Define v ( E ) = # ( E n Z), the number of integers in E . a) Prove that v is a countably additive measure on Q. b) Let f(r) = v([O.r ) ) .Is f continuous? Justify your answer. c) Is (X. Q. v ) complete? Justify your answer. If A is a null set and if B E Q is a subset of A , then the monotonicity of ,LL implies that p ( B ) = 0, so that B is a null set. The reason that not every measure space is complete is that a subset of a null set may fail to belong to U. However, we do have the following corollary of the Hopf Extension Theorem.
Corollary 3.3.1 Let (X.Q * . p * ) be any measure space produced by applying the Hopf Extension Theorem (Theorem 2.4.1) to produce a Carathbodory outer measure p*, where%* is the a-jieldof all p*-measurable sets. Then (X.U*,p * ) is a complete measure space. In particulal; each of the Lebesgue measure spaces ([-N, N ) . C[-.U. N). l ) is complete, as is
(IR,2(JR).1).
MEASURE SPACES AND COMPLETIONS
39
Proof: We will not suppose that p* ( X ) < ix, since this is not needed. Suppose that A E U* is a null set, and let B c A. We need to prove that B is measurable. Thus we need to show that for each W E Q ( X )we have
p*(Nr)= p*(W n B)+ p * ( i t ' n B'). Since A is measurable. we know that p * ( W ' ) = p*(M' n A )
+ p*(It?n A').
and
0 6 p * ( W n B)6 p*(I1.' n A ) 6 p * ( A ) = 0. Hence p*(W n B) = p* (IY n A ) = 0. It follows that
p * ( I Y )= p * ( W n A') 6 p*(T/t' n BC)6 p*(Tt-) by monotonicity. Thus
p*(N') = p*(W
+
n A') = p * ( W n B') = p * ( H Tn B) p * ( W n BC).
Hence B E U*. The real cases ( [ - N , N ) ,C [ - N , N ) ,I ) are covered directly by the preceding general argument. And the case of (IR. C(IR). 1 ) is covered by the completeness of 1 on [ - N . N ) for each N E IN, since Lebesgue measure on the real line is 0-finite. It will follow from Exercise 3.1 1 that the family of Lebesgue measurable subsets of [O. 1) has cardinality 2' which is strictly greater than c = 2 N n ,the cardinality of the set IR. of all real numbers. The exercise asks us to construct a Borel set that is known as the middle thirds Cantor set, C.
EXERCISE 3.11 Let C1 be the set [O: 1]\ of length i.Let c 2 =
(i?i),a union of two disjoint closed intervals, each
((is ;) (i.;))
c,\
u
.
so that again we discard the middle thirds, leaving four disjoint closed intervals, each of length We proceed inductively in this manner. each time discarding middle thirds, producing a descending chain of nested sets C1 2 Cz 2 C, 2 . . .. Define the Cantor set
i.
L
C= n c . n=1
a) Prove that C is a Borel set with measure zero. b) Use ternary expansions of the real numbers to produce a bijection between the elements of C and the set of all infinite sequences of 0s and 2s. Explain
40
LEBESGUE MEASURE
why this set is uncountable and has the cardinality c of the continuum of real numbers. c) Prove that C is a closed, nowhere dense subset of the unit interval [O. 11.
Remark 3.3.1 We sketch here a proof that the Borel measure space ( X N . 13.P ) ?
consisting of all the Borel sets in the interval X N = [-N. N ) , is nor complete.25 The argument will not be self-contained, since it will depend upon outside knowledge of transfinite induction and how that process can be used to construct the a-algebra 13. What we will do is to show that there are not enough Borel sets to account for all the subsets of the middle thirds Cantor set. We will explain why the cardinality of the set !I3 of all Borel sets in X>Vhas the cardinality c
=
2"l < 2" = #rp(C).
where C is the middle thirds Cantor set. The interested reader can consult [8] for the details of the following construction by transfinite induction of 93. For each set S c p ( X ) ,denote by S * the set of all unions of countably many members of S and all differences of elements of S. Let k = # S , the (generally infinite) cardinality of S. Sequences in S correspond to functions f : IN --* S, and the cardinality of the set of these is the (infinite) cardinal number k N o .Suppose for the moment that k = c = 2 No. Then
3x
I = max(3.7)
Thus the cardinality
#S*
=
c = #S.
First, let C c ? ( X ) be the field of elementary sets. It is easy to see that the cardinality of C is c. Let CO = C,and for each finite or transfinite ordinal number N let
25A different proof, based on the Cantor function, of the existence of measurable sets that are not Borel sets IS given in Exercise 7.13.c. 26The smallest infinite cardinal number is the cardinality of the set IN of natural numbers and it is written as No (read aleph-null). The next two symbols we have used are bet 1and gimell. which are the next two letters of the Hebrew alphabet, or aleph-bet. Each infinite cardinal number is an equivalence class of infinite sets. Each infinite ordinal number is an order-preserving equivalence class of well-ordered sets.
MEASURE SPACES AND COMPLETIONS
41
Let R denote the first uncountable ordinal number. It can be shown that
WE) =
u
Ea
a
and that the cardinality of B(E) is c x c numbers being the larger of the two.
= c,
the product of any two infinite cardinal
3.3.1 Minimal Completion of a Measure Space
Theorem 3.3.1 For each measure space ( X .U. p ) there exists a minimal completion ( X . p ) . That is, ( X .a. ,ii) is a complete measure space that is an extension of ( X ,U, p ) in the sense that 2 U and the restriction
a.
= P.
Moreovel; every complete extension of ( X ,U, p ) is an extension of ( X ,
a.p ) .
Proof. Let ( X ,U. p ) be any measure space. Let
T = { A I A E U , p ( A )= 0 } , (n = {BI B c A for some A E T}. U = { A1 A = (A\Ni) u R2,A E U, N 1
E (n. iV2 E
3).
and let P((A\N,) u N2) = p ( A ) . It is easy to see that includes every subset of every null set, so that (X, U,p ) is complete. The reader should check that !% is a a-field. This follows from the fact that countable unions of null sets are null sets. Also, every complete extension of (X,U.p ) must extend (X. !%,p ) , so that the latter m completion is minimal.
Corollary 3.3.2 A a-jinite measure space ( X ,2,p ) , arising from the HopfExtension Theorem using the a-jield C of all the measurable sets, is the (minimal) completion of the Borel measure space ( X .23, p ) . (Here, 93 is the Boreljield generated by the originaljield in the hypothesis of the Hopf theorem.} Proof: We have seen that (X, 2,p ) is complete by Corollary 3.3.1. By Corollary 3.2.1, every measurable set A in a a-finite measure space lies between two Borel sets as follows: B, E A E B*, where p (B*\B,) = 0. Thus
A = (B*u Nl)\N2, where N1 and N2 are subsets of the null set B*\B,). Hence each A E 2 must lie in the minimal completion of (X,23. l ) , as defined in the proof of Theorem 3.3.1. 3.3.2 A Nonmeasurable Set We prove here the existence of a nonmeasurable set for Lebesgue measure on the unit interval.
42
LEBESGUE MEASURE
H EXAMPLE3.1 We will prove the existence of a nonmeasurable set, specifically within the unit interval of the real line.*' Define a translation mapping of X = [O. 1) onto itself by T,(z) = z + c - 1s + c].
+
where 1s c] denotes the greatest integer that does not exceed s. This can be understood as translation modulo one. It follows from Exercise 3.6 that if E C X is measurable, then Tc(E)is measurable as well, and 1 (E) = I (T, (E)). Now define an equivalence relation on X by r y if and only if .r - y E Q, the set of rational numbers. Decompose X into disjoint equivalence classes:
-
By the Axiom of Choice there exists a set
consisting of exactly one member s1 from each of the equivalence classes A,. We claim that A 1 $ 2. Note that s-, - z,!E Q if and only if 7 = 7 ' . Hence the family of sets { T c ( A l ) I c E Q n [0.I ) } is a countable family of mutually disjoint sets. Moreover, =
UcanxT,( AI).
If Al were measurable, then either
or [ ( A l ) > 0.
In the former case, countable additivity of Lebesgue measure would imply that 1 ( X ) = 0, which is impossible since I ( X ) = 1. But if /(*If) > 0, it would follow that 1 ( X ) = m, which is impossible as well. Hence $ 2.
EXERCISE
3.12 Let S be any nonmeasurable subset of IR. Prove that the outer measure I*(S)> 0. "A more general result is presented soon as Theorem 3.4.4. the proof of which utilizes Steinhaus's theorem from Exercise 3.21. A further development of the construction of a nonmeasurable set appears in the Appendix as Example A. 1.
SEMIMETRIC SPACE OF MEASURABLE SETS
43
3.4 SEMIMETRIC SPACE OF MEASURABLE SETS
Definition 3.4.1 A set S equipped with a function d : S x S + [O. E) is called a tnetric space ( S ,d ) , with metric d , provided that d has the following properties: i. d(a. b ) = d ( b , a ) for all a and b in S. ii. d(a. b) = 0 if and only if a iii. d ( a . c)
=
b.
< d ( a , b ) + d ( b . c) for all a, b, and c in S.
Property (iii) is called the triangle inequality. If the pair (5'. d ) lacks the only if part of property (ii) but has the others, then it is called a semimetric space. In any metric or semimetric space, a sequence a , is called a Cauchy sequence if and only if for each E > 0 there exists N E K such that rz and m greater than or equal to implies that d(a,. a,) < E . A metric or semimetric space is called complete if and only if for each Cauchy sequence a, in S there exists a E S such that a,? + a in the sense that d(a,, a ) -, 0 as n + E."
Theorem 3.4.1 Let ( X .U,1-1) be any finite measure space. For each pair of sets A and B in U, define the symmetric difference A A B = ( Au B)\(A n B ) and define d(A.B ) = p ( A A B ) . Then the pair (U. d ) is a complete semimetric space. The reader should note that this concept of completeness as a metric space does not have the same meaning as completeness of a measure space in the previous sense, in which every subset of a null set is a null set. The present theorem does not require the measure space to be complete in the sense of Theorem 3.3.1. Proof: To prove that d is a semimetric, everything except the triangle inequality is obvious. To establish the triangle inequality, it would suffice to prove what we will call the triangle inequa1it;vfor sets:
AAC
c ( A B)u (Ba C )
(3.1)
To prove this containment, we note first that if x E A C, then either .c E '4 or E C, but not both. Thus A A C = (A\C) u (C\A). Hence if .T E A A C. it follows that either
x
x E A\C. and thus or else
x
E
C\A, and thus
1 1
x
E
x @
B , and s E B A C. or B,and J E A a B.
x E B, and
x$
.1:
E
A
B,and ,T E B A
B.or
c.
28However, for limits to be unique, we need the space to be a full-fledged metric space.
44
LEBESGUE MEASURE
In order to prove that the semimetric space is complete, we take a Cauchy sequence {ATL 1 A, E U Vn E IN}. That is, for each E > 0 there exists N E IN such that n and m 2 N implies that d(A,. A,,) < E. We need to prove that there exists A E U such that d(A,. A) -+ 0 as n 4 cc. For each k E K there exists nk E N such that n and m 2 711, implies that
Moreover, we can select the sequence n k so that n k < nk+l for all k . We will define the limit superior of the sequence of sets A,, by r
A
=
limsup~,, k+
r
UA
=
~ ~ .
p=l k=p
1
the set of all elements that are present in infinitely many of the sets A,, . Observe that A E U because U is a a-field. We will prove that d(A,, A) 0 as n CC. Denote -+
Cp =
-+
U A,,,and D, = n A n k . f
f
k=p
b=p
Observe that C, is a descending sequence of sets with intersection equal to A. On the other hand, D , is an ascending sequence of sets with union denoted as
u" n Y.
liminf A,,
=
A,,
p=l k=p
The limit inferior of the sequence of sets A,, is therefore the set of all elements that are present in all but finitely many of the sets A,, . Observe that
D, C A,, S C,, and D, C A C C,,
" 1
1
a 3 = 2 p -4' 0 k=p
SEMIMETRIC SPACE OF MEASURABLE SETS
45
as p -+ a.The inequality (i) is justified by the fact that in order for an element 1' to be in the union of the pth tail of the sequence of sets but not in its intersection, there must be at least one of the sets A,, in the union that contains 2 and another set ATI3 that does not contain 5 . Iterated application of Equation (3.1) using in place of the set B each of the sets indexed between n k and n3 shows that there must be at least one value of k such that this element lies in il,, A A,,,, . In fact, suppose for specificity that n3 < n k . Then
Thus
as p
-+
co. Hence if n > 1zp we have 1
< -2 P+ -
1 2P-1
4 0
a s p -+ 00.
Remark 3.4.1 We could replace the semimetric space of measurable sets in Theorem 3.4.1 by a metric space in which the points of the space are equivalence classes. Define A B if and only if p ( A A B ) = 0. That is, two sets A and B are equivalent if and only if the measure of their symmetric difference is zero. We form the quotient space U/V where V denotes the family of null sets in 'u. And if A and B are equivalence classes, we define the full-fledged metric d ( A ,B ) = d(A, B ) . The reader should verify that the value of the metric is independent of which representatives are selected from each of the two equivalence classes.
-
In the next theorem, we apply what we have learned about the metric on the space of measurable sets to Lebesgue measure on X N = [-AT, AT).
Theorem 3.4.2 In the semimetric space (U.d ) obtained from the Bore1 measure space ([-A7% N ) , ' u . 1 ) by letting d(A.B)= 1(A A B),thej?eld Eofelementat-ysets is a dense subset of U. That is, i f B E U, tlieri there exists a sequence of elementar?. B in the sense that d(B.E n ) -+ 0 as n -+ E. sets En E C such that E , -+
The proof is Exercise 3.13
46
LEBESGUE MEASURE
EXERCISES
3.13 Prove Theorem 3.4.2. (Hint: Use the definition of CarathCodory outer measure to show that there exists a suitable sequence of elementary sets.) 3.14 In the semimetric space ( 2 , d ) formed from ( [ - N . N ) , 2 , 1 ) ,where 1 is Lebesgue measure, prove that each of the following sets is dense: a) 0,the set of all open sets in [ - N . N ) . b) K, the set of all closed sets in [-Ar. AV). 3.15 Let ( X ,U. p ) be any finite measure space arising from the Hopf Extension Theorem applied to a finitely additive measure on some field E. a) Prove that Theorem 3.4.2 is true for ( X .U. p ) . b) Explain why we need the measure p ( X ) < so in part (a). The following concept is purely topological and is not part of measure theory, although it can be applied to metric spaces that arise from measure theory as explained above.
Definition 3.4.2 If (X, d ) is any metric space, a subset S E X is said to be a set of theJirst category if and only if S can be written as the union of countably many nowhere dense sets.29 A subset S E X is said to be of the second categoiy if and only if it is not of the first category. For example, the set Q is a set of the first category in Euclidean topology, since
IR,equipped with its usual
a countable union of singleton sets. And each singleton is clearly closed and nowhere dense.
Theorem 3.4.3 (Baire Category Theorem) A complete metric space is a set of the second category The proof is given in Exercise 3.16. In the Baire Category theorem the concepts of being closed, dense. or nowhere dense are understood in terms of the given metric. The reader can apply the Baire Category theorem to the complete metric space U/n of equivalence classes of Lebesgue measurable sets of the measure space ( [ - N . N ) .2.1) in order to prove that measurable sets with a certain bizarre property are ubiquitous. See Exercise 3.17. The Baire Category theorem has many other interesting consequences, some of which are treated in the exercises below. "Nowhere dense means that the closure 5 of S contains no open set other than the empty set
SEMIMETRIC SPACE OF MEASURABLE SETS
47
EXERCISES
3.16
Prove the Baire Category theorem (3.4.3). Here is a suggested outline.
a) Show that it suffices to prove that in a complete metric space the intersection of countably many open, dense sets must be nonempty. b) Let 0, be a sequence of open, dense subsets of the metric space (X,p ) . Prove that there exists a decreasing nest of closed balls
with r,
-+
0.
c) Use completeness to prove that
n:=:=, ok z 0.
3.17 Call the Lebesgue measurable set S c [-N. N) a ghost set if and only if S has the property that for every i n t e n d I C [-N. N) we have
o < I(S n I ) < I(I).30 a) Show that the family 6 of ghost sets in the unit interval is a set of the second category. b) Determine whether 6 is closed under i. unions of two elements. ii. complementation of an element. iii. intersection of two elements. c) Let G be a ghost set in [ - 1.1)and prove that 1G is rtot Riemann integrable on [-1,1]. (This exercise shows that in the sense of Baire category, most indicator functions of measurable sets in a finite interval are not Riemann integrable. However, the reader will see that they are all Lebesgue integrable.)
3.18
a) Use the Baire Category theorem3' to prove that the set (IR\$) n [a,b] is not an F, set if a < b. b) Show that the set Q n [a.b] is not a G s set if a < b. (Hint: If Q and Qc were both Gpsets, then the empty set would be the intersection of countably many dense open sets, which would violate the Baire Category theorem.) c) Give an example of a subset S of the real line that is a Bore1 set but is neither an F, nor a Gg set. 3.19
Let S be the set of points
2
at which a function f : IR
-+
IR is discontinuous.
30The behavior of ghost sets makes an interesting contrast with Exercise 3.8. "In this problem, think of [a, b] as being a complete metric space with respect to ordinary Euclidean distance on the real line.
48
LEBESGUE MEASURE
a) Prove that S must be an F,-set. (Hint: It may be helpful to consider the concept of the oscillation o f ( p ) = l i n i s u p f ( x ) - liniinf f ( x ) , X+P
X-+P
as defined in advanced calculus.3’ You may use the fact that a function f is continuous at p if and only if o f ( p ) = 0.) b) Prove that it is impossible for the set S to be precisely the set $‘ of all irrational numbers. (Use the result of Exercise 3.18.a.) c) Give an example of a bounded monotone function f for which the set S of all points of discontinuity is precisely Q.33
3.20 Equip the space C(IR) of all continuous functions on the real line with the standard sup-norm: l l f ~ ~ s u p = sup{ I f ( . ) ] I z E R}.Prove that for each 17 E IN the set
is an open, dense subset of C(R). Apply the Baire Category Theorem to prove that the set of nowhere differentiable continuous functions is a set of the second category in C(R).34(Hint: Think in terms of sawtooth functions.)
3.21 Prove the following theorem of Steinhaus. Suppose A c measurable and suppose its Lebesgue measure l(’4) > 0. Denote
A -A
= {X - y 12 E
A. y
E
IR is Lebesgue
A}
Prove that A - A contains an open interval around 0. Explain why it would suffice to give a proof if 0 < l ( A )< m. Let
f(x) = L((x
+ A) n A).
It may help to use the following procedure. 35 a) Show that if f(z)> 0, then x E A - A. b) Show that Steinhaus’s theorem follows i f f is continuous at 0. c) Show that
IL(A)- L(B)I < l ( A A B)= d (A ,B ) if A and B are measurable sets of finite measure and if d is the semimetric for the space of measurable sets. ”See for example the book [20]. 33See,for example, Exercise 7.10 in [20]. 3JThis problem is not about measure and integration. We include it here because of its general interest as a corollary to the Baire Category Theorem. 35An alternative proof of Steinhaus’s theorem is given in Exercise 6.10.
SEMIMETRIC SPACE OF MEASURABLE SETS
49
d) Show that f is continuous at each point z if A is a finite interval. Then prove the same thing if A is an elementary set. (It will be easier to give a proof for these two special cases.) e ) Use the triangle inequality for d, together with the result of part (d), to show that f is continuous at each .c E IR,assuming only that A E C with finite measure. With the help of Exercise 3.21 we can prove a generalization of Example 3.1. We will show that every subset P E R of strictly positive measure must contain a nonmeasurable set.
Theorem 3.4.4 r f P subset of P.
E
C(R)and i f l ( P ) > 0, then there exists CI noizmeasurable
Proof: Let Q denote the set of all rational numbers, as usual. We will define again an equivalence relation on IR by
X-YUX-YEQ By the Axiom of Choice, we can find a cross section r We can express the real line as a disjoint union
36
of the quotient space IR/ -.
-
since if -) + y = y’+ q’, then 3 - y’E Q. This would make 3 A,’ and thus 3 = -,’, since r is a cross section of IR/ -. If P n (r + q ) were not Lebesgue measurable for some q E Q, then we would be finished because P would have a nonmeasurable subset. However. if
P n (r + 4 ) = Pq E C(R) for each q
E
Q, then we observe that pq - pq E
(r + q ) - (r + q ) = r - r.
where I‘- r is disjoint from the dense set Q\{ 0) for the reasons explained just above. Thus the difference Pq - Pq of a supposedly Lebesgue measurable set with itself fails to include any open interval around 0. By Steinhaus’s theorem (Exercise 3.21), Pq must have measure zero. Hence P itself is the union of countably many disjoint null sets, which contradicts the hypothesis that 1(P)> 0. More information about the nonmeasurable subsets of P can be found in [lo].
EXERCISE 3.22 Prove that the cardinality of the family of all nonmeasurable subsets of IR is the same as the cardinality of the family of all measurable sets: 22”0. (Hint: If P is a nonmeasurable set and S is any disjoint measurable set, what can you conclude about S u P?) 36That is. r contains exactly one element of each equivalence class in R
50
LEBESGUE MEASURE
3.5 LEBESGUE MEASURE ir.1 I R ~ By the 2N-cube in IR" we mean QN
=
[-N,N)"
=
[-N.Ar) x . . . x [-N,?V).
a Cartesian product with n factors, each being a copy of the interval [-N. N ) . We will denote a typical rectangular block in Q ,v by R = J ~ .x. . x J , .
a Cartesian product of n intervals of the form J ,
= [a,, b t ) . We
define the measure
n z= 1
the product of the lengths of the n intervals. By an elementap set E E e we will mean the union of finitely many rectangular blocks as defined above, and I is a finitely additive measure on e.
Theorem 3.5.1 The family ~ ( Q N of) elementan sets is a field of sets contained in ? !3 ( [ - N . N ) " ) . There is a countably additive measure 1 dejined on the generated Boreljield '23 = IB ( ~ ( Q N ) that ) coincides with the measure I on ~ ( Q N ) where , 1 of a rectangular block is its Euclidean volume. The extension of 1 from 2 ' to 93 proceeds as it did in Theorem 3.1.1 for the one-dimensional case, using the Hopf Extension Theorem. The proof is virtually identical, since the main tool is the finite intersection property for compact sets. That principle, presented in Exercise 3.1, is valid for IR" as well. We leave it to the reader to check this fact. Next we extend the definitions of Borel sets, measurable sets, and Lebesgue measure to IR" from Q N = [ - N . N ) " . We will make use of the a-finiteness of Euclidean space, just as we did for the line in Definition 3.2.3. One may express
where SN = QN\Q,~-I for each iVE N . Thus the mutually disjoint elementary sets S N all have finite Lebesgue measure.
Definition 3.5.1 We say that A E B(lR"),the family of all Borel sets in R7*, if and ) each N E IN. Call A E C(IRn), the family of all only if A n S N E ' 2 3 ( S ~for measurable sets in IR",if and only if A n S N E C ( S N )for each N E IN. Finally, define the Lebesgue measure of A E C(IRn) by
1(A)=
C 1(A NEW
nSN).
LEBESGUE MEASURE IN
Theorem 3.5.2 Let 0 denote the set cf all open subsets of all Borel sets in IR" is generated by 0:
of
a''
51
IRn. Then the family '23
Proof: This proof is left to the reader in Exercise 3.23.
Remark 3.5.1 There are advantages to thinking of the a-field of Borel sets in Euclidean space as being generated by its topology-that is, by the family of open sets. This is a viewpoint that generalizes in interesting ways-for example, to topological groups. We began the study of Borel sets and of measurable sets in the real line by considering half-closed, half-open rectangular blocks, because we require Lebesgue measure to generalize the concept of the volume of a box. The use of half-closed, half-open blocks was primarily a convenience for describing a field of sets, so that we could use the theorem that the monotone class and Borel field generated by the elementary sets would be the same.
EXERCISES
3.23
Prove each of the following statements about sets A E IR If : a) Prove that B(IRn)is the a-field generated by
b) Prove that A E C(IR") if and only if there exist B* and B, E B(JR")such that B, s A E B* and 1(B*\B,) = 0. c) Prove that 1 is a countably additive measure on B(IR"). d) Prove that the open sets and the closed sets are Borel sets in R n . e) Prove Theorem 3.5.2: B(IR?L) is the a-field generated by the family of open sets or, equivalently, by the family of closed sets in IR" .
3.24
Suppose E is Lebesgue measurable in IR", and define
E+c={e+cIeEE}. the translate of E by c. (Addition refers to vector addition in this context.) We will prove the translation invariance of Lebesgue measure. a) Prove that E is a null set if and only if E c is a null set. b) Prove that E c is Lebesgue measurable. c) Suppose that E is an elementar?;set in IR",and prove that I ( E + c) = 1 ( E ) . d) Let E be any measurable set, and prove that 1 ( E c) = 1( E ) .
+
+
+
3.25 Suppose E is a subset of IR" and let -E = {-e 1 e E E } . We will prove the invariance of Lebesgue measure under rejection through the origin. a) Suppose first that E is an elementary set in IR". and prove that -E is a Borel set and / ( - E ) = 1(E).
52
LEBESGUE MEASURE
b) Finally, let E be a measurable set, and prove that 2( - E )
= I(E).
3.26 Suppose A and B are measurable subsets of IR",each one of strictly positive butfinitemeasure. ProvethatthereexistsavectorcE IR"suchthatl((A+c)nB) > 0. (Hint: Consider the outer measure of A and B.) 3.27 Let E > 0. Construct an open, dense subset S of IR" for which the Lebesgue measure 1 ( S ) < E . 3.6 JORDAN MEASURE IN IRN Lebesgue measure will be the foundation for defining the Lebesgue integral and for proving its properties. Jordan measure is the corresponding foundation for the Riemann integral, though Jordan measure (also called Jordan contenf)is often not taught explicitly in advanced calculus courses. Although not required for understanding the Lebesgue integral itself, the study of Jordan measure will help us to understand the relationship between the Lebesgue integral and the Riemann integral, which the Lebesgue integral supersedes. Moreover, Jordan measure will enable us to prove Lebesgue's theorem (6.3.11, classifying all Riemann integrable functions in terms of the Lebesgue measure of the set of points of discontinuity.
Definition 3.6.1 Let E be the collection of unions of finitely many closed blocks in the closed cube QN, where N E IN is fixed arbitrarily. Define the outer Jordan ) measure for each A E ~ ( Q Nby v*(A) = inf{l(E) 1 A C E , E
E
E}
E
E}.
and let the inner Jordan measure be defined by v,(A) = sup{l(E) I A 3 E . E
Here 1(E) denotes the volume, or Lebesgue measure, of the union of finitely many rectangular blocks that comprise E. We define the set 3 to be the family of all Jordan measurable sets, where A is Jordan measurable if and only if %)*(A)= u*(.4), in which case either number is called u(A), the Jordan measure of A. We will see that the weakness of Jordan measure stems from the need to cover a set using only unions o f j n i t e l y many rectangular blocks. This has the unfortunate effect that Jordan measure is only finitely additive, and that is insufficient for the needs of analysis.
EXERCISES 3.28
Prove that Jordan measure is nor countably additive.
3.29 Use Definition 3.2.1 and DeMorgan's Laws to prove the following properties of inner and outer Jordan measure, in relation to inner and outer Lebesgue measure, for all A E ~ ( Q N ) The . sets Ei are in E .
53
JORDAN MEASURE IN IRN
b) u*(A)6 1,(A) .$ l*(A)6 .*(A). It follows that every Jordan measurable set is Lebesgue measurable and that its Lebesgue measure equals its Jordan measure. We will see that not every Lebesgue measurable set is Jordan measurable, however. Thus Lebesgue measure is an extension of Jordan measure.
Theorem 3.6.1 The family 3 of all Jordan measurable sets in X and v is aJinitely additive measure on 3.
= Q,v
is a field,
Proof: Let A and B lie in 3. Since
u ( X ) - u* (X\A) = u,(A) and w(X)- t i *
(X\A)
=
u*(A).
it follows that X\A E 3. Both A u B and A n B are in 3 since the union and intersection of any two elementary sets is again an elementary set. And u is finitely additive on 3 since on that field of sets v agrees with 1. N
EXERCISE
3.30 A subset A of X = QN is Jordan measurable if and only if for each there exist in & sets El C A E Ez such that v(Ez\El) < E . Definition 3.6.2 A set N c X and v(N) = 0.
=
E
Q N is called a Jordan null set if and only if N
Theorem 3.6.2 A Lebesgue null set F c X
=
>0 E
3
Q.v that is closed is also a Jordan
null set.
Proof: It suffices to prove that F E 3. It is easy to see that if E > 0, there exists a set G that is open in IR" and such that F c G and I(G) < e. But the set G is a countable union of open rectangular blocks. Since F is compact, the Heine-Bore1 theorem implies that F can be covered with finitely many of these open rectangular blocks with measure no greater than that of G. Thus there exists an E 2 F such that E E &, and v ( E ) = 1(E)< E . Thus F is a Jordan null set since $3 F c E .
Theorem 3.6.3 I f A C X
=
O N ,then A E 3 ifand only ifthe boundary dA = iT\A",
the difference between the closure and the interior of A, is a Jordan null set. Proof: Suppose first that A E 3. Since the boundary 8A is a closed set, it will suffice to show that it is a Lebesgue null set, since that will imply that it is also a Jordan null set. We know that for each k E IN there exist sets Ek C A C F k such that both Ek and Fk are in & and 1 (Fk\Ek)
=
1
1 (Fk\E,O) < k
54
LEBESGUE MEASURE
Let
F
=
n PI,
and E =
kEN
U E;. I,&
It follows that dA E F\E, and 1 (F\E) = 0. Thus dA is a Lebesgue null set and also a Jordan null set. Now we suppose that dA is a Jordan null set. For each E > 0, there exists E E E such that E" 2 dA, with l ( E ) < E . We seek to prove that A E 3.Denote X = Q-' and ' h
E"
=
X\E"
=
U BI,E . E
1=1
Suppose that one of the rectangular blocks BI, contained a point p E '4 and also a point q E A". Then it would follow that the straight line segment from p to q must contain a boundary point of A, which contradicts the hypothesis that dA E E . (See Exercise 3.32.) Now let El be the union of the blocks B k that lie inside il,and let Ez be the union of the blocks that lie outside A. Then the elementary set G\E," = E , and this means that El E A E % and 1 (%\El) < E . Thus A E 3.
EXERCISES
3.31 Let A = Q n [O, 11,the set of all rational numbers in [0, 1 ) . Prove that 8A is neither a Lebesgue null set nor a Jordan null set, so A $ ,I. On the other hand, show that A E !B, the family of Bore1 sets. 3.32 In IRn, show that if a rectangular block B contains both a point of the set E and a point of the complementary set E" = IRn\E, then B contains a point in JE, the boundary of E.
MEASURABLE FUNCTlONS
If X is a set and U c p ( X ) is a 0-field, then ( X ,U) is called a measurable space. If p is a countably additive measure defined on U, then ( X ,U.p ) is called a measure space. In this chapter we will introduce the family of measurable functions for which we will seek to define the Lebesgue integral. We will prove the very important fact that pointwise limits of measurable functions must be measurable. This is encouraging because pointwise limits of Riemann integrable functions need not be Riemann integrable.” 4.1 MEASURABLE FUNCTIONS Definition 4.1.1 Let ( X .2.p ) be a measure space. i. I f f : X
-+
IR,we say that f
is U-measurable provided that
j-l(-CO,u)
=
{x E X
1 f(x) < a } E U
37See Example 1.3. Measure and Integration: A Concise Introduction to Real Anulysis. By Leonard F. Richardson Copyright @ 2009 John Wiley & Sons, Inc.
55
56
MEASURABLE FUNCTIONS
for all a E R . ~ * ii. If f : X -+ C , the complex numbers, we write f(2) = u(,r) + i ~ ~ ( 3 for ') real-valued functions u = 82f and 2' = Sf. We say that f is 2-measumble provided that both u and 2' are %-measurable. iii. I f f : X -, S, where S is a topological space, we say that f is %-measurable provided that f - l ( G ) E % for every open set G s S. The reader may note correctly that the concepts of measurability presented above have a formal similarity to the definition of continuity. A function between topological spaces is called continuous provided that the inverse image of each open set is open. Of course, for measurability, the inverse image of an open set must be measurable, and measurability is by no means synonymous with being open. In IR ', every open set is Lebesgue measurable, but the converse is clearly false. The following exercises establish the equivalence of the three different forms of the definition of measurability for functions that are presented in Definition 4.1.1.
EXERCISES
4.1 Let ( X ,U. p ) be a measure space. Use the steps below to show that the concepts of measurability in Definition 4. I . 1 for both real and complex-valued functions are consistent with the concept of measurability for a topological space valued function. a) Show that if f : X --+ IR is %-measurable, then f (G) E U for every open set G E IR. b) Show that f : X -, IR satisfies the definition of measurability if and only if f-'(G) is measurable for each open set G E IR. Do the same for complex-valued functions f .
-'
4.2 Let ( X .%, p ) be a measure space. Let f : X -+ IRn, the latter space being equipped with its usual Euclidean topology and the Borel field generated by the open sets. a) Show that the family
(5 = { A E
p(IR")1 f - l ( A )
E %}
is a a-field. b) Prove that f is measurable if and only if f-'(B)
E %
for each Borel set
B.39 4.1 .I Baire Functions of Measurable Functions It will be important to know that many combinations of measurable functions and many functions of measurable functions are again measurable. To investigate this, we need the following definition. '*This definition is motivated by Section 1.3. "For information about .f-l of a measurable set, see Exercise 7.14.
MEASURABLE FUNCTIONS
57
Definition 4.1.2 Let (X. %, p) be any measure space arising from the Hopf Extension Theorem, applied to some measure on afield E. If f : X S, where S is a topological space, we say that f is a Baire function provided that -+
f-'(G)
E
B(E).
the Borel field generated by E, for each open set G E S . Baire functions may also be called Borel measurable functions. Clearly, every Baire function is measurable and every continuous function (from
IR" to S) is a Baire function. The indicator function of a measurable set that is not a
Borel set would be an example of a measurable function that is not a Baire function. The indicator function of a nontrivial proper Borel subset of IR " is an example of a Baire function that is not continuous.
.
Theorem 4.1.1 Suppose each of the functions f f 2 . . . . f n is an %-measurable real-valued function defined on X . Let (a : IR" + IR be a Baire function. Then F = @( f l , f 2 . . . . , fn) is an %-measurablefunction defined on X . Proof: We need to show that for each open set G E IR we have F - ' ( G ) E 2. Denote f = ( f l ? f 2 . . . . fn) : X IR". We claim that f is measurable. In fact, each open set 0 G JR" can be written as a union of countably many open blocks of the form n
.
-+
Z=1
which is a Cartesian product of n open intervals. Thus
f-l(B)
=
f)f,-1(a2.b,)
E
u.
Z=1
Hence f is %-measurable from X to IR", since 2l is a a-field and is thus closed under countable unions. Then F - l ( G ) = ((a o f ) - ' ( G ) = f-' ( @ - ' ( G ) ) for each open set G C JR. Since W1(G) is a Borel set, the theorem is true by the result of Exercise 4.2. We remark that in Theorem 4.1.1 it would have sufficed to have CP defined on a set D E JR" provided that (f1 , . . . , f i x ) : X + D.
Remark 4.1.1 It follows from Theorem 4.1.1 that such combinations of measurable functions as the following must be measurable: c1 f l f l
+ c2 f 2 , where c1 and c2 are constants.
. fi, the product of two measurable functions.
58
MEASURABLE FUNCTIONS
0
2,where
In particular,
f2
is nowhere zero.
is also measurable. Thus f is measurable if and only if
-fl
I f ( x ) '0 ) E for every Q. E IR. Of course, this can be seen also from the fact that f is measurable if and only i f f - ' maps open sets to measurable sets.
Corollary 4.1.1 r f f 1 and f 2 are %-measurable real-valued functions, then each of the following functions is %-measurable: i. max(fl.f.2) xE
x.
ii. min(f1. iii.
f+
f2)
= f l v fi,
where
fl
v
=
f2(.r)
m a x ( f l ( x ) . f 2 ( x )for ) each
= fl ~ f 2 wherefl , ~ f 2 ( x= ) min(fl(x). f ~ ( x )foreachr ) E
= f v 0, known
X.
as the positive part o f f .
iv. f - = (-f) v 0, known as the negative part o f f . (The reader should note that the negative part of a real-valued function is positive.) v.
If1
=
f+
+ f-.
Proof: It suffices to observe that m a x ( x 1. x2) and min(z1, x2) are both continuous functions from R2 to R, making each of these a Baire function. For the last part, we H use the fact that @(XI. x2) = x1 + x2 is continuous and thus a Baire function. EXERCISE
Show that it is possible for f : IR -+ R to be nonmeasurable and still have measurable.
4.3
4.2
if1
LIMITS OF MEASURABLE FUNCTIONS
In the study of pointwise limits of measurable functions and integrable functions. we will consider sequences of functions for which f n ( x ) diverges to +afor some values of x. Thus it is helpful to extend the concept of real numbers to the set IR* = R u { kco} of extended real numbers. Measurability for an extended realvalued function means that for each a E R, the set f-1[0,
co] = { T 1 f(.)
a } E %.
and conversely.
Theorem 4.2.1 Let ( X ,2,p ) be a measure space and let { f n I n E IN} be any sequence of measurable functions from X to IR*. Then each of the jive functions dejined as follows is %-measurable:
LIMITS OF MEASURABLE FUNCTIONS
59
i. f*(z)= inf{fn(z) I n E I N } f o t - d l s E X . ii. f * ( x ) = sup{f,(s)
I n E IN}fornlls E X.
iii. f(z)= liminf f n ( z ) f o r a l l z E X . iv. T(z)= lim sup fn ( x )for all s E X . v. f(z)= limn+L fn(s)provided the limit existsfor all z E X. Proof; For the first part, we observe that each a E JR,
f;'[a.
021 =
=
{
s
I
inf f n ( s )2 a } n
n f,-ya3
2.
nEIN
Thus f* is %-measurable. Since
f*(z)= -inf(-fn(z)). n it follows that
f* is %-measurable as well. Note next that since in
=
inf{fk(z) I k
> n)
is an increasing sequence of extended real numbers i ,,, we have
so that f is %-measurable, being the supremum of a sequence of measurable functions given asinfima. Also,
limsupf,(z) with the result that
=J(x) =
inf {sup{fk(s)
I k 3 1 2 } 172 E IN},
f is %-measurable. Finally, we note that f(z)= lim fn(z) n+ L
exists if and only if T ( x ) = f(x). Thus f is %-measurableprovided that the pointwise limit exists on X .
EXERCISE Suppose fn : X + IR is a measurable function for each n ( X ,%, p ) is a measure space. Prove that the set
4.4
is a measurable set.
E
IN, where
60
MEASURABLE FUNCTIONS
The reader should be able to give examples of pointwise convergent sequences of continuous functions for which the limit is not continuous and examples of pointwise convergent sequences of Riemann integrable functions for which the limit is not Riemann integrable. We see already that measurability must be a valuable concept for pointwise convergence since pointwise convergence does preserve measurability.
Definition 4.2.1 Let ( X ,U, p ) be a measure space, and let 's1 be the set of all null sets. Suppose P ( r )is a proposition for each x E X . Then we say that P ( r )is valid U-p almost everywhere
if and only if there is a set N E T such that x has the property P for all 3: E X\N. This is commonly expressed as p-almost everywhere, or as almost everywhere, provided that there will be no confusion as to which measure p or g-algebra U is in use.
Remark 4.2.1 Given a measurable function f , it is common to see a set-theoretic notation such as this: Let
B = inf{K E IR I If I).(
< K a.e.}.
In this notation, each of the numbers K is called an essential upper bound of f, meaning that If (.)I is bounded above by K almost everywhere. If B < a , then we denote
B = Ilflllr> the essential sup-norm o f f .
Corollary 4.2.1 Let f n be a sequence of measurablefunctions on a complete measure space ( X ,U. p). Suppose f n ( r ) f ( z )almost everywhereon X . Then thefunction f is measurable. --f
Proof: This follows from Theorem 4.2.1. It is understood in this context that although the function f is defined by the given limit, except on a null set N , f' may have arbitrary values on N itself. Then the completeness of the measure space tells us that the resulting function f remains U-pmeasurable regardless of how values are assigned to f within the null set N . w
EXERCISES 4.5 Suppose f : X + IR* is a measurable function that has finite values almost everywhere on the measure space ( X .U, p ) , where p ( X ) > 0. Prove that there is a set of positive measure on which f is bounded. 4.6 Suppose the measurable function f : IR" + IR has the special property that for each fixed vector c E IR", the translation o f f given by fc(x) = f (x c) is equal almost everywhere to f (z) itself. That is, for each value of c E IR", we have
+
SIMPLE FUNCTIONS AND EGOROFF'S THEOREM
61
for almost all 5 . Prove that f(x) is equal almost everywhere to a constant function. (Hints: Consider both the sum and the terms of
Apply Exercise 3.26 to select the special value of n. Divide and conquer!) 4.3
SIMPLE FUNCTIONS AND EGOROFF'S THEOREM
In this section we will consider both uniform convergence and what we will call almost uniform convergence of sequences of measurable functions. We begin with a useful definition that generalizes the notion of a step function from advanced calculus.
Definition 4.3.1 A function f : X + IR is called %-simple (or simple if there will be no confusion regarding the a-field U that is under consideration) if and only i f f is %-measurable and
{f(.) I 2 E XI is a finite set. The class of simple functions is denoted by 6 . Thus, f is simple if and only if we have finitely many sets
A,=f-l(cu,)~%. i = l such that
u n
X
=
. . . . .n
n
A, and f(x) =
2=1
1c y , l ~ , ( x ) . 2=1
where 1~ denotes the indicatorfunction of A. That is,
Theorem 4.3.1 E v e q bounded, %-measurable, real-valued function f is the uniform limit of a monotone nondecreasing sequence of simple functions arid also o j a monotone nonincreasing sequence of simple functions. Proof: Define f71(T)
k . If z 2n
= -
E
,4k
=f-
k
k+l
Since f is bounded, there are only finitely many values of k for which A,, # ,@. Thus f n achieves only finitely many values and is %-simple. Moreover,
62
MEASURABLE FUNCTIONS
so that f n + f uniformly on X. It is left to the reader to verify that f is in fact a monotone nondecreasing sequence of simple functions. Monotonicity follows from the method of interval bisection that is used in the definition of the sequence f n . The reader should also produce and verify the properties of a similar nonincreasing sequence of simple functions. This can be done by applying the preceding method to the function - f . H It is worth noting that even the pointwise limit of a sequence of simple functions must be measurable. Thus a bounded function is measurable if and only if it is the limit of a monotone sequence of simple functions.
EXERCISES 4.7 Prove that every %-measurable function f : X IR is the pointwise limit of a sequence of %-simple functions. Give an example to show that uniformity of convergence cannot be required in this exercise. (Hint: Proceed as in the proof of Theorem 4.3.1, but for each n define f using llcl < n2".) -+
4.8 Let f : IR" IR be Lebesgue measurable. Prove that f is equal almost everywhere to a Bore1 measurable function. (Hint: The function f is the pointwise limit of a sequence & E 6 ,the vector space of simple functions. Modify the functions an suitably.) -+
Exercise 4.7 shows that if the measurable function .f is not bounded. then we cannot expect uniform approximation by simple functions. The following theorem provides interesting additional insight into what can be guaranteed.
Theorem 4.3.2 (Egoroff) Let ( X .U.p ) be a measure space of finite total tnerisure. Let f n be U-measurablef o r each 1% E IN and suppose that f ,, + f poinruise on X . Then, for each 71 > 0, there exists a set B E U such thcit p ( B ) < 11 and such thut f n + f uniformly on X\B. Proof: Note that f must be measurable because of the hypotheses. Let define
A n ( € )= {.r
E
X 1313 3 n such that If,(.)
-
f(~)l
This set is measurable since each f, is measurable. Thus A,, Moreover, ill(€) 2 A * ( € )3 . . . 3 A n ( € 3 ) ... is a decreasing nest with emp8 intersection because f By Exercise 2.19, P(-4n(E)) 0 +
as n + cc,since p ( X ) < co.
E
2
F
> 0 and
6
% for each
TI
,, converges pointwise
E IN.
on X.
SIMPLE FUNCTIONS AND EGOROFF'STHEOREM
For each k
E
IN there exists n,k
E
63
IN such that
and such that nl < n2 < . . .. Let
2
so that p ( B ) < v. If E > 0, we can pick k such that < E . If x $ then I f n ( x )- f(.)l < E . Thus fn + f uniformly on X\B.
B and 11 3 n k ,
We remark that Egoroff's theorem is often paraphrased informally as follows: Pointwise convergence is almost uniform convergence. It is necessary, however, to understand the precise meaning of that expression as being the statement of Egoroff's theorem. It is especially important to understand that the set B must normally have strictly positive measure, albeit small measure. We give an elementary example to explain this. EXAMPLE4.1
Let fn(.) = (1 - x 2 ) n on [-1.13 for each n. E W, using the underlying Lebesgue measure space ([-1.11, C.1). Then fn + l{o) pointwise, but not uniformly, on [-1. 11. If we let B = ( -T, r ) for a small positive number T , then it is easy to see that f n + 0 uniformly on [-l>1]\B. In fact, Ilfn - ollsup =
(1 -
+
0.
with the sup-norm being calculated on [-l,l]\B. For a set B to have the effect calculated above, it must include some interval around 0 so as to exclude all sequences different from 0 but converging to 0. Thus B cannot be replaced by a null set. EXERCISE
4.9 Give an example on the real line to show that pointwise convergence of a sequence of measurable functions need not imply uniform convergence off some set of small measure. Thus finiteness of the measure space is necessary in Egoroff's theorem. 4.3.1 Double Sequences
Theorem 4.3.3 Let ( X ,Q, p ) be afinite measure space. Let { f 2 , 3 I i . j = 1 , 2 , . . .) be a double sequence of %-measurablefunctions defined on X such that lim f z , 3 ( x = ) f z ( x ) .Vx E X
&L I+
64
MEASURABLE FUNCTIONS
and lim f,(x)= f(x), Vx E X 2-
I
Then there exists an increasing sequence n , of positive integers such that
2 - p almost everywhere on X.
Proof: We can diagram the hypotheses conveniently in the form
I
(i
4
m)
f(XI. There exist B1E
for all
2 E
and n1E N such that p(B1) <
and
X\B1 because of Egoroff’s theorem. Similarly, there exist B, 1
E
!2l and
n , E N such that ,u(B,) < - and 2%
for all .x E X\B;.Now let n = l ?=n
Observe that p ( B ) = 0. Also, B is the set of all those 2 that lie in infinitely many of the sets B,.Hence if x E X\B, it follows that x lies in at most finitely many of the rn for which reason (x)- f(.r)l 4 0 as i 3 co. sets B,,
Remark 4.3.1 We defined the concept of Baire function in Definition 4.1.2. There is an alternative, equivalent definition that could be given for functions defined on the real line. The Baire class Bo is the class of continuous functions. B1 is the class of all pointwise limits of functions in Bo. For each a < 0, the smallest uncountable ordinal number, we can define B, to be the set of all pointwise limits of functions belonging to lower Baire classes. The family of all Baire functions as defined in Definition 4.1.2 can be shown to be
The detailed explanation can be found in the book by Casper Goffman [8]. One part of the significance of the preceding theorem is that there are functions of Baire class B2 that are not of class B1. Thus almost-everywhere convergence is the best that can be expected in the theorem.
SIMPLE FUNCTIONSAND EGOROFF’STHEOREM
4.3.2
65
Convergence in Measure
There is a concept called convergence in measure of a sequence of measurable functions f n to f that is especially useful in the theory of probability. In that context, it is helpful to to know that the probability of a random variable f differing from the random variable f by more than E is very small.
Definition 4.3.2 On a measure space ( X .U, p ) , a sequence of measurable functions f n is said to converge in measure to a measurable function f provided that for each E > 0 there exists N E IN such that n 2 N implies that
It follows readily from the definition that f l L + f in measure if and only if for each E > 0 and each 17 > 0 there exists N E IN such that n 2 N implies I-L
I
{ z Ifn(x) - f
I).(
2 17) < 6 .
Thus the definition is phrased as it is for simplicity. One gains nothing that is not already there if one uses two criteria, E and 7 . EXERCISES 4.10 Let ( X .U. p ) be a measure space for which p ( X ) < m. Suppose f is a sequence of measurable functions such that f -+ f almost everywhere. Prove that f n + f in measure. (Hints: Theorem 4.3.2 is helpful. You may assume either that f is measurable or that the measure space is complete, explaining why either assumption has the same effect.) 4.11 Give an example of a sequence of Lebesgue measurable functions f -, 0 in measure on [0,1] for which the sequence of numbers f n ( r )fails to converge to zero for any z E [O. 11.
Exercise 4.1 1 adds to the significance of the theorem below. The exercise explains why, in the following theorem, we will need need to pass to a subsequence that is sufficiently rapidly convergent in measure to guarantee pointwise convergence almost everywhere.
Theorem 4.3.4 Let f and f n be measurable functions on a $finite measure space ( X ,8 .p )f o r all n. Suppose that f n + f in measure. Then there exists a subsequence f n , that converges to f alniost eveiywhere as u + a. Proof: By hypothesis, for each u that
E
IN there exists n u E IN such that n 2 nu implies
The difficulty in establishing convergence pointwise almost everywhere is that these sets can slide around and cover a big region as we vary n 2 n ”. Thus we define the
66
MEASURABLE FUNCTIONS
set
for the individual function
fn,,
and we do this for each value of v. Define
nU Y.
s = limsupE,
=
L
E,.
k=l v = k
It is easy to check that S is a null set. Moreover, x $ S if and only if 3: lies in only finitely many of the sets E,. Thus if x $ S , we know that for sufficiently big values of v corresponding to z, we have x $ E,. This implies that
It follows that for z $
S the sequence fn, (x) f (x)as v -+
-+
m.
4.4 LUSIN’S THEOREM Lusin’s theorem is paraphrased often as stating that a measurable function on RP is almost a continuousfunction. Such phrasings can be useful as reminders of theorems that could help us in certain situations. But it is very important to remember to interpret the paraphrasing of Lusin’s theorem as meaning exactly what the theorem itself states.
Theorem 4.4.1 (Lusin) Let f : X -+ IR be a measurable function defined on a Lebesgue measurable set X c IRP,f o r which the Lebesgue measure 1 ( X ) isjnite. Then for each > 0 there exists a compact subset K c X such that l(X\K) < 7 and such that f the restriction off to K, is continuous on K.
IK,
Proof: We will undertake four restrictions of domain in order to reach a compact set K on which f is continuous.
lK
i. We wish to restrict f to a bounded subset of X so that the closed approximations of measurable sets from within will be compact. We can do this as follows. Since 1 ( X ) = lim 1 ( X n [-k. kip), k+ L
there exists a closed cube Q = [-K; KIP c IRP large enough so that 77 l ( X ) 2 1(Q n X ) > l ( X ) - -. 8
ii. We know from Exercise 4.7 that f is the pointwise limit of functions the class of simple Lebesgue measurable functions. Write P,,
fn E
G,
LUSIN'S THEOREM
67
a linear combination of indicatGr functions of disjoint measurable sets A :', with X = UFZl A:. (The superscripts are labels only-not exponents.) By Exercise 3.3, for each i and n there exists a compact set K: C Q n A: such that
Each function fn is continuous on K: because it is constant there. Note also that the cluster points of K: and those of "3" for j # i must be distinct, since both sets are compact subsets of their respective disjoint measurable sets. That is, each set contains all its limit points and the two compact sets are disjoint. Thus f72 is continuous also on
Moreover
l(X\Kn) < 17
+
Pn
77
p,2n+l ~
1=1
=
'I 1 8 +2n+l'
... Define another compact set
111.
so that 17 + 'I = 5'1 l(X\K*) < 8 2 8
The functions f n are continuous on K * and the sequence f n on K * .
--f
f pointwise
iv. By Egoroff's theorem there exists a measurable set B E K * with 1(B) < and fn + f un$orormly on K*\B. Since the set K*\B is measurable, there exists a compact set K E K*\B such that 'I l((K*\B)\K) < g
which implies that l(X\K) < r j and f is continuous on K . w
IK
Lusin's theorem tells us that the restriction f is a continuous function. That In other words, f is continuous as a function defined only on the is, f E C(K.a). restricted domain, K . But f need not be continuous at any k E K as a function defined on X . This is relevant to Exercise 4.12.
68
MEASURABLE FUNCTIONS
EXERCISES 4.12 Let f be the indicator function of the set of all irrational numbers in the interval X = [O. 11. a) Show that f is nowhere continuous on [O. 11. b) Let 7 > 0 and find a set B of measure less than 11 such that f is continuous on K = X\B and such that K is compact. 4.13 Let f : [a. b] + lR be a measurable function. Let 7 > 0 and E > 0. Prove that there exists a measurable set B such that l ( B ) < 77 and a polynomial p such that sup { i f ( s )- p(.1')I 1.1' E [a. b]\B} < E .
(Hint: Apply the Tietze Extension theorem and the Weierstrass Approximation theorem.) 4.14 Use Lusin's theorem to prove that a measurable homomorphism f of the additive group of real numbers into itself must be continuous. That is, prove that if f is measurable and if f ( s+ Y) = f ( T ) + f ( y ) .
then f must be continuous. (Hints: Consider a compact subset K c [0,1] that nearly Jills [0,1] in the sense of Lebesgue measure and such that f is uniformly continuous on K . Prove that if h is sufficiently small, then K n ( K h ) is nonempty. Prove that f is continuous at 0.4')
+
'"This problem appears again in the present book as Exercise 7.7. on page I3 1. See how the problem is expressed in that exercise, and read the historical footnote following it. For that second introduction of the problem, an integration-based solution is suggested. "This method was presented in a paper by S. Banach [ I ] .
CHAPTER 5
THE INTEGRAL
We will introduce the concept of the integral, beginning with the integration of special simple functions and some easy theorems. 5.1 SPECIAL SIMPLE FUNCTIONS Throughout this section we let ( X .2;p ) be a measure space. We introduce below the concept of a special simple function which generalizes the concept of a step function in the study of the Riemann integral. Recall that a simple function is a measurable function that achieves only finitely many distinct values, all of them real.
Definition 5.1.1 The set 6 0 of special simple functions consists of those real-valued simple functions f E 6 denoted by
i=l
for which a , # 0 implies that p ( A i ) < co. Measure and Integration: A Concise Introduction to Real Andysis. By Leonard F. Richardson Copyright @ 2009 John Wiley & Sons, Inc.
69
70
THE INTEGRAL
Observe that for a special simple function, if p ( A , ) = a,then we must have at = 0. Next, we define the integral of a special simple function in the most natural way. The definition will clarify why we need to assume that p ( A , ) < m if a , # 0.
Definition 5.1.2 I f f
=
z:=,a I 1 ~ ,
E6 0 ,
we define
We adopt the convention, in Equation (5.1), that if p ( A , ) = x,so that a , = 0, then the corresponding summand, a Z p ( A L is) , to be counted as zero in the sum. We leave it for the reader to check that the integral is well defined on 6 0,despite the fact that the decomposition of such a function as a linear combination of indicator functions of measurable sets is not unique.
Theorem 5.1.1 (Linearity of the Integral) The space 6 is a vector space and the space 60is a vector space. Let I : 6 0+ IR be defined by I ( f )=
Jf
dP
X
Then I is a linear functional, meaning that I ( t r f gin6oandaEIR.
+ g ) = a I ( f )+ I ( g ) f o r cill f and
Proof: To prove that 6 0 and 6 are vector spaces, it is necessary only to prove closure under subtraction and scalar multiplication. If f and g are in either of these two spaces, we can write m
ri
i=l
j=1
We will require for convenience in this proof that n
m
UA,=X= UBI. 3=1
1=1
Hence we can write af
+g =
1
(00,
+ 3j)1A,nBJ
I<,<m
1SjSn
It follows by direct application of the definition of I that I(.f
+ 9 ) = Q I ( f+) I(g).
The following exercises and theorems present basic properties of the integral of a special simple function. The proofs will be quite simple.
SPECIAL SIMPLE FUNCTIONS
71
EXERCISES
5.1 (Monotonicity of the Integral) Let f and g be in 6 0 . Prove the following statements.
a) If f(z)2 0 for all IC
E
X, then
r
J
s b) If f(x) 6 g ( x ) for all x E X, then
f d p 3 0. (Positivityof the Integral)
J], f
dp 6
ly
g d p . (Monotoniciw of
the Integral) (Hint: Apply Theorem 5.1.1.)
Theorem 5.1.2 (Triangle Inequality) Let f , g, and h be in 6 0. Dejne
Then d ( f , h ) 6 d ( f . g ) Proof: For each x
E
+ d ( f .h).
X,
If(.)
- h(.)l
6 If(.) - d.)I
+ Ig(x.)- Wx)l
by the triangle inequality for real numbers. The rest follows from Exercise 5.1.
-
rn
Remark 5.1.1 In the vector space 6 0 ,define f g if and only if d ( f . g) = 0. We can verify easily that this is an equivalence relation and that this equivalence relation partitions 60into equivalence classes. For each special simple function f , we denote by f the equivalence class o f f . With the function d serving as a metric, the quotient space 60,’ of equivalence classes formed from 6 0 is a metric space. Note that the verification that d is a full-fledged metric on the space of equivalence classes includes the fact that if f(x) 2 0 for all 2 E X and if 1 , f d p = 0, then f(x) = 0 almost everywhere.
-
c:=l
Definition 5.1.3 Let f = ~ y , l ~E 460 ~ and let A E U, the a-field of measurable sets. Then the pointwise product f 1,A E 60,and we define
EXERCISES
5.3 (Countable Additivity as a Set Function) If f E 6 0 and if we have a sequence of disjoint measurable sets E , E U, then use the countable additivity of p to prove that
72
THE INTEGRAL
5.4 Let X = [ u , b ] ,a closed finite interval, and let S = Q n X . Prove that 1s E GO 1s. dl. Compare 1s dl and with respect to Lebesgue measure on X . Find with Exercise 1.3.
5.2
EXTENDING THE DOMAIN OF THE INTEGRAL
We are ready to expand the domain of functions that we can integrate. We will begin with the bounded measurable functions having a carrier of finite measure.
Definition 5.2.1 Let 61 3 GObe the family of functions f that are not necessarily simple but that have the following properties: i. The function f is %-measurable. ii. There is a set A E U with p ( A ) < cc such that f(x) = 0 for all z 4 A. (The set A is called a carrier o f f . Property (ii) says that f has a carrier A of finite measure.) iii. There is a nonnegative constant A 1 E
IR such that
If(.)[
6 M for all .c E A.
The definition may summarized in words by saying that f E G 1 if and only i f f is bounded and measurable with finite carrier. By Theorem 4.3.1, we know that there exist both a monotone nondecreasing sequence gn E GOand a monotone nonincreasing sequence h , E GO of simple functions, both of which converge uniformly on A to f. That is, J J h , - fl l s uI, +0 and l/gn - flIsup 0, where +
ll4llSUP
I
= S U P { 14(z)l z E
A}
is the sup-norm of any function 4 defined on A. From the triangle inequality for the sup-norm, it follows that Ilh, - gnllsup-+ 0 as n + 00 as well. Define
a(&)=
9dp, [A
for each q E G O .We know from the monotonicity of the integral of a special simple function that @(g,) 6 @ ( h nfor ) all n,that @(gn)is a nondecreasing sequence, and that @ ( h nis ) a nonincreasing sequence. Moreover,
q h n ) - @(gn)
=
@(hn- gn) r
EXTENDING THE DOMAIN OF THE INTEGRAL
73
as n + GO. Recall that every bounded monotone sequence of real numbers must converge. This argument establishes also that for each monotone nondecreasing sequence g n E GOconverging uniformly to f and for each monotone nonincreasing sequence h, E 60 converging uniformly to f ,lim @ ( g n )exists and equals lim @ ( h n\. This justifies the following definition.
Definition 5.2.2 I f f E
el,we define
where 4, is any monotone sequence of special simple functions converging uniformly to f on its carrier set A of finite measure. If E
E
U and f is as in Definition 5.2.2, then f 1E E G I , and we observe that
Corollary 5.2.1 Let f E G 1 und let sI1 E Go be any sequence, not necessarilv monotone, converging uniformly to f on its carrier set A ofjnite measure. Then
as n
+ co.
Proof: Let 4, be any monotonic sequence, as in Definition 5.2.2. We know that SA 4 n d~ [ A f d ~ But . +
6 ~l.(A)Ilsn- dnllsup as 11
-+
co. This proves that limn+x
Remark 5.2.1 I f f
Sx
-+
sn dp exists and equals
0
Sx f d p .
E 01, then
The reader can prove this as follows by proving two inequalities. One of the inequalities, that
Jx f dP G S U P
{
I
0dl.l Q E 6 0 . cb G
f
I
‘
is immediate, For the other direction, consider any sequence of increasing special simple functions d, converging uniformly to f. Then the sequence of maximum functions 4n v 4 + f in the same way. Now the other inequality follows directly.
74
THE INTEGRAL
EXERCISE
, f d ~ has i the 5.5 Prove that the mapping @ : 6 -+ R defined by @ ( f ) = 1 following properties: G 1 is a vector space and the integral is linear. The integral is positive and monotone, meaning that iff E 6 1 is everywhere 0. If f 6 g , then
nonnegative, then @ ( f )
I
LY 1, f dp 6
g dp.
r
Show that the integral satisfies the triangle inequality
sx
and I f 1 d p = 0 if and only if f(x) = 0 almost everywhere. I f - gl dp is a semimetric on G I . Show that d ( f . g) = Show that the integral is countably additive as a set function:
s,
A Lfinite measure and the sets A, are measurprovided that A = u L E I Nhas able and mutually disjoint. (Hint: Show that finire additivity follows from part (a) above. Show that
and prove that one of these terms approaches zero as N
-+
zs.)
5.2.1 The Class L' of Nonnegative Measurable Functions
We are ready to take the next-to-last step in the construction of the family of integrable functions in the measure space (X,%, 1-1). Specifically, we will be defining the class C+ of all nonnegative integrable functions on the given abstract measure space. (The letter C is used in honor of Lebesgue.) Let f be any extended real-valued, nonnegative, %-measurable function defined on X. Let A E Q and define the truncation f,"'(x)
Observe that f.;' iff E GI.
E
=
1
f(.c) Af 0
if .r E A. f(x) 6 A f . if x E A. f(x) > Af. ifxEX\A.
provided that p ( A ) < GO. Continue to denote @ ( f )
(5.2)
=
s.y
fdp
EXTENDING THE DOMAIN OF THE INTEGRAL
75
Definition 5.2.3 Let f be any extended real-valued, nonnegative, %-measurable function defined on X , and define the truncation f j J as in Equation 5.2. Then we define f d p = @(f) = sup @ (fjJ).
s,
o<
.\f
A€%, p ( A ) < y
Definition 5.2.4 An extended real-valued, nonnegative, %-measurable function f is said to be integrable provided that (a(f) < a.The class of all such functions is denoted by C +. Lemma 5.2.1 Ifthe measure space ( X ,%. p ) is clpproximatelyjinite, and then f has a a-jinite carrier: That is, f vanishes off some 0-finite set. Proof: We are given that be expressed as the set
sx f
dp <
CE. However,
iff E C +,
the minimal carrier C of f can
If any one of these inverse images had infinite measure, then monotonicity together with the approximate finiteness of the measure space would imply that the integral o f f must be infinite. This contradiction proves the lemma.
Lemma 5.2.2 Assume that f is nonnegative, R*-valued, and %-measurable. If ( X .2L p ) is 0-finite, then X = UzEN A,, with p(A,) < co f o r each i, and A1 c A2 c . . . c A, c . . . . Then the following are equivalent forms of De$nition 5.2.3:
ii. @(f)
=
sup @(g). O
gcei
iii. @ ( f )= sup @(g). O
gEeo
We leave the proofs of these equivalences to the reader. However, the third form should be noted carefully, because it does not involve the intermediate step of employing 61. It is correct because @( f) can be approximated within ~ / by 2 @(g), with g E 61. And (a(g) can be approximated within ~ / 2by @(h,),with h E 6 0 and with h 6 g 6 f. See Remark 5.2.1. (An inequality among functions with no variable named means that the inequality is valid for all values of the variable in the domain.) As for the several preceding domains for the integral, we define r
r
76
THE INTEGRAL
for each f
E
C’ and for each A
E
U.
Theorem 5.2.1 The integral @ : C i
-+
JR has the following properties:
i. Additivity and Positive Homogeneity: @(of+ g)
=
a@( f )
+ @(g)ifn 3 0.
ii. Monotonicity: I f f 6 g, then @( f ) 6 Q,(g), and in particular; 0 6 g implies that0 < @(g). iii. Triangle Inequality: @ ( I f - hi) 6 @ ( I f - gl) + @(lg- hj) i f f . g , atzdh are in C+. iv. Positive D e f i n i t e n e ~ s :@( ~ ~f ) = 0 if and only i f f (x)= 0 almost e\ier?.where. v. Countable Additivity as a Set Function:
fdI.=C[ fdP EJK
A,
ifthe measurable sets A, are mutually disjoint. Proof: i. One should note that Ct is not a vector space. Observe that because @ is monotone on G 1, we have
for all A and AI. The inequality ( a ) in Inequalities (5.3), together with the finite additivity of @ on 6 1, gives us @(f
+ 9 ) 6 @ ( f ) + w?)< m,
This proves also that C+ is closed under addition. Moreover, each individual term labeled with A1 and with A is monotonically increasing with increasing value of AI, and with increasing A by inclusion, because @ is monotone on G I . The inequalities ( b ) and (c) in Inequalities (5.3) tell us that max(A/.A/’)
-
max(A1,XI’)
+ QAU‘4’
“As in Remark 5.1.1, the property proven here becomes positive definiteness in the proper sense of the word when we form a quotient space Is/ 2 ,where f g if and only i f f = g almost everywhere.
EXTENDING THE DOMAIN OF THE INTEGRAL
77
for all A , A’, A l and Al’. Taking the suprema separately over A. A1 and il’, AI’ yields
@(f)+ W) @(f+ g ) ?
proving additivity. Homogeneity under multiplication by a nonnegative scalar is easy to verify. ii. Monotonicity is clear since @( f ) > 0 if f 2 0. iii. Although C+ is not a vector space, it is still true that i f f and 9 lie in C+,then If - g1 E C+.The triangle inequality is clear, using (i) and (ii) above, since
If
-91 6
If-
hl
+ Ih-gl.
iv. Positive definiteness follows from the fact that @( f ) = 0 if and only if @ = 0 for all A and all A I . Since E 61, we know that = 0 almost everywhere. The reader should verify that this implies the same for .f itself.
(fif)
fi‘
fir
v. Because is finitely additive as a function of the integrand, it is also finitely additive over unions of finitely many disjoint domains. In fact, we can write
Thus we could establish that
su:,. 1 f dl* =
u1 4, f dl* + \
=ES,
s,.
f d P + j u .\ + I
2
f dP
\+I*,
4,
f dl.l
’
+i14 f dl**
2
which must then equal
’
f dp, provided that we can show that
78
THE INTEGRAL
Let 6 > 0. There exists a truncation f
i' such that
Therefore E
fdp-1. U;+IAl
.f;'dP
for all N . However, we know that f,"' E 61, so that the integral of f countably additive set function. It follows that
as N
+ co. This
is a
implies that, for all sufficiently big N , we have
f dp < E , and this concludes the proof of the theorem.
5.2.2 The Class
of Lebesgue Integrable Functions
We are ready at last to define the general concept of a Lebesgue integrable function and the Lebesgue integral.
Definition 5.2.5 An U-measurable, extended real-valued function f : X + IR * is said to be integrable provided that both f + = inax(f . 0) and f - = max( -f , 0) belong to L+.In this case we say that f E C,the set of all integrable functions, and we define
@(f)= @(f+) - @(.f-).
The value of @( f) is normally written as
sdyf
dp.
We remark that if f + and f - are defined as above, then f
If1 = f + + f - .
=
f + - f -. Also,
EXERCISE 5.6
if
Let f : X
-+ IR*
be any %-measurable function. Prove that f E C if and only
If I E C+. Use this to prove that C is a vector space.
Theorem 5.2.2 Let f and g be in C, arid let Q be in IR.
i. I f f
=
fl
- f 2 with f l aiid f.2 in
C', then
EXTENDING THE DOMAIN OF THE INTEGRAL
(We do not assume here that f ii. Linearity:
(a(cyf
iii. Monotonicity: f
=
1 =
f
+
or that f 2
=
79
f -.)
+ g) = a @ ( f )+ @(g).
Iff 6 g, then @(f) 6 @(g). Also,
0 almost everywhere.
iv. Triangle Inequality:
Iff
E C, then
If1
E
C and
1,
L
If1
dp = 0 ifand only if
f dpI 6 1 ,
If1
dp.
v. Countable Additivity as a Set Function: Let the integral over a measurable subset A E X be defined by
Theti the integral over A is countably additive as a set function of A. Proof: i. In this case, f = fl - f 2 = f+ - f- expresses f in two ways as the difference of functions in C+. Hence f l f- = f + + f 2 . Thus
+
Wl + f-) = W
l )
+K-) = @ (f+)+ @ ( f 2 ) .
It follows that
ii. I f f = fl - f2. as in the preceding part, and if (I > 0, then @(af)= &(f) by Theorem 5.2.1. And if cy < 0, then af = Ial(f2 - f l ) ,so that
@(d) = I 4 W 2 ) - IQl@(fl) = Q[@(fl)- W 2 ) l =
(I@(f).
Now suppose also that g = g1 - g2 with g1 and g2 in C+. Then @(f
+ 9) = W
l
+ g1) - w
2
+ 9 2 ) = @(f) + %).
iii. Iff 6 g, theng = f+(g-f) and@(g)= @(f)+@(g-f), where @(g-f) 2 0 sinceg-f E Ct.A l s o , i f L ( J f J )= 0 , t h e n J f J= 0almosteverywhere.andthe same is true for f. Conversely, i f f = 0 almost everywhere, then L( I f l ) = 0. iv. Note that
80
THE INTEGRAL
v. We write f = f + - f - as before and let A, disjoint sets. Then
E
A be a sequence of mutually
, If I dp, as in The latter series is absolutely convergent, being bounded by 1 the preceding part. We know this from an earlier theorem establishing the countable additivity of this function for functions in C +. Also, with the series on the right side being absolutely convergent. It follows from standard theorems about absolutely convergent series that
EXERCISES 5.7 Prove that the Lebesgue integral on IR" is translation-invariant. That is, if f E C (IR")and if ft(s)= f ( s + t ) for all n. and t in IR", then
5.8 Prove that the Lebesgue integral on IR" is invariant under reflections through the origin. That is, i f f E C (IR"), then
5.9 Define the o-finite counting measure v in the measure space (IN, p ( K ) .v) as in Exercise 2.15. Show that a function f : IN -+ IR is integrable on IN if and only if the sequence f ( n ) is absolutely summable. Prove that i f f is integrable, then
EXTENDING THE DOMAIN OF THE INTEGRAL
81
5.10 Using Definition 1.2.2, show that if f : [ u , b] + IR is a Riemann integrable function, then it must be Lebesgue integrable as well, and the Lebesgue integral has the same value as the Riemann integral of f . (Hint: Each Riemann integrable function can be approximated from below by step functions, which are v e v special simple functions.) 5.2.3
Convex Functions and Jensen's Inequality
In this subsection, we will prove a useful generalization, known as Jensen 'sinequalih, of the triangle inequality from Theorem 5.2.2. The reader should recall from advanced calculus that a connected subset of the real line is called an interval, whether it is finitely long or infinitely long. A subset S of a vector space is called convex provided that for each pair of points P and Q in S , the straight line segment joining P to (2 lies in S .
Definition 5.2.6 We call a function 4 : I and only if the so-called epigraph,
4d)
=
+ IR
{(t.Y)
is a convex subset of the plane. A function convex.
a convex function on the interval I if
I Y 2 d(t)), 4 is called concave provided that
-Q is
The following lemma identifies an important geometrical property of convex functions.
Lemma 5.2.3 Let 4 : I + R be a convex function on the interval I , and let c be any interior point of I . Then there exists a real number m such that
m(t - c) + o(c) 6 O ( t ) ,
(5.4)
for all t E I . A straight line that is the graph in the plane of y = m(t - c) + d(c), satisfiing Equation (5.4), is called a supporting line f o r o at t = c. Proof: We let
) that each element of with t required to belong to I . The convexity of ~ ( 4 implies C- is a lower bound of C + : Otherwise, the convexity of ~ ( 4would ) produce an element of the epigraph below the point (c. @ ( c ) )The . reader should do a calculation with convex combinations of points to verify this claim by letting t 1 < c < t:! and showing" that d(t1) - O(C) d(t2) - o(c) 6 tl
-c
t2 -c
82
THE INTEGRAL
Hence there is a real number nz
=
inf(C+).
One can check readily that m satisfies Inequality (5.4). Jensen’s inequality pertains to integrals on probability spaces, which we define below.
Definition 5.2.7 A measure space (X,U, p) is called a probability space, and the nonnegative measure p is called aprobability measure, provided that p ( X ) = 1. Theorem 5.2.3 (Jensen’s Inequality) Let 4 be any convex, Borel measurable function defined an interval I , which may be either finite or infinite. Let f be a real-valued integrable function on a probability space (X. U. p). Suppose that the range o f f is contained in I . Then Q
(
Jy
f dP)
J], 4
0
f dll,
provided that 4 o f is integrable. Proof: We begin by letting c =
s,
f d p . If c were an endpoint of I , it would follow
that the function f is equal almost everywhere to a constant, and this would imply that Jensen’s inequality is satisfied by being an equality. The main case is that in which c is an interior point of I , and then we let m denote the slope of any supporting line y = m(t - c) + d(c) at t = c for the convex function 4. Thus
m(t - c) + 4(c) < d(t) for all t E I . It follows from monotonicity of the integral that
=
d(c)
since p is a probability measure.
Remark 5.2.2 Observe that since d is Borel measurable, it follows that 4 o f is p-measurable. (See Exercise 5.14.) The connection between Jensen’s inequality and the triangle inequality, from Theorem 5.2.2, is that the absolute value function is convex. It is possible to strengthen the statement of Jensen’s inequality slightly by allowing for the possibility that l,y4 o f d p = m, although in that case 4 o f is not
LEBESGUE DOMINATED CONVERGENCE THEOREM
83
integrable. That is, it can occur that the positive part, (4 o f)+, fails to be integrable, although the negative part is integrable, because of the existence of a supporting line for the convex function 4. (See Exercise 5.15.) Also, the reader should note that Jensen’s inequality implies easily that if q5 were a concave Borel function that is defined on an interval containing the range of f,then Jensen’s inequality would imply that
EXERCISES
5.11 Suppose f is a nonnegative integrable function on [O. 13 with respect to Lebesgue measure. Prove that
5.12 Let f be a nonnegative integrable function on [0,1] and let I 1 is Lebesgue measure. Show that
=
5; f d l , where
5.13 Let ( X .8, p ) be a probability space. Suppose that f is a measurable function with ~ I f ~ ~ s u p < co and that 1 6 p < q < GO with p and q being real numbers. Prove that43
5.14 Let 4 be a convex function defined on an interval I . Prove that 4 is continuous on the interior of I , thereby establishing that 4 is Borel measurable. (Hint: At each c in I o , use a supporting line for ~ ( 4and ) a chord to prove continuity.) 5.15 Give an example of a convex function q5 on an interval of the real line, together with an integrable function f such that 4 o f is not integrable. Show that the negative part, ( 4 o f)-, must be integrable.
5.3 LEBESGUE DOMINATED CONVERGENCE THEOREM The Lebesgue integral was developed especially to deal with limits of integrals and integrals of limits. The most famous theorem about this topic is the Lebesgue 4’For a somewhat strengthened form of this exercise, see Exercise 9.2.
84
THE INTEGRAL
Dominated Convergence theorem. (This theorem is so widely used, that it is often cited simply as LDC.
Theorem 5.3.1 (Lebesgue Dominated Convergence) Let ( X ,%. p ) be a complete measure space. Let fn : X -+ R* be a sequence of %-measurable fiini-tions converging pointwise almost evepwhere to a function f . Suppose there e.rists an integrable function 4 E C such that \fn(x)l6 d(x)for all 5 E X and f o r all n E IN. Then i. f is integrable and r
ii.
J
X
f d p = lim n+-L
r
J
fn
dp.
Remark 5.3.1 The conclusion of this theorem can be rewritten slightly in the form r
r
This form emphasizes that the Lebesgue Dominated Convergence theorem establishes a sufficient condition for interchanging the order of two very important limits: the integral, which is a very intricate limit, and the pointwise limit of a sequence of functions. Many of the most important and useful theorems in analysis are concerned with such an interchange in the order of taking limits. Example 1.3 is an illustration that the successive application of limit operations need not be commutative. Note that the existence of the dominating function, 6 E C , ensures that f is finite almost everywhere. Proof: Observe first that because lfnl =
f,' + f,- 6 0 .
it follows that both f,' and f; lie in C+ and thus each fn E C. Thus we could as well have assumed that each f n is integrable. Also, by Theorem 4.2.1, we know that f is integrable as well. What we are observing here is that f is equal almost everywhere to a measurable function. In a complete measure space, this proves that f is measurable. 44
i. Suppose that p ( X ) < cc and that +(x) = A l , a nonnegative real constant. Let E > 0. By Egoroff's theorem, there exists a set A E Q such that
and such that fn + f uniformly on X\A. for all n 2 N we have
Thus there exists N
E
IN such that
41n an incomplete measure space, we could require that fn(z) + f(x) everywhere on X. Alternatively. we could set f(z)= 0 on the Lebesgue null set of nonconvergence.
LEBESGUE DOMINATEDCONVERGENCE THEOREM
for all x
E
85
X\A. Hence for all 12 2 N we have
It follows that
jx f n dp -+
1,
f d p as n
-+
cc.
ii. This will be the general case, and we will see how the first case facilitates the second. Since 4 is nonnegative and integrable, there exists a special simple function, 4 0 E 6 0 such , that 0 6 00 6 4 and
Since f n = f,'
- f,-
-+
f = f + -f-
almost everywhere on X,it follows that f,' = f n v 0 + f v 0 = f + almost everywhere. Similarly, f ; -+ f - almost everywhere. Because f + and f - are nonnegative, it will suffice to prove the theorem for f 2 0. Also,
almost everywhere on X ,and 0 6
fn A
40 6 Qo.
Because special simple functions are finite linear combinations of indicator functions of measurable sets of finite measure, the proof in case (i) applies as well to upper bounds that are special simple functions as it applies to upper bounds that are constant on a set of finite measure. Hence case (i) tells us that there is an N E IN such that if n 2 N , we have
Next, observe that since 0 6 40 A
and
f n
6
fn
6 @, it follows that
86
THE INTEGRAL
Thus for each n
> N we have
< E. Thus we have proven that
jX
%: ;
f n dP = JX
f dp.
EXAMPLE5.1
Let S = { T , ~I n E IN) = Q n [0, 11. Let f n = l{rl ,...,T , L j for each n E IN. Thus I f n I 6 1 E C[O. 11 for all n and f n -+ 1s as n -+ 00. This implies by the Lebesgue Dominated Convergence theorem that 1s E C[O, 11 and that LO.ll
lsdp
=
lim
n -+ -L
LO.ll
fn
dp = 0.
However, 1s is not even Riemann integrable on [0,1].
EXERCISES
5.16 Let the graph of fn : [O. 11 + IR be an isosceles triangle with base and altitude a,. Prove that f n -+ f = 0 on [O. 11,but that
s,o, ,
fn
dp
[A. $1
-+ *
as n -+ co,provided that the sequence a, grows sufficiently rapidly as 71 increases. Thus domination (boundedness) by an integrable function is a necessary hypothesis in Theorem 5.3.1.
5.17
Let
X fn(X) = 4 n[ - n , n ] ( . d .
Find the pointwise limit f(x) = 1imrL+/ fn (x)on IR. Prove that
LEBESGUE DOMINATED CONVERGENCE THEOREM
87
Does the sequence f n satisfy the hypotheses of the Lebesgue Dominated Convergence theorem? Explain.
5.18 With respect to Lebesgue measure 1 on [O. 11, give an example of a nortmetrsuruble function f : [0,1] IR such that I f 1 E C[O. 11 and JLo,ll I f 1 dl = 1. ---f
5.19 Let f be integrable on an arbitruty measure space ( X .U, p ) . Let A n E U be an arbitrary sequence of disjoint measurable sets. In Theorem 5.2.2 we proved that
Lew
f&=
'4
cs
nEW
fdp.
An
Give an alternative proof using the Lebesgue Dominated Convergence theorem.
5.20
Let f a)
E
C[1. co)with respect to Lebesgue measure. Prove or disprove: f d l -+ 0 as b -+ l + .
sLb,x)
b) f dl -+ 0 as b -, so. c ) This is a generalization of parts (a) and (b) above. Suppose that f
E
C ( X .U, p)
is integrable on a general measure space. Prove that45 i. If E > 0, there exists a set A E U of finite measure such that
s,.
ii.
5.21
If I d P < E .
SA f d p -, 0 as p ( A ) -+ 0, where A E U. (Hint: Use the definition of C ( X .U, 1-1)in terms of bounded nonnegative measurable functions with finite carrier.)
Let
Prove that f l L -+ 0 pointwise almost everywhere. Is
dl
S, S,257 f n dl =
fn
dl?
Reconcile your conclusions with the Lebesgue Dominated Convergence theorem.
5.22
Let f
E
C(IR). Find
and justify your conclusion. "Variations on this exercise appear in Exercises 5.24 and 8.10. Compare also with Exercise 2.19
88
THE INTEGRAL
5.23
Let fn : [0, co) --+ IR be defined by X
fn(x) = - e - y
T
n
for each n E IN. Let f(x) = limn+L fn(z).Show that for all a
E
[O. a )we have
but that Explain.
5.24 Let El 3 E2 2 . . . 3 En 3 . . . be a decreasing nest of measurable sets in the complete measure space ( X ,U, p ) . Let f be integrable on ( X .a, p ) and suppose that
Prove46 that
5.25
SE,,
f dp
--+
0 as n
--+
oc.
Let f E C(R), and suppose also that ,5 Izf(x)I dl(x) < m. Define
for all Q E
IR. Prove that the derivative ~(x)cos(cIz)~~(T)
exists and find its value for all cy E IR. Observe that this problem deals with the interchange of order of two limits: the integral and the derivative. You will need to justib carefully bringing the derivative with respect to Q inside the integral. (Hints: Take a sequence hn -+ 0 and show that lim
F(Q
+ h,)
- F(cl)
hn
n-x
exists and is independent of the choice of h
--+
0.)
5.26 Let be a double sequence of real numbers having the property that there exists a sequence b, for which lam,nl < lbml for all m and 72 in IN and such that --+ A,, as n --+ co for each ni. Em lbml < m. Suppose also that a) Prove that A, = l i m C
C m
46Another proof is suggested in Exercise 8.10.
71
m
89
MONOTONE CONVERGENCE AND FATOU'S THEOREM
(Hint: Apply Lebesgue Dominated Convergence, interpreting the summation on m as being an integral over IN with respect to counting measure.) b) Give an example in which no summable sequence such as b,, exists, in which the conclusion of part (a) fails. 5.4
MONOTONE CONVERGENCE AND FATOU'S THEOREM
The following theorem extends somewhat the conclusions of the Lebesgue Dominated Convergence theorem in the case of monotone increasing sequences of nonnegative, measurable functions.
Theorem 5.4.1 (Monotone Convergence) Let ( X .U,p ) be a complete measure space, and suppose f n : X + IR is %-measurable f o r each n E IN. If each fn b 0 and if the sequence f n ( z ) is increasing monotonicaliy to the limit f ( x ) almost everywhere, then
Before proving the theorem, we observe that the Monotone Convergence theorem differs from the Lebesgue Dominated Convergence theorem in that we do not assume that the sequence f n is bounded above by an integrable function. Note that f (z) may be infinite. For example, the Monotone Convergence theorem will still apply even if f n ( z )diverges to infinity at each z, though it would assert in that case only that both sides of the equation are infinite. Proof: We know that f is measurable because it is the pointwise limit almost everywhere of measurable functions, combined with the hypothesis that the measure space is complete. If f E C ( X ,U, p ) , then the Monotone Convergence theorem is an immediate consequence of the Lebesgue Dominated Convergence theorem. Beyond the content of the Lebesgue Dominated Convergence theorem, the Monotone Convergence theorem adds only the claim that if too. Suppose therefore that
[
Jx
f dp = co then
[
Jx
f n dp
-+
m
P
J f dp = co. This means that for each 111 > 0 there X E 6 0 such that 0 < d < f and A l < ix d d p < co. Also, f n A #(z) is an
exists increasing nonnegative sequence converging almost everywhere to the dominating function f A 4(z). Thus the Lebesgue Dominated Convergence theorem implies that
Hence there exists N such that n b N implies that
r
r
90
THE INTEGRAL
This proves the theorem. As a handy application of the Monotone Convergence theorem, we prove the following result.
Theorem 5.4.2 Suppose f is measurable on ( X .Q, p), a 0-finite measure space. r f
is a decomposition into the union of an expanding nest of measurable sets of~finite measure, then
Proof: Let f n = I f 1 l,4n for each n E IN. Then f n is an increasing sequence of nonnegative measurable functions converging everywhere to f . The MonoI f 1 dp < a3 if and only tone Convergence theorem gives the conclusion that w if limn f n dp < a3.
ix
A simple application of this theorem would be to show that f ( x ) integrable on IR by showing that lirn n
rn
J
--n
1
=
1
1
+ x2
is
A dl=.rr
EXERCISES
5.27
Let f ( z ) = 2 on the interval ( 0 , l ) . fi a) Show that f E C((0. l ) ,C, I ) . b) If g E 6 0is a special simple function satisfying 0 6 g 6 f on (0, 1),prove that g dl 6 2. c) Show that { h E 6 I f 6 h } = 0. This helps to explain why the integral of a nonnegative measurable function is defined in terms of approximations from below by special simple functions--not from above.
s(o.l)
5.28
Show first that g(z) = e-lzl is a Lebesgue integrable function on the real line. a) Suppose that Ifn(x)l 6 e-l"l for each x E IR and each n E IN. Suppose also that i, f n dl = 0 for each n E IN and that f n f pointwise almost everywhere on IR.Prove that f E C(R) and that i, f dl = 0. b) Now let fn(r)= e - l T l ~ [ - n , n j ( sinx. x) --f
Show that f ( x ) = e-Izl s i n r
is Lebesgue integrable on the real line and that
i,
f dl
= 0.
MONOTONE CONVERGENCE AND FATOU’S THEOREM
91
c ) Show that h ( z ) = s i n x is Ilot a Lebesgue integrable function on
IR,al-
though
s
hdl=O
[-n,n]
for all n E
IN. 5.29 Suppose that f, E C[O, 11 for each n E IN and suppose also that
fn(z) is (absolutely) convergent almost everywhere to a function Prove that CTIEN f E C[O, 11, and prove that
5.30
Let f E C(R)be a nonnegative function. Suppose that the sequence
converges to a real number. Prove that the sequence of powers f converges almost everywhere to the indicator function of a measurable set.
5.31 Let f be a nonnegative measurable function on a 0-finite measure space, (X. U, p ) , Prove that f is the pointwise limit of a monotone increasing sequence of special simple functions.
Another useful convergence theorem, called either Fatou’s Lemma or Fatou’s theorem, is a consequence of the Monotone Convergence theorem.
Theorem 5.4.3 (Fatou) Suppose (X,U. p ) is a complete measure space and suppose that f, : X + R is %-measurable arid tionnegative f o r each n E N. Suppose that lim inf, fn(z) = f(x)almost everywhere. Then
Proof: By Theorem 4.2.1 we know that f is measurable. Moreover,
dn = inf{fn3f n + l . . . .>
92
THE INTEGRAL
is an increasing sequence of measurable functions converging pointwise almost everywhere to f . Since ?+hn is monotone increasing, the Monotone Convergence theorem tells us that
Since f r ; 2
L
t+ln
?Vn
dp
+
for all k 3 n , it follows that
s,
f
dP.
which implies in turn that
EXERCISES
5.33 Prove that in Fatou’s theorem, we can relax the hypothesis that f 2 0 as follows. Suppose there exists q5 E C+ such that --d 6 f n . Show that the conclusion of Fatou’s theorem still follows. 5.34 Give an example of a sequence of nonnegative real-valued %-measurable functions f n on [0,1] such that f n + f almost everywhere, yet lim inf n+ L
5.35
Give an example of a sequence of functions f lim inf f d,u > lim inf 11 -+
T.
E
IR
C(IR) for which
f n dp.
Explain why your example does not violate Fatou’s theorem.
5.5
COMPLETENESS OF L 1 ( X ,U, p ) AND THE POINTWISE CONVERGENCE LEMMA
We saw in Definition 5.2.5, together with Exercise 5.6, that C ( X ,rU. p ) is a vector space. In this section we will place the structure of a norined vector space (also called a normed linear space) on C ( X .%, p). First, we remind the reader what this means.
COMPLETENESS OF
L1 ( X .2,p ) AND THE POINTWISE CONVERGENCE
LEMMA
93
Definition 5.5.1 A (real) nortned vectcw space is a real vector space V, equipped with a real-valued function called a norm, denoted by 11 . 11, provided that for all L; and ul in V, and for all o E IR,the following properties are satisfied: i. Ilv'li 2 0, and Ilv'll ii. IIautll
=
= 0 if
and only if C = 6.
101~~C~~.
iii. llZ+ GI/ 6 112'11
+ ~ ~ u(This l ~ ~is .the triangle inequalio.)
The norm that we will use is called the L' -norm, named after Henri Lebesgue.
Definition 5.5.2 Let ( X ,U. p ) be a measure space. We define the L '-norm of an integrable function f by r
and we note that for each f E C ( X ,U. p ) we have
llfli
1
< a.
EXERCISE 5.36 Show that the L1-norm on the vector space C(R.2.I ) satisfies properties (ii) and (iii) of Definition 5.5.1 but does not satisfy property (i).
The difficulty is that an integrable function f can satisfy llfll1 = 0 without f being the (identically) zero function. We can remedy this defect, and turn C ( X .U, p ) into a normed vector space L1( X .Q, p ) , as follows.
Definition 5.5.3 We call two functions f and g in C ( X ,U. p ) equivalent, denoted g, provided that Ilf - gill = 0. We define by f
-
L' ( X .2, p )
u, p ) /
=C(X.
-
the quotient space of the vector space C ( X ,U,p ) modulo the subspace of all functions having L1-norm equal to zero. The equivalence class of an integrable function f is commonly denoted [ f ] . EXERCISE
If f E C(X.U.p), prove that everywhere.
5.37
llfll1 =
0 if and only if
f(2)=
0 almost
When dealing with L 1 ( X . U. p ) , it is very common to go back and forth between considering functions and the equivalence classes of functions. It is not common to denote the integral of an equivalence class, for example. It is easy to check that if g E [ f ] , then f dp = ,1 g dp, so that it does not matter which member of an equivalence class one integrates.
IX
94
THE INTEGRAL
Subtleties arise, however, of the following kind. Sometimes one speaks of some function f E L'(lR) being a continuous function. What this actually means is that there exists a continuous function which can be selected for use as a representative of that equivalence class. That is, there exists a continuous function g E [f].But other representatives of [f] need not be continuous even though g E C(IR) n [f]. Still, if f is continuous, then it is common to speak of f E L being continuous, even though there is at most one individual function g E [f] that is continuous. And strictly speaking, we oughr not to write f E L1(lR) because it is really [f]E L1 (IR). Nevertheless, everyone writes in the simple, loose way, but bearing in mind that the notation f E L'(lR) is just a common abuse of notation for the sake of simplicity of expression.
'(a)
EXERCISES
5.38 Let f = ~ Q ~ [E ~L' . [0,1] ~ I with respect to Lebesgue measure. Prove that f is both continuous and differentiable on its domain in the sense that there is a unique member of the equivalence class o f f with those properties. 5.39 Let f = 1pll E L'(R). Prove that f is neither continuous nor differentiable, in the sense that the equivalence class o f f contains no function with either of those two properties. Definition 5.5.4 A set S G L' ( X .U, p ) is said to be dense in L1 (X. U, p ) provided that for each f E L 1 ( X ,U. p ) , and for each E > 0, there exists s E S such that Ils - fill
< 6.
The following theorem is frequently useful for proving theorems about integrable functions.
Theorem5.5.1 The space 6 0of special simple functions is dense in the space L ' ( X . 2,p ) . The reader should note how we have made the conventional abuse of language in the statement of this theorem: 6 0 is not actually contained in L * ( X , U , p ) , because the latter space is a set of equivalence classes of function-not of functions themselves. Thus it is really not the set of special simple functions but the set of equivalence classes of those functions that is dense in L ! However, we will abuse language in this way for simplicity of expression, and this should not cause difficulty if the reader remains alert to it.
Proof: Let f E L ' ( X , U, p ) and write f = f+ - f-, as in Definition 5.2.5. We know that there exists a function 4+ E 60such that 0 6 q5+ 6 f+, and also such that E
We define 4- similarly. Let q5=#~+ - 4-, and apply the triangle inequality as follows:
COMPLETENESS OF L1
(x. U, p ) AND THE POINTWISE CONVERGENCE LEMMA
95
In the case of the real line with Lebesgue measure, there is a very special subset of 00called the stepfinctions.
Definition 5.5.5 Define a step function on an interval [a,b] c IR as follows. We call o a step function if there exists a partition a = zo < 2 1 < . < z, = b of [u, b] into finitely many contiguous closed intervals such that o(z) = c,, a constant for all z E ( ~ ~ - 1x,t ) , i = 1, . . . , n. Thus o is constant on each open interval (xi-1, x ? ) . The values of o at 2 0 , 2 1 , . . . x, are arbitrary. Let S denote the family of all step functions on the real line. The reader will show in Exercise 5.41 that the vector space S of all step functions is dense in L1(IR). The latter fact is very useful for proving many theorems in real analysis.
Definition 5.5.6 In a normed linear space V, a sequence of vectors v, is called a Cauchy sequence provided that for each E > 0, there exists an N E IN such that, for all m. and n greater than or equal to N , we have IIu, - ,urnl1 < E . A normed linear space is called complete provided that, for each Cauchy sequence in V ,there exists 2' E V such that 21, -+ 'u, which means that 1171, - ell1 -+ 0. A complete normed real linear space is called a real Banach space, and a complete normed complex linear space is called a Banach space.
EXERCISE 5.40
Give an example of a Cauchy sequence of functions
such that there does not exist any point z E [0, 11 for which f n ( z )converges. Prove that your example has the properties that are claimed.
Theorem 5.5.2 Let ( X ,U, p ) be a measure space. Then L ' ( X , U. p ) is a complete normed linear pace.^' Proof: Let fn be a Cauchy sequence in L 1( X ,U. p ) . We must show there exists f E L 1 ( X , U , p )such that l i f , - fill 0 as n -+ co. The main difficulty in the proof is to find a suitable function f in L1 ( X .U, p ) . The reason this is challenging is that a Cauchy sequence in L 1 ( X ,U, p ) need not converge pointwise at any point z E X , as is shown in Exercise 5.40. In order to remedy this difficulty, we will prove that if a sequence of L1-functions is sufficiently rapidly Cauchy, then it must converge pointwise almost everywhere. ---f
?'That is, L1 ( X ,U, p ) is a real Banach space. In Section 5.6 we will show how to deal with cornplexvalued L1-functions, and then L 1 ( X ,2,p ) will be a full-fledged Banach space in the complex sense.
96
THE INTEGRAL
Note that since f i L is Cauchy, there exists an increasing sequence of natural numbers nk E IN such that if n and m are greater than or equal to n k , then (5.5)
In particular,
1
Ilfn,
- fn,,,
111
<3
for each k E IN. The following lemma will be very helpful.
Lemma 5.5.1 (Pointwise Convergence Lemma) Let ( X .%. p ) be a measure space, and suppose a sequence offunctions gk E L1( X .U,p ) satisfies the equation
) foreachk E IN. Thenthereexistsafunctiong E L 1 ( X . U . p ) s i ~ c h t h a f y k ( ~g(z) for almost all values o f x . Moreover, gk -+ g in the sense of Ll-norin convergence, meaning that Ilgk - 9111 0. -+
-+
Proof: We think of the sequence gk as being very rapidly Cauchy. Note that the set
is measurable, since Igk: - gk+l I is %-measurable. Furthermore,
so that p ( A k ) <
1 for each k 2
E
IN. Next, we define L
N = lim sup A~ =
Y.
UA ~ .
p=l k=p
and we note that N is the set of all points x that lie in infinitely many of the sets Ak. Furthermore
as p + CO. It follows that p ( N ) = 0, so that N is an that the sequence gk(x) converges for each .r E X\N.
(a,p)-null set. We will show
COMPLETENESS OF
If 2
L1 ( X . II. p ) AND THE POINTWISE CONVERGENCE LEMMA
97
4 N , then there exists p E IN such that k 2 p implies that Igdz) -gk+l(Z)I <
1
3'
Therefore, if k and 1 are greater than or equal top, repeated application of the triangle inequality tells us that
so that gk(z) is a Cauchy sequence of real numbers. Thus the function given by g(z) = lim gk(x) k+ x
exists almost everywhere on X and is measurable.48 Moreover, we find that
P
by Fatou's theorem.49 Thus gk - g E L ' ( X ) , which implies that g E L1(X)and that gk + g in the L'-norm. This proves the pointwise convergence lemma. To finish the proof of Theorem 5.5.2, we substitute f n , from Equation (5.5) for in the pointwise convergence lemma. We need to show the convergence in the L '-norm of fn to f,playing the role of 1 g from the lemma. Note that if n 2 n k , then we have ilfn - fn, Ill < -. It follows 4k that if n 2 nk, then we have gk
llfn - flll 6 llfn - f n J 1 + llfn,
- flll
2
< -4 k - 1 + O as k
+ CO.
Hence
iifn
- fill + 0 as
n + GO,and L1(X)is complete.
481f ( X . 9,k ) happens to be a complete measure space. then it IS clear that g is measurable. If the measure space is not complete, we can define g to be constant on the null set N of nonconvergence pointwise, and again g will be measurable. For the purposes of this proof, we need to find only one L1-function that can serve as the limit for the original Cauchy sequence. 49Fatou's theorem is useful here, since we do not have a dominating function that would be required for the Lebesgue Dominated Convergence theorem.
98
THE INTEGRAL
EXERCISES
5.41
Prove the following three handy lemmas. a) On the real line IR,prove that the vector space S of step functions is dense
in 6 0 with respect to the L'-norm. Explain why this implies density of S in L1(JR) as well. (Hint: If A is a measurable set, prove that the indicator function l~ is the limit of a sequence of step functions ~ 7 , ~ That . is, 1(1A - on/1l+ 0 as n + 00. You can use the result of Exercise 3.4.) b) Prove that each function f E L' (IR) is equal almost everywhere to a Borel measurable function. c ) Let g : IR + IR and suppose that g is equal to a Borel measurable function 4 except on a set of Lebesgue measure zero. Prove that g is Lebesgue measurable.
5.42
Let f
E
L1(IR)and define the Fourier sine transform ~ ( c y= )
f^ by
Jl,
f(x)sin ax d l ( x )
for each cy E IR. Prove that f ^ ( a )+ 0 as Q + CD. This statement is known as the Riemann-Lebesgue lemma. (Hint: Treat first the special case in which f is the indicator function of an interval.)
5.43
Iff
E
L1(JR), define f t
E
the translate o f f by t. For each t
L'(JR) by
E
JR,define +(t)to be the linear transformation
d(t) : L1(IR) + L'(JR), given by ~ ( t: f) -, f t for each t E IR and for each f E L (IR). a) Prove that @ is a homomorphism from the additive group of real numbers into the group C (L1(IR)) of linear transformations of the vector space L'(IR) into itself. The group C (L'(IR))is equipped with the operation of compo~ition.~~ b) Fix t E IR and show that
for each f E L'(R). (For those who know what is meant by the norm of a linear transformation, this says that +(t)is a bounded-hence continuouslinear self-transformation of L (IR).5') 'OThis part is a purely algebraic question concerning the group operations. "This part is only for those students who know how to put a norm on the vector space of continuous linear transformations of a normed linear space. See, for example, [20] and [21].
COMPLETENESS OF L ' ( X , 2.p ) A N D THE POINTWISE CONVERGENCE LEMMA
c) Prove that 4(t)f is continuous at the point t fixed f in L1(IR), we have
ll4(t)f - O(0)flIl
--$
=
99
0. This means that for each
0
as t + 0. Prove also continuity at each value o f t E IR. In words, this exercise says that the mapping t + d(t)f is a continuous mapping from
IR + Ll(IR). d) Fix any t > 0, no matter how small. Show that there exists a function f in L1(IR) such that ilflll= 1 yet
ll4(t)f - flll = 2. Thus 114(t)- @(O)lI fails to converge to 0 as t + 0, using the concept of the norm of a linear transformation of a normed linear space. s2 5.44 Denote [O. 11 n Q = {Q I k E IN}, the countable set of all rational numbers in [0,1]. Let f k : [0,1] + IR* be defined almost everywhere by 1 fk(X) =
for each k E IN. Note that the graph of fk has a vertical asymptote at z = q k . (One may decide to define f ( q k ) arbitrarily, but this is not relevant to the problem.) a) Show that fk E L' [0,1] and find an upper bound for I l f k 111 that is independent of k. b) Let fk
k=1
for each n E IN. Prove that S , is a Cauchy sequence in the L'-norm, and thus that S = limn+, S, E L'[O,11. c) Prove for each z E [0,1] that S,,(z) either diverges to infinity or else converges, and that S,(T) < co for almost all x. d) Show that S1 is improperly Riemann integrable in the sense of elementary calculus. Show that S , may be considered improperly Riemann integrable in a plausible sense that should be explained. Show that liin S ( T )is infinite x-q
at each rational point q E E . Can you find any reasonable way to describe S as improperly Riemann integrable?s3 5'The point here is that continuity in analysis is a very delicate issue indeed. It is very much a matter of what is the mapping, what is the domain (and its norm or topology), and what is the range (and its norm or topology). ''This example can be interpreted also as an instance of Lebesgue Dominated Convergence of the sequence of partial sums, regarding S itself as the dominating function. Unlike many examples used for Lebesgue Dominated Convergence, in this one the dominating function is not even iryroprr/y Riemann integrable. This exercise shows that Lebesgue integration obviates the need for a concept of improper integration and goes very much farther than the concept of improper integration used for the Riemann integral.
100
THE INTEGRAL
5.45 Suppose f n measure as well,
E
L 1 ( X ,Q. p ) and
llfnlll
-+
0 as n
-+
co. Prove that f n
-+
0 in
5.6 COMPLEX-VALUED FUNCTIONS What we have learned thus far about L ( X .9, p ) can be extended easily to complexvalued functions. (This should not be confused with complex analysis, in which the independent variable z is replaced by a complex independent variable z.) In contexts in which it would not be obvious in what set the values of a function lie, it is common to write such symbols as L 1 ( X ,JR) or L 1 ( X .02)to indicate (with the last entry) what type of value the function has. In more advanced subjects, such as harmonic analysis on Lie groups and representation theory, one learns about L ( X ,E), in which the functions take their values in a complex Hilbert space 7-i rather than in the real or complex field. I f f : X + C, we write f ( z ) = u ( z )+ iw(x), where the real part of f(.) is denoted by
4.)
=
W(.)
=
sf(.).
and the imaginary part is denoted by
).(w both of which are real-valued.
Note that the complex modulus
Definition 5.6.1 Acomplex-valuedfunction f : ( X ,Q, p ) -+ C is called measurable provided that both 8f and Sf are measurable. We say that f E L ' ( X , C ) provided that both X f E L1(XIIR)and Sf E L ' ( X , I R ) . When the latter conditions are satisfied, we write that
and we define the Ll-norm of f to be
EXERCISES
5.46
Let ( X ,U, p ) be a measure space. a) Prove that f E L1 ( X ,C) if and only if both Rf and S f are in L 1( X .IR). b) ProvethatfEL1(X,C)ifandonlyifllflll < co,whereIlfll1 Ifldp. (Here If1 denotes the modulus of f . )
=Is
COMPLEX-VALUED FUNCTIONS
101
c ) Prove that i f f E L1(X: C ) , then
(Hint: Denote ,1 f dp = Re2@where R = f dpI 2 0 and 0 E IR.) d) Prove that L1(X, C ) is a complete normed linear space.
5.47 The Fourier transform of a complex-valued function f respect to Lebesgue measure is given by
for each cy E IR. a) Prove that f(a)6
E
L ‘(IR. C ) with
for all IR. I ^ I b) Prove that f ^ C(IR), ~ the space of continuous functions on IR. llfiil
cy E
c) Prove the Riemann-Lebesgue lemma: Exercise 5.42.) 5.48 Denote by summable:
11
f^(cy)
+
0 as
101
+
co. (Hint: See
the vector space of all complex sequences z that are absolutely
Call the sum in Equation (5.7) the Il-norm of the sequence z , denoted by Prove that 11 with the given norm is a complete normed vector space.
llajll1.
This Page Intentionally Left Blank
CHAPTER 6
PRODUCT MEASURES AND FUBINI’S THEOREM
All students of mathematics learn in elementary calculus to evaluate a double integral by iteration. The theorem justifying this process is called Fubini ’stheorem. However, the cleanest and simplest form of Fubini’s theorem appears for the first time with the Lebesgue integral. We will see in this chapter that Fubini’s theorem is instrumental in proving many other important theorems of real analysis.
6.1 PRODUCT MEASURES We assume here that we are given two measure spaces, ( X .9, A) and (Y.23. p ) . The two measure spaces are permitted to be identical. We intend to construct the product measure on a suitable 0-field contained in the power set of the Cartesian product Z = X x Y . By a rectangular set R in Z we mean any set of the form R = A x B , where A E 5% and B E 23. We will take as the family of elementary sets for the product measure
{
e n
1
C = E = U z = 1 R , ~ R , = A , X B , . A , E U . B , ,E ~ where the rectangles R, are mutually disjoint and n is an arbitrary natural number. Measure and Integration: A Concise Introduction to Real Analysis. By Leonard F. Richardson Copyright @ 2009 John Wiley & Sons, Inc.
103
104
PRODUCT MEASURES AND FUBINI'S THEOREM
EXERCISE
6.1 Show that the set C of Equation (6.1) is a field of subsets of 2 sure to check closure under complementation.)
=
X x Y . (Be
Definition 6.1.1 Define the product measure n i=l
for each elementary set E E C as defined by Equation (6. I). This definition requires justification, because the decomposition given in Equation (6.1) is not unique. Suppose
U z=1A, x B, U . m
E
=
e
=
n
j=1
C,?x D,
It follows from the finite additivity of each of the measures X and 1-1that no rectangular set of injinite measure can be expressed as a union of finitely many rectangular sets of finite measure. Thus
2=1
j=l
We can use the integral of a special simple function to rephrase Definition 6.1.1 in a way that expresses concisely why that definition is independent of the decomposition
c m
in the case in which
Definition 6.1.2 If of S by
2=1
SEX
and the y-section by If E
E
X(A,)p(B,) < 00. x Y , a Cartesian product space, we define the rc-section
XS = {y I ( r ?y)
E
s, y E Y }
s,= { r 1 ( r qy) E s.x E X ) .
C, the field of elementary sets, then
Define the x-sectionfunction by
PRODUCT MEASURES
105
m
If
X(A%)p(Bi) < co,then we see fromEquations (6.2) that f~ is a nonnegative i=l
special simple function on X . Moreover, (6.3) From Equation (6.3), it is clear that Definition 6.1.1 is independent of the decomposition of the elementary set into a disjoint union of rectangular sets. In fact, if we consider two such decompositions, then we will have two different expansions of the same simple cross-section function fE , so the result of the integral fE dX must be the same either way. Furthermore, if E and E' are any two mutually disjoint elementary sets in X x Y , then the x-section function
sx
fE"E' = fE
+ fE'.
This establishes that v ( EG E' ) = v ( E )
+ v (E')
~
since integration over X is a linear function of the integrand.
EXERCISE 6.2 Let E E C,the field of elementary sets in the product of two measure spaces, ( X ,U, A) and (Y,B,p ) . Prove that the product measure v = X x p is given on C by v ( E )=
s,
X(E,) dp.
Our next goal is to show that, under suitable hypotheses, v can be extended to a countably additive measure on C = B(C),the a-field generated by C.
Theorem 6.1.1 Let ( X ,U, A) and (Y,23%p) be any two measure spaces. There exists a countably additive measure u defined on C = B(C) such that v ( A x B ) = X(A)p(B) for all A E U and B a-finite.
E
93. Moreover; u is unique, provided that X and p are both
Proof: Uniqueness will follow at the end from Exercise 2.21, since u will be a-finite if both X and p are a-finite. Existence depends upon proving countable additivity within C. Each elementary set is a disjoint union of finitely many rectangular sets of the form en E = U 2. = 1 A, x B ~ ,
106
PRODUCT MEASURES AND FUEINI'STHEOREM
with each Ai and each Bi measurable. To prove the countable additivity within E, it will suffice to prove for each rectangular set
that
To this end, we define
so that f n is a nonnegative simple function. Note that although the sets B,need not be disjoint, we do have
which is a disjoint union. Since p is countably additive, it follows that the sequence fn increases monotonically toward the limit f ~where ,
By the Monotone Convergence theorem (5.4.1) we see that
n
= lini
n+ L
A(A,)p(B,) 1=l
and this is true even if v ( R ) = a. Note that in the preceding proof we have applied the Monotone Convergence theorem only to nonnegative measurable functions defined on X, on which there is already a countably additive measure A. (We are not applying Monotone Convergence to functions defined on the product space, for which we are in the process of establishing the existence of a countably additive measure.) We remark that we did not need to use the hypothesis of completeness that appeared in the Monotone Convergence theorem, because in the preceding proof, f n converges evevwhere to f-not merely almost everywhere.
PRODUCT MEASURES
107
Definition 6.1.3 Let ( X ,2 ;A) and (Y. '23. p ) be any two measure spaces. We will denote the Cartesian product U x '23 = { A XB I A E U . B E % } .
and the field that it generates is called &. But we will denote by U0'23 = B(U x '23)
the Borel field generated by the field &. Finally, we will denote by
the completion of U 0'23 with respect to the product measure X x p. It is reasonable to wonder whether it is necessary to form the completion U @ '23 if both U and '23 happen to be complete families of measurable sets for their respective measures. The answer-that the completion needs to be formed-is confirmed by the following special case. Let X = Y = IR and 2 = X x Y = IR2, the Euclidean plane. Here we will assume that X = p = 1, Lebesgue measure on the line. And U = '23 = C(IR) will be the a-field of all Lebesgue measurable sets in the line. Recall that we have constructed earlier the family '23(R") of Borel sets in IR" and the family 2(IR")of Lebesgue measurable sets in R" for each 71 E IN.
Theorem 6.1.2 With the notations of Definition 6.1.3, we have
Moreovel; if we denote by 12 the Lebesgue measure dejined on IR2 as in Definition 3.5.1, and by 1 Lebesgue measure on IR,then 1 x 1 = 1 2 . Proof: We will justify the claims numbered (i), (ii), and (iii) in order. i. Each Cartesian product of two intervals lies in C(IR) x C(IR). Thus the 0-field '23(IR2) that they generate is contained in C(IR) 0C(R). Observe that the family
s ={ s c I R ~ ~ s , E ' 2 3 ( R ) v y E I R } is a a-field in the plane containing the elementary sets of the plane, as defined in Section 3.5. Thus it contains all the Borel sets of the plane. On the other hand, by Remark 3.3.1, there exists a set
E so that
E
Z(R)\%(IR).
108
PRODUCT MEASURES AND FUBINI'S THEOREM
Hence there is a set E x IR chat is in C(R) 0C(IR) but not in 23 proves the first (improper) containment.
(R')).This
ii. For the second improper containment, let A1 be any nonmeasurable subset of IR. Then the measure (1 x l ) ( { x }x M) = 0 for each singleton set {x} in IR, being a subset of a null set. Hence
{x} x n1 E C(R) @ C(IR). Thus it would suffice to show that {x} x A l $ C(IR) 0C(IR). Consider the class S of all sets E c IR2 for which z E E C(IR) for a fixed .c E IR. Then it is easy to check that S is a monotone class, which implies that S contains the a-field C(IR) @C(JR). But since the set {x} x A1 lacks this property, it follows that {x} x A1 $ C(IR) 0C(IR). iii. The space C (R') contains both C(IR) 0C(R)55and 23(R2). Since C (a2) is the (minimal)completion of 23 (IR2), it must be also the unique minimal completion of C(IR) 0C(IR), as is C(IR) @ C(IR). Finally, it is clear that 1 x 1(R)= /2(R)for each rectangle
R
= [a. b] x [c.
d]
Thus the two measures agree as well on the Borel sets and on the Lebesgue measurable sets in the plane IR2. As we have seen above, the family of Lebesgue measurable sets H on IR2 coincides with the family of Lebesgue measurable sets in R x IR. 6.2
FUBINI'S THEOREM
Theorem 6.2.1 (Fubini's Theorem-First Form) Let ( X .U, p ) and (Y,23, u ) be complete a-finite measure spaces. Let C = U @ 23. Thenfor each p x u-measurable set C E C,the section xCis measurable for almost all x,thefunction f c(x)= u( xC) is %-measurable,and (6.4) Proof: Note that ( X x Y.U @ 23, p x u ) must be a-finite as well, so that X x Y =
UX.
xy,,.
neN
s4The strict containment (i) could be proven in another way. by using a cardinality argument. The transfinite cardinal number of the right-hand side is greater than that of the left. s51t is not hard to see that because 2 ( R 2 ) contains the products of intervals, and because it is a cr-algebra, it must contain also all products of the form B x R or R x B , where B is a Borel set in the line. Thus it contains also the products of Borel sets. However, each Lebesgue measurable set is sandwiched between two Borel sets of the same measure. This, together with the completeness of C(R2),justifies the claim.
FUBINI'S THEOREM
109
where p(X,)v(Y,) < co and X , x Ynis an ascending chain of sets that are rectangular and thus elementary and Borel as well. We remark that the first form of Fubini's theorem expresses the product measure p x v of a set C E C as the integral with respect to p of the v-measures of the 2-sections of C. This includes the possibility of both sides of Equation (6.4) being infinite. Observe that the theorem follows easily from the definition of the product measure on the field C of elementary sets in the special case in which C is an elementary set in the product space. For the latter sets, each section zCis measurable as well, being a union of finitely many measurable subsets of Y. Next, we wish to prove the theorem for the case in which C is a Borel set, meaning that C E U 0 23. By Theorem 2.1.2, we see that Fubini's theorem would be true for all Borel sets C provided that the family F of sets for which the theorem is true is a monotone class. 1. We will prove first that F is closed under the operation of forming the union of an increasing chain of sets in C, E F.This conclusion will follow from the Monotone Convergence theorem, together with the fact that the pointwise limit almost everywhere of measurable functions must be measurable in a complete measure space. Let C = U, C,, the union of an ascending chain of sets in F. We let fn(x) = v (zCn),which is a monotone increasing sequence of measurable functions defined almost everywhere. We observe that f, -+ fr. where fc(2)= V ( ~ Cbecause ), 5C is the union of the increasing chain JCn of measurable sets and because v is countably additive. Hence
,.
P
with the final equality following from the countable additivity of the product measurep x v. 2. For a decreasing nest C,, we limit ourselves first to a typical subspace,
of finite product measure. Thus we assume at first that each C, C S,V, which has finite measure. Define fn(z)= V ( ~ C , as ) before, and observe that f l is an integrable function56dominating the decreasing sequence f,, and f,(z) -+ fc(x)= V ( ~ Cfor) almost all s. Then the Lebesgue Dominated 56The function fn is integrable because it is a bounded, measurable function on a space X, of finite measure. Note that fn is bounded because ~ ( Y N < )CC.
110
PRODUCT MEASURES AND FUBINI'S THEOREM
Convergence theorem implies that
=
lini ( p x v)(Cn)
n+ x
This shows that for each N E IN we have F n !J3 ( S N ) is a monotone class within the power set of S N , and thus also a a-field. Hence 3 contains all the Borel sets in S N . However, the Borel sets of S = X x k' will be the unions of their own intersections with each of the Borel sets S A T. Hence F 2 !2i @ 23 by part (i), in which we showed that 3 is closed under forming unions of ascending chains. Moreover, every section ,C of a Borel set must be measurable, as shown in the proof of part (ii) of Theorem 6.1.2. To complete the proof of Theorem 6.2.1 for all measurable sets C E 'u 023, recall that each measurable set differs from a Borel set by a null set. Thus it would suffice to prove that Fubini's theorem applies to all C that are null sets with respect to the measure 1-1 x v. If p x v ( C ) = 0, then there exists a Borel set D 2 C such that p x v ( D ) = 0 also. By the previous part of this proof, it follows that (y x v ) ( D )=
s,
v(,D) dp
= 0.
Hence v(,D) = 0 for p-almost all x , and thus ,C G ,D is both a measurable set and a v-null set for p-almost all x . ~ ' It follows that (y x v ) ( C )= v(,C) dp, and the proof is complete.
sx
Theorem 6.2.2 (Fubini's Theorem-Main Form) Let ( X .2.y) and (Y.23.v)be two complete a-finite measure spaces. Suppose that f
E
L'
(xx Y . ' u @ 2 3 . p
x v).
Then i. For almost all x E X , the function f (x,.) : y ii. For almost all y E Y , the function f (., y) : x
1., The function 1 ,
+f
( x . y) is integrable on Y .
+ f ( x . y)
iii. The function
f ( x . .) dv is integrable on X .
iv.
f (-.y) dp is integrable on Y .
is integrable on X .
5 7 0 n e should note here that it is not necessary for each cross section of a null set in the product measure to be measurable. For example, if M is nonmeasurable in Y and if N is a null set in X, the N x hl is a null set in X x Y . Recall that every set of positive measure contains a nonmeasurable set by Theorem 3.4.4.
FUBINI'S THEOREM
111
v. We have the following equality of double integrals, withfinite values:
Proof .- Because the roles of the two variables are symmetrical, it will suffice to prove (i), (iii), and the first equality in (v). If the conclusions are true for two functions, then they are true also for the difference of the two functions. Hence it suffices to prove the statements listed for nonnegative functions, because we can write f = f + - f - . It follows easily from Theorem 6.2.1 that the claims are true if f is the indicator function of a measurable set of finite measure. Taking finite linear combinations of such indicator functions, we see that the theorem is true if f is a special simple function. By Exercise 5.31, we know that if f is measurable and nonnegative, then f is the pointwise limit of a monotone increasing sequence of special simple functions. Thus
f
=
lim qn. n
(6.6)
The function f ( z ..) is a measurable nonnegative function of y for almost all 2 , being the pointwise limit of a sequence of functions 4n(zl.) that are measurable and integrable for almost all z. There is a different null set S,,for each n,of values of .z for which &(z. -) is not measurable, but the union UnENS , of countably many null sets is a null set. It follows that
f(z, dv = lini e )
JY
n
J
Y
qn (x..) dv
by the Monotone Convergence theorem for almost all z. Thus the integral is a measurable function of z, and it follows again from monotone convergence that
Since we know now that
it follows that the inner integral of the iteration must be finite almost everywhere.
112
PRODUCT MEASURES AND FUBINI’S THEOREM
Finally, Fubini’s theorem follows from the fact that f + and f - are both dominated by the integrable function on X x Y , which enables us to subtract one integral from the other since they are both finite.
If1
The following corollary to the Fubini theorem is known as Tonelli’s theorem. It is a useful variation of the Fubini theorem, in which the function f is given as nonnegative, but only measurable on ( X .U, p)-not necessarily integrable. The conclusions under this modified hypothesis look similar but they are subtly altered.
Corollary 6.2.1 (Tonelli’s Theorem) Let ( X ,U. p ) and (Y.’23, v) be mo complete a-finite measure spaces. Suppose that f is a nonnegative measurable function on the product space ( X x Y.U @ ‘23, p x v). Then i. For almost all x E X , the function f (x..) : y
+f
(x.y) is measurable on Y .
ii. For almost all y E Y , the function f (.. y) : x
+f
(x.y) is measurable on X .
iii. Thefunction
SY
iv, The function
sx f (..
f(z,.) dv is measurable on X . y ) d p is measurable on Y .
v. Whether the integrals arefinite or infinite, +yehave
Remark 6.2.1 The reader should note that in Tonelli’s theorem we do not claim that the inner integrals of the iterations (in either order) are finite almost everywhere. That conclusion would be valid provided that f is integrable on the product space, as stated in Fubini’s theorem. Proof: These hypotheses are sufficient for the validity of Equation (6.6). The remainder of the proof of Fubini’s theorem is based on the Monotone Convergence theorem, which does not require integrability.
Remark 6.2.2 Tonelli’s theorem has an important practical consequence. In order to use the main form of Fubini’s theorem, we need a way to confirm whether or not the measurable function f is integrable. Since If I is nonnegative, we can calculate whether or not n
by calculating the iterated integral in either order, according to convenience. Thus, if either
FUBINI'S THEOREM
I 13
or then f is an integrable function on the product space, and the full strength of the Fubini theorem can be applied to f . If one of the two orders of iteration yields a finite result, this must be true of the other order and of the integral over the product space because of Corollary 6.2.1. Fubini's theorem is one of the most powerful tools in real analysis. The reason is that the interchange of order of iteration of a double integral is an interchange of order of two limit operations of the most delicate kind-namely, Lebesgue integration. Among the most important uses of Fubini's theorem is to prove that the inner integral of one of the two iterated integrals exists and is finite almost everywhere by establishing the finiteness of the other order of iteration when the integrand f is replaced by I f l , See especially Exercise 6.7.b. Several important applications are contained in the following exercises.
EXERCISES 6.3 Let uij E IR for all i and j in IN. Suppose that at least one of the following three sums is finite:
Use Fubini's theorem to prove that
with all three sums being finite.
6.4
Suppose that f lies in L'(IR2). a) Prove that F,(x) = f(x.y
x
E
IR.
li
+ n ) d ( y ) exists for almost all values of
b) Prove that F, E L'(IR). Determine whether or not the sequence F,, has a limit in L1(IR). 6.5 Suppose p : lR" -+ l R is a polynomial in R real variables. Prove that the set p-'(O) is a Lebesgue null set in IR", unless p is the identically zero polynomial. (Hint: If n = 1 this is simple. For the inductive step, use Fubini's theorem.)
6.6
a) Suppose h : W" x IR" -+ IR" is a measurable function such that h-' (Ar) is a Lebesgue null set for each null set N . If f : IR" -+ IR is measurable, prove that f o h is measurable." "Compare this exercise with Exercise 7.14.
114
PRODUCT MEASURES AND FUBINI'S THEOREM
b) Show that if k ( z , y) = z - y, mapping IR" x IR" -+ IR", then IC-' maps null sets to null sets. (Hint: It is not automatic that I C - I of a null set is measurable. Prove first that k - l of a Bore1 null set is both measurable and null. Use Theorem 6.2.1. Then show that k-' maps null sets to measurable null sets.) c ) Show that i f f : IR" -+ IR is Lebesgue measurable function, then H ( X . Y) = f ( . - Y) is Lebesgue measurable on IR" x
IR".
6.7 Suppose both f and g are L 1 functions on IR". In the following problems, you may use the translation invariance of both Lebesgue measure (Exercises 3.6 and 3.24) and the Lebesgue integral on IR" (Exercise 5.7), as well as Exercises 6.6 and 5.41. a) Show that h(z9 Y) = f ( - Y. )dY) is an L1 function on R2". Be sure to explain why h is measurable. You can use the result of Exercise 6.6.59 b) Show that the convolution, denoted and defined by
f * g(r) = c)
4 e)
IR7, f(.
- Y)g(Y) dl(Y).
is defined almost everywhere in z. Show that f * g is an integrable function on IR". Show that llf *dl6 llflllll9lll. Show that f * g = g * f . (See Exercise 5.8.) Show that ( f * g)(cy) = f ^ ( a ) g ( a(See ) . Exercise 5.47.) -4
f)
6.8
Let g E L1(IR". C, l ) , and define the mapping
T : L1(IR", C. 1 ) -+ L1(IR", C ?1 ) by T ( f )= f * g. Prove that if f n -+ f in the L1-norm, then T ( f n )-+ T ( f )in the L1-norm. That is, prove that T is a continuous mapping.
6.9 Let f and g be in L1(IR), and suppose also that g is essentiallji bounded: for some M E IR we have [g(z)l 6 A l for almost all z E IR. a) Prove that f * g(z) is a continuous real-valued function defined for all z E IR. That is, show that
If * dJ.1- f * d
~ 0 ) l
+
0
59The continuous image, or even the homeomorphic image, of a measurable set need not be measurable. See Exercise 7.13s.
FUBINI'S THEOREM
b) Let
115
1
for all 2. Show that f E L1(JR) but f
* f is not continuous at z = 0.
6.10 Suppose A E 2,the family of Lebesgue measurable subsets of the real line, and B = - A . Suppose 0 < 1(A) < m. Let f = 1~ and let g = l g . a) Prove that g * f ( 0 ) > 0. (Hint: See Exercise 5.43.) b) Use part (a) to prove Steinhaus's theorem: There exists an interval
(-6*6) E A - A
= {X - z
I T E A. z E A }
(Hint: Compare with Exercise 3.21, which called for a different proof.)
6.11 Suppose A and B are measurable subsets of the real line, and suppose that l ( A ) l ( B )> 0. a) Use the convolution 1 - *~l ~and, Fubini's theorem, to prove that there is a measurable set C of positive measure such that 1 ( ( A x) n B) > 0 for all z E C. b) Prove that (B - A ) n Q # 0. That is, prove that the set of differences between elements of A and elements of B must include a rational number.
+
6.12 Let f : X + IR be a measurable function on the complete finite measure space ( X ,!X p ) . Suppose g(z. y) = f ( z )- f ( y ) is integrable on X x X . Show that f is integrable on X and calculate the numerical value of 1 , x x g d(p x p). 6.13 Suppose ( X .U, p ) and (Y, 93,v ) are both a-finite complete measure spaces. Suppose f E L1(X) and g E L 1 ( Y ) .Define h ( z ,Y) = f ( z ) d y ) , a n d p r o v e t h a t h E L ' ( X x Y,
[email protected] x u ) .
6.14 We investigate what is called the essential uniqueness of translation-invariant measures. a) Let (IRR,C,l) be the standard Euclidean measure space with Lebesgue measure 1 defined on the a-field of Lebesgue measurable sets. By Exercise 3.24 we know that 1 is translatiowinvariant. Suppose that p is any other 0 finite measure defined and translation-invariant on C. Use Fubini's theorem to prove that p = cl for some constant c. This is called the essential uniqueness of translation-invariant measure. (Hint: To prove this with Fubini's theorem, let E E C be any set of finite measure, let Q be the unit cube, and write P(E) =
Jn.1 1Q(P) dl(Y) h;. 1 d z )&L(z).
Then write this as a double integral over the product space lR2n, and play with the translation invariance of both measures. This proof is modeled on the proof of a more general case published by Shizuo Kakutani Ll51.1
116
PRODUCT MEASURES AND FUBINI’S THEOREM
b) Let u ( E )be the number of elements in of E-meaning u ( E )is either finite or simply co-for each E E C.That is, u is [O. coo]-valued.Is u translationinvariant? Is u a constant multiple of / ? Do we have a counterexample to the essential uniqueness of translation-invariant measure on IR n? 6.15 Let f be a real-valued function on R x IR.Suppose that f(..y) is continuous in the first variable for each fixed y and that f ( z ..) is Lebesgue measurable in the second variable for each fixed 2 . Prove that f is measurable as a function on the plane. (Hint: Express f as a pointwise limit of measurable functions on the plane.) 6.16 Suppose that ( X ,U, p ) is a complete a-finite measure space, and let f be a real-valued integrable function in L ( X ,U. p) . Let 1 denote Lebesgue measure on the real line. Apply Fubini’s theorem to the space X x IR to prove that
The use of a powerful tool such as Fubini’s theorem can produce serious errors if the tool is applied in cases that do not satisfy the hypotheses of the theorem. Here are some examples. 6.17 Let X = Y = IN the set of all natural numbers, and let 2l = Q ( X ) ,the power set of the set of natural numbers. Let I-( = u be the ordinary counting measure on U, as defined in Exercises 2.15 and 5.9. a) Show that p x u is counting measure on the power set of a- x K and that p x u is a-finite. b) Define ifx=y. if.z:=y+l.
2-2-”
if z $ {y, y
+ I}.
Show that
and explain why this does not violate Fubini’s theorem. 6.18 Give an alternative solution for Exercise 5.29 by interpreting that exercise in terms of Fubini’s theorem applied to a product in which one of the factors is the counting measure. 6.19
For 2
E
IR1 and t > 0, let
It is well known that for each t > 0,
s:x
f (2, t ) dx = 1 . It is also known that
COMPARISONOF LEBESGUE A N D R I E M A N NINTEGRALS
117
af prove or disprove: If g(z, t ) = -, at
SI,I'
g(z. t ) dt dz
I'
+ J,
g(r3t ) dz d t .
What is the relevance of this example to Fubini theorem? 6.20 Let X = Y = [0,1]. Let L,L be Lebesgue measure on X and let X be counting measure on Y . Let
Show that
and explain why this does not violate Fubini's theorem. 6.3 COMPARISON OF LEBESGUE AND RIEMANN INTEGRALS
Riemann integration corresponds to the concept of Jordan measure in a manner that is similar (but not identical) to the correspondence between the Lebesgue integral and Lebesgue measure. Although it is possible for an unbounded function to be Lebesgue integrable, this cannot occur with proper Riemann integration. Moreover, proper Riemann integrals are defined only for functions with a bounded domain D . Since a bounded domain D can always be contained in a rectangular block with edges parallel to the axes, and since we can let f be identically zero on the part of the block that is outside D , we will assume that f is defined on such a block. We will denote such a block by the suggestive notation [a,b], with the understanding that a = ( a l . . . . .a,) E IR" and b = ( b l , . . . , b,) E IR". Then the symbol we have chosen for a block has the form
n n
[a,b]
=
[at bt] . 1
t=l
a Cartesian product of closed, finite intervals. Let f be any bounded real-valued function on [a.b]. Since f =: f + - f-, a difference between two nonnegative functions, it will suffice to deal with the Riemann integration of nonnegative bounded functions f . 6o Let A denote a partition of [a,b] into the union of N rectangular blocks, the interiors of which are mutually disjoint:
A
=
{[xZ,yt] Ii
=
1.. . . . N }
'We assume the reader knows from advanced calculus that the positive and negative parts of a Riemann integrable function must be Riemann integrable. See [20].
118
PRODUCT MEASURES AND FUBINI'S THEOREM
Let Axc,denote the volume of the box [xz, y,]. On each of the N blocks [x,.yz]we yt]}. We form the let m, = inf{f(z) 1 x E [x,,yt]}and M,= sup{f(z) I z E lower and upper sums
EX,.
N
and
Then we define the lower and upper Riemann integrals by Jab
-
f ( r )d r
=
sup s(A). A
which is a supremum over all possible finite partitions A of [a.b], and
Jbbf(x) dx
= inf A
,"(A).
Definition 6.3.1 A bounded real-valued function f defined on [a.b] is called Riemann integrable if and only if
la b
In the case of equality, this value is called
f (x) d ~ . ~ '
Theorem 6.3.1 (Lebesgue's theorem) A bounded real-valued function f on [a.b] is Riemantz integrable ifand only $the set ofpoints x at which f is not continuous is a Lebesgue null set. Proof: Without loss of generality, we can suppose that f is a nonnegative bounded function on [a.b], since f = f + - f-. Note that f is Riemann integrable if and only if the same is true of both f + and f - , and a similar statement applies for continuity at a point 5 . Let
C(f) = {(J.Y) I 0 6 Y 6
f(.)},
6'The reader who wishes to learn more about the Riemann integral in text, such as [20].
IR" can consult an advanced calculus
COMPARISON OF LEBESGUE AND RIEMANN INTEGRALS
119
the region between the graph o f f and the block [a.b] in the IR”. Observe that the Jordan inner and outer measure of C ( f )correspond as follows to the lower and upper Riemann integrals of f :
1
b
g(C(f))
=
V(C(~)) =
f ( x ) d s = sup
a -
J”
O
b
a
f
g(r)dx.
(6.8)
a
S,
b
f ( x ) d r = iiif
s”
g(r)dx,
(6.9)
where g varies over the stepfiinctions.62 It follows that f is Riemann integrable if and only if C ( f ) is Jordan measurable, and the latter condition is equivalent to the boundary dC(f) being a Jordan null set, according to Theorem 3.6.3. Since the boundary is a closed set, this is equivalent to dC(f) being a Lebesgue null set. On the other hand, by Fubini’s theorem, dC(f) is a Lebesgue null set in the plane if and only if the 2-section .dC(f) has linear Lebesgue measure equal to zero for almost all x E [a,b]. Suppose that the point x = p is a point of discontinuity of f. That is, suppose it is false that limz+, f(x) = f ( p ) . Since f is bounded, the Bolzano-Weierstrass theorem can be used to establish that there is in [a,b] a sequence L , + p such that f(x,) y l # f ( p ) . Denote f(p) = 92. Suppose that y1 < 92. (The argument would be nearly identical with the opposite inequality.) The reader should show that
,dC(f) 2 {(P.Y) I91 G Y G ?a}. so that ,dC(f) has strictly positive linear Lebesgue measure. Moreover, if f is continuous at p , then ,dC(f) contains at most two points and is a Lebesgue null set. H See Exercise 6.24. EXAMPLE6.1
Let f : [0,1] x [O. 11 + IR be defined by
We claim that f is Riemann integrable on the closed rectangular box [(O. O ) , (1, l)] = [O. 112.
(See Figure 6.1)63. In fact, f is bounded and continuous except at the points on the two axes: s = { ( q . x z ) lx1s2 = O}. 621nthis context, by a step function we mean afinite linear combination of indicator functions of rectangular boxes. 63This illustration is from [20]
120
PRODUCT MEASURES AND FUBINI’S THEOREM
Since S is a Lebesgue null set in the Lebesgue measure on IR2, Lebesgue’s theorem implies the integrability of f.
Figure 6.1 f(x) =
sin
&-
We remark that a Fubini theorem for the Riemann integral is much less general and more cumbersome in its statement than is the case for the Lebesgue integral. One reason for this is that one cannot be assured of the existence of the iterated integrals, and far fewer functions are Riemann integrable than Lebesgue integrable.
EXERCISES 6.21 Suppose f : [ O , l ] + IR is given by f = 1Q,[031]. Riemann integrable by applying Lebesgue’s theorem. 6.22
Prove that f is not
Let f : [0,1] + IR by letting
Prove by using Lebesgue’s theorem that f is Riemann integrable,
COMPARISON OF LEBESGUE AND RIEMANN INTEGRALS
121
6.23 Let f : [0, 11 + R by letting f = l c , where C is the Cantor set. (See Exercise 3.1 1.) Prove by using Lebesgue's theorem that f is Riemann integrable. 6.24 Let f : [a.b] space. Define
+ JR be
a nonnegative function defined on a block in Euclidean
C(f) = { ( X . Y ) I 0 6 Y 6
f(X))?
as in the proof of Theorem 6.3.1. Prove by the following steps that f is continuous x if and only if 1 ( ,dC(f) ) = 0. a) Show that if f is continuous at p, then .dC(f) consists of at most two points. b) Let p E [a,b]. Show that if it is false that lirnx+p f ( x ) exists and equals f ( p ) , then .dC(f) contains an interval of positive length.
This Page Intentionally Left Blank
CHAPTER 7
FUNCTIONS OF A REAL VARIABLE
The purpose of this chapter is to consider the relationship between Lebesgue integration and differentiation for functions of a real variable. Thus this chapter is concerned with the adaptation to the Lebesgue integral of what is called the fundamental theorem of the calculus for a single real variable. We will show that for each Lebesgue integrable function f,the indefinite integral F ( z ) =
I:
f dl has bounded variation,
is differentiable almost everywhere, and F ’ ( r ) = f(x) almost everywhere.
7.1 FUNCTIONS OF BOUNDED VARIATION Definition 7.1.1 A function f : I -+ IR is said to be of bounded variation on an interval I , written as f E B V ( I ) ,provided that there exists a real number A! > 0 such that the variation n
.(A) =
CIf(ti)
- f(ti-i)I
6M
i=l
for every partition
a = { t o 6 tl
6 . . . 6 t n >c I
Measure and Integration: A Concise Introduction to Real Analysis. B y Leonard F. Richardson Copyright @ 2009 John Wiley & Sons, Inc.
123
124
FUNCTIONS O F A REAL VARIABLE
of I by finitely many points t o 7 . . . , tn.64If I = [a. b] is a closed finite interval, we require that the partition A have t o = a and t , = b. EXERCISE 7.1 Prove that B V ( I ) is a vector space. That is, prove that if both f and g are in B V ( I ) ,then the same is true for af + g for each a E IR. Define the positive variation and the negative variation corresponding to a partition
A by n
where, in general, x+ means the positive part of the real number x,and it serves only to replace x by zero if it is negative. The superscript x- means the negative part of x, and it serves to replace the value of z by zero if it is positive and by its absolute value if it is negative.
Definition 7.1.2 If f : [ a ?b] + IR has bounded variation, and if x E [u. b], then f is still of bounded variation on the subinterval [ a ,x ] . Let A denote an arbitrary partition of [a. z ] ,and define p(A) and .(A) as in Equations (7.1). Define P ( 5 ) = SUPP(A), A
n ( z ) = supn(A), and A
W(X) = ~ u p t j ( A ) . A
each of which is bounded above by 111,as defined in Definition 7.1.1.
Theorem 7.1.1 A function f represented as
:
[a.b]
f
+
IR lies in BV[a,b] if and only iff can be
= fl - f 2 ,
where f l and f 2 are both monotonically increasing functions. Moreover, i f w e have f E BV[a.b], and i j v . n , a n d p are as in Definition 7.1.2, then
f ( x )- f ( a ) ).(
= p(x) - n ( x ) . and =p(5)
+ n(x)
@Note that M is required to be independent of both the choice of n E N and the choice of partition A. There are uncountably many choices of A for any one choice of n > 2 and a fixed interval [a,b].
FUNCTIONSOF BOUNDED VARIATION
for all x
E
125
[a.b].
Proof: Sufficiency is easy to establish. If f is any moiiotone function on [a,b]. whether it is increasing or decreasing, then
4 A )= If(b) - f ( n ) l for all A. Thus fl and fi have bounded variation, and we apply Exercise 7.1. We turn next to the proof of necessity. So we suppose that f E BV[a.b]. Let x E [u. b], and let A be any partition of [ a ,x ] . It is easy to see that U(Z) =
sup ( p ( A ) A
+ n(A))
< supp(A) + sup .(A)
(2)
A
=p(x)
(7.4)
A
+ n(.)
from the definitions. We need to prove equality in Inequality (i) of Equation (7.4). For E > 0 and some suitable partitions A1 and A2 of [a,x ] , we have p ( x ) -p(Al) < T I ( T )-
Let A
=
2, and E
E
)1(A2) < -. 2
A, u A2. We claim that
The reader should check that this is so because partitioning an interval [ x L - l , s z ]by means of a point x of that interval forces
and
(f(4 - f(4)+ + ( f b )- f(xL-d)+
>,
( f ( 4 - f(sl-1,)'
(f(4 - f(4)+ (f(4- f(xt-1,)-
b
(f(d - f(x*-I))-.
It is simply a matter of checking two cases: Either f(x) lies between f(x,)and f ( x ? - - l ) or , it does not. It follows from Inequalities (7.5) that
> p(x) - - + n(s)- = p(s)
E
E
2
2
+ n ( x )- E
126
FUNCTIONS OF A REAL VARIABLE
for each E > 0. This establishes that
u(x) = p ( x ) + n ( x )
(7.6)
It is clear that the three functions p , n , and are all monotone increasing, since any partition A of [a.x] can be extended to a partition A u {x’)of [a, x’]for .c’ > z. Note that P(A)- 4 A )= f(.) - f ( a ) for each partition A of [a, x]. Since this implies that
it follows that supp(A) A
+ f ( a ) = supn(A’) + f(x). A,
Thus
P(z) - 4 . 1
=
f(.)
-f(a).
We have shown that f can be expressed as the difference of two monotone increasing functions having the additional special property of satisfying Equation (7.6).
Remark 7.1.1 We note that the representation f(x) - f ( a ) = p(x) - n ( x ) in Theorem 7.1.1 is not unique. In fact, we could take any monotone increasing function h(x) and write
However, the reader will prove in Exercise 7.2 that only one such decomposition satisfies the requirement that ).(V
= p(x)
+ n(x).
EXERCISES
7.2 Suppose for f , as in Theorem 7.1.1, we had monotone increasing functions f and f 2 such that
1
Observe that f l (x)- p ( z ) = f2(x) - n ( z )and prove that
+ n(x) 6 (fl(4
P(Z)
- fib)) + ( f 2 ( 4
-
fZ(4)
This establishes that the decomposition in terms of p and n in Theorem 7.1.1 is minimal in the sense that f l and f 2 must increase faster than p and n do, and that the difference cancels out. as in Remark 7.1.1,
FUNCTIONS OF BOUNDED VARIATION
127
7.3 I f f is continuous on an interval [ a ?b] and has a bounded derivative in ( a , b ) , show that f is of bounded variation on [ a 3b]. Is the boundedness of f ’ necessary for f to be of bounded variation? Justify your answer. 7.4
Let
We claim that fn 7.1 and 7.2.65
E
2 but it is not in this space if n
BV[O,11 if n
=
1. See Figures
Y
Figure 7.1 f(s)= ssin
( I ) ,with envelope u ( s ) = T , 1(x) = --2.
7.5 If both f and g are in BV[a,b ] , prove that f g E B V [ a ,b]. (Caution: The product of two monotone functions need not be monotone.)
Remark 7.1.2 Let f : [a, b] -, IR be Lebesgue integrable, which is equivalent to f + and f - being Lebesgue integrable. The indejinite integral of f is given by
F ( z ) = [a= f ( t )d l ( t ) =
laZ
f + ( t )d l ( t ) -
f - ( t ) dl(t). a
which is a difference of two monotone increasing functions, so that F lies in BV[a,b]. 65These two figures are from [20].
128
FUNCTIONS OF A REAL VARIABLE
Y
1.01
-1.0
1
Figure 7.2 f(x) = .r2sin
'\ \ '
(5).with envelope u ( z ) = x2,I(s)= - r 2 .
Thus, although f E L1(IR,2,1 ) need not be Riemann integrable, the indefinite integral of f has bounded variation and is therefore Riemann integrable, being a difference of two monotone functions. These observations are significant for the theory of functions of a real variable, and they lead us to the Fundamental Theorem of Calculus for the Lebesgue integral in the next section. First, in preparation, the reader should solve the following easy exercise. EXERCISE 7.6
Give an example of a Lebesgue integrable function f on [O. 11 for which
exists for all z E [O: 11 but fails to be equal to f(x)for z
E
[0,1] n $.
7.2 A FUNDAMENTAL THEOREM FOR THE LEBESGUE INTEGRAL
Theorem 7.2.1 Let f be any Lebesgue integrable.function on [ u , b]. and dejine the indefinite integral F ( z ) by
A FUNDAMENTAL THEOREM FOR THE LEBESGUE INTEGRAL
129
for all x E [ u , b]. Then the derivative F’(z) exists for almost all x,and F ’ ( x ) = f (.r) almost evevwhere. Proof: We know that F E BV[u.b] by Remark 7.1.2. The differentiability of F ( r ) for almost all x will follow therefore from Lebesgue’s theorem (7.3.1), which we will prove in the next section.66 For now we will assume Lebesgue’s theorem and prove the remaining conclusions of Theorem 7.2.1. It will suffice to give a proof for f (z) 2 0 for all x E [ u . b],since in general
f = f + - f-. a difference of two positive integrable functions. In summary, we are assuming that F’(x) exists for almost all x, and we must prove that F ’ ( z ) = f ( z ) almost everywhere. We consider first the case in which the function f is bounded:
o<
f ( t )6 n r E R
for all t E [u?b]. We observe that the indefinite integral, F ( x ) , is a monotone increasing continuous function of x since f is both nonnegative and integrable. Moreover, the difference quotients
for almost all z, independent of the choice of sequence h n + 0. Thus F’is a measurable nonnegative function, defined almost everywhere. And it is easy to calculate that F(. + hn) - F ( x ) < nr, hn which is an integrable constant function on [ a , b ] . This is the place where the boundedness of f is helpful-because A 1 is an integrable constant on each finite interval. By Lebesgue Dominated Convergence (Theorem 5.3.1), we know for each interval [c?d] c (u. b) that
as n
+ CO.
However, the left-hand side can be written as
66This is not the same Lebesgue theorem that we saw in the preceding chapter, which classified the Riemann integrable functions, Both theorems are commonly called Lebesgue’s theorem.
130
FUNCTIONS OF A REAL VARIABLE
by the Mean Value Theorem for integrals applied to the continuous integrand F . It follows from the uniqueness of limits that
1
f (x)dl(x) = F ( d ) - F ( c ) =
1
F’(x)d l ( z ) .
The latter conclusion can be rewritten as
1
F’(x) - f ( x ) d l ( x )= 0
for all [c, d] c ( a , b). It follows that F ’ ( x ) = f (x)almost e~erywhere.~’ The next case permits f to be an unbounded nonnegative integrable function. Define the truncation
for each n E IN and for each t E
IR. Let ~n(x) =
IaX
fn
dl.
By the first case, FA(x)= f,(z) for almost all 2 . Since the monotone increasing sequence f n + f,we have the increasing sequence F , + F by Monotone Convergence (Theorem 5.4.1). And F , like F,, is a monotone increasing function of x. In fact, for each fixed n, consider the difference
F ( x ) - F,(z) =
IaX f
-fndl,
which is an increasing function of x. By Lebesgue’s theorem (7.3. l),we have F - F n differentiable almost everywhere. And
for almost all z, since the derivative of an increasing function must be nonnegative wherever the derivative exists. Thus F’(z) - f (z) 3 0 (7.7) 67The reader may find it interesting to compare this method of proving that a function is zero almost everywhere with Exercise 3.8. That exercise implies that no set could have the property of comprising exactly half of each interval in measure. Thus, for example, it is not possible for a measurable function to be alternately 1 and -1 on half of each interval, making the integral zero without f being zero almost everywhere. These observations are not needed, however, to justify the method we have just used to prove that F’ = f almost everywhere.
131
LEBESGUE’S THEOREM AND VITALI’S COVERING THEOREM
almost everywhere. Also,
<
lim hn +O
S, F(. +
=
F ( d ) - F ( c )-
=
0.
h,) - F ( z ) h,
1
f(.) dl(z)
where the inequality above comes from Fatou’s lemma (Theorem (5.4.3)). We have used also the fact that F is continuous because f is integrable. It follows that F’(z) < f(x) for almost all 2 . Because of Equation (7.7), F ’ ( s ) = f(r) almost everywhere. H
EXERCISE 7.7 Let 4 : IR + IR be a measurable homomorphism of the additive group of real numbers. That is, c+6 is measurable, and
4(z + w) = Suppose also that T > 0 such that
r#~
+ 4(Y).
is locally integrable,68 meaning that for each p E
s,-,
(7.8)
IR there exists
P+T
@dl
exists. Use Theorem 7.2.1 to prove that 4 is continuous and also that mapping of IR to itself.69
is a lirieur
7.3 LEBESGUE’S THEOREM AND VITALI’S COVERING THEOREM
In this section we will prove the theorem of Lebesgue that we have used already.
Theorem 7.3.1 (Lebesgue) Let f E BV[a,b]. Then f’(s)exists and is jnire for almost all IC E (a. b). 68Local integrability is not necessary for the stated conclusion to be true. Local integrability does permit, however. an easy proof using Theorem 7.2.1. See Exercise 4.14 for a hint for a fairly simple proof based on Lusin’s theorem, without assuming local integrability. Thus all measurable homomorphisms of the additive group of real numbers to itself must be continuous. Exercise 7.13.c shows that measurability is nor a topological property. It is interesting that measurability combined with the homomorphism property, which is also not topological, implies continuity. 69Cauchy proved that every continuous function satisfying Equation (7.8) must be linear. G e o g Hamel showed in [ l I ] that there exist solutions of that functional equation that are not continuous and hence not linear. It was in [ I I ] that Hamel introduced the Hamel basis for the set of real numbers.
132
FUNCTIONS OF A REAL VARIABLE
One could extend the theorem slightly by considering the right-hand derivative at a and the left-hand derivative at b, but this has no effect upon the measure-theoretic claim. By Theorem 7.1.1 it suffices to prove the theorem for f monotonically increasing. In order to prove the theorem of Lebesgue, we must prove first another famous theorem.
Theorem 7.3.2 (Vitali's Covering Theorem) Let A be a subset of afinite open interval ( a , b). L e t 1 = { I }be a fami1.y of closed subintenials I of strictly positive length in ( a .b ) with the following Vitali Property: For each x E A, and for each IS > 0, there exists I (I1 < 6.
E
Z such that x. E
170
and
Then there exists a countable set of mutually disjoint intervals I,, E Z that cover A up to a null set, meaning that
We will denote the property of a set ilbeing covered up to a null set as
We remark that in Vitali's theorem the set A E ( a ?b) does not have to be measurable, although the conclusion states that the coverage of A is up to a measurable set of measure zero. Proof: The following notation will be convenient: If I c ( a 9b ) , denote
I C = (a.b)\I We proceed to the construction of the sequence of intervals I , E Zthat cover A up to a null set as follows. The process described in the display below may proceed without end or it may be forced to terminate. 7 1 We let "It is acceptable for 5 to be an endpoint of the closed interval I . "It is interesting to note that other than requiring Z to be a Vitali covering of A, the recipe for the construction of the sequence In proceeds without reference to A.
LEBESGUE'S THEOREM AND VITALI'S COVERING THEOREM
133
Consider first the possibility that this process terminates in n - 1 steps, meaning that no I , of positive length is disjoint from all those already selected. We claim in this case that
u
j
If it were the case that A is not covered by that
2
lies in the complement of
u
I j , then there would exist z E A such
3
I J , which is open. Then there must exist an I
3
from the Vitali covering that contains x but has length too short to intersect the union of the selected intervals. This is a contradiction. Thus termination of the process would imply that A is covered entirely by a finite sequence of intervals from the covering. So consider the remaining and main case, that the process does not terminate. Since the length of ( a . b) is finite and the intervals I , selected as above must be disjoint, it follows that 11,I + 0 and all + 0 also. We claim that
If this were false, then we would have the strictly positive number
Let J , be defined as the closed interval with the same midpoint as I , but with"
'?The use of the number 5 in this theorem is so distinctive that some authors refer to this theorem as the Vitali Five Theorem.
134
FUNCTIONS OF A REAL VARIABLE
,
2
1
0
3
,
4
,
,
-
5
Figure 7.3 Top to bottom: I,, J,. I ; 111 > 211,l.
z
lJil < co,although it is nor izecessaiy to have J , c ( u , b).
Then 1
There exists N
x E
IN such that
IJ,I < 77. Thus N+1
Thus there exists
20 E
A such that
20
4
u
In and zo 4 J , for any n 2 iV
+ 1.
n$N
We will explain why this yields a contradiction. By the hypotheses of Vitali’s theorem, there exists a sufficiently short interval I E Z such that 50 E I and
Since a, + 0, there exists no such that a,, < 111.Hence I is too long to be disjoint from all the intervals I , with n < no. Thus
f
0.
+
Let m be the least value of n such that I n I , # 0. It follows that m 2 AT 1. And since zo 4 J , if n 2 N + 1, we see that 2 0 4 J,. But 2 0 E I\J,, and yet I n Im # 0. Since IJ,I = 5lI,l, it follows that
so that (I1 > a,. (See Figure 7.3.) Thus I n I , # 0for some n < m, contradicting the minimality of the choice of m.. This is the contradiction that proves Vitali’s theorem. a
LEBESGUE'S THEOREM AND VITALI'S COVERING THEOREM
135
Y
Figure 7.4
Unequal upper and lower derivatives for f(z)= z sin (5).
Definition 7.3.1 We define the upper and lower left- and right-hand derivatives of a function defined in a neighborhood of x E IR as follows:
Recall that the liin sup and lim inf always exist within the extended real number system R* = IR u {kco}.It is not hard to prove that f'(z) exists if and only if the values of all four upper and lower right and left derivatives are equal and real-valued.
EXERCISE 7.8
Let ifzZ0, if x = 0.
Find all four upper and lower one-sided derivatives o f f at x
= 0.
(See Figure 7.4.)
136
FUNCTIONS OF A REAL VARIABLE
We are ready to prove Theorem 7.3.1. Proof: Recall that we can assume without loss of generality that f is an increasing function on [alb]. Let
v = {x E ( a , b ) I 0 < D+f(.)
=
D-f(.)
=
D + f ( x ) = D-f(.)
< a}.
We will prove that 1 ( ( a . b)\V) = 0. Note that there is no significant loss in omitting the endpoints of the interval [ a , b ] , because a two-point set is a null set. Since the monotone function f lies in BV[al b], f is bounded and both f ( a ) and f ( b ) are real-valued. We will present most of the work of the proof in the form of two lemmas.
Lemma 7.3.1 ZfA 1(A)= 0.
=
{x I
a
< 2 < b. D + f ( x ) = a}, then A is measurable, and
We are not assuming that A is measurable. However, we will prove that l*(A)= 0, and this will imply that A is a null set. Fix 3!, > 0, arbitrarily large. If x E A, then there exist values of h > 0 as small as we like such that
Proof:
f ( . + h ) - f(.)
> 5.
h Let
Then Z covers A in the sense of Vitali. By Vitali’s theorem, there exists a sequence
I,
E Z of
mutually disjoint intervals such that A
u
I,. Write
n6W
In
= [Cnldnl.
Then
f(&)
- f ( c n ) > b(& - c n ) l and
Thus
Thus 1 * ( A )= 0. which can be made as small as we like by increasing ,8. Note that because f is monotone increasing, the lower derivative in either direction must be nonnegative. This implies that neither D + f ( x ) nor D - f ( . z ) can be -a.
LEBESGUE'S THEOREM AND VITALI'S COVERING THEOREM
137
On the other hand, if D + f ( x ) = co,then the same is true for D + f ( z ) ,so x belongs to the null set identified in Lemma 7.3.1. For the cases of D - f ( s ) and D - f ( x ) we let g(z) = -f(-x) on [-b, -a], which interchanges the lower right derivative with the upper left derivative and the lower left with the upper right. Thus we can restrict our attention without loss of generality to the case in which both one-sided upper and lower derivatives are finite, which we assume henceforth.
Lemma 7.3.2 Let A = {z E (w.b) I D ' f ( r ) > D - f ( s ) } . Then A is measurable, and [ ( A )= 0. Proof: Let r > s, with both numbers rational, and let
AT,s= { z E (w.b)
I D'f(s)
We see that
A=
> r > s > D-f(x)}.
u
&,8
(r.s)EQ2
is a union of countably many sets. (Here ( T . s) denotes an ordered pair of rational numbers, not an interval.) Thus it suffices to show that each is a null set. For each I = [c. d] c ( a 3b ) we define f ( I ) = f ( d ) - f ( c ) . Suppose that p = I*(AT.S) > 0.
Note that we do not assume that Ar.s is measurable. We will deduce a contradiction from the assumption that p > 0. Here is the idea of the proof of this lemma. We will begin by identifying a sequence of mutually disjoint intervals on which the sum of the increments o f f from one end to the other is bounded above in terms of s. Then we will find a sequence of other intervals lying within the union of the first sequence on which the sum of the increments o f f exceeds the aforementioned bound because T is larger than s. If we do this carefully, it will yield an impossible inequality. Now we proceed with the details of the proof. If E > 0, then there exists an open set G 1Ar.ssuch that l ( G ) < p + E . Let
Z= { I c G I f ( I )< .slIi] so that Zcovers Ar,9in the sense of Vitali. By Vitali's theorem there exists a sequence of mutually disjoint intervals I I ;E Zsuch that
k€N
where I" denotes the interior of the set I . Here we are using the fact that the set of endpoints of the countably many intervals II; is a null set, so that the theorem is unaffected if we do not use the set of endpoints of the intervals I k . Let
138
FUNCTIONS OF A REAL VARIABLE
so that 1 ( A , implies that
= 0. The combination of monotonicity and subadditivity of
I*
L*(Ak,s)< l*(A,+)= p 6 L*(A;+)+ 0,
which implies that Now let G’ =
u
[*(A;.,)
= P.
I ; , which is an open set. Let
kEW
J={JzG’(f(J)>rlJ(), where J denotes an interval. Then J covers A:,s in the sense of Vitali. Hence there exists a sequence of disjoint intervals Jk E 3 such that
Since f is monotone increasing, f ( I ) and f ( J ) must always be nonnegative numbers, and
Yet it is true also that
Hence s ( p
+ E ) > rp, which implies that p + ~ r ->->1. P
S
Since we are assuming that p > 0, and because like, this implies that r 1>->1.
E
> 0 can be taken as small as we
S
which is a contradiction. We are ready to complete the proof of Theorem 7.3.1.If we let
dz)= -f(-z)
ABSOLUTELY CONTINUOUS AND SINGULAR FUNCTIONS
139
on [-b, - a ] , then g is also monotone increasing, and
D+g(-.)
=
D-f(z).
By applying Lemma 7.3.2 to g , we see that the set of points z at which
D-f(.)
> D+f(.)
is also a null set. Thus D+ f = D- f almost everywhere. Since there are four upper and lower one-sided derivatives at each point, it is necessary to consider the . + f ( z ) )and (D-f(z), D-f(z)) we remaining pairs. For the two pairs ( D + f ( z )D can emulate the proof of Lemma 7.3.2 and then replace f by g(z) = -f(-z), which is still increasing, to reverse the roles of the upper and lower right derivatives. Similar work can be done to cover the pairs ( D + f ( x )D-f(z)) , and (D-f(z),D + f ( z ) )rn.
7.4 ABSOLUTELY CONTINUOUS AND SINGULAR FUNCTIONS The concept of absolute continuily for a real-valued function of a real variable is particularly important when studying the various forms of the Fundamental Theorem of Calculus for the Lebesgue integral. We present the definition of this concept below, following a review of two more elementary concepts of continuity.
Definition 7.4.1 Let f : [a, b] + IR.Then we have the following definitions regarding f . 1. The function f is continuous at zo E [a,b] if and only if for each E > 0 there exists 6 > 0 such that z E [a: b], with 1z - 201 < 6, implies that
If(.)
- f(.o)l
< 6.
2. The function f is uniformly continuous on [a,b] if and only if for each E > 0 there exists 6 > 0 such that z and y in [a. b ] , with 12 - yI < 6, implies that
If(.) - f ( Y ) l < 6 . 3. The function f is absolutely continuous on [a3b] if and only if for each E > 0 there exists a 6 > 0 such that for each n E IN. and for each choice of
140
FUNCTIONS O F A REALVARIABLE
The reader should take note that continuity at a point is a local concept, and the E > O may depend upon where in [ a , b ] the point zo is located. Uniform continuity requires that there exist a suitable 6 corresponding to E , regardless of where in [a,b] the points z and y are located, provided they are within 6 of one another. Absolute continuity demands more, because S > 0 is required to be independent of both the location within [a,b] of the 2n points 21 y1, . . . , zn.yn and the number n E IN, provided only that
S > 0 that works in collaboration with a given
n
1
n
If we denote E
=
U[zi,y z ] in the definition of absolute continuity, then E is 1
easily Lebesgue measurable, and the definition requires that
Thus the absolute continuity o f f is commonly denoted as f < 1, which is read as f is absolutely continuous with respect to Lebesgue measure.
EXERCISES
7.9
Let
where n E IN. Prove the following conclusions. a) f is continuous at each point of [0111. b) f is uniformly continuous on [O. 11. c ) f is not absolutely continuous on [0,1] if n = 1, but f is absolutely continuous provided n > 1. (Hint: Compare with Exercise 7.4.)
7.10 I f f is absolutely continuous on [a.b], prove that f has bounded variation on [alb]. (Hint: If A is a partition of [alb], and if S c [a. b] is a finite set of points, then the variation w(A u S ) 2. .(A).) 7.11 Show that the product of two absolutely continuous functions on a closed finite interval [a. b] is absolutely continuous. Definition7.4.2 A monotone function f is said to be singular with respect to Lebesgue measure (written f 1 I ) provided that f is nonconsranr, yet f’(s)= 0 almost everywhere.
ABSOLUTELY CONTINUOUS AND SINGULAR FUNCTIONS
141
EXAMPLE7.1
We will construct a continuous, singular function f,called the Cantorfunction, on [0,1].Each number I E [O. 11 can be expressed in a ternaty e.xpansioiz: (7.9) where each coefficient a , E (0.1.2). The coefficients a , are not unique without some further restriction. For example, if we allow infinite tails of 2s and also allow l’s, this would render ternary expansions of L in a nonunique manner, since y 2 1
&=,-1. P
Thus, if a0 = 0 and if a , = 2 for all n 3 1, then L = 1 and we could have used a0 = 1 and a , = 0 for all n 3 1. The Cantor set, C, defined in Exercise 3.1 1, can be described arithmetically by prohibiting the use of the ternary digit 1 but allowing infinite tails of 2s. The effect is the removal of open middle thirds that results in the Cantor set. Thus (7.10) Notice that the relation n: 6 I’ between two points of the Cantor set, C, corresponds correctly with the same nonstrict inequality between the two corresponding sequences of ternary digits, using the lexicographic ordering. The Cantor set can be pictured as follows. Delete from [O. 11the open middle third, $’) . This deletion eliminates all 3: with ternary expansions having which is a1 = 1 and leaves two closed intervals of length each.73 Delete the open middle third from each of the two remaining pieces, which eliminates all 3’ for which the ternary expansion has a2 = 1. Continue an infinite sequence of such deletions of open middle thirds. The reader should note that the Cantor set is necessarily uncountable, because the same is true for the space of all infinite sequences on two symbols. The Cantor set includes the endpoints of the deleted open middle thirds from the construction process, but those endpoints comprise only a countable subset of the uncountable Cantor set. Thus C includes uncountably many points that are more difficult to picture mentally than the endpoints from the deleted middle thirds. The Cantor set is closed and nowhere dense. Let each L E [O. 13 be expressed as in Equation (7.9). We will define the Cantorfunction f first on the Cantor set C by the following equation. If L is in fact expanded in a ternary manner as in Equation (7.10), without the use of
(5.
73The reader should note that the numbers
and
are not excluded.
142
FUNCTIONS OF A REAL VARIABLE
Figure 7.5
Approximation to the Cantor function.
the digit 1, then x E C,and we define f(x) by means of the binary expansion
for all x E C.One can see from this definition that f is monotone increasing on C.On each of the missing open middle thirds, we define f to be locally constant. In fact, the missing open middle third ( a ~b . ~ is )defined by the requirement U N # 1 on the N t h digit, and one can conclude that f ( a ~=) f ( b ~ )which , is the constant value chosen for f on the closed interval [ U N , b ~ ]For . example, on the first deleted middle third f will be constantly equal to and on the next two deleted thirds f will be and respectively. A computer rendering of the Cantor function is shown in Figure 7.5. The computer was set to connect the plotted points. This is appropriate in the sense that the Cantor function is continuous, as the reader will prove in Exercise 7.12. However, the picture is misleading as well, since it appears as though there were places on the graph with the derivative existing but different from zero. Actually, the derivative is zero wherever it is defined. If the picture were perfect, and if one could magnify it to an arbitrary degree, the seemingly upward-sloped parts of the graph would look just like the large-scale features, consisting of horizontal segments, except on the null set that is the Cantor set.
2,
i,
ABSOLUTELY CONTINUOUS AND SINGULAR FUNCTIONS
2.0
143
i
I
t
1
1.5t
I
t
1.0,
i
I
0.5 r
t Figure 7.6
A homeomorphism that maps a measurable set to a nonmeasurable set.
EXERCISES
7.12 Show that the Cantor function f,defined in Example 7.1, maps the interval [0,1] continuously onto itself, and is a monotone increasing function for which f '(2) exists and equals zero almost everywhere, and such that f(0) = 0 and f(1) = 1. 7.13 Let f be the Cantor function and define 4(z) = f(z)+ z for all z E [0,1]. Let C denote the (middle thirds) Cantor set. (See Figure 7.6.) a) Prove that 4 : [0,1] -+[0,2] is a homeomorphism. That is, prove that 4 is injective, surjective, and biconrinuous. b) Prove that 1(@([O,l]\C)) = 1 and that 1(4(C)) = 1. c) Let P be any nonmeasurable subset of 4(C).(See Theorem 3.4.4 for the existence of P.) Prove that 4-l ( P )is a Lebesgue measurable set but not a Bore1 set. 7.14 Let ( X ,rU, p ) be a complete measure space. Suppose that $ : X -+ JR is a measurable function such that t!rl maps null sets to null sets. Prove that $-' maps measurable sets to measurable sets.74 Is the mapping in Exercise 7.13.c measurable? 7.15
a) Provide an example of a function on [O; 11that is not absolutely continuous but is of bounded variation. 74Compare this exercise with Exercise 6.6.
144
FUNCTIONS OF A REAL VARIABLE
b) Provide examples of two different continuous functions on [O, 11 that have the same derivative almost everyw9here and that are both equal to zero at 0.
Theorem 7.4.1 Let f be a monotone increasing real-valued function on [a b]. Then f’exists almost everywhere on [a. b], and we have the following conclusions: ~
i. f(z)- f ( u ) 3
1,“ f ’ d l for all z E
[a,b].
ii. Equality holds in the inequalio above ifand only iff is absoluteljl continuous. Proof: The existence of f ’ almost everywhere follows from Theorem 7.3.1. We need to prove the two parts concerning the inequality.
Thus we can pick a sequence hn we have
4
0+, and for each t at which f ’ ( t ) exists,
It follows that f ’ is equal almost everywhere to the limit of a sequence of measurable functions, which implies that f ’ is measurable. Note that each difference quotient,
is nonnegative, as is f’(t) wherever it is exists. Next, we apply Fatou’s theorem (5.4.3) as follows. For each [c, d] c (a.b ) we have
=
f(4 - f ( c )
for almost all c and d, since f is differentiable (and hence continuous) almost e~erywhere.’~In the theorem, limc+a+ f ( c ) 3 f(a)because f is monotone increasing. I f f is not continuous at a,then the inequality is true afortiori. ii. Suppose first that equality holds in the inequality of part (i). The reader will prove that f is absolutely continuous in Exercise 7.16. 751f f is continuous at c and at d, then the limit indicated will be f(d) - f ( c ) .
ABSOLUTELY CONTINUOUSAND SINGULARFUNCTIONS
145
Suppose for the opposite direction of implication that f < 1. meaning that 1 is absolutely continuous with respect to Lebesgue measure. We need to prove that equality holds in Theorem 7.4.1. Define =
so that g < 1-by
1;
f’W d l ( t ) .
Exercise 7.16 again. Now let h h’(z)= f’(z) - g’(z)
=
=
f - g, and we have
0
almost everywhere. Also, it is easy to check that h=f-g
0 there exists 6 > 0 such that for nonoverlupping subintervals [ u k ,b k ] of [ab], we have
We claim that h must be constant. Let
Then Z covers E = {x E [u, b] I h’(z) = 0) in the sense of Vitali. Hence the Vitali covering theorem (7.3.2) tells us that there exists a disjoint sequence II, E Zsuch that k€a-
Since h’
=
0 almost everywhere, we see also that [u. b]
P
u
Ik
k€m-
which implies that
Xk( I k ( = b - a. Thus
c [u, b ] .
146
FUNCTIONS OF A REAL VARIABLE
There exists N E IN such that &,N
[a? b]\
( I k (<
6. We can write
u u Ik
=
k
Jk
k
for some natural number p , and where each J k is an interval, not closed, and lJkl < 6. Because of the absolute continuity criterion for the monotone increasing function h,
zkGp
for all 6 > 0. Thus h must be a constant function. But then
Thus
Corollary 7.4.1 I f f is an absolutely continuous, real-valuedfunction on [ u , €11,then ,-X
f(z)-f(u)
=J
f ’ d l f o r a l l x E [u,b].
a
Proof: See Exercise 7.18.a.
4
Definition 7.4.3 A real-valued function f on a measure space ( X .U. p ) is called essentially bounded if and only if there exists A1 E IR such that If(x)l < A1 for almost all 2. We denote the set of all essentially bounded functions as
L” ( X .u, p ) and we define the essential supremum o f f by
EXERCISES 7.16
Let f E L1(IR). a) If E > 0, prove that there exists 6 > 0 such that if E is Lebesgue measurable and if 1(E) < 6, then J E If1 dl < E. (Hint: Use Definition 5.2.3.)
ABSOLUTELY CONTINUOUS AND SINGULAR FUNCTIONS
147
b) Prove that if equality holds in Theorem 7.4.1, then f is absolutely continuous.
2
7.17 Suppose that both f and lie in L1 ( [ a .b] x [c. d ] ) . Suppose also that f ( z ,?J) is absolutely continuous as a function of y for almost all fixed values of 2 . Prove that
for almost all y. Take care to establish that both sides exist. (Hint: Use Fubini's theorem to prove that
is a constant function of y.)
7.18
Let f be an absolutely continuous, real-valued function on [u. b] a) Prove Corollary 7.4.1. (Hint: Note that f need not be monotone. Use Equation (7.2) to express f as the difference between two monotone absolutely continuous functions.) b) Prove that the total variation of f on [u. b] is equal to If' l dl. (Hint: Use the result and the hint for Exercise 7.18.a to prove an inequality in one direction. Use Equation (7.3) to prove the opposite inequality.)
ss
7.19 A real-valued function f on an interval I for which there exists a constant C such that
If(.)
- f(y)l
6 Clz - YI
for all z and y in I is called a Lipschirzfinction. a) Show that a Lipschitz function is absolutely continuous. b) Show that an absolutely continuous function f on an interval is Lipschitz if and only i f f ' is essentially bounded. c ) Give an example of a Lipschitz function that does not satisfy the Mean Value Theorem for derivatives. 7.20
a) Provide an example of a function of unbounded variation on [0,1] that has a derivative equal to zero at almost all z E [O. 13. b) Provide an example of a function that is absolutely continuous on [0,1] but has an unbounded derivative. We know already that if f E L 1 [ u .b] and if F ( z ) = exists and equals f ( z ) almost everywhere. Thus
1,"f ( t )d l ( t ) , then F ' ( T ) (7.1 1)
148
FUNCTIONS OF A REAL VARIABLE
for almost all x. We have the following stronger theorem, and the reader should pause to consider why the proof is not as simple as that of Equation (7.11).
Theorem 7.4.2 (Lebesgue) Let f E L ' [ a , b]. Theti
f o r almost all x.
Proof: Suppose cy is a given constant. We define the set N, to be the minimal set such that x E [u3b]\Na implies that 1 ~ + h lirn If(t) - Ql d l ( t ) = If(x) - 4. h-0 h We know that N, is a Lebesgue null set because of Theorem 7.2.1. We nil1 show that we can choose the sets iVa independent of Q. Write the set of all rational numbers as Q = { a zI i E IN} and let N = U?EW N,,, which is a null set. Now let 13be an arbitrary real number and pick Q E Q such that
b
13 - CL < E. We apply the triangle inequality as follows:
< 2E +
;; s,
x+h
lirn
h+O+
If(t)
- a1 d l ( t ) -
If(.)
1(A n [ x - h. I + h ] ) 2h
exists and equals 1. A density point of A need not belong to A.
- 01
+
2F
ABSOLUTELY CONTINUOUS AND SINGULAR FUNCTIONS
149
EXERCISE Let A be a Lebesgue measurable subset of R of positive measure.
7.21
a) Apply Theorem 7.4.2 to the function f = 1A , the indicator function of A,
in order to prove that almost every point z E A is a density point of A. 76 b) Suppose that A and B are two sets of strictly positive measure in IR.Apply the preceding part to prove that there exists a translation by some h E lit such that [ ( ( A h ) n B ) > 0 . 7 7(Hint: Consider two density points.)
+
The following surprising congruence theorem is a fairly simple consequence of any one of the Exercises 7.21, 3.26, or 6.11.
Theorem 7.4.3 (Steinhaus) Let A and B be any hvo subsets of R having identical, finite positive measure: 1(A) = 1 ( B ) = Q and 0 < Q < co. Then there e.xist ~ M ' O sequences of mutually disjoint measurable sets A,, and B, and null sets h' and A1 such that
A
=
UA,, u N. nEN
B = U B , u nr, nEN
and there exist constants a , such that A,, Proof: The function
+ a,
= B,
f o r all 11 E IN.
+
f(z)= [ ( ( A z) n B) is a continuous function of s which approaches zero as 1x1 + x and which achieves strictly positive values at least for some T . Thus there exists a number z = u 1 which maximizes the value o f f . Let B1 = B n ( A al),and let A1 = B1 - al. Define B1 = B\B1 and A' = A\A1. If B1 and A1 happen to be null sets, we are done. If not, pick a2 which maximizes 1((A' + T ) n B1) and define Az3B2. A 2 , and B2 in the same manner as in the first step. We proceed until the process terminates (in which case we are done) or else we generate in this way two infinite sequences of sets and translation numbers. In the latter case. observe that
+
1(A,) = l(B,) + 0 as n
+ co. Let
and let
761t is interesting to compare the result of this exercise concerning density points with Exercise 3.17. "This part calls for a new proof of a theorem the reader has proven in an earlier exercise by a different method. See either Exercise 3.26 or Exercise 6.1 1.
150
FUNCTIONS OF A REAL VARIABLE
It will suffice to prove that N and AI, which must have the same measure, are null sets. Suppose this conclusion were false. Then there exists a E IR such that 1((N + a ) n A l ) > 0. But then there exists n such that
1(A,)
=
1(B,) < 1((N + u ) n A d ) .
This violates the maximality property in the choice of a n .
CHAPTER a
GENERAL COUNTABLY ADDITIVE SET FUNCTIONS
In Theorem 5.2.2 the reader saw that if f : X + IR is integrable on the measure space ( X ,8.p ) , then we can define a countably additive set function v on U by the formula
We learned that the set function v can take both positive and negative values and that v is bounded in absolute value by Ilf 1 1. In this chapter we will study general countably additive set functions that can take both positive and negative values. Such set functions are known also as signed measures. In the Radon-Nikodym theorem we will characterize all those signed measures that arise from integrals of an integrable function with respect to a measure, p, as in Equation (8. l), as being absolutely continuous with respect to the measure p. And in the Lebesgue Decomposition theorem, we will learn how to decompose any signed measure into its absolutely coritinuous and singular parts, with respect to a given measure p. These concepts for signed measures will be defined as part of the work of this chapter. Measure and Integration: A Concise Introdi~tionto Real Ana1)sis. By Leonard F. Richardson Copyright @ 2009 John Wiley & Sons. Inc.
151
152
GENERAL COUNTABLY ADDITIVE SET FUNCTIONS
8.1 HAHN DECOMPOSITION THEOREM
Definition 8.1.1 Given a a-algebra U of subsets of X , a function p : U + IR is called a countably additive set function (or a signed m e a ~ u r e ) , ~provided ' that for every sequence of mutually disjoint sets A, E U, we have
We prove first the following theorem.
Theorem 8.1.1 If p is a countably additive set function on a c-field U, theri p is bounded on U. That is, there exists a real number A1 such that Ip(A)I 6 A1 for all A E U. Proof: We begin by restating the theorem as follows, bearing in mind that p can have both positive and negative values on U, and that consequently p need not be monotone. Let p*(A) = S U P { Ip(B)I B = A, B E U}
1
for each A
E
U. The theorem asserts that p * (X) < 00.
i. We claim that both Ip(A)I and p*(A) are subadditive as functions of A E The inequality for 1p1 follows immediately from the triangle inequality for the real numbers combined with the additivity of p: If A and B are U-measurable and disjoint, then IP ( A b B ) l = M A )
+ PL(B)l 6
IP(A)I + IP(B)I.
For the second inequality, we note that if A = A bA2, a disjoint union, then p* (A) 6 p* ( A l )+ p* (A2) because if an U-measurable set B c A, then
JAB n A111 + IP(B f-Az)l l 6 p*(A1) + p*(A2).
IP(B)I 6
Here, we have used the subadditivity of Ip(B)I as a function of B. ii. We will suppose that p * ( X ) = co and deduce a contradiction. By hypothesis, p ( X ) E IR.So there exists a set B E U such that IP(B)I
'Icl(X)l+ 1 2 1.
"Some authors allow a signed measure to be e.xtended real-valued. In that case, it is necessary to require that p take at most one of the two infinite values, co or -a, in order to ensure that p is well defined on 2L We will restrict ourselves to real-valued, countably additive set functions here, however. 79Wedo not denote Ip(A)I in the form IpI(A) because the latter symbol will be given a special meaning in Definition 8.1.2.
153
HAHN DECOMPOSITION THEOREM
By the additivity of p, IAX\B)I = I P ( W - dB)I 2 IP(B)I - IP(X)l
'1.
Because B and X\B are disjoint, it follows from subadditivity that either
p * ( B )= co or p*(X\B) = co. Thus there exists B1 E U such that Ip(B1)l > 1 and p*(X\Bl) there exists B2 E U, disjoint from B1,such that
=
co. Hence
lp(B2)I > 1 and P*(X\(Bl u B2)) = (33.
This process generates an infinite sequence of mutually disjoint sets B n such that Ip(Bn)l > 1 for each n E IN. Let
so that
E
2l
(8.2)
The latter series is conditionally convergent, meaning that it is convergent but not absolutely convergent. Therefore, by a familiar exercise or theorem from advanced calculus,*' both the sum of the positive terms and the sum of the negative terms in Equation (8.2) must diverge. Hence there exists a subsequence Bnj such that
which is a contradiction. We are ready to state and prove the Hahn Decomposition theorem.
Theorem 8.1.2 (Hahn Decomposition) Let p be a countably additive set function on a a-algebra U. Then there exists a partition, X = PUN into disjoint measurable sets, with the following properties for each A E U: i. I f A E P, then we must have p ( A ) 2 0. ii. I f A E N , then p ( A ) 6 0. *Osee, for example, [20]. There it is shown that if a series is conditionally convergent, then the sum of the positive terms diverges and the sum of the negative terms diverges.
154
GENERAL COUNTABLY ADDITIVE SET FUNCTIONS
iii. The partition X = P b N is essentially unique in the following sense. If X = PI u Nl is another such decomposition, then each measurable subset of P A P’ is a p-null set, and each measurable subset of N A N’ is a p-null set. We observe that each measurable subset of P A PI has nonnegative measure, whereas each measurable subset of AV N’ has nonpositive measure. Thus the essential uniqueness criterion can be restated as follows: p ( P A P’) = 0 = /~(~‘v A N’).
Proof: Let Q
I
= S L I P { ~ ( A )A E
U).
Then 0 < Q < cc by Theorem 8.1.1 and because p(0)= 0. For each n exists An E U such that Let
P
= lirn inf
A,
=
E
W there
6fpn. p=l n=p
so that P E U and P is the set of all those z E X such that s is present in all but a finite number of the sets A,. We will show that p ( P ) = Q. First, we need the following lemma.
Lemma 8.1.1 Under the hypotheses of Theorem 8.1.2, with a defined by Equation (8.3),i f p ( B 1 ) > Q - €1 and i f p ( B 2 ) > cy - € 2 , then p ( B 1 n B2)> Q - ( € 1
+
€2).
Proof: Because p is additive, p ( ~n 1 ~
+ p ( B 2 ) - p(B1 u B2)
2 =) p(B1)
> (Q - €1) + ( a - € 2 ) - u. = a - ( € 1 + €2)
because of Equation (8.3). Letting P+4
-1
A, and H,
N, = n=p
we see that Hq 2
for all q
E
4,.
= n=p
IN. Also,
1 > a - -2P-1
HAHN DECOMPOSITION THEOREM
155
for all q E IT.By countable additivity of p, we see that
c I
P(H%)= dH1) -
P (H,\H¶+l)
9=1
/
\
N
We deduce that
1
a n - -2P-1'
[riP4)
It follows that
a-
p
1
since the union over p is the union of an increasing chain of sets. Since p is countably additive, 1 Q - ---i < p ( P )< a 29-
for all q E IN, which implies that p ( P ) = a, as claimed. Moreover, if there were a set A E U such that A4C_ P and ~ ( ~ <4 0, ) then n e would have P(P\A) = P(P) - A A ) P ( P ) .
'
which is a contradiction. It follows that if A & P , then p ( A ) 2 0. Nou let N = X\P. Suppose there were a measurable set A c N such that p ( A ) > 0. Then it would follow that p ( P u A) > p ( P ) ,which is impossible. Hence if A E U and A c N,it follows that p ( A ) < 0. We leave the proof of essential uniqueness to Exercise 8.1.
Definition 8.1.2 Let p be a countably additive set function on a a-algebra U, and let P and N be (for p ) as in Theorem 8.1.2. Define the positive part, the negarive part, and the variation of p as follows: p+(A)
=
p(AnP)
p-w)
=
1 ~ 4N)I n
lPl(A)
=
P+(A) + p-(A).
The number I p l ( X ) = lip11 is called the total variation norm of p. See Exercise 8.3.
156
GENERAL COUNTABLY ADDITIVE SET FUNCTIONS
EXERCISES
8.1
Suppose that we have two Hahn decompositions as in Theorem 8.1.2:
X = PUN
=
P’uN’
Prove that p ( P A P’) = 0 = p ( N a N’). 8.2
Prove the Jordan Decomposition Theorem as follows: a) Prove that p + , p - , and /pi, as in Definition 8.1.2, are countably additive nonnegative measures. b) Prove that the decomposition p=p+ -P
-
is minimal in the following sense. If 1-11 and p2 are measures such that ,< and p - 6 pz.
p = 1-11 - p2, then p’
8.3 Prove that the total variation norm, as in Definition 8.1.2, satisfies all the requirements to be a norm on the vector space M of all countably additive set functions on ( X ,U), a measurable space consisting of the set X and a a-field 2l of subsets of X . 8.4
Prove that M is complete in the total variation norm.
8.2 RADON-NIKODYM THEOREM Definition 8.2.1 If X and p are measures on a a-algebra 2l of subsets of X , we call X absolutely continuous with respect to p , written as X < p, if and only if p( A ) = 0 implies X(A) = 0 for all A E U. If a nonnegative function f is in L 1 ( X ,2l. p ) , and if we define
for each E E 2l, then pf will be absolutely continuous with respect to p, written < p, as in the foregoing definition. We have a similar definition for the absolute continuity of one countably additive set function (signed measure) with respect to another. pf
Definition 8.2.2 If X and p are countably additive set functions on 2l, we call X absolutely continuous with respect to p, written as X < p , if and only if X ( E ) = 0 for each E E 2l such that lp l(E)= 0. See Exercise 8.5.
Theorem 8.2.1 Suppose X and p are finite (nonnegative) measures on a u-algebra 2l of subsets of a set X . Then we have the following conclusions:
157
RADON-NIKODYM THEOREM
i. The measure X < p
if and only
if there exists a nonnegative function ,f in
dX
L ' ( X . U, p ) , called the Radon-Nikodym derivative and denoted by -,
dP
that for each A E U Mie have
such
ii. Moreover; the L 1( X ,U, p)-equivalence class of a Radon-Nikodym derivative, dX -, is uniquely determined. dP
Remark 8.2.1 The notation for the Radon-Nikodym derivative suggests a chain rule (Exercise 8.9) and a change of variables formula (Exercise 8.13). Proof; The implication from right to left in part (i) is inherent in the fourth conclusion of Theorem 5.2.2. So we will suppose here that X < p, and we give a proof from left to right. We begin with a lemma.
Lemma 8.2.1 Under the hypotheses of the Radon-Nikodym theorem, if X < p.
and if the measure X is not identically zero, then there exists E > 0, and there exists P E 2l with p ( P ) > 0, such that if A E 2l and A c P, we have X(A) 2 e p ( A ) . In the context of the proof of this lemma, we will write the conclusion of the P
lemma as an inequality as follows: X 2 ~ p . Proof: Since X < p and X is not identically zero, neither is p identically zero. We claim that there exists sufficiently small E > 0 such that the nonnegative measure ( A - €I*)+ # 0, meaning that ( A - E V ) + is strictly positive on some set. Suppose this were false. Then we would conclude from the Hahn Decomposition theorem that X(A) < E ~ ( A for) all A E U, and for all E > 0. But this would force X = 0, which would be a contradiction. Thus there exists E > 0 such that (A - ~ p ) +# 0. Hence there is a Hahn , p ( P ) > 0. Decomposition X = P b N for the signed measure X - ~ pwith since (A - t p ) + ( P )> 0. If a measurable set A c P , then (A - e p ) ( A ) 3 0. Thus there exists a set P of strictly positive p-measure, for which
158
GENERAL COUNTABLYA D D I T I V ESET FUNCTIONS
I f f E L + ( p ) ,we define p j ( A ) = iAf d p , as we did earlier, and we let L+(p.X) =
{f E L + ( p ) I p j ( A ) < X(A)VA E U } .
noting that the inequalities that define L + ( p .A) apply to all A E U. By Lemma 8.2.1, we know that there exist E > 0 and P E U, with p ( P ) > 0, and such that ElP E
L+ ( p , A),
which therefore has a nontrivial element if X is not identically zero. Even if X were zero, the set L + ( p ,A) would be nonempty since it would contain the zero function. Let Q = S U P b j ( X ) I f E L + ( P L A)}, , (Y < X ( X ) < co. Thus, for each n E N, there exists f n X ) > (Y - k. Let
so that pjn(
gn = max(f1,. . .
fn) =fl v
... v f n
E
L + ( p ,A) such that
E L + ( p , A).
Then gn is a monotone increasing sequence of measurable functions. Moreover, pg, < X and pg, ( X ) > (Y We can define g = limn gn, which is defined and finite almost everywhere,x' and we see that pg < A. We know also that , u g ( X ) = a. It will suffice to prove that pg = A. We will suppose that the latter equation is false and deduce a contradiction. Suppose that X - pg > 0, which means that X - pg is nonnegative and not the identically zero measure. Let A * = X - pg. Then A * > 0 and A * < p. By Lemma 8.2.1, we conclude that there exists E' > 0 and P' E U such that p ( P ' ) > 0, and such that A E U and A c P' imply that
i.
X(A) - p g ( A )= X*(A)3 d p ( A ) .
+
Let h = g ~ ' l p , .Then A c P'. That is
iAh d p = l A g d p + d p ( A ) for each A E U such that 3 pg
sx
+ P€'lP,.
h d p > Q, which is impossible since h must lie in L f ( p ,A). Hence The uniqueness of the Radon-Nikodym derivative up to L '-equivalence is shown in Exercise 8.6.
Remark 8.2.2 We remark that i f f E L 1( X ,U, p ) for some measure p, then f has a a-finite carrier,'* Thus in order to characterize those measures expressible in the form
X(A) =
s,
f dp, it would be appropriate to limit our attention to 0-finite measure
spaces ( X ,U, p ) . It is easy to extend the Radon-Nikodym theorem to the case in which is a a-finite measure and X is a finite measure. "Here we use the Monotone Convergence theorem, together with the finiteness of X(X). "We can take for the carrier the set I f l - l ( O . co].
RADON-NIKODYMTHEOREM
159
It is simple also to give an extension of the Radon-Nikodym theorem to signed measures because each signed measure is the difference between two positive measures. Thus the Radon-Nikodym derivative in this more general context is the difference between two Radon-Nikodym derivatives for positive measures. See Exercises 8.7 and 8.8.
EXERCISES 8.5 Let X and p be countably additive set functions. Prove that the following three statements are equivalent. a) X i p . b) Xf < p and A- < p. c)
8.6
1x1 4 IPI.
dX Show that -, the Radon-Nikodym derivative of X with respect to p in Theorem dP
8.2.1, is uniquely determined as an element of L '(X, U, p ) . 8.7 Suppose p is a a-finite measure on a a-algebra U of subsets of X. Suppose X is another a-finite measure on U such that X < p. Prove that there exists a nonnegative p-measurable function f on X such that X(A) = f dp for all A E 2. Prove that X is a finite measure if and only if f E L1 (X,5%. p ) .
lA
8.8 Suppose p is a o-finite measure on a a-algebra U of subsets of X. Suppose X is a signed real-valued measure on U such that X < p. Prove that there exists a (signed) function f E L1 (X,U, p ) such that X(A) = f dp for all A E U. 8.9 Suppose that the measures A, p , v on a measurable space (X,U) have the relationship X
meaning that X < p and p < v,where X and p are finite and v is a-finite. Prove that X < v and that dX
dXdp dpdv'
-- --
dv
This can be done by means of the steps below. (You may use the result of Exercise 8.7.) a) Be sure to show (easily) that X < v. Let
and explain why there is a monotone nondecreasing sequence f such that f , + f pointwise everywhere. b) Show that
E 6 0
160
GENERAL COUNTABLYADDITIVE SET FUNCTIONS
l4 1
for all A E U as n
c) Show that
-+
f n dp
so.
=
.4
f T Lg dv for all A
E
U.
d) Use Exercise 8.6 to complete the proof that f n g n
-+
h.
8.10 Give another proof of Exercise 5.24 using a Radon-Nikodym derivative. That is, let El 2 Ez 3 . . . 3 En 3 . . . be a decreasing nest of measurable sets in the complete measure space ( X ,24. p ) . Let f be integrable on ( X .24, p ) and suppose that
Prove that
f dp
-+
dv 0 as 11 -+ cc by letting v be defined by - = f . dp
[En
8.11 Let 1 denote Lebesgue measure on the unit interval and let f denote the Cantor function from Example 7.1. a) Does it make sense to define X = 1 o f by X(A) = 1( f ( A ) )for each measurable set A c [0, l ] ?Why or why not? b) Suppose A C [O. 13 is a p-null set with the property that f (‘4)is Lebesgue measurable. Must X(A) = O? Why or why not? 8.12 Let 4 be a continuously differentiable monotone increasing function defined on [a. b] c IR. Define a measure X on the Lebesgue measurable sets of [u. b] by X(A) = L(o(A)).Prove that X < 1 and find
g.
8.13
Suppose ( X .24. p ) is a complete measure space and f
E
L y X . U. p ) .
Suppose 4 : X -+ X is a bijection for which o ( E ) E 24 if and only if E Suppose 0 maps p-null sets to p-null sets. Define the measure I-1
O
E
24.
4 E ) = p(b(E)).
Show that p o 4 < p, and prove the change of variables formula
8.14 It is interesting to consider the relationship between the concept of absolute continuity of functions given in Definition 7.4.1 and that of absolute continuity of measures. a) If X and 1-1 are any two finite measures on a a-field U c g ( X ) ,prove that X < p if and only if they satisfy the following condition: For each E > 0 there exists a 6 > 0 such that p ( A ) < 6 implies that X(A) < E . (Hint: For one direction, use the Radon-Nikodym theorem.) b) Suppose now that the finite measure X is defined on the Lebesgue measurable sets of ( [ u . b ) , 2).Define f(x) = X[u. x) for all .z E [a. b]. Prove that
LEBESGUE DECOMPOSITION THEOREM
161
f is an absolutely continuous function on [a. b] if and only if X < I . (Hint: From right to left is easy by part (a). For the other direction, prove that X(A) = f'dl for all A E 2.)
s,
8.3 LEBESGUE DECOMPOSITION THEOREM The Radon-Nikodym theorem addressed the classification of measures absolutely continuous with respect to a given measure. Here we study a quite different (symmetrical) relationship of singularity between two measures or countably additive set functions.
Definition 8.3.1 If X and p are countably additive set jiinctions on a measurable space (X.%),we call X singular with respect to p if and only if X = EbF,a This is denoted disjoint union of %-measurable sets, such that IXl(E) = 0 = l~il(F). as X 1b.
Theorem 8.3.1 (Lebesgue Decomposition Theorem) Let % be a a-ulgebra of subsets of X. Let p and u be two signed measures dejined on 9.Then there e,& two unique signed measures, us and u,,such that u = u,
+ u,.
with the properties that us 1p and u, < p. Proof: It follows from Definitions 8.2.2 and 8.3.1 that singularity or absolute continuity with respect to a signed measure p means singularity or absolute continuity with respect to 1p1. Thus we can assume without loss of generality that p is a (nonnegative) measure. Because of Exercise 8.5 and Definition 8.3.1, we can assume without loss of generality that u is a measure as well. The proof of the theorem is based upon the simple observation that u < ( p + u ) . Thus there exists a nonnegative measurable function f such that
for all E E U. Since 1-1 and u are positive, we have
*'By Definition 8.1.1. our assumption implies that p and u are finite measures. For measures that are not signed, the present theorem can be generalized readily to the 0-finite case. See Exercise 8.18.
162
GENERAL COUNTABLY ADDITIVE SET FUNCTIONS
+
which implies that we have 0 < f < 1 almost everywhere with respect to p v. Let A = f - l { l},and let B = X\A. Observe that A and B are complementary, and 0 < f < 1 almost everywhere on B.Thus
v(A)= p ( A )
+ v(A),
which implies that p ( A ) = 0. Define
v,(E) = v ( E n A ) . and v a ( E )= v ( E n B).
+
Thus v = v, v,, and v, 1 p because v, vanishes on subsets of B and because p ( A ) = 0. We need to show that v, < p. Suppose that p ( E ) = 0. Then
v,(E) = v ( E n B)= -
-
s,s,,, s,,
dv
f 4 P + v) dv.
since p ( E ) = 0 by hypothesis. This implies that (1 - f ) d v JEd3
= 0.
Since 1 - f > 0 v-almost everywhere on B, it follows that v a ( E )= v ( E n B)= 0, so that v, < p. Finally, we prove uniqueness. Let v = v, v, = V , V , be two Lebesgue decompositions. Thus v, - 0, = 0, - v,, with one side singular and the other side absolutely continuous with respect to 1-1. (See Exercise 8.15.) This forces both sides H to be zero, which completes the proof. (See Exercise 8.16.)
+
+
EXERCISES 8.15 Prove that a linear combination of two measures that are absolutely continuous with respect to p on ( X ,U) must be absolutely continuous. Prove also that a linear combination of two measures that are singular with respect to p on ( X .U) must be singular with respect to p. 8.16 Let p and v be nonnegative finite measures on ( X ,9). If v 1p and v < p, prove that v = 0, the identically zero measure on U. 8.17 This exercise continues the work begun in Exercise 8.14. Let f be a monotone increasing function on [ a , b] and define a measure p by letting it assign to an interval [ a ?x) the measure p[a. x) = f ( x ) - f ( a ) . a) Let pa be the absolutely continuous part of p with respect to Lebesgue measure, and find the Radon-Nikodym derivative
LEBESGUE DECOMPOSITIONTHEOREM
163
b) Show that the singular part p , and the absolutely continuous part p a of pf can be used to define absolutely continuous and singular parts of the function f.
8.18 Let v be any a-finite measure on the measure space ( X ,U, p ) , where p is a-finite. Prove that there exist two unique measures v, and v, such that
v with the properties that v,
=
v,
+ v,
1p and v, < p .
This Page Intentionally Left Blank
CHAPTER 9
EXAMPLES OF DUAL SPACES FROM MEASURE THEORY
We have seen that, for any measure space ( X ,U,p ) , the space L1(X,U, p ) is a Banach space. In this chapter, we will define an infinite family of Banach spaces. We will see that some of these spaces can be used to represent the Banach space of all the continuous linear mappings of other such spaces into the field of scalars. 9.1 THE BANACH SPACE P ( X ,U, p )
Recall that for a measurable function f on a measure space ( X ,U, p ) , we define the equivalence class [f] to be the set of all measurable functions g such that f = g almost everywhere. This equivalence is denoted also by f g. h-
Definition 9.1.1 Let ( X ,U,p ) be any measure space. For each real number p in the interval [l,co),we define the vector space
{ I IXl f l P
L p ( X ,9, p ) = [f]
dp <
co, f U-measurable
I
Measure and Integration: A Concise Introduction to Real Analysis. By Leonard F. Richardson Copyright @ 2009 John Wiley & Sons, Inc.
165
166
EXAMPLES OF DUAL SPACES FROM MEASURE THEORY
It is common to say or to write that somefunction f is an element of L P ( X .U. p ) , although the elements of that vector space are actually equivalence classes of functions. In Exercise 9.1, the reader will show that LP is a vector space. It is important to define a suitable norm on L P ( X ,U. p ) .
Definition 9.1.2 For each f in L P ( X . 2l. p ) , 1 6 p < GO, we define
-
g in L P ( X ,U. p ) if and only if l i f - g1Ip = 0. Also, l l f l l p is well Observe that f defined on equivalence classes in LP. One of our objectives is to prove that 1) . 11 is a norm on the vector space LP. The triangle inequality is the only property required of a norm that is not very easy to check. To prove the triangle inequality, we begin with an important inequality for the real numbers.
Lemma 9.1.1 (Jensen’s Inequality for Real Numbers) Suppose that Q
3 0.13 3 0,( I > 0. and b > 0
are real numbers such that a
+ D = 1. Then aNbL36 aa
+ 3b.
Proof: We will prove this inequality from the more general measure theoretic Jensen’s inequality that was proven in Theorem 5.2.3. We observe first that the logarithm function is a concave function on the positive half of the real line. We define a very simple probability space by letting X = {O, I} and U = !J3(X). We define a probability measure p such that p ( 0 ) = a and p(1> = d. We define an integrable function f by letting f ( 0 ) = n and f ( 1) = b. It follows that
Thus we see that log(cua
+ Rb 3 culoga + Blogb,
and this implies the desired conclusion. Jensen’s inequality enables us to prove the following very important inequality.
Theorem 9.1.1 (Holder’s Inequality)Let p 3 1 and q 3 1 be real numbers such that 1 1 -+-=l. P 4
Let f
E
LP(X. 2l. p ) andg E L q ( X .U. p ) . Then the product f g
E L’(X. U. p ) arid
THE BANACH SPACE L p ( s . ‘u. p )
167
Remark 9.1.1 In the special case in which p = 1, we take y = a,and we denote ~ ~ supremum x o f f . Thus i l f i l is the essenrial sup-norin of .f. I f by ~ ~ thef essential p > 1 and q > 1, it is common to write Holder’s inequality as
Proof: Note first that if p = 1 and y = a,then Holder’s inequality, expressed by Equation (9.1), is very easy to prove from the monotonicity of the integral. So we will suppose that p > 1 and q > 1. To prove Holder’s inequality, let
E
= (3.
I I f ( r ) g ( r ) l> 0>
%
and assume without loss of generality that p ( E ) > 0. We will need to consider the set E in order to have the strictly positive terms to which we can apply Jensen’s inequality. For each r , let
where neither denominator can vanish because p ( E ) > 0. By Jensen’s inequality for each x E E , we have 1 1 a ( x ) t b ( x ) i 6 - a ( s ) -b(x) P Y or
+
Next, we integrate both sides over E to obtain
which implies that ~ ~ f g = ~/ I ~ f d1 L ’ ( E ) 6 llfllpllgllq.
since 11 f g I I L I ( E ) = Ilfglll, whereas the integrals of I f l p and of less than or equal to the corresponding integrals over X .
1gl4
over E will be
We are ready to prove the triangle inequality for the LP-norms.
Theorem 9.1.2 (Minkowski’s Inequality) P ( X . U. p ) , then
llf
+ 911P 6
If 1 6
IIfllP
p 6 co and if f and g are in
+ Ilgllp.
168
EXAMPLES OF DUAL SPACES FROM MEASURE THEORY
Proof: For p = 1 we have prove11 this already in Exercise 5.36. For p = cc the result is very easy. So we will suppose that 1 < p < co. Because L p is a vector space, we know that 0 G llf + Sllp < a. If
Ilf + g(Ip= 0, then the theoremis very easy. We will suppose that ilf + glip > 0. We can write
If we let
1
1
. then-+-=l.
4'-
P-1 P 9 and we have If + g 1 p - l E Lq. We apply Holder's inequality to each summand on the right side above, obtaining
Ilf + gll;
(IlfIlP
+ IlglIP)
/I If
+ gIp-qq.
(9.2)
Since
=
Ilf
+ gll;-l
we can divide both sides of Equation 9.2 by inequality.
# 0,
Ilf + glig-'
to obtain Minkowski's
If we introduce a metric d ( f . g ) = iif - gllp, we see that L p would be only a semimetric space if we did not employ the equivalence relation indicated in the definition. With this quotient space we have made LP into a normed vector space. It remains to prove that this space is complete and is therefore a Banach space.
Theorem 9.1.3 The norined vector space LP(X, U. p ) is a Banach space f o r each real number p 3 1. Proof: We will model the proof on that for Theorem 5.5.2. Let f n be a Cauchy sequence in the Lp-norm. For each k E IN, there exists a natural number n k such that for all n and m greater than or equal to n k we have
In particular,
1
THE BANACH SPACE Lp(,y.
a.p )
169
Let
which is the set of all those points x that appear in infinitely many sets A k . It is easy to calculate that p ( N ) = 0. For all z 4 N , the sequence f n , ( x ) + f ( ~ ) and f is measurable, being the limit almost everywhere of a sequence of measurable functions. We need to prove that f n -+ f in the Lp-norm. By Fatou's lemma we have
Thus fn, - f
E
LP, which implies that f
IIfn - flip 6 Ilfn
E
LP. Also,
- fniellp
+ IIfw - flIp3
which can be made as small as we like by choosing k sufficiently big and n 3 n k. Thus l l f n - f l i p + 0 as 11 -+ CO.
EXERCISES
9.1 For each 1 < p < CO, prove that L P ( X .U, y) is a vector space. (Hint: Closure under scalar multiplication is easy. Show that i f f and g are in LP, then f g E LP too. Express X as a disjoint union of two sets, depending on which of the two functions, I f 1 or 1g1, is larger.)
+
9.2
Let ( X . Q, y) be measure space, and suppose that 1 6 p d q d a. a) Show that L Q ( X2. , y) C LP(X. U,p ) if p ( X ) < a. b) Show that i f f E L P ( X , Q. p ) and if y(X) = 1, then
Ilf IlP 6 IlfllT (See Exercise 5.13.)
170
9.3
EXAMPLES OF DUAL SPACES FROM MEASURE THEORY
Let f be measurable on a measure space ( X ,Q, p ) . a) If 0 < p ( X ) < co,prove that
asp + ~ 0 . ~ ~ b) If p ( X ) = co, prove the same conclusion with the additional hypothesis that f E LP for some 1 6 p < co.
9.4
Let n E IN. a) Show that Lq(JRn) LP(IRn),if 1 6 p < q 6 co. b) Show that LP(IRn) 9 LQ(IRn), if 1 6 p < q 6 co.
9.2 THE DUAL OF A BANACH SPACE Let B be a vector space over a field IF, which may be either IR or C . Suppose that R is equipped with a norm, as in Definition 5.5.1. We will assume that B is also a Banach space, as in Definition 5.5.6. We have seen that for each real number p E [l. X ) and for each measure space ( X .Q, p ) , the space L P ( X ,U, p ) is a Banach space.
Definition 9.2.1 If B is a Banach space equipped with a norm 11. 1 1. we call T : B a linearfunctional provided that
T(ax
+ IF
+ y ) = aT(x) + T ( y )
for all x and y in B and for all IY E IF. A linear functional T is called continuous at x if and only if, for each sequence x, in B such that x, + x, we have T(x,) + T(x). T is called continuous if and only if T is continuous at each x E B. EXAMPLE9.1
Let x = (XI.2 2 , . . . . xn) E En, the vector space clidean norm,
R" equipped with the Eu-
n
Let T3 : En + IR by the definition T3(x) = x,. Then each T3 is a continuous linear functional on En. We see that the values of the continuous linear functionals T, at x, with j varying from 1 to n, determine x uniquely. In any n-dimensional Banach space, we could fix a basis arbitrarily and use Tl(x).. . . , TT1(x) to determine the vector x. Such a basis is not canonical, however. A canonical version of this statement could be made as follows. 84This exercise explains why the essential supremum of
f is denoted by the symbol Ilflk,
THE DUAL OF A BANACH SPACE
171
Let B* denote the vector space of all continuous linear functionals on B. Then two vectors x and y in B are distinct if and only if there exists T E B' such that T(x) # T ( y ) .This can be expressed in words by saying that there are enough continuous linear functionals on a Banach space B to separate points. The latter statements are a consequence of the Hahn-Banach theorem, which is a very important theorem presented in many books about functional analysis. See, for example, [9]. See also Exercise 9.13.
Lemma 9.2.1 Let B be any Banach space over IF, which tnay be either lR or C, equipped with a norm I( . 1 . A linear functional T : B + IF is continuous ifand only ifT is continuous at 0. Proof: If T is continuous (at all x E B), then it must be continuous at 0. So we prove the opposite implication. Note that since T ( 0 )= T(O
+ 0 ) = T ( 0 )+ T ( 0 ) .
we must have T ( 0 )= 0. Suppose T is continuous at 0: That is,
/Ixn - 011 = I/xnI(
+
0
implies that T(x,) -+0 = T ( 0 ) .Let x E B be arbitrary and suppose that x,, That is, /Ix, - X I / + 0. By hypothesis
T(x,-X )
+ x:
= T(x,) - T ( x )+ 0 ,
so T ( x n )-+ T ( x ) .
rn
Definition 9.2.2 A linear functional T on a Banach space B is called bounded if and only if there exists a positive number K E IR such that lT(x)I 6 KllxiI for all x E B.85 It should be stressed that the same constant I( < a must suffice for all x E B, for T to be bounded. The reader should note that the concept of boundedness for a linear functional does not have the same meaning as the concept of boundedness for a function.
Theorem 9.2.1 IfT is a linearfunctional on a Baiiach space B, then T is continuous if and only ifT is bounded. Proof; In the direction from right to left, suppose that T is bounded. It will suffice to prove that T is continuous at 0. So suppose that lIx,/l + 0. Then
IT(xn)l6 KIIxnII
+
0= T(0).
85Definition 9.2.2 and Theorems 9.2.1 and 9.2.2 remain valid for any normed vector space, B , whether complete or not.
172
EXAMPLES OF DUAL SPACES FROM MEASURE THEORY
Now suppose that T is continuous. We will prove that T is bounded by contradiction. So suppose that the claim were false. Then for all n E IN there exists x, E B such that IT(Xn)ll
’
~IIXnII.
Let where we note that x, # 0. Note also that llYnII =
1
-+0,
&
yet T ( y n )fails to converge to 0. In fact, /T(Y?l)l> &.
which is an unbounded sequence. This is a contradiction. Because of this theorem, continuous linear functionals on normed linear spaces are often called bounded linear functionals.
Definition 9.2.3 Let B denote any (real or complex) Banach space. Let B’ be the set of all T : B + IF such that T is linear and bounded. We call B’ the d u d space of B. If T E B’. it has a norm defined by
See Theorem 9.2.2.
EXERCISES
9.5 Let f E L 1 ( X ,a,p ) and define T ( f ) = Jx- f d p . Prove that T is a bounded linear functional. Show that there is a smallest bound K and find it. 9.6 Let V be ajnite dimensional real or complex normed vector space. Prove that every linear functional T : B -+ F must be bounded.
9.7 Let P be the vector space of all polynomials on the interval [O; 11. Give an example of a linear functional T : P + JR such that T is not bounded with respect to the sup-norm. 9.8
Let T E B’. Show that IT(.)/
< 11TIJllvll for all v E B.
Theorem 9.2.2 If B is any Banach space, then its dual space, B ‘, is a Banach space. Proof; The reader will recall from a course in linear algebra that the sum of any two linear maps is linear, and that any constant times a linear map is linear, so we will see that B’ is a vector space if we can show that the function I/ . // defined on B’is a norm, which will prove also that the sum of two bounded linear functionals is again
THE DUAL OF A BANACH SPACE
173
bounded, and the same for scalar multiples. Observe first that IT(v)l < llT/l. Ilv/Ifor all v E B,so that IcT(v)I = ICIIT(V)I 6 ICI . /IT//. / / v / / . This implies that I l C T I l < ICI . IITII < 00. But T = i(cT) for all c # 0. Thus
Hence llcT11 = IcIIITlI. Observe next that I(T1+ Tz)(v)l 6 ITl(V)l
+ /T2(v)Id (IITlIl + l l ~ 2 l l ) ~ ~ v l l .
Thus we see that
IlTl + T2lI IlTlIl + lT211. To complete the proof that 1 . 11 is a norm on B’, the reader should show that IlTll 2 0 for all T and that //TI1= 0 if and only if T = 0. It remains to be shown that B’is complete in the given norm. Let T,, be any Cauchy sequence in B’. Let E > 0. Then there exists N such that if m and n are greater than or equal to N , then llTm - TnII < 5. Thus, for all v E B, ITm(v) - T,,(V)l <
.,1 ”1 . E
Hence {Tn(v)}&lis a Cauchy sequence in IF and we can define T(v) = lini T,(v) 1 1 ’ L
for all v E B. The proof that T is linear is an informal exercise for the student. Finally, we must show that T is bounded and that llTn - TI/ + 0. But if rn and n are greater than or equal to N , we know that E
ITm(v) - Tn(v)I 6 ,IlvIl.
Letting n + a,we see that ITm(v)- T ( v ) ( 6 511~11for all v IIT, - TI1 6 5 < E , so T, T in the norm for B’. Moreover,
E
B. Thus
--f
IITll = IT l,
- (T, - T)ll 6 IlTmII + llTm - TI1 < a.
so T is bounded as claimed. Thus B’is a Banach space.
m
Definition 9.2.4 I f f is a measurable function on the measure space ( X ,Q. p). define the essential supremum o f f , denoted / / f i l %, by
llfiiz
=
inf { K E
IR u {a}I I f ( ~ ) l d K
a.e.}>.
Define L L ( X ,Q. p ) to be the set of all measurable functions with finite essential supremum.
174
EXAMPLES OF DUAL SPACES FROM MEASURE THEORY
EXERCISES
9.9
Let (X. U, p ) be a measure space, and suppose that 1 < p < CT, and that 1
1
P
4
-+-=1. L e t g E Lq(X,U.p),anddefineT,(f) = ,1 f g d p foreach f i n L P ( X . 9 l . p ) . a) Prove that T, is a bounded linear functional. b) Prove that llT9(16 ( ( g ( ( q .
9.10
Let (X.A. p ) be a measure space. a) Prove that L ( X ,U. p ) is a Banach space. b) Prove that for each g E L’ ( X ,U. p ) , the function
is a bounded linear functional on the Banach space L ( X . 0.p ) . c) Prove that IlTJ 6 llglll. 9.3 THE DUAL SPACE OF P ( X ,U, p )
In Exercises 9.9 and 9.10 we saw that for each g E L Q ( X .U. p ) there is a corresponding bounded linear functional c
acting on LP(X, U. p ) , providedg6that 1 6 p < a,1 < q 6
1
GO,
and that
1
-+-=1. P q
In this section we will prove that all bounded linear functionals on LP arise in this way, thereby characterizing the dual space of LP in terms of Lq.
Theorem 9.3.1 Let (X,U, p ) be any a-finite measure space. Let 1 < p < X ,arid suppose that 1 1 -+-=1. P
4
Let T be an! bounded linearfitnctiorial or1 the Banach space L P ( X . U. p). Then there exists a unique function g in LQ(X,U. p ) such that
s6The equation that follows is interpreted informally when p = 1 and q =
CC.
THE DUAL SPACE OF
for all f
E
~ ( ‘u. x p.)
175
LP(X, U, p). Moreover
and the map g LP.
+
Tg is a Banach space isomorphism of Lq onto the dual space of
Remark 9.3.1 This theorem can be described as a representation theorem, because it represents the dual space, LP(X)’, as being isomorphic to Lq(X). The mapping g + Tg is the isomorphism in the direction from Lq to (LP)’. It is clear that the mapping g + Tg is linear, so that this mapping will be a Banach space isomorphism if it is onto and norm-preserving (and therefore bijective). Proof: 1. Suppose that p ( X ) < co and that p > 1, so that q < co. We will begin by obtaining the required function g for the case of a bounded real-valued IR-linear functional acting on real-valued LP-functions f. For each set E E U define X ( E ) = T(1E). Note that 1~ E LP because p ( X ) < GO, so that 1~ does lie in the domain of definition of T . We will show that X is a countably additive set function (signed measure) on U. Let
be a disjoint union of sets in Q. Let
we know that p(E\A,)
+0
as n
lllA,l
+ m.
It follows that
- 1EttLp(,Y.Zl.p)
-+
0
as n -+ m. Since T is bounded and thus continuous, we have
X(E) = T(1E) = lirnT(lA,) n
= lirn n
X(A,)
176
EXAMPLES OF DUAL SPACES FROM MEASURE THEORY
proving that X is countably additive on 2l. Moreover, if p ( E ) = 0, then lll& = 0, forcing T ( ~ E= )X(E) = 0. Thus X is absolutely continuous with respect to LL. By the Radon-Nikodym Theorem (8.2.1), there exists a function g E L1(X. 2l. p) such that T ( 1 E ) = X(E) =
IX
1Eg&
for all E E 9.It follows easily that for each d E Go, we have
Now let f E LP, and we suppose without loss of generality that f is nonnegative as well. Then there exists a sequence q7%E Go which increases towards f as a pointwise limit almost everywhere. It follows that 1 f - d n -+ 0 as n -+ cc). Since T is continuous, we have
T (f )
= lirnT(&) 7% n
Thus T
=
Tg as claimed.
Next, we consider the general case of complex-valuedlinear functionals. If we restrict T to real LP-functions. we can write T = % ( T )+ iS(T). We apply each of these two real-valued parts separately to real-valued functions f . This produces two results, g1 and gz, as in the first part of this proof. Then we let g = g1 igz, and we see that T = T,, a bounded @-linearfunctional acting on complex LP-functions.
+
We need to prove that g E Lq, and that liTgll = llgllq. We define a truncated function f n in such a way that f 7 1 gapproximates 1914 from below as follows:
f n ( x )=
i
Ig(x)lq-'sgng(x)
n
if Ig(s)lq-' 6 17 if Ig(x)lq-' > n
g(x)
for each n E IN.87 Being bounded on a space of finite measure, f n lies in LP for each n E IN. Thus
IT(fn)i =
1
f n g d p l 6 iiTiillfnilp.
X
87Here the signum function is understood to mean sgn(z)
2 ,
= - if z
lzl
# 0, or 0 otherwise.
THE DUAL SPACE OF
177
P ( X . 3,p )
Hence
The opposite inequality is contained in Exercise 9.9. This completes the proof of the first case. 2 . Suppose again that p ( X ) < co but that p = 1, so that q
=
co.
We obtain T f = l, f g d p as before, with g E L’. We need to show that g E L”, which will be contained in L1 since p ( X ) < co. Suppose this were false. Then for each h’ > 0 the set A K = {s jIg(x)l
>K]
has strictly positive measure. Define
so that
11 fKll1
=
1, ensuring that
fK E
-
L1(X:U, p,). Then we would have
I’‘ S,,i m
dp
>K
for all K > 0. This contradicts the boundedness of T . Hence g claimed. The reader will complete the proof in Exercise 9.1 1.
E
L” as
3. Suppose p ( X ) = co. Since the underlying measure space is 0-finite, we have
where A k E In this case
Ak+l
for all k , and p ( A k ) < co.
178
EXAMPLES OF DUAL SPACES FROM MEASURE THEORY
This determines gk
E
L * ( A k )uniquely almost everywhere in such a way that 119klIq
(9.3)
IITII
for each k E IN. It is clear that gk extends gk-1 for each k, so that g = lirnk gk exists pointwise. By Exercise 9.12 we know that
We need to show that g E Lq. The functions 1gkl increase toward the limit 191. Thus
because of the Monotone Convergence theorem (Theorem 5.4.1). Thus
s,
Idq& <
if q < x because of Inequality (9.3). If q = co,we have 11gkll < /lT/lfor all k. Thus llglly < co,since the union of countably many null sets is a null set. H
EXERCISES
9.11
Complete the proof of case 2 of Theorem 9.3.1 by proving that IlTIl
9.12
Complete the proof of case 3 of Theorem 9.3.1 by proving that T(f ) =
1,
= ligll
%.
f g dp for all f E LP(X:Q, p ) .
for all h E L q ( X .!?IzL, p ) , prove that f
=g
almost everywhere. (See Example 9.1.)
9.4 HILBERT SPACE, ITS DUAL, AND L 2 ( X ,U, p ) Let (X,Q, p ) be a 0-finite measure space. According to Theorem 9.3.1, the space L 2 ( X , U , p )can be identified with its own dual space, since + = 1. The identification is made as follows. If T E (L’)’, then there exists g E L 2 such that
4
HILBERT SPACE, ITS DUAL, AND
for all f
E
L ~ ( Xa. . p)
179
L 2 . The equivalence class of g is uniquely determined by T . Moreover, c
where the Hermitian scalar product
for all f and g in L2.88We introduce a definition.
Definition 9.4.1 In any vector space V over the scalar field C of complex numbers, we call a function (., .) : x V -+ c
v
a Hermitian s c a l a r p r ~ d u c tif~ and ~ only if it has the following three properties: 90 i. ( a x + y , z ) = a ( x , z ) + ( y , z ) f o r a l l a in the First Variable) ii. (x. y)
= (y, x)
E
C a n d f o r a l l x a n d y i n V . (Linearity
for all x and y in V. (Conjugate Symmetn)
iii. (x.x) 2 0 for all x Definiteness)
E
V, and (x.x)
=
0 e x
=
0 E
V. (Positive
A vector space V, equipped with a Hermitian inner product, is called a Hermitian inner product space. If a Hermitian inner product space 'FI is complete with respect to the corresponding norm, as defined in Theorem 9.4.1, then it is called a Hilbeit space.
Theorem9.4.1 In a complex vector space V equipped with a Hermitian scalar product, we define IIXII = (9.4)
dG3.
f o r all vectors x E V.9'The function 11 1) as dejned in Equation (9.4)is a norm, as in Definition 5.5.1,where scalars c are taken as complex and IcI is interpreted as the modulus of c. Moreover the Cauchy-Schwarz Inequality is satisfied:
I(%
Y)I 6
IIXII IIY
11.
@We express the scalar product here in the proper form for complex-valued functions. If we were dealing only with real-valued functions. then the conjugation sign could be omitted, and the scalar product would be symmetric-not conjugate symmetric. 89Another name for this is Herniirian inner producr. gopart of this section is adapted from [20]. 9'Not all norms in vector spaces can be given by a scalar product. See Exercise 9.18.
180
EXAMPLES OF DUAL SPACES FROM MEASURE THEORY
Proof: To prove the Cauchy-Schwarz inequality, we fix x and y, and we proceed as follows. If x = 0, the Cauchy-Schwarz inequality is trivial. So, suppose x # 0. For all c E C, observe that (cx + y. cx + y) > 0 for all c. By linearity of the scalar product in the first variable and conjugate linearity in the second variable, we see that
for all c E C. (If z = a + ib E @, we denote the real part of z by SRz imaginary part of z by S z = b.) Let
=a
and the
An easy calculation shows that I(x. y)I2 < ~ ~ x ~ ~ 2 ~ ~ y ~ ~ 2 . The first two conditions of Definition 5.5.1 are easily verified for 1 . I/. The third rn condition, the triangle inequality, is left for Exercise 9.14.
Corollary 9.4.1 I f V is a Hermitian inner product space, then the mapping T,, dejned by
T,,(v) = (2). w) f o r each u E V , is a bounded linear functional on V . Proof: This follows immediately from the Cauchy-Schwarz inequality.
rn
Theorem 9.4.2 Let ‘H be any Hilbert space. I f a mapping T of a Hilbert Spuce 7-1 to the scalarjeld is a bounded linear functional, then there exists a unique element ui E 7-1 such that T(z,)= (t,. U I ) f o r each 11 E ‘H. Thus the dual space of ‘H is identijied with ‘H itself: Proof: If T = 0, then we can take u’= 0, and that is the only choice that suffices. Suppose T # 0 E ‘H’. It is easy to see that the ker(T), the kernel of the linear transformation T , is a closed, proper subspace of ‘H. Moreover, if E. $ ker(T), then any other such vector z’ must be congruent to s modulo ker(T). Thus the codimension of ker(T) is 1, and ‘H = ker(T) @ Crr. Suppose for the moment that we can choose z to be a unit vector orthogonal to ker(T) in the sense of the Hermitian scalar product of z with any vector from ker(T) being zero. Then we can check readily that
so that we can take U I = The reader should verify that this mapping is well defined, since a unit vector orthogonal to the ker( T ) is defined only up to a complex ”The choice of the second variable of the scalar product as the proper location for the vector that represents the action of T on v assures that the action on v is linear, in the environment of a complex Hilbert space.
HILBERT SPACE, ITS DUAL, AND
L ~ ( xn. , p)
181
scalar factor of modulus one. It remains only to show that if V is a closed, proper. nontrivial subspace of ‘FI, then there exists a nonzero vector s E V I,the subspace of all vectors orthogonal to each vector in V. The latter space is known also as the orthogonal complement of V . The reader will prove this in Exercise 9.19.
Remark 9.4.1 The map that carries y + T,is an isomorphism of ‘FI onto its dual ‘FI’ if ‘FI is a Hilbert space over the field IR but not if the field is C, It is easy to check that the mapping y + Tvis linear over the real field. But over the complex field Tcy= ZT,,which causes the map y + T,to fail to be linear. If 3-1 happens to be L 2 ( X ,U,p ) , then over either field, y + T,preserves the Hilbert space norm because of Theorem 9.3.1. The latter theorem does provide a Banach space isomorphism between L2 and its dual. But in the present discussion we are working in the context of an abstract, complex Hilbert space, and for this reason T, is being defined by means of the Hermitian scalar product. We can make ‘FI’ into a Hilbert space in its own right, by introducing the following scalar product on 8’: ( T y . TZ)W = ( 2 . y).
The reader should check easily that (.. .)%, is a Hermitian scalar product.93
Definition 9.4.2 An orthonormal subset E = { e , I N product space, is a set with the property that ( e a . eo) =
E
A } c ‘FI, a Hermitian inner
1 if Q = @. 0 if N # p.
The set A is called an index set. I f f E ‘H, a Hilbert space with an orthonormal set indexed by A , then we define an abstract Fourier transform : A + C by
f^
The reader should note that the Fourier transform, as a function on the index set A , is dependent upon the choice of an indexed orthonormal set E. Observe that if ~ ( c Y #) 0 for uncountably many values of Q E A , then there could be nofinite upper bound on the set of all possible finite sums of squares of the form lf(a)l2.The next theorem will establish that for each f E ‘FI, we can have f ^ ( o#) 0 for at most countably many values of Q.
Theorem 9.4.3 (Bessel’s Inequality) Let f E 31,a Hilbert space with an orthonormal set E, indexed by A. Then for each finite set F E A, we have Bessel’s inequality: (9.5)
9’One can interpret the scalar product (.% as providing an isomporphism of the Hilbert space N onto its own second dual space, 31” = (N’)’. A Hilbert space is an example of a Banach space B for which B” is isomorphic to B. Such Banach spaces are called rejexive. It is not hard to see that Theorem 9.3. I establishes the reflexivity of the Banach spaces U ( X ,U, p), provided that 1 < p < co.
182
EXAMPLES OF DUAL SPACES FROM MEASURE THEORY
with the right side being necessarily$nite. Moreover; we have
Proof: We calculate carefully the following nonnegative Hermitian scalar product:
Thus the finite partial sums of the terms f ( a ) are all bounded above by 11 f 11 this yields Bessel’s inequality.
l A l2
’,and
Definition 9.4.3 A Hilbert space is called separable provided that it has a countable orthonormal subset E = {eV2I TI. E IN},with respect to which the Plancherel identity is satisfied:
for each f E 3-1. The subset E is then called a countable orthonormal basis for ‘FI.
Theorem 9.4.4 Let 3-1 be a separable Hilbert space with an orthonormal basis E = { e n } , indexed by IN. Then we have the following conclusion for each f E 3-1. We have
f
=
c
cnen.
neW
in the sense of convergence_with respect to the Hilbert space norm, i f and only i f each conipkx coeflcient cn = f (n). Proof: In one direction the corollary follows immediately from Bessel’s inequality. For the uniqueness of the coefficients c,, expand (f.e n ) as an infinite series, using the continuity of the bounded linear functional on 3-1 that is determined by e with respect to the Hilbert space norm.94 The reader will see in Exercise 9.15 that a Hilbert space is separable if and only if it has a countable, dense subset. The reader will show also that L 2(lR) is a separable Hilbert space. It is a consequence of Bessel’s inequality that for each element f of a separable Hilbert space 3-1, f^ lies in 12, the separable Hilbert space of square summable sequences, which the reader will study in Exercise 9.16. Thus the Fourier transform as 94Consider the Cauchy-Schwxz inequality.
HILBERT SPACE, ITS DUAL, AND
L~ ( x .a.p )
183
an operator can be viewed as a mapping
The reader will show in Exercise 9.17 that the Fourier transform is a Hilbert space isomorphism from ‘FI to 1 2 . Thus all separable Hilbert spaces are isornorphic to one another. EXERCISES is defined by means of a Hermitian inner product, prove the rrianglr 9.14 If /lxI/ inequality: IIX
+ YII
IIXII
+ IlYll.
(Hint: Use the Cauchy-Schwarz inequality.)
9.15
Let 3-t be a Hilbert space. a) Prove that ‘FI is separable if and only if it possesses a countable, dense
subset. That is, ‘FI is separable if and only if there is a countable subset S , the closure 9of which is the whole space 3-t. (For the proof of sufficiency. use a countable iteration of the Gram-Schmidt orthonormalization 95 process from linear algebra.) b) Prove that L2(IR)is a separable Hilbert space. (Hint: Let the set SQbe the set of all step functions o with rational values, and with the property that u - ’ ( q ) is an interval with rational endpoints for each q E Q. Show that S Q is dense in L2(a) .)
9.16
Show that the square-summable sequence space
is a separable Hilbert space. (Hint: Show that 1 2 = L 2 ( X .U, u ) for a suitable choice of u-finite measure space (X. ‘21. u).)
9.17
We show in this exercise that all separable Hilbert spaces are isomorphic to
12.
a) Prove the following generalization of the Plancherel identity, called ParseVal’s identity: for each f and g in a separable Hilbert space W ,we haveg6
Parseval’s identity shows that the Fourier transform preserves the Hermitian scalar product. (Hint: Apply the Plancherel identity to (f g , f g ) to
+
+
9 5 ~ e efor , example. [12]. 961n a separable Hilbert space, we assume that a countable orthonormal basis has been chosen. and that the abstract Fourier transform has been defined with respect to that basis.
184
EXAMPLES OF DUAL SPACES FROM MEASURE THEORY
prove that
Wf> s) =
wf?9.
Then do something similar with if to obtain the desired conclusion.) b) Prove that the Fourier transform is an isomorphism of a separable Hilbert space 3-1 onto 1 2 . That is, show that the Fourier transform is a linear bijection that preserves the scalar product and the norm.
9.18
Let V be a real or complex vector space. a) Suppose V is equipped with an inner product (.. .), and suppose we define a corresponding norm by llxll = (x.x). Prove the Parallelogram Law: IIX
+ Y112 + IIX - YII'
= 211xl12
+ 2llYIl'.
+
b) Prove that the taxicab norm,97 l1xJlt= 1x1I /;czI, does not correspond, as in 9.18.a, to any inner product on IR'. c) Under the hypotheses of 9.18.a, prove the identity
with the understanding that in a real inner product space the real part of the scalar product is the inner product. In a Hermitian inner product space, show that one can express S(x, y) = -%(zx3 y). d) Suppose only that V is a real or complex vector space with a norm. Define what is hoped to be a scalar product on V by the formulas in 9.18.c. Prove that this defines a legitimate scalar product98on V ,provided that the norm satisfies the Parallelogram Law of 9.18.a.
9.19 Prove that if V is a closed, proper, nontrivial subspace of an arbitrary Hilbert space 3-1, then there exists a nontrivial proper closed subspace V I,each vector of which is orthogonal to each vector of V , and 3-1 = V @ V'. This can be done by the following sequence of steps. a) Let z $ V and let d = inf{ llx - yl/ / y E V } .Show that d > 0. b) Pick a sequence yn E V such that /1z - y n ( / -+ 0 as n + a.Apply the Parallelogram Law (from Exercise 9.18.a) to the sequences 1 - y, and x - ym, to prove that yn is a Cauchy sequence. c) Prove that y = limn+L yn has the property that z - y is orthogonal to V and is nonzero. d) Denote the orthogonal complement of V by
'v
= {WE
3-1 1 (u,w)
= OVUE
V}.
Prove that V' is a closed subspace of 3-1 and that
3-1=vov' 97The taxicab norm is defined in many advanced calculus books, such as [20]. 98Forthis and generalizations, see [ 131.
RIESZ-MARKOV-SAKS-KAKUTANI THEOREM
185
9.5 RIESZ-MARKOV-SAKS-KAKUTANI THEOREM Early in the twentieth century, Frigyes Riesz discovered a full classification (or representation) of the dual space for the vector space C[n.b], consisting of the continuous functions on [ a 3b] and equipped with the L "-norm, or sup-norm. This space, with the given norm, is a complete normed linear space: It is a Banach space in modern terminology. Riesz proved that each bounded linear functional on that space can be described as
s
b
T ( f )=
a
f
dP.
using a suitable finite Borel measure p.99 Later, Andrei A. Markov extended this theorem to the compactly supported functions on the infinite real line. A version with the hypothesis that X is a compact was proven by Stanislaw Saksloofor C(X), metric space. And Shizuo Kakutani l o ' generalized the theorem to cover the vector space of all continuous functions on any compact Hausdorff space. We will prove the Kakutani version of the theorem. The proof we give is adapted slightly from the one presented by Kakutani. lo2 In order to be able to integrate continuous functions on a compact topological space, it is natural to use the field generated by either the open or the closed subsets for elementary sets. It will make no difference which is chosen since we are dealing with a field of sets. Either way, a continuous function will be measurable.
Definition 9.5.1 A measure p on the Borel field 93 generated by the open subsets of a compact space is called regular, provided that for each E E 'L3 and for each E > 0, we have a closed set F C E and an open set G 2 E such that
If p is a signed Borel measure, then we call ,u regular provided that the variation lpl (from Definition 8.1.2) be regular, under the definition for positive Borel measures. This is equivalent to both p+ and p- being regular.
Theorem 9.5.1 (Riesz-Markov-Saks-Kakutani) Let X be any compact Hausdoif space, and let C ( X )be the vector space of all continuous real-valued functions on X , equipped with the L'-norm. Then the dual space C ( X ) ' is isomorphic as a Banach space to the space M ( X ) of all finite, regular signed measures defined on the Borel jield generated by the open (01;equivalently, the closed) sets. The space 99Actually,Riesz expressed this representation as a Stieltjes integral with respect to a function of bounded variation. See, for example, [21] or [20]. '"See [23]. IolKakutani produced this theorem as part of a paper [I41 that provides a classification of all objects known as Banach lattices. 'O*The author is calling the theorem after all four mathematicians of whose contributions he is aware. However, it is common to see this theorem named after the first two of these authors.
186
EXAMPLES OF DUAL SPACES FROM MEASURE THEORY
M ( X ) will be equipped with the total variation norm, denoted by llpll. '03 The correspondence is given by
Proof:
i. We will begin by proving that the map from p -+ T, is both isometric and injective. Let p E M(X) with norm llpll, as in Exercise 8.3. Define
6~ We see that ITLL(f)l
~llpll, f so~that ~
lIT,ll
x
6 IIPII.
We would like to show equality, however, for these two norms. For this purpose, write the Hahn decomposition X = P u N , as in Theorem 8.12. Let E > 0. There exist compact sets K p and K.%rand open sets O p and O,, such that K p c P c O p and II-~V E IV C_ 0 , ~ and such that
It follows from Urysohn's lemma that there exists a continuous, real-valued function f with l l f ( l z = 1 such that
f : Kp\Oly
-+
1 and f : KN\Op
-+-1.
Observe that (CL((l~P\ON) 2 P(P) - 26
and IPI(~~.V\OP) 2
Thus
Is,S'iPI
3
I4(W - 26.
114 - 8 6
for each E > 0. Therefore (/T// b 11p((,making both norms equal Suppose now that T could be expressed by two regular Bore1 measures, / i and u.Then T - T = 0 would correspond to LL - u,and this together with the first part of the proof implies that lip - ull = 0. Hence ,u = v. It follows that the mapping from M + C ( X ) ' is an injection. '"3See Definition 8.1.2 for I/pll.
RIESZ-MARKOV-SAKS-KAKUTANI THEOREM
187
Remark 9.5.1 The main part of the proof is to show that the injection, p -+ T,, is also a surjection. The reason this is difficult to prove is that we are given a bounded linear map T : C ( X ) + IR,and we would like to determine a measure that could describe T in the manner given above. The first guess could be that if E E 23, we should let p ( E ) = T(lp,).This would not make sense, however, because the function l p , is usually not continuous, for which reason T ( l B ) is undefined. ii. We will show how we can construct a suitable measure, given a bounded linear functional T , with the temporary simplifying assumption that T is a positive operator, as in the following definition.
Definition 9.5.2 A bounded linear functional T E C ( X )' is called positive if and only if T f 2 0 for each function f that is everywhere nonnegative. The reader will need to prove, in Exercise 9.20, that each positive linear functional is also monotone, as explained in that exercise. The restriction to positive operators will be lifted in the final part of the proof. Since T has a positive, finite norm, we can assume that IlTll = 1 simply by adjusting T by a scalar factor. '04 We make this assumptioti f o r simpliciQ in what follows. We can use T to define a nonnegative function p on the family 13 of all open sets by (9.7) p ( 0 ) = sup{Tf 10 6 f 6 l o , f E C ( X ) } . Here 10 is the indicator function of the set 0.
Lemma 9.5.1 The set function 1-1 has the following properties on the set 13 of all open sets: (a) (Null Set Additivity) p ( 0 ) = 0.
(6) (Monotonicity) 0 1 c
implies that ~ ( 0 1 )6 ~ ( 0 2 ) . (c) (Subadditivity) p(O1 u 0 2 ) 6 p ( 0 1 ) p(02). 0 2
+
Proof: The first two parts we leave as an informal exercise for the reader. For the third part we reason as follows.
Let E > 0. Then there exists f E C ( X )with 0 6 f 6 1oIuo2such that
T f > p(01 u 0
2 ) - E.
Let C = f - ' [ c , 11, so that C is a compact subset of 0 1 u 0 2 . We claim that we can write C as the (not necessarily disjoint) union of closed (and thus compact) sets, C1 and Cz, such that C1 c 01 and C2 c 0 2 . '@We assume that T # 0, the zero operator, since Theorem 9.5.1 would be trivial in that case.
188
EXAMPLES OF DUAL SPACES FROM MEASURE THEORY
In fact, the boundary set d(01 n C)\O1 and the boundary set d(O2 n C)\02 are mutually disjoint subsets of C n O2 and C n 01,respectively. Since the space X is both compact and Hausdorff, there exist mutually disjoint open sets UZ E 0 2 and U1 E O1 such that d(01
n C)\Ol G U Z .
and
d(O2 n C)\Oz G U1. We can select C1 = C n 01\U2 and C2
=
C n 02\1/1.
Thus there exist continuous functions f l and f 2 such that fl E 1 on C1 and 0 6 f l 6 l o l , and fi = 1 on C2 and 0 6 f 2 6 lo2. It follows from the definition of the set C that
f
6 fl
Also, T ( f 1 )6 p(O1) and T ( f 2 )6 ~
+ f2 + 6. ~ ( 0 2 It) .follows
that
since we are assuming that I(TI(= 1. Because the inequalities (9.8) hold for m all E > 0, the lemma has been proven. Our next task is to extend the domain of p to p(X), which we do as follows. We let
p * ( E ) = inf{,u(O) 10 3, E , 0 E O } .
(9.9)
for each subset E of X. Observe that for an open set 0 we have ~ ( 0= )p * (0). Lemma 9.5.2 Thefunction p* is a Carathkodo? outer measure on the o-jield g ( X ) ,the power set o f X . Proof: We need to show that p * ( ( z l ) = 0, and that p* is monotone and countably subadditive. The first requirement is a simple consequence of the fact that 0 is open, and 10 6 0 on X. Monotonicity follows immediately from the definition of p* as an infimum. Finally, if E , E Cp(X)for each i E IN, let E > 0, and take an open set Oi 2 E, such that
RIESZ-MARKOV-SAKS-KAKUTANI THEOREM
189
It follows from Lemma 9.5.1 that for each n E IN we have
n i=l
i=l
Since this is true for all E > 0,
and this proves the lemma. The next lemma will imply that each Bore1 set is p *-measurable.
Lemma 9.5.3 Each open set 0 is p*-measurable. Proof: Since we know that p * is subadditive, it will suffice to prove for each E c X that p * ( E ) >, p * ( E n 0 ) p * ( E n 0').
+
As in the previous lemma, Equation (9.9) tells us that it will suffice to prove the inequality for every open set E . To prove this, we proceed as follows. Let E > 0. Then there exists f E C(X) with 0 6 f 6 1 ~ such ~ that 0
T f > p ( E n 0 )- E . Let C = f [ E , 11, which is a closed subset of E n 0. Similarly, there exists g E C ( X )with 0 6 g 6 lE\cand Tg > p(E\C) - E . Thus
and this implies that Tf
+ Tg 6 p ( E ) + E. Thus
+
p * ( E n 0') + p * ( E n 0 )- 2~ 6 p(E\C) p ( E n 0 ) - 2 6
< P(E)+ 6 , which proves the lemma since E > 0 is arbitrary.
190
EXAMPLES OF DUAL SPACES FROM MEASURE THEORY
That p* is a regular Carathtodory outer measure follows from Exercise 2.20. It follows that p* defines a regular measure on the 0-algebra of all p *-measurable subsets of X .
Lemma 9.5.4 For each f E C ( X ) ,we have
sx f
dp = T f
Proof: We may assume, without loss of generality, that f is not the constant zero function. Let E > 0, and pick a finite sequence of numbers Q , such that Qo
andsuchthatcuo < - l l f l l L Define the open set
< cu1 < . . . < CY,
<
IlfllL
E,
=
<~foreachi.
f - l ( a 2 m), *
r:
for i = 1 . 2 , . . . . n. Define the functions
f i ( t )=
if f ( t ) < ~
f(t)-ut-1 a,-a,-l
if
0,-1
~ - 1 .
6 f ( t ) 6 a,.
if f ( t ) > cut
for z = 1... . n . Thus f i ( t ) 6 1 if t E ~
f i ( t ) = 0 if t E E,"_,, and
n
f ( t )= Qo + -& - Q,-l)fi(t). a=1
Hence T (f i ) 6 p ( E t - l ) , by Equation (9.7). Also, p ( E o )= 1, and p ( E n ) = 0. Thus n
(9.10)
i=l n 2=1
n
6
CQ2-1 (pP-1)
- p(EJ)
+
E
z=1 r
where we use again the assumption that IITII 6 1, making p ( X ) 6 1 as well. Since the inequalities (9.10) are true for all E > 0 we have T f 6 f d p . But the opposite inequality follows if we replace f by -f and use the linearity of T . This proves the lemma.
RIESZ-MARKOV-SAKS-KAKUTANI THEOREM
191
iii. There is only one final step in the proof of Theorem 9.5.1. We need to prove that we can separate a general bounded linear functional T E C(X)' into the difference of twopositiveparts. This is accomplished in the next lemma, which is sometimes also called a Riesz representation theorem, as in [24], where it is proven in broader generality.
Lemma 9.5.5 r f T E C ( X ) ' , then there exist two positive bounded linear functionals, T' and T - , such that T
=
T'
- T - . and
+
llTl1 = T + ( l ) T ( 1 ) . Proof:
Note that T is not necessarily positive. Suppose f 2 0, and define T+f
=
sup T ( Q ) 0GqJG.f
It follows that
T +f 2 T f . and T + ( c f )= cT+f for all c > 0. We will need to prove that T + is additive as part of the work of proving that it has a linear extension. Now let f and g both be nonnegative with 0 6 d 6 f and 0 6 y 6 g . Thus 06 so that T + (f
4+v6 f
+g.
+ g ) 2 Td + TI,!,J.Hence T + (f
+ g ) 3 T +f + T'g.
On the other hand, if 0 6 $ 6 f
+ g , then
Thus
It follows that T + is additive, since
T + (f
+ g ) d T +f + T'g.
Now let f E C(X) be a r b i t r a p (not necessarily positive) and pick any two constants A1 and N such that f + Al 2 0 and f + N 2 0. Thus
T+(f
+ M + N ) = T+(f + A l ) + T'N = T + ( f + N ) + T'Al,
192
EXAMPLES OF DUAL SPACES FROM MEASURE THEORY
which implies that
Hence it makes sense to defne
T+f
=
T+(f
+ A l ) - T'AI.
+
for any A 1 such that f M 3 0. The reader should check that T + is additive by doing Exercise 9.21. Observe also that
T+(f ) + T+f
=
T+O = 0.
so that T+(f ) = -T+ f . It follows that T+is linear. Hence we can define
T-=T-T+.
so that T-must be linear, T- 2 0, and T
=
T+ - T-.
+
All that remains is to prove that IlTIl = T+(1) T-(l). Observe that if T were positive, then f 2 g would imply that T f 3 Tg. But
ITf I IITII = S U P f#O
i:l I
llfll
and - 6 1. Thus if T 3 0, it follows that IlT/l = T(1). Since for general ~
T we have T
=
T+ - T - , it follows that IlTIl 6 llT+I/+ IIT-II = T + ( l ) T-(l).
+
On the other hand, if 0
< o 6 1, then 124 - 11 6 1 and IITII =
SUP OQQil
/T(O)/
3 T(20- 1) = 2T(@) - T(1). Thus
+
It follows that llTl1 = T + ( l ) T-(l). This proves the lemma. The proof of Theorem 9.5.1 is complete.
RIESZ-MARKOV-SAKS-KAKUTANI THEOREM
193
EXERCISES
9.20 Prove that if T is a positive linear functional on C ( X ) ,then T is monotone. That is f < g implies that T f < Tg. 9.21 Prove that T + ( f + g ) = T +f + T+g for arbitrary continuous functions f and g on X, and that T + ( c f )= cT+(f) for all c >, 0. 9.22 Suppose p is a bounded regular (signed) Borel measure on [0,1] c 1 suppose xn dp = 0 for each nonnegative integer n . Prove that 1-1 = 0.
so
IR,and
9.23 For each bounded (signed) Borel measure p defined on the Lebesgue measurable sets of [0,1],define the Fourier-Stieltjes transform to be the function : Z + @, given by rl
for each n E Z. If p and v are two bounded Borel measures on [O. 13 for which = 8, prove that p = v. In words, this exercise states that the Fourier-Stieltjes transform of a bounded Borel measure determines that measure uniquely.
This Page Intentionally Left Blank
CHAPTER 10
TRANSLATION INVARIANCE IN REAL ANALYSIS
Throughout this book, we have emphasized the example of R",with n 2 1. An important attribute of Rn,which distinguishes it from a general measure space, is the fact that JR" is a group under vector addition. In the case of IRn, Lebesgue measure is translation-invariant, as is the Lebesgue integral. In this chapter we carry the theme of translation invariance further, classifying the closed, translation-invariant subspaces of L2(R)and L 2 ( T ) ,where lT is the unit circle, meaning the quotient group R/Z.The closed, translation-invariant subspaces will be differentiation-invariant as well, meaning that differentiable functions in these subspaces will have their derivatives in the same subspaces. ' 0 5 We will begin with the decomposition of L 2 of the circle into a direct sum of one-dimensional translation-invariant subspaces Ce 2irzns of L 2 . Thus each invariant summand in the direct sum will be the space of all complex scalar multiples of the periodic function The reader should note that each translate of the latter function is a complex scalar multiple of itself. This will be followed by a treatment of Io5The theorems to be treated in this chapter are special cases of theorems that apply more generally to functions on suitable topological groups or on Lie groups for the consideration of differentiation.
Measure and Integration: A Concise Introduction to R e d Analysis. By Leonard F. Richardson Copyright @ 2009 John Wiley & Sons, Inc.
195
196
TRANSLATION INVARIANCE IN REAL ANALYSIS
L'( J R ) , expanded as a direct iritegral of one-dimensional translation-invariant spaces. with n replaced by a real parameter. '(I6 In order to develop the decomposition of L' of a circle as indicated, it will be helpful to treat the smooth functions first. For L 2 ( R ) we will consider Sc1zatwr.f: fiinctiorzs in a similar role. The present chapter may be interpreted as an introduction to Fourier analysis. or to harmonic analysis, on the real line, but from the perspective of its modern group-theoretic sense. 10.1 AN ORTHONORMAL BASIS FOR L2(T) We regard the circle as being the quotient group of the additive group of real numbers. modulo the integers, or any nonzero multiple thereof. We will prove most of our theorems in the context of
T
=
IR/Z.
although we will need to generalize this slightly to IR/(kZ).'"' for any nonzero Sometimes we will need to stress that T = IR/Z is being constant k E considered as a group under the operation of addition, so we will denote the additit'e circle group by (T. +). In the case of R/Z, the circle may be identified with the unit interval, with the numbers 0 and 1 identified. Geometrically, this can be modeled by bending the unit interval into a circle and gluing 0 and 1 together. A function
.f:T+C is understood to mean a periodic function on the real line, with period 1. Thus f ( z ) = f(.r
+ 1).
meaning that the equality is valid for each real number
.I,.
Remark 10.1.1 Unless we make a statement to the contrary, L'(T) is understood to be with respect to Lebesgue measure on the interval [O. 1). with each point of the circle being identified with its unique coset representative from [O. l ) , which is regarded also as being a cross sectiorz for the coset space R/Z. We leave it as an exercise for the reader'"' to prove that this measure is translation-invariant with respect to the action by addition of the group IR on T . Observe that if ,f E L '(T). then it is true also that f E L1(?r). This is necessary for the following definition, in '"'The reader should note the absence in the latter case of the prefix srrh! ""See Exercise 10.6. "IXThesymbol T is appropriate for a circle. because we think of the circle as a one-dimensional ram. A product of 11 circle groups is an n-torus. which we denote by TI. "'See Exercise 10.2.
AN ORTHONORMAL BASIS FOR L2(?r)
197
which we introduce two key concepts: The Fourier transform and the Fourier series o f f E L2(lr).
Definition 10.1.1 In the following statements, L 2 ( T )and L1(lr) refer to Lebesgue measure as described in Remark 10.1.I . i. For each n E Z, we define a function i n: ?r
+
x n ( t ) = eZnLnt= cos 27rnt
C' by
+ i sin 27rnt.
where we have employed Euler's formula for the trigonometric exponential functions xn. We note that y,l is well defined on cosets of Z.'" ii. For each f E L1('IT) or in L2 (T),we define the Fourier tr.unsform to be a function : Z -+ C by
f
where the integral over 'IT of a periodic function f of period 1 means the integral over any interval of length 1. iii. We define the Fourier series S ( f )o f f E L'(T) to be
neZ
making no claim regarding the convergence or divergence of the Fourier series in any sense. The Nth partial sum of the Fourier series is denoted by
for each N E IN. It is important to understand that each of the functions >( is a homomorphism of the additive circle group, R/Z, to the multiplicative group of the complex unit circle."' Especially, one must observe that Xn(S
+t)= xn(S)xTL(f).
Theorem 10.1.1 I f f E L' (T),the nth partial sum S , of its Fourier series is given bY (10.1) "'The function xn is called a character because it is continuous and it has the property that xL(x + y)
x n ( z ) x n(y) for all r ,y E IR. "'See Exercise 10.5.
=
198
TRANSLATION INVARIANCE IN REAL ANALYSIS
Y
Figure 10.1 Dirichlet kernel D , for 71 = 10.
where the Dirichlet kernel D n is defined by sin TI-
if x @ Z, ifxEZ.
( 10.2)
1
Also, J:t D , ( x )dx
=
1,f o r each n E Jh.'Iz
Remark 10.1.2 The Dirichlet kernel does not converge to zero for x bounded away from the origin. It depends for its work on rapid oscillations in sign to produce cancellations, together with most of its integral being nearly 1 over a small interval around the origin. See Figure 10.1. Proof: We observe that
"*The theorems in the present section have been adapted from the author's book [20]. Adivnced Calculus. Introduction to Linear Ana/wis.
Ati
AN ORTHONORMAL BASIS FOR L'(?r)
199
since the integrand has period 1 and can be integrated with the same result on any interval of length 1. It will suffice to prove that the sum inside the integrand is the Dirichlet kernel evaluated at x - t. We reason as follows, using Euler's formula and the sum of a geometric series: e27rzkx
C 2n
- e-27rzns
k=-n
e2~r26x - e-2ninx
k=O e-27r~nz
- e2nz(n+1)x
1- e2n2(2n+l)s 1- e 2 ~ 2 ~
e-i(2n+l)7rx
-
-
1 - e27rzz sin(2n l ) m = Dn(x). sin T X
e-lrrx
- ez(2n+1)7rx - ezrrs
+
provided that the denominator in the geometric series formula is not zero, which is equivalent to 5 $ Z. If .z E Z, then the sum is clearly 2 n 1. Finally, since we have shown above that D71.(x)= C;=-ne27i*k5, it follows
+
readily that
1
1'
D , (x)dx
=
1 for each n E
IN.
2
The following famous lemma is very useful.
Lemma 10.1.1 (Riemann-Lebesgue Lemma) I f f E L (T),then
m
-+
0
The proof of this lemma is left to Exercise 10.1. For the next lemma, note that differentiation of a function on T has the same meaning as differentiation of a periodic function of period 1 on R. The notation C P ( T ) stands for the vector space of all p-times continuously differentiable periodic functions of period 1, where p can be a natural number or 03. In particular, if f E CP(T),then its pth derivative, f ( p ) , is continuous, provided that p < a.
Lemma 10.1.2 Ler f E CP(T), where 1 6 p < X . Then (10.3) Proof: We begin by applying integration by parts to
200
TRANSLATION INVARIANCE IN REAL ANALYSIS
We iterate this argument a total of p times, obtaining ( 10.4)
Finally, we observe that for each function g E L (T),we have
It is clear that
f(p)
is integrable since it is continuous.
Theorem 10.1.2 Let f
E
CP(T), where 1 6 p < a.Then the Nthpartial sum N
converges uniformly to f on the real line. Moreover;
llsN- f l i L
G KN+-~.
f o r some constant K that is independent of N but is dependent upon f and p . Proof: It would suffice for the first part of the theorem to give a proof for p = 1, but the the first part follows from the inequality that is the second part, and that is what we will prove. Our first step will be to prove that the sequence S , is Cauchy in the sup-norm and that it converges at the rate claimed. For each n E IN and for each m 2 n we have
as n + CO. For inequality (1)we have used Equation (10.4) and the Cauchy-Schwarz inequality for 1 2 . For inequality ( 2 ) we have used both Bessel's inequality and the integral test for infinite series of positive terms. This proves that
/ISm - Snllsup G K n i - p
AN ORTHONORMAL BASIS FOR L2(T)
201
for a suitable constant, K , that is independent of n . Thus S, is uniformly convergent to some continuous function 4. Letting m -+ cn,we see that
114 - S n l l s u p
0
+
as n + co,and that the convergence takes place at the rate claimed. It remains to prove that S, -+ f or, in other words, that 4 = f . Since uniform convergence is established already, we need prove only pointwise convergence to f . We fix 2 arbitrarily and observe that
_ _2
2
ez7T(2n+l)y
- e-zT(2n+l)y
dY.
2i where we have used Euler's formula, and where we define
Next, we define Q + ( y ) = Q(y)eiTYand Q-(y)
=
Q(Y)~-~"'>
(10.5)
and the reader can check easily that each of these functions is continuous and thus integrable. Finally, we see that
-i
- f ( z )= as n
+
h
h
( Q + ( - n )- Q-(n))
+
0
co by the Riemann-Lebesgue lemma.
Theorem 10.1.3 I f f E L2(T),then
tISrL(f) - f I l 2 as n
--+
+
0
co. Consequently, we have the Plancherel identity: (10.6)
Remark 10.1.3 A celebrated theorem of Lennart Carleson [3] established that the Fourier series of any square-integrable Lebesgue measurable function must converge to f(x) pointwise except on a set of Lebesgue measure zero. However, a set of points
202
TRANSLATION INVARIANCE IN REAL ANALYSIS
can have Lebesgue measure zero and still be an uncountably infinite set. There are examples known of continuous functions f for which S,(f) is actually divt7rpenf for infinitely many values of 2 . And there is an example of a Lebesgue integrable function f for which the Fourier series diverges at each point T ! The extraordinary pathologies of Fourier series in regard to pointwise convergence, even for continuous functions, make theorems like the one we are about to prove very interesting and useful. Proof: In Exercise 10.3, the reader will show that C1(T) is dense in L L ( T ) .Let E > 0. It follows that if f E L2 (T),then there exists a function o E C1 (T) such that
Ilf - 4
2
<
E
5'
Since the partial sums SN ( 0 ) of the Fourier series of 0 converge to Q uniformly on T, it follows that there exists ;V E IN such that n 3 W implies that
llf
- Sn(d)lla <
f.
By Exercise 10.4,the Fourier coefficients o f f provide the optimal L 2-approximation t o f . Thus Ilf - S,,(f)ll2 G Ilf - Sn(@)ll2 < E for all n 2 A;. This implies the theorem. EXERCISES
10.1 Prove the Riemann-Lebesgue lemma, Lemma 10. I . I . (Hint: Emulate Exercise 5.42.)
10.2 Show that the measure with respect to which we define L '(T), described in Remark 10.1.1, is invariant under translation by any real number. 10.3
Use the following steps to prove that the set C '-(T) is dense in L2(T). a) Prove that the family S of all step functions is dense in L 2 ( T ) .(Hint: See Exercise 5.41.) b) Prove that each step function can be approximated as accurately as desired in the L2-norm by means of a continuous function. c) Prove that each continuous function on T can be approximated as accurately as desired by means of a C -function. l 3
'
10.4 Suppose that f lies in a separable Hilbert space 'H, and let { e k I k E Z}be any countable orthonormal set-not necessarily a basis. Let c-,,, . . . . c,! be any 211 + 1 complex constants. Then
"3You can use the Weierstrass polynomial approximation theorem for this. See, for example. [201
CLOSED, INVARIANT SUBSPACES OF
L 2 (a)
203
with equality holding if and only if c k = T ( k ) for each k . (Hint: See Definition 9.4.2. Write the difference inside the right-side norm as the sum of two sums of differences, one of these involving k ) - ci.) e k .)
(y(
10.5 Denote by (a,.) the multiplicative group of complex numbers of modulus one to distinguish it from the additive group (T, +) = R/Z. We call a function 2; : T -, (T3.) a character of T if and only if 1 is both a continuous map and a homomorphism, meaning that
x(r + Y)
= Y(.Z)?(Y)*
for all 2 and y. Let 5denote the set of all characters of T. a) Prove that ?Ais a group under the operation of pointwise multiplication. We will call 'IT the character group of T. b) Prove that y E T if and only if x = ln,as defined in Definition 10.1.1. for some n E Z. Hints: Necessity is the harder part. Let 2 be any character of T. Prove that h
s, t
x(5)
and explain why >i itself.
E
X ( P ) dY =
C1(T). Prove that
x ( u )du 2;'
must be a constant multiple of
10.6 Let TT = (IR. +)/(TZ. +) denote the rimstandard additive circle group, which is a circle of perimeter T > 0. Use the isomorphism T : IR/Z -+ R/(TZ) defined by ~ ( r=)Tx to show that
is an orthonormal basis for L 2 ( ' I T ~and ) that the Fourier series of a function f in converges uniformly to f ,just as in Theorem 10.1.2. (In the case of IR/(TZ), the interval [O. T ]plays the role of a convenient cross section for the quotient group, and a function on such a circle of perimeter T would be a periodic function on R with period T . )
CP('ITT)
10.2 CLOSED, INVARIANT SUBSPACES OF L2('IT) In this section, we will classify all the closed, translation-invariant subspaces of L 2 ( T ) .We begin with a reinterpretation of Theorem 10.1.3.
Definition 10.2.1 Let K be a set of indices, which may be either finite or infinite. Suppose there is a set of Hilbert spaces ' H k , indexed by the set K . The direct sunz
'H=@?-tFtk k€K
204
TRANSLATION INVARIANCE IN REAL ANALYSIS
of the Hilbert spaces k E K and such that
Hk
is the set of allfunctions f such that f ( k ) E
‘FIk
for each
We define a Hermitian scalar product on ‘FI by
where (.>‘)k denotes the scalar product in the Hilbert space ?-tk The reader will verify in Exercise 10.7 that the direct sum, as defined above, is a Hilbert space in its own right. Theorem 10.1.3 tells us that for each f E L 2 ( T ) ,we have
in the sense of L2-convergence, and that the set { x n 1 n E Z} is a complete orthonormal basis for L 2 ( T ) . We can interpret the latter theorem as providing a decomposition of the Hilbert space L2(?r)into the direct sum of minimal, nontrivial, mutually orthogonal, translation-invariant subspaces:
L2(W =
@ @Xn. nEZ
Each of the spaces @ y n is a one-dimensional Hilbert space consisting of all the complex scalar multiples of a single character function, x n , which does lie in L 2 ( T ) . The one-dimensionality of each space @>( guarantees that it has no nontrivial, proper, translation-invariant subspaces. It is easy to see that if S C_ Z, then (
10.7)
is a closed, translation-invariant subspace of L 2(T),We will prove that every closed, translation-invariant subspace has this same form. In words, we will prove that each closed, translation-invariant subspace of L2(T) is the direct sum of some finite or countable collection of the one-dimensional spaces of the form C x 7L. We begin with a brief introduction to the integration of Hilbert space valued functions.
10.2.1 Integration of Hilbert Space Valued Functions We begin with a brief digression in order to define the concept of the integral of a Hilbert space valued function. We have seen in Theorem 9.4.2 that the bounded linear functionals on a separable Hilbert space 3-1 correspond uniquely to the points of 3-1 itself via the bijection Tc(17)= (v.0
CLOSED, INVARIANT SUBSPACES OF
for each E
E
L2(?r)
205
'H.
Definition 10.2.2 Suppose ( X .U. p ) is a measure space and suppose that Q, mapping X to the separable Hilbert space 'FI, has the properties that the function 2
+
(77. d(.))
is p-measurable, for each fixed 71 E 'H. Suppose also that the real-valued function ll@(x)11~ lies in L 1 ( X .U, p). Then we define the Hilbert space value of the integral of @ with respect to ,LL by the equation
for each 7 E 'H. For this definition to make sense, it is necessary to show that the right side of Equation (10.8) defines a bounded linear functional
T d i ) = LY(71.4x))&4.)* so that the latter integral determines a unique vector in 'H,"4 which we call the value of the Hilbert space valued integral. This is Exercise 10.8, In the following lemma, we prove a useful generalization of the triangle inequality for the Hilbert space norm of a Hilbert space valued integral.
Lemma 10.2.1 (Triangle Inequality) Suppose that 0 : X
+ 'H satisjes
the hypothe-
ses in Definition 10.2.2. Then
where the norm on each side of the inequalif?.is the Hilbert space norin. Proof: We reason as follows, using the definition of the norm of a linear functional:
Here we have used both the definition of the Hilbert space valued integral and the Cauchy-Schwarz inequality. "'Note
that 7-1 is self-dual-a
fact that is important here.
206
TRANSLATIONINVARIANCEIN REAL ANALYSIS
10.2.2 Spectrum of a Subset of L2('IT)
Definition 10.2.3 Let E be any subset of L (T). We define the spectrum of E to be the subset o f ? given by
Theorem 10.2.1 A linear subspace V if and only if
v=
c L2 ( T) is closed and translation invariant @%I.
7EP Proof: Sufficiency is established by Exercise 10.7. We will prove necessity. To this end, denote the convolution
(10.9)
Equality (i) expresses the definition of the convolution over the circle group. Equality (ii) rewrites the right side of Equality (i) in a manner that is useful for the present argument. Since the characters of T are mu_tuallyorthogonal, it will suffice to show that for each f E V and for each 72 E Z = T, we must have f * x n E V. Let E > 0, and fix V. Observe that the mapping t -+ f - t , carrying t to the translation of f by -t, is a continuous map from T to L2(T).'15 This mapping is uniformly continuous since T is compact. Thus there exists 6 > 0 such that
The plan of the proof is to show that the convolution integral in Equation (10.9) is a limit of a finite linear combination of translations and is therefore contained within "%ee
Exercise 10.9.
CLOSED, INVARIANT SUBSPACES OF L~
the closed subspace V. Let
nz-1
(a)
207
< 6.We reason as follows, applying Lemma 10.2.1 :
.
< C E K 1= E k=O
for all sufficiently large values of 'm. This completes the proof that f * x ?, E V ,since V is closed. Hence V contains the closed linear span of all the characters in its own spectrum. The definition of the spectrum completes the proof.
Remark 10.2.1 Although we will not prove it in this book, the fact that the irreducible translation-invariant subspaces are one-dimensional is a reflection of the circle group being both abelian and compact. The real line is abelian, but noncompact, so L 2(lR) is a direct integral of one-dimensional irreducible, invariant spaces (but not a direct sum of subspaces). Harmonic analysis can be carried out on matrix groups, on Lie groups, and on locally compact Hausdorff topological groups. For groups that are compact but not abelian, minimal translation-invariant subspaces will be finite-dimensional, but they need not be one-dimensional. For groups G that are neither compact nor abelian, minimal translation-invariant spaces in the direct integral decomposition of L 2 ( G )can be infinite-dimensional. An interesting example of such an infinitedimensional, irreducible space under the action of a nonabelian, noncompact group is given in Section 10.5, where the group under consideration is the Heisenberg group. For a well written elementary introduction to topics mentioned in this remark, see the book [ 191 by Pukanszky. EXERCISES 10.7 Prove that Definition 10.2.1 makes the direct sum of an indexed family of Hilbert spaces into a Hilbert space.
10.8 Prove that Equation (10.8) determines the integral uniquely as a vector in 3-1 by identifying its action as a point in the dual space. '16Such groups play an important role in quantum mechanics.
208
TRANSLATION INVARIANCE IN REAL ANALYSIS
10.9 Let f E L2(T). Prove that the mapping t to L2(Ti').(Hint: See Exercises 9.15 and 5.43.)
-+ f - t
is a continuous map from T
10.3 SCHWARTZ FUNCTIONS: FOURIER TRANSFORM AND INVERSION The study of the representation of functions f E L (T) by Fourier series was facilitated by the study of Cx(T) first. For the real line itself, there is a special role played by the Schwartz functions, which we will define below. First, we define the characters of IR and the Fourier transform.
Definition 10.3.1 Let f E L'(IR) and let -/ E IR. We make the following definitions. i. For each
-, E IR,we define the character k-, : IR + (T.
s)
by
uy(x) = e 2 T i 7 - G Denote the set of all characters of R by
ii. Define the Fourier traitsfortn of the function f to be f :
+ @,
given by
iii. Define the inverse Fourier transform of g E L1(IR) to be
In Exercise 5.47, the reader will have proven that the Fourier transform of each function f E L'(IR) is continuous and that it vanishes at The reader should check the simple extension that the inverse Fourier transform of an L '-function has the same properties: In fact, g ( x ) = c ( - x ) . In Exercise 10.10,the reader will show that k is the abelian group of all continuous homomorphisms of (IR.+) + (T. .). We leave it as an informal and very easy exercise for the reader to prove that the integral defining the Fourier transform of each integrable function f on IR exists. The wish would be to show that
f^=f . "'This is the Riemann-Lebesgue lemma.
209
SCHWARTZ FUNCTIONS: FOURIER TRANSFORM AND INVERSION
for each f E L 1 ( W ) .However, it is not hard to give examples of integrable functions f for which f^is not integrable, leaving its inverse Fourier transform undefined. (See Exercise 10.11.) We are ready to define the family of all Schwartz ' I 8 functions on R.These functions will not share the difficulty just described for L '(R).
Definition 10.3.2 We define the set of all Schwartz functions on the real line to be
In words, Schwartz functions can be described as being those smooth functions on the line that vanish rapidly at plus and minus infinity, as does every derivative of every finite order n. The role of the index k is to ensure that the Fourier transform of each derivative of a Schwartz function vanishes more rapidly at plus and minus infinity than the reciprocal of any polynomial. The reader should check easily that every Schwartz function is integrable and that every derivative of a Schwartz function is Schwartz.
Theorem 10.3.1 The Fourier transform, : f following properties when restricted to S(K):
+
f^ f o r each f
E
L 1 ( R ) ,has the
i. The Fourier transform is a bijection of S(R) onto itself "
ii.
f^ = f f o r each f E SQR).
iii. The Fourier transform is an L2-isometn of S(R),meaning that ( 1 f 112 for each f E S(IR).
=
l
f
Proof: i. We can produce a variant of Lemma 10.1.2, applying integration by parts to show that for Schwartz functions on the real line we have ( 10.10)
for all y # 0. We see easily that
From this it follows readily that the Fourier transform f^of a Schwartz function f must be rapidly decreasing. We need to prove that f^is differentiable and that it is in ' C (IR) as well. We will apply the Lebesgue Dominated Convergence "8Schwartz functions are named for Laurent Schwartz, who lived later than the inventors of the CauchySchwarz inequality. The difference in spelling is not accidental. "'Clearly, each Schwartz function on the line is also square integrable
21 0
TRANSLATION INVARIANCE IN REAL ANALYSIS
theorem, together with the mean value inequality for complex-valued functions of a real variable.12' Select an arbitrary sequence h , + 0. It is necessary to prove that (10.11) =
-27ri(sf)(y)
by showing that this limit exists and is independent ofthe choice of h 11 + 0. We leave the verification of this to the reader in Exercise 10.15. Applying Equation
(3'
(10.1 I), and using the fact that sf E S(R),we see that f (7)+ 0 fast as l-/l + a.By mathematical induction, we establish that the Fourier transform maps S(R) into itself. We leave the proof that the Fourier transform is surjective until the end of this proof. For the second and third parts of the prooJ u,e will treatfirst the case of functions f E C: (IR),which is the linear subspace of S(R) consisting of all infinitely differentiable functions with compact support. ii. If f E Cg (R),then there exists a sufficiently big real number T > 0 such that the interval contains the support of f. By Exercise 10.6, we claim that
[-5. 5 )
as T + so. In Equality (a) we use a Hilbert space Fourier series using an orthonormal basis, and in Equality (b) we use the form of the Fourier transform for the real line (rather than the circle) on the right-hand side. The convergence claim (c) is what we will prove now for f E CiL (R). The details are as follows.
For a suitable constant c, we have
'?OThis inequality I S a special case of the Euclidean space Mean Value Theorem, interpreting f : IR,+ C as though it were R2--valued. See [20], for example.
SCHWARTZ FUNCTIONS: FOURIER TRANSFORM AND INVERSION
21 1
for all *r because f is Schwartz. Assume without loss of generality that T > 1. and define the function h E L'(IR) by
h ( y )=
sup
-,-l
c 1+ t 2 '
and we note that the supports of the functions of y on the left-hand side are mutually disjoint. Also,
pointwise on IR, as T theorem tells us that
-+
co. Now the Lebesgue Dominated Convergence
iii. Again for this part, we assume for now that f E :C part, we have for all sufficiently big T > 0 that
(a). Just as in the preceding
II
as T + m. Hence llflla = f , making the Fourier transform an isometric injection of Cg (R) into S(R),with respect to the L2-norm. It remains to show that the Fourier transform maps S(IR)isometrically onto itself and that Fourier inversion works for all Schwartz functions. To these ends, we let ,f be an arbitrary Schwartz function. We would like to approximate f with a sequence of compactly supported smooth functions. Let Q E C,/(IR) such that o ( x ) = 1 on [-:. with / / $ l l L = 1, and the support of o contained in [-1.1]."' We define our approximating sequence as follows:
i],
(10.12) Observe that
'?'The existence of such smooth functions @ can be found in many advanced calculus texts, such as [20].
212
as n
TRANSLATION INVARIANCE IN REAL ANALYSIS
+ co. Now
we make a two-part estimation on
for all y,and in particular for IyI estimates:
< 1.
But for 171 > 1, we make the following
with C independent of n. Hence the sequence
I f n I is dominated in the sense of the h
I
,
Lebesgue Dominated Convergence theorem, with the same being true of It follows that
Ifn 1’. h
,
I
Also,
Hence the Fourier transform is an isometric injection of S(IR)into itself, and we can see that this transformation is surjective by Exercise 10.13.b.
EXERCISES 10.10 Prove that the set 6 is an abelian group under the operation of pointwise multiplication, and that every continuous homomorphism of the additive group (IR. +) to the multiplicative group (T) .) of the complex unit circle is an element of 6. (Hint: If x is any nontrivial homomorphism of of (IR,+) .+ (T, then it has a discrete kernel, ker( x) = PZ for some positive real number p. Now consider the nonstandard circle, R/pZ.) e),
CLOSED, INVARIANT SUBSPACES OF
10.11
Find and justify an example of a function f
10.12
Let p be any polynomial, c > 0, and zo
E
E
L2((IR)
L ‘(a) for which
213
7$ L’(IR).
JR. Show that
is a Schwartz function. 10.13 Define the function 6, : JR -+ R. by 6,z = cz for each c E IR.Prove that,for each Schwartzfunction f E S(IR)we havethe following identities: ” A
A h
a) f = f 0 6 - , .
-
h
,. A
b) f = f. This is sometimes expressed with the operator equation h4 =
I,
the identity operator. In words, the fourth power of the Fourier transform is the identity operator. (This exercise may prompt the reader to search for eigenfunctions of the Fourier transform operator, corresponding to the four complex roots of unity. In fact the Hermite functions, which are the
up to a constant factor, are eigenfunctions. The reader can learn more about this in [ 5 ] , in which it is shown that h,, is an eigenfunction for the eigenvalue (4) .) 10.14
If f and g are in S(IR),show that
(See Exercise 6.7.b.) 10.15
Prove Equation (10.1I ) for every function f
E
S(IR).
10.4 CLOSED, INVARIANT SUBSPACES OF L2(lR) We will begin by defining the Fourier transform of each f E L (IR) by means of Schwartz approximations. Then we will use the L2-Fourier transform to classify all the closed, translation-invariant subspaces of L (IR).
10.4.1 The Fourier Transform in L2(IR) We caution the reader that we cannot define
214
TRANSLATION INVARIANCE IN REAL ANALYSIS
since L2(IR) Q L1(IR), and thus the integral need not exist. This deficiency can be remedied, thanks to the density of S(R) in L2(JR).
Lemma 10.4.1 The vector space S(IR)of Schwartz functions is dense in the Hilbert space in L 2(IR). Proof: It will suffice to prove that C: is dense in L2(IR). We have shown. in Exercise 5.41, that the space S of step functions'*' is dense in L1(IR). This implies that S is dense in L 2(IR) as well, since each integral of a nonnegative function is the supremum of the integrals of its truncations in both domain and range. Very much as we did for Equation (10.12), in the proof of Theorem 10.3.1, we can construct a C: (IR)-function d that comes as close as we like in L2-norm to any given indicator function of an interval, and hence as close as we like to any step function f . The method is to provide a C -interpolation between any two arbitrary heights, a and b, over a width in the domain that is positive but as small as desired. In this manner, a square integrable step function can be approximated as closely as desired by means of a Cf -function, completing the proof of the lemma.
If we take a sequence of Schwartz functions f n such that llfn - f 112 + 0, then is also L2the fact that f n is Cauchy in the L2-norm implies that the sequence Cauchy, thanks to Theorem 10.3.1. We make the following definition, which will require justification.
Definition 10.4.1 For f E L2(IR), we define f^ to be the L2-limit of sequence of Schwartz functions f n -+ f in the L2-norm.
for each
It should be noted that we have not defined f^ pointwise as a function of 7 E IR. Rather, we have defined the Fourier transform as an L 2-equivalenceclass by invoking the completeness of L2. We must show that this definition is independent of the choice of the sequence of Schwartz functions f n . Also, we have at this point two different definitions of the Fourier transform for those functions that are in L '(R) n L2(IR), and these must be shown to agree.
Lemma 10.4.2 r f f n and g n are any wo sequences of Schwartz functions converging in the L2-norm to f , then lim f r l = liin h
n-+
f
n-+ f
as elements ofL2(IR). Proof: Applying Theorem 10.3.1 again, we see that
'"The reader should take care not to confuse the space S of step functions with the space S(R) of Schwartz functions. "'See Exercise 5.62 in [20] for the details of this useful technique.
CLOSED. INVARIANT SUBSPACES OF
L2 (IR)
215
Corollary 10.4.1 The Fourier transform is a linear isometry of L2(IR)onto itself: Moreovel; the inverse Fourier transform, denoted by -, is a well-defined isotnetric surjection of L2(IR). Both transform preserve the Hermitian scalar product, arid this is called Parseval's Identity for L2(IR). Proof: If we have Schwartzfunctions f ,, -+ f and y,, and
+
(uf 9)-
=
-+g, then
n f n +gn
-+ ci f
+g,
+
lirn ( n f n g n ) ^
n+
f
proving linearity. The Fourier transform is an L '-isometry because
To see that the Fourier transform is a surjection from L 2 onto itself, let f E L 2 ( R ) be arbitrary. We pick Schwartz functions f l l + f , and we invoke Exercise 10.13.b. which tells us that
Thus f lies in the range of the Fourier transform. This shows also also that the fourth power of the Fourier transform is equal to the identity operator on L 2(IR), just as it is on S(IR).Moreover, the third power of the Fourier transform must be the inverse h
Fourier transform: 9' = $ for all g E L2(IR). The final conclusion follows from the Parallelogram Law, which determines the Hermitian scalar product of a complex inner product space in terms of its associated norm. We should note that i f f E L1(IR) n L2(IR),then we have defined f i n two distinct ways: first as an integral and then as a limit of transforms of Schwartz functions. The following lemma establishes that these two definitions for the Fourier transform of such a function coincide.
Lemma 10.4.3 For each f E L1(IR) n L 2 ( R ) ,the L'-Fourier trarzsforni of ,f is equal almost everywhere to iRf (x)e-2n2*/s dx. Proof: Let f N = f 1L - N , N ] for each N norms of both L1and L to f. Denote
E
IN. We see that fav converges in the
216
TRANSLATION INVARIANCEIN REAL ANALYSIS
the L1-Fourier transform of f , ~ iebesgue . Dominated Convergence tells us that f~ -+ the L1-Fourier transform o f f . From Lemma 5.5.1 concerning the pointwise convergence of rapidly L l-Cauchy sequences, we know that there is a subsequence f>~ that , converges pointwise almost everywhere to the L ’-transform o f f . However, , in L 2 since fnr, converges also in the L2-norm to f , the L2-transforms f ~ , converge norm to f . Passing to a suitable subsequence again (as in the proof of completeness of LP) would ensure pointwise convergence almost everywhere to a function in the L2-equivalence class of the L2-Fourier transform of f . Thus it would suffice to know that for the compactly supported function f , ~ , the L1- and L2-Fourier transforms coincide. To this end, pick a sequence q T L in C’ [-N. N ] that converges L 2 to f . ~Note . that h
7,
h
h
h
x,)
(On.
-+
(fN.
X?>
and that the latter Hermitian scalar product is the L1-Fourier transform of f,\, expressed in terms of the scalar product for L2[-N. AT]. Here we benefit from the fact that the characters x-,E L2[-N. N ] .though they do not belong to L2(IR). By passing implicitly to a subsequence in n if needed, we can assume without loss of generality that h
for almost all 7 . But since pn converges L 2 to f N , we have & convergingin the L2norm to the L2-transform f ~ Passing . again, as needed, to a suitable subsequence, we get pointwise convergence almost everywhere. Thus the two concepts of Fourier transform for f N agree almost everywhere, and the proof is complete. h
10.4.2 Translation-Invariant Subspaces of L2(R) If H is a nontrivial closed, proper subspace of L 2 ( R ) , then Exercise 9.19 ensures that H has an orthogonal complement H such that L2(IR) = H @ H I . If H is a translation-invariant subspace, then the reader will show in Exercise 10.16 that this implies the translation invariance of H I as well.
Lemma 10.4.4 Let E be any Lebesgue measurable subset of the real line, R,and let H = { ~ E L ~ ( I R ) I ~ ( ~ ) on = OE~C~= R \ E } .
(10.13)
Then H is a closed, translation-invariant subspace of L 2 ( R ) , and we call E the spectrum, H , of the closed, trarzslatiori-invariant subspace H . h
Another measurable set, E’, would have the property that 2 = E’if and only if 1(E A E’) = 0, so that E E’. That is, the spectrum of a closed, translationinvariant subspace of L2(JR)is determined up to a null set.
-
CLOSED, INVARIANT SUBSPACES OF
LZ(R)
217
Proof: Recall that the Fourier transfmm is an isometry of L2(IR). It follows that -+ if f n + f is any Cauchy sequence in H, then f n is Cauchy as well, and in L2-norm. Some subsequence of f n is pointwise convergent almost everywhere. This implies that f^((r) = 0 almost everywhere on E C ,so that f E H, which is thus shown to be closed. We claim that H is translation-invariant as well. Let f E H and take any sequence of Schwartz functions f n + f in the L2-norm. For each real number a. denote f a ( x )= f ( r a ) , the translation o f f by a. Then ii( f n ) a - fal12 -+ 0 as well. But h
h
f
+
h
It follows again by a subsequence argument that f Q ( y ) = 0 almost everywhere on E C ,proving that f a E H. rn The next theorem asserts that every closed, translation-invariant subspace of L 2 ( ( a ) is determined by its spectrum, as in Equation (10.13). This will yield a bijection between the family of all closed, translation-invariant subspaces of L 2 ( R ) and the metric space of measurable subsets of IR,in which two sets are identified if the measure of their symmetric difference is zero.
Theorem 10.4.1 For each closed, translation-invariant subspace H in L '((a), there ?-. is a measurable set E, determined up to a null set, which serves as the spectrum, H , of H as in Equation ( 10.13). Proof: Let H be any closed, proper, nontrivial, translation-invariant subspace of L 2 ( R ) . Then L2((a)= H @ ' H
by Exercise 9.19. For each vector f E L 2 ( R ) , there exists a unique decomposition of the form f = Pf + P'f, where Pf E H and P'f E HI.The mappings P and 'P are called the orthogonalprojections onto H and H', respectively. Since L 2 ( R ) is separable, there exists a countable dense subset f n , and the reader will show in Exercise 10.17 that the set { P f n I n E K} is dense in H. Thus H is separable, and it has an orthonormal basis { e n 1 n E IN}. Define a measurable set
E=
u {rE(aIG(;;(?)fO}
nEIN
Note that for each f
E
H,
f
=
c
( f . en> e n .
nEN h
an L2-convergentsum, so that f l E C = 0 almost everywhere. We need to show that i f f E L2 ((a )is chosen subject only to the requirement that f = 0 almost everywhere on EC,then f E H, implying that E is the spectrum, fi, of H. To this end, let g = f - Ce,,. h
nEK
218
TRANSLATION INVARIANCE IN REAL ANALYSIS
and it will suffice to prove that g = 0. We know at this point that s^ = 0 on E', so it will suffice to prove that = 0 as well. As we have observed earlier, for each
$IE
= x? (y) Q^(r) almost everywhere. So translation of an L 2 function 4 by y, z(7) take 4 E H and recall that, by its definition, g E HI.Thus, by Parseval's Identity (Corollary 10.4.1) for L2(JR),'24
0 = (9.P d = =
@.
.>
x7(?4)
S,
~(y)
e - 2 ~ zd?~ ~
h
for all y E IR. Thus G @ = 0 almost everywhere, including on E . Since q~ was arbitrary, and could have been any of the functions e for example, it follows that $ = 0 even on E , so that g = 0.
10.4.3 The Fourier Transform and Direct Integrals In Theorem 10.2.1, we showed how to decompose L 2 (T ) into the direct sum of one-dimensional, translation-invariant subspaces of L (T), Each one-dimensional translation-invariant subspace of L'(T) is called also an irreducible translationinvariant subspace, because it has no proper, closed, nontrivial, translation-invariant subspaces. Indeed, each of the one-dimensional spaces has no nontrivial, proper subspaces at all. We will see that the direct sum of Hilbert spaces, as defined in Definition 10.2.1, is a special case of the concept of a direct integral of Hilbert spaces.
Definition 10.4.2 Let ( A .9, p ) be a measure space. Suppose there is a set of Hilbert spaces 7-1, indexed by the set A. The direct integral
1,
0
'Ft
=
7-1, dp
of the Hilbert spaces H, is the set of allfunctions f such that f(a) E 'Ft, for each a E A and such that
We define a Hermitian scalar product on 7-1 by
"'Here we use the Fourier transform according to Definition 10.4.1-not in the sense of an abstract Fourier transform in any separable Hilbert space with respect to a specified orthonormal basis.
IRREDUCIBILITY OF L z ( R )UNDER TRANSLATIONS AND ROTATIONS
219
where (.*.)a denotes the Hermitian scaldr product in the Hilbert space Z a ,and where // . /la denotes the corresponding Hilbert space norm. The reader will show in Exercise 10.18 that the direct integral is itself a Hilbert space. Moreover, the direct sum is a special case of the direct integral, in which the measure p is counting measure on a countable space. Thus we see that the Fourier transform provides a decomposition of L 2 ( T )as a countable, discrete direct integral of irreducible, closed, translation-invariant subspaces of L (T) itself. The situation is different for L2(R). Here the Fourier transform provides an isomorphism of Hilbert spaces between L2(R)and the direct integral over the real line of one-dimensional, irreducible, translation-invariant spaces H a = (Exa.Thus
However, it is important to note that the spaces H , are not subspaces of L2(lR),since each nonzero function in H , has constant modulus. Thus the Fourier transform in L2(R)leads one to a more abstract version of harmonic analysis in which the original space, L 2(R), is analyzed in terms of irreducible, translation-invariant spaces that exist only externally to that original space. EXERCISES
10.16 If H is any closed, translation-invariant subspace of L 2 (R),then H I is also closed and translation-invariant. (Hint: Use the fact that Lebesgue measure is translation-invariant. 10.17 ble.
Show that each closed vector subspace of separable Hilbert space is separa-
10.18 a) Prove that the direct integral of a Hilbert spaces, as in Definition 10.4.2, is
itself a Hilbert space with respect to the scalar product from that definition. b) Show that the direct sum of Hilbert spaces in Definition 10.2.1 is a special case of the direct integral of Hilbert spaces. 10.5 IRREDUCIBILITY OF L2(R) UNDER TRANSLATIONS AND ROTATIONS It is a consequence of Theorem 10.4.1 that every nontrivial, closed, proper, translationinvariant subspace of L (R) has nontrivial closed, proper, translation-invariant subspaces. Here we will show that if we act upon L2(R) with all trunslatioiis and rotations, then L 2(R) has no nontrivial, closed, invariant subspaces. That is, we will show that L2(lR)is irreducible with respect to the combined action of translations and rotations.
220
TRANSLATION INVARIANCE IN REAL ANALYSIS
By rotations, in this context, we niean all operators that act on functions f E L '(R) by multiplication AIxO by a character x n (x)= e2xzas. Thus
(M,,f)I.(
= xn(.)f(.)
for all x E IR. A rotation rotates the values of f in the complex plane. The reader should note easily that AI, does map L2(IR)into itself. Thus we will prove here that if H is a closed, nontrivial subspace of L'(IR) that is invariant under all translations and all multiplications by characters, then H = L2(IR). After this work is done, we will explain the connection of this theorem with the Heisenberg group, the Heisenberg Uncertainty Principle, the Schrodinger model of the position and momentum operators in quantum mechanics, and a theorem of Stone and von Neumann.
Theorem 10.5.1 Let H be any closed, nontrivial subspace of L (IR) that is invariant under the actions of all the multiplications AIxc>(f)(x)= x a ( a . ) f ( x and ) invariant under all the translations T,(f)(t) = f(t x). Then H = L'(R).
+
Proof: By Theorem 10.4.1, there exists a measurable set I?, called the spectrum of H, such that H = {f E L2(IR) = 0a.e.).
I ylzc
We need to prove that I?" is a Lebesgue null set. Suppose that 1 H" > 0. We will deduce a contradiction. Since H is nontrivial, there exists a function f # 0 E L'(IR) for which f E H. Thus the set S f = Q E IR T(a) # 0) has strictly positive Lebesgue measure. We
( A)
{
I
know from any one of the Exercises 3.26, 6.1 1, or 7.21 ' 2 5 that there exists such that We claim that
9 E IR
(wJ)-(Q) = f(Q - PI. h
for almost all a. This would imply that S(Af,,f)
=
Bf
sf.
For functions in L1(IR) this follows immediately from the definition of the Fourier transform. I f f E L 2 ( R )we can define f n = so that fn lies in L1 (IR) for each n and fn + f in the L2-norm. By passing to a suitable subsequence fn, of the sequence fn, we can be assured that f T t z +. f almost everywhere as well, and this proves the claim. The proof of the theorem is complete, since AI,,f E H and because r?" is disjoint H from r?.
fl[-n, n l,
125Thereader may enjoy noting how any one of the measure-theoretic exercises cited here can play the crucial role in the proof of this theorem.
IRREDUCIBILITY OF L 2 ( R )UNDER TRANSLATIONS AND ROTATIONS
221
10.5.1 Position and Momentum Operators It is natural to consider the action of IR by translation upon L2(IR),but the reader may wonder why we consider here the combined actions of translation and rotation on L2(IR). In order to address this question, we present a brief description of the quantum mechanical formalism for the position and momentum operators of one isolated quantum (particle) having only one degree of freedom, meaning that it is able to move only along the real line, IR. Our discussion is only a sketch, and we will not concern ourselves with physical constants, however important, treating them as though they were 1 wherever this is convenient. Our main purpose is to explain how the combined action of translations and rotations stems from the action of a nonabelian group, called the Heisenberg group, on L 2(IR) and what this action has to do with physics. The state of the particle is interpreted as being a complex-valued function (2 in L2(lR),having the additional property that 110112 = 1. This makes 1#12 into a probability density function, meaning that the probability that the position .c of the quantum is between the coordinates a and b is given by
s,
b
P ( a 6 z 6 b) =
\d(z)l2d r .
The expected value of the position is given by
It is thus natural to define the position operator P by
( P 9 ) ( r )= X&(.)> so that the expectation of the position is (PQ. &), and this scalar product is a real number, although the function 4 is complex-valued. Note that the domain of P is D p = {O E
1
L2(IR) zq E L2(IR)}
and that this is a dense, but not closed, subspace of L (IR). Conceptually, the quantum can be regarded as beirig the state function 0,and it can be interpreted as being present at all locations in the support of d. Probability enters the picture as soon as one makes a macroscopic observation or measurement to try to detect the presence of the quantum. As evidence for this abstract notion of the position of a particle, physicists cite an experiment in which a single quantum is released on one side of a barrier that has two parallel slits cut into it. If one places detectors at the two slits, it will turn out that the quantum passes through either one slit or the other-not through both. The probability of the quantum being located at either slit is governed by the probability distribution 9.However, if no detectors are placed at the slits and a detection screen is placed opposite the wall that separates the quantum from the detector, then it turns out that a diffraction pattern appears
222
TRANSLATION INVARIANCE IN REAL ANALYSIS
on the screen. The pattern that is produced shows that the quantum has passed wave-like through both slits, creating a diffraction pattern on the detection screen by interference with itself. The Fourier transform d~permits one to express d as an integral of characters xL,. Each index u corresponds to a pure frequency having an energy level E = Itv, h being Planck's constant. The index v is taken also as corresponding to momentum.
I ^I2
= 1, thanks to the Plancherel identity. Thus d~ is a probability We note that 2 density function. The probability that the momentum A1 is between a and b is 0' wen by
The expected value of the momentum is given by
-
However,
Hence we define the momentum operator Q by QO =
-i -@I.
27r
and the domain DQ of Q is the set of those square integrable functions @ such that the derivative 4' exists and is square integrable. This domain is a dense, but not closed, subspace of L2(R).The reader will prove in Exercise 10.19 that
PQ - Q P
2
i 2T
(10.14)
-I.
where I is the identity operator, restricted to the domain
D
= Q-l ( D p )n P-l
(DQ)
According to the Heisenberg Uncertainty Principle, the failure of the operators P and Q to commute with one another means that the result of the combination of position and momentum operators is dependent upon the order in which the two operators are applied. Physically, each measurement (of position or momentum) alters the subsequent measurement of the other. Thus it matters which operation is performed first and which second. 10.5.2 The Heisenberg Group In our brief survey of the quantum mechanical formalism for position and momentum, we have derived Equation (10.14) from the definitions given here for the position
IRREDUCIBILITY OF
L2(R) UNDER TRANSLATIONSAND ROTATIONS
223
and momentum operators, P and Q, respectively. More generally, Equation (10.14) is taken as fundamental, and the formulas given here for P and Q are Schrodinger’s model (concrete realization, or example) of operators satisfying the Heisenberg Commutation Relation. A fundamental mathematical question that arises is whether or not Schrodinger’s model is unique in some suitable sense as an operator solution to Equation (10.14). The Heisenberg group, IH,is the group of all real matrices of the form
with the operation being ordinary matrix multiplication. Since these upper-triangular matrices depend only upon the three real parameters s. y, and z , we denote the matrix by the more efficient symbol (x.y, z ) , understanding this to represent the full 3 x 3 matrix. The reader should check easily that the multiplication in the Heisenberg group is given by (s, y. 2 ) (XI.y’, z’) =
(x + d.y + y/. z
+ z/ +q/)
and that this multiplication is nonabelian. (See Exercises 10.20 and 10.21.) For each (z. y. z ) E JH, we define the following action on L2(IR):
where Txdenotes translation by x and Adxudenotes multiplication by the character xy. Of course, L2-functions are defined only almost everywhere, so the preceding equation should be understood as applying pointwise almost everywhere. The reader will show in Exercise 10.22 that T is a continuous homomorphism of the group JH into the group of norm-preserving Hilbert space automorphisms of L 2(R). We see ~ %all~ x) E IR include all translation operators on at once that the operators T ( ~ , for L2(IR). The operators 7r(o.y,o)provide all the rotation operators. Theorem 10.5.1 tells us that L 2 ( R )has no nontrivial closed, proper invariant subspaces under the action of 7r. The representation i7 is said to be irreducible because of the absence of nontrivial, closed, T-invariant subspaces. It is called unitary because i7 acts in such a way as to preserve the Hermitian scalar product of L 2(IR). It is a simple calculation to show that
p = --a
27rzay
1
y=o
T(O.y.0)
and
(10.16)
Here it must be understood again that P and Q are defined only on that dense subspace of L2(IR) consisting of functions that have images under P and Q that remain in L2(IR). We see that the position and momentum operators of Schrodinger arise
224
TRANSLATION INVARIANCE IN REAL ANALYSIS
naturally by differentiation of the iepresentation 7r of the Heisenberg group, 13. In this context, the uniqueness property of the Schrodinger model for the solution operators P and Q of the Heisenberg Commutation Relation is established by a famous theorem of Stone and von Neumann. This theorem asserts that the representation 7i is determined uniquely up to isomorphism of Hilbert space by its restriction to the center 2 ( H ) = ( ( 0 . 0 . 2 ) I 2 E IR}. In other words, any other representation 0,having the same restriction to the center, must have the property that there is an isomorphism 7- of Hilbert space such that
= C(.Z.y.z) 0 T
7- O 7 r ( z s y s t )
for all (z?y. i )E IH. The reader who would like to study carefully the ideas sketched in the present section is referred to [191 and 141.
EXERCISES 10.19
Prove Equation (10.14) by showing that
the Heisenberg Commutation Relation for the position and momentum operators in quantum mechanics.
10.20 Show that the Heisenberg group is a nonabelian group, closed under both multiplication and inversion. 10.21 Show that Lebesgue measure on R 3 is invariant under the operation of right translation by an arbitrary element of the Heisenberg group. That is, define for each point ( a , b. c) in IR3, the mapping T(a,b.c) : IR3 + JR3 by
T(a.b,c)(~, y. Z ) = (X + a , y + b, 2
+c +Xb).
Show that this map preserves both Lebesgue measurability and Lebesgue measure. Show that Lebesgue measure is invariant under left translation as well.
10.22 Show that the action 7i defined by Equation (10.15) is a homomorphism of the multiplicative group IH into the group of all norm-preserving Hilbert space automorphisms of the space L2(IR).In particular, show that 7 r ( , , y , t ) ~ ( d , y / d ) = T(,+,‘,
y+y’.
Z+Z’+TY’).
and show that 7r is continuous in the sense that l l ~ ( , . ~-, ~ ql12 ) q-+ 0 for each 4 E L2(R),as (z. y , z ) + (0.0.0) in the sense of convergence in IR3.
10.23
Prove Equations (10.16) by direct calculation.
APPENDIX: THE BANACH-TARSKI THEOREM
A.1 THE LIMITS TO COUNTABLE ADDITIVITY
In Section 1.1, we considered the pivotal role of the discovery of incommensurable line segments in the development of Euclidean geometry. We identified commensurability problems as belonging to the early development of measure theory. In the pages that followed, we have learned much about Lebesgue’s theory of measure and integration. We have seen how the Lebesgue theory yields complete normed linear spaces such as LP(X.8.p). We have learned about the dual spaces of the latter spaces, and about the dual of the space C ( X ) ,if X is a compact Hausdorff space. It would be difficult to overstate the importance of Lebesgue measure and integration throughout modern pure and applied analysis. The reader has seen that much effort must be made to ensure that we deal only with measurable sets and measurable functions in Lebesgue’s theory. We showed in Example 3.1, using the Axiom of Choice, that no translation-invariant. countably additive measure can be defined on all the subsets E E IR if 0 < p[O, 1) < cc Measure arid Integration: A Corzcise Introduction to Real Aiialwis. By Leonard F. Richardson Copyright @ 2009 John Wiley & Sons, Inc.
225
226
THE BANACH-TARSKI THEOREM
In this appendix, we will discuss a theorem that is as challenging for modem mathematicians as incommensurable line segments were for their ancient Greek forerunners. It is a theorem of Banach and Tarski, and it is often called the BanachTarski Paradox. It is not a paradox in fact, but a theorem. This theorem is described as a paradox because of the extraordinarily counterintuitive nature of its conclusions. We begin with a definition.
Definition A . l . l Two sets A and B in a metric space ( X ,p ) are said to be congruent by finite decomposition provided that there is a natural number n such that it is possible to decompose A and B into disjoint unions
in such a way that Ak and Bk are congruent for each k
< n. This is denoted by
f
A 2 B. Congruence means that there exists a bijection Tk : Ak
+ Bk
such that
p ( x . y) = p ( T k . E . T k y )
for all z and y in Ak. Such a mapping Tk is also called a bijective isometp. We denote congruence as Ak Z Bk. We call A and B congruent by coiintable decomposition provided that the decomposition of A and B into mutually congruent pieces can be accomplished by using countably many pieces. This is denoted by
A
2 B.
One may prove readily as an exercise that a linear isometry of Euclidean space must preserve the measure of any Lebesgue measurable set. The reader will note that Steinhaus's' theorem (7.4.3) shows that two sets of the same Lebesgue measure in the real line must be congruent by countable decomposition. The Banach-Tarski theorem is much more startling.
Theorem A.l.l (Banach-Tarski) Let A and B be two subsets of Rn, each having f
nonempty interior, and with n 2 3. Then A 2 B. (For example, any two spherical balls are congruent by3nite decomposition regardless of the difference in their radii.) For dimensions 1 and 2 , A & B. We remark that this theorem implies that for R", with n 2 3, there cannot exist even afinitely additive measure defined on all subsets that is invariant under Euclidean ' A s an historical sidelight, we remark that Steinhaus played a pivotal role in Banach's decision to become a professional mathematician. Further information is available in [ 171.
THE LIMITS TO COUNTABLE ADDITIVITY
227
motions. For n = 1 or n = 2 , there is no countably additive measure that is defined on all subsets that is invariant under Euclidean motions. That is, in each of these two sets of circumstances, nonmeasurable sets must exist. The proof given by Banach and Tarski can be found in their original paper [ 2 ] . There is also a modern treatise on the subject [25]. The reasoning comes primarily from abstract set theory and from the study of groups of linear transformations acting on vector spaces. We give a simple example below that illustrates the Banach-Tarski theorem. EXAMPLEA.l
We will show that there exists a subset S c [O. 2 ) , in the real line, for which S IR. That is, S c [O,2) will be congruent by countable decomposition to the entire real line. The congruence mappings will be translations. The example begins with the proof in Example 3.1 that there exists a nonmeasurable subset of the line. There we defined an equivalence relation:
r
-
y
- y E $.
Note that each real number is equivalent modulo rational translation to numbers in the interval [ O , l ) . We use the Axiom of Choice to select an uncountable set C in [ O , 1 ) having the property that C consists of one element from each equivalence class in IR/ -. The set C is called a cross section of IR/ -. If we were still in Example 3.1, we would proceed to explain why the set C must be nonmeasurable. Instead, the example takes a surprising turn. Let U = Q n [O. 11, and let C, = C + q. Define
'
the disjoint union of the countably many translates of C by q E U . Thus S c [O, 2 ) . Let r:U+$ be a bijection between the two countable sets, U and xq =
0. Let
T(4) - g
for each q E U . Observe that
as claimed. 'The author learned the following example from the website of Professor Terence Tao at UCLA. This surprise ending for the famous example of a nonmeasurable set deserves to be better known. because it is so simple, compelling, and delightful.
228
THE BANACH-TARSKI THEOREM
This example shows with convincing simplicity an instance of a congruence, by countable decomposition, between two seemingly incongruous sets. It provides a compelling explanation of why it is necessary to check for measurability in analysis. It gives also a startling sense of the geometrical possibilities of nonmeasurable sets. The theorem of Banach and Tarski demonstrates that the tools of Lebesgue measure and integration, with all their strength, cannot measure all sets, even with firzite additivity. More striking is the capacity of nonmeasurable sets to be reconfigured in astounding ways, with geometrical perfection, losing nary a point. Thanks to Lebesgue and many other mathematicians, the understanding of measure and integration has been advanced in many ways that are invaluable to analysis. Yet the mysteries that stand are more profound than those that came before. This is a challenge and an invitation to the student to engage in the search to expand human knowledge. It affirms for us the grandeur of truth, the finiteness of ourselves, and the good fortune to be allowed a glimpse.
REFERENCES
1. S. Banach, Sur l’kquation fonctionnelle f(r + y) = f(r) + f ( y ) . Fiindamrnta Marhematicae, Tom 1, Warszawa, 1920.’ 2. S. Banach and A. Tarski, Sur la dicomposition des ensembles en parties respectivement congruentes. Fundamenta Mathematicae, Tom 6, Warszawa, 1924.’ 3. Lennart Carleson, On convergence and growth of partial sumas of Fourier series. Acra Mathematica, Vol. 116, 1966, pp. 135-157. 4. L. Corwin and F. P. Greenleaf, Representations ofNilpotent Lie Groups and Their Applications. Cambridge University Press, Cambridge, 1990. 5 . H. Dym and H. P. McKean, Fourier Series and Integrals. Academic Press, New York, 1972.
6. J. B. J. Fourier, Thkorie Analytique de la Chaleur. Firmin Didot, Paris, 1822. 7. B. Gelbaum and J. Olmsted, Counrerexamples in Analysis. Holden-Day, San Francisco, 1964. 8. C. Goffman, Real Functions. Rinehart and Company, New York, 1953. 9. C. Goffman and G. Pedrick, First Course in Functional Analysis. Prentice-Hall, Englewood Cliffs, NJ, 1965. ]This historic paper is available online from the Institute for Computational Mathematics at the University of Warsaw in Poland at http://matwbn.icm.edu.pl/.
Measure and Integration: A Concise Introduction to Real Analysis. By Leonard F. Richardson Copynght @ 2009 John Wiley & Sons, Inc.
229
230
REFERENCES
10. P. Halmos, Measure Theory. D. vdn Nostrand Company, New York, 1950.
1 1. G. Hamel, Eine basis aller zahlen und die unstetigen losungen des funktionalgleichung f(x + y) = f ( z ) f ( y ) . Mathematische Annalen, Vol. 60, 1905, pp. 459-462.
+
12. Kenneth Hoffman and Ray Kunze, Linear Algebra. Prentice-Hall, Englewood Cliffs, NJ, 1971. 13. P. Jordan and J. v. Neumann, On inner products in linear, metric spaces. Annals of Mathematics, Vol. 36, No. 3, 1935, pp. 719-723. 14. S. Kakutani, Concrete representation of abstract (M)-spaces (A characterization of the space of continuous functions). Annals ofMathernatics, 2nd Ser., Vol. 42, No. 4, pp. 1941,994-1024.
15. S. Kakutani, A proof of the uniqueness of Haar’s measure. Annals ofMathernatics, Vol. 49, 1948, pp. 225-226. 16. E. Kamke, Theory of Sets, translated by F. Bagemihl. Dover Publications, New York, 1950. 17. The MacTutor H i s t o ~ yof Mathematics, Archive of the University of St. Andrews, Fife, Scotland.’ 18. L. Nachbin, The Haar Integral. D. van Nostrand Company, New York, 1965. 19. L. Pukanszky, LeFons sur les Repre‘sentarions des Groupes. Dunod, Paris, 1967. 20. L. Richardson, Advanced Calculus: A n Introduction to Linear Analysis. John Wiley & Sons, 2008. 21, F. Riesz and B. Sz.-Nagy, Functional Analysis. Frederick Ungar Publishing Company, New York. 1955. 22. S. Saks, Theory ofthe Integral, 2nd ed., translated by L. C. Young. Hafner Publishing Company, New York. (First ed. Warsaw, 1937.’) 23. S. Saks, Integration in abstract metric spaces. Duke Mathematics Journal, Vol. 4, 1938, pp. 408-41 1. 24. G. E. Shilov and B. L. Gurevich, Integral, Measure and Derivative: A Unijied Approach, translated by R. A. Silverman. Prentice-Hall. Engleewood Cliffs, NJ, 1966. 25. S . Wagon, The Banach-Tarski Paradox. Cambridge University Press, Cambridge, 1986.
?This archive is at http://www-history.mcs.st-andrews.ac.uWhistory/index.html. 3Thishistoric book is available online from the Institute for Computational Mathematics at the University of Warsaw in Poland at http://matwbn.icm.edu.pl/.
INDEX
(I?,+), 196
(T,.), 203 1 A ( X ) , 61 1s. 5 A A B , 43 AC.21 B * , 171 D+, 135 D - , 135 D + . 135 D-, 135 E O . 54 Fo?35 Gg. 35 I o , 137 L l ( X . C ) , 100 L 1 ( X . R), 100 L1(X,U. p ) , 93 L2(T), 196 LP, 165 L*. 173 L* (X. U. p ) . 146 U,,, 220 Q N , 50 S N . 197 s,, 104
2.
226
$, 157
1, 40
Measure arid Integration: A Concise Introduction to Real Analysis. By Leonard F. Richardson Copyright @ 2009 John Wiley & Sons, Inc.
231
232
INDEX
f dp, 71. 76
f * g. 114
7-1, dp, 219 f dp, 73 5.y fdP9 78 A l p , 161 X < p. 156
f i I . 140 f < 1. 140
JA
5:
,s
P
> cp.
157 37 liin inf A,, 44 lim sup A,, 44 BV(1). 124 C P ( T ) , 199 cy (R).210 L , 78 L+,75 M , 156 s.95 S(IR),209 U@B, 107 U@B,107 2l x 23, 107 2-p-a.e.. 60 B(IR),36 B(W),50 3. 52 C, 33, 36 C(IR). 36 C(IRn), 50 ??(X), 11 R[a, b], 4. 5 6 , 61 60, 69 61. 72 a.e., 60 p*. 24 X
[XI,
155 p - , 155 p f . 156 I-(+.
z,
134 a-algebra, 13 a-field. 13 ~ ( 4 )81. v, 58, 158
1-4 155
c.,, 58, 85
4 208
E , 206 r?, 216 8. 208 ?, 203 101, 208 .S, 104 f * x,207
f?
f f f
-
g, 93 v g. 58 A
9. 58
f+. 78 f - , 78
I , 33. 36 11, 101 12, 183 .(A). 124 n(x). 124 p ( A ) , 124 p(x), 124 .(A). 124 ~ ( 5 )124 . x-section function. 105 absolutely continuous function, 140 bounded variation, 140 measure, 156 algebra, 1 1 -a, 13 almost everywhere. 60 %-p-almost everywhere, 60 Baire Category theorem, 46 function, 57 Banach space. 95 complex, 95 isomorphism. 175 real, 95 reflexive, 18 1 Banach-Tarski example, 227 paradox, 226 theorem, 226 Bessel’s inequality. 182 Borel function, 57 Borel field, 13 Borel measure regular. 185 signed regular, 185 Borel sets abstract, 13, 16, 24 closed sets, 35 generated by topology, 37, 5 I in IR,16 in R k , 16
INDEX
in Tychonoff cube, 17 in unit k-cube. 17 in unit interval. 16 in unit square, 16 not Fb,47 not Gs, 47 open sets, 35, 36, 51 bounded variation, 124 differentiability, 132 Cantor function, 141 Cantor set. 39. 141 CarathCodory outer measure, 21 Cauchy sequence, 4 3 normed linear space, 95 Cauchy-Schwarz inequality, 180 character, 197, 203 group, 203 of IR, 208 circle as multiplicative group, 203 as quotient group, 196 nonstandard, 203 complete normed linear space, 95 complex number imaginary part. 180 real part, 180 concave function, 81 continuous function absolutely, 140 measure absolutely, 156 convergence Fatou’s theorem, 9 1 in measure, 65, 100 Lebesgue Dominated, 84 monotone, 89 convex function, 81 supporting line, 81 convolution, 114. 207 countably additive measure, 20 set function, 79 counting measure. 20 cylinder set, 17 decreasing sequence, 13 dense subet of L z . 202 subset of L1, 94 density point, 148 derivative
lower. 135 one-sided. I35 under integral sign. 147 upper. 135 Dirichlet kernel, 198 dual space, 172 of L p . 175 Egoroff’s theorem, 62 elementary sets. 12 abstract. 24 in IRn, 50 in finite interval, 31 in unit square. 16 product measure, 103 epigraph. 81 equivalence class in L1, 93 essential sup-norm, 60, 167 supremum, 146, 173 essentially bounded, 146 exponential function trigonometric, 197 Fatou’s theorem. 9 1 field 0-, 13 elementary sets in IR. 12 Bore], 13 elementary sets in IRn. 50 generated by U 12. 13 finite intersection property. 33 floor function, 37 Fourier series. 197 transform eigenfunctions. 2 13 for L1(IR), 208 for L2(R), 214 Hermite functions. 2 I3 in Hilbert space, 181 inverse, 208 on circle. 197 Fourier transform for L1@) continuity, 208 Riemann-Lebesgue lemma, 208 Fubini’s theorem first form. 108 main form, 1 1 1
233
234
INDEX
function R*-valued. 58 %-measurable. 56 complex, 56 topological space valued. 56 %-simple, 61 special, 69 z-section. 105 indicator, 5 absolutely continuous, 140 bounded variation, 140 part. 163 Baire. 57 Borel. 57 Cantor. 141 carrier a-finite. 158 of finite measure, 72 complex valued, 100 concave, 81 continuous, 139 convex, 81 essentially bounded, 146 extended IR-valued measurable, 58 Hilbert space valued. 204 imaginary part, 100 indicator, 61 integrable, 78 Lebesgue integral, 78 measurable Borel equivalent, 62, 98 homomorphism of R,68, I 3 1 real part. 100 Schwartz. 209 dense in L2(R), 214 simple, 61 special, 69 singular, 140 part, 163 step, 95 dense in L1(R), 98 truncation. 74 uniformly continuous, 139 functional bounded, 171 continuous, 170 linear, 170 Fundamental theorem Lebesgue integral, 129 Holder's inequality, 167 Hahn Decomposition theorem, 154 Heisenberg
Commutation Relation, 222 group. 223 Uncertainty Principle. 222 Hermitian inner product. 179 space, 179 Hilbert space, 179 direct integral. 219 direct sum. 204 orthonormal basis, 182 separable, 182 Hopf Extension Theorem, 24 uniqueness. 28 increasing sequence, I3 indefinite integral, 127 index set, 181 indicator function, 5 inequality Jensen's. 82 inherited measure space, 29 inner Lebesgue measure, 36 product Hermitian, 179 integrable locally. 131 integral additive on domains, 72 as set function. 71. 73. 76 complex valued, 100 counting measure. 81 Hilbert space valued. 204 indefinite, 127, 129 Lebesgue reflection invariance, 8 0 translation invariance. 80 linear functional, 70 on c+,75 73 on 61. sign derivative under, 147 special simple function, 7 0 irreducible. 2 I9 Jensen's inequality, 82 for real numbers, 166 Jordan content, 52 measure, 52 inner, 52 outer. 52 null set, 53
INDEX
Jordan Decomposition theorem, 156 Jordan measurable, 52 kernel Dirichlet, 198 LDC. 84, 212 Lebesgue Decomposition theorem, 161 Dominated Convergence. 84 measure outer. 36 theorem, 118. 132 linear functional. 170 bounded, 171 continuous, 170 positive, 187 linear space normed complete, 95 lower sum
on IR.4 Lusin’s theorem, 66 measurable function, 56 homomorphism of IR. 68. 131 set, 21 space, 38 measure absolutely continuous, 156 approximately finite, I9 Caratheodory outer, 2 I regular, 27 countably additive on 2l. 20, 24, 27 counting, 20 finite. 19 -u,20 finitely additive, 19 infinite. 19 Jordan, 52 Lebesgue, 33, 36. SO complete, 38 essential uniqueness, 115 in IR.36. 50 inner, 36 on [ - N , N ) , 33 outer, 36 reflection invariance. 52 translation invariance. 37. 51 norm of, 155 probability, 82 product
in IR2. 107 on E, 104 signed, 151. 152 singular. 161 space. 38 complete, 38 measure space. 38 completion. 41 minimal, 41 extension, 41 inherited, 29 metric. 43 metric space. 43 Minkowski’s inequality. 167 momentum operator. 222 monotone class, 13 generated by 2l. 13 monotone convergence theorem, 89 nonstandard circle, 203 norm, 93 L’. 93 LI’. 166 in a vector space. 93 of functional, 172 total variation, 155 null set. 38 orthogonal complement. I8 1 orthogonal projection. 2 17 orthonormal basis Hilbert space, 182 outer Lebesgue measure, 36 measure, 36 Caratheodory. 2 I. 25 Lebesgue. 36 Parallelogram Law. 184 Parseval’s Identity for L*(IR). 215 separable Hilbert space. 184 partition, 4. 124 mesh. 4 Plancherel identity. 182. 201 Pointwise Convergence Lemma. 96 position operator, 22 I power set. 1 I premeasurable space. 30 premeasure space, 29 probability measure. 82 space. 82
235
236
INDEX
Radon-Nikodym derivative, I57 theorem, 157 real numbers extended. 18 rectangular set. 103 reflection invariance, 52 regular Borel measure, 185 signed Borel measure. 185 representation theorem LP-LP duality, 175 Riesz-Markov-Saks-Kakutani, 186 Riemann -Lebesgue lemma. 98, 199, 208 integral, 4. 81. 118 deficiencies. 3 lower, 118 upper, I 18 rotation, 220 scalar product Hermitian. 179 section x-,104 y-. 104 semimetric, 43 separable Hilbert space, 182 separable Hilbert space L2 ( R ) , 183 sequence absolutely summable. I01 Cauchy. 43 normed linear space, 95 set Cantor, 39, 141 category first. 46 second, 46 convex, 81 index, 181 interior, 137 Lebesgue not Jordan, 54 measurable, 21 nonmeasurable, 42 in any P.l(P) > 0, 49 null, 38 set function countably additive, 79, 151, 152 sets Fo,35
Gg, 35 algebra of, I 1 Borel in R, 36 , in R r L50 congruent, 226 by countable decomposition, 226 by finite decomposition, 226 field of, 11 Lebesgue measurable, 33 limit inferior, 44 limit superior. 44 measurable, 21. 28 between F, and GJ. 35 between open and closed. 35 density point of. 148 in R, 36 in R", 50 Lebesgue. 33 signed Borel measure regular. 185 signed measure, 152 space normed vector. 93 dual. 172 measurable, 38. 55 measure, 38. 55 inherited, 29 metric, 43 complete. 43 semi-, 43 premeasurable, 30 premeasure, 29 inherited, 29 probability, 82 spectrum, 206, 216 state function. 221 Steinhaus theorem. 48, i 15 congruent decomposition. 1-19 step function, 95 sup-norm, 72 essential, 167 supporting line convex function. 8 I symmetric difference. 43 Tonelli's theorem, I12 transform Fourier. 98 Fourier-Stieltjes, 193 translation invariance. 37, 5 I triangle inequality. 43 complex space. 183 norms. 93
INDEX
sets. 43 Tychonoff cube, 17 Uncertainty Principie, 222 upper sum on IR. 4 variation, 123 bounded, 124 negative, 124 positive. I24 total, 155 Vitali covering theorem, 132 property. 132
237