This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
(m);
and bad otherwise. The element under the correspondence
a) is an interval of the form (b, u
+ oo).
Therefore, E(
a) (f) = 0 , 4n>, x ~n), ••. ) E /,. If Xn tends to the element x in the space /" then this holds also in the space lp. b-a-e.
= E(J >b).
VI.
160
SUMMABLE fUNCTIONS
(Consider the functionf(x) = 1~x 2 on the entire line, for example). We should then have to establish all of the needed properties, such as additivity and homogeneity, for this integral, and then proceed by stages to general summable functions. It is quicker and simpler to use the properties of summable functions already established in Chapters V and VI. For this ·purpose, we make the following definition. Let f(x) be a measurable function defined on the measurable set E, which may be bounded or unbounded. If E is bounded, then the definition of summability given in§ 2 is perfectly adequate. If E is unbounded, then ~o set E [ - n, n) is equal to E. Define the function .finJ (x) as being equal to f(x) on the interval [ - n, n] and being undefined elsewhere. Now suppose that f(x) is non-negative and measurable and defined on E: f(x) may be bounded or unbounded. Then we make the following definition. DEPINlTION 1. f f(x) dx = lim f ftnJ (x) dx. sum S may be infinite, if E has infinite measure.
n_,.oo
E
£[ -n.
ttl
Each of the integrals on the right-:hand side in the above definition is defined, although it may be equal to oo (see Definition, § 1). It is obvious that the integrals on the right-hand side form an increasing sequence of non-negative numbers, ,and so the limit must exist, as a finite number or as + oo. (We make the assumption that an increasing sequence which from some point on is equal to + oo has limit equal to
+
+ oo).
Next, suppose that f(x) is of arbitrary sign on the set E. Then, introducing the functionsf+(x) andf.(x) as in§ 2, we make a second definition. DEFINITION
2.
Jf(x) dx = Jf+(x) dx -I f_(x) dx,
E
E
E
provided that at least one of the integrals appearing on the right-hand side is finite. Otherwise, the integral I f(x) dx is undefined. If I f(x) dx exists and is finite, then E
E
f(x) is said to be summable on the set E.
All of the theorems, corollaries, and remarks of§§ 1 and 2 are true for integrals taken over unbounded sets. Verification of most of these involves only a simple passage to the limit on both sides of an equality or inequality and may be left to the reader. Some of the arguments, however, are a little subtle. For example, let us give a proof of Fatou's theorem (Theorem 9, § 1). Proof of Fatou's theorem for unbounded E.
Let n be an arbitrary positive integer ..• Then, clearly, lim fm!nl (x) = almost all x
m-oo
Finl (x)
for
E E[ -n, n]. By Fatou's theorem for bounded sets, therefore, we have J F[nJ (x) dx ~ sup { f fmtnl (x) dx}. ( 1) -n. n] m £{ -n. n} £[
Taking limits of both sides of (1) as n ->- oo, we obtain
J F (x) dx ~ n-oo lim
E
Sin<>e each integral
f
J
[sup { m
fm[nl (x) dx} ].
(2)
£{-n,nj
fm!nl (x) dx is an increasing function of n, it is clear that
E!-n. n]
sup { m
I
fm!nl (x) dx}
E!-n,n]
increases as n increases. Hence the limit in (2) exists. It is also obvious that
I
E(-n, n]
fmlnJ (x) dx~
I fm (x) dx
E
4.
161
EDITOR'S APPENDIX
for all m and n. Therefore sup { m
I
fmrnJ
f
fm[n] (x) dx } ] ~ sup
£[-n,nl
(x) dx} ~ sup { f fm(:~) dx }, m
E
for all n, and hence lim [sup { n-oo m
E[-n, n]
m
{IE fm (x) dx }.
(3)
Combining (2) and (3), we obtain Fatou's theorem. Having Fatou's theorem for functions defined on unbounded sets E, it is easy to prove the theorem of B. Levi (Theorem 10, § 1) by repeating verbatim the argument given in the text. The proof of Theorem 12, § 1, can be carried over to the case of unbounded E and not necessarily bounded Ek by a slight paraphrasing. The fundamental theorem on passage to the limit under the integral sign, and the one most widely used in applications of Lebesgue integration, differs slightly from Theorem l, § 3. We therefore state and prove it in detail. The reader should note that the set E referred to may have either finite or infinite measure. THEOREM l (LEBESGUE'S THEOREM ON DOMINATED CONVERGENCE). Let E be a measurable set, and let fn(x) be a sequence of summable funcri'Jns defined on E having the property that I fn(x) I ~ s (x)for all nand almost all x, where s (x) is a summable non-negative function on E. Then if lim fn(x) = F (x) exists almost everywhere on E, the relation n-oo (4) lim f fn (x) dx = f F(x) dx n....---..oo E
E
holds. Proof First of all, it is obvious that F (x) is summable and that I F (x) I ~ s (x) almost everywhere. Now define the sequence of functions pix) by the following rule: Pn (x) = lim {min[f,. (x), fn+l (x), ... , fn+k-l (x) ] }
n
=
1, 2, 3, ....
k-oo
It is easy to see that the functions pn(x) are a non-decreasing sequence of measurable functions, that I Pn(x) J ~ s (x) almost everywhere, and that lim Pn (x) 1:-oo
=
lim fn (x) n-..oo
=
F(x)
+
for almost all x EE. The sequence of functions s (x) Pn(x) is therefore an increasing sequence of non-negative summable functions. Applying the theorem of B. Levi (Theorem 10, § 1), we have lim
f
n_,_oo E
+ Pn (x)] dx = f
[s (x)
lim [s (x)
E n......P.C>O
+ Pn (x)] dx.
(5)
The right-hand side of (5) is obviously equal to
fE s (x) dx + EJF (x) dx.
(6)
The left-hand side of (5) is equal to
f s (x) dx +
E
Since f,(x)
lim
f Pn (x) dx.
> Pn(x) for almost all x E E, we have lim fj, (x) dx >-lim f Pn (x) dx. n->00 E
(7)
n-+00 E
n->oo E
(8)
162
VJ.
SUMMABLE FUNCTIONS
Assembling (5), (6), (7), and (8), we obtain
lim
Jf, (x) dx >-EJF (x) dx.
(9)
n--+-oo E
We next define a sequence of functions q.(x) by the following rule:
q. (x) = lim {max [J,. (x), f,+ 1 (x), ... , J,+l<-1 (x) ] },
n
=
I, 2, 3, ....
k-00:
The functions qn(x) obviously form a non-increasing sequence of functions such that I q.(x) I< s(x) and such that lim q, (x) = 7/m. J, (x) n-oo
n~oo
= F(x) for almost all x E E. We consider then the increasing sequence of non-negative functions s(x) - p.(x). Again applying the theorem of B. Levi and repeating the argument used above, we infer that lim N-+00
f J, (x) dx < EJ F (x) dx.
(10)
E
Comparing (9) and (10), we see that
lim
Jf, (x) dx
n-+:>e E
exists and that (4) holds. This of course completes the proof. CoROLLARY. Let f(x) be a summable .function defined on E. Extend the defimtion of [J],v given in § 1 by the rule -N if f(x) < -N, { [f(x)JN= f(x) if -N
if f(x)
>
N.
Then
JE f(x) dx= N-oo lim f E
[J(x) JNdx.
One can now prove Theorem 8, § 2, by applying the preceding corollary and repeating the proof of Theorem 8, § 2. The theorems of§ 3, barring Theorem 1, are of a somewhat specialized nature and are ·needed comparatively rarely in applications. We therefore omit a discussion of their extensions to sequences of functions defined on sets of infinite measure. We note merely that all of them are true for arbitrary sets of finite measure, whether bounded or unbounded.
Exercises for Chapters V and VI. I. If /,(x)
>- 0 and Jf,(x) dx -> 0, then J.(x) => 0,
but it is not necessarily the case tbat f.(x)
E
tends to 0 almost everywher.:. 2. The relation
r __jb__L dx~
. l+lfnl
E
is equivalent to In (x)
=> 0.
0
163
EXERCISES
3. If "n ->- 0, there exists a sequence of non-negative measurable functions un (x) such that 00
< + oo, but the functions un(x) do not tend to 0 at any of the points of E.
:E anf un (x) dx n=l
E
4. If the integral
J'i'(x)f(x)dx E
exists for every summable function f(x), then the function <;> (x) is bounded almost everywhere (Lebesgue). 5. Let a measurable finite function f(x) be defined on the set E. Consider a doubly infinite sequence of numbers
· · ·Y-3• Y -z, Y-!• Yo. Y1, Y2, Y9• · · · (yk-+ +co, Y-k-+- co, Yk+l- Yk
A necessary and sufficient condition that the function J(x) be sum-
00
mabie is that the series :E Ykmek be absolutely convergent. k=-00
6. If, under the conditions of Exercise 5, the series :Eykmek converges absolutely, then for >. ->- 0, its sum tends to f(x) dx
f
E
7. The limit of a uniformly convergent sequence of functions integrable (R) is a function integrable (R).
8. The characteristic function of the Cantor perfect set P 0 is integrable (R). What is its integral over [0, 1] ? 9. Let f(x) and g (x) be two non-negative measurable functions defined on the set E. If Ey = E(g";;?;:- y), then
Jf (x)
g (x) dx
=
J
(y) dy,
o
E
where rJ> (y) =
+oo
f f(x) dx. (D. K. Faddeyev) Ey
10. Let E 1 , E2 , ••• , En be measurable subsets of [0, 1]. If each of the points of [0, I] belongs to at least q of these sets, then at least one of them has measure ";;?;:- ~. (L. V. Kaotorovic) 0
11. Let a summable function f(x) be defined on [a, b]. b - a. If for every set e of measure a, f(x) dx
f
Funhcr, let a be a constant such that then f(x),....., 0. (M. K. Gavurin)
= 0,
c
12. Let f(x) be summable on [a, b] and equal to 0 outside of [a, b]. If
x+h
Jf
(t) dt,
x-h then b
JI
b
cp (x) I dx
a
< Jlf(x)! dx
(A. N. Kolmogorov).
a
c
Iff
13. Let a summable function f(x) be defined on [a, b]. f(x) dx = 0 for all c (a< c q. Prove that
inf {ft(x)dx}>o.
eES
e
164
VI.
SUMMABLE FUNCTIONS
15 Ler M = {f(x)} be a family of functions surnmable on [a, b]. If the functions of the family have equi-absolutely continuous integrals then there exists an increasing positive function (u) defined for 0 .:( u < + = which tends to + oo witb c and such that b
J1/(.x)l·ll>( lf(.x)l)d.x
for all f(x) of M, where the constant A does not depend on the choice of f(x). {de Ia Vallce-Poussin)
CHAPTER VII
SQUARE-SUMMABLE FUNCTIONS § 1. FUNDAJ.\1ENTAL DEFINITIONS. INEQUALITIES. NORM
In the present chapter, we consider a highly important class of functions: functions with summable squares. For simplicity, we will assume that all functions under consideration are defined on a certain closed "interval E = [a, b]. DEFINITION. A measurable function f(x) is said to be a function with summable square or a square-summable function if b
JJ2(x) dx< +oo. a
The set of all functions with summable square is usually designated by the symbol L 2 • THEOREM l. Every function with summable square is summable, i.e., L 2 C L. This theorem follows from the obvious inequality
In exactly the same way, the inequality
if (x) g(x) l-
t g2
(.x)
implies the following result. THEOREM 2. The product of two functions, each with summable square, is a summable function. Theorem 2 and the identity
{f::!::g)2 =P-+- 2/g+ g2. yield another simple fact. THEOREM 3. The sum and the difference of two functions in L 2 are in L 2• Finally, it is completely obvious that iff (x) is in L 2 then all functions of the form kf(x), where k is a .finite constant, are also in L 2 • THEOREM 4 (THE INEQUALITY OF CAUCHY-BUNYAKOVSKI-SCHWARZ, OR CBS INEQUALITY). lff(x) E L 2 and g (x) E L 2, then b
b
b
(Jf (x) g(x) dxY< [J j2 (x) dx] • [J g2(x)dx]. a
a
(1)
a
Proof. Consider the quadratic function ~(u) =Au2+2Bu
in which the coefficients A, B, C, are real and A 165
+ C,
> 0.
If this function is non-negative
VII.
166
SQUARE-SUMMABLE FUNCTIONS
for all real values of u, then
{2) If this were not so, we would have
~ (-~)= ~
(AC-B2)
< 0.
Having made this observation, we set b
~ (u) =
'b
b
b
J [uf(x)+g(x)]2dx= u'J I f2dx+2u Sfgdx+ Ig a
a
a
2
dx.
a
This quadratic is non-negative and hence satisfies condition (2), which is equivalent to the present theorem. 1 CoROLLARY. Jff(x) E L 2 , then
flf(x)ldx
y
JJ2(x)dx.
(3)
a
Upon setting g = I in {l) and replacing f(x) by I f(x) I, we obtain {3). 5 {MINKOWSKI's INEQUALITY). If f(x) E L 2 and g (x) E L 2 , then
THEOREM
Proof. Taking the square root of both sides of the CBS-inequality, we find
Jfgdx-<. Jl~ Jj2dx· Jl~ Jg2dx. b
..
a
..
a
a
Multiplying this inequality by 2 and adding
u
b
Jpax+ Jg2dx, a
to both sides, we obtain
a
which implies Minkowski's inequality. Minkowski's inequality permits us to consider the space of functions L 2 from a new point of view. Namely, if we associate with every function f(x) E L 2 the number
11/11 =
y' j f2(x)d~, a
1
We assume
b
b
a
"
JP dx > 0. If Jf
the trivial identity 0
= 0.
2 dx
= 0, f(x) is equivalent to zero and inequality (1) becomes
2. MEAN CONVERGENCE
167
then the following assertions are valid: I. IIIII >- 0, and IIIII = 0 if and only if l(x) ,.._. 0; II. I kill = I k I · IIIII and, in particular, I -Ill = IIIII ; III. II/+ g II -< IIIII + II g 11. The number IIIII is called the norm of the function l(x). The analogy between IIIII and the absolute value I x I of a real (or complex) number xis very obvious. This analogy is the source of a number of important and beautiful constructions. Roughly speaking, the essential significance of absolute values in analysis is that with their aid, we can carry out the measurement of distances on the line:
p(x, y)=lx-y!. The norm introduced in L 2 permits us to regard the set L 2 as a space in which it is also possible to measure distances, if we take the number p(j,g)
= 11/-gll
as the distance between the elements I and g of L 2 • If we agree to consider functions which are equivalent as identical, then the distance p (f, g) just defined possesses the usual properties: 1) p (f, g) >- 0, and p (f, g) = 0 if and only if I= g; 2) p (f, g) 3) p (J, g)
=
p (g,J) ;
-< p (f, h) + p (h, g).
If such a function p (x, y) is defined for every pair of elements of a set A of elements of an arbitrary nature, then the set A is called a metric space. Thus L 2 is a metric space. D. Hilbert first developed this point of view regarding L 2 , so that L 2 is frequently called a Hilbert space.* § 2. MEAN CONVERGENCE
The concept of norm permits us to introduce the idea of limit in Hilbert space with the aid of almost the same expressions as used in the ordinary case of the line. DEFINITION 1. The element f of the space L 2 is called a limit of the sequence !1> / 2, 13, ••• of elements of the same space, if for every e > 0, there exists a natural number N such that
11/.. -Jil <e for all n > N. We express this situation by saying that the sequence {fn} converges to the element/: we write in the usual way lim In = f and f,. ->-f. n-+oo
* The space
L 2 described here is only one example of a large number of spaces which are known as Hilbert spaces. In fact, for distinct pairs of re~l numbers a < b and a' < b', it is plain that one must distinguish between L 2(a, b) and L 2(a', b'). There are a great many Hilbert spaces distinct from these spaces L 2(a, b), as well. For an axiomatic treatment of not quite the most general Hilbert space, see M. H. Stone, Linear Transformations in Hilbert Space and Their Applications to Analysis, New York, Amer. Math. Soc. Coli. Pub!., 1932.-E. H.
168
VII.
SQUARE-SUMMABLE FUNCTIONS
We draw the reader's attention to the profound difference between the relations j, (x) ->- f (x) and fn ->-f. The first means that for fixed x, the numerical sequence {.fn(x)} converges to the limit f(x) in the usual sense. The second expression, however, means that the sequence of elements of L 2 converges to the element/ E L 2 in the sense of Definition L In the usual symbols of the theory of functions, the relation f, ->- f means that 1J
lim
J[fn(x)-f(x)] dx=0. 2
n -+co
a
This new form of the convergence of a sequence of functions is called mean convergence or convergence in the mean. THEOREM L If a sequence {!,,(x)} converges in the mean to the function f(x), then it converges in measure to f(x). Proof Let a be a fixed positive number and let
An(o) =E (iln-ll~a). Then
b
J(/n-/) 2 dx~ J (ln-/) dx-;;:;::o mAn (a), 2
a
2
.&-n (•)
and, since a is fixed,
mAn (a)-;. 0. This implies that fn => f CoROLLARY. If the sequence {in (x)} converges in the mean to f(x), then there exists u subsequence {f"k(x)} of {J,,(x)} which converges to f(x) almost everywhere. This corollary is established by referring to Theorem I and Riesz's theorem (Theorem 4, § 3, Ch. IV). However, it can be proved without referring to convergence in measure at all. Namely, if b
lim 1~
then it is possible to find n 1
<
-)'OJ
n2
J(fn-!)2d:c=O, a
< n3 < ... , such that b
f (f
nk
-1)2dx<_!_. 2k
a
Then the series 1J
~
J(Ink -/)2dx
k=la
converges and by the corollary of Theorem 11, §1, Chapter VI
Ink (x)-;. I (x) almost everywhere on [a, b]. We note that convergence in the mean of the sequence {.fn(x)} to the function f(x) does not imply its convergence to f(x) almost everywhere. This is illustrated, say, by
2. MEAN CONVERGENCE
169
the example constructed in § 3, Chapter IV. Conversely, it is possible that fn {x) ->- f(x) at every point of [a, b] while In does not converge to fin the mean. EXAMPLE. Let a sequence {fn{x)} be defined on [0, 1] by the requirements
<X n1
and fn(x)
=
< f.n (x) = n fot 0 clear that for arbitrary x is it Then 1]. [0, of 0 at all other points
E [0, 1]
lim/,. (x) = 0,
but at the same time 1/71
1
JJ! (x) dx = J
nZ
dx =
ll-->
+ co.
0
0
THEOREM 2 {UNIQUENESS OF L 2 can have at most one limit. Proof Assume that
THE LI~UT).
A sequence / 1 ,
/ 2, / 3 , ••.
of elemerm of
fn ->- f and fn ->- g ; then
II/ -gll~ll/- !nil +11/,.-g!l, and since the right member of this inequality has limit zero and the left member is constant and non-negative, it follows that
llf-glJ=O, Therefore f- g =:= 0 and f = g, as was to be proved. It is possible to give another proof of the theorem. If In ->- f and fn -+ g, then the sequence UnCx)} converges in measure to f(x) and tog (x), so thatf(x) ,..._, g (x) but we have agreed to consider equivalent functions to be one element of the space. THEOREM 3 (CONTINU11Y OF THE NORM). If fn ->- J, then 11/n 11->-11/11. Proof The obvious inequalities
11/nl\ ~ 11/II+Jifn-/\1,
11/11 ~ 11/n\1+11/,.-/ n imply that lll/nl\-11/1\l~
1\/,.-/11,
this inequality implies the present theorem. COROLLARY. The norms of the elements of a convergent sequence in L 2 are bounded. DEFINITION 2. A sequence Un} of points of the space L 2 is said to be a Cauchy sequence or a fundamental sequence, if to every s > 0, there corresponds a natural number N such that for all m, n
>
THEOREM
N. 4. If the sequence {In} has a limit, then it is a Cauchy sequence.
Proof Let
lim/,.=/. 7l.-io'co
VII.
170
SQUARE-SUMMABLE FUNCTIONS
Having taken an arbitrary e > 0, we find an N such that for n
11/n-/11 Now if n
> Nand m >
<; ·
> N,
N, then
11/.. - I mII~ 11/.. - /II+ lit- fm II< s, which proves the theorem. The converse theorem is much deeper. THEOREM 5. If {In} is a Cauchy sequence, then it has a limit.
'f ~- For every k, choose an nk such that 1 11/n-/7,.11 < 2
Proof. Consider the convergent series
k=l
k •
for n
> nk and m > nk.
We may suppose without loss of generality that nl < n2
< rzs < ...
so that and hence co
k~l llfnk+ 1 - fnkll < + 00 • By inequality (3) of §1, b
J lfnk-1-l -f,.kldx::s::;; Vb-a 11/,,k+l -fukll' (l
so that the series
converges. By Theorem 11, §1, Chapter VI, it follows that the series (X)
l/n, (x) I+ ~ I fnk+l (x)-fnk (x)!, k=l
converges almost everywhere. Therefore the series co
fn, (x)+ k~l {fnk+l (x) -fnk (x)} converges almost everywhere. The convergence of this last series almost everywhere is obviously equivalent to the existence of the finite limit lim
n-+co
In
k
(x)
almost everywhere. We introduce the functionf(x), equal to this limit everywhere where
2.
171
MEAN CONVERGENCE
it exists and is finite and equal to zero at those points where this limit does not exist or is infinite. The function f(x) is plainly measurable and by definition,
/,. (x) -> f (x). k
almost everywhere on [a, b]. Our problem is to establish that this function is an element of Hilbert space and that it is the L 2-limit of the sequence {f,}. For this purpose, having selected an arbitrary e > 0, we find a natural number N such that
ll/11-- fn,l\ for all m, n > N. If k 0 is a number such that nko
>
<E.
N, then
b
J(/.,_- fn)2 dx< e2. Cf
for all n > N and arbitrary k > k 0 • From this, upon applying Fatou's theorem (Theorem 9, §1, Chapter VI) to the sequence of functions { Un - J..,Y} (k > k 0 ), we find that b
J (In- !)2 dx~ e2, a
i.e., llfn-fll~e
for all n
-=
> N. Since f, - f E L 2, it follows that/ E L 2 • Also lim fn =f. This completes
~~~ The property
of Hilbert space L 2 , established in the preceding theorem, is callcu completeness. The reader has, of course, noticed that Theorems 4 and 5 are analogues of the Bolzano-Cauchy test for convergence. The Bolzano-Cauchy test is one of the numerous forms of the property of continuity of the real line Z. This property can be expressed by any one of the following assertions: A. If the points of the real line Z are divided into two classes X and Y such that every point of the class X lies to the left of every point of the class Y, then either there is a largest element in the class X or a smallest element in the class Y. B. A set which is bounded above admits a least upper bound. C. A bounded monotonic increasing sequence has a finite limit. D. If {dn} is a seq-uence of nested closed intervals whose lengths tend to zero, then there exists a point belonging to aU the segments dn. E. The Bolzano-Cauchy property: every Cauchy sequence{xn} has a finite limit. It is sufficient to remove one point from the straight line Z to make all the theorems indicated above untrue. Of the theorems A, B, C, .D, E, only the last one E is formulated without the aid of the concept of order of the points on the straight line. It uses only the notion of distance. Therefore, it is natural that E be carried over as the definition of the property of completeness to inore complicated spaces than the real line. DEFINITION 3. A set A contained in L 2 is said to be everywhere dense in Lz if every point of L 2 is the limit of a sequence of points belonging to A. In function-theoretical language, Definition 3 reads as follows: a class of functions
VII.
172
SQUARE-SUMMABLE FUNCTIONS
A c L 2 is everywhere dense in L 2 if every function in L 2 is the limit (in the sense of mean convergence) of a sequence of functions selected from A. It is easy to see that a necessary and sufficient condition that the set A = {g} be everywhere dense in L 2 is that for an arbitrary point f E L 2 and arbitrary s > 0, it is possible to find a point g E A such that
llf-gll<e. 6. Each of the following classes of functions: M, the class of bounded measurable functions ; C, the class of continuous functions ; P, the class of polynomials ; S, the class of step functions ; is everywhere dense in L 2• If the closed interval [a,b] is[- "• 1t], then T, the class of trigonometric polynomials is also everywhere dense. Proof 1) Letf(x) E L 2• Choosing an arbitrary s > 0, we infer from the absolute continuity of the integral that there exists a 8 > 0 such that the relations THEOREM
me<~
ec:[a, b],
imply the inequality
J
f2 (x) dx·< e2.
8
By theorem I, §4, Chapter IV, there exists a measurable bounded function g (x) such that
mE (j*g)
we may define g (x) as 0 at all points of the set E (f =F- g). Then b
J
llf-gll = 2
(f-g)'!·dx=
a
J
(f-g) 2 dx=
E{f:j:g)
i.e.,
!!f-gll
J
j2dx<e 2,
E((:j:g)
<e.
This proves the theorem for the class M. 2) Let f (x) E L,2 and e > 0. Choose a function g (x)
EM
such that
!!f-g\\
!lg-11 2 =
J
(g-CJ?) 2 dx=
a
Obviously
J(g-9) 2 dx~4K2· mE(g*cr:)< e:,
E(g.f:
2. MEAN CONVERGENCE
173
and hence
This proves the present theorem for the class C. 3) Letf(x) E L 2 and e > 0. Let cp (x) E C have the property that
11/-cpll<
f·
In accordance with Weierstrass's theorem (Theorem 2, §5, Chapter IV), there is a polynomial P (x) such that for all x
E [a, b].
It follows that b
11?-PJ\2=
I
e:2
e;2
(9-P)'J·dx~ 4 (b-a) ·(b-a)=-.r,
a
from which the inequalities and
!(f-PII<e are obvious. The theorem is thus proved for the class P. 4) Letf(x) E L 2 and e > 0. Let cp (x) E C have the property that
11/-rpll
c0 =a
...
...;:-a,
into parts such that the oscillation of cp (x) is less than 2 in each of the subintervals [ck, ck+ 1]. Now define a step function s (x) by setting
= s (x) =
s (x)
We then have
I s (x)
-
q> (c7c)
(c7c~x
q> (cn_ 1)
(cn-l ~X~ b).
cp
(x)
e
I < 2 "\j'b~,
k=O, 1, ... , n-2)
everywhere on [a, b],
and therefore
ns-rpll <~. It follows that
11/-sll<e:, and the theorem is proved for the class S. 5) Finally, let [a, b] = [- r., r.] andf(x) E L 2• Selecting an arbitrary e > 0, we find, as above, a function continuous on [ - r;, r.] such that
11/-g>ll< ; .
VII.
174
SQUARE-SUMMABLE FUNCTIONS
Let lffl{x)I~K.
For 0 < 8 < 2n, we define the continuous function ifi (x) on the segment [- n, n.J by setting ifi (x) = cp (x) for x E [ - n + S, n], "' ('-77) = cp (77), and defining ifi (x) as a linear function on the closed interval [ - n, - n choose S > 0 so that
+ S] ;
we
e2
0 < 64K 2 • The function ifi (x) is obviously continuous on [ - n, n], but· what is now the main thing, it has the property that ifi (-77) = ifi (77). Furthermore, I ifi (x) I < K, so that
II cp
"
-
lf!ll 2 = J Ccp-lf!) 2dx= -7T
-1T+o
2
J ( cp-ifi) 2dx < 4K28 < ; 6•
-'tT
Accordingly,
3e
llf-lf!il <4. By Weierstrass's Theorem (Theorem 4, §5, Ch. IV), there exists a trigonometric polynomial T (x) such that for all x E [ - n, n],
I ifi(x)-T(x) I<~ · 4-v'h
Then
and consequently
11/-Tll <s. This completes the proof. The concept of weak convergence of a sequence of functions plays an important role in many questions. DmNITION 4. A sequence of functionsf1 (x),/2 (x), ... , in L 2 , is said to converge weakly to a function f(x) E L 2 if the equality !I
~!!I g(x)fn(x)dx= a
!I
J
g(x)f(x) dx
a
holds for every function g (x) E L 2 • We will not study this new form of convergence in detail and will limit ourselves to one theorem. THEOREM 7. If a sequence offunctions {f,(x)} converges in the mean to the function f (x), then it also converges weakly to this function.
3. ORTHOGONAL SYSTEMS Proof. Let g (x) b
{J g
E L 2•
175
Then the CBS inequality yields
(x) Un (x)- f(x)] dxr ~
a
b
[J g~(x)dx]
b
·
[J f/, (x)- I (x))2 dxJ. 1
a
a
Therefore b
b
IJgfndx- Jgfdxl ~ligll ·111~~-fll ~ a
0,
a
as was to be proved.
§ 3. ORTHOGONAL SYSTEMS DEFINITION 1. Two measurable functions f(x) and g (x) defined on the closed interval [a, b] are said to be orthogonal if b
J
f(x)g(x)dx=O.
a DEFINITION
2. A measurable functionf{x) defined on [a, b] is said to be normalized
if b
Jj2(x) dx= 1. a DEFINITION 3. A system of functions w 1 (x), w 2 (x), w 3 (x), ... , defined on the segment [a, b], is said to be an orthonormal system if every function of the system is normalized and every pair of distinct functions of the system are orthogonal. In other words, a system of functions { wk (x)} is orthonormal if
fu w,(x)
wk
(x) dx=
a
{1 0
(£= k) . (t::f::.k).
It is clear that every orthonormal system is contained in L 2 • A classical example of an orthonormal system is the trigonometric system 1 ~'
cos x
sin x
yit ' y;t,
cos 2x
sin 2x
-y;t '
v;t
' ... ,
r- +
(1)
defined on the segment .-c, .-c]. Suppose that J(x) E £ 2 and that J(x) is a linear combination of functions of a given orthonormal system:
f (x) =
c1w1 (x)
+ ... + cnwn (x).
Multiplying this equality by wk (x) (k = I, ... , n) and integrating, we find b
ck =
Jf(x)(l))< (x) dx, a
VII.
176 i.e., the coefficients c 1 ,
••• ,
FUNCTIONS
SQUARE-SUMMABLE
en are completely determined.
In particular, if
n
T (x) =A+ l:ca,_cos kx+ b" sin kx), k~l
then r.
r.
~
A= 2~J T(x)dx,
ax=
~J T(x)coskxdx,
f;k
=! JT(x) sinkxdx. -~
-rt
-tt
These formulas were discovered by Fourier for the trigonometric system. It is natural to give the following general definition. DEFINITION 4. Let { wk (x)} be an orthonormal system and letf(x) be a function in L 2 • The numbers l>
ck
J
= f (x) (Jlk (x) dx a
are called the Fourier coefficients of the function f(x) in the system { wk (x)}. The series co
~ Ck(J)k (x) k=l
is called the Fourier series of the function.f(x) in the system { wk (x) }. We now consider how near the partial sum of Fourier series of the functionf(x), n
Sn(x) = ~ckwk(x) k=l
is in L 2, to the functionf(x), i.e., we will compute
To do this, we first evaluate the integrals l>
b
f I (x) Sn (x) dx
f s; (x) dx.
and
a
For the first integral we have n
l>
b
"
Jf(x)Sn. (x) dx= k~/" Jj(x) (J)k(x) dx= k~1c~.
a
a
In exactly the same
w~v. b
b
Js;. (x) dx = ~.c'c" Jru, (x) a
t,k
(Jl"
(x) dx
a
=
t c~.
k=l
It follows that b
Jl/-S,.il2 =
b
n
J(/-2/Sn+s;) dx= jf2dx- ~ c%, a
a
k-1
(2)
177
3. ORTHOGONAL SYSTEMS that is, n
~cl. !lf-S.. I =11/Ir- k=l
(3)
2
Equation (3) is called Bessel's identity. Since its left member is non-negative, we obtain Bessel's inequality from it : n
~ c~-<. II! 1 2 •
7.:=1
Since the number n in Bessel's inequality is an arbitrary positive integer, Bessel's inequality can be written in the strengthened form 00
~ c~ -
(4)
k=l
In particular, we may have
(So) this equation is called Parseval's identity. It has a very simple meaning. Namely, Bessel's identity (3) and (5) permit us to write lim
II/- Snll =
0.
n~co
In other words, Parseval's identity implies that the partial sums Sn(x) of the Fourier series of the function/ (x) converge (in the sense of the metric in L 2, that is, in the mean) to f(x). DEFINmoN 5. The orthogonal system { wk (x)} is said to be closed if Parseval's identity holds for all functions in L 2 • THEOREM 1. If the system { <1-h (x)} is closed, and if f(x) and g (x) are functions in L 2, we have b
co
Jf (x)
b
Jf(x)g(x)dx= ~1 a1.:bk,
a
where 11
a7.: =
Olk
(x) dx,
bk =
Proof The numbers ak Therefore
Jg (x)
wk
(x) dx.
a
a
+ bk are the Fourier coefficients for the sumf(x) + g (x). co
~ (ak+ bJ,) 2 , lif+giJ2 = k=l and direct computation gives
u
b
b
Jf2dx+2 Jfgd:c + J
g2
a
a
This is equivalent to the theorem.
a
00
co
co
dx = ~ a%+2 ~ akbk+ ~ bL Jc.,l k=l k=l
178
VII.
SQUARE-SUMMABLE FUNCTIONS
The formula just established is a generalized form of Parseval's identity. COROLLARY. If the orthonormal system { wk (x)} is closed and f (x) E L~, then the Fourier series of f(x) in the system { wk (x)} can be integrated term by term over an arbitrary measurable set E C [a, b], i.e.,
I f(x) dx= E
00
~ck k=l
I
wk
(x) dx.
E
In fact, if we take the characteristic function of the set E for g (x), then it is clear that g (x) is square-summable. We thus have a special case of the generalized Parseval's identity. It is interesting to note that the Fourier series itself, .E ck wk (x), need not converge pointwise to the function f(x). THEOREM 2 (V. A. STEKLOV). Let A be a class of functions everywhere dense in L 2 • If Parseval's identity holds for all functions of the class A, then the system { wk (x)} is closed. Proof. Let f(x) be a function of the class L 2• We form the partial sums of its Fourier series
and emphasize their dependence on the function f(x) by writing them as Sn (f). is easy to verify that
1) Sn (kf) = kSn (f), 2) Sn U1 /2) = s11 (/1) 3) l sn (f) 11-< 11!11·
+
It
+ Sn (f2),
The first two properties are trivial. The third follows from Bessel's inequality n
[lSn[f=k=l 1: c1-
llf-gll
Futhermore,
~Sn(g)-SnU)!I=I!Sn(g-f)!i-
3.
ORTHOGONAL SYSTEMS
179
Since Parseval's identity holds for g (x), we have
for n
>
n 0, and, consequently
llf-Sn
> n0 •
This proves the theorem. 1. If Parseval's identity holds for all of the functions I, x, x 2, then the system { (Jh (x)} is closed. Consider, in fact, the polynomial CoROLLARY
x 3 , ••• ,
Then
and accordingly m
liP- Sn(P)!I-< ~I A~
k=o The right-hand member of this equation has limit 0 as n ->- oo. Therefore Parseval's identity holds for every polynomial, and the class of polynomials Pis everywhere dense in L 2 • Do closed systems exist? Another corollary of Steklov's theorem supplies the answer. CoROLLARY 2. The trigonometric system (1) is closed. It suffices to verify Parseval's identity for an arbitrary trigonometric polynomial n
T (x) =A+ l: (akcos kx+bk sin kx), k=l
but this is obvious, 2 because T (x) is a linear combination of functions of the system (1). THEOREM 3 (F. RIEsz-E. FISCHER). Let an orthonormal system { wk (x)} be defined on the closed interval [a, b]. If the numbers c 1 , c 2 , c3 , ••• are such that
l: c%< +oo,
k=l
there is a function f (x) E L 2 such that: I) the numbers ck are the Fourier coefficients of f(x); 2) Parseval's identity holds for f(x). n
2
Lctf(x)
= .E ck "'k (x).
Theo, multiplying this equation by f(x) aod integrating, we obtain at
k=!
ooce the completeness formula ()
Jf
a
n
3
(x) dx=
~c%. k=l
VII.
ISO
SQUARE-SUMMABLE FUNCTIONS
Proof Let n
Sn(x)= ~ckmktx). k=l
We shall show that the sequence S 1 , S~, ... is a Cauchy sequence. For this purpose, let us evaluate II sm - sn II form> n:
For every e > 0, there exists a natural number N such that for m > n > N,
or, equivalently,
This implies that { Sn} is a Cauchy sequence.. Since L 2 is a complete metric space, there exists a function f(x) E L 2 such that nsn-11!-+ 0. This function f satisfies our requirements. In fact, by Theorem 7, §2, the sequence { Sn (x)} converges weakly to f(x), i.e.,
u
lim
b
JSn(x)g(x)dx= Jl(x)g(x)dx
n-+c:oa
for every g (x)
E L 2•
a
In particular upon setting
g (x) =
wi (x),
we obtain b
f
b
I (x)
(J) 0
(x) dx = lim ?l.~OO
a
JSn (x)
(J)'
(x) dx.
a
For n > i, we have
Accordingly,
b
JI (x) m, (x) dY. =
C-t
{l
and the functionf(x) satisfies requirement (1). It also follows that Sn (x) is the partial sum of the Fourier series of the function f(x), and the relation
l!Sn-11!-+ O, shows that Parseval's identity holds for f(x).
3. ORTHOGONAL SYSTEMS
181
REMARK. There exists only one function satisfying both conditions of the RieszFisch(!r theorem. Assume that there are two such functions, f(x) and g (x). By the first condition, they have a common Fourier series. The second condition implies that
s"- j, sn -> g, which implies that f = g. It is interesting to see if the preceding remark is still valid when we omit the second condition of the theorem. We need the following definition to answer this question. DEFINITION 6. A system of functions { Cllk (x) }, defined on the closed interval [a, b], and belonging to L~, is said to be complete if there exists no function different from zero 3 in L 2 which is orthogonal to all the functions cpix). We note that in the foregoing definition, no requirement is made that the system {
b
Jt(x)wrc(x)dx=
Jg(x)~k(x)dx
(k=l,2,3 ... ),
a
then their difference, being orthogonal to all functions of the system, must be identically zero. Conversely, if the system is not complete and h (x) is a function different from zero and orthogonal to all functions of the system, then it is sufficient to add to it any function f(x) satisfying the first condition in order to obtain a function distinct fromf(x) also satisfying the first condition. In the case of orthonormal systems, the concepts of closure and completeness coincide_ THEOREM 4. An orthonormal system { wk(x)} is complete if and only if it is closed. Proof Suppose that the system { wk (x)} is closed. If a function f(x) E L 2 is orthogonal to all functions wk (x), then all of its Fourier coefficients are zero. The Parseval identity gives CXl
11/11 2 = ~ C~=O, k=1
and the function f(x) is identically zero, i.e., the system is complete. Conversely, suppose that the system { wk (x)} is complete. Assume that Parseval's identity fails for some function g (x) E L 2 • Then, necessarily 00
~c%<
1Jg1J2,
k=l
where the numbers
b
ck= Jg(x)wk(x)dx a
are the Fourier coefficients of the function g (x). On the basis of the Riesz-Fischer 3
Recall that a function equivalent to zero is taken to be identically zero.
VII.
182
SQUARE-SUMMABLE FUNCTIONS
theorem, a function f(x) can be found such that b
Jf(x)
(J)k
(x) dx= c7o
In this case, the difference f(x) - g (x) is orthogonal to all functions of the system and
f(x)
=
g (x),
since the system is complete. This contradicts the condition
l\fl!
~c~
(6)
k-1
converge. By the Riesz-Fischer theorem, the series (7)
is the Fourier series of some functionf(x) S 11 (x)
=
E L 2,
and its partial sums
n
~
C'kCJlk
(x}
k=l
converge in the mean to f(x). Therefore, a partial sequence { SnI (x)} of them can be formed which converges [to the function f(x)] almost everywhere on [a, b]. It turns out that the choice of the indices n1 can be carried out without specifying the system { wk (x) }, simply by using the series (6). A large number of investigations have been devoted to this problem. We introduce only the simplest results which are pertinent here: THEOREM 5 (S. KAczMARZ). Let
If the natural numbers
n1
<
n2
< ...
have the property that
co
~rn:~<+co, ,,.1 then the sequence {S"_(x)} converges almost everywhere. I Proof. Bessel's identity shows that
(K)
n-1
l!Sn-1-111 2 = 11/11 2 -
~c~=rn 'k=1
(We assume thatf(x) is the function which satisfies both conditions of the Riesz-Fischer
3. ORTHOGONAL SYSTEMS
183
theorem.) Therefore, it follows from condition (K) that b
~ Jcsn;-t-f) 2 dx<+oo '2"s::::l
a
and, by the corollary to Theorem 11, §1, Chapter VI, we have
almost everywhere on [a, b].
On the other hand,
and, by virtue of Theorem 11, §11, Chapter VI, we have
almost everywhere on [a, b]. It follows that
S,, (x) -+ f (x), as we wished to prove. THEOREM 6 (H. RADEMACHER). Let tjJ (k) be a positive, increasing junction dej11.ed for k = 1, 2, 3, ... , such that lim tjJ (k) = oo, and such that k->-oo (X)
~ <jl (k) k=l
If the
natural numbers 1 = n1
<
nz
<
cZ< +oo.
< ...
n3
(8)
have the property that
o/ (n,):;;;,. i,
(R)
then the sequence { Sn 1(x)} converges almost everywhere. Proof We shall show that the numbers n; satisfying condition (R) also satisfy condition (K), and the theorem follows. For this purpose, we note that condition (8) can be written as co ni+t-1
l,;
i=t
~
<jJ
(k) c~<
+co.
{9)
k=11;
Condition (R) implies that ""'
ni+l-1
i=l
k:::t:~'f1i
~i
~
4<+cc.
(10)
184
VII.
SQUARE·SUMMABLE FUNCTIONS
Consequently the double series n 2-l
4+ k =n, ~
n,-1
~
c%+
k='n2 n,-1
n 4 -1
~
c%+ ...
1
k=n 3 n,-1
+k=n ~ c%+ ~ ci+ ... k=n 2
(11)
3
n 4 -1
+ k=n ~ c%+··· 3
+··· converges when summed by columns. Therefore it converges and is summable by rows. This is equivalent to condition (K) because the sum of the i-th row is rn.· I REMARK. Conditions (K) and (R) are equivalent. In fact, we have already seen that numbers n; satisfying condition (R) also satisfy condition (K). Conversely, let the numbers n1 satisfy condition (K). Then, summing series (11) by rows, we obtain a finite sum. Summing it by columns, we see that (10) is fulfilled. If we set <j!(k)=i
(n,.
then (10) can be written in the form (9) or (8). Therefore ljJ (k) satisfies the conditions of Rademacher's theorem and the numbers n; satisfy condition (R). § 4. THE SPACE /2
Points in two-dimensional Euclidean space R 2 are of course ordered pairs (a 1, a 2 ) of real numbers. With each point M(a 1 , a2) E R 2, we may consider also the vector with initial point {0, 0) and terminal point M. The co-ordinates a 1 and a 2 of the point M are the projections of the vector x on the two co-ordinate axes. Therefore, a pair of numbers (a 1, a 2) can be considered not only as a point M but also as a vector x. This point of view is very useful. Namely, given two vectors x = (a 1 , a 2) and y = (b 1 , b2 ), one can form their sum
x
+Y= (a +b 1
1,
a2
+b
2),
and one can multiply a vector x = (a1 , az) by a real number k:
kx = (ka 1 , ka2 ). Such operations cannot be carried out for points, if they are considered as purely geometric entities. The length of the vector x = (a 1 , az) is the number
(This is merely the Pythagorean theorem). The inner product (x, y) of two vectors x = (a1 , az) and y = (b 1, bz) is the product of their lengths by the cosine of the angle between them: (x, Y)= lixll . IIYII . cos e,
4.
185
THE SPACE /2
In terms of the projections of the vectors, we have the well-known formula
Knowing this product and the lengths of both vectors, it is easy to find the angle between them from the relation cos 6 = II
X
(x, y) II • II y
II
In particular the vectors x and y are orthogonal if and only if
The above discussion can be repeated verbatim for three-dimensional space R 3 • I) A triple of numbers x = (a 1 , a 2, a 3), taken in a definite order, can be considered either as a point of the space R 3 or as a vector lying in R-1. 2) In the second interpretation, we can multiply a vector by a number and add two vectors. The length II x II of the vector x = (a 1, a 2 , a 3 ), as the Pythagorean theorem shows, is
3) Given vectors x product
=
(a 1 , a2 , a 3) and y = (b 1 , b2 , b3), we can form their scalar
4) Knowing the scalar product (x, y) and the lengths of the vectors, we can find the angle e between them "
(.x, y)
cos v = II .x II · IIY II
(0
< 8 < 1t).
5) Finally, the condition of orthogonality of vectors is
Generalizing these relations, we can introduce the concept of n-dimensional Euclidean space Rm whose points and vectors are ordered n-tuples of real numbers
x= (a 1, The length of the vector x
=
~• • • • ,
an).
(al> ... , an) is defined to be the number
The inner product of vectors x and y cannot be defined by means of the angle between
x andy, and we must take the equation
n
(x,y)=~ a1!Jk 1.:=1
186
VIL
SQUARE-SUMMABLE FUNCTIONS
for its definition. The angle e can be defined, on theotherhand,fromthescalarproduct by the relation cos e=
(x, y)
.llxll · llY II
For this definition to be reasonable, we must prove that
l(x, y)l<: l!xl! ·IIYII· When this has been proved (this is done below), it is natural to consider the vectors x = (a 1, ••• , an) and y = (b 1, ••• , bn) as orthogonal if n
(x, y)
=
~ akbl<= 0. k=l
Continuing this process of generalization, we naturally come to the concept of the infinite-dimensional space R 00 , which is also designated by / 2 • In this, we restrict ourselves to the vector treatment of the space in question. DEFINITION. An infinite sequence of real numbers
x= (a1 ,
~.
a8 ,
••• )
is said-to be an element of the space 12 if
The number II x II is called the length or the norm of the vector x. It is easy to see that if x E 12 , then, for all real numbers k, the vector
is also in / 2• Furthermore,
llkxll =lkl·llxll and in particular II - xll If the vector
=
II x [[.
is in / 2 as well as x, then the vector sum
also lies in / 2, because
4.
187
THE SPACE /2
It follows from the inequality
that the series 00
~ a-,.!Jk
(x, y) =
k=l
converges absolutely. Its sum is called the inner product of the vectors x and y. A close connection ~xists between the spaces / 2 and L 2. Let { wk (x)} be any complete orthogonal system in L 2, and let f(x) be a function in L 2• Computing the Fourier coefficients b
ch
=
Jf(x) Olk(x) dx,
4
a
we obtain an infinite sequence
It follows from Parseval's identity, 1\
xl! =
VJ d 1
=
11/11 < oo,
that xis an element of /2 • In this way, we obtain a single-valued mapping of L 2 into /2 • It is easy to see from the foregoing discussion of L 2 , however, that this mapping is oneto-one and that the image of L 2 is all of !2• First, it is clear that distinct elements of L 2 have distinct corresponding elements of / 2, by virtue of the completeness of the system { wk (x) }. By the Riesz-Fischer theorem, every element of / 2 is the sequence of Fourier coefficients of a function in L 2• The above correspondence possesses other properties than that of being one-to-one onto. If then
In other words, no linear relation kdl
+ •••+k.Jn= 0
among the elements of L 2 is altered if we replace the elements / 1, ••• , J, by elements of / 2 which correspond to them [the vector (0, 0, 0, ... ) in 12 is written as 0]. Upon combining this fact and the fact that
\\x\1 = l!fl!, the complete geometric identity of the spaces L 2 and 12 becomes clear. For this reason, is also called Hilbert space.
/2
4 We hope that the reader will not be confused by the fact that we use the letter x to denote the argument of the functions f(x), "'k (x), ..• and also to denote vectors of the space 1-:,.
VII.
188
SQUARE-SUMMABLE FUNCTIONS
Let x=(a 1 , a 2 , a9 ,
••• ),
y=(b 1,
b2,
b3 ,
••• )
be two vectors in 12, and let f and g be the elements of L 2 corresponding to them. Parseval's identity yields
Jf(x)g(x)dx= 7.:~1 akb ,=(x, y). b
""
1
a
It is thus natural to call the integral b
Jf(x) g(x) dx a
the scalar product of the elements f and g and to write it as (!, g).
Thus
(/, g)= (x,y). In this notation, the CBS inequality is written
I(/, g) !-< \1/11 · llg!l. This implies the possibility of defining an angle clements f and g of the space L 2 : 8 (f,g) cos = IIIII · Ugll
e between
any two non-zero
In particular, the definition given above of orthogonality of functions f(x) and g (x),
(/, g)=O is equivalent to the condition that the angle between them be ~· Further, if w(x) is a normalized function,
then it can be considered as a unit vector of the space L.!. (or IJ. In this case, we can define the projection of the vector fin the direction of w in the usual way:
11/11· cos e, where 6 is the angle between the vectors f and w. In other words, Prc.of=
b
Pr,f= Jf(x)w(x) dx. a
Thus the Fourier coefficients of the functionf(x) in an orthonormal system 5 { wk ().)} are projections of the vector fin the directions characterized by the functions of the system. 5
This is not necessarily the system used to establish the relation between L 2 and /2•
189
In n-dimensional Euclidean space, the length of a vector x = (a 1 , a 2 , number
llxll
=
••• ,
an) is the
Vk~l a~.
This is a generalization of the Pythagorean theorem because the numbers ak are the projections of the vector x on the co-ordinate axes. Let us consider m (m < n) of these projections. In order to find out whether all n projections on the axes have been taken into consideration, we can simply compare the numbers m and n, and also we can observe whether for every vector x, the equality m
llxll 2 =~a~ k=l
is true (because if m
< n, then necessarily vectors x exist for which
J; eli < II x 11 2).
k=l
Finally, we can see whether there exist directions orthogonal to all m axes taken into consideration. For the infinite-dimensional space L 2, every orthonormal system { wk (x)} is a system of orthogonal co-ordinates of the axes. In checking to see whether all possible directions have been taken into consideration in setting up this system, one cannot simply count, as in the finite-dimensional case. Generalizing the two other methods indicated for n-dimensional space, we naturally arrive at the definitions of closed and complete orthonormal systems. The basic reason for the equivalence of these two definitions now becomes clear. Up to this point, the connection between the spaces /2 and L 2 has been used to establish certain new points of view regarding L 2 ·(which, of course, is very important). We will show that this connection is also useful for obtaining new facts. First of all, the inequality
l(x, y)J<:IIxii·IIYII, is equivalent to the CBS inequality j(/,
g)l
·llg~.
Therefore (1) and the valid inequality
llx+ Yll-<: 1\xll + IIYII can be written in the form (2) Inequalities (1) and (2) are of a purely algebraic character. Further, the completeness of the space /2 follows automatically from the completeness of the space L 2 (i.e., if x 1, x 2 , •.• is a sequence for which II xn - xm II->- 0 as m, n ->-=, then x 1 , x 2 , ••• ·, has a limit in /2 •
VII.
190
SQUARE-SUMMABLE FUNCTIONS
Since every n-dimensiol_lal space R, is a closed subset of the space 12 , everything stated above (inequalities (1) and (2) and completeness) is applicable to R, as well.* In conclusion, we treat another problem in which the connection- between L 2 and / 2 turns out to be very useful. Let g be a fixed element of L 2• Consider the function (f)= (f. g)
(J;>
defined for all f
E L 2•
The function 1}
(3)
possesses the obvious properties
<1>
+12) =
(J;>
(ft)
+ lli (!,).
2) I <11 if) I <. K · II I II (K = II g II ). Every function (f) defined for an element of L 2, whose values are real numbers and which possesses properties 1) and 2) is said to be a bounded linear functional on the space L 2• Every bounded linear functional on the space L 2 has the form (3). THEOREM (F. RIESZ). If (f) is a linear functional in the space L 2, then there exists one and only one element g E L 2 such that for every f E L;., (J;>
(f) = (f. g).
Proof. Supp()Se that we have set up, using some complete orthonormal system in L 2, a one-to-one Itlapping of L 2 onto / 2 which preserves norms and carries sums and scalar multiples into the corresponding sums and scalar multiples. It is clear that the linear functional on L 2 can be considered as a functional on /2, where (x) is defined as being equal to (/), whenever x is the element of 12 corresponding to f E L 2• It is also clear that (x1 + Xz) = is a bounded linear functional on /2• We shall show that there exists an element y E 12 such that
+
(J;>
for all x
E 12•
(x) = (x, y)
(4)
We prove first that the functional is homogeneous, i.e., that <11
(5)
(ax)= a (x)
for all real a. Relation (5) obviously holds if a is a natural number. From this, it where m is is easy to verify that it is also fulfilled in the case when a has the form .!, m a natural number,and hence (5) holds for every positive rational number a. Further, designating the vector (0, 0, ... ) by 0, we have <11
which implies that (0) equalities 0
that ( - x)
= -
= 0.
(0) =
(J;>
(0
+ 0) = 2<1> (0),
Thus (5) holds for a= 0. Finally, it follows from the
= <11 (0) =
+ (- x)] =
¢> (x)
+ <11 (-X)
(x) and hence (5) is true for every rational number a. It remains
* The author seems here to be flogging a dead horse. The inequality r (x, y) I <. ]I x II : II y II, true for all x, y E /2, can be proved very simply by a direct method. The proof of Theorem 4, §I, can be imitated directly to obtain this inequality for elements of R, and a passage to the limit gives the inequality for lz. Completeness of Rn is obvious, and completeness of /2 can be proved by a simple argument,·very much like the proof of Theorem 5, § 2.-E. H.
191
to consider the case when a is irrational. In this case, let r be any rational number. Then the equality ci> (rx) = rii> (x)
(6)
goes over into (5) in the limit as r ->-a. This is obvious for the right member of (6). On the other hand, ]
(unit in the k-th place) and set (k =
1, 2, ... ).
The vector is in / 2• In fact, if Yn = (At• A2, ..• , An, 0, O, ••• ), n
then Yn
E Ak ek, and
=
k=I It
IZ
•I> (y.11)
=
~ A1,;•I> (ek) = ~ Ak~ k= 1
The inequality
k=l
I <1> (x) I < K • II x I applied to Yn gives
and this shows that lly/1
since n is arbitrary. The element y is the one sought, i.e., (4) holds for all x x
E /2•
Let
=(at, a2, a 8, ••• )
be an element of / 2 • We set
Then Xn
=
L' ak ek and
k=!
n
n
~ ak
=
~ Akak.
(7)
k=1
As n ->- oo, the right member of this equation tends to (x, y). On the other hand,
I (xn) I =I (x-xn) I
oo,
~ ak'!., Vf k=n+1
=K·,
and (7) goes over into (4) in the limit.
VII.
192
SQUARE-SUMMABLE FUNCTIONS
Let g be the element of L 2 which corresponds to the element y in the correspondence described above. Let f be an arbitrary element of L 2 and let x be the element of 12 corresponding to it. Then (f)=
= (.x, y) = (f,
g).
It remains to verify that there is only one element g in L 2 satisfying the relation i[l (/)
= (!. g)
for arbitrary f. If there were two such element g 1 and g 2 , then we would have (f, gl- g~ = (f, gl)- (f. g2) =
for allf E L 2•
Settingf = g 1
-
g 2 , we have
(gt-&•gt-g2) = II g1-g~ II~= o,
and therefore g 1
= g 2•
This completes the present proof.
§ 5. LINEARLY INDEPENDENT SYSTEMS
DEFINITION l. A system of functions cp 1 (x), cp 2 (x), ... , Cfln (x), defined on [a, bj, is said to be linearly dependent if it is possible to find a set of constants Al> A 2 , ••• , An, at least one of which is different from zero, such that
(1) If, however, there is no such set of constants, i.e., if (1) implies that A 1 =A2= ... =An=O,
then the system { cpk(x)} is said to be linearly independent. It is easy to see that if at least one function of the system {
~
Akwk (x)
,.._, 0,
k=l
then, multiplying this equality by w,. (x) and integrating, we find that
A,;=O
(i= 1, 2, ... , 1z).
This implies that { wk (x)} is a linearly independent system. THEoREM 2. The system of functions x"l, x"2, ... , x"i, where the exponents n 1 , n 2, . . . , n,- are distinct integers, is linearly independent on every closed interval. This theorem follows from the fact that an integral polynomial can have only a finite number of roots. DEFINITION 2. A denumerable system of functions cp 1 (x), cp 2 (x), ... is said to be linearly independent if every finite subset of it is linearly independent.
5. LINEARLY INDEPENDENT SYSTEMS
193
For example, every denumerable orthonormal system and the system 1, x, x2, ... are linearly. independent. Let cp 1 {x), ... , Cfln (x) be a system of functions in L~, defined on the closed interval [a, b]. As above, we set b
(/, g)=
Jf(x) g (x) dx tz
for two arbitrary functionsf(x) and g (x) in L 2• Let us form the determinant (91•
DEFINITION
. .. . . . . . . . . . . ... . .
(9n• 9!) (cp7P 92) · · • (9n• 9n) 3. The determinant f).n is called the Gram determinant of the system
of functions {cpk (x) }. THEOREM 3. The system offunctions
SOt (x),
2 (x), · · · , '?n (x)
(2)
is linearly dependent if and only if its Gram determinant is zero. Proof Suppose that the system (2) is linearly independent. Then there exists a set of constants A 1 , ••• , Am at least one of which is different from zero, such that
+ ...
A1r.p 1 (x)+ A 29 2 (x) +An?n(x) ---0. (3) Multiplying this equation by
l
Al(9t• 9t)+A2(9t· 92)+ ... +An (9t> 9n)=O)
A: ~T2: :'~ ~A:2 ~~'': ~:,):~:. ~~· (~,: ~·~: 0 A! (9w 9t) + A2 (9n• 92) + ·•· +An ('fn• 9n) = 0 :.
( 4)
If we regard the numbers Ak as unknowns in the equations (4), we see that these equations form a linear system of equations with determinant 6.n. The homogeneous system (4) is satisfied for the numbers Ak and, since not all Ak are zero, it follows that 6.71 = 0, (5) which proves the necessity of this condition. Suppose next that 6.n = 0. Then the homogeneous linear system of equations (4) has a solution A 1, .•• , An different from zero. Let A1 , A 2 , ••• , An be this solution, so that equations (4) are identities. We rewrite these identities in the form b
J
Cf1
(x) fAt9 1 (x)
+ ... +
a
... ........
An'fn (x)J dx
=
0
.. . . . . . . . . . . J'fn (x) [A cpt (x) + .. -+ An'fn (x)] dx = 0 . b
1
a
..
VII. SQUARE-SUMMABLE FUNCTIONS
194
Multiplying these equations by A 1, b
I [A
1<:'1
(x)
••. ,
+ ·.·+
Am respectively, and adding, we find that
An'fn (x)]2 dx
= 0,
a
(3) follows at once, and the system { 'Pk (x)} is linearly dependent. COROLLARY. If then none of the determinants A1 , A2 , A3 , ••• , A,... 1 is zero. 6 In fact, if An# 0, the system {
(91• 'f1) ('ft• 90 · • • (91• 'fn-1)
(x) =
... . ... . .... .. . - .. .. .. . . . . - . . .. . . . . .
(6)
Then
(k
Hence, if the system { cpk (x)} is linearly independent, then o/n (x) is not equivalent to zero (since An·l # 0). Multiplying equation (7) by o/n (x), integrating and referring to the lemma, we find b
I 9; dx = An-lAn,
(8)
a
so that the determinants D.n and A,... 1 (not equal to zero) have the same sign. For the same reason, the determinants An-t and An_ 2, etc., have the same sign. Therefore An bas the same sign as A1 = (cp 1, cp 1 ) > 0. We have thus established the following theorem. THEOREM 4. The Gram determinant of a linearly independent system is positive. The reasoning developed above leads to a simple proof of the following fact. THEoREM 5 (E. SCHMIDT). Let a finite or denumerable linearly independent system 6 .d~,_is
taken to be ( o:;.1,
o:;.1).
5.
195
LINEARLY INDEPENDENT SYSTEMS
qJ 1 (x), qJ 2 (x), ... be given on the closed interval [a, b]. Then it is possible to construct an orthonormal system w 1 (x), w 2 (x), ... such that I) each function Wn (x) is a linear combinatio.n of the .first nfunctions of the system { ((lk (x)} and, conversely, 2) every function
(n;;:::: 2), where !Jin (x) is defined by inequality (6). it is clear from (7) that
This defines the required system. In fact, n
wn
(.x:) = ~ ak(/lk(x), k=l
so that the system { wk(x)} satisfies the first requirement of the theorem. It follows from the lemma that ljln (x), and hence, wn (x) also, is orthogonal to all the functions qJ 1 (x), ... ,
9n (x) =
~
bkwk
(9)
(x).
k=l
This is trivial for n = I. Suppose that it has been proved for all n < m. Then, by (7),
whence, replacing ~jim (x) in the right member by V Ll.m-l Ll.m wm (x), and each
is linearly independent on the closed interval [ - 1, 1]. Applying the process of orthogonalization indicated in the theorem to it, we obtain a system of polynomials L0 (x), L 1 (x),
L~
orthonormal on [ - I,+ 1], where Ln (x) is a polynomial of degree n. nomials (10) are called Legendre polynomials. 7
(10)
(x), ... 7
·The poly-
By Theorem 5, the degree of L11 (x) is not greater than n. But it is also not less than n because
" .x:n = ~ k=O
akLk (x).
VII.
196
SQUARE-SUMMABLE FUNcnoNs
6. The system of Legendre polynomials is closed. Proof. According to Schmidt's theorem,
THEOREM
(11) At the beginning of §3, we showed that the coefficients ak are the Fourier coefficients of the function X" in the orthonormal system { Lk (x) }. Multiplying (11) by X" and integrating between the limits - 1 and + I, we find that n
II xn 11 2 =
~ k-o
ak2•
i.e., Parseval's identity is valid for each of the functions x" (n = 0, 1, 2, ... ). By Corollary 1 of Theorem 2, §3 our theorem is proved. CoROLLARY. The system of functions 1, x, x 2 , ••• is complete in L 2 ( [ - 11 1] ). § 6. TiiE SPACES Lp AND lp
In this paragraph we will take up a certain generalization of the space L 2• DEFINITION 1. A measurable function (as above, we restrict our discussion to functions defined on some closed interval [a, b]) is said to be p-th power summable, wherep > 1, if
It is customary to denote the set of such functions by the symbol Lp. Obviously, L 1 = L. THEOREM I. If f(x) is p-th power summable (p > 1), then f(x) is summable, i.e., LPCL. If we set E= [a, b], A= E( Ill < 1), B = E- A, the summability of the function f(x) on the set A is obvious, and its summability on the set B follows from the fact that I f(x) I < I f(x) IP on this set. The following theorem is proved similarly. THEOREM 2. The sum of two functions in LP is again a function in LP. Let f(x) and g(x) be in LP. Setting
E=(a, bJ, A=E(JJI-<[gl), we will have for x
B=E-A,
E A:
lf(x) +g(x)
1P
<: { [/(x) I +!g(x) I }P<:2P !g (x) IP,
and consequently
Jlf(x) +g (x) IPdx<: 2P J! g(x) [Pd:c <+co. A
We show similarly that the integral
A
f If+ g IP dx
is finite.
We note the further
B
obvious fact that all functions kf(x), where k is a finite constant, are in LP if f(x) is in
Lr
6. THE Suppose that p
>
SPACES
Lp
AND
lp
197
1. The number
q=-'-~ p-1
is called the index conjugate to p. Since
..!... + ..!_= 1 p q the index conjugate to q, is p. That is, p and q are mutually conjugate indices. In particular, if p = 2, then q = 2 also. Thus, the index 2 is self-conjugate. (Certain important properties of the space L 2, which are not ¥alid for other LP, have their root in this fact.) THEOREM 3 (HOLDER'S INEQUALITY). Iff (x) E LP and g (x) E Lq, where p and q are mutually conjugate and p > 1, then the product f(x) g (x) is summable and the inequality
1
b
r
\J!(x)g(x)dx\<:Ji a
is valid. Proof. Let 0 < a.
b
qr
..
flJ(x)IPd:c·
Jl
a
<
b
(1)
fJg(x)lqdx, a
1. Consider the function
<
(0 <x -\-oo) Its derivative, lj!' (x) = a. [ .xa1], is positive for 0 < x < 1 and negative for x > 1. Therefore the maximum value of the function t\1 (x) is attained for x = 1. Thus ~
(x) = xa-ax
1-
1 (x) <: o/(1) =
1 - a,
whence it follows that (2) for a~ x > 0 .. Let A .and B be tw~ positive numbers .. If we substitute x = mult1ply tbe·mequality thus obtamed by B, we obtam
NBl-« <: aA
~
in (2) and
+ (1- a) B.
Let p and q be the two mutually conjugate indices mentioned above. If we set 1 1 a. = -, 1 - a. =-, we see that p q
(3) Tne last inequality has been proved for A > 0 and B > 0, but it obviously holds also when A or B or both A and B are zero. Having established (3), we consider the functions f(x) and g (x) of the present theorem. If at least one of them is equivalent to zero, all assertions of the theorem are obvious. Excluding this trivial case, we can assert that both the integrals b
J]f(x)]Pdx, a
VII.
198
SQUARE-SUMMABLE FUNCTIONS
are strictly positive and therefore we may consider the functions
~(x)=
V
/(x)
1(X)=
'
fl t cx>IP dx
a
·if j
g(x)
1g
<x>lq dx:
a
If we set in (3), we obtain
I cp (x)l(x)J-< I 'P ~)IP whence the summability of the product product f(x) g (x). Besides, noting that
+ ll ~)lq ,
(4)
(x) y (x) follows and with it that of the
u
b
JI ~(x)[Pdx= Jlr(x)j11 dx= 1 a
a
and integrating (4), we find b
J!~(x)!(x)jdx.< :,
+ .~
=1.
a
The inequality
r
.. I
~~
Jlf(x) g(x)j dx <: V Jjf(x)jt' dx · V J\g (x) [lldx, b
a
p
q
a
b
a
follows at once ; this is a stronger inequality than (1). Holder's inequality is a generalization of the CBS inequality, which is merely Holder's inequality with p = 2. THEOREM 4. (MINKOWSKJ's INEQUALITY). If f(x) E LP and g (x) E LP (p > 1), then (5)
Proof. The theorem is very simple for p
=
1. Consider then the case p
+
> l.
Let
q be the index conjugate top. By Theorem 2, the sumf(x) g (x) is in LP, and, therefore, ! f(x) g (x)IPfq is in Lq. Now substitute I f(x) I forf(x) and I f(x) g (x)IPfq for g (x) in Holder's inequality. This yields
+
] lf(x)\·\f(x)+g(x)[Plqdx-
f
jf(x)]Pdx ·
a
Vf
+
lf(x)+g (x)jPdx. (6)
a
Similarly b
..p ! b
JIg(x)j· !f(x)+ g(x)~fqdx-< V Jlg (x)IP dx · a
a
q
b
JIf (x) + g (x)IP dx. (1) a
6.
However, p
=
THE SPACES
Lp
AND
199
lp
1 + !!.. This implies that q
lf+g jP=/J+gJ ·lf+gjPI'l-<;: lfl·lf+gjPiq+/gl·lf+g jPiq. From these relations and (6) and (7), it follows that
and the theorem follows [cancelling
b
(
[ If+ g
IP dx )
1/q
is permissible if this
integral is different from zero, and otherwise the theorem is obvious]. Minkowski's inequality as given here is a generalization of the inequality of Theorem 5, § l, to which it reduces for p = 2. The inequalities of Holder and Minkowski for finite sums are established in a very similar way:
(8)
V DEFINITION
~1 lak+bkl"-
ELr
vf
V~ 1 lbk/P.
(9)
The number
ll/11 =
lf(x)jPdx
a
is called the norm of the functionf(x) (considered as an element of Lp)The following properties of the norm are obvious: I. 11/11 > 0, and 11/11 = 0 if and only if f(x) '""0. II. I k/11 = I k I · 11/11 and in particular II -/II = 11/11nr. II/+ g II~ 11/11 +II g /1. The introduction of a norm allows us to establish the same geometric terminology for LP as introduced above fot· L 2• The convergence of a sequence of elements {.fn{x)} in LP to a limitf(x) E Lp in the norm means that b
lim jlf,.(x)-f(x)jPdx=O. n~co
a
This form of convergence is called convergence in the mean of order p. As for £2> we can show that the convergence of a sequence in the mean of order p implies its convergence to th~ same limit in measure. As for L 2 , we can establish the continuity of the norm and the uniqueness of limits, if no distinction is made between equivalent functions. Exactly as in L 2 , the concept of a Cauchy sequence is introduced. It is proved that a sequence of elements of LP has a limit if and only if it is a Cauchy sequence (the space Lp is complete). Since there is nothing essentially new in passing from L 2 to LP, we will not stop to give proofs of all of these statements. We note also without proof that each of the
VII.
200
SQUARE-SUMMABLE FUNCTIONS
classes M, C, P, S and T (the last for b - a = 27t), considered in Theorem 6, §2, is everywhere dense in LP. The concept of weak convergence for p > 1 is introduced as follows. A sequence {J, (x)} C LP converges weakly to f(x) E LP if the equality
I
b
b
nl~~oo fn(x)g(x)dx= a
I f(x)g(x) dx
(10)
a;
holds for all functions g (x) in Lq, where q is the index conjugate to p. With the aid of Holder's inequality, it is easy to show that a sequence converging in the mean converges weakly (to the same limit). If p = 1, the .conjugate index does not exist. In this case, we say the sequence {.f,. (x)} C L converges weakly to f(x) E L, if equality (10) holds for every measurable and bounded function g (x). It is clear that here also convergence in the mean (of first order) implies weak convergence to the same limit. We mention briefly one further class of spaces of importance in analykis. These are the spaces IP, where p > 1. The space IP is the set of all sequences X=
of real numbers xk for which
ll xJJ =
(x1,
X 2, X 3, • ·.)
Vk~1l
Xk
F< +oo.
The number II x I is called the norm of the element x E IP. As in the case of 12, we define the sum x + y of two elements of IP and the product kx of an element x E /P by a number k. The norm possesses the usual properties. I. II x II > 0, where II x II = 0, if and only if x = 0. 8 II. II kx II = I k I · II x II and, in particular, II - x II = II x 11. III. II X + y II II X II + II y liThe first two of these properties are clear ; the third follows from (9). 9 With the aid of the concept of norm, the ideas qf the limit of elements of !P, Cauchy sequences, sets everywhere dense in /P, etc., can be introduced. One can prove that the limit of a sequence of elements in IP is unique, that the norm is continuous, and that the space lP possesses the property of completeness. We shall not discuss this in detail.
<
§7. EDITOR'S APPENDIX TO CHAPTER VTI
The notion of square-summable function need not be confined to finite intervals [a, b). In fact, one may consider any measurable set E on the line and study the class of all measurable functions f(x) defined on E for which J/ 2 (x) dx E
is finite. All of the definitions, notation, and results of§ 1, and of§ 2 up to and including That is, x = (0, 0, 0, ... ). Inequality (9) deals with finite sums. By taking limits the mequality (9) can be immediately genc:ralized to sums of infinite series. 8
9
7. EDITOR'S APPENDIX
201
Definition 3, can be carried over to this more general class of Hilbert spaces with absolutely no change. Details are left to the reader. In discussing Theorem 6, §2, and its extension to spaces of square-summable functions on an arbitrary measurable set, it is convenient to restrict ourselves to the case E = ( - oo,oo). We have then the following result. THEOREM 1. The following classes of functions are denst> subsets of L 2 ( -oo,oo): the class ofbounded measurable functions which are square-summable over (- oo,oo); the class of continuous functions f(x) on (- oo,oo)such that f(x) = 0 for I xI >A (the constant A depending upon the function f(x) ); the class ofstep functions vanishing outside of finite intervals (the interval depending upon the particular function) ; the set of all functions of the form p (x) e-"'", where p (x) is a polynomial. Proof The proofs that the first three function classes mentioned are dense in L 2 ( - oo, oo) are very similar and are quite simple. Let us show, for example, that the continuous functions vanishing outside of finite intervals form a dense subspace of L 2 ( - oo, oo). If f(x) is an arbitrary function in L 2 ( - oo, oo), and if e is an arbitrary positive number, then by Lebesgue's theorem on dominated convergence, (Theorem I, §4, Chapter VI), we infer that there exists a positive number A such that
J/2 (x) dx < 25, Jf e2
-A
oo
2
(x) dx
62
< 25 .
Now, by Theorem 6, §2, there exists a continuous function interval [ - A, A] such that A ez l)f(x)- e (x) 1 2 dx < 25"
(1)
e (x)
defined on the closed
(Note that iffE L 2 ( - oo, oo), thenfE L 2 (a, b) for all numbers a< b). The function
e (x) can be extended to be a continuous function on ( - oo, oo) which vanishes identically outside of the interval (-A - S, A + S) ( S > 0) by making e (x) linear in the intervals [ - A - S, -A] and [A, A small that
+ S].
It is easy to see that S can be chosen so
-L e ex) dx < 25e2 , Ioog ex) dx < 25e2 . -A
2
2
We now have, making free use of Theorem 5, §1, that -A
llf- e II< { J lf(x)- e (x) -oo
12
dx}~
00
l
e (x) 12 dx} 2
{f If (x) A
-A
A
+ {-A f lf(x)- e(x) l 2 dx}~+
-A
A
-< {-ooJ p (x) dx} ~ + {-oo J ez (x) dx} ~ + {f 1f (x) -A
00
1
{J f2 (x) dx} 2 A
+ {f 6 00
2
(x) dx}
e CxW dx} ~ +
1
2
A
E
< 5· S =E. The proof that the class of all functions p (x) e-x", where p (x) is a polynomial, is a dense subclass of L.l. (- oo, oo) is somewhat more complicated. For a proof, we
202
VII.
SQUARE-SUMMABLE FUNCTIONS
refer the reader to E. C. Titchmarsb, Introduction to the Theory of Fourier Integrals, Oxford University Press, 1937, page 79, Theorem 55. We return to a discussion of the theorems and definitions of the text, as they are applied to L 2 (E) for unbounded E and in particular to L 2 ( - oo, oo). The definition of weak convergence (Definition 4, §2}, remains the same for generalL 2 (E), and Theorem 7, §2, obviously remains true. Orthogonality is defined in general L 2 (E) just as for L 2 (a, b). In fact, all of the definitions of §3 may be taken over without change to our general L 2 (E). All theorems of §3, except for the corollaries to Theorem 2, and the corollary to Theorem 4, are true for general L 2 (E), and the proofs can be repeated word for word. In §4, a one-to-one correspondence is set up between L 2 (a, b) and the sequence space / 2 , preserving sums, scalar multiples, norms, and inner products. It is clear that all that is needed for this purpose is a complete orthonormal sequence in L 2• Thus, if we can find a complete orthonormal system w 1 (x), w 2 (x), ... , Wn (x), ...
in L 2 (E), we can assert that L 2 (E) is geometrically identical with /2 • There exist complete orthonormal systems in L 2 (E) for every measurable set E. For a proof of this fact, we refer the reader to M. H. Stone, Linear Transformations in Hilbert Space and Their Applications to Analysis, Amer. Math. Soc. Coil. Publ. Vol. XV, 1932, Chapter I. All definitions and theorems of §5 can be carried over with no change from the case of L 2 (a, b) to L 2 (E) for arbitrary measurable E, barring Theorem 6 and its corollary, which refer specifically to L 2 ( - 1, 1). The spaces LP defined in §6 can be defined over every measurable set E, and all of the theorems of 6 dealing with LP (a, b) are also true for LP (E). One needs merely to replace integrals of the form b
fa by integrals of the form
f.
E
Exercises for Chapter VII 1. Let Un (x)} be a system of functions in L 2 converging in measure to F (x). If 11/n II<: K, then the sequence Un (x)} converges weakly to F(x). (F. Riesz) 2. If the sequence Un (x)} converges weakly in Lz to F(x), then 11/n II<: K. (H. Lebesgue) 3. Weak convergence in Lz of the sequence Un {x) } to F (x) does not imply convergence in measure of Un {x) } to F (x). 4. If the sequence Un (x)} converges weakly in Lz to F(x), and, in addition, 11/n 11->-11 Fll, then the sequence Un (x)} converges to F(x) in the mean. (F. Riesz) b
5. If the integral
f f(x) g (x) dx
exists for every /(x)
a
E Lz,
then g (x)
ELz.
(H. Lebesgue)
6. Every orthonormal system is at most denumerable. 00 7. If {"'k (x)} is a closed orthonormal system on the segment [a, b], then .E "'~(x) = + oo almost everywhere on [a, b]. (W. Orlicz) oo k"' I 8. Under the same conditions, E "'~ dx = + oo for an arbitrary measurable set e of measure me> 0. (W. Orlicz) k=l e 9. No finite system of functions is complete in L 2•
J
203
EXERCISES
10. Let { c.>k (x)} (k combinations
= 1, 2, ... , n)
be an orthonormal system and let f(x)
E~·
Of all linear
f: Ak ""k (x), the norm of the difference II I - k=l f: Ak "'k II has the least value when Ak=
k=l
(j,wk)(k = 1,2, ... ,n). 1!. Let {c.>k (x)} be a complete orthonormal system of functions. If {'l'k (x)} is a system of functions oob
in L 2 such that .E
J ("'k (x)- 'l'k (x) ]2 dx < I, then the system {'l'k (x) } is also complete. (N. K. Bari)
k=l u
12. Let the function f(x)
EL2
be defined on [ - 1r, 1r) ; suppose that f(x + 21r) = f(x) and let
g.,(x)=
f
f(x+t)-;f(x-f} dt.
1
n The functions gn(x) converge in the mean on ( - 1r, 1r] to a function q (x)
Jsi~
E L 2 , where
1T
II q II < II f II •
t dt
and it is impossible to reduce the value of the multiplier of 11/11. (I. P. Natanson) I3. Let f(x) E L 2 on [a, b] and let f(x) = 0 outside [a, b]. If ~+h
'I' (x)
=
J
dh
f(t) dt,
:x:-lt
then II cp II < II fll, (A. N. Kolmogorov) 14. Using the same notation, the functions c;> (x) converge in the mean in L 2 to f(x) as h ->- 0. (A. N. Kolmogorov) I5. Extend the results of exercises I, 2, 3, 4, 5, 13 and 14 from~ to Lp for p > 1. 16. Prove the completeness of space Lp for p :> I. 17. Prove the completeness of space lp for p :> I. 18. Prove the completeness of the space m of all bounded sequences x = {xk }, where II x II = sup { I Xk I } < oo. 19. If in the set C of all functions continuous on [a, b], we introduce the norm 11/11 =max I /(x) I, and sums and scalar multiples are as usual, then the space obtained is complete. 20. The system of functions 'l'k (x) is said to be complete in the class of functions A if, in the latter, there are no functions distinct from zero which are orthogonal to all the 'i'k (x). It does not follow from the completeness of an orthogonal system in the class (R) of all Riemann-integrable functions that this system is closed. (G. M. Fichtenholz) 21. If I < r < p, then Lp C L,. 22. If I < r < p, then lp ::> !,. 23. Let Un (x)} c Lp (p > I) be a sequence of functions, convergent in the mean of order p to the function F (x) E Lp. Show that Un (x) } converges to F (x) in the mean of order r, where I < r
+
00
25. If the sequence { ak} is such that the series
.I: ak xk
converges for every sequence {xk}
E lp
k=l
(p> l), then { ak}
1 1 E lq, where-+= p q
l. OQ
26. If the sequence { ak } is such that the series
E
.I: ak xk converges for every sequence { xk} E I =
k=l
1,, then { ak} m, i.e., sup { lak I} < + oo. 27. If p> I and the equality sign holds in Minkowski's inequality {5), then g (x) = Kf(x), where K>O.
CHAPTER V111
FUNCTIONS OF FINITE VARIATION. THE STIELTJES INTEGRAL § 1. MONOTONIC FUNCTIONS
As is known, a function f(x), defined on the closed interval [a, b] is said to be increasing if
(1)
f(x)<.f(y) for
x
If f(x) < f(y) for x < y then f(x) is said to be a strictly increasing function. Analogously, a functionj(x) is said to be decreasing (strictly decreasing) if/(x) :;.f(y) [f(x) > f(y)] for x < y. Increasing and decreasing functions are called monotonic or monotone (strictly monotonic). If the function f(x) decreases, then - f(x) increases. This simple remark permits us to consider only increasing functions in many problems involving monotonic functions. Monotonic functions will always be considered finite.
Letf(x) be an increasing function defined on [a, b] and let
<
a<.x0 b. It is an elementary and well-known fact that for any sequence of points x 1 , tending to x 0 and lying to the right of x 0 , xn ~ Xo,
Xn
Xz,
x3,
.•. ,
> Xo
the limit exists and is finite. This limit is nothing but inf{f(x)} and hence does not depend on the choice of the sequence { x, }· It is denoted by f (x0 +0). The symbol f(x 0 -0) is defined similarly. It is easy to see that j(x0 -0) <.f(x0 )-<.f(x0 +0) and that f(a) <.f(a+O), f(b-0) <.f(b). 204
(a<
X0
1.
205
MONOTONIC FUNCTIONS
Hence the function f(x) is continuous at the point x 0 if and only if
f (x0 -0) =f(x0 ) =f(x0 +0). [For x 0 =a (x 0 =b), we need to consider only the one-sided limitf(a + 0) (f(b- 0)).] The numbers
are called the left and right saltus, respectively, of the function f(x) at the point x 0, and their sum f(x 0 + 0) - f(x 0 - 0) is called the saltus of the function f(x) at this point. [For the points a and b, only one-sided saltus are considered.] . LEMMA. Let an increasing function f(x) be defined on [a, b]. Let x 1, x 2, •.• , Xn be arbitrary points lying in (a, b). Then n
[f(a+O)- f (a)]+ ~ [f(xk+O) -f(xk- 0)] + [f(b)- f(b- 0)]-< k=l
(2)
Proof We may suppose that
Set a= x 0 , b
=
Xn+l
and choose points y 0 , Yl• ... , Yn such that (k=O, 1, ... , n).
Then
f(xl,+O)-f(xk-0)
(k= 1, 2, ..• , n)
f (a+ 0)- f(a)
na
,+ 0) -f(x,.-0)]+ [f(b) -f(b-0)]-<
f!.(a+O)-f (a)]+ ~ [f (x 1 k=l
-
(3)
206
VIII. FUNCfiONS OF FINITE VARIATION. SnELTJES INTEGRAL
Proof. Designate the set of all points of discontinuity of the function f(x) by H and the set of those points of discontinuity of thls function, at which its saltus is greater
than
t• by Hk.
Obviously,
and the denumerability of H follows from the fact that each Hk is finite. Inequality (3) follows from (2) upon taking the limit as n ->- oo. Let f(x) be an increasing function defined on [a, b]. Define the function s (x) by setting s(a)=O,
s(x) = [f(a+ 0)- /(a)]+ ~ lf(:ck+ 0)- f(xk-O)J
"'k<::c
+
+[f(x)-f(x-0)] The functions (x) is called the saltus function of the functionf(x). Clearly, it is also an increasing function. THEOREM 2. The difference
9 (x) =f(x) -s (x) between an increasing function and its saltus function is an increasing and continuous function. Proof. Let a < x < y
s(y)-s(x) -
(4)
as a simple calculation shows. This implies that
(5)
s(x+O)-s(x)<J(x+O)-f(x). On the other hand, it follows easily from the definition of the function s (x) that
f (x+ 0)- f(x)<.s(y) -s(x) for x < y. Taking the limit as y
->-
x, we obtain
f (x+O)-f(x) <s (x+O)-s (x). From this inequality and (5), it is clear that
f(x+O)- f(x) =s (x+ 0) - s (x)~ Hence,
(x+ 0)
=
cp (x);
by a very similar argument, we can show that rp (x - 0) = cp (x). It follows that is a continuous function, as we wished to prove.
2.
MAPPING OF
SETS.
DIFFERENTIATION OF MoNOTONIC FUNcnoNs
207
§ 2. MAPPING OF SETS. DIFFERENTIATION OF MONOTONE FUNCIIONS
Let a functionf(x) be defined on a certain abstract set A. Let E be an arbitrary subset ofA. We may consider the set of all points y having the formf(x), for x E E, and call this setf(E). Thus the function/induces a mapping of the family of all subsets of A into the family of all subsets of the image set f(A) of A. One may say that f(E), for E C A, consists exactly of those points y for which the equation f(x) = y has a solution in the set E. The set f (E) is called the image of E, and E is called the inverse image of the setf(E). oo oo THEOREM 1. If E 1 C E 2, thenf(E1) Cf(E 2). If E = .E Em thenf(E) = .Ej(En)This theorem is obvious. n= I "~I The theory of mappings is particularly simple when the mapping function establishes a one-to-one correspondence between the sets A and f (A). Then there also exists an inverse function x = g (y), defined on the set f(A) and having values lying in the set A. We define g (y) by the condition that x = g (y) if and only if y = f(x). It is easy to see that in this case, ""
00
/(fiE,~)= fl/(En). 71.=1
7l=1
In particular, if two sets E 1 and E 2 are disjoint, then their images f (EJ and f (EJ are also disjoint. As an example of such a well-behaved mapping, we may consider the mapping given by a continuous, strictly increasing functionf(x) defined on the closed interval A= [a, b]. In this case,J(A) = [f(a),f(b) ]. The notion of a mapping of sets is very useful in studying differentiation. DEFINITION. The number A (finite or infinite) is said to be a derived number of the function f(x) at the point x 0 if there exists a sequence h 1 , h 2 , h3 , ••• (hn =I= 0), tending to zero, such that lim f(xo+ fzn)- f(xo) =A.. hn
n-7-oo
We express in symbols the assertion that A is a derived number of the function
f
(x) at the point x 0 as follows:
), =
Df (xo)·
If the (finite or infinite) derivative f'(x 0) exists at the point x 0, then it will be a derived number Dj(x 0 ), and in this case the functionf(x) has no other oerived numbers at the point x 0 • As an example, let us consider the Dirichlet function if; (x), which is equal to zero for irrational values of x and is equal to one for rational values of x. Let x 0 be a rational number. Then the ratio <\1 (xo +h)
-4 (xo)
h
equals zero for rational h and
h1 when h is irrational. From this it follows that the
function tjl (x) has three derived numbers at the point x 0 : -oo, 0, and+=. It is easy to verify that the function if; (x) has no other derived numbers at the point x 0 • The same is true when x 0 is irrational.
208
VIII. FUNCTIONS OF FINITE VARIATION. ST!ELTJES INTEGRAL
THEOREM 2. If the function f(x) is defined on [a, b], then derived numbers exist at every point x E [a, b]. Proof. Let x 0 E [ a, b] and { h,} ( h, =!= 0) be a sequence, tending to zero, such that x 0 + hn E [a, b]. Set
If the sequence {an} is bounded, then, by the Bolzano-Weierstrass theorem, a subsequence can be extracted from it having some limit A which will be a derived number of the functionf(x) at the point x 0 : A= Df(x 0). If the sequence {an} is unbounded, (suppose, for example, that it is unbounded above), then a subsequence {a,k} tending to + = can be extracted from it. In this case, + = = Df(x 0). THEOREM 3. A function f(x) defined on [a, b] has a derivative!' (x 0) at the point x 0 E [a, b] if and only if all derived numbers of f(x) at this point are equal. The necessity of the condition, already noted above, is quite trivial. To prove its sufficiency, suppose that it holds, and let A be the common value of all derived numbers of f(x) at the point x 0 • The existence of the derivative f' (x 0) will be proved once we show that for every sequence { hn} (hn =I= 0) with limit zero, the relation lim /(xo+h..)-/(xo)=A hn
n-+oo
holds. Assume that this is not the case. (hn->- 0 hn =!= 0) for which the ratios an=
Then there exists at least one sequence { h, }
I (xo +h.,.) - f (xo) h
n
=
=;
do not have the limit A. But this implies [we suppose that < A< + if A = ± =, the reasoning only becomes simpler] the existence of an e > 0 so that an infinite set of the numbers a 11 lie outside the interval (A - e, A + E). This infinite set contains a subsequence { a"k }, which tends to a finite or infinite limit p.. The number p. is a derived number of the function f(x) at the point x 0 different from A. The existence of such a derived number contradicts our hypothesis; and thus the theorem is proved. LEMMA 1. If the function f(x) is monotonically increasing on [a, b], then all of its derived numbers are non-negative. This lemma is obvious. LEMMA 2. Let f(x) be a strictly increasing function defined on [a, b]. If at every point of the set E C [a, b], there exists at least one derived number D f(x) such that
Df(x)<,_p,
then m*f (E) <,_p • m~'E.
Proof. Take any
E
>0
and choose a bounded open set. G such that
Ec:.O, mO Further, let Po be any number such that Po { hn } with limit zero such that
< m~'E+e. > p.
If x 0
E E, then there exists a sequence
lim f(xo+h,.)-f(x 0 )_D'f( ) / n-+oo
1m
-
·
Xo ~P·
2.
MAPPING OF SETS. DIFFERENTIATION OF MONOTONIC FUNCTIONS
209
+ hnL is contained
entirely in
For all sufficiently large n the closed interval 1 [x 0 , the set G. In addition, for all n sufficiently large, f(xo+ h,)- f(xo) hn
Xn
< Po·
We shall suppose that both of these situations bold for all n. We next introduce the closed intervals
Since the function f (x) is monotonic increasing, it is clear that
The lengths of these intervals are
Therefore
But h, -> 0 ; this implies that among the intervals t..n (x 0 ), there is an arbitrarily small one. Since the image f(E) of the set E consists of the points f(x 0) which lie in the interval !:.., (x 0), f(E) is covered by all the intervals !:.., (x) ( x E E) in the sense of Vitali 2 • (See §8, Ch. III.) By Vitali's theorem (Theorem I, §8, Cb. Ill), it is possible to select from the family of these intervals a countable sequence of pairwise disjoint intervals { 1:..,1 (x1)} ( i = 1, 2, 3, ... ), such that 00
m [f(E)- ~ b.n. (xi)] = 0. i=l
•
It is clear that co
co
m*f(E)-<. ~ m6.11. (.\: 0)
•
i=l
Now note that not only the intervals An. (x1) but also the intervals d, (x1)arepairwise 1 ' disjoint. 3 Therefore co
00
,;~/nd,. 1 (x~) = m [,;~1d,.1 (xf)J. Since co
~ d11. (x,)c=O,
i=l
•
+
1 This is for the case h, > 0. If h, < 0, it is necessary to write [x0 hm x 0]. However, we may agree to denote the set of numbers lying between c: and {J by [ c:, {J] even if c: > {J. 2 We here use the fact that f(x) is strictly increasing. Otherwise some of the intervals .dn (x) could degenerate to points, and it would then be impossible to apply Vitali's theorem. 3 In fact, if z were in the intersection dn (x1) • dnk (xJ, then f(z) would be in the intersection 1 i1,. 1 (xi) • .dnk (xJ. This requires the hypothesis that f is strictly increasing, and hen~ one-to-one,
210
VIII. FUNCTIONS OF FINITE VARIATION. STIELTJES INTEGRAL
we have
m*f(E)
E
->- 0, the last inequality becomes in the limit
m*f(E)-< p · m* E, which is the inequality we set out to establish. The following lemma has a sinular, although technically more complicated, proof. LEMMA 3. Let f(x) be a strictly increasing function defined on [a, b]. If at every point of the set E C [a, b] there exists at least one derived number Df(x) such that
Df (x)':;::.. q,
(q
> 0)
then
m*f(E)> qm*E. Proof. The lemma is trivial if q = 0. We may suppose then that q > 0. Let q 0 be any positive number less than q and let E be any positive number. Let G be a bounded open set such that 4
0-=:Jf(E),
mO<m*f(E)+s.
LetS be the set of points x E Eat whichf(x) is continuous. The set E- Sis at most denumerable, because a monotonic function has at most a denumerable set of points of discontinuity (Theorem 1, §1). If x 0 E E, we can find a sequence { hn} for which
hn-+ 0,
lim f(xo+hn) -f(xo) = Df(xo) hn
?l-)o OJ
> q.
We may suppose that for all n, f(xo
+ hn)-f (xo)> hn qo.
Therefore, introducing again the intervals dn(x 0 )=[x0 , x 0 +hnl.
6.n(x0 )=[f(x0), f(x 0 +h.a)],
we have
> q0 mdn (x0). E S, the whole interval [f(x ),f(x + hn)] will lie entirely in the set G for m!\n (x0)
If x 0 0 0 sufficiently large n. We may suppose that this is the case for all n. The set S is covered by the intervals dn (x) (where x E S) in the sense of Vitali. Therefore there exists a denumerable sequence of pairwise disjoint intervals { d,. (x1) } such that ' co
m[S- ~ da.(X.;)]= 0. ~=1
But then
OJ
m*S-<
00
~ md,~i (xo~)< : 0 ~ m6."t (x,). i=l
4
•
i=l
Note that the set[(£) is bounded: it is in fact contained in the closed interval [J(a),/(b) ].
2. MAPPING OF SETS. DIFFERENTIATION OF MONOTONIC FUNCTIONS
211
The segments D.n1 (x;), as well as the dn; (x;), are pairwise disjoint [here we need the hypothesis that the function f (x) is strictly increasing]. This implies that co
o>
~ mtl1,. (x,:) = m [ ~ ~" (xi)l-< mG i=l
'
i=l
Thus
m*S
i
< m*f (E)+ e.
< _!__ [m*f (E)+ e]. qn
Taking the limit as e -> 0 and q 0 -> q, we find that
m*f(E) ?:-qm*S.
< m* S + m* (E -
S) = m* S, the lemma follows. The set of points at which at least one derived number of an increasing function f(x) is infinite, is of measure zero. First of all, suppose that the function is strictly increasing. If we had m'~E(Df(x) =+co)> 0,
Since m* E
CoROLLARY.
the image of this set would necessarily have infinite outer measure, which is an absurdity because this image lies on the segment (f(a), f(b) ). This proves the corollary for a strictly increasing function. Iff (x) is not strictly increasing, let
g(x)=f(x)+x. Then g (x) is a strictly increasing function. But g(x+ h)-g(x) _ h -
f(x+h)-f(x) h
+ 1.
Hence the set of points where at least one Df(x) = + oo coincides with the same set for g (x) and therefore has zero measure. LEMMA 4. Let f(x) be an increasing function defined on the closed interval [a, b], and let p and q be two numbers such that p < q. If at every point x of the set Ep, q C [a, b], there exist two derived numbers Dd(x) and D 2 f(x) such that
< < <
Dd (x) p q DJ (x), then mEp. q = 0. Suppose first that f(x) is strictly increasing. Then both Lemmas 2 and 3 are applicable, and, accordingly we may write m*f(Ep, q)-<,. pm*Ep, q•
m*f (Ep, q) ?:- qm*Ep, q·
It follows that and therefore m* Ep. q = 0. If f(x) is not strictly increasing, then, as above, we set g (x) = f(x) + x and apply the part of the lemma already proved tog (x), substitutingp + 1 and q + 1 for p and q. We can now establish the main theorem of this section. THEOREM 4. Every increasing 6 fwtction f(x) defined on a closed interval [a, b] has a finite derivative f'(x) at almost all points x E [a, b]. Proof Let E denote the set of those points of [a, b] at which the derivativ~ f' (x) does not exist. If x 0 E E, there are two distinct derived numbers D 1 f(x 0) and Dd(x 0 ); 5
Note that we do not suppose f (x) continuous,
212
VIII. FUNCTIONS OF fiNITE VARIATION. STIELTJES INTEGRAL
suppose that Dd(x 0) such that
<
Dzf(x0). There obviously exist rational numbers p and q
It follows that
where Ep, q is the set of those x in [a, b] at which two derived numbers D 1 f(x) and Dzf(x) exist satisfying the inequalities
Dd(x)
< p < q <.. DJ(x),
and the summation is extended over all pairs (p, q) of rational numbers for which p < q. According to Lemma 4, every set Ep. q has measure zero. Since there are clearly only a denumerable number of sets Ep, q• it follows that E is the sum of a denumerable family of sets of measure 0 and hence itself has measure 0. We have thus shown that the derivative/' (x) exists almost everywhere on [a, b]. Since f' (x) = + oo is possible only on a set of measure 0 (see Corollary to Lemma 3), the theorem is proved. Henceforth, in referring to the derivative f' (x) of an increasing function, we shall suppose it defined for all x in [a, b]. To accomplish this, we agree once and for all to define f' (x) = 0 at those points x where f(x) has no derivative. THEOREM 5. If f(x) is an increasing function defined on [a, b], then its derivative !' (x) is measurable and b
Jf' (x) dx <.f(b)-f (a), . a
so that f' (x) is summable. Proof. Extend the definition of the function f(x), setting
f(x)=f(b) if b<x<.b+l. Then at every point x such that/' (x) is the derivative ofj(x) [except, perhaps, the point x = b, where f' (b) was previously only the left-side derivative), we have
f (x)= nllm= n [f(x +~)-f(x)
J.
This implies that f' (x) is the limit of an almost everywhere convergent sequence of measurable 6 functions. Hence f' (x) is a measurable function. Since f' (x) is nonnegative, we can speak of its Lebesgue integral b
J!' (x)dx. a
By Fatou's Theorem [Chapt. VI, §1] b
b
Jf (x)dx-<sup{n J[t(x+ ~)-f(x) ]dx}.
a
a
*)
6 The functions /(x) and f(x + are increasing and are therefore measurable. E (/ > c) is either the void set or an interval.
In f:.ct,
2.
MAPPING OF
SETS.
DIFFERENTIATION OF MONOTONIC FUNCTIONS
213
But
(there is no need to cite the theorem on the change of variable in a Lebesgue integral becausef(x) is monotonic, and the integral can be taken in the Riemann sense). Consequently, we have 1
b
1
ll+n-
J~(x+~)-f(x)]dx= a
a+;;
J f(x)dx- Jf(x)ax= a
b
a+.!.. n
=
Jf(x)dx-
1 r;f(b)-
1 ff(b) -f(a)J.
a
From this, the inequality b
J!' (x) dx <,_j(b) -f(a). a
follows readily. We are accustomed to think that the integral of the derivative of a function is equal to the difference of the values of the function itself, computed at the ends of the b interval of integration: f f' (x)dx = f(b)- f(a). a
From this point of view, the inequality obtained in Theorem 5 seems somewhat less than perfect. However, it is impossible always to obtain equality in the inequality of Theorem 5, even if the functionf(x) is continuous. ExAMPLE. Let P 0 be the Cantor perfect set. Its complementary intervals can be divided into groups, putting into the first group the interval (!, ~), into the second the two intervals (~, :), (f, }), into the third the four intervals (f.;, f,;), ( ;, :7 ), (\*, W), (¥,, ~),and so on. There will be 2n·t intervals in the n-th group. We define a function 0 (x) by the following description:
1 8 (x) =2 1 .
8(x)=-;r tf
xE (19
.
if x
E(13,
2)
, 9 ,
3:l) ,
3
8(x)=-:r if
xE(~, }).
In the four intervals of the third group, the function 0 (x) equals i, }, i·, i respectively. And in general, on the zn-l intervals of the n-th group, we set the function 0 equal to 1
3
5
21'-1
2n ' 2n' 2n ' .. •' ~ .
respectively.
214
VIII. FUNCTIONS OF FINITE VARIATION. STIELTJES INTEGRAL
The fu~ction 0 (x) is defined on the Cantor open set G0 • It is constant on every component interval of this set and is an increasing function on the set G0 as a whole. 7 We extend the definition of the function 0 (x) by defining it at the points of the set P 0 • To do this, we set 8(0)=0,
At those points x 0
E P 0 which
8(1)=1.
lie between 0 and I, we set
8 (x0) =sup {e (x)} It is easy to see that this definition does not alter the monotonicity of the function Having established the monotonicity of 0 (x), we can easily prove that it is continuous. This follows from the fact that the set of values taken on by the function 0 (x) on the set G 0 is everywhere dense in [0, 1]. [In fact, if an increasing functionf(x) has a point of discontinuity x 0 , then at least one of the intervals (f( x 0 - O),j(x 0)) and (f( x 0 ),j(x0 0)) contains I).O points f(x).] Thus, 0 (x) is a continuous increasing function. Besides, almost everywhere on [0, 1], we have 0 (x), which is now defined everywhere on the closed interval [ 0, 1].
+
€1' (x)= 0. (This relation obviously holds at every point of the set G0 .) Therefore 1
f e' (x) dx = 0 < 1 = e (1)- 9 (0). 0
Further on, we shall describe the conditions under which there is equality in (3). In conclusion, we prove a theorem which is useful in many situations. THEOREM 6. For every set E of measure zero on the closed interval [a, b], there exists a continuous increasing function a (x) such that G
1
(x)
=
+oo
at all points x E E. Proof For every natural number n, let Gn be a bounded open set such that
Let ~n (x)
= m {G,L [a, x]}.
The function !Jin (x) is increasing, non-negative, continuous, and satisfies the inequality
~n (x)
Therefore, the function 00
a (x)
=
~ ~,L(x) 1£=1
7 A simple inductive proof can be given for this fact. It is suggested that the reader work out a detailed proof.
3. FUNCTIONS OF FINITE VARIATION
215
is also increasing, non-negative, and continuous. If x 0 E E, and I h I is s.ufficiently h] lies in the set Gn [for fixed n]. For such h small, the whole interval [x 0 , x 0 (assuming h > 0 for simplicity), we have
+
~,.(x0 +h)=m(O,.·[a,
x 0 J+011 ·(x0 , x 0 +hJ)=t\',.(x0 )+h.
Thus ·~,. (xo
+ h).y,. (xo) h
= 1
•
It follows that for every natural number N, if I h I is sufficiently small, we have N
r;
(x0 +h)h
r;
(x0}
......_ ~
-::?'
kJ.
.Y,. {x0 + h)-
= N,
n=l
so that c' (x0 )
=
+ oo.
This proves the theorem.
§ 3. FUNCTIONS OF FJNITE VARIATION
In this section, we discuss the theory of an important class of functions - functions of finite variation, which are intimately connected with monotonic functions. Let a function f(x) be defined and finite on the interval [a, b]. Subdivide [a, b] into parts by means of the points
x 0 =a<x1 <· .. < xn=b and form the sum n-1
V= ~ l/(x;,+ 1) -f(xk)f. k=O
DEFINITION 1. The least upper bound of the set of all possible sums Vis called the b
total variation ofthefunctionf(x) on [a, b] and is designated by V {f). If a
b
V(f)<+oo, a
then f(x) is said to be a function of finite variation on [a, b]. We also say that f(x) has finite variation on [a, b]. THEOREM 1. A monotonic /Wlction on [a, b] has finite variation on [a, b]. Proof It is sufficient to prove the theorem for an increasing function. If f(x) is increasing on [a, b], then all the differences f(xk+l)- f(xk) are non-negative and
This proves the theorem. :f' urther examples of functions of finite variation are furnished by functions satisfying~ a Lipschitz condition.
VIII. FUNCTIONS OF FINITE VARIATION. STIELTJES INTEGRAL
216
DEFINITION 2. A finite functionf(x) defined on [a, b] is said to satisfy a Lipschitz condition if there exists a constant K such that for any two points x andy in [a, b ],
!f(.x)-f(y)
I~Klx-yl.
If the function f(x) has a derivative f' (x) at every point of [a, b] and iff' (x) is bounded, then, as is clear from the mean value theorem, (x
If (xk+ 1)-f (.xk) I~ K (.xk+t- xk), whence
V~K(b-a)
and f (x) is a function of finite variation. An example of a continuous function with infinite total variation is the function (O<.x~
/(.x) = X COS 2:
1, /(0)=0).
If we take the points
0
1
1
1
1
< 2n < 2n- 1 < .. · < 3 < 2 < 1'
for points of division of [0, l], then it is easy to verify that 1 I
V=1+2+· .. +n·
Therefore 1
V(f)=+oo. 0
THEOREM 2. Every function offinite variation in [a, b] is bounded in [a, b}. x < b, In fact, for a
<
b
V=if(x)-f(a)l+lf(b)-f(.x)!~V(f). a
Therefore b
1/(.x) I~ lf(a) I+ V (f). a
THEOREM 3. The sum, difference and product of two functions offinite variation are functions of finite variation. Proof Let f(x) and g (x) be functions of finite variation on the closed interval [a, b], and lets (x) be their sum. Then
\ s(xk+l)-s(xk) l ~If (.xk+l)- f (xk) and it follows that b
b
b
a
a
a
I+ \g (xk+ 1) - g(xk) I,
V (s)~V (f)+ V (g).
217
3. FUNCTIONS OF FINITE VARIATION
Therefore s (x) is a function of finite variation. The proof thatf(x)- g (x) is of finite variation is very similar to the proof just given, and need not be written out. Next,.suppose that f(x) and g (x) have finite variation and that p (x) =f(x)g(x). Let
B=sup{jg(x)[}
A=sup{l/(x)!}.
both suprema being taken over the closed interval [a, b]. Then
IP (xk+t)- P (xk) I:;:::;; i/ (xk+i) g(xk+l)-I
(x;,:) g (xk+t) I+
+ i/(xk) g(xk+l) -f(x,J g (x") I:;:::;; B !I (x"+ 1) - f (xk)[ +A Ig (xk+t) -g (:c") [. This implies that b
b
a
a
b
V (p)~ BV (f)+ AV (g), a
and accordingly p (x) has finite variation in [a, b]. THEOREM 4. If f(x) and g (x) are functions offinite variation and if g (x) >a
> 0,
then the quotient~~~ is a jUnction of finite variation.
The proof is left to the reader. 5. Let a finite function f (x) be defined on [a, b] and let a < c < b. Then
THEOREM
c
b
b
va (/) = va (f)+ vc (/).
(1)
Proof. Subdivide each of the intervals [a, c] and [c, b] into subintervals by means of the points
Yo =a
z0 =c
•
0
''
and form the sums 1n-t
n-1
Vr= ~lf(Yk+r)-/(y,Jj,
Vz =~If (zk+J)- f(z,.)
k=O
k=-0
[.
The points { yk} and { zk} subdivide the whole interval [a, b]. If Vis the sum corresponding to this method of subdivision,
V= V1
+ V2·
It follows at once that
and that c
b
b
va (f)+ v (!):;:::;; va (/).
(2)
c
Now subdivide the interval [a, b] by means of the points
x 0 =a<x 1 <
... <xn.=b,
being careful to include the point c as a point of division. Writing c =
Xm,
we can
218
VIII. FUNGI10NS OF FINITE VARIATION. STIELTJES INTEGRAL
express the sum V which corresponds to our method of subdivision in the form m-1
n-1
k=O
k=m
V= ~ \f(xk+t)-f(xk)l+ ~ lf(xk+l)-/(xJ.)[· More briefly, where V1 and V~ are sums corresponding to the intervals [a, c) and [c, b]. Consequently c
b
v~VU)+ V(f). a.
(3)
c
Inequality (3) has been established only for sums V corresponding to methods of sub~ division for which the point c is a point of subdivision. Since the addition of new sub~ division points does not decrease the sums V, (3) is true for all sums V. From this it is clear that c
b
b
va. (/)~ va. (/) + vc (!).
(4)
Combining (2) and (4), we obtain (1). COROLLARY 1. Let a< c
it has finite variation on each of the intervals [a, c] and [c, b], and conversely. CoROLLARY 2. If it is possible to subdivide the segment [a, b] into a finite number of parts on each of which the function f(x) is monotonic, then f(x) has finite variation on [a, b). THEoREM 6. Afunctionf(x) defined andfinite on [a, b] is a function offinite variation if and only if it is representable as the difference of two increasing functions. Proof. The sufficiency of the condition follows from Theorems 1 and 3. To prove its necessity, we set 1r
(x)=
"' (f) V a
1t
(a)= 0.
By virtue of Theorem 5, the function
1t
(x) is an increasing function. Setting
= 1r (x)- f (x), we obtain another increasing function v (x). In fact, if a Theorem 5, v (x)
(5)
< x < y < b,
11
'J
and hence
(y)
= 1t(_y) -f(y) ='lt(X) + y (/) -f(y) ""
y
v(y)-v(x)= V(/)-[j(y)-/(x)]. 3:
However, from the very definition of total variation, it is clear that 11
f(y)-f(x)~ V (!), so that
:l:
v (y)-v (x)~ 0.
then, by
219
3. FUNCTIONS OF FINITE VARIATION
Therefore v (x) is an increasing function. It remains to write the equality (5) in the form
f (x) =
(x)-v (x), in order to obtain the desired representation of f(x). CoROLLARY 1. If a function f(x) has fmite variation on [a, b], then at almost every point of [a, b] the derivative f' (x) exists, and is finite. Furthermore, f'(x) is a summable function on [a, b]. CoROLLARY 2. The set of points of discontinuity of a function of finite variation is at most denumerable. At every point x 0 of discontinuity, both limits 1t
f(x 0 +0) = lim f(x) :r;
f(x 0 -0) =
->
:Z:o
lim f(x) :r;~
:Z:o
exist. We now take up the problem of writing a function of finite variation as the sum of a continuous function and a saltus function. Let the sequence consist of all points which are points of discontinuity of at least one of the functions 7t (x) and v (x). Consider the following saltus functions:
(a<x~b)
s,(x)= [v (a+O)-v (a)]+ ~ [v(xk+O)-v(x 1,-0)] x7,
< :r;
+ [v(x)-v (x-0)];
we further agree that s~(a)=s.(a)=O.
(If a point xk is a point of continuity of one of the functions 1t (x) and v (x), the corresponding saltus vanishes automatically. Moreover, we can show that a point of discontinuity of the function v (x) cannot be a point of continuity of the function 1t (x) ; but this is only of slight interest.) Let s (x) = s~ (x) -s. (x).
This function s (x) can be written in the following form :
s(a)=O.
It is clear that s (x) is a function of finite variation. It is called the saltus function of the function/ (x). It is self-evident that the definition of the functions (x) is not changed if we remove from the sequence (6) those points at which the functionj(x) is continuous, 8 so that we may suppose that the xk appearing in (6) are all points of discontinuity ofj(x). 8 It is possible to show directly that there arc no such points in (6). Theorem 1, §5.
This will be shown in
220
VIII. FUNCTIONS OF FINITE VARIATION. STIELTJES INTEGRAL
It has already been proved (Theorem 2, §1) that the functions
'lt(x)-s,(x),
v(x)-s.(x)
are continuous and increasing. From this it follows that the difference ~ (x) =! (x)-s(x) is a continuous function of finite variation. The resl}lt just obtained can be stated in the following form. THEOREM 7. Every function offinite variation can be written as the sum of its sa/tus function and a continuous function of finite variation.
§ 4. HELLY'S PRINCIPLE OF CHOICE
In this section, we take up a theorem due to E. Helly which has many important applications. We first prove two lemmas. LEMMA l. Let an infinite family of functions H = {f(x)} be defined on [a, b]. If all the functions of the family are bounded by one and the same number 1/(x)I~K. (1) then, for any denumerable subset E of [a, b], it is possible to find a sequence {.t;, (x)} in the family H which converges at every point of the set E. Proof Let E = {xk }. Consider the set {! (xt)} of values taken on by the functions of the family H at the point x 1 • By (1), this set is bounded and, by the Bolzano-Weierstrass Theorem, we can select a convergent sequence from it:
J\1>(x1), A1'(x 1),
/'l'{x1),
••• ;
limjg>(x1)=A 1•
(2)
n-+co
Now consider the sequence
/t1' (x2),
/~1) (x2), !~1) (x~, · · •
of values taken on by the functions of the set {/,,< 1> (x)} at the point x 2• This sequence is also bounded, and we can apply the Bolzano-Weierstrass Theorem to it. This gives a convergent subsequence \ , / 1(2) ( x 21
1(2) 2 ( x 2) ,
/(2) s ( xll )
, .•. ,
lim /~) (
:n-+ex>
xJ =
A2,
(3)
selected from {J,
(x 1) ' /(!)( ~ x 1) ,
1 1(2) ( :c2),
/{2)
f
~
(x2) ,
z
/(I) s ( x 1) , ••• , /(2) ( :1
f
3
X2 ) ,
• 11m
/(1) n (x )
n-+co
1
= A 1•
... ,
( xSJ, , ... ,
lim
I :n
Ak.
221
4. HELLY'S PRINCIPLE OF CHOICE
where each sequence of numbers is a subsequence of the preceding one, and in which the order of elements has not been altered. We now form the sequence of diagonal elements of the infinite matrix just constructed, i.e., the sequence
(n=l, 2, 3, ... ). This sequence converges at every point of the set E. In fact, for every fixed k, the sequence
{!k'l(xk)}
(n;9k)
is a subsequence of {f,(kl (xk) } and converges to Ak. LEMMA: 2. Let F = {I (x)} be an infinite family of increasing functions, defined on the segment [a, b].
If all functions of the family are bounded by one and the same number,
lf(x)i<.K, fEF,a<x
sequence of functions of the family F,
Fo =
such that
{f(n)
(x)}
lim j(n) (xk)
( 4)
11-..oo
exists and is finite at every point xk E E. We now define a function 4 (x) by the following procedure. ~ (xk) =
First, we define
lim jC'IiJ(x1,)
(xk
EE)
n~""
for all xk E E. This defines ljl (x) only on E, of course. It is easy to see that tjJ (x) is an increasing function on E, that is, if x 1, xk E E and xk < x 1, then For x
E [a, b] -
ljl(x)=sup
{Hxk)}·
(xkEE).
:x:k<:z:
It is obvious that ljl (x) is an increasing function on the closed interval [a, b] and that the set of points Q where tjJ (x) is discontinuous is at most denumerable.
We show next that limJCn>(x0 )
=<\' (x0)
(5)
11~00
at every point x 0 where tjl (x) is continuous. xk and x 1 be points of E such that
Let e be any positive number, and let
Fixing the points xk and x 1, select a natural number n 0 such that for n
!f(n) (xk) -Hxk) l < ; ,
>
·lf(n) (x,) -ljl(x,) I<; ·
n0,
222
VIII. FUNCriONS OF FINITE VARIATION. STIELTJES lNTEGRi\L
It is easy to see that ljl (x0) - e <JCn) (xk) -<.JCn) (x,) for n
>
we have ~ (xo) -e<JCn) (xo)
for n
>
< 1\J (x0) + e,
n0 • Since
< 1\1 (xo) +e,
n 0 • This proves (5). Thus, the equality lim
j(n)
(x) =
~
(x)
1L~<X>
(6)
can fail only on the finite or denumerable set Q, where ljJ (x) is discontinuous. We now apply Lemma 1 to the sequence F 0 , taking for the set E the set of those . points of Q where (6) is not fulfilled. This yields a subsequence
{f, (x)} of F 0 , which converges at all points of [a, b] (because at points where the sequence
{J
we obtain a function which is obviously an increasing function. THEOREM (HELLY'S FIRST THEOREM). Let an infinite family offunctions F = {f(x)} be defined on the segment [a, b]. If all functions of the family and the total variation of all functions of the family are bounded by a single number
Jf(x)l -<.K,
b
V<J)-<.K, a
then there exists a sequence {f, (x)} in the family F which converges at every point of [a, b] to some function cp (x) of finite variation. Proof. For every function f(x) of the family F, set
':t(x)=V(f), v(x)=7t(X)-f(x). a
Both 1t (x) and v (x) are increasing functions. Furthermore,
!'It (x) I< K. Iv (x) I< 2K. Applying Lemma 2 to the family { 7T (x) }, we find that there is a convergent sequence { 7Tk (x) }, lim 1rk (x) =a: (x)
k~<X>
in this family. To every function 1rk (x), there corresponds a function vk (x), extending it to the function fk (x) of the family F. Applying Lemma 2 to the family { vk (x)}, we find a convergent subsequence { vk1(x) },
'
lim vk (x)
·~0)
=
~
(x)
of { vk (x) }. Then the sequence of functions
fk,(x) =
7tkt (x)-v 1,,; (x),
223
5. CONTINUOUS FUNCTIONS OF FINITE VARIATION belonging to F, converges to the function ~(x)=a(x)-@
(x).
This proves' Belly's theorem.
§ 5. CONTINUOUS FUNCTIONS OF FINITE VARIATION
THEOREM 1. Let a function f(x) of finite variation be defined on the closed interval [a, b]. Jff(x) is continuous at the point x 0 , then the function :;rs
-n:(x)= V(/) a
is also continuous at x 0 • Proof Suppose that x 0 < b. We shall show that 1t (x) is continuous on the right at the point x 0 • For this purpose, taking an arbitrary e > 0, we subdivide the segment [x 0 , b] by means of the points x0 x 1 xn = b so that
< < ... <
n-1
b
V= ~ \f(xk+t-f(xk)l> V(f)-e. k=O ~
(1)
Since the sum V only increases when new points are added, we may suppose that
lf(xt)-f(xo)
I< e.
It follows from (1) that n-1
b
n-1
V (/) < e+ ~ Jf(xk+l) -f(xk) !<2e+ ~ if(xk+l) -f (xk) :»o k=O 7<=1
b
I <:2e+ V(f). :!:1
Hence x,
V(f) < 2e, and consequently This implies that Since
£ is arbitrary,
we have
-n: (xo
+ 0) =" (-'
It can be shown in like manner that 1t (x 0 - 0) = 1t (x 0 ), i.e., that 1t (x) is continuous on the left (if x 0 > a) at the point x 0 • CoROLLARY. A continuous function offinite variation can be written as the difference of two continuous increasing functions. In fact, iff (x) is a continuous function of finite variation defined on [a, b), then both of its increasing components :r.
-n:(:c) = are continuous.
V (/) a
and v (x) = ' I t (x)-f(x)
224
VIII. FUNCTIONS OF FINITE VARIATION. STIELTJES INTEGRAL
Let a continuous functionf(x) be defined on [a, b]. Subdivide [a, b) by means of the points
x 0 =a
<
x 1 < x2 <
... < X 71 =
b
and form the sums n~
n~
V= ~ lf(xk+l)-f(xk)i,
Q= ~
k=O
Olk,
k=O
where <.r.Jk designates the oscillation of the function f(x) in the interval [xk, xk+l]. b
THEOREM
2. As .:\ ->- 0, each of the sums V and Q tends to the total variation V {f) a
b
ofthefunctionf(x).
9
(We do not suppose that the variation
V a
(f)isfinite).
Proof. As already noted, the sum V does not decrease when new points of subdivision are added. On the other hand, if this new point falls into the interval between xk and xk+l• the increase in the sum V due to this poio.t, is not greater than twice the oscillation <.r.J11 of the function f(x) on the segment [xk, xk+J· Take any number b
A <
V {f) a
and find a sum V* such that
V*
>A.
(2)
This sum corresponds to some method of subdivision, say
x0*=a<x 1"'< ···<xm*=b. Now choose S· > 0 so small that
lf(x")-/(x')[< V*4m A' provided that
s.
I xw- x' I < We shall show that V >A for every method of subdivision of [a, b] for which..\< 8. Let (I) denote any such subdivision, and let (II) be the subdivision obtained from (l) by adding to (I) all of the points x 0 *, x 1*, ... , Xm *. Let V 0 be the sum corresponding to the method of subdivision (II). It is then clear that Vo> V*. (3) On the other hand, the subdivision (II) is obtained from the subdivision (I) by m repetitions of the process of adding a single point. Each addition of a point of sub-
V*~ A, and consequently
division increases the sum V by an amount less than V*-A
Vo-V<-2-. 9 It is essential here that we are dealing with a CQiltinuous function. Let, for instance, f defined on [ - 1, + 1} as follows: /(0) = 1,/(x) = 0 for x 0. Then
*
+t
(x) be
v (f) =2.
-1
but for an arbitrary method of subdividing [ - 1,
V=O,
+ I] for .0=1.
which x = 0 is not a point of division,
5.
225
CONTINUOUS FUNCTIONS OF FINITE VARIATION
It follows from this observation and from (3) that
V>Vo- V*2 A>A+2V*>A. Therefore the inequality (2) holds for all A < 3. Since b
V~V(f) a
for all V, it follows that b
JimV= V(f). }.-)- 0
a
It is now easy to carry out the proof for the sums Q. On the one hand, it is clear
that (4)
f2>V.
But if we find a sum Q corresponding to any method of subdivision and then add as new points of subdivision those points at which the function f (x) takes on the values
then the sum V' corresponding to this method of subdivision will obviously be not less than Q. Consequently b g ~
v (/).
(5)
n
From (4) and (5), we have
u lim Q= V(f). ).~0
a
The preceding theorem just proved is the basis of a very interesting approach to the study of continuous functions of finite variation due to S. Banach. Letf(x) be defined and continuous on the segment [a, b] and let m = min {! (x)},
M
= max {f (x)}.
Introduce the function N (y), defined on the closed interval [m, M], in the following way: N (y) is the number of roots of the equation f(x) =Y· If the set of these roots is infinite, then N(y)=+co.
We will call the function N (y) the Banach indicatrix. THEOREM 3 (S. BANACH). The Banach indicatrix is measurable and :w b
f
y (/).
N(y) dy =
m
Proof. Subdivide [a, b] into 2" equal parts and set dt
= (a,
a
+ b 2.,. a ]
dk=(a+(k-1) b 2,.a
I
a+k b 2na]
(k = 2, 3, • • •1 2") •
226
VIII. FUNCTIONS OF FINITE VARIATION. STIELTJES INTEGRAL
Further, let the function Lk (y) (k ~ I, 2, ... , 2n) be equal to 1 if the equation f(x) =y
(6)
has at least one root in the interval dk, and let Liy) be equal to 0 if this equation has no root in dk. If mk and Mk are, respectively, the greatest lower and least upper bounds of the function f(x) in the interval dk, then Liy) equals 1 in the interval (rnk, Mk) and equals 0 outside the segment [mk, MJ so that this function can have no more than two points of discontinuity and is obviously measurable. Note further that Jlf
J
Lk (y) dy
= Mk -
mk
= wk,
'" where wk is the oscillation of f(x) on the closed interval dk. Finally, introduce the function Nn (y)
= Lt (y) + L2 (y) + ... +L2n (y),
equal to the number of those intervals dk which contain at least one root of equation (6). The function N,(y) is clearly measurable. Here, on
M
JNn,(y)dy= ~wk, "'
k=l
so that by Theorem 2, M
lim
b
JNn (y) dy = V {!).
n-+ 00 111
a
It is easy to see that N1 (y) ::s;; N2 (y) :s;;N3 (y) ::s;; · · ·.
and, hence, the finite or infinite limit N'' (y) = lim
N,~ (y),
n-+oo
exists and is a measurable function. According to B. Levi's Theorem (Chapt. VI, §1), M
I
N
N*(y) dy =lim
m
I
b
Nn(Y) dy =
n-+oom
V ({). a
We now need only to show that N*(y) =N(y),
(7)
in order to prove the theorem. First of all, it is perfectly clear that Nn (y) ::s;; N (y),
so that N* (y) ::s;; N (y).
(8)
Now let q be a natural number not greater than N (y). Then we can find q distinct roots x1<x2< ... <xq of equation (6). If n is so large that b-a ~<min(xk+l-xk),
6.
THE STIELTJES INTEGRAL
227
all q roots xk will fall into distinct intervals dk so that Nn(Y)~q.
and therefore N*(y)~q.
(9)
If N (y) = + oo, we can take q arbitrarily large, so that N* (y) = +co also. If N (y) is finite, we can take q = N (y), and then (9) can be written in the form N* (y) ~ N (y).
(7) follows from this and (8). CoROLLARY 1. A continuous Junction f(x) has finite variation if and only if its Banach indicatrix N (y) is summable. COROLLARY 2. If f(x) is a continuous function of finite variation, then the set of values taken on by it an infinite number of times has (on the axis of ordinates) measure zero. In fact, in this case the Banach indicatrix, being summable, is finite almost everywhere.
§ 6.
Tim STIELTJES INTEGRAL
We now take up a very important generalization of the Riemann integral which is called the Stieltjes integral. Let f(x) and g (x) be finite functions defined on the closed interval [a, b]. Subdivide [a, b] into parts by means of the points choose a point gk in [xk,
x 0 =a < x 1 < ... < Xn = IJ, fork= 0, ... , n- 1, and form the sum
xk+ 1]
n-1
a= ~t(ek) [g(xk+t)-g (xk)]. k~o
If the sum a tends to a finite limit I as 'J...
=max (xTt+ 1 -
xk) -r 0,
independently of both the method of subdivision and the choice of the points gk, this limit is called the Stieltjes integral of the function J(x) with respect to the function g (x) and is designated by b
b
Jf(x)dg(x),
or
(S)
Jf(x)dg(x). a
a
The exact meaning of the definition is this: the number I is the Stieltjes integral of the function f (x) with respect to the function g (x) if, for every e > 0 there exists a S > 0 such that for an arbitrary method of subdivision for which A < S, the inequality
Ja-II<e gk·
holds, for all choices of the points It is clear that the Riemann integral is a special case of the Stieltjes integral, obtained by setting g (x) = x.
228
VIII. FUNCllONS OF FINITE VARIATION. STIELTJFS {NTEGR.A,L
We list the following obvious properties of the Stieltjes integral. b
1.
b
1
2
1
a
a
a
b
2.
b
Jf! (x)+l (x)]dg(x)= J/ (x)dg(x)+ fla(x)dg(x).
Jf (x) d [gl (x) +
b
g!l
JI (x) dg (x) + JI (x) dg (x).
(x)] =
a
b
1
2
a
a
3. If k and 1 are constants, then b
b
Jkf (x) dig (x) = kl JI (x) dg (x). Q
Q
(In all three cases, the existence of the right-hand member implies the existence of the
left-hand member). 4. If a < c < b, and all three integrals involved in the equality b
b
(!
Jf(x)dg(x)= Jl(x)dg(x)+ Jf(x)dg(x) a
a
(*)
c
exist, then the equality (*) holds. In order to prove this property of the integral, it is necessary only to see that the point cis included in the points of subdivision of [a, b) when we form the sum a for the b
integral Jf dg. a
b
It is not difficult to prove that the existence of the integral c
Jf
dg implies the
a
b
existence of both the integrals Jf dg and Jf dg, but we will not stop to discuss this. "
c
It is more interesting to note that the converse statement is not true. EXAMPLE. Let the functionsf(x) and g (x) be defined on the segment [ - l, where
0 if j(x)= { 1 if
-l::s;;;x~O
O<x::s;;;l' g(x)=
+ 1],
{ 0 if -l::s;;;x
It is easy to see that the integrals 1
(l
Jf(x)dg(x),
ft(x)dg(x) 0
-1
exist {because all of the sums a equal zero). But at the same time, the integral
+l l(x) dg(x)
J -1
does not exist. In fact, subdivide the segment [ - l, is not a point of subdivision, and form the sum
+ l ) into parts so that the point 0
n-1
a==
~I (Ek) {g (xk+l) - g (xk)].
k""O
It is easy to see that if x 1 < 0
<
xf+ 1, only the i-th term remains in the sum .a,
229
6. THE STIELTJES INTEGRAL
because if the points xk and xk+l lie on the same side of the point 0, then g (xk)
=
g (xk+J. This implies that
o =/{E 1) [g (x1 H) - g (x,)] = j(e,). Depending upon whether
g1 < 0 or g1 > 0, we have o=O
or
a= 1,
so that a has no limit. b
5. The existence of one of the integrals
Jf
b
(x) d g (x) and
J g (x) d f
a
(x) implies
a
the existence of the other. In this case, the equality b
b
Jf(x)dg(x)+ Jg(x)df(x)=[f(x)g(x)]~
(1)
a
a
holds, where [f (x) g (x)J!
=f
(b) g (b) - I (a) g (a).
(2)
The relation (1) is called the formula for integration by parts. b
Let us prove this formula. Suppose that the integral Jg (x) df(x) exists. Subdivid.e [a, b] into parts and form the sum a n-1 0
= ~ f (~k) [g(xk+t)- g (xk)J. k-=0
The sum a can also be written as n-1
o= ~
k=G
n-1
f
(ek) g(xk+l)- ~ f (~k) g (xk), k=O
and this expression is clearly equal to n-1
a=-~ g (x") [/ (ek)- f (ek-t) l k=l
+f C~n-tl g (x,.)- f (co) g(.'Co)·
Adding and subtracting expression (2) on the right side of the last equality, we find o = [f(x)g(x)]:-
-{g(a) [f(e0) - /(a)]
~~>(xk) [J (E0-t(ek_ 1)]+g(b) [f(b)-t(e,._ 1)]}.
The expression in the curly brackets is nothing but the sum formed for the integral b
f g df,
where the points of subdivision of [a, b] are
"
a~E 0 ~E~~~~ ••• ~ E,._ 1 ~a, and the points a, x 11 x 2, ••• , xn- 11 b are points o(the closed intervals [a, ... , ... [gn-1 , b]. As max (xk+l - xk) approaches zero,
go], [g0, gJ,
max (~k+ 1 - e,J b
also approaches zero, so that the sum in the curly brackets tends to the integral Jg df. This completes the proof. a It is natural to ask the question under what conditions the Stieltjes integral exists.
230
VIII. FUNCTIONS OF FINITE VARIATION. STIELTJES INTEGRAL
We will restrict ourselves to only one theorem in this direction. THEOREM 1. The integral b
Jf(x)dg(x) a
exists if the fwzction f(x) is continuous on [a, b] and g (x) is of finite variation on [a, b]. Proof. We may obviously suppose that the function g (x) is increasing, because every function of finite variation is the difference of two increasing functions. Sub~ divide [a, b] by means of the points
< < ... <
x 0 =a x 1 xn = b and designate by mk and Mk,the least and greatest values, respectively, of the function f(x) on [xk, xk+J. Let n-1
s = ~ mk [g (xk+l)- g (xk)l,
k=O It is clear that
n-1
5= ~ Mk[g (xk+ 1) - g(xk)]. k=o
s~cr~S.
for all choices of the points gk in the closed intervals [xk> xk+J· It is also easy to verify that the sum s does not decrease and S does not increase when new points of subdivision are added. From this it follows that none of the sums s surpasses any of the sums S. In fact, having two methods, I and II, of subdividing the segment [a, b], with corres~ ponding s 1, S 1 and s 2, S 2, respectively, we can form a method III by combining the points of subdivision of both of the methods I and II. If the sums sa and Sa correspond to the method III, we have so that, in fact, sl < s2. With this in mind, we denote the least upper bound of the set { s } of all lower sums by the symbol /: I= sup {s}. For every method of subdivision, it is clear that s~J~S,
and consequently, by (3),
lo-ll<S-s. If we choose an arbitrary e > 0 and find a 8 > 0 such that I f(x") - f(x') I < e whenever I x" - x' I < 8, then we have Mk-mk<e (k=O, 1, ... , n-1) for,\
< 8, and accordingly S-s
< e [g (b)-g(a)].
Clearly
l o-Il< e [g (b)- g(a)J. for ,\ < 8. In other words, lim a=l, l.-+0
6.
231
THE STIELTJES INTEGRAL
b
so that I is the integral f f(x) dg (x). a
It fpllows from the theorem just proved that every function of finite variation is integrable with respect to every continuous function. The problem of computing Stieltjes integrals will be studied in detail in §6, Chapt. IX. We restrict ourselves here to two elementary cases. THEOREM 2. If the functionf(x) is continuous on [a, b] and if the function g (x) has a Riemann integrable derivative g' (x) at every point of [a, b], then b
(S)
b
JI (x) dg (x) = (R) J/(x)g' (x) dx. a
(4)
a
Proof. It follows from the conditions of the theorem that g(x) satisfies the Lipschitz condition and therefore is of finite variation. Hence the integral on the left side of (4) exists. On the other hand, the function g'(x), and the product f(x)g'(x) with it, are continuous almost everywhere, so that the right member of (4) also exists. It remains to show that the two sides of (4) are equal. For this purpose, subdivide [a, b] by means of the points
x 0 =a <x1 <· .. <xn= b, and, to every difference g(xk+J - g(xk), apply the mean value theorem:
b
In forming the sum u for the integral
f f dg, we may take the point xk which appears a
in the mean value theorem for the point
gk.
The sum u then becomes
11-1
a= ~ 1(xiJ g' (x,.) (x,.+l -x,.). k=O
This is a Riemann sum for the functionf(x) g' (x). Refining the subdivision and taking the limit, we obtain (4). THEOREM 3. Let f(x) be a function continuous on [a, b] and let g(x) be constant in each of the intervals (a, cJ, (c1 , c:J, ... , (em, b), where 10
a
Jf(x)dg(x)=f(a) (g(a+O)-g(a)] + q
m
+ k=l ~ l(c,J fg(c,.+O)-g(ck-O)J+f(b)[g(b)-g(b-0)]. Proof. It is easy to see that m
b
V (g)== lg(a+O)-g(a) I+ a
~ {lg(cT<)-g(ck-0)1
k=l
+
+lg(ck+O)-g(c,.)l}+lg(b)-g (b-0)1, 10
In other words, g(x) is a step function.
(5)
232
VIII. FUNCT[ONS OF FlN[TE VARIATION. STIELTJES INTEGRAL
so that the function g(x) has finite variation on [a, b]. Therefore g(x) has finite variation on every subinterval of [a, b]. Therefore b
m
ck+l
k=O
Ck
Jt(x)dg(x)=~ G
where we write c 0
= a, cm+l = b.
J f(x)dg(x),
(6) c
It remains to compute the integral
k+l
J f(x) d g (x). c
Subdividing [ck, have
c~c+ 1 ]
k
and forming the sum rr for the interval [ck> ck+1l we obviously
,+ 0) -g(ck)J +J(en_
a =I (~ 0 ) [g (c 1
1)
[g(ck+ 1) -g(ck+ 1 -0)],
because all other terms vanish. In the limit, therefore, ck+l
J f (x) dg (x) = f(c,,) [g(ck+O)- g (ck)l +I (c1~+ 1 ) [g(ck+
1) -
g (cTc+ 1 -
0)].
~
The equality (5) follows from this and (6).
§ 7. PASSAGE TO THE LIMIT UNDER THE STIELTJES INTEGRAL SIGN.
I. If the function f(x) is continuous on [a, b] and g(x) has finite variation
THEOREM
on [a, b], then b
b
I J f(x)dg(x)l ~M(!J. V(g), a
(I)
a
where M (f)= max I f(x) [. Proof. For an arbitrary method of subdividing [a, b] and an arbitrary choice of the points gk, n-1
!a I= I~
k=O
n-1
I
b
(ek) [g(xk+l) -g(x")J I~ M (f)~ I g (xk·H)-g (xJ I~M(j) · V(g} k=O
a
(I) follows at once. THEOREM· 2. Let g(x) be a function of finite variation defined on the closed interval [a, b], and let fn(x) be a sequence of continuous functions on [a, b], which converges uniformly to the (necessarily continuous) functionf(x). Then b
lim
b
I In (x) dg (x) = I I (x) dg (x).
n-+ co a
a
Proof. Let M ifn- f)= max If" (x)-I (x) ). Then, by (1), b
\J fn(x)dg a
b
(x)-
b
Jf(x)dg(x) ~~MUn-f) ·"!(g) a
233
7. PASSAGE TO THE LIMIT UNDER THE INTEGRAL SIGN
and it remains merely to note that by hypoth~sis. THEOREM 3 (BELLY's SECOND THEOREM). Letf(x) be a continuous function defined on the interval [a, b], and let {gn (x)} be a sequence of fzmctions which converges to a finite function g(x) at every point of [a, b]. If b
V (gn)
lim n~oo
b
J f(x) dgn (x)= J/(x) dg (x). a
(2)
a
Proof We first show that b
V(g)~K.
(3)
a
so that the limit function also bas finite variation. In fact, if we subdivide the interval [a, b] in any way whatsoever, we have m-1
l:
(n = 1, 2, 3, ... ).
[gn(xk+x)-gn(xk)I
k=O
Taking the limit as n->-=, we find that m-1
l:
I g(xT<+ 1)-g (x~c) I~ K.
k=O
Since the subdivision used was arbitrary, (3) follows from the last inequality. Now select an arbitrary E > 0 and subdivide {a, b] by means of the points { xk} (k = 0, 1, ... m) into subintervals [xk> xk+l] so small that the oscillation of the function f(x) is less than 3 ~ on every interval [xk, xk+J. Then b
m-1 xlc+l
Jf(x)dg(x)= "~ J f(x)dg(x)= a
xk m-1
= ~
xk+l
f
m-1
[!(x)-f(x,l)]dg(x)+ e/<xk)
xk
k=O
x7c+t
J dg (x) = g (xlt+
Furthermore, the inequality
J dg(x).
.:rk
Now
xk
.:rk+ 1
1) -
g (x 1.).
234
VIII. FUNCTIONS OF FINITE VARIATION. STIELTJES INTEGRAL
holds for all x
E [x", xk+ 1], xl,+ 1
II
and therefore xl,+l
!l(x)- l(xk)l dg (x) [~ 3~ V (g), :~:1,
::ck
accordingly, zk+l
tn-1
\2: J
[f(x)- f(xk)] dg (x) \ ~ ;K
b
V (g)~; · a
k=O
We find, therefore, that b
?li-1
I l(x)dg(x)= ~f(xk) {g(xk+ )-g(x0J+6 ~ 1
a
(I eI:::;;;;; 1).
k=O
In the same way, we can show that m-1
b
IJ(x)dg11 (X)= ~ l(xk)[g11 (Xk+ 1)-g11 (xk)J+6n; a
(l6 11 j::::;;;;I).
k=O
Since lim gn(x) = g (x) for all x
E [a, b],
it is clear that there exists a natural number
n->oo
n0 such that the inequality m-1
\~I (xk)
m-1
~I (xk) [g (xk+ 1) - g (x7,)]
[g,1 (xk+ 1) - gn(xk)1-
k=O
l< ;
k=O
holds for all n
> n 0•
For all n
>
n0 , therefore, we have
b
b
j Jl(x) dg11 (x)- Jf(x)dg (x) [<e. a
a
This proves the theorem. With the aid of the preceding theorem we can reduce the problem of evaluating b
the integral f f(x) dg(x) (f(x) continuous and g (x) of finite variation) to the case in a
which g (x) is continuous. Let g (x) be an arbitrary function of finite variation. Consider the saltus function of the function g (x) :
As proved in Theorem 7, §3, we may write g (x) =s (x)
+1 (x),
where y (x) is a continuous function of bounded variation. Hence, b
b
b
JI (x) dg (x) = Jf(x) ds (x) + Jl(x) d'"f (x). a
a
a
7.
PASSAGE TO THE
LIMIT
235
UNDER THE INTEGRAL SIGN
b
The integral Jf(x) ds(x) can be easily evaluated. To do this, first note that the series Q
00
~ { Jg(xk)-g(xk-O)!+Ig(xk-\-0)-g(x;Jj}
k=l
is convergent.
11
Next, define the functions sn (x) by setting sn (a)
s,. (x)= [g(a-\-0)-g(a)]
=
0 and
+xk<x ~ [g(xk+ 0)-g (xk-0)] +[g(x)-g(x-0)1
for a < x
for all x
u
E [a,
CQ
b]. On the other hand, n
V (sn)= Jg (a-\-0) - g (a) I+~ { Jg (xk) -g(xk-0)1 + a
k=l
+ Jg (xk+O) - g (x
1,)
I}-\- I g(b) -g(b -0) I,
so that the variations of all the functions sn (x) are all less than a certain fixed value. By Helly's second theorem, therefore, b
b
Jf(x) ds (x) =
lim n~co
a
J! (x) ds., (x). a
As the function sn (x) is constant in the intervals between the points a, x 1, we infer from Theorem 3, §6, that
u
f
•.•
Xn,
b,
n
f(x)ds,.(x)=f(a) fg(a-\-0)-g(a)]-\- ~ f(xT:)[g(xk-\-0)k=l
"
-g(x7:-0)]
+ f(b) [g(b)-g(b- 0)]
(It is clear that the saltus of the functions sn (x) at the points a, x 1 , with the saltus of the function g(x) .) Hence b
xm b coincide
. •
00
J/(x)ds(x) =f(a) [g (a+O)-g(a)] + ~ f(x
a
1,)
[g(x 7,-\-0)-
k=l
-g(x 1~-0)J+f(b)
[g(b)-g(b-0)],
11 Write g(x) = w (x) - v (x), where w (x) and v (x) are increasing functions. non-negative series
co
00
~ [1t(x;rc+0)-1t{XJ<-0)), k=l
obviously converges,
Then each of the
~ I~(x~<+O)-~(xT<-0)] 1<=1
and it remains merely to observe that
l g (XJ<)- g {Xk- 0) 1+1 g (XJc + 0)- g (XJc) l S (1t (xr,+O)-rr (xk-O)J+ [·1 (xk+O)- -.(xk-O)J.
236
VIII. FUNCTIONS Of FINITE VARIATION. STIELTJES INTEGRAL b
and to find the integral
J f(x) dg(x)
b
it remains to evaluate
a
J f(x) d y (x),
where
a
y (x) is the continuous component of the function g(x). We call the reader's attention to the fact that the value g(x,J itself of the fu·nction g(x) at a point xk of discontinuity such that a < xk < b has no effect on the value of b
the integral
Jf dg.
This is quite natural, because we can omit the point xk as a point
a
of subdivision in forming the sums a. § 8. LINEAR FUN<;:noNALS
Let g(x) be a function of finite variation defined on the closed interval [a, b]. By means of this function, we can assign the number b
(/)=
Jf
(x) dg (x)
(1)
a
to every continuous function f(x) defined on [a, b]. The following properties of this function are evident. 1) (fl + fJ = (JJ + (JJ 2) I (f) I~ K M(f), where M(f) =max I f(x) I and K =
b
V (g). a
Suppose that an abstract function
C. Before proving this theorem, we observe that if (f) is a bounded linear functional on C, then (kf)
= k (f)
for all numbers k and f E C. This is proved by the argument employed for the corresponding assertion for bounded linear functionals on the space L 2 (§4, Chapter VII). THEOREM (F. RIESz). Let
(f) =
Jf
(1)
(x) dg (x)
a
for allf(x) E C. Proof. It is sufficient to consider the case a= 0, b = 1, since the general case can be reduced to this by use of a linear transformation of the argument x. In §5, Chapt. IV, we noted that n ~ k=O
c!xk (1- x)n-lc =
1.
8.
For x
E [0, 1],
LINEAR FUNCTIONALS
237
each term of this sum is non-negative. Therefore, if ek=±l
(k=O,l, ... ,n),
then n
I~ ekC~xk (1-.x)n-k J ~ 1. k=O
(2)
We next consider the linear functional
I
~
K • M (f).
From this and (2), we have n
I~
ek
c! 'Jl {1 -
k=O
Upon choosing the numbers we find that
ek
!
x)n-k] ~ K.
so that all terms of the last sum are non-negative,
n
~ I
(3)
Define a step function Kn (x), by setting g,.(O) =0
=
gn (x)
g11 <x> =
+
(o<x<*) (! ~x<~)
n-1
gn (x) = ~
gn (1) = ~
In view of (3), the functions gn (x) and their total variations are bounded by a single number. Therefore, on the basis of Reily's first theorem, we infer the existence of a subsequence {gn1 (x)} of {g.(x)} which converges at every point of [0, I] to a function of finite variation g (x). If f(x) is any continuous function defined on [0, 1], then by Theorem 3, §6,
f
n
l
f(x)dgn(x)=
k=O
0
and therefore
~t(!)
1
Jf (.x) dg.,.
{x) = Ill [Bn (x)J,
0
where
1~
Btdx) =
~t(!) C!xk(l-x)n-1' k=O
238
Vlll. FUNCTIONS OF FINITE VARIATION. STIELTJES INTEGRAL
is the Bernstein polynomial of degree n for the functionf(x). By Bern stein's theorem of §5, Chapt. IV, we have M (B 1, - f)-+ 0,
and by the definition of a bounded linear functional,
I ci> (B,)- (f) I= I ci> (Bn- f) I ~KM (B1, -f). Accordingly,
as n ->- oo. It follows that
1
Jf (x) dg
lim
(x) = ci> (f).
1,
n-?oo 0
By Belly's second theorem, we have 1
Jf
lim
1
= Jf (x) dg (x).
(x) dgn (x)
n-?oo 0
Therefore
0
1
Jf
ci> (f)=
(x) dg (x),
0
as was to be proved. § 9. EDITOR'S APPENDIX TO CHAPTER Vill.
For many applications, especially in the theory of probability, it is useful to consider functions of finite variation on the infinite line ( -oo,oo). DEFINlTION 1. Let f (x) be a function defined for all x, - oo < x < oo. If b
V (f)
< oo
for all a
b
<
b, and if sup V (f) is finite, then f(x) is said to be of finite
a
a
a
variation on ( -oo,oo), and the number b
00
sup V (!)·= V (f) a
-=
a
is called the total variation off. oo
THEOREM
V (f)
1.
--oo
a
=
lim V (!). Q--)'o:)
-a
This theorem is obvious. a
DEFINITION
a
V (!) = lim
2.
---=
1>->--oo b b
00
V(f)
lim V(f).
=
b=oo a
a
THEOREM
2. For a
V (!);
< b, b
(1)
a
V(f) = V(f)
---=
--<Xl
a
+ V(f); b
9. EDITOR'S APPENDIX a
oo
239
oo
v (!} = v (!) + va (f) ; -<><>
(2)
-OO
co
co
b
v (f) = av (f) + vb (!).
(3}
a
This theorem is also obvious. THEOREM 3. Letf(x) have finite variation on ( -oo,oo). Then lim
)C
V (f) =0 (I)
.x.+-CX) --oo 00
=
and lim V (f)
(2)
0. co
a
Proof. Let us prove, for example, (2). Since V (f) = sup V (f), it is clear that for --eo
every e
>
0,
oo
a -oo
a
there exists a. number a such that V (f) - e < V (f). Also, -oo
a
ex:>
V (f) = V (f) --oo
-oo
oo
oo
+
V (f). It follows that V (f)
<
-oo
This obviously proves (2).
e.
a
a
The relation (1) is proved similarly.
If
THEOREM 4.
00
V (f)< oo, thenf(x) is bowuled. -oo
Proof. Let a be any number. Then, for every number x 00
)C
~
V (f) a
Similarly, if x
V (f).
I f(a)
- f(x)
I ·~
~
>
a,
I f(x)
00
- f(a)
Hence I f(x)
V (f).
I~ !~
-oo
00
lf(a)
I+ V(f),forallx=,t:a. -oo
Theorems 3 and 4 of §3 are easily seen to be true of functions having finite variation on ( - oo, oo). THEOREM 5. Letf(x) have .finite variation on ( - oo, oo). Thenf(x) can be written as the difference of two boUJ!ded monotone increasing jUJ!ctions : f(x) = 1T (x) - v (x) and 1t (x) can be chosen so that lim 1t (x) = 0. x-)o-oo X
Proof. Let
1t
(x) = V (f), and let v (x) =
1t
(x) - f(x).
-oo
increasing function, and by Theorem 3, we have lim
1t .x.--).o-oo
(x)
= 0.
Clearly
1t
(x) is an
To show that v (x)
is monotone, simply repeat the argument of Theorem 6, §3. Remark. The converse of Theorem 5 is an obvious consequence of Theorem 3, §3. CoROLLARY. Let f(x) have finite variation on ( - oo, oo). Then lim f(x) and .x-roo lim f(x) exist.
.
~
Corollaries 1 and 2 of Theorem 6, §3, hold for functions of finite variation on
( - oo, oo).
Theorem 7, §3, is also true for functionsf(x) having finite variation on ( - oo,oo). The construction is, in fact, a bit simpler in the present case. For all x, let s,... (x) = E [ 1r (xk + 0) - 1r (xk- 0)] + [ 1r (x) - 1r (x- 0) ], and let s,..(x)
= E [v(xk
+ 0)
- v(xk- 0)]
+ [v(x}- v (x-0) ].
;ck<x
The points x 1,
X2>
x 3,
•••
are the points at which at least one of
1t
(x)
VIII.
240
FUNCTIONS OF FINITE VARIAT!ON. STIELTIES INTEGRAL
discontinuous. Then, putting s (x) = s..-(x) - sv(x) and
+
Namely, let us write
One further reduction is possible at this point. a.= lini. v (x), and p (x) = v (x) - a.. Then we have x--oo
f(x) = 7T(x) - p (x) -a.,
where both 1t (x) and p (x) are monotone increasing functions with limit zero at - oo. The results of§ § 4, 5, and 6 have obvious extensions for functions of finite variation on ( - oo, oo). We mention explicitly only the definition of the Stieltjes integral co
f f(x) dg(x)
for bounded continuous functions f(x) and functions of finite variation
--oo
g(x), on ( - oo, oo). We make the definition b
00
J f(x)dg(x) = -oo
lim
ff(x)dg(x).
a~-oo
b-+OO
a
The theorems of §6 have obvious analogues for this integral, except for the formula for integration by parts, in which care must be exercised in taking the limits a -+ - oo, b -+ oo. Belly's second theorem is not true as it stands for functions on ( - oo, oo). For example, let 0 X~ 0, y (X) = { X 0 ~ X ~ 1, 1 X> 1,
=
and let g. (x) = y (x - n) (n and yet
1, 2, 3, ... ).
n-oo
00
J 1 dg,. (x) =
lim n-+OC
Then g(x) = lim g,.(x) = 0 for all x, 00
1 #- 0
= f
1 dg (x).
--oo
--oo
To obtain the analogue of Reily's second theorem, we must introduce a new function class. Let C00 denote the class of all continuous functions f(x) defined - oo < x < oo and having the property that lim f(x) = lim f(x) = 0. x--oo
x-+oo
THEOREM 00
V (g.) -o:::a
f(x)
<
EC
K for all n, and such that lim g,.(x) exists for all x, -
00 ,
< x < oo such that oo < x < oo. For all
6. Let {g. (x)} be a sequence of functions on - oo 11--+00
we then have 00
00
lim n~
f f(x) dg (x). f f(x) dg,. (x) = --co --oo
Proof This is a simple corollary of Reily's second theorem (Theorem 3, §7). co
It is first clear that V (g)
< K.
Now let e be an arbitrary positive number. Then there
exists a number A such that -yf(x) d I--oo
(x)
I<
with variation < K.
I f(x) I < 8 ~
_e K = .:_ and 8 8K
I fA
for
I x I ;:; :
A.
f(x) d
I < _:_8
if
We then have
241
EXERCISES
It follows that 00
If
00
J f(x) dg(x) I ~
f(x) dgn(x) -
-00
-00
A
j
J f(x) dgn(x) -A
A
f f(x) dg(x) I +
-
-A
-A J
oo
-A
J f(x) dgn(x) I + I-oa J f(x) dg(x) J + IAJ f(x) dgn(x) I +
-00
oo
A
A
e
) Af f(x) dg(x) J
Then there exists a function g(x) of finite variation on ( - =, =) such that 00
ci> (f) = Jf(x) dg(x) for allf(x) E Coo. We omit the proof.
......00
Exercises for Chapter VIII 1. A necessary and sufficient condition that the function f (x) be of finite variation is that there exist an increasing function 'P (x) such that f (x 11 ) -f (x') ~ 'F (x") - 9 (x') for x'
<
x".
2. If at every point of the set E, the derivative/'(x) of a finite function/(x) exists and 1/'(x) I~ K,
then
m*f (E) ~ K • m*E. 3. A function f(x) is said to satisfy a Lipschitz condition of order a.> 0 if lf(x")- f(x') I I x"- x' I a. Show that for a > l,j(x) =: const. Construct an example of a function of finite variation which satisfies no Lipschitz condition. Construct a function satisfying a Lipschitz condition of given order a < I and having infinite total variation. ~K
b
4. The integral
Jf(x) dg(x) exists if f(x) a
satisfies a Lipschitz condition of order a., and g(x)
satisfies a Upschitz condition of order {J, where .. + f3
> 1.
(V. Kondurar'). :X
f
5. If(x) is continuous and g(x) is of finite variation, then f(x) dg(x) is a function of finite variation, continuous at all points of continuity of g(x). ,.
242
VIII. FUNCTIONS OF FINITE VARIATION. STIELTJES INTEGRAL
6. Let f'o• 1'-l• l'-2• • • • be a given sequence of numbers. Set ~of'n = f'm Ak+ ll'n = A necessary and sufficient condition that there exist an increasing function g(x)
Akf'n- Akp.n+l'
for which
1
J
= 1-'-n
xndg(x)
(n=O,l,?, ... )
(1)
0
is that
for all k and n (F. Hausdorff). 7. (Notation as in Exercise 6.) A necessary and sufficient condition that a function of bounded variation g(x) exist and satisfy condition (I) is that n
~ c~ l Lln-k~nl ::s;;;K k=O
(F. Hausdorff). 8. Show that Riesz's theorem of §8 is a corollary of Hausdorff's theorem stated in Exercise 7. 9. The set F = {f(x)} consists of equfcontinuous functions if to every e > 0 there corresponds a 8> 0 such that l /(x") -f(x') I < E for I x"- x' I <8 and for all functions of F. If all the functions of such an infinite set Fare bounded in absolute value by a single number, then a uniformly convergent sequence can be found in F (C. Arzeta-J. Ascoli). b
c
b
10. Prove the equality V(f) = V(f) + V(f) for continuous functions f(x), using Banach's a
theorem of §5.
a
c
CHAPTER IX
ABSOLUTELY CONTINUOUS FUNCTIONS TIIE INDEFINITE LEBESGUE INTEGRAL § 1. ABSOLUTELY CONTINUOUS FUNCTIONS
We now take up a special class of functions of finite variation, the class of absolutely continuous functions. These functions are important for a number of applications, and are in addition interesting on their own account. DEFINmON. Let f(x) be a finite function defined on the closed interval [a, h]. Suppose that for every e > 0, there exists a 3 > 0 such that n
l~ {f(bk)- f (ak)} for all numbers a 1, bl> ... , a,., b,. such that a 1 < b1
l<e.
(1)
< a 2 < bz. < (2)
Then the functionf(x) is said to be absolutely continuous. It is evident that every absolutely continuous function is continuous, since . the case n = 1 is not excluded in the above definition. We shall show in the sequel that there are, however, continuous functions which are not absolutely continuous. Without altering the sense of the definition, we can replace condition (2) by the stronger condition n
::8 If (b0- j(ak) I<. e. k-1
(3)
In fact, let the number 3 > 0 be such that the inequality n
j ~ {/(bk)
-f(aJJ} j <
]-
k=t
follows from (2). Then, taking an arbitrary system of pairwise disjoint open intervals {(ak> bk)} (k = 1, 2, ... , n), for which (2) holds, we divide this system into parts A and B ; we put into A those intervals (ak, bk) for which f(bk) - f(ak) > 0, and into B all remaining intervals of the system. By virtue of the obvious relations
~
If lbk)- /(ak)
A
I= I~ {f(bk) -f(ak)} [<
~1/(bk)-f(a~c)l=l~{f(bk)-f(ak)}[ B
;,
A B
it is clear that (3) holds.
243
<;,
244 IX.
ABSOLUTELY CONTINUOUS FUNCTIONS. THE INDEFINITE LEBESGUE INTEGRAL
Since all terms of the sum (3) are non-negative, and their number is arbitrary, it is clear that to each e > 0 there corresponds a 8 > 0 such that for an arbitrary finite or denumerable system of pairwise disjoint open intervals {(ak, bk)}, for which
the inequality
holds.
It is possible to replace the increments I f(bk) - f(ak) I in the definition of absolute continuity by the oscillations of f(x) in the intervals [ak, bd. Let us prove this assertion. Let mk and Mk be the least and greatest values, respectively, of the function f(x) in the interval [ak, bd. Then there exist points a." and f3k in [a", b"] such that
f(r;.k)=mk, f(~k)=Mk. Since the sum of the lengths of the intervals ( ak, {3") is less than or equal to the sum of the lengths of the intervals (a", bk), it is obvious that
~ f! (~k)- f(ak)l <e. k
Hence, if the function f (x) is absolutely continuous, to every e > 0 there corresponds a S > 0 such that for an arbitrary finite or denumerable system of pairwise disjoint intervals {(ak, bk)} for which
the inequality
holds. ( wk designates as usual the oscillation of f(x) in [ak, b"].) A function f(x) satisfying the Lipschitz condition
t
lf(x'')- (x') I~ Kl x'' -x' I is a simple example of an absolutely continuous function. THEOREM l. If the functions f(x) and g(x) are absolutely continuous, then their sum, difference, and product are also absolutely continuous. If g(x) vanishes nowhere, then the
quotient~~:~
is absolutely continuous.
Proof The absolute continuity of the sum and difference follows at once from the fact that
I{/ (bk)-+- g(bk)}- (/ (ak) :±: g (ak)} I~ If (bk)- t (ak) I+ Ig(bk)-g (ak) I·
I and ! g(x) I, then lf(bk) g (bk) -f(ak) g(a0] ~I g(brJ I ·lf (bk)-f(arJ I+ Furthermore, if A and B are upper bounds for I f(x)
+ lf(ak) I·Jg(bk)- g (ak)!~ B !f(bk) -f(ak)! +A /g (brJ-g (a,.)/,
245
1. ABsoLUTELY CoNTINUous FuNCTIONs
from which the absolute continuity of f(x) g (x) follows. Finally, if g(x) vanishes nowhere, then I g(x) I ~ a > 0, from which it follows that _1__ -~-~~ig(bk)- g(ak)i • g (bk) g (ak) a2
1
The function g ~x) is therefore absolutely continuous and the function {~;j
= f(x). g(~)
is absolutely continuous, being the product of two absolutely continuous functions. If f(x) is absolutely continuous on [a, b] and F{y) is absolutely continuous on [min j, max f], then the composite function F(f(x)) may or may not be absolutely continuous. We shall return to this subject later, and for the time being content ourselves with pointing out two simple conditions under which F(f(x)) is absolutely continuous. THEOREM 2. Let f(x) be an absolutely continuous function defined on the closed interval [a, b]. Let the values off(x) lie in the closed interval [A, B]. If F(y) is a function defined on the segment [A, B] which satisfies the Lipschitz condition, then the composite function F[f(x)] is absolutely continuous. Proof If I F(y")- F(y') I ~ K I y"- y' I, then for an arbitrary system of pairwise disjoint open intervals (ak, bk), the inequality n
n
k=l
k=1
~IF [!(bk)] -P [f(ak)] I~K ~ lf(bk)- f(ak) I holds. The right-hand member of this equality becomes arbitrarily small together with n
~ (bk-a 1,).
k=l
THEOREM 3. Let f(x) be an absolutely·continuous function defined on [a, b] and suppose that f(x) is strictly increasing. IfF (y) is absolutely continuous on [f(a),j(b)j, then the function F [f(x)] is absolutely continuous on [a, b]. Proof Let e be an arbitrary number> 0, and letS > 0 have the property that for an arbitrary system of pairwise disjoint open intervals (Ak, Bk), for which 11
~ (Bk-Ak)<3,
k=l
the inequality
"
~ (F(Bk)-P(Ak)l
k=l
<e
holds. Then, for this S, there exists a number 7] > 0 such that the inequality m
~ (bk-ak)<'IJ
k=l
implies the inequality 11~
~
k=l
f/ (b,.)- I (ak)l
< o,
provided that the intervals (ak, bk) are pairwise disjoint. Next, select an arbitrary system of intervals (ak, bk), which are pairwise disjoint and for which the sum of the
246 !X. ABSOLUTELY CONTINUOUS FUNCTIONS. THE INDEFINITE LEBESGUE INTEGRAL
lengths is less than 77· The intervals (f(ak), f(bk)) are also pairwise disjoint (this fact is essential) and the sum of their lengths is less than 8. Hence m
~ IF(/ (l>•.)] - F ff(a!<)ll <e.
'/,·=1
This completes the proof.
§ 2. DIFFERENTIAL PROPERTIES OF ABSOLUTELY CONTINUOUS FUNCTIONS
THEOREM 1. Every absolutely continuous function has finite variation. 1 Proof Letf(x) be an absolutely continuous function defined on the closed interval [a, b]. Choose a S > 0 such that for every system of pairwise disjoint open intervals
n
{(ak, bk)} for which ~ (bk - ak)
< S,
the inequality
k=l
"
~ 1/(bk)-f(ak)i
k=l
obtains. Subdivide [a, b] by means of points
into parts such that
(k=O, 1, ... ,N-1). Then, for every subdivision of the segment [c", ck+J, the sum of the absolute increments for f(x) on these parts is less than 1. It follows that ck+l
V (/) ~ 1,
b
and therefore
V(j)~N. a
ck
COROLLARY. Let f(x) be a function which is absolutely continuous on [a, bJ. Then the derivative f'(x) exists and is finite at almost every point of [a, b]. Furthermore, the function f'(x) is summable on [a, b]. THEOREM 2. If the derivative f' (x) of an absolutely continuous function f (x) is zero almost everywhere, then the function f (x) is constant. Proof Let E be the set of those points of (a, b) for which!' (x) = 0. Let e > 0. If x E E, then for all sufficiently small h > 0,
< e.
1/(x +h)- f(x) 1 It
It is clear that the closed intervals [x, x 1
(*)
+ h] for which h > 0 and condition (*)
This theorem implies the existence of continuous functions which are not absolutely continuous 11"
(for example, x cos 2x is such a function: see §3, Chapt. VUI).
2.
DIFFERENTIAL PROPERTIES OF ABSOLUTELY CONTINUOUS FUNCTIONS
247
is satisfied cover the set E in Vitali's sense. Hence we can select from them a finite number of pairwise disjoint closed intervals
d1 =[x 11 x 1 +hd,
d2 =[x2 , x:t+h 2 ],
••• ,
dn=[X 11 , Xn+h"],
lying in the open interval (a, b) and such that the outer measure of the part of the set E not covered by them is less than an arbitrary preassigned number S > 0. We may suppose without loss of generality that xk < xk+l· If
[a,x 1),
(x1 +h 1, x 2),
.•. ,(Xn- 1
+h 11 -1, x,,),(x11 +hn, b]
(1)
are the intervals which remain after removing from [a, b] all intervals dk (k= 1, 2, ... n), then the sum of the lengths of these intervals will necessarily be less than S. This follows from the fact that n
11
11
b-a=mE~ ~ md.~:+m~' [E- ~d.~:]<~ mdk+o. k=l
k=l
k=l
This in turn implies n
~ mdk > b-a-o.
k=l
Recall now that the function f (x) is absoltitely continuous. Hence, the number S can be chosen so small that the sum of the increments of f(x) on the intervals (1) is less thane:
,,_1 !(f(x 1 )-f(a)}+ ~ {f(x.~:+ 1 )-f(x,,+hk)}+(f(b)-f(xn+hn)}i<e. (2) k=1
On the other hand, the definition of the segments dk shows that
from which we infer that n
I~ {f(xk+hk)-/ (x~)} k=l
I< e (b-a)
(3)
(since ~h.~:=~mdk~b-a). Upon combining (2) and (3), we have
1/(b)- f (a) I <e (1 +b-a) is arbitrary, it follows that /(b) =f(a). This reasoning can be carried out for every interval [a, x] such that a< x arbitrary x E (a, b], therefore, we have
since
E
For
f(x)=f{a), andf(x) is a
constant. 2
2 It follows from the theorem just proved that the continuous function §2, Chapter VIII is not absolutely continuous.
e (x)
constructed in
248 IX. ABSOLUTELY CONTINUOUS FUNCTIONS. THE INDEFINITE LEBESGUE INTEGRAL CoROLLARY. If the derivatives f'(x) and g'(x) of two absolutely continuous functions f(x) and g(x) are equivalent, then the difference of these functions is constant. In fact, if we remove from [a, b] the set of points (of zero measure) at which at least'" one of the functions f (x) or g(x) does not have a finite derivative or for which their derivatives are not equal, then for every remaining point, we have
If(x)-g(x)}'=O.
§ 3. CONTINUOUS MAPPINGS
In §2, Chapter VIII, we had occasion to consider images of point sets under certain mappings. We consider here a number of more refined properties of various image sets under continuous mappings. To avoid repetition, we agree once for all that f (x) is a continuous function defined on the closed interval [a, b]. THEOREM 1. The image f(F) of a closed set F is a closed set. Proof Let y 0 be a limit point of the setf(F), Yo=limyn n-+oo
For every point Ym let
Xn
[YnEf(f)}.
be a point in F such that f(xn) =Yn·
Since the sequence {xn} is bounded, there exists a convergent subsequence {xn) : lim Xnk=x 0.
Since the set F is closed, it follows that
x 0 EF and so
f
(xu) Ef (F).
On the other hand, since f(x) is a continuous function, limYnJc= limf(xnT<) =f(xo),
so that Yo=f(xo)
and Yo Ef(F). The set f(F) therefore contains all of its limit points and is closed. Combining this theorem with Theorem 1, §2, Chapter VIII we have the following fact. COROLLARY. If E is a set of type Fa, then its map f(E) is a set of type Fa. Now let us consider the question of whether the property of measurability is invariant under a continuous mapping: To answer this question, the following definition, due to N. N. Luzin, will be helpful. DEFINITION. If the map f(e) of every set e of measure zero is also a set of measure zero, then the function f(x) is said to have the property (N). THEOREM 2. In order that the map f(E) of every measurable set E be a measurable set, it is necessary and sufficient that the function f (x) possess the property (N). Proof Let f(x) possess the property (N) and let E be a measurable set lying in [a, b]. Then
E=A+e,
3.
249
CONTINUOUS MAPPINGS
where A is a set of type Fa and e is a set of measure 0. a Therefore, /(E) =/(A)+f(e)
and consequently the setf(E) is measurable, being the sum of an Fu and a set of measure zero. Suppose now that the function f(x) does not possess the property (N). Then we can find a set e0 of measure zero lying in the closed interval [a, b] such that the outer measure of its image under f is positive : m->:j(e0)
>O.
Under these conditions, the setf(e 0) contains a non-measurable subset B. 4 For every y EB, consider an element x E e0 such that f(x) = y. The set of all such points x comprises a set A contained in e 0 such thatf(A) =B. It is obvious that A is measurable. In fact, A is a subset of e 0 and hence has outer measure zero. At the same time, /(A)= B is non-measurable, so that the functionf(x) carries at least one measurable set into a non-measurable set. THEOREM 3. All absolutely continuous functions possess property (N). Proof Let the function f (x) be absolutely continuous and let E be a set of measure zero. We shall show that mf(E) =0. We first suppose that the points a and b do not belong to E, so that E c(a, b).
Taking an arbitrary ~: > 0, let 8 > 0 have the property that for an arbitrary finite or denumerable system of pairwise disjoint intervals {(ak,.b.J}, the sum of the lengths of which is less than 8, the inequality ~(M,.-m,.)
holds, where as usual
m,. =min{/ (x)},
Mk =max {f(x)}
(See remarks following the definition in §1.) Since mE= 0, there exists a bounded open set G such that E c G, mG
We may suppose that G C (a, b), since E by hypothesis is contained in the open interval (a, b). Now, G is the sum of its component intervals (ak, b.J, the sum of the lengths of which is less than 8. This implies that
f (E) ct(G) = ~ f k
3
[(a 1,, b~c)] c ~ f([ak, b,.]), k
To show this, let Fn be a closed subset of E such that mFn
> mE-~. n
for n
= 1, 2, 3, ....
Then set
4
If /(e0 ) is non-measurable, set B
of §6, Chapt. III.
= f(eQ)
; otherwise apply the statement proved at the end
250 IX.
ABSOLUTELY CONTINUOUS FUNCTIONS. THE INDEFINITE LEBESGUE INTEGRAL
Therefore m*f(E) ~ ~ m*f ([a~, b~]). ~
It is also clear that and consequently m-.:f(E)~ ~(Mk-mk)
It follows from this that mf(E) = 0, since e is arbitrary. To establish the general case, it suffices to note that omitting the points a and b from the set E implies the removal from the set f(E) of not more than two points, f(a) and f(b) ; this obviously has no effect on the measure of the set f(E). CoROLLARY.
An absolutely continuous function maps measurable sets into measur-
able sets.
We have now proved that every absolutely continuous function has finite variation and possesses the property (N). It turns out that these two properties characterize the class of absolutely continuous functions. THEOREM 4 (S. BANACH AND M. A. ZARECKI). If f(x) is a continuous function of finite variation possessing the property (N), then it is absolutely continuous. Proof. Suppose thatf(x) is not absolutely continuous. Then there exists a positive number e0 such that for every S > 0, there exists a family of pairwise disjoint open intervals {(ak, bk)}t.. 1 for which n
~(bk-ak)
and having the additional property that n
,l!
mJ : ; ,:. e0 •
(Mk -
k=1
Let 00
~a, i=l
be a convergent series of positive terms, and for every S;, let (a~>, bW) (k be a collection of pairwise disjoint open intervals for which
,.,.
~ (b~l- a~l) k=l
=
1, 2, ... , n;)
< o,,
and
"'"
~ (M~l- m~>) ::;;,:. ~a. k=l
[As usual, Mfl and m~> are the maximum and minimum values, respectively, of the function f(x) in the interval (dj.l, b~>).]
Set
"'• .. -""""
E -
00
~ (aCi)
k=l
k'
bC.))
k'
. A=
II
00
~E, .
n=li=n
251
3. CoNTINUous MAPPINGS
It is easy to see that mA = 0; it then follows from our hypotheses that (1)
mf(A)=O.
We next define functions L~> (y) [k = I, 2, .· .. , n1 ; i = I, 2, 3, ... ] by the following rule. Lfl (y) = I if there is at least one x in the open interval (~>, b~>) for which f(.x) =JI·
(2)
Otherwise, L~> (y) = 0. Plainly L~>(y) = I for all y in the open interval (m~>, MV>) and L~> (y) = 0 for y not in the clo~ed interval [m~>, M~>]. Therefore )If
f L~>
M}:>- m~>.
(y) dy =
(3)
Let 1~.,~
=
N, (y)
~ L~) (y). k=l
It is clear that N 1 (y) is the number of those intervals (a~>, b~>) containing at least one x satisfying equation (2). Hence ~
N; (y)
N (y),
(4)
where N(y) is the Banach indicatrix of the function f(x). Because of (3), 1Jf
J
Ni(y) dy;;;;::•o.
(5)
To complete the proof, we shall show that for almost ally in [m1 M], the equality lim N;, (y) = 0
(6)
·t -+co
holds. Since the Banach indicatrix N(y) is summable, it will follow from (4) and (6) and Lebesgue's convergence theorem that )If
lim
i ->co
J
Ni (y) dy = 0,
n•
this contradicts inequality (5). Let B be the set of y for which (6) does not hold, and let C be the set of y for which N(y) =oo. Since N(y) is a summable function, mC=O, and to prove the theorem it is sufficient to verify that B-Cc:[(A).
(7)
Let Yo E B - C. Since the functions N,(y) assume only non-negative integral values, there is a sequence { i,} of natural numbers such that N;,. CYu)> 1
(r
= 1, 2,
3, ... ).
For every r, accordingly, there exists a point x 1r such that
f (x,) =Yo,
Xir
EEir·
Since N (y 0) < +oo, there are only a finite number of distinct points among the points x 1,- Hence, one of them, say x 0, occurs an infinite number of times in the sequence
252
IX. ABSOLtJrELY CONTINUOUS FUNCIIONS. THE INDEFINITE LEBESGUE INTEGRAL
{x;).
x0 belongs
to an infinite number of the sets E,, and clearly f(xo) =YoIt is then clear that x 0 E A and thatj (x 0) = y 0 E/(A). This verifies the inclusion (7) and completes the proof. THEOREM 5 (G. M. FICHfENHOLZ). Let F(y) andf(x) be two absolutely continuous functions such that the values off(x) all lie in the closed interval on which F(y) is defined. The composite function F [J (x) ] is absolutely continuous if and only if it has finite variation.6 Proof The necessity of the stated condition is obvious. To prove its sufficiency, we need only note that if/and Fboth possess property (N), then the composite function F [J(x)] also possesses property (N). Then apply Theorem 4. The point
§4. THE INDEFINITE LEBESGUE INTEGRAL
Letf(t) be a summable function defined on the closed interval [a, b]. The function :z:
~-(x) =
C+ Jf(t)dt a
is called an indefinite Lebesgue integral of the function f(t), for every choice of the constant C. The term • indefinite ' refers to the variable upper limit of the integral. THEOREM 1. The indefinite integral ClJ (x) is an absolutely continuous function. Proof For every e > 0, there exists, in view of Theorem 8, §2, Chapt. VI, a 3 > 0 such that for every measurable set e of measure me < S, the inequality
IJI (t) I< dt
e
e
holds. In particular, if the sum of the lengths of a finite system (ak, bk) of pairwise disjoint open intervals is less than S, then tl
bk
Ik~l I I
(t) dt r
< s.
ak
Since !Jk
JI (t) dt =tTl (h
1, ) -
we infer
(a"),
ak
" {•I>(bTc)-(ak)}[<e. ~~ k=l
That is, (x) is absolutely continuous. The preceding theorem, together with the corollary to Theorem 1, §2, implies that (x) has a finite derivative almost everywhere, which is itself a summable function of x. This derivative can be identified completely, as the following theorem shows. 6 This theorem was discovered by G. M. Fichtenholz in 1922. In 1925, a new proof was given by M. A. Zareck.i, who at the same time obtained Theorem 4 above. Theorem 4 was also proved by s. Banach.
4. THE lNDEFINITE LEBESGUE INTEGRAL
THEOREM 2. Let f(x) be a summable function on [a, b]. the indefinite integral
253
The derivative
a:
=I f(t) dt a.
is equal to the function f(x) almost everywhere on [1.1, b]. Proof. Let p and q be two real numbers such that p < q. Let EP, q be the set of those points of [a, b] where the function (x) is differentiable and where its derivative ll>'(x) satisfies the inequalities !ll' (x) >q p>I (x). It is clear that the set Ep,q is measurable. Our first problem is to prove that m£11 , 11 =0. (1) For this purpose, let e be an arbitrary positive number, and let > 0 have the properties that < e and that
>
o
o
1
I t(t)dtl<e,
" set such that G c [a, b] and whenever me< S. Let G7 be an open G=>Ep,q, If x
mG
E Ep, q• then
< mE,,q+a. (2)
for all sufficiently small h > 0. It is clear that the set Ep,q is covered by the closed intervals [x, x + h] [for positive h satisfying condition (2)] in the sense of Vitali. We may suppose that all of the intervals [x, 'x + h] are contained in G. By Vitali's Theorem, there exists a sequence [xl, xl
+ hl],
[x2,
x2
+ h2], ••• ,
of these intervals which are pairwise disjoint and for which 00
m{E- ~[xk, xk+h1,]}=0. 7<"'1
Because of (2), we have
a:k+"Jc hk
f
f(t)dt>q.
:tlk co
Set S = ~ [x,~., xk 7<-1
+ hk]-
Then the last inequality implies that
I I (t) dt>q. mS,
s or, equivalently,
It(t)dt>q [mEp,q+ asJ s 7
We may suppose that the points a and b are not in Ep,q·
(0 ~ 8 ~ 1).
(3)
254 IX.
ABSOLUTELY CONTINUOUS FUNCTIONS. THE
INDEFINITE
LEBESGUE INTEGRAL
On the other hand, we have S c G, and therefore Hence m [S - Ep,q]
< 8, and
S- E11 , q c:: a- E11, q.
J f (t) dt<
e.
S-E1"'l
Consequently, 8
Jf(t)dt< J/(f)dt+e.
On the set Ep•q• we have f (t)
(4)
E 11 , 1
S
< p and hence
J f(t) dt<, p · mEp,q·
(5)
EP•'l
Relations (3), (4) and (5) imply that q [mE11,q+6e]
Then
(x.)
>! (x).
E= ~ Ep, (p,q)
q•
where the summation runs through all pairs (p, q) of rational numbers for which p By virtue of (I), we have
<
q.
mE=O. In other words, if A" is the set of points at which the derivative
of?' (x) -<.f(x)
(6)
almost everywhere on A. Finally, set
a:
g(x)=-f(x), It is easy to see that 8
r
(x)
Note that m (E p,q - S)
=
= 0.
f g(t)dt.
r(x)=
- (x), so that
r' (x)
Therefore
f fdt= f fdt. E 11 , q
SE 11 ,
a
exists at all points of A.
4. THE INDEFINITE LEBESGUE INTEGRAL
255
Applying the assertion just proved to the function r (x), we find that r' (x) (x), almost everywhere on A, or, equivalently, that ' (x) f (x) {7) for almost all x EA. From (6) and (7), it follows that ' (x) =! (x) for almost all x E A, and hence for almost all x E [a, b]. This completes the proof. THEOREM 3. Every absolutely continuous function is an indefinite integral of its own derivative. Proof Let F (x) be an absolutely continuous function. Its derivative F'(x) exists almost everywhere and is summable. Write :2;
il>(x)=F(a)+ .
JF'(t)dt.
a
The function (x) is also absolutely continuous and, as proved in the preceding theorem ' (x) = F' (x) almost everywhere. In view of the corollary of Theorem 2, §2, we infer that the difference F (x) - (x) is constant ; since this difference is 0 for x = a, the functions F (x) and (x) must be identical. Theorem 2 can be considerably sharpened. We first state a definition. DEFINITION. If :t"+h
lim
l.J
h-+ 0 h
1/(t)-f(x)Jdt=O
:z:
at the point x, the point xis said to be a Lebesgue point of the functionf{t). THEOREM 4. Let x be a Lebesgue point of the function f(t). The indefinite integral .<
(x)
= f
f(t) dt is differentiable at the point x, and
= f(x).
a
Proof It is easy to show that :z:+ll
=i J
therefore
(x} -f(x)J<:~
{f(t)-f(x)}dt;
:z:+h
J
!f(t)-f(x)Jdt,
:2;
and this proves the theorem. We note that the converse statement is not true, in general. THEOREM 5. If the function f(x) is summable on [a, b], then almost every point of [a, b] is a Lebesgue point off (x). Proof Let r be a rational number. The function I f(t) - r [ is summable on [a, b] and hence for almost all points x E [a, b], we have lim h-+0
~
:z:+h
J
lf(t)-rJdt=lf(x)-rJ.
(8)
256
IX. ABSOLUTELY CONTINUOUS FUNCTIONS. THE lNDEFlNlTE LEBESGUE INTEGRAL
Let E(r) be the set of those points of [a, b] at which (8) does not hold. It is clear that mE(r) = 0. We enumerate all rational numbers. as a sequence r 1 , r 2 , r 3 , ••• , and put co
E= ~E(rn)+E(lfl=+oo). n=1
Then mE = 0, and it is sufficient to prove that all points of the set [a, b] - E are Lebesgue points of the functionf(t). Let x 0 E [a, b] - E, and let e be an arbitrary positive r\umber. Let rn be a .rational number such that
lf(xo).-rn I<
;·
Then it is clear that
and we have
I~
~+h
J
lf(t)-rnldt-
~
x0
Since x 0
J
1/(t) -j(xo)ldtl
<:
i"·
::t:o
E E, we have
Ii;"'J"!/(t)- r1~l for
~+h
Ih I <
dt-l f(xo)-rnl\
<-I'
o( e ), i.e., ""' ~
and hence for h
<
o(e),
"'•+h
I
lf(t)-rnl dt
< f e,
"'o (l>o+h
! I [f(t)-f(x )!dt<e. 0
"'• THEOREM 6. Every point of continuity of a summable function f(t) is a Lebesgue point off (t). Proof. Letf(t) be continuous at the point x. Then to every e > 0, there corresponds a > 0 such that
o
lf(t) -f(x) l<e for
It
-
x
I < o.
For I h
I < o, we have "'+h
k J if(t)-f(x)ldt<e, a;
and the theorem follows. Theorems 1 and 3 imply that' for a function (x) to be the indefinite integral of a summable function, it is necessary and sufficient that it be absolutely continuous. We may also try to characterize functions which are indefinite integrals of functions in LP for p > 1.
4. THEOREM
257
THE INDEFINITE LEBESGUE INTEGRAL
7 (F. RIEsz). A function F (x) (a <. x
form :z:
P(x) = C
+ Jf(tJ dt,
(9)
a
where f(t)
E LP (p >
I)
if and
the inequality
only if, for every subdivision of [a, b) by points
n-1
~ lP(xk+l)-P(xk;IP <_K
k=O
(10)
(Xk+1 -xk)P-
ho/ds, where K is independent of the manner of subdividing [a, b). 9 Proof. The necessity of condition (10) is almost obvious. In fact, by Holder's inequality [Chapter 7, §6, Formula (1)], :z:k+l
IF(xk+d-P(x01=]J
f(t)dtj<.~Xk+l-Xk·
yr J
xk
:z:k+l
lf(t)! 11 dt,
:z:k
where q = _E_1. Hence p-
b
and (10) holds, where the nur;nber K is the integral f a
If
(t) IPdt.
The sufficiency of condition (10) is more difficult to prove. First of all, we note that condition (10) is only strengthened if some of its left-hand terms are omitted. Hence for an arbitrary finite system of pairwise disjoint open intervals (ak~ bJ (k = 1, 2, ... , n) contained in [a, b], we have ~ IF(bTt)-F(ak(lP <.K. ~ k=l
(bk-ak)P-
By Holder's inequality for sums [Chapt. VII, §6, Formula (8)], we have
9 If p = 1, (10) is the condition that F (x) have finite variation. Hence, this condition remains necessary but ceases to be sufficient for F (x) to be representable in the form (9) for f(t) L.
E
258
IX. ABSOLUTELY CONTINUOUS FUNCflONS. THE INDEFINITE LEBESGUE INTEGRAL
Therefore
and accordingly, F (x) is absolutely continuous. Therefore F (x) is representable in the form (9) withf(t) E L. It remains to show thatf(t) E Lr For this purpose, we subdivide [a, b] into n equal parts by the points
4"> =a+~n (b- a)
(k
=
0, I, ... , n),
and introduce the function f,. (t), setting
f 11 (t) =
F (x<71 )) k xk+l-xk
F (x
At the points of subdivision, we set f,. (.x).">) It is easy to see that
=
,;c
f;or
k+l -
--(11)--(~ '
1·
0.
lim / 11 (t) =f(t) 1l~CO
almost everywhere. [The equality just stated may fail to be true at points of the form In fact, if xis not a point of subdivision and if F'(x) exists and is finite, then x lies in some open interval (x1~, 4~ + 1) for all natural
xl."> and at points where F' (x) ::;t:j(x).] numbers n. Since x).">II +l expressions
x<,.II>= b -n a -+ 0 as n 11
-
-+ oo,
it follows that each of the
F (x)- F (xfJ:l) n
converges to F' (x) as n-+
oo.
However, j,(x) =
F (.t."> ) - F (xS">) xln) kn , and accordingly
);t _ k 11+1
the number fn(x) lies between the two numbers (*). Fatou's theorem now implies that
a
k 11
Aceordingly lim f,(x)
=
F' (x).
b
b
Jlf(t) jP dt<,
( *)
sup {
Jlfn(t) jP dt}. a
We obtain an upper bound for the right-hand side of the preceding expression, as follows:
Therefore
b
Jlf(t)!Pdt<+co· This completes the proof. In conclusion, we compute the total variation of an indefinite integral.
259
4. THE INDEFINITE LEBESGUE INTEGRAL THEOREM
8. Let f{t) be a summable function defined on [a, b]. If
P (x)
= Jf
b
(t) dt,
b
! (P) =I lf(t) Idt,
then
a
i.e., the total variation of an absolutely continuous function is the integral of the absolute value of its derivative. Proof. If x 0 = a < x1 < x 2 < ... < xn = b is any subdivision of [a, b], then
eo
n-l
n-1
I= k~J
I F(xT<+l) -P(xk)
"'k+t
I
n-1 "'k+l
f(t) dt
I< i'<~O I
"'k
b
lf(t) I dt=
"'k
J/!(t) 1dt. a
Accordingly, b
b
V (F)< a
f lf(t) I dt. a
In order to establish the reverse inequality, we set (a, b) = E and let
P=E (f).J:-0), Then
b
fit
]dt=
a
N=E(f
Jt(t)dt- I t(t)dt. P
N
Let e be an arbitrary positive number. Since the integral is absolutely continuous, there exists a 8 > 0 such that for every measurable set e C [a, b] with measure me< S, the inequality
Jl!(t)jdt<e e
holds. Let F(P) and F(N) be closed sets, contained in P and N respectively, such that m[P-F (P)J
m [N -P (N)J
Then b
Jl!(t)jdt< J f(t)dt- J j(t)dt+2e. a
F(P)
F(N)
In accordance with the separation theorem for disjoint closed sets (Theorem 2, §4, Chapter II), one can find open sets r (P) and r (N) such that r(P)=:~F(P),
r(N)=:~F(N),
r(P)·l'(N)=O,
where the sets r (P) and r (N) are contained in (a, b). Furthermore, there exist bounded open sets A (P) and A (N), containing F (P) and F (N), respectively, and such that m [A(P) - F(P)] < S, m [A (N) - F (N)] < 8. Now set O(P)= A (P) • r(P),
O(N)= A (N) • r (N).
G (P) and G (N) are disjoint open sets contained in (a, b), containing F (P) and F (N) re-
260 IX. ABSOLUTELY CONTINUOUS FUNCTIONS. THE INDEFINITE LEBESGUE INTEGRAL
spectively and such that m [G(P)- F(P)]
< S, m [G(N)- F(N)] < S. Hence
b
I lt
G- (P)
G- (N)
The set G (P) is the sum of its component intervals. Taking a sufficiently large finite number of these intervals, we obtain a set B (P) whose measure differs from that of G (P) by less than S. Then we have
I f(t)dt- Jf(f)dt< e. B(P)
G(P)
n
Let B (P) = ~ ('-k• p.k}. Then k=l
n
~'-.k
n
f f(t) dt= ~ f f(t)dt= ~ [F(f-L ,)-F(I-k)]. 1
B (P)
k=1 >.k
Hence
k=l
n
J f(t) dt< ~ [F(p.k)-F(J.k)] +e. k=l
G(P)
In an analogous manner we can find a finite number of component intervals of the set G (N), say (0' 1 , -rJ, (a 2 , -rJ, ... , (am, ..m), such that m
J f(t)dt> ~[F(":,:)-F(a,)J-e.
G(N)
i=l
Combining all these statements, we see that b
"'
f lf(t) i dt < ~ [F(f'-,J-F(Ak)l- ~IF (-:i)-F(a,)] + 6e,
a
k=l
i=l
and necessarily n
b
m
J1/(t) Idt < ~ l F (!'-k)-F(J.,k) I+ ~I F('t,)- F(cr,) I+ 6e.
a
k=l
Since the intervals ( Ak,
/1-k)
i=l
and (a1, T 1) are pairwise disjoint, it follows that
n
m
b
'''""1
""'1
a
~IF(f'-k)-F(Ak)i+ ~ j.F('t,)-F(a,:)I~V(F).
Thus b
b
I if(t) ]dt
a
Since e is arbitrary, the theorem follows. § 5.
POINTS OF DENSITY. APPROXIMATE CONTINUITY
Let E be a given measurable set. Taking an arbitrary point x 0 and a number
h
> 0, we set
5. POINTS OF DENSITY. APPROXIMATE CONTINUITY
261
This set also is measurable. Let us consider the number mE(x0 ,h) 2h
(1)
It is natural to consider this the " mean density " of the set E on the closed interval [x0 -h, x 0 +h]. The limit of (I) as h -+ 0 is called the density of the set E at the point x 0 and is denoted by DEFINITION 1.
D~OE.
If Dx 0E = 1, then x 0 is a point of density of the set E, and if Dx 0E = 0, x 0 is a point of rarefaction of E. In stating this definition, we do not assume that x 0 E E. Furthermore, a measurable set should not be expected to have a defined density at every point of the line. However, the following theorem is valid. THEOREM 1. Almost all points of a measurable set E are points of density of E. Proof. Let the set E be measurable. Take an arbitrary closed interval [a., fJJ containing the set E, and let a= a ~ 1, b = f3 + 1. Then, for x E E and h < I, it is certain that the closed interval [x - h, x + h] is contained in [a, b]. If the contrary is not specified, we shall suppose that h I. Consider the characteristic function !Jl (x) of the set E,
<
a>
.
(x)
={1 o
if x § E if xEE
taken only on [a, b]. This function is measurable and bounded. Let ~
JIf (t) dt.
(x) =
Then ' (x)
=
a !Jl (x) almost everywhere on [a, b], by Theorem 2, §4, and in particular
<J?' (x) =I (2) almost everywhere on E. We shall show that points for which (2) holds are points of density of the set E. In fact at every such point, lim (.r +h)-4> (x) ~ . h
=
lim
fz
iH-0
-
1
,
and hence lim cll(x+h)-
But
:~:+11
<J? (x +h) -1> (x -h)=
J If (t) dt =mE (x, h),
:1:-h
so that
D E ~
as was to be proved.
= 1"
lffi
h~
mE (x, h) -
2h
-
1
'
262 IX.
ABSOLUTELY CONTINUOUS FUNGnONS. THE INDEFINITE LEBESGUE INTEGRAL
There is an important generalization of the concept of a continuous function which is closely connected with the concept of points of density. DEFINITION 2. Let f(x) be a function defined on the closed interval [a, b], and let x 0 E [a, b]. If there exists a measurable subset E of [a, b] having the point x 0 as a point of density 11 , such that f(x) is continuous at the point x 0 with respect toE, then f(x) is said to be approximately continuous at x 0 • It is clear that every point of continuity of the function is a point of approximate continuity of the function. Of course, a measurable function may have no points of continuity at all. Such a function, for example, is the function equal to 0 at irrational points and 1 at rational points. On the other hand, the following theorem is true. THEOREM 2. If f(x) is a measurable function defined on the closed interval [a, b] and finite almost everywhere, then it is approximately continuous at almost all points of [a, b]. Proof Let e be a·n arbitrary positive number. Using Luzin's Theorem (Theorem 4, §5, Chapter IV), we find a continuous function
mE(j=f=. 'f)<e. Let A be the set of all points of density of the set E (f =
If x 0 E A, then f(x) obviously is approximately continuous at this point, since we can take the set E (f =
m*H>mA>b-a-e, and, since e is arbitrary,
m*H>b-a. Furthermore, H C [a, b], so that
b-a<,.m*H<,.m*H<,. b-a. H is therefore measurable, and mH = b - a, as was to be proved. REMARK. The concept of density given above can be generalized. Namely, we may define the density of the set E at the point x 0 as the limit of the ratio mE (x0• h 11 h.,.)
hl+iz..:. as h 1 > 0 and h 2 > 0 tend to zero independently one of the other, where E(x 0 , h1 , h2 ) is some subset of the set E contained in [x 0 - h, x 0 + h]. However, this generalization alters neither the set of points of rarefaction nor the set of points of density of the set E. In fact, let x 0 be a point of rarefaction of the set E in the sense of definition 1. Taking 11 If x 0 =a, then instead of requiring the set E to have x 0 as a point of density, we must require the right-hand density of the set to be one at x 0 ; i.e.,
lim m{E·[a, a+ll]} _ 1 h -.
~
For the point b, the definition of approximate continuity must be modified in an analogous way by using the left-hand density.
6. SUPPLEMENT TO THE THEORY OF FUNCTIONS OF FINITE VARIATION numbers h 1
>
0 and h 2
>
263
0, we let h be the greater of the two. Then
E (x01 h 1, h 2) c:.E (x 0 , h), and hence
mE (xa. h1, h2) ~ 2 . mE (xo, h) ht+h2 ...., 2h
Since the right-hand member of this inequality tends to zero together with h, it follows that lim mE (xo. h1, ~) = 0 ht
h,-:Hl h2->0
+ h2
and x 0 is a point of rarefaction of the set E in the sense of the generalized definition. The converse statement is obvious. It is just for this reason that we gave the definition of density set forth in Definition 2. It is clear, for example, that the definition of a point of approximate continuity does not depend o.n the definition of density used in establishing it.
§ 6.
SUPPLEMENT TO THE THEORY OF FUNCriONS OF FINITE VARIATION AND STIELTJES INTEGRALS
Letf(x) (a< x
(x)
Then
J
=I (a)+ I' (t) dt,
r (x)
=I (x) -cp (x).
a
I (x)= Cf (x) + r
(x),
where !fl (x) is an absolutely continuous function [with cp (a) = f(a)], and r (x) is a continuous function of bounded variation, whose derivative obviously equals zero almost everywhere. It is clear that r (x) vanishes only when f(x) itself is absolutely continuous. DEFINITION. A non-constant continuous function of finite variation whose derivative equals zero almost everywhere is called a singular function. It is clear that a singular function cannot be absolutely' continuous, for otherwise (Theorem 2, §2) it would be constant. An example of a singular function is the function 0 (x) constructed at the end of §2, Chapt. VIII. THEOREM 1. A continuous function f(x) of finite variation can be uniquely repre-
sented in the form
I (x) =
cp (x} +r(x),
where
l(x) =f? (x) we have
+ r (x) = 9
1
(x) +r1 (x),
264 IX.
ABSOLUTELY CONTINUOUS FUNCTIONS. THE INDEFINITE LEBESGUE INTEGRAL
Hence, the derivative of the difference tp (x) - tp 1 (x) is 0 almost everywhere, and since this difference is absolutely continuous, it is constant. But tp (a) = tp 1 (a) = f(a). This implies ~ (x) cp 1 (x), and therefore r (x) THEOREM
2.
=r
=
1
(x) as well.
If f(x) is an increasing function, then both of its components
and r (x) are increasing functions. Proof It is obvious that f'(x) the function
cp (x)
> 0 wherever this derivative exists. This implies that :;c
Cfl (x)= cp (a)+
JI' (t) dt a
is an increasing function. Furthermore, Theorem 5, §2, Chapt. VIII, implies that 11
JI' (t)dt..(J(y)-l(x)
(y>x).
Therefore or r (x)
< r (y).
9 (y)- ~ (x) <:.I (y)- I (x),
CoROLLARY. A necessary and sufficient condition that an increasing continuous function f(x) be absolutely continuous is that b
Jl'(x)dx=f(b)-l(a).
{1)
a
The necessity of condition (1) is obvious. Conversely, suppose that f(x) is not absolutely continuous and let cp (x) and r (x) be the absolutely continuous and singular components of f(x). Then l(b) -f(a) = tp (b)-rp (a) +r (b)-r (a), b or l(b)-l(a)= f'(x)dx+r(b)-r(a). (2)
J
a
.
But r (x) is an increasing, non-constant function. Therefore r (b) > r (a), and (I) is not satisfied. This proves that the condition of the theorem is sufficient. In §3, Chapt. VIII, we saw that every function of finite variation can be written as the sum of its saltus function and a continous function of finite variation. Combining this with Theorem 1, we see that every function of bounded variation can be written l(x) = cp (x) +r (x) +s (x), where cp (x) is an absolutely continuous function, r (x) is a singular function and s (x) is a saltus function (some terms can be absent).* In §7, Chapt. VIII we raised the question of evaluating the Stieltjes integral b
Jf(x)dg(x) a
in the case where g (x) is continuous. We see that the case g (x) absolutely continuous can be reduced to the computation of an ordinary Lebesgue integral. *A similar resolution can be obtained for functions f(x) of finite variation on ( -
oo, oo).-E. H.
6.
SUPPLEMENT TO THE THEORY OF FUNCTIONS OF FINITE VARIATION
THEOREM
265
3. If f(x) is continuous and g (x) is absolutely continuous on [a, b], then b
b
(S) J f(x)dg(x)=(L)J f(x)g'(x) dx. a
a
Proof. It is obvious that both integrals exist. To show that they are equal, we
evaluate the difference between the sum n-1
a=~ f(t,,)[g(xk+I)-g(xk)] J.:-0
and the integral
I>
J/(x)g'(x)dx. a
Since
a:k+l
g(xrc+l)-g(xk)= Jg'(x)dx, a:rc
we have n-1
b
a-Jt(x)g'(x)dx=~
k-o
a
Zrc+t J [f(e~t)-f(x)]g'(x)dx.
If the oscillation of the function f(x) on [xJc, X~c+J is written as that n-1
b
Ia- J f(x)_g' (x) dx 1-< k~owk
(3)
~k
a:k+l
J
a
w",
then (3) implies
II
Ig' (x)!dx
xk
a
where a. = max {w~c}· If the lengths of the intervals [x"' x~c+J tend to 0, then a~ 0 b
also ; therefore u approaches the integral Jf(x) g'(x) dx.
Jf
Since lim u is the integral
a
b
(x) dg (x), the theorem is proved.
a
b
We have shown that computing a Stieltjes integral Jf(x) dg (x) involves only a
summation of an infinite series and evaluating an ordinary Lebesgue integral, unless the function g (x) has a singular part.* Certain properties of Lebesgue integrals can be established with the aid of Theorem 3,, as the following example shows. b
*The computation of J f(x)dg(x) when g(x) is a singular function can be very a
complicated indeed, and can lead to very curious results. For example, one can show that
f 0
cos 21T x d O(x)
=-
IT cos ( 23~)
1=1
where fJ (x) is the singular function defined in §2, Chapter VIII.-E.
H.
266 IX.
ABSOLUTELY CONTINUOUS FUNCTIONS. THE INDEFINITE LEBESGUE INTEGRAL
THEOREM
4
(INTEGRATION BY PARTS).
then
~
/f f(x) and g (x) are absolutely continuous,
~
Jf(x)g'(x)dx+ Jg(x)f'(x)dx= [f(x)g (x)J!· a
a
To prove this, it is sufficient to write the left-hand member in the form ~
b
Jf(x)dg(x)+ Jg(x)df (x) a
a
and apply Formula (1), §6, Chapt. VIII.
§ 7.
RECONSTRUCTION OF THE PRIMITIVE FUNCTION
In §5, Chapt.V, we solved the problem of reconstructing a continuous function (x) from its derivative f'(x) if the latter exists everywhere and is bounded. Here we ask, does the equality :2l
f
f(x)=f(a)+
Ji'(t)dt
{1)
a
hold when f'(x) exists everywhere but is not necessarily bounded? It is perfectly clear that this is so if f(x) is absolutely continuous. In this case, it suffices to suppose only that f'(x) exists almost everywhere, which is insufficient in general for (1) to hold even whenf(x) is an increasing continuous function whose derivativef'(x) equals zero almost everywhere. 12 However, we will formulate conditions for the validity of equality (1) in terms, not of the function f(x) itself but of its derivative f'(x). THEOREM 1. If the derivative f'(x) exists everywhere, is finite, and is summable, then (1) holds. The proof will be based on two lemmas. LEMMA 1. Let the function (x) be defined and finite on [a, b]. If at every point of[a, b] all derived numbers of¢! (x) are non-negative, then ¢J (x) is an increasing function. Proof. Let E be an arbitrary positive number, and let
Assume that
+ sx. (2)
Then, if c = a
i b, at least one of the differences
is negative. Let [a 1, bJ be the interval [a, c] or [c, b] for which and set c1 =
a1 ;
b 1•
bJ be the interval [a 1, cJ
or [c1,
bJ
for which
WI (bJ <
This is clear from the example given by the function
e (x) (§2, Chapt. VIII).
7. RECONSTRUCTION OF THE PRIMITIVE FUNCTION
267
Continuing this process, we construct a sequence of nested closed intervals {[a"' b"J} for which x (b,.) cpl (an)· Let x 0 be a point lying in all of the intervals [am bn]- Then, for each n, one of the differences x (b,.)- clJx (xo), clJ1 (xo)- clJ1 (a,.)
<
is negative. Put hn = b,- x 0 if (x 0). It is clear that
<1>1
(b11)
<
(x 0) and h, =a,- x 0 if
<1> 1
(b,)
>
D.
n
= lt>t (x0 + h,.)-
< 0.
Selecting a subsequence { D.,.J having a (finite or infinite) limit, we obtain a derived number D
D 1 (x):? s. for all points x
E [a, b].
Thus (2) is impossible. This implies that
4J 1(b) :;p x(a), or, equivalently,
(b)+ eb :;;r 4J (a)+ ea. The number
E
is arbitrary, and we infer that
This completes the proof, since we could have taken an arbitrary subinterval [x, y] in place of [a, b]. LEMMA 2. Let cp (x) be a function which is defined and finite throughout [a, b]. Suppose that all derived numbers of q> (x) are non-negative at almost every point of [a, b] and that no derived number of q> (x) is equal to - oo at any point of [a, b]. Then q> (x) is an increasing function. Proof. Let E be the set of points of [a, b] where at least one derived number of q> (x) is negative. By hypothesis,
mE=O. By Theorem 6, §2, Chapt. VIII, there exists a continuous increasing function a (x) such that
a'(x)=
+oo.
at all points of the set E. Let
+ ea (x),
where E is an arbitrary positive number. No derived number of any point of [a, b]. In fact, since a (x) is an increasing function, l'[l(x+h}-
h
:d: .;:?
cp(x+h)-
h
<1>
(x) is negative at
268 IX.
ABSOLUTELY CONTINUOUS FUNCTIONS. THE INDEFINITE LEBESGUE INTEGRAL
and therefore Dii>(x));. 0
for ~
E E.
If x
E E,
then
+
co, since for hn->- 0, the ratio
+ hn.)- Cf (x) hn
is bounded below (for othe[\Vise there would be a derived number D !fl (x) and a'(x) = + co. We have accordingly,
= -
co),
Dii>(x));-0
everywhere. By the preceding lemma, t1> (x) is an increasing function, i.e.,
/I> (X) :s;;;
<
y, or, in other terms,
rp(x) +eo(x):s;;; cp {y) +ea(y). Taking the limit as e -+ 0, we obtain 9 (x) :s;;;
as was to be proved. Proof ofTheorem I. We introduce the function cp,(x), by the definition 1f /' (.x) :s;;; n, /' (x) ff1t (x) = { n, if f(x) n. It is easy to see that I t?n <x) 1 :s;;; I !' (x) I ,
>
and that
!fln
(x) is summable. Write R 11 (X)
(3)
a;
=f(x)-
J
Cfn(f)dt,
a
We shall show that Rn(x) is an increasing function. To do this, note first that
R,,,' (x) = f' (.x)- Cf11 (x);;;;;::: 0 almost everywhere, so that the set of points at which some derived number of the function Rn(x) is negative has measure zero. On the other hand, we have lfliX) '~ n, so that
~
a-+h
f
9n (t) dt=s;;; It
a;
and Rn (x +h)- Rn(X) f(x +h) -f(x) h ~ h
n.
This makes it clear that no derived number of the function Rn(x) is - co. Hence, by Lemma 2, Rn(x) increases. This implies that
Rn (b));. Rn (a)
7. RECONSTRUCTION OF THE PluMmVE FUNCTION or, in other terms,
269
b
f(b)-f(a);::::
J'fn(x) dx, (J
Since lim 'f,.(x)=/(x), 11-)o 00
it follows from (3) that
b
lim fl-)oOO
J
b
Cfn
J!' (x) dx.
(x) dx =
a
a
Consequently,
b
f(b)-f(a)-;;:::;
Jf'(x)dx. a
The same reasoning applied to the function -f(x) yields the inequality b
I
(b) -f(a) ~
Jf' (x) dx. a
Therefore
b
/(b)=f(a)+
Jf'(x)dx, a.
which completes the proof, since any x E (a, b] can take the role of b. In conclusion, we mention two examples. I. Let the function
~ . 1 f(x) =x • smX
(x> 0)
/(0)= o. be defined on [0, 1]. This function has a finite derivative everywhen:
I
1
3 ~ . 1 -~ 1 (x) =2 x· sm 7 - x • cos~
(x>O)
/'(0)=0. This derivative is summable, since
II' (x) I~~+ 2
. ;-. y X
Hence, the functionf(x) satisfies all conditions of Theorem 1. However, it is easy to see thatf'(x) is not bounded, so that the theorem of §5, Chapt. V is not applicable. II. Let the function
f(x) = x 2 cos.; X
(x > 0)
/(0)=0 be defined on [0, 1]. This function also has a finite derivative everywhere, but the
270 IX.
ABSOLUTELY CONTINUOUS FUNCTIONS. THE INDEFINITE LEBESGUE INTEGRAL
derivative is not summable. In fact, ifO
Jf' 13
(x) dx= ~2cos
f3 < I, the
derivative f'(x) is bounded
f -a2cos;.
"
In particular, for
~n
we have
Jf' (x)
d.-..:=
2~.
O.n
The intervals [am {3 (n = 1, 2, ... ) are pairwise disjoint; writing 11]
00
E
= E [am {3;], n::..l
we have
~1
f [!' (x) I dx >n=l 2.; E
2n
=+
oo
and f' (x) is not sum-
mabie. Hence Lebesgue integration does not furnish a complete solution to the problem of reconstructing a function from its derivative. A complete solution of the problem is given by the process of Perron-Denjoy integration, which generalizes the Lebesgue integral. We cannot discuss this generalized integral here.*
Exercises for Chapter IX 1. A summable function is approximately continuous at each of its Lebesgue points. The converse is not true. 2. Let f(x) be a bounded measurable function. A point x 0 is a Lebesgue point for f(x) if and only if it is a point of approximate continuity. 3. It is possible for a function to be equal to the derivative of its indefinite integral at a point x 0 without being approximately continuous at x 0• 4. If all derived numbers of the function f(x) satisfy the inequality I D f(x) I < K, then f(x) satisfies the Lipschitz condition I/(x)- f(y) j < k I x- y 1. 5. If the function F[f(x)] is absolutely continuous for every absolutely continuous f(x), then F(x) satisfies a Lipschitz condition. (G. M. Ficbtenholz). 6. Let f(x) be defined on [a, b]. If to every e > 0, there corresponds all> 0 such that for every finite system of intervals {(ak, bk)), the sum of whose lengths is less than ll, we have
I:i
{f(bk)-f(ak)}l
k=l
then f(x) satisfies the Lipschitz condition. 13 (G. M. Ficbtenholz). 7. Prove the following particular case of the Banach-Zarecki Theorem directly. If a continuous and strictly increasing function possesses the property (N), then it is absolutely continuous.
*The interested reader is referred to S. Saks, Theory of the Integral, 2nd edition, Monografie Matematyczne, Warsaw, 1937.-E. H. 13
This result shows that it is impossible to discard the requirement that the intervals
be pairwise disjoint in the definition of absolute continuity.
(ak, bk)
EXERCISES
271
8. Let f(x) be continuous on [a, b] and let E be the set of points at which at least one derived number of f(x) is non-positive. If the map f(E) of the set E contains no closed interval, then f(x) is an increasing function. (A. Zygmund). 9. Using the preceding result, generalize Lemma 2, §7 in the following way. If f(x) is continuous on [a, b], if all derived numbers of f(x) are non-negative at almost all points of [a, b], and if the set of points at which at least one derived number equals - oo, is finite or denumerable, then f(x) is an increasing function. 10. Letf(x) be continuous and letf'(x) exist everywhere and besummable. Iftheset£(1/' I=+ oo) is finite or denumerable, then f (x) is absolutely continouous. (Apply the preceding exercise.) II. A function having a finite derivative everywhere possesses property (N). 12. A necessary and sufficient condition that a continuous, strictly increasing function f(x) be absolutely continuous is that the map f(E) of the set E of points at which/' (x) = +oo have measure zero. (M. A. Zarecki). 13. A necessary and sufficient condition that a function which is the inverse of a continuous and strictly increasing function f(x) be absolutely continuous, is that mE(!' = 0) = 0. (M. A. Zarecki)
INDEX Absolute continuity of the integral, 149 Absolutely continuous functions, 243 differential properties of, 246 Accumulation point, see Point of accumulation Aggregate, 11 Aleksandrov, 107 Aleph-nought, 19 Algebraic numbers, 21 Almost everywhere, 90 Approximate continuity, 149, 260, 262 Arzela, 242 Ascoli, 242 Assemblage, 11 At most denumerable, 17
Bounded function, 102 sequence, 3 6 variation, see Functions of finite variation
Cantor, 11, 27, 60 sets Go and Po, 49, 56, 60, 75, 76, 213 Caratheodory, 88 Cardinal number, 27 Cauchy, 116 sequence, 169 Cauchy-Bunyakovski-Schwarz inequality, 165 CBS inequality, see Cauchy-Bunyakovski-Schwarz inequality Characteristic function, 93, 112 Class of bounded measurable functions, M, 172 of continuous functions, C, 172 of measurable sets, 75 of polynomials, P, 172 of step functions, S, 172 Closed Interval, 13 set, 36, 37 orthogonal system, 177 Closure of a set, 3 7 Collection of objects, 11 Comparison of powers, 27 of Riemann and Lebesgue integrals, 129 Complement of a set, 42 Complementary intervals, 49
Bad element, 29 Baire functions, 129 measurability of, 131 Banach, 81, 252 indicatrix, 225 Banach's theorem, 80, 225 Banach-Zarecki theorem, 250 Bari, 203 Bernstein, 107 polynomials, 108 Bernstein's theorem, 108 Bessel's identity, 177 inequality, 177 Binary expansion, 23 unique representation of, 24 · Bolzano-Cauchy property, 171 Bolzano-Weierstrass theorem, 35, 36 Borel covering theorem, 39 set, 76, 86 Borel's theorem, 39, 104 273
274
INDEX
Complete additivity, 57 space, 171 system, 181 Component interval, 47 Condensation point, see Point of condensation Congruent sets, 75 Continued fraction expansion, 23 Continuity, 102 of the norm, 169 Continuous functions of finite variation, 223 mappings, 248 real-valued functions, 32 Continuum hypothesis, 28 Convergence in measure, 95, 96 in the mean, see Mean convergence Convergent subsequence, 36 Countable additivity, 57, 67, 86, 122 of the integral, 142 Countable set, 17 Counting elements, 15 Darboux sum, 132 Decreasing function, 204 De la Valtee-Poussin, 88, 164 De la Vallee-Poussin's theorem, 159 Dense in itself, 37 Density of a set, 261 Denumerable, 17 set, 17 subset, 18 Derived number, 207 set, 37 Difference of sets, 14 Differentiation· of monotone functions, 207 Difficult problem of the theory of measure, 79 Dirichlet function, 116, 133 Disjoint sets, 14 Distance, 44 Dominated convergence, 161
Easy problem of the theory of measure, 79, 80 Egorov, 96 Egorov's theorem, 99, 112 Element, 11 Enumeration of sets, 17 Equality of sets, 12 Equi-absolutely continuous integral, 151, 152 Equicontinuous functions, 242 Equivalent functions, 90, 181 sets, 15 Euclidean space Rn, 185 Everywhere dense set, 171
Faddeyev, 163 Family of sets, 13 Fatou's theorem, 140, 152, 160 Fichtenholz, 96, 203, 252, 270 Fichtenholz's theorem, 158, 252 Finite additivity of the integral, 145 set, 18 variation, see Functions of finite variation First law of the mean for integr~ls, 121 Fixed point, 16 For almost all points, see Almost everywhere Fourier, 176 cofficients, 176 series, 176 Fraenkel, II Frechet's theorem, 106, 110 Functions continuous at a point, 102 of finite variation, 204, ?-15, 223, 238 with summable square, see Square-summable functions Functionals, see Linear functionals Fundamental lemma, 130 sequence, see Cauchy sequence
275
INDEX
Gavurin, 163 Good element, 29 Gram determinant, 193 Greater power, 28 Greatest lower bound, 44 Half-open interval, 13 Hardy, 23 Hausdorff, II, 242 Hausdorff's theorem, 80 Helly's theorem (principle of choice), 220,222,233 Hilbert, 167 Hilbert space, 167, 187 completeness of, 171 Hobson, 133 Holder's inequality, 197 Image of a set, 71, 207 Inclusion, 12 Increasing function, 204 indices, 36 Indefinite Lebesgue integral, 243, 252 Index conjugate to p, 197 Infimum, 39 Infinite matrix, 26 set, 11, 15, 18, 20 subset, 18 Inner measure of a bounded set, 63, 64 product, 184 Integrable (L), 120, 144 (R), 116 Integration by parts, 229, 266 Interior point, 41 Intersection of sets, 13 Interval, 13 lnvariance of measurability and measure under isometries, 71 Inverse image of a set, 207 Inverse isometry, 73 Irrational numbers, 23
Isolated point, 34, 37 Isometry, 71 Kaczmarz's theorem, 182 Kantorovic, 115, 163 Kolmogorov, 107, 163, 203 Kondurar', 241 Lebesgue, 163, 202 measure, 67 point, 255 Lebesgue integral, 116, 119, 136, 144 fundamental properties of, 121 Lebesgue's theorem, 95, 96, 112, 127, 149, 153 on dominated convergence, 161 Length of a vector, 186, 189 Legendre polynomials, 195 Levi's theorem, 141, 161 Limit point, 34, 35 of a sequence, 167 Linear functional, 236 Linearly dependent system, 192 independent system, 192 Lipschitz condition, 216 Lower Baire function, 129 Lebesgue sum, 118 1:!-space, 186 1"-space, 196 L:rspace, 165 Lp-space, 196 Luzin, 248 Luzin's theorem, 106, 107, 113, 114 Mapping of sets, 207 Mean convergence, 167, 168 of order p, 199 Measurable (B), 76 in the Lebesgue sense, 67 (L), 67 set, 55, 66, 84
276 Measurable functions, 89, 90 properties of, 93 Measure of a bounded closed set, 60 of a bounded open set, 55, 56 of an interval, 55 of arbitrary sets, 56, 66, 84 of a set, 66, 84 Metric space, 167 Minkowski's inequality, 166, 198 Monotonic functions, 204 differentiation of, 207 Mutually exclusive relations, 32 Natanson, 203 Non-denumerable set, 17 Non-measurable set, 76, 77 Non-void set, 13 Norm, 167, 186, 199, 200 Normalized measurable function, 175 Number of elements in a set, 15 Numerical sequence, 168 One-element set, 14 One-to-<>ne correspondence, 15 Open interval, 13 set, 41 Orlicz, 202 Orthogonal system, 175 Orthonormal system, 175 closure and completeness in, 181 linear independence of, 192 Outer measure of a bounded set, 63 Pairwise dis joint, 14 Parseval's identity, 177 Passage to the limit under the integral sign, 127, 149, 232 Perfect set, 37 Perron-Denjoy integration, 270 Point in R 3 -space, 185 of accumulation, 34 of condensation, 50, 52 of density, 261 of rarefaction, 261
INDEX
set, 34 Points in Euclidean space R :l, 184 Power of a set, 15, 27 symbol for, 27 Power a, 17 c,22,27 f, 29 of a closed set, 50 of the continuum, 21, 22 Primitive function, 133 Principle of monotonicity, 80, 81 Problem of measure, 79 Proper subset,· 12 Property ( N) , 248 Rademacher's theorem, 183 Rational numbers, 19 Real numbers, 23 Real-valued functions, 28 Real variable, 11 Reconstruction of primitive function, 133, 266 Reflection in the origin, 71 Regular solution, 81 Relations, 32 Riemann, 116 Riemann integral,) 16 Riesz, 106,114,202,236 Riesz-Fischer theorem, 179, 187 Riesz's theorem, 98, 112, 127, 150, 152, 190, 236. 257 Rule of association, 15 Russell, 11 Russell's paradox, 11, 27 Saks, 270 Saltus, 205 function, 206, 219 Schmidt's theorem, 194 Section of a function by the number N, 136 Separability, 45 Separation, 44 property, 46 theorem, 45
INDEX
Sequence of measurable functions, 95 of natural numbers, 23 Set, 11 dense in itself, 37 inclusion, see Inclusion of irrational numbers, 23 of natural numbers, 36 of real-valued functions, 28 of square-summable functions, see L 2 -space of type F ,_, 75 of type Ga, 75 Single-valued mapping, 71 Singular function, 263 Sliding hump method, 156 Smaller power, 28 Smallest closed interval containing a set, 43 Square-summable function, 165 Steklov's theorem, 178 Step functions, 91, 112 Stieltjes integral, 204, 227 Stone, 167, 202 Strictly monotonic function, 204 Structure of bounded closed sets, 47 of bounded open sets, 47 of measurable functions, I 0 1 Subsequence, 36 Subset, 12 Sum of sets, 12 Summable function, 136, 144, 149 of arbitrary sign, 143 Supremum, 39
277
Suslin, 76 System, 13 Ternary expansion, 50 Theory of functions, 11 of sets, 11 Titchmarsh, 202 Total variation, 215,238 Transcendental numbers, 23 Transitive relation, 32 Translation, 71 Trichotomic property, 32 Trigonometric polynomials, 110 Union of sets, 12 Upper Baire function, 129 Lebesgue sum, 118 Vector in R 3 -space, 185 Vitali covering, 81 Vitali's theorem, 81, 83, 152, 157,209 Void intersection, 14 set, 12 Weak convergence, 174, 200 Weierstrass's theorems, 107, 109, 111, 173, 174 Wright, 23 Zarecki, 252, 271 Zygmund, 271